Methodology guide

Survivorship Bias in Public Trader Rankings — How to Spot It

Public trader rankings systematically over-state performance because losers quietly disappear. Here is the math, the evidence, and the fix.

By NakedPnL Research·May 7, 2026·13 min read

TL;DR

Survivorship bias arises when failed accounts are removed from a ranking, so only winners remain visible.
The mutual fund literature shows survivorship bias inflates published returns by 0.5–1.5% annually depending on the period.
Crypto influencer rankings are far worse — losing accounts disappear with no record at all, often within weeks of a blow-up.
An honest registry must retain every historical entry, including delisted and inactive accounts, with the original chain intact.
NakedPnL's append-only hash chain makes it impossible to silently remove a track record after the fact.

Survivorship bias is one of the most studied flaws in published performance data, and one of the most aggressively exploited. The mechanism is simple: any ranking that only shows currently active accounts will systematically over-state the average performance of the underlying population, because the worst performers have already left the dataset.

Academic finance has measured the bias for decades in mutual funds, hedge funds, and managed futures. The numbers are not subtle. In crypto, the problem is significantly worse, because the population of public traders churns faster, the platforms publishing their performance have no obligation to retain history, and the loudest voices on social media are by definition the ones who have not yet blown up.

How the bias arises

Imagine a population of 1,000 traders. In year one, half lose money and half make money. The losers, embarrassed, stop posting their P&L. The platforms hosting their dashboards delete their accounts after 90 days of inactivity. The winners stay visible. At the end of the year, the published 'top 100' list shows an average return of, say, 60% — but the average return of the original 1,000 is closer to zero, because the bottom 500 are no longer counted.

The bias compounds over multiple years. Apply the same survival-filtering logic across five years and the published ranking is dominated by traders who happened to be lucky in every single intervening period. Random-walk analysis suggests that even with no skill, by year five about 1 in 32 traders would still be visible from a starting cohort, and their published track records would look extraordinary.

The mutual fund literature

The seminal study is Brown, Goetzmann, Ibbotson, and Ross (1992), 'Survivorship Bias in Performance Studies'. Working with a multi-decade mutual fund dataset that included both surviving and dead funds, they showed that databases that excluded the dead funds over-stated cross-sectional fund performance and dramatically over-stated persistence of returns.

“Persistence of performance is largely an artifact of survivorship bias. When the analysis is corrected for survival, much of the apparent persistence disappears.”
— Brown, Goetzmann, Ibbotson, Ross (1992)

Subsequent work refined the magnitude. Malkiel (1995) estimated US equity mutual fund survivorship bias at roughly 150 basis points per year over 1982–1991. Carhart, Carpenter, Lynch, Musto (2002) documented the same effect for hedge funds at a comparable scale. The pattern is consistent across asset classes, eras, and geographies: any sample restricted to currently-living entities over-states the population mean.

Why crypto rankings are worse

Three structural features amplify the problem in crypto-trader rankings:

Faster failure rate. A regulated mutual fund manager who underperforms gets fired over years; a leveraged perp trader can liquidate a 7-figure account in hours and disappear from social media the same week.
No retention requirement. The major influencer-tracker platforms typically delete or hide accounts that go inactive, fail KYC, or specifically request removal — there is no SEC-equivalent recordkeeping rule keeping the failures visible.
Self-selection at entry. Crypto P&L is voluntarily published by traders who already think they are good. The implicit filter is even tighter than for mutual funds, where the manager has at least committed years of career capital to the fund.

The combination produces a published cohort that bears almost no resemblance to the underlying population of traders. A typical 'top 100 traders this month' panel is, in effect, a snapshot of the lottery winners with the millions of losing tickets thrown away.

A worked example of the magnitude

Consider a simulation: 10,000 traders, each making one independent annual return drawn from a normal distribution with mean 0 and standard deviation 50%. After year 1, remove the bottom decile (the most embarrassed losers stop publishing). Continue for five years.

Year	Surviving traders	True mean of original cohort	Mean of survivors	Bias
0	10,000	0%	0%	0%
1	9,000	0%	+6.4%	+6.4%
2	8,100	0%	+12.0%	+12.0%
3	7,290	0%	+16.8%	+16.8%
4	6,561	0%	+21.0%	+21.0%
5	5,905	0%	+24.6%	+24.6%

Simulated bias from removing the bottom decile each year, even when true mean skill is zero. By year 5 the published 'top traders' show a 24.6% mean return that is entirely artifact.

These numbers are illustrative, not measured — but they bracket what the academic literature finds in real datasets. The fundamental claim, replicated across hundreds of papers since the 1990s, is that survivor-only samples cannot be used to estimate population skill without an explicit correction.

Backfill bias is the silent twin

Survivorship bias has a less famous but equally damaging cousin: backfill bias. When a new fund or trader joins a database, they typically only do so after a successful initial run. Their first year of returns is then back-published into the database as if it had been recorded contemporaneously. The published track record of the database as a whole picks up an artificial boost from the implicit selection at entry.

Fung and Hsieh (2000) measured the backfill bias in commercial hedge fund databases at roughly 140 basis points per year. Combined with survivorship, the two biases together can shift the published mean of a hedge fund index by 250–300 basis points relative to the true population return — an enormous adjustment, larger than the entire excess return claimed by the index in many years.

What an honest registry has to do

Eliminating survivorship and backfill bias from a public registry requires four properties:

Append-only history. Once a track record is recorded, it cannot be removed, edited, or quietly deactivated. A trader who blows up stays in the registry with their full history visible.
Single-direction time. New entries always begin with their join date as start; no prior performance can be back-published as if it had been observed contemporaneously.
Public delisting. If a trader is removed (regulatory takedown, KYC failure, withdrawal at user request), the removal must be a recorded event with date and reason, not a silent deletion.
Independent re-verifiability. Anyone — researcher, regulator, journalist — must be able to reconstruct the full historical state of the registry, including currently-inactive entries, from a public hash chain.

How NakedPnL's chain prevents quiet removal

Every NavSnapshot row in NakedPnL is hashed with SHA-256 over the canonicalized exchange response (contentHash), and chain-linked into the trader's full history (chainHash = SHA-256(previousChainHash + contentHash)). The genesis entry uses the literal string 'genesis' as previous hash. Each day's chain head from every active account is then aggregated into a Merkle tree, and the Merkle root is anchored to Bitcoin via OpenTimestamps.

This produces a structural guarantee: removing a trader after the fact would require finding a SHA-256 collision (currently considered cryptographically infeasible) AND rewriting every Merkle root since they joined AND falsifying every Bitcoin timestamp anchor since the relevant publication date. The cost of forgery is, in practice, prohibitive. The append-only property is enforced by math, not by NakedPnL's policy.

What the chain does not solve

Cryptographic append-only-ness prevents NakedPnL from quietly removing a track record. It does not prevent a trader from never registering in the first place — the population of registered users is still self-selected. A trader who suspects they are about to blow up can choose not to publish, and there is no way to force them to.

The honest answer is that survivorship bias at the join boundary is a fundamental limit of any voluntary system. What the registry can guarantee is that once a trader has chosen to register, their full record is permanent. The cohort of currently-visible traders will still be self-selected; the cohort of historically-visible traders will not have any quiet exits hidden from view.

How to read any trader ranking critically

When you encounter a public trader ranking — anywhere, including NakedPnL — apply this checklist:

Is the full historical roster visible, or only currently active entries? An honest registry shows both.
Can you see traders whose accounts went to zero or were closed? If not, those failures are being hidden.
Is there a publicly verifiable history that cannot be edited after the fact (hash chain, append-only ledger, archival mirror)? Without this, silent removal is trivial.
Are start dates consistent — do all displayed track records begin with the trader's actual join date, or have some been backfilled with prior unverified history?
Are inactive accounts retained for at least the same duration as active ones, with their final state preserved?

If a ranking fails any of these tests, the published numbers are not safe to use as input to any due diligence process. The averages, percentiles, and 'top X traders' lists are functions of the filtering, not of the underlying skill.

Frequently asked questions

What is the difference between survivorship bias and look-ahead bias?

Survivorship bias arises when failed entities are removed from a sample after the fact, distorting the cross-section of who is visible. Look-ahead bias arises when information that would not have been available at decision time is used to evaluate a strategy retroactively. Both are forms of dataset corruption that systematically inflate published performance, but they operate via different mechanisms and require different fixes.

Has anyone measured survivorship bias in crypto trader rankings specifically?

There is limited published academic work on crypto influencer rankings specifically because the underlying datasets are typically proprietary and not retained. Studies of crypto fund returns (Bianchi, Babiak, Dickerson 2022, among others) find magnitudes broadly consistent with the hedge fund literature — survivorship adjustments of 100–250 basis points per year are typical. Inferring from those, public crypto trader rankings, which are even less retention-disciplined than crypto funds, likely face larger biases.

If NakedPnL is append-only, what happens when a trader requests deletion under GDPR?

GDPR right-to-erasure applies to personally identifiable information, not to anonymized historical performance data. NakedPnL's response to a verified deletion request is to remove the trader's identity (handle, display name, profile metadata, identity-linked auth records) while retaining the underlying NAV snapshots and chain entries in fully anonymized form. This preserves the integrity of the registry's historical aggregate statistics while honoring the user's privacy rights. The trader's individual page becomes inaccessible; the chain remains complete.

Could NakedPnL secretly downrank a trader who blew up, even if the chain is preserved?

The chain prevents removal, not display ordering. The default registry view is sortable, and a malicious operator could in principle hide bad performers from the default view while leaving the chain intact. This is why every individual chain export at /verify/chain/[handle] is independently downloadable and the full historical data is queryable via the public v1 API — anyone can replicate the registry view with their own ordering rules and verify the underlying numbers.

Does this mean older NakedPnL traders look better on average?

It means survivors do, by construction. A trader who joined two years ago and is still active has demonstrated something — at minimum, that they did not blow up in those two years. That is informative but it is not the same as 'good'. The longest-tenured traders should be evaluated against their full history, not just the current visible position. NakedPnL's verification depth tier (Bronze/Silver/Gold) explicitly weights tenure, which makes this trade-off transparent.

What about traders who pause and resume publishing?

Pausing is recorded in the chain as a gap. The chain timestamps make it impossible to backdate the resumption — if a trader stops publishing during a drawdown and resumes after recovery, the gap is permanently visible and the missing days are not back-filled. This is one of the structural reasons NakedPnL refuses to interpolate missing NAV snapshots, even when the underlying exchange data could plausibly support it.

References