The Most Common Crypto Backtesting Pitfalls and How to Actually Avoid Them in 2026

Quant Trading · 2026-05-30 · 比特三棱镜编辑部

Ask AI

Plenty of people write strategies. Far fewer write honest backtests. I have seen too many “Sharpe above 4” curves destroyed within a week of going live, and the cause is almost never the alpha decaying. The cause is that the backtest lied, and the lies fall into three categories: look-ahead bias, liquidity distortion, and fee miscounting. This post unpacks each one with concrete fixes and a checklist I personally walk through every time I finish a backtest.

Three common crypto backtesting pitfalls compared in one diagram: look-ahead bias, liquidity distortion, and fee miscounting

Pitfall one: look-ahead bias, the most hidden and most fatal

Look-ahead bias means your strategy at time t uses information that, in reality, was only knowable after t. The reason it is so dangerous is that nothing in your code literally says “use the future”; the leak hides in the data structure itself.

Classic leak patterns include:

Close-price signal, close-price fill: computing an indicator on the close of a 5-minute candle and immediately filling at the same close. In live trading, that close exists only at the boundary; you can fill no earlier than the next tick, and price has usually moved.
Aggregated price feeds: using something like CoinGecko daily aggregate price to backtest a daily strategy, when the aggregation window may cross UTC boundaries in non-obvious ways.
Survivor-only universe: backtesting against the “top 50 by market cap” list as it exists today against historical data from 2022. Dead coins have been silently dropped.
Future-settled funding: perpetual strategies often pull funding rates from an endpoint that returns the “final value of the 8-hour window” timestamped at the start of the window, so reading at t leaks information about t+8h.

There is only one way to find these leaks: for every field that feeds a signal, ask whether that value was truly available at t. If the field is “an aggregate over a window”, be suspicious by default. My recommendation is to bake a lag parameter into the engine that pushes every computed indicator one bar forward by default, and only disable it once you have personally audited the data source.

If you have not yet built up a mental model of backtesting tooling, the crypto backtesting starter is a useful prerequisite.

Pitfall two: liquidity distortion, you are never outside the market

This one is subtler because you are not using wrong data; you are using right data while forgetting that you are part of the market.

Backtests typically fill at mid or last price, ignore slippage, and ignore book depth. That produces gorgeous numbers on small caps, but live:

Scenario	Backtest assumption	Reality
100k market buy on a small cap	Filled at mid	Eats five levels, average slips 0.8%
TWAP’d large order	No impact	You push the price up against yourself
Asian morning thin window	Same as always	Spreads triple, depth halves
Cancel/replace	Instant	200ms latency, miss the best level

The fix has two tiers:

Basic tier: simulate fills against the top-N levels of the orderbook with cumulative depth, and anything exceeding depth fills at a weighted “sweep” price.
Advanced tier: record real L2 snapshots (every 100ms is enough) and replay them, running an order matcher against each snapshot including cancel/replace latency.

For strategies that hold for minutes-to-hours and trade small relative to depth — like funding-rate arbitrage or grids — tier one is enough. For market-making or scalping you have no choice but tier two. The grid trading strategy tutorial contains a useful “hold longer to swallow depth” mental model.

Pitfall three: fees, funding, and borrow — small leaks that drain the boat

The third class is technically the easiest, yet it traps the most people, because everyone subconsciously thinks “0.04% per leg is nothing.” It is everything.

Run the math. Suppose a high-frequency grid pays maker -0.005%, taker 0.04%, and turns over ten times a day (twenty fills round-trip). A lazy backtest just multiplies 0.04% × 20 = 0.8% daily fee and feels honest. Live:

Your orders are not all maker. Maybe 60% maker, 40% taker.
Maker rebates may not actually credit (some exchanges tier the rebate and settle monthly).
Funding settles every 8 hours and applies to both sides; when funding is positive shorts receive and longs pay.
Shorting spot requires borrow interest, and cross-exchange arbitrage adds withdrawal fees and bridge fees.

A serious backtest separately models each field:

maker_fee / taker_fee: read from the exchange tier you actually qualify for, including VIP discounts.
funding_rate: real settled values from the historical endpoint (mind the timezone and the settlement window), never a hand-picked average.
borrow_rate: APR for spot or perp borrowing, compounded by holding time.
withdrawal_fee / bridge_fee: enumerated per chain and per asset for any cross-venue play.

My personal habit is to output two curves: gross and net. Gross tells you about alpha; net tells you about executability. If the gap exceeds 30%, the strategy is probably not ready to go live. The funding rate primer covers the settlement mechanics in more depth.

A pre-flight checklist for every backtest

The three pitfalls above translate into very specific items. This is the checklist I walk through every single time:

[ ] Every indicator is lagged by at least one bar before generating signals
[ ] Universe uses the membership at the historical date, not the current snapshot
[ ] Funding uses settled values, not aggregates that leak the future
[ ] Fills simulate orderbook depth, not mid/last
[ ] Cancel/replace carries 100-300ms latency
[ ] Maker/taker split is modelled separately, not lumped as taker
[ ] Cross-venue strategies include withdrawal + bridge fees + transit time
[ ] Both gross and net equity curves are plotted and the gap is reasonable
[ ] Walk-forward or non-overlapping out-of-sample tests are performed
[ ] The strategy is checked across three regimes — bull, bear, range — separately

A strategy that still keeps 60% of its returns after this checklist is worth real capital. A backtest is not there to make you happy. A backtest is there to schedule the unpleasant surprises before they cost real money.

Treat the backtest as a mirror, not a proof

Most people write backtests in “prove the strategy works” mode, which subtly biases every choice towards making the numbers look good. Flip the mindset: a backtest is a mirror that exposes the strategy’s blind spots. If a backtest still prints money no matter how you tweak the parameters, the strategy is not the impressive part — the backtest is broken. When backtests start to disappoint you — a regime that fails, a coin that breaks — you finally see the real shape of the edge. Respecting that shape matters more than chasing one more decimal of Sharpe.