Backtest Metrics That Matter: Expectancy, R:R, Drawdown & More

In short

Judge a backtest by expectancy (the headline: average R per trade, net of costs), max drawdown and longest losing streak (whether you could survive it), and profit factor. Win rate alone is the metric that lies — meaningless without the reward-to-risk ratio beside it.

Expectancy — the Headline Number

Expectancy = (win rate × average win) − (loss rate × average loss)

It folds frequency and magnitude into one figure: the average R (or pips) you earn per trade. It must be positive after the full cost stack or there’s no edge. Example: 45% winners at +15 pips, 55% losers at −8 pips → 0.45×15 − 0.55×8 = +2.35 pips/trade gross; subtract ~1 pip costs → +1.35 net. Everything else is context for this number.

Win Rate — the Metric That Lies Alone

A 40% win rate is excellent or terrible depending entirely on reward-to-risk. The breakeven table:

Reward : Risk	Breakeven win rate
0.5 : 1	66.7%
1 : 1	50%
1.5 : 1	40%
2 : 1	33.3%
3 : 1	25%

Costs push every line up (at 1:1 with costs ~10% of target, true breakeven ≈ 52.6%). Quote win rate only alongside R:R, or not at all.

Max Drawdown — the Survival Number

The largest peak-to-trough equity decline. This decides whether you’d actually have kept trading the strategy — and it’s the number prop firms test you against. A strategy with great expectancy and a 40% drawdown is untradeable by most humans: you’d quit (or breach a firm’s limit) before the edge paid off. Read it on the equity curve, and compare it to any drawdown limit you must respect.

Longest Losing Streak — the Psychology Number

If the backtest contains 9 consecutive losers, live trading will too. The question isn’t whether it’ll happen — it’s whether you’ll keep following the rules when it does. Streak length also sets your sizing ceiling: streak × risk% must stay inside your drawdown tolerance (and any prop limit). An 8-loss streak at 3% risk is −22%; at 1%, −7.7%.

Profit Factor

Profit factor = gross wins ÷ gross losses

Above 1.0 is profitable; above ~1.3 after costs is respectable for a discretionary system; suspiciously above ~2.5 on a small sample usually means overfitting or too few trades. A single number for “how much do winners outweigh losers in total.”

Putting Them Together

Read them as a set, never alone:

Metric	Answers	Danger if ignored
Expectancy (net)	Is there an edge?	Trading a negative-edge system
Win rate + R:R	What kind of edge?	Misjudging viability from win rate
Max drawdown	Could I hold it?	Quitting / breaching at the worst moment
Longest streak	Can I size it safely?	Over-sizing into ruin
Profit factor	How efficient?	Mistaking a fragile fit for an edge

All five fall out of a complete journal — a few spreadsheet formulas, or the built-in stats of replay tools that track P&L (tick tools like StrategyTune compute win/loss, expectancy and streaks automatically; keep the cost columns in your own sheet since no tool models swap).

Frequently Asked Questions

What's a good expectancy for a trading strategy?

Any reliably positive net-of-cost expectancy is tradeable — the magnitude matters less than the reliability and the drawdown it comes with. As a feel: +0.2R to +0.5R per trade is a solid discretionary result over 200+ trades. Be suspicious of much higher figures on small samples; they usually shrink with more data.

Is a high win rate good or bad?

Neither, on its own — it's only meaningful next to reward-to-risk. A 70% win rate at 0.4:1 R:R loses money; a 35% win rate at 3:1 prints. High win rates also tend to pair with occasional large losers, so always check the loss distribution and drawdown rather than celebrating the percentage.

How do I calculate max drawdown from a trade list?

Build a running cumulative-equity column, then a running-maximum column. Drawdown at each trade is running equity minus running max (zero or negative); max drawdown is the most negative value. Express it as a percentage of the peak for comparability across account sizes and against prop-firm limits.

Which metric matters most for prop firm challenges?

Max drawdown and longest losing streak, because challenges fail on rule breaches, not on weak expectancy. Your worst drawdown must fit comfortably inside the firm's limit at your planned risk per trade, and your worst streak must not breach the daily loss limit — simulate both before paying.

Backtest Metrics That Matter

Expectancy — the Headline Number

Win Rate — the Metric That Lies Alone

Max Drawdown — the Survival Number

Longest Losing Streak — the Psychology Number

Profit Factor

Putting Them Together

Frequently Asked Questions

What's a good expectancy for a trading strategy?

Is a high win rate good or bad?

How do I calculate max drawdown from a trade list?

Which metric matters most for prop firm challenges?

More in Method

Writing Entry/Exit Rules You Can Actually Test

Hindsight Bias: Why Scrolling Charts Isn’t Backtesting

Overfitting in Manual Backtesting

Look-Ahead Bias and How Replay Prevents It

Reading an Equity Curve Like a Risk Manager

Journaling Backtest Trades

Testing Across Trending, Ranging & News Regimes

Practice This in a Free Replay Tool