Drawdown Envelope (Monte Carlo) Deep Dive

How the platform separates structural drawdown risk from sequencing luck using 10,000 randomized trade orderings.

Every panel in the Drawdown Envelope modal explained — Monte Carlo bootstrap resampling, percentile comparison, expected range bounds, and sequencing assessment.

16 minIntermediate

The Envelope Modal

The Drawdown Envelope modal opens from the Drawdown Envelope card on the results dashboard. It answers a question that the standard Drawdown Analysis cannot: was the observed drawdown a product of strategy structure, or did trade sequencing make it better or worse than it needed to be?

Every backtest produces a single equity curve with a single maximum drawdown. But the same set of trades, taken in a different order, would produce a different curve and a different drawdown. The Monte Carlo envelope shows you the full distribution of possible drawdowns from the same trades — 10,000 randomized orderings that reveal how sensitive the result is to sequencing.

The modal has a 2x2 grid comparing your observed drawdown to the simulated distribution, followed by a full-width interpretation section that translates the numbers into diagnostic text.

Drawdown Envelope (Monte Carlo)

Drawdown dispersion under 10,000 randomized trade orderings

↻ Re-run

Observed vs Typical

Your Max DD26.4%
Typical (Median)21.6%
Difference+4.8% above typical

Expected Range

Best case (5th)10.1%
Worst case (95th)49.4%
Range width39.3 pts

Most random orderings fall within this range

Drawdown Distribution

10%
YouMedian
49%

Sequencing Assessment

Percentile66th

Slightly worse than typical, not extreme

SensitivityModerate
Low·Moderate·High

Observed drawdown is slightly above typical but within expected range

Observed Sequencing Interpretation

Characteristics

  • Drawdown falls within normal simulated range
  • Not driven by extreme trade ordering
  • Outcome reflects strategy structure rather than ordering

Implications

  • Outcome influenced by sequencing, but within normal variation
  • Future drawdowns may vary within expected bounds
  • No evidence of hidden tail risk from sequencing
Additional Statistics

How the Simulation Works

The engine uses bootstrap resampling with replacement. From your set of completed trades, it randomly selects N trades (where N equals your original trade count) with replacement — meaning the same trade can be picked more than once in a single simulation, while others may be skipped. This produces a new hypothetical set of trades with the same statistical properties as the original.

The Process

  1. Extract each trade's return percentage: returnPct = totalProfit / portfolioBeforeTrade
  2. For each of 10,000 simulations, bootstrap sample N trades with replacement
  3. Sum the return percentages additively (not compounded) to build a synthetic equity curve
  4. Calculate the maximum drawdown from each synthetic curve
  5. Sort all 10,000 drawdowns to form the distribution

Why Additive, Not Compounded?

Summing many small percentage returns converges to a normal distribution via the Central Limit Theorem, producing the expected bell-curve shape in the histogram. Compounding would create a log-normal distribution with extreme right skew, making the percentile comparisons less interpretable. The additive approach sacrifices some mathematical precision for much clearer diagnostic value.

Why With Replacement?

Bootstrap resampling with replacement is the standard statistical technique for estimating the sampling distribution of a statistic when you cannot generate new data. It preserves the original distribution of trade returns while allowing the simulation to explore different possible sequences. Without replacement, you would be limited to permutations of the exact same trades — still useful, but less robust to small-sample effects.

Observed vs Typical

The top-left panel places your actual drawdown alongside the median of the simulated distribution.

Your Max DD

The actual maximum drawdown from your backtest, displayed in rose text. This is the same number shown in the Drawdown Analysis modal — the worst peak-to-trough decline in the equity curve.

Typical (Median)

The 50th percentile of the 10,000 simulated drawdowns. This represents what drawdown you would "typically" experience if you traded the same strategy with the same trades but in random order. It is the central tendency of the distribution — the most representative single number.

Difference

The gap between your observed drawdown and the median. Displayed in rose when your drawdown is above typical (you experienced worse-than-average sequencing) and in green when below (you got favourable sequencing).

In the screenshot, +4.8% above typical means your observed 26.4% drawdown was 4.8 percentage points worse than the median 21.6%. This is a mild adverse result — trade ordering made the drawdown somewhat worse than it needed to be, but not dramatically so.

What "Above Typical" Actually Means

Being above the median does not mean the strategy is bad. It means the specific sequence of wins and losses in the backtest produced a slightly worse drawdown than most random orderings would have. The strategy's structural drawdown risk (as estimated by the median) is actually lower than what was observed. This is cautiously optimistic information — the true risk may be less than the backtest suggests.

Expected Range

The top-right panel shows the envelope bounds — the range within which 90% of simulated drawdowns fell.

Best Case (5th Percentile)

The most favourable plausible outcome. Only 5% of simulations produced a drawdown this small or smaller. In the screenshot, 10.1% means that with extremely lucky trade sequencing, the same trades could have produced a drawdown as low as 10.1% — less than half the observed 26.4%.

Worst Case (95th Percentile)

The pessimistic bound. Only 5% of simulations produced a drawdown this large or larger. At 49.4%, the same trades could have created a drawdown nearly twice the observed value if ordered unfavourably.

Range Width

The P95 minus P5 spread, expressed in percentage points. At 39.3 points, this is a wide range — drawdown varies enormously depending on trade order. This width is the single most important number in the panel because it indicates sequencing sensitivity.

  • Narrow range (under 10 points) — drawdown is stable regardless of ordering. The strategy's structure determines risk, not luck.
  • Moderate range (10-20 points) — some sensitivity to ordering. Plan for meaningfully different outcomes in live trading.
  • Wide range (over 20 points) — sequencing matters enormously. The observed drawdown is a weak predictor of future drawdown because a different sequence could produce dramatically different results.

A 39.3-point range means running this strategy live could produce anything from a mild 10% dip to a severe 49% drawdown, depending on trade ordering luck. Risk management should be calibrated to the worst case, not the observed case.

Drawdown Distribution

The bottom-left panel visualises where your drawdown falls within the simulated distribution using a horizontal bar chart.

The Distribution Bar

The bar represents the full range of simulated drawdowns from minimum to maximum. The shaded P5-P95 region occupies the central portion — this is where 90% of outcomes fell. The colour gradient shifts from green-ish on the left (lower drawdowns) through orange in the middle to muted grey on the right (higher drawdowns).

Markers

Two markers are placed on the bar:

  • The grey dot ("You") shows where your observed drawdown sits in the distribution. In the screenshot, it falls slightly right of centre — confirming the 66th percentile position.
  • The orange line ("Median") marks the 50th percentile. The gap between the median line and your marker visualises the +4.8% difference from the first panel.

Reading Positions

If your marker is far left of the median, you experienced better-than-typical sequencing. If it is far right, your sequencing was adverse. If the markers nearly overlap, your drawdown was typical — exactly what you would expect from random ordering.

The x-axis labels show the P5 and P95 bounds (10% and 49% in this case), anchoring the visual to concrete numbers.

Sequencing Assessment

The bottom-right panel translates the distribution analysis into two categorical assessments: percentile position and sequencing sensitivity.

Percentile

Your observed drawdown's position within the simulated distribution. The 66th percentile means 66% of simulations produced a smaller drawdown and 34% produced a larger one. You are slightly on the unfavourable side of the distribution.

The description text varies by percentile range:

  • 25th or below — "Better than most random orderings"
  • 26th to 50th — "In the favourable half of simulations"
  • 51st to 75th — "Slightly worse than typical, not extreme"
  • Above 75th — "Worse than most random orderings"

Sensitivity

A three-level classification based on range width and percentile position:

  • Low — range width below 5 points (labelled "Insensitive") or percentile at or below 25 ("Favorable"). Drawdown varies minimally with ordering, or you got lucky.
  • Moderate — percentile between 26-75% with meaningful range width. The observed drawdown is influenced by sequencing but not extreme. This is the most common classification.
  • High — percentile above 75% ("Adverse"). The observed drawdown is worse than most simulations predict, suggesting unfavourable trade ordering.

The sensitivity indicator uses colour coding: green for Low/Favorable, amber for Moderate, and rose for High/Adverse.

Observed Sequencing Interpretation

Below the 2x2 grid, a full-width panel provides structured diagnostic text in two columns: Characteristics (what was observed) and Implications (what it means going forward).

Characteristics

Three bullet points describe the observation:

  1. Position within range — "Drawdown falls within normal simulated range" if your drawdown is between P5 and P95, or "Drawdown falls outside the typical 90% simulation envelope" if it is more extreme than 90% of simulations.
  2. Ordering influence — "Not driven by extreme trade ordering" for percentiles between 5-95%, or "May be influenced by extreme trade ordering" for outlier percentiles.
  3. Always present — "Outcome reflects strategy structure rather than ordering." This anchoring statement reinforces that the strategy's design, not luck, is the primary driver.

Implications

Three bullet points describe forward-looking expectations:

  1. Stability assessment — varies by range width:
    • Range under 10 points: "Risk profile is stable under reordering"
    • Range 10-20 points: "Moderate sensitivity to trade ordering"
    • Range over 20 points: "Outcome influenced by sequencing, but within normal variation"
  2. Always present — "Future drawdowns may vary within expected bounds."
  3. Tail risk check — "No evidence of hidden tail risk from sequencing" for non-outlier percentiles (10-90%), or "Observed result is an outlier" for extreme percentiles.

Practical Use and Cross-References

The Monte Carlo envelope is most powerful when used to calibrate expectations rather than to pass/fail a strategy.

Setting Realistic Risk Limits

If the P95 worst case is 49.4%, you should plan for a drawdown near that level — not the 26.4% you observed. The observed drawdown is a single sample. The P95 is the planning boundary: "given my trades, the worst plausible drawdown from bad sequencing is around 49%." If that exceeds your tolerance, reduce position size or tighten stops before deploying.

Envelope + Drawdown Analysis

The Drawdown Analysis modal shows the depth, duration, and recovery pattern of the observed drawdowns. The Envelope shows whether that observation was typical. Together, they answer: "how bad was it, and how bad could it have been?" If the observed max drawdown is at the 20th percentile, the strategy was lucky — live trading will likely produce worse drawdowns. If it is at the 80th percentile, the strategy was unlucky — structural risk is actually lower than the backtest implies.

Envelope + Performance Metrics

The Performance Metrics card shows return-based ratios (Sharpe, Calmar). The envelope adds a dimension: if the range width is large, those ratios are unstable. A Calmar ratio computed from the observed drawdown could be dramatically different if computed from the P95 drawdown instead. For wide-range strategies, compute Calmar using P95 to get a conservative risk-adjusted measure.

Envelope + Behaviour

The Behaviour card shows win/loss streaks. Long losing streaks are the primary driver of sequencing risk — they create deep drawdowns when clustered at the start of trading. If the behaviour analysis shows a max losing streak of 9 (as in this strategy), the Monte Carlo distribution will naturally be wide because clustering those 9 losses together produces very different equity curves than spreading them evenly.

The Re-run Button

The top-right Re-run button regenerates all 10,000 simulations with fresh random seeds. Because Monte Carlo is stochastic, each run produces slightly different percentiles and bounds. If you run it three times and get materially different results each time, the distribution is unstable — typically a signal that 10,000 simulations is insufficient (rare) or that the trade set is very small. Consistent results across runs confirm the analysis is reliable.

Related articles

Browse all learning paths