Home Methods Data Performance Philosophy Research Live Dashboard

believe · v133

A FUSION-DUAL quantitative strategy on E-mini S&P 500 futures. An ML tick ensemble, an XGBoost 5‑minute bar model, and an MBP-10 microstructure classifier, each voting through independent OCA brackets at the exchange. v133 adds an adaptive regime gate (2-day rolling P&L auto-toggle) and a session cut-down to RTH + ETH_EUROPE only, trimming backtest drawdown ~30% at the same profit tier. Purged walk-forward validated. Marketable-limit execution. No black boxes, no moving goalposts.

ES CME Futures
MBP-10 Order Book
0.826 Walk-Fwd AUC
v133 Production

Three independent models. One instrument. One execution path.

believe trades only ES futures on CME. We do not diversify across instruments to paper over a weak model. We diversify across decision processes: three models, trained on different horizons and different views of the same tape, each firing its own qty=1 marketable-limit bracket. If any one model is the only voice in the room, it is the only voice that trades.

01 / Tick ML

Order-flow sequence model

An LSTM-style sequence network consumes the tick stream — signed trade size, local imbalance, VPIN, Kyle’s λ, microprice drift, realised-volatility terms — and emits a short-horizon directional probability. Labels come from a triple-barrier scheme with embargo.

02 / XGB 5m

Structural bar model

Gradient-boosted trees on 52 bar-level features: trend strength, session context, range dynamics, volatility regime, relative position in the day. One decision per closed 5-minute bar. Walk-forward out-of-sample validated, no touching of the test fold until purge and embargo have elapsed.

03 / F2_dom

MBP-10 microstructure

A LightGBM classifier on 17 live order-book features — book imbalance, top-of-book ratio, top-three imbalance, bid and ask gradients across ten levels, rolling imbalance std, midprice momentum. Trained on 1.45M labelled samples. 0.826 ± 0.015 AUC under purged K-fold.

Our edge is in the plumbing, not the forecast.

Everyone has a model. The difference is what you do between the model and the exchange. believe is designed around four choices that most retail quant stacks quietly get wrong.

Why ES

E-mini S&P 500 is the deepest, most liquid equity-index futures contract in the world. Tight spreads, minimal size impact at qty=1, round-the-clock price discovery. It rewards microstructure honesty and punishes overfitting — the right crucible for a research-grade system.

Why microstructure

Price is the last thing to move. The order book is the cause, the tape is the effect. F2_dom reads 10 levels of bids and asks around the clock. Structural imbalance, queue gradients, top-of-book pressure — none of it shows up in a 5-minute candle. All of it shows up before the candle closes.

Why purged walk-forward

Standard K-fold CV leaks the future into the past through overlapping labels and autocorrelated features. We use López de Prado purged K-fold with embargo, triple-barrier labels, and isotonic calibration on held-out folds. The reported AUC is what a deployed model would have seen, not a curve-fit fantasy.

Why marketable limit

Market orders quietly cost you a tick every fill. Over thousands of trades, that tick is the difference between a real edge and an imagined one. believe enters with marketable limit orders — aggressive enough to fill, disciplined enough to refuse worse prices. Every order attaches a server-side OCA bracket before acknowledgement.

How we measure ourselves.

We do not publish daily P&L on a marketing site. What you see below is the kind of statistic that belongs in a risk review: classification quality of the microstructure model on purged, embargoed folds; hit rate and profit factor of the full stack against 79 days of real captured tick data (RTH only, adaptive regime gate on).

0.826 F2_dom AUC
76.1% Stack Hit Rate
1.66 Profit Factor
−30% Max DD vs v131

F2_dom_v133 classification metric is purged K-fold (5 folds, 10-minute embargo) on 1.45M labelled samples, 2026-01-27 to 2026-04-15. Stack hit rate and profit factor are from a 79-day backtest (Jan 29 – Apr 17 2026, RTH only) on real captured tick and MBP-10 data, qty=1, with commission and a ±1-2 tick stop-slippage model, adaptive regime gate on. Max-DD reduction is v133 (adaptive gate) vs v131 (no gate) on the same tape. Past results do not guarantee future performance.

Pick a depth.

From a 60-second investor overview down to the full methodology paper. Pages are ordered high level first.