Skip to content

Utkal059/orderflow-backtester

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

▲ OrderFlow Backtester v3.0

Institutional-grade order flow backtesting, ML pipeline, and portfolio management platform.

Built for quant research interviews. Not a toy — a system with real execution modeling, walk-forward validation, and portfolio-level risk analysis.


What This Actually Does

Most backtesters are glorified spreadsheets. This one simulates what happens when you trade:

  • Slippage: half-spread + volatility impact + size impact (√ market impact model)
  • Latency: 1-bar signal delay — your signal fires, but execution happens next bar
  • Partial fills: 85% fill probability per bar, partial fills on the rest
  • Fees: configurable in basis points, tracked per-trade
  • Position sizing: fixed fraction, volatility-targeted (15% ann. vol), or quarter-Kelly

The ML pipeline uses walk-forward validation (not k-fold — that leaks future data in time series) and runs feature leakage detection on every training run.


Architecture

Frontend (React + Vite)                    Backend (Python + FastAPI)
┌─────────────────────┐                   ┌──────────────────────────┐
│ Dashboard            │                   │ /backtest                │
│  • Single / Portfolio│◄──── api.js ─────►│ /portfolio/backtest      │
│  • Data source select│  (retry+timeout)  │ /ml/train                │
│  • Position sizing   │                   │ /ml/insights             │
│                      │                   │ /ws/logs (WebSocket)     │
│ Results              │                   ├──────────────────────────┤
│  • Equity + Drawdown │                   │ Engine                   │
│  • Trade log (paged) │                   │  ├── strategies (5)      │
│  • Portfolio corr.   │                   │  ├── execution model     │
│  • Cost analysis     │                   │  ├── position sizing     │
│                      │                   │  ├── metrics (20+)       │
│ ML / Alpha           │                   │  └── portfolio combiner  │
│  • Walk-forward table│                   ├──────────────────────────┤
│  • Leakage check     │                   │ ML Pipeline              │
│  • Model comparison  │                   │  ├── XGBoost + LR        │
│  • SHAP + importance │                   │  ├── Walk-forward (5w)   │
│  • Signal quality    │                   │  ├── Leakage detection   │
└─────────────────────┘                   │  └── SHAP (OOS only)     │
                                           ├──────────────────────────┤
                                           │ Data Sources             │
                                           │  ├── Synthetic (GARCH)   │
                                           │  └── CSV (Yahoo, custom) │
                                           └──────────────────────────┘

No Lookahead Bias — Here's How

Component Prevention Method
Signals Strategies process bars sequentially; each bar only sees past data
Execution 1-bar latency delay between signal and fill
ML Labels Future returns used for labels, but train/test split is strictly temporal
ML Split 70% train → 10% val → 20% test, chronological order, no shuffling
Walk-Forward Expanding window: each fold trains on all prior data only
SHAP Computed on out-of-sample test set only
Features All derived from rolling windows of past data

Quick Start

# Backend
cd orderflow-backtester-v3
pip install -r backend/requirements.txt
uvicorn backend.main:app --reload --port 8000

# Frontend (new terminal)
cd frontend
npm install
npm run dev

Open http://localhost:5173


Strategies

Strategy Signal Logic Exit Logic
order_flow_imbalance Z-score of rolling OFI > ±1.5σ Mean reversion to ±0.3σ
queue_exhaustion Book-side depletion + intensity spike Flow reversal or 12-bar timeout
momentum_burst 5-bar momentum + volume spike (1.8x) Trailing stop (vol-adjusted)
mean_reversion Price z-score > ±2σ + tight spread Z-score crosses ±0.5σ
composite_alpha Weighted ensemble vote of all four Combined signal threshold ±0.4

Execution Model

Fill Price = Mid Price
           + (Spread / 2)                    ← always pay the spread
           + (Volatility × Price × 0.1)      ← vol-proportional impact
           + (Price × 0.0001 × √size_ratio)  ← square-root market impact

With 85% full-fill probability. Remaining 15% get 50-95% partial fills.

ML Pipeline

Training: XGBoost (200 trees, depth 5, 0.05 LR, regularized) Validation: Walk-forward with 4-5 expanding windows Comparison: XGBoost vs Logistic Regression baseline Features: 20 features (12 raw order flow + 8 derived: z-scores, rolling stats, composites) Leakage check: Flags features with |corr| > 0.5 to target

Portfolio System

  • Equal weight: simple 1/N allocation
  • Risk parity: inverse-volatility weighting
  • Correlation matrix: computed from equity curve returns
  • Diversification ratio: weighted avg vol / portfolio vol
  • Warnings: auto-flagged when |corr| > 0.7 between assets

API Reference

Endpoint Method Description
GET /health Health check
GET /strategies List strategies
GET /symbols?source=synthetic List symbols by source
GET /data-sources List data sources
POST /backtest Single-asset backtest
POST /portfolio/backtest Multi-asset portfolio
POST /ml/train Train + evaluate + walk-forward
POST /ml/insights Feature importance + SHAP
WS /ws/logs Live log streaming

Metrics (Computed, Not Mocked)

Performance: Total Return, Annualized Return, Sharpe, Sortino, Calmar Risk: Max Drawdown, Drawdown Duration, VaR 95%, CVaR 95%, Vol, Skewness, Kurtosis Trade: Win Rate, Avg Win/Loss, Profit Factor, Hold Duration, Max Win/Loss Streaks Cost: Total Fees, Total Slippage (tracked per-trade) ML: OOS Accuracy, AUC-ROC, Precision, Recall, IC, ICIR, Turnover, Signal Decay Portfolio: Diversification Ratio, Correlation Matrix, Per-Asset Breakdown


Project Structure

orderflow-backtester-v3/
├── backend/
│   ├── config.py                 ← centralized config
│   ├── main.py                   ← FastAPI app + WebSocket
│   ├── requirements.txt
│   ├── data/
│   │   ├── base.py               ← abstract DataSource interface
│   │   ├── generator.py          ← synthetic GARCH + order flow
│   │   └── csv_loader.py         ← CSV ingestion (Yahoo, custom)
│   ├── engine/
│   │   ├── backtest.py           ← event-driven engine
│   │   ├── execution.py          ← slippage, latency, partial fills
│   │   ├── metrics.py            ← 20+ performance metrics
│   │   ├── portfolio.py          ← multi-asset portfolio engine
│   │   ├── position.py           ← sizing: fixed, vol-target, Kelly
│   │   └── strategies.py         ← 5 strategies incl. ensemble
│   ├── ml/
│   │   ├── features.py           ← feature engineering
│   │   ├── pipeline.py           ← XGBoost + SHAP + comparison
│   │   └── validation.py         ← walk-forward + leakage detection
│   └── routes/
│       ├── backtest.py           ← /backtest + /portfolio/backtest
│       └── ml.py                 ← /ml/train + /ml/insights
└── frontend/
    ├── index.html
    ├── package.json
    ├── vite.config.js
    └── src/
        ├── main.jsx
        ├── App.jsx               ← tabs + keyboard shortcuts
        ├── api.js                ← API service (retry, timeout)
        ├── Navbar.jsx            ← status + clock + tabs
        ├── Dashboard.jsx         ← config + portfolio mode
        ├── ResultsView.jsx       ← metrics + trade log + correlation
        ├── MLInsights.jsx        ← walk-forward + leakage + SHAP
        └── hooks/
            └── useBackend.js     ← connection + clock hooks

Design Decisions

Why synthetic data? — Exchange tick data costs $10K+/year. The GARCH(1,1) generator produces realistic vol clustering and order flow correlations. CSV loader supports real data when available.

Why event-driven? — Loop-based backtests allow accidental vectorized operations that leak future data. Bar-by-bar processing with explicit state makes this impossible.

Why XGBoost over LSTM? — For tabular order flow features with 20 columns, gradient-boosted trees consistently outperform sequence models. SHAP provides the interpretability trading desks require. The model comparison proves this empirically on each run.

Why walk-forward over k-fold? — k-fold shuffles time series data, placing future bars in the training set. Walk-forward expanding windows guarantee the model never trains on future data. The overfit ratio per window quantifies model stability.

Why quarter-Kelly? — Full Kelly is theoretically optimal but practically catastrophic — it assumes perfect edge estimation. Quarter-Kelly provides geometric growth with ~75% lower variance of outcome.


What Makes This Different From Tutorial Projects

Tutorial Project This Project
if price > MA: buy Z-score of rolling OFI with adaptive exits
returns.mean() / returns.std() Proper annualized Sharpe from daily returns
random_state=42, train_test_split() Walk-forward validation, no shuffling
Mock data in frontend All data from backend API, error states everywhere
Single asset only Portfolio with correlation + risk parity
No execution costs Slippage + fees + latency + partial fills, tracked per-trade

Technologies

Backend: Python 3.12, FastAPI, NumPy, pandas, XGBoost, SHAP, scikit-learn Frontend: React 18, Vite, Canvas charts Protocol: REST + WebSocket


Built as a quantitative research platform for prop trading interviews. Every metric is computed from actual PnL. No mock values. No silent fallbacks.

About

High-frequency order book backtester with C++ LOB engine + FastAPI + React dashboard

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors