Skip to content

davidszp/norgate-pst-utils

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

norgate-pst-utils

Import historical futures data from Norgate Data into pysystemtrade (PST).

What is pysystemtrade?

pysystemtrade is Rob Carver's open-source systematic trading framework for futures, based on his books Systematic Trading and Leveraged Trading. It handles backtesting, live trading via Interactive Brokers, position management, and more.

What This Solves

Getting historical futures data into pysystemtrade is a common challenge. The default CSV data shipped with PST is outdated, and Interactive Brokers only provides ~1 year of history.

Norgate Data provides high-quality historical futures data going back to the 1980s for many instruments. This toolkit bridges Norgate's data format to PST's expected format.

Results

Using this toolkit, we successfully imported:

Metric Count
Contract price files 23,315
Unique instruments 105
Date range ~1980 to present

All 105 instruments have complete multiple prices and adjusted prices, ready for backtesting.

Overview

The import process has two phases:

PHASE 1: Export from Norgate (requires Windows)
===============================================
  Windows + NDU + norgatedata Python package
                    |
                    v
         norgate_utils/export.py
                    |
                    v
      CSV files: INSTRUMENT_YYYYMM00.csv

              ( copy files )

PHASE 2: Import to PST (any platform)
=====================================
               CSV files
                    |
                    v
  pst_import/build_multiple_prices.py
  pst_import/build_adjusted_prices.py
                    |
                    v
    Parquet database ready for backtesting

Requirements

For Phase 1 (Export)

  • Windows (Norgate Data Updater only runs on Windows)
  • Norgate Data subscription (futures data)
  • Python 3.8+ with norgatedata and pandas

For Phase 2 (Import)

  • Any platform (Linux, macOS, Windows)
  • pysystemtrade installed and configured
  • The CSV files from Phase 1

Quick Start

Step 1: Export from Norgate (on Windows)

# Install norgatedata
pip install norgatedata pandas

# Make sure NDU (Norgate Data Updater) is running

# Export all mapped instruments
python norgate_utils/export.py --output-dir ./norgate_export

# Or export specific instruments
python norgate_utils/export.py --output-dir ./norgate_export --markets CL ES GC ZN

This creates CSV files like:

norgate_export/
├── CRUDE_W_20230100.csv
├── CRUDE_W_20230200.csv
├── SP500_20240300.csv
├── GOLD_20241200.csv
└── ... (23,000+ files)

Step 2: Copy files to your PST machine

Copy the norgate_export/ folder to your pysystemtrade data directory:

# Example: copy to PST's expected location
cp -r norgate_export/ /path/to/pysystemtrade/data/futures/norgate_export/

Step 3: Import contract prices to PST

cd /path/to/pysystemtrade

# Import all contract prices from CSVs to Parquet
python sysinit/futures/contract_prices_from_csv_to_db.py

Step 4: Build multiple prices

# Copy the import scripts to your PST
cp pst_import/*.py /path/to/pysystemtrade/sysinit/futures/

# Build multiple prices for all instruments
python sysinit/futures/build_multiple_prices.py

Step 5: Build adjusted prices

python sysinit/futures/build_adjusted_prices.py

Step 6: Fix roll calendars (if needed)

Some instruments may need roll calendar fixes. Use the pre-generated ones first:

# Copy pre-generated roll calendars for instruments that fail
cp roll_calendars/*.csv /path/to/pysystemtrade/data/futures/roll_calendars_csv/

# Then rebuild affected instruments
python sysinit/futures/build_multiple_prices.py
python sysinit/futures/build_adjusted_prices.py

Step 7: Verify

# Copy verification script
cp pst_import/verify_data_coverage.py /path/to/pysystemtrade/sysinit/futures/

# Run verification
python sysinit/futures/verify_data_coverage.py --summary

Expected output:

OK:    104 instruments (100% coverage)
GAP:     1 instruments (missing data)   # NIKKEI-SGX_mini - known issue
ERROR:   0 instruments (critical issues)

Symbol Mapping

We mapped 115 Norgate symbols to 105 PST instruments.

The full mapping is in norgate_utils/symbols.csv.

Key Mappings

Norgate PST Description
CL CRUDE_W WTI Crude Oil
ES SP500 E-mini S&P 500
NQ NASDAQ E-mini NASDAQ 100
GC GOLD Gold
ZN US10 10-Year Treasury
6E EUR Euro FX
ZC CORN Corn
VX VIX VIX Volatility

Some Norgate symbols map to the same PST instrument (e.g., ES and MES both provide S&P 500 data).

Roll Calendars

How It Works

PST uses roll calendars to stitch individual contract prices into continuous series. These define when to roll from one contract to the next.

For most instruments, we used PST's existing roll calendars from data/futures/roll_calendars_csv/. However, many of these calendars are truncated - they don't go back as far as Norgate's contract data.

Two Types of Roll Calendar Issues

1. Broken Roll Calendars (Build Failures)

Some instruments fail to build because the roll calendar references contracts that don't exist in Norgate data:

Instrument Issue
FED Roll calendar referenced contracts before Norgate had data
LEANHOG Gap in 2020 roll calendar vs Norgate data
VIX Similar gap around May 2020

2. Truncated Roll Calendars (Lost History)

Many instruments have contract data going back further than their roll calendars. This means you're not using all available historical data:

Instrument Contracts From Roll Cal From Lost History
SPI200 1983 2022 39 years
COFFEE 1980 2007 27 years
CAC 1988 2009 21 years
EUROSTX 1999 2014 15 years
BUND 1999 2007 8 years
... ... ... ...

Detecting and Fixing Roll Calendar Issues

The fix_roll_calendars.py script can detect and fix both types of issues:

# Detect truncated roll calendars (data exists before roll calendar)
python pst_import/fix_roll_calendars.py --detect

# Detect broken roll calendars (missing contracts)
python pst_import/fix_roll_calendars.py --detect-broken

# Fix all truncated roll calendars
python pst_import/fix_roll_calendars.py --fix

# Fix specific instruments
python pst_import/fix_roll_calendars.py --fix --instruments EUROSTX BUND CAC

# Adjust minimum gap threshold (default: 1 year)
python pst_import/fix_roll_calendars.py --detect --min-gap-years 0.5

# Quiet mode for scripting (returns exit code 0 on success, 1 on failure)
python pst_import/fix_roll_calendars.py --fix --quiet

The script:

  1. Regenerates roll calendars from actual contract price data
  2. Rebuilds multiple prices using the new roll calendar
  3. Rebuilds adjusted prices

Pre-Generated Roll Calendars

For convenience, we include pre-generated roll calendars for instruments that failed with PST defaults in the roll_calendars/ directory:

# Copy the fixed roll calendars to PST
cp roll_calendars/*.csv /path/to/pysystemtrade/data/futures/roll_calendars_csv/

Platform-Specific Setup

Norgate Data Updater (NDU) only runs on Windows. Here's how to handle this on different platforms:

Windows Users

You're all set! Install NDU and Python, then run the export script directly.

See: docs/SETUP_WINDOWS.md

Linux Users

You have two options:

  1. WinBoat (recommended) - Containerized Windows VM using Podman/Docker
  2. Dual boot or separate Windows machine

We used WinBoat successfully. It runs Windows in a container with file sharing via \\host.lan\Home.

See: docs/SETUP_LINUX.md

macOS Users

Use a Windows virtual machine:

  • Parallels Desktop
  • VMware Fusion
  • VirtualBox (free)

See: docs/SETUP_MACOS.md

Troubleshooting

"No PST mapping for Norgate market: XXX"

The export script only exports instruments that have a mapping in NORGATE_TO_PST. To add a new mapping:

  1. Find the PST instrument code in data/futures/csvconfig/instrumentconfig.csv
  2. Add the mapping to norgate_utils/export.py
  3. Re-run the export

"Missing contracts in middle of roll calendar"

This means PST's roll calendar expects contract data that doesn't exist in your Norgate export. Solutions:

  1. Use pst_import/fix_roll_calendars.py to regenerate the roll calendar from actual prices
  2. Or check if Norgate has the missing contracts

Build fails for specific instruments

Some instruments may fail if:

  • Norgate doesn't have enough contract history
  • The roll parameters don't match Norgate's available contracts

Run fix_roll_calendars.py to regenerate roll calendars from actual price data.

Verifying Your Import

After importing, verify data coverage:

# Quick summary
python pst_import/verify_data_coverage.py --summary

# Show instruments with gaps
python pst_import/verify_data_coverage.py --gaps-only

# Check specific instrument
python pst_import/verify_data_coverage.py --instrument EUR

A successful import shows 100+ instruments with "OK" status.

File Reference

norgate-pst-utils/
├── README.md                           # This file
├── norgate_utils/
│   ├── export.py                       # Main export script (run on Windows)
│   └── symbols.csv                     # Complete symbol mapping
├── pst_import/
│   ├── build_multiple_prices.py        # Build multiple prices from contracts
│   ├── build_adjusted_prices.py        # Build adjusted (back-adjusted) prices
│   ├── fix_roll_calendars.py           # Regenerate calendars (use with caution)
│   ├── extend_roll_calendars.py        # Safely extend calendars forward (preferred)
│   └── verify_data_coverage.py         # Verify import success
├── roll_calendars/
│   ├── CRUDE_W.csv                     # Extended through 2026
│   ├── FED.csv                         # Custom roll calendar for Fed Funds
│   ├── HEATOIL.csv                     # Extended through 2026
│   ├── LEANHOG.csv                     # Custom roll calendar for Lean Hogs
│   └── VIX.csv                         # Custom roll calendar for VIX
└── docs/
    ├── SETUP_WINDOWS.md
    ├── SETUP_LINUX.md
    └── SETUP_MACOS.md

IB Symbol Mapping Notes

When transitioning from Norgate historical data to Interactive Brokers for live updates, some instruments have non-obvious IB symbol mappings. PST's ib_config_futures.csv defines these mappings. Here are known gotchas:

SGX (MSCI Singapore Free Index)

Norgate symbol: SSG PST instrument: SGX IB symbol: SSG (not STI!) IB exchange: SGX IB multiplier: 100 IB tradingClass: SGP

Despite the PST instrument name "SGX", this is the MSCI Singapore Free Index (SIMSCI), NOT the Straits Times Index (STI). The price levels confirm this:

Index Price Level (Jan 2026)
MSCI Singapore Free ~457
Straits Times Index ~4,905
Norgate SSG data ~457

If you incorrectly map to IB symbol STI (the Straits Times Index futures), you'll get prices ~10x too high and the market has zero volume. The correct IB symbol is SSG (which IB internally resolves to contract symbol SSG with tradingClass SGP).

PST config (ib_config_futures.csv):

SGX,SSG,SGX,SGD,100,1,FALSE

PST config (instrumentconfig.csv):

  • Pointsize: 100 (multiplier matches IB contract)
  • Currency: SGD

Other Non-Obvious Mappings

PST Instrument Norgate IB Symbol Notes
JPY 6J JPY Norgate quotes per 100 yen (~0.634), IB quotes per 1 yen (~0.00634). See Price Unit Mismatches below.
SILVER SI SI Norgate quotes in cents/oz, IB in dollars/oz. Handled via priceMagnifier=100 in PST. Pointsize=10 (1000-oz SIL contract in cents).
COPPER HG HG Norgate quotes in cents/lb, IB in dollars/lb. Handled via priceMagnifier=100 in PST. Pointsize=250 (25000-lb contract in cents).
DAX FDAX DAX PST targets micro DAX (tradingClass FDXM, mult=1), not full DAX (mult=25)
MXP 6M 6M IB symbol matches Norgate, but ibexecutor uses section name MXN

Price Unit Mismatches: Norgate vs IB

This is the most dangerous data issue when combining Norgate with IB. If Norgate and IB quote the same instrument in different units, the price series will have a 10x or 100x jump at the switchover point. This silently corrupts your backtest P&L.

Affected Instruments

PST Instrument Norgate Unit IB Unit Factor Fix
JPY USD per 100 yen (~0.634) USD per 1 yen (~0.00634) 100x Divide Norgate by 100
SILVER Cents per oz (~3100) Dollars per oz (~31.00) 100x priceMagnifier=100 in PST
COPPER Cents per lb (~450) Dollars per lb (~4.50) 100x priceMagnifier=100 in PST

How the Export Script Handles This

The export script (norgate_utils/export.py) has a NORGATE_PRICE_DIVISOR dict that automatically divides prices during export. JPY is divided by 100 so the exported CSVs match IB's convention.

For SILVER and COPPER, PST's priceMagnifier=100 in ib_config_futures.csv multiplies IB prices by 100 before storing, so they match Norgate's cents convention. No export-time correction is needed — but Pointsize must reflect the cents convention (SILVER: 10, COPPER: 250).

How to Detect This

After combining Norgate and IB data, check for large jumps around the switchover date:

import pandas as pd
adj = pd.read_csv(f'data/futures/adjusted_prices_csv/{instrument}.csv',
                  parse_dates=['DATETIME'], index_col='DATETIME')
daily = adj.iloc[:, 0].resample('B').last().dropna()
ratio = (daily / daily.shift(1)).dropna()
big_jumps = ratio[(ratio > 5) | (ratio < 0.2)]
if len(big_jumps) > 0:
    print(f"WARNING: {instrument} has {len(big_jumps)} suspicious jumps")
    print(big_jumps)

If You Already Imported Without the Fix

If you imported JPY data before this fix was added, you need to correct the historical contract prices:

import glob, pandas as pd

files = sorted(glob.glob("path/to/parquet/futures_contract_prices/JPY#*.parquet"))
for f in files:
    df = pd.read_parquet(f)
    if len(df) == 0:
        continue
    # Norgate rows have prices > 0.01, IB rows have prices < 0.01
    is_norgate = df["FINAL"] > 0.01
    if is_norgate.any():
        for col in ["OPEN", "HIGH", "LOW", "FINAL"]:
            df.loc[is_norgate, col] = df.loc[is_norgate, col] / 100.0
        df.to_parquet(f)

Then rebuild multiple prices and adjusted prices for JPY.

Known Issues

NIKKEI-SGX_mini Data Gap

Norgate data for NIKKEI-SGX_mini has a gap after 2002. The contract data switches from quarterly to monthly contracts around 2003, and some months are missing, preventing the roll calendar from being built. This is a Norgate data limitation, not a toolkit issue.

Workaround: Either exclude this instrument from your backtest, or source the missing data elsewhere.

Sparse Early Data

Some instruments (HEATOIL, CRUDE_W, older commodities) have sparse contract data in the early years (1970s-1980s). The fix_roll_calendars.py --fix command may produce truncated roll calendars for these.

Solution: Use the pre-generated calendars in roll_calendars/, or use extend_roll_calendars.py which only extends forward without touching historical data.

Roll Calendar Safety

The fix_roll_calendars.py script now includes a safety check: it will abort if the regenerated calendar would be smaller than the existing one (indicating data issues). Use --force to override, but this is not recommended.

Preferred approach: Use extend_roll_calendars.py --extend which safely extends calendars forward without risking historical data.

Data Quality Notes

  • Norgate data is high quality - Clean, adjusted for splits, with good coverage
  • VIX history starts 2004 - VIX futures were introduced in 2004
  • Some instruments have shorter history - Crypto, micro contracts, etc.
  • Data is daily OHLCV - No intraday data from Norgate
  • 104/105 instruments work perfectly - Only NIKKEI-SGX_mini has known issues

Contributing

Found a mapping issue? Missing instrument? Please open an issue or PR.

Related Projects

License

MIT License - See LICENSE file.

Acknowledgments

  • Rob Carver for pysystemtrade
  • Norgate Data for quality historical futures data
  • The PST community for guidance on data formats

About

Import Norgate historical futures data into pysystemtrade

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages