LOB Book, Trade Enrichment & Snapshots
The order book is the State Engine's most performance-critical data structure. Every trade is enriched against the book state at its exact timestamp, producing L1 analytics and full-depth L2 snapshots.
The Sparse LOB Book
A fixed-size, direct-indexed structure optimized for Numba's JIT compiler:
| Structure | Size | Purpose |
|---|---|---|
| Price map | 20M int32 slots (~80 MB) | Maps tick price → level index, O(1) lookup without hashing |
| Level arrays | 3 × 200,000 entries | Price, size, side — active levels stored contiguously |
| Free list | Stack of freed indices | O(1) slot reuse after deletions |
Key Operations
- Lookup: Direct array access at
price_map[tick_price]— returns level index or -1 - Insert (ADD): Pop free index, write level data, update price map
- Delete: Write -1 into price map, push index onto free list
- Modify: Update size in-place (price unchanged)
- Best bid/ask: Eagerly updated on every ADD/MODIFY that beats the current best. Cold-path re-scan (linear scan of active levels) only triggers when the previously-best level is deleted.
The book is initialized fresh per micro-batch — state carries across all events within one batch but is not persisted between batches.
Trade Enrichment
For each trade, the kernel computes L1-derived analytics from the book state at trade time:
| Field | Formula | Purpose |
|---|---|---|
mid_price | (best_bid + best_ask) / 2 | Fair value estimate |
spread | best_ask - best_bid | Liquidity cost |
effective_spread | 2 × |trade_price - mid_price| | Actual execution cost |
book_imbalance | (bid_sz - ask_sz) / (bid_sz + ask_sz) | Pressure asymmetry |
micro_price | (bid×ask_sz + ask×bid_sz) / (bid_sz + ask_sz) | Volume-weighted fair value |
p_buy | 0.0–1.0 | Probability trade was buyer-initiated |
signed_volume | size × (2×p_buy - 1) | Continuous signed volume |
trade_direction | +1 / -1 | Discrete buy/sell classification |
ret | price_t - price_{t-1} | Trade-to-trade return |
log_return | log(price_t / price_{t-1}) | Log return |
Trade direction is inferred via a priority chain of three signing methods (configurable via sign_methods): DSIDE (exchange-reported flag) → QUOTE_INFER (trade vs. mid-price) → LEE_READY (quote + tick test). The sign_confidence column records which method produced the final classification.
The full enriched trade output has 28 columns and maps to the cdm_trade_enriched CDM table.
LOB Snapshots
Snapshots capture the full depth of the order book at configurable intervals and map to the cdm_lob_l2 CDM table:
snapshots:
period_seconds: 60.0
depth_levels: 10
on_every_trade: false
interval: 100
| Parameter | Description |
|---|---|
snapshot_on_every_trade | Emit snapshot after every trade (high frequency, high storage) |
snapshot_period_seconds | Time-based interval (0 = disabled) |
snapshot_interval | Trade-count-based interval (0 = disabled) |
depth | Number of price levels to capture per side (default: 10) |
Snapshots fire on four independent triggers:
- Every trade (
snapshot_on_every_trade: true) - Time interval (
snapshot_period_secondselapsed) - Trade count (
snapshot_intervaltrades) - CUSUM threshold (
cusum_snapshot_threshold > 0— emits when CUSUM accumulator crosses threshold)
Each snapshot row contains:
- L1 data: best bid/ask prices, sizes, spread, mid, weighted_mid
- L2 arrays:
bidsandasksas arrays of{level, price, size, order_count}structs - Depth metrics:
total_bid_depth,total_ask_depth,depth_imbalance,vwap_bid,vwap_ask
Integration Points
- StateEngine.process(): Calls
fused_kernelwhich handles all LOB book operations, trade enrichment, and snapshot emission in one pass - StateEngineReader: Queries raw trades + LOB from source engine via engine-specific SQL generators (DuckDB, Trino, BigQuery, Snowflake, Databricks)
- StateOutputWriter: Routes
trades_enriched→cdm_trade_enrichedandsnapshots→cdm_lob_l2CDM tables - StateEngineConfig: Controls all book, enrichment, and snapshot parameters with three-tier resolution (fallbacks → metadata → overrides)