Skip to main content

Custom SQL Backends

Add support for new database engines in the State Engine.


Custom SQL Backends (State Engine)

The State Engine generates SQL to read source data. DuckDB and BigQuery are built-in. To add another database:

Steps

  1. Subclass StateEngineSQLGenerator (abstract base in state_engine/sql/base.py).
  2. Implement three query builder methods.
  3. Register: register_generator("my_backend", MyBackendGenerator).

Required Methods

Each method receives (table, symbol, venue, start_time, end_time) and returns a SQL string:

MethodQuery must return
build_trades_sqlevent_type (2), time (epoch float), price (double), size (double), side (int), update_type (int), is_buyer_maker (int), trade_id (int64)
build_lob_sqlevent_type (1), time (epoch float), price (integer ticks), size, side, update_type (1/2/3), trade_id (null)
build_lob_snapshot_sqlSame as LOB, but rows exploded from JSON array snapshots

Implementation Checklist

  • Price-to-tick: Multiply float price by 1/tick_size, cast to integer. Use your DB's native cast (::BIGINT, SAFE_CAST, CAST, etc.).
  • Epoch extraction: Convert timestamps to epoch seconds as float64.
  • LOB update type inference: Determine ADD/MODIFY/DELETE in SQL from quantity changes (quantity=0 → DELETE, quantity>0 with existing level → MODIFY, else → ADD). The Numba kernel expects a clean integer.
  • Snapshot explosion: For JSON array storage, explode each element into individual rows with UNNEST/JSON_EXTRACT equivalents.
  • Time filtering: Apply venue/symbol/time-range filters in WHERE clauses for efficient querying.

Custom Generator Registration

from quantflow.state_engine.sql.registry import register_generator
from quantflow.state_engine.sql.base import StateEngineSQLGenerator

class MyBackendGenerator(StateEngineSQLGenerator):
def build_trades_sql(self, table, symbol, venue, start_time, end_time):
...
def build_lob_sql(self, table, symbol, venue, start_time, end_time):
...
def build_lob_snapshot_sql(self, table, symbol, venue, start_time, end_time):
...

register_generator("my_backend", MyBackendGenerator)

The historical_data_engine setting in the project config (data_processing.historical_data_engine) determines which generator is used.