Skip to main content

Custom Label Methods

Define custom label methods in Python using the decorator-based registry.


The Contract

Every label method follows this signature:

ParameterTypeDescription
cdm_datadict[str, pl.DataFrame]CDM tables keyed by name
paramsdictMethod-specific parameters from YAML
inputsdictColumn name mappings (e.g., {"close": "close_price"})

Returns: LabelResult with fields:

  • name — label name (string)
  • df — result DataFrame with at minimum symbol, timestamp, label_value columns
  • horizon — integer (0 = backward-looking, >0 = forward-looking with N-period shift)
  • output_columns — list of column names this label produced

Key Rules

  • Use @register("my_method_name") — the string becomes the type in YAML configs.
  • Read columns via the inputs dict — never hard-code column names.
  • Use Polars' .over("symbol") for per-symbol computations.
  • Return a LabelResult.
  • Place your module where it will be imported before the orchestrator runs.

Registration

The method registry is auto-populated at import time. All files in label_engine/methods/ are auto-imported via __init__.py. To add a custom method outside the package, import your module explicitly before instantiating LabelEngine:

import my_custom_label_method # Triggers @register
from quantflow.label_engine.orchestrator import LabelEngine

Built-in Methods Reference

typeWhat it computesForward-looking?
triple_barrier+1 (TP), -1 (SL), 0 (expiry)Yes
fixed_horizon_returnForward return; optional binning into classesYes
trend_scanning+1/-1/0 CUSUM regime detectionNo
ts_labelDirection classifier with noise bandYes
meta_labelSecondary model assessing primary label correctnessYes
quantile_labelQuantile-based categorical bins from continuous distributionYes

Using a Custom Label

In quantflow_project.yml under label_engine:

label_engine:
enabled: true
historical_label_engine: polars
labels:
- name: my_custom_label
type: my_method_name # Must match the @register string
description: Custom label description
parameters:
threshold: 0.01
inputs:
close: close
bar_types: [time_1m]

The label_engine field accepts either:

  • A LabelEngineConfig dict with historical_label_engine and labels keys (shown above)
  • A bare list of label definitions (legacy format — auto-wrapped to {"labels": [...]})

Label Engine Constructor

LabelEngine takes three parameters:

ParameterTypeRequiredDescription
configLabelEngineConfigYesLabel engine configuration from project metadata
engineDBEngineYesDatabase engine for reading CDM tables and writing results
schema_namestrNo (default: "cdm")Database schema name for CDM tables

To join features and labels for ML training, query both tables from the database and join on symbol and timestamp. Features are stored in {project}_feature.features; labels are in cdm.cdm_labels.


See the Label Engine design doc for the full architecture.