Common Workflows & Troubleshooting
Day-to-day tasks and what to do when things go wrong.
1. Generating ML Feature Sets
- Pick features from the built-in catalog or define custom ones.
- Add your chosen features to
quantflow_project.ymlunderfeature_engine.features. - Configure labels — e.g.,
triple_barrierfor TP/SL-based targets. - Run:
qf run --start-date 2024-01-01 --end-date 2025-12-31 - Query the computed feature tables and
cdm.cdm_labelsin DuckDB. - Query features and labels in DuckDB — features in
{project}_feature.features, labels incdm.cdm_labels. Join onsymbolandtimestampfor ML training.
2. Switching from Research to Production
- Verify batch mode works:
qf run --mode research - Ensure DolphinDB is running — check host/port in
.local_config.yml. - Configure streaming feed providers in
.definitions/feed_providers/. - Deploy:
qf run --mode trade - Monitor:
qf pipeline status
Features defined once in YAML run across batch and streaming modes — no duplicate implementations. Only the execution backend differs.
3. Changing Data Providers
- Find or create a feed provider YAML in
.definitions/feed_providers/. - Set the provider name in
quantflow_project.ymlundersources[].historical_feed_providerorstreaming_feed_provider. - Add credentials to
.local_config.ymlunderfeed_provider_credentials. - Run
qf validateto check the configuration.
4. Adding Features
The crypto template includes 133 built-in FeatureTypes. To configure features:
- In
quantflow_project.yml, add entries underfeature_engine.features:features:- name: my_ofitype: ofiparameters:bar: imbalance_k_10 - Override any feature parameters as needed (horizon, bar type, etc.).
- Run
qf validateand thenqf run.
5. Reprocessing After Config Changes
After changing state engine thresholds, label definitions, or feature parameters:
# Re-run state engine for the affected date range
qf run --engine state --start-date 2026-01-01 --end-date 2026-01-31
# Re-compute features and labels
qf run --engine feature --start-date 2026-01-01 --end-date 2026-01-31
New results replace old ones for the overlapping date range (state engine) or append with a new run_id (features/labels).
6. Running Batch Pipelines via Dagster
For research workflows that benefit from asset lineage and observability:
# Start Dagster UI
dagster dev -w dagster_workspace.yaml
# Trigger full pipeline
python -c "
from quantflow.pipeline import create_runner
from quantflow.metadata import load_metadata
meta = load_metadata(project_dir='.')
runner = create_runner(meta)
runner.run(stage='all')
"
# Or run individual stages
runner.run(stage='ingest')
runner.run(stage='dbt')
runner.run(stage='state_engine')
runner.run(stage='feature_engine')
Dagster tracks run history, asset materializations, and per-stage failures — useful for debugging and reproducibility. Only the batch path goes through Dagster; streaming runs independently in DolphinDB.
7. Troubleshooting
"Feature type not in registry"
The type in your feature definition doesn't match any name in .definitions/feature_types/. Check for typos, and verify the YAML file exists in that directory (subdirectories are searched recursively).
"Missing required parameter"
A feature type declares a parameter as required: true, but no value is in your config. Add it under the feature's parameters section in quantflow_project.yml, or in the feature YAML definition under .definitions/features/.
"Required input column not available"
A feature's required_inputs lists a column that doesn't exist in the source CDM tables. Verify the state engine ran successfully for the date range, and check that the column name matches what the state engine produces.
"No source columns available" (feature skipped)
The feature's first step references columns that don't exist in the merged source data. Check column name mappings and ensure the state engine completed before the feature engine runs.
DolphinDB connection refused
- Verify host/port in
.local_config.yml. - Default credentials:
admin/123456. - Ensure the DolphinDB server has enough memory for the deployed engines.
- Check that the DolphinDB service is running.
Pipeline stage hangs or times out
- For batch mode: try reducing
micro_batch_sizein state engine config. - For streaming mode: check
qf pipeline statusfor queue depth — backed-up queues indicate a downstream bottleneck. - Verify data is actually available for the requested date range.