Skip to content

Conversation

@javfg
Copy link
Member

@javfg javfg commented Jun 19, 2025

This PR adds a way to run single steps from ETL or Gentropy into the orchestrator. See the updated README.md for more info.

Also, it sets all clusters' idle_ttl to 5 minutes, given the fact all steps now respawn clusters if needed.

@ireneisdoomed
Copy link
Contributor

I have tried running the L2G training step using this branch and didn't manage to.

  1. I first run into an issue with the new dependency: deepmerge. It was installed in my project, but not on the Airflow instance.
    I reset the Airflow instance by running docker compose build --no-cache, and then doing make dev

  2. I then had a DAG error indicating that PIS_L2G was meant to run, even though I have specified to only run the gentropy step.

Broken DAG: [/opt/airflow/dags/src/orchestration/dags/unified_pipeline.py]
Traceback (most recent call last):
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
  File "/opt/airflow/dags/src/orchestration/dags/unified_pipeline.py", line 495, in <module>
    step_tasks["start"].set_upstream(steps[dep]["end"])
                                     ~~~~~^^^^^
KeyError: 'pis_l2g'

This is the config I used:

# unified_pipeline.yaml

################################################################################
# UNIFIED PIPELINE CONFIGURATION
################################################################################

# `release_name` is the prefix used as work path when the pipeline is run in
# _production mode_ (`is_dev: false`). It is where step inputs will be read
# from, and outputs will be written to.
release_name: '25.06'

# `is_dev` decides whether this is a run for a release or development purposes.
# On a dev run, only one step of the pipeline can be selected to run; and that
# step can only be from the `etl` or `gentropy` stages (PIS and PTS steps can be
# run locally in a very easy way without using the orchestrator).
# If `is_dev` is true, the `run_name` and `run_steps` parameters must be set.
is_dev: true

### NOTE: The next two settings are exclusive for dev runs.
# `dev_run_name` is the output folder for a dev run. The convention is:
# `<username>/<release_name>-<description>`
dev_run_name: 'il/new-l2g'
# `dev_run_step` is the step that will be run in a dev run.
dev_run_step: gentropy_l2g_training

# gentropy.yaml - where i configured the step
  l2g_training:
    params:
      step: locus_to_gene
      step.session.write_mode: overwrite
      step.run_mode: train
      step.wandb_run_name: '{{l2g_training_version}}'
      step.cross_validate: false
      step.hf_hub_repo_id: opentargets/locus_to_gene_xgboost
      step.hf_model_commit_message: 'chore: update model base model for {{l2g_training_version}} run'
      +step.session.extended_spark_conf: "{spark.kryoserializer.buffer.max:500m, spark.sql.autoBroadcastJoinThreshold:'-1'}"
      # INPUTS
      step.credible_set_path: '{{release_uri}}/output/credible_set'
      step.feature_matrix_path: '{{release_uri}}/intermediate/l2g_feature_matrix'
      step.gold_standard_curation_path: '{{release_uri}}/input/l2g/gold_standard.json'
      # OUTPUTS
      step.model_path: '{{output_uri}}/etc/model/locus_to_gene_model/classifier.skops'

@javfg Whenever you have some time, could you let me know if I've done anything wrong here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants