Fetch deadline callback context via Execution API at runtime by seanghaeli · Pull Request #66608 · apache/airflow

seanghaeli · 2026-05-08T22:02:58Z

Summary

Replace the simple context workaround from #55241 that stored serialized context in trigger kwargs (DB). Now that #55068 gives the triggerer API access, fetch the DagRun at execution time via the Execution API and build context fresh.

This avoids DB bloat from serialized context, provides fresh (not stale) context, and builds a richer context dict including logical_date, ds, ts, conf, data_interval_start/end, and the deadline info.

Changes

deadline.py: Remove get_simple_context(). Store only identifiers (dag_id, run_id, deadline_id, deadline_time) in callback kwargs.
callback.py: Add _build_context() that fetches DagRun via SUPERVISOR_COMMS.asend(GetDagRun(...)). Backward compat: old callbacks with "context" key still work.
triggerer_job_runner.py: Add GetDagRun to ToTriggerSupervisor union, DagRunResult to ToTriggerRunner union, handler in _handle_request.
callback_supervisor.py: Add GetDagRun to CallbackToSupervisor union + handler for executor callback path.
Tests: Updated deadline model tests, added context-fetching test, backward-compat test, GetDagRun handler test.

Testing

Ran in Breeze to verify the comms plumbing works e2e:

Confirmed GetDagRun round-trips through the triggerer's ToTriggerSupervisor → _handle_request → DagRunResult response path without breaking existing trigger handling
Verified SUPERVISOR_COMMS.asend() is the correct async calling pattern — uses TriggerCommsDecoder from init_comms() with async lock for coroutine safety in the trigger event loop
Verified the DagRun generated model has all fields accessed in _build_context: logical_date, data_interval_start, data_interval_end, conf
Backward compat confirmed: old callbacks with stored "context" key (queued before this change) still work

Motivation

Per @ramitkataria's feedback on #64984: context should not be stored in the DB. The triggerer now has API access (#55068), so fetch it at runtime like tasks do.

Addresses feedback from Add Jinja template rendering and richer context for async deadline callbacks #64984 (closed)
Follows Re-enable start_from_trigger feature with rendering of template fields #55068 (triggerer API access) and reverts the context-storage approach from Include simple context in triggerer async callback #55241

seanghaeli · 2026-05-08T22:13:00Z

@ramitkataria incorporated your feedback from #64984
@ferruzzi

your reviews would be much appreciated!

ferruzzi

Just a quick question, otherwise LGTM.

ferruzzi

Approved pending CI passing

…in DB Replace the simple context workaround from apache#55241 that stored serialized context in trigger kwargs. Now that apache#55068 gives the triggerer API access, fetch the DagRun and build context at execution time. This avoids DB bloat from serialized context, provides fresh (not stale) context, and enables richer context information. The CallbackTrigger now uses SUPERVISOR_COMMS.asend(GetDagRun(...)) to fetch the DagRun details from the Execution API when it runs, rather than receiving a pre-built context dict from the scheduler. Changes: - deadline.py: Store only identifiers (dag_id, run_id, deadline_id, deadline_time) in callback kwargs instead of serialized context - callback.py: Add _build_context() that fetches DagRun via Execution API; maintain backward compat for old callbacks with "context" key - triggerer_job_runner.py: Add GetDagRun/DagRunResult to triggerer comms - callback_supervisor.py: Add GetDagRun to executor callback comms Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The CallbackTrigger legitimately imports from airflow.sdk to communicate with the supervisor via the Execution API at runtime, similar to triggers/base.py and jobs/triggerer_job_runner.py which are already excluded.

Address review feedback: only include deadline keys that have non-None values, preventing the callback from receiving unexpected None entries.

ramitkataria

Thanks for pivoting away from the previous approach. This is in the right direction but I think there's still work to be done. Removing the context from DB is good but like I said in #64984, we should follow the approach used in #55068. I also want to point out that the way context works in ExecutorCallback also needs to be updated because it was using the same "temporary solution" and will break if this PR is merged.

I did a deep dive to reduce the number of iterations we have to go through and here's what I recommend based on my findings:

Context and kwargs:

Let's use the standard Context TypedDict for the context parameter (dag_run, run_id, logical_date, etc., with task-specific fields absent)
For deadline-specific info (deadline_id, deadline_time), let's add those to kwargs, since that's what they defined when registering the callback.

handle_miss (deadline.py):

`{"deadline": "id": ..., "time": ...} goes in callback.data["kwargs"]
Let's not put put context or DagRun identifiers in kwargs.

Triggerer path:

In _create_workload (triggerer_job_runner.py), when trigger.task_instance is None but trigger.callback exists with dag_id/run_id in its data, fetch the DagRun and put it in dag_run_data on the workload (same field start_from_trigger uses).
In create_triggers, when the workload has dag_run_data but no ti, build a Context (dag_run, run_id, logical_date, etc.) and set it as an attribute on the trigger instance (e.g. trigger_instance.context = built_context), same pattern as trigger_instance.task_instance = ti.
CallbackTrigger.run() reads self.context instead of popping from kwargs.

Executor path:

Adding GetDagRun to CallbackToSupervisor is good so let's keep that. Use it from inside execute_callback (the subprocess function), not from inside the trigger. When execute_callback detects it needs context (identifiers present on callback.data), it sends GetDagRun via SUPERVISOR_COMMS, builds a Context from the response, and passes it to the user's callback as a separate context parameter.
This matches how tasks work: the subprocess asks for what it needs through comms.

This way, the implementation for context in tasks and callbacks would become similar which is the goal.

ramitkataria · 2026-05-15T02:16:09Z

            {attr: getattr(self, attr) for attr in ("callback_path", "callback_kwargs")},
        )

+    async def _build_context(


Ideally, we should be minimizing any callback specific code for context and use the same type for context as the one used for tasks. So I think we should remove this function entirely

ramitkataria · 2026-05-15T02:17:07Z

+        from airflow.sdk.execution_time.comms import DagRunResult, GetDagRun
+        from airflow.sdk.execution_time.task_runner import SUPERVISOR_COMMS
+
+        response = await SUPERVISOR_COMMS.asend(GetDagRun(dag_id=dag_id, run_id=run_id))


I don't think a trigger is supposed to directly interact with SUPERVISOR_COMMS. That's the job of the trigger runner

ramitkataria · 2026-05-15T02:44:11Z

+        # Store only identifiers in kwargs; the callback executor (triggerer or executor subprocess)
+        # fetches the full DagRun context via the Execution API at runtime. This avoids DB bloat
+        # from serialized context and ensures context is fresh at execution time.
+        context_identifiers = {


Let's remove all these identifiers and have the triggerrer supervisor fetch these like it does for tasks. I would like to keep self.callback.data["kwargs"] as minimal as possible besides the user specified kwargs.

ferruzzi

Withdrawing my approval for now. Ramit has put a lot of thought and planning into this project already so I'll defer to his thoughts here. Sorry for the churn.

boring-cyborg Bot added area:deadline-alerts AIP-86 (former AIP-57) area:task-sdk area:Triggerer labels May 8, 2026

seanghaeli marked this pull request as ready for review May 9, 2026 03:24

seanghaeli requested review from XD-DENG, amoghrajesh, ashb, dstandish, hussein-awala and kaxil as code owners May 9, 2026 03:24

potiuk added the ready for maintainer review Set after triaging when all criteria pass. label May 11, 2026

ferruzzi reviewed May 12, 2026

View reviewed changes

Comment thread airflow-core/src/airflow/triggers/callback.py

ferruzzi approved these changes May 12, 2026

View reviewed changes

Sean Ghaeli and others added 7 commits May 13, 2026 19:26

Fix MyPy: use proper types for DagRun in test

0c3f129

Fix ruff: move stdlib import before third-party

bc4d6d3

Fix SDK import check by excluding triggers/callback.py

b0c19fe

The CallbackTrigger legitimately imports from airflow.sdk to communicate with the supervisor via the Execution API at runtime, similar to triggers/base.py and jobs/triggerer_job_runner.py which are already excluded.

Fix ruff isort: datetime import in correct stdlib group

5f67447

Fix ruff isort: alphabetical order (dataclasses before datetime)

71783ea

Prune None values from deadline context dict

82efc3e

Address review feedback: only include deadline keys that have non-None values, preventing the callback from receiving unexpected None entries.

seanghaeli force-pushed the ghaeli/callback-context-execution-api branch from c93d733 to 82efc3e Compare May 13, 2026 19:27

ramitkataria reviewed May 15, 2026

View reviewed changes

ferruzzi requested changes May 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fetch deadline callback context via Execution API at runtime#66608

Fetch deadline callback context via Execution API at runtime#66608
seanghaeli wants to merge 7 commits into
apache:mainfrom
aws-mwaa:ghaeli/callback-context-execution-api

seanghaeli commented May 8, 2026 •

edited

Loading

Uh oh!

seanghaeli commented May 8, 2026

Uh oh!

ferruzzi left a comment

Uh oh!

Uh oh!

ferruzzi left a comment

Uh oh!

ramitkataria left a comment •

edited

Loading

Uh oh!

ramitkataria May 15, 2026

Uh oh!

ramitkataria May 15, 2026

Uh oh!

ramitkataria May 15, 2026

Uh oh!

ferruzzi left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

seanghaeli commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Testing

Motivation

Related

Uh oh!

seanghaeli commented May 8, 2026

Uh oh!

ferruzzi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ferruzzi left a comment

Choose a reason for hiding this comment

Uh oh!

ramitkataria left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ramitkataria May 15, 2026

Choose a reason for hiding this comment

Uh oh!

ramitkataria May 15, 2026

Choose a reason for hiding this comment

Uh oh!

ramitkataria May 15, 2026

Choose a reason for hiding this comment

Uh oh!

ferruzzi left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

seanghaeli commented May 8, 2026 •

edited

Loading

ramitkataria left a comment •

edited

Loading