Skip to content

Fetch deadline callback context via Execution API at runtime#66608

Open
seanghaeli wants to merge 7 commits into
apache:mainfrom
aws-mwaa:ghaeli/callback-context-execution-api
Open

Fetch deadline callback context via Execution API at runtime#66608
seanghaeli wants to merge 7 commits into
apache:mainfrom
aws-mwaa:ghaeli/callback-context-execution-api

Conversation

@seanghaeli
Copy link
Copy Markdown
Contributor

@seanghaeli seanghaeli commented May 8, 2026

Summary

Replace the simple context workaround from #55241 that stored serialized context in trigger kwargs (DB). Now that #55068 gives the triggerer API access, fetch the DagRun at execution time via the Execution API and build context fresh.

This avoids DB bloat from serialized context, provides fresh (not stale) context, and builds a richer context dict including logical_date, ds, ts, conf, data_interval_start/end, and the deadline info.

Changes

  • deadline.py: Remove get_simple_context(). Store only identifiers (dag_id, run_id, deadline_id, deadline_time) in callback kwargs.
  • callback.py: Add _build_context() that fetches DagRun via SUPERVISOR_COMMS.asend(GetDagRun(...)). Backward compat: old callbacks with "context" key still work.
  • triggerer_job_runner.py: Add GetDagRun to ToTriggerSupervisor union, DagRunResult to ToTriggerRunner union, handler in _handle_request.
  • callback_supervisor.py: Add GetDagRun to CallbackToSupervisor union + handler for executor callback path.
  • Tests: Updated deadline model tests, added context-fetching test, backward-compat test, GetDagRun handler test.

Testing

Ran in Breeze to verify the comms plumbing works e2e:

  • Confirmed GetDagRun round-trips through the triggerer's ToTriggerSupervisor_handle_requestDagRunResult response path without breaking existing trigger handling
  • Verified SUPERVISOR_COMMS.asend() is the correct async calling pattern — uses TriggerCommsDecoder from init_comms() with async lock for coroutine safety in the trigger event loop
  • Verified the DagRun generated model has all fields accessed in _build_context: logical_date, data_interval_start, data_interval_end, conf
  • Backward compat confirmed: old callbacks with stored "context" key (queued before this change) still work

Motivation

Per @ramitkataria's feedback on #64984: context should not be stored in the DB. The triggerer now has API access (#55068), so fetch it at runtime like tasks do.

Related

@seanghaeli
Copy link
Copy Markdown
Contributor Author

@ramitkataria incorporated your feedback from #64984
@ferruzzi

your reviews would be much appreciated!

@seanghaeli seanghaeli marked this pull request as ready for review May 9, 2026 03:24
@potiuk potiuk added the ready for maintainer review Set after triaging when all criteria pass. label May 11, 2026
Copy link
Copy Markdown
Contributor

@ferruzzi ferruzzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick question, otherwise LGTM.

Comment thread airflow-core/src/airflow/triggers/callback.py
Copy link
Copy Markdown
Contributor

@ferruzzi ferruzzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved pending CI passing

Sean Ghaeli and others added 7 commits May 13, 2026 19:26
…in DB

Replace the simple context workaround from apache#55241 that stored serialized
context in trigger kwargs. Now that apache#55068 gives the triggerer API access,
fetch the DagRun and build context at execution time.

This avoids DB bloat from serialized context, provides fresh (not stale)
context, and enables richer context information. The CallbackTrigger now
uses SUPERVISOR_COMMS.asend(GetDagRun(...)) to fetch the DagRun details
from the Execution API when it runs, rather than receiving a pre-built
context dict from the scheduler.

Changes:
- deadline.py: Store only identifiers (dag_id, run_id, deadline_id,
  deadline_time) in callback kwargs instead of serialized context
- callback.py: Add _build_context() that fetches DagRun via Execution API;
  maintain backward compat for old callbacks with "context" key
- triggerer_job_runner.py: Add GetDagRun/DagRunResult to triggerer comms
- callback_supervisor.py: Add GetDagRun to executor callback comms

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The CallbackTrigger legitimately imports from airflow.sdk to communicate
with the supervisor via the Execution API at runtime, similar to
triggers/base.py and jobs/triggerer_job_runner.py which are already
excluded.
Address review feedback: only include deadline keys that have non-None
values, preventing the callback from receiving unexpected None entries.
@seanghaeli seanghaeli force-pushed the ghaeli/callback-context-execution-api branch from c93d733 to 82efc3e Compare May 13, 2026 19:27
Copy link
Copy Markdown
Contributor

@ramitkataria ramitkataria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pivoting away from the previous approach. This is in the right direction but I think there's still work to be done. Removing the context from DB is good but like I said in #64984, we should follow the approach used in #55068. I also want to point out that the way context works in ExecutorCallback also needs to be updated because it was using the same "temporary solution" and will break if this PR is merged.

I did a deep dive to reduce the number of iterations we have to go through and here's what I recommend based on my findings:

Context and kwargs:

  • Let's use the standard Context TypedDict for the context parameter (dag_run, run_id, logical_date, etc., with task-specific fields absent)
  • For deadline-specific info (deadline_id, deadline_time), let's add those to kwargs, since that's what they defined when registering the callback.

handle_miss (deadline.py):

  • `{"deadline": "id": ..., "time": ...} goes in callback.data["kwargs"]
  • Let's not put put context or DagRun identifiers in kwargs.

Triggerer path:

  • In _create_workload (triggerer_job_runner.py), when trigger.task_instance is None but trigger.callback exists with dag_id/run_id in its data, fetch the DagRun and put it in dag_run_data on the workload (same field start_from_trigger uses).
  • In create_triggers, when the workload has dag_run_data but no ti, build a Context (dag_run, run_id, logical_date, etc.) and set it as an attribute on the trigger instance (e.g. trigger_instance.context = built_context), same pattern as trigger_instance.task_instance = ti.
  • CallbackTrigger.run() reads self.context instead of popping from kwargs.

Executor path:

  • Adding GetDagRun to CallbackToSupervisor is good so let's keep that. Use it from inside execute_callback (the subprocess function), not from inside the trigger. When execute_callback detects it needs context (identifiers present on callback.data), it sends GetDagRun via SUPERVISOR_COMMS, builds a Context from the response, and passes it to the user's callback as a separate context parameter.
  • This matches how tasks work: the subprocess asks for what it needs through comms.

This way, the implementation for context in tasks and callbacks would become similar which is the goal.

{attr: getattr(self, attr) for attr in ("callback_path", "callback_kwargs")},
)

async def _build_context(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, we should be minimizing any callback specific code for context and use the same type for context as the one used for tasks. So I think we should remove this function entirely

from airflow.sdk.execution_time.comms import DagRunResult, GetDagRun
from airflow.sdk.execution_time.task_runner import SUPERVISOR_COMMS

response = await SUPERVISOR_COMMS.asend(GetDagRun(dag_id=dag_id, run_id=run_id))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think a trigger is supposed to directly interact with SUPERVISOR_COMMS. That's the job of the trigger runner

# Store only identifiers in kwargs; the callback executor (triggerer or executor subprocess)
# fetches the full DagRun context via the Execution API at runtime. This avoids DB bloat
# from serialized context and ensures context is fresh at execution time.
context_identifiers = {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove all these identifiers and have the triggerrer supervisor fetch these like it does for tasks. I would like to keep self.callback.data["kwargs"] as minimal as possible besides the user specified kwargs.

Copy link
Copy Markdown
Contributor

@ferruzzi ferruzzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Withdrawing my approval for now. Ramit has put a lot of thought and planning into this project already so I'll defer to his thoughts here. Sorry for the churn.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:deadline-alerts AIP-86 (former AIP-57) area:task-sdk area:Triggerer ready for maintainer review Set after triaging when all criteria pass.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants