Skip to content

[Fix](ai_agg) isolate AI_AGG query_ctx per aggregate state#63080

Open
linrrzqqq wants to merge 1 commit intoapache:masterfrom
linrrzqqq:fix-ai-agg-ctx
Open

[Fix](ai_agg) isolate AI_AGG query_ctx per aggregate state#63080
linrrzqqq wants to merge 1 commit intoapache:masterfrom
linrrzqqq:fix-ai-agg-ctx

Conversation

@linrrzqqq
Copy link
Copy Markdown
Collaborator

Problem Summary:

AI_AGG previously stored QueryContext in a process-level static pointer. Concurrent AI_AGG queries could overwrite that pointer, causing one query to read another query's AI resource metadata, query options, or timeout state.

The aggregate function instance now receives the query context from AggFnEvaluator and binds it to each AggregateFunctionAIAggData state when the state is created, reset, or deserialized.

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: no blocking issues found in this PR.

Critical checkpoint conclusions:

  • Goal and proof: The PR removes the process-wide static QueryContext from AI_AGG and binds the context through AggregateFunctionAIAgg into each AggregateFunctionAIAggData state on create, reset, and deserialize. The added unit test covers two aggregate function instances with different context-window settings and verifies isolation.
  • Scope and clarity: The modification is small and focused on the AI_AGG context lifetime issue.
  • Concurrency: The removed static pointer was the shared mutable state. The replacement stores context on each aggregate function instance and copies it into each aggregate state; no new shared mutable state or lock ordering concern was introduced.
  • Lifecycle: Aggregate states are initialized through create, reset, and deserialize before use. QueryContext is an observer pointer consistent with the existing RuntimeState/query lifecycle used by AggFnEvaluator.
  • Configuration: No new configuration items are added.
  • Compatibility: No storage format, thrift protocol, or function signature compatibility change is introduced.
  • Parallel paths: Nullable aggregate wrappers delegate set_query_context to the nested function, and the aggregate create/deserialize/reset paths reviewed all preserve context propagation.
  • Conditional checks: Existing context checks remain unchanged; no new speculative defensive checks were added.
  • Test coverage: The new BE unit test directly exercises the fixed cross-instance isolation behavior. I did not run tests in this review environment.
  • Observability: No new observability is required for this narrowly scoped lifecycle fix.
  • Transaction/persistence/data writes: Not applicable.
  • FE/BE variable passing: No new transmitted variables are added.
  • Performance: No meaningful overhead beyond assigning a pointer during aggregate state lifecycle operations.

User focus response: No additional user-provided focus points were supplied, and no extra issue was found while reviewing the whole PR.

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (12/12) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 74.38% (27999/37643)
Line Coverage 58.44% (304841/521663)
Region Coverage 56.12% (256308/456715)
Branch Coverage 57.57% (110649/192189)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (12/12) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.76% (27764/37643)
Line Coverage 57.61% (300523/521663)
Region Coverage 54.78% (250184/456715)
Branch Coverage 56.36% (108314/192189)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (12/12) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.76% (27764/37643)
Line Coverage 57.60% (300460/521663)
Region Coverage 54.79% (250225/456715)
Branch Coverage 56.34% (108283/192189)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants