feat: Separate Runtime Statistics Collection from UI Updates#4205
feat: Separate Runtime Statistics Collection from UI Updates#4205kunwp1 wants to merge 5 commits intoapache:mainfrom
Conversation
Xiao-zhen-Liu
left a comment
There was a problem hiding this comment.
LGTM, left minor comments and some questions. Tested and can verify the size changes of the persisted runtime stats by adjusting this new parameter.
| status-update-interval = 500 | ||
| status-update-interval = ${?CONSTANTS_STATUS_UPDATE_INTERVAL} | ||
|
|
||
| runtime-statistics-interval = 2000 |
There was a problem hiding this comment.
It should be more explicit that this config is about persistence. Please add it in the name.
| Future.collect(futures).flatMap(_ => processLayers(rest)) | ||
| } | ||
|
|
||
| // Start processing all layers and update the frontend after completion |
There was a problem hiding this comment.
Please also update this comment as it is not just the frontend.
| ControlInvocation( | ||
| METHOD_CONTROLLER_INITIATE_QUERY_STATISTICS, | ||
| QueryStatisticsRequest(Seq.empty), | ||
| QueryStatisticsRequest(Seq.empty, updateTarget), |
There was a problem hiding this comment.
As you are having two separate timers that each send separate QueryStatisticsRequests, more requests will be sent than before. I am wondering what would be the implication of this? For example, would more frequent QueryStatistics be sent to each worker? It would be good if you can comment on this in your PR description.
| ): Future[EmptyReturn] = { | ||
| // Avoid issuing concurrent full-graph statistics queries. | ||
| // If a global query is already in progress, skip this request. | ||
| if (globalQueryStatsOngoing && msg.filterByWorkers.isEmpty) { |
There was a problem hiding this comment.
The default configs of 500ms vs. 2000ms might result in each RuntimeStatisticsPersist coinciding with ExecutionStatsUpdate. Will all the RuntimeStatisticsPersist requests be discarded?
There was a problem hiding this comment.
I wanted the runtime-statistics-interval to be independent from the existing status-update-interval and consider the case where runtime-statistics-interval is smaller than status-update-interval. But if we make an assumption that the RuntimeStatisticsPersist will always coincide with ExecutionStatsUpdate, the implementation would be much simpler and we can discard RuntimeStatisticsPersist.
There was a problem hiding this comment.
You can decide what assumptions we make. I just wonder what the behavior of this part will be if the requests are independently sent.
What changes were proposed in this PR?
This PR introduce a new configuration parameter
runtime-statistics-intervalto independently control the frequency of runtime statistics persistence, separate from the UI update frequency (status-update-interval). Previously, both UI updates and runtime statistics persistence were controlled by a single parameterstatus-update-interval. This meansfrequent UI updates (e.g., 500ms) caused excessive statistics writes to storage. This change allows independent control:
status-update-interval: Controls how often the frontend UI refreshes (default: 500ms)runtime-statistics-interval: Controls how often statistics are persisted to storage (default: 2000ms)Changes
runtime-statistics-intervalparameter (default: 2000ms) inapplication.confStatisticsUpdateTargetenum (UI_ONLY,PERSISTENCE_ONLY,BOTH_UI_AND_PERSISTENCE) toQueryStatisticsRequestRuntimeStatisticsPersistevent for statistics-only updates;ExecutionStatsUpdatenow handles UI-only updatesQueryWorkerStatisticsHandlerroutes to appropriate event based onStatisticsUpdateTargetAny related issues, documentation, discussions?
Closes #4204
How was this PR tested?
Tested with the following workflow and dataset, change the
runtime-statistics-intervalparameter to see if the runtime stats size reduces if we increase the parameter value.Iris Dataset Analysis.json
Iris.csv
Was this PR authored or co-authored using generative AI tooling?
No.