-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: adaptive lookback for monovertex #2373
Conversation
Signed-off-by: Sidhant Kohli <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2373 +/- ##
==========================================
- Coverage 69.84% 69.68% -0.16%
==========================================
Files 361 361
Lines 49935 50040 +105
==========================================
- Hits 34878 34872 -6
- Misses 13979 14095 +116
+ Partials 1078 1073 -5 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Sidhant Kohli <[email protected]>
Signed-off-by: Sidhant Kohli <[email protected]>
Signed-off-by: Sidhant Kohli <[email protected]>
Signed-off-by: Sidhant Kohli <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KeranYang - please review.
lastSeen := make(map[string]struct { | ||
count float64 | ||
seenTime int64 | ||
}) | ||
|
||
// Map to store the maximum duration for which the value of any pod was unchanged. | ||
maxDuration := make(map[string]int64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point was to make the code easier to understand by changing the method to something like below
maxUnchangedDuration = make(map[string]int64)
for pod in pods:
maxUnchangedDuration[podName] = calculateMaxUnchangedDurationForPod(pod)
globalMaxSecs = maxUnchangedDuration.theMaxDuration().
return globalMaxSecs
The calculateMaxUnchangedDurationForPod
method doesn't need to maintain pod name to lastSeen/maxDuration mapping as we do right now.
Signed-off-by: Sidhant Kohli <[email protected]>
Signed-off-by: Sidhant Kohli <[email protected]>
Signed-off-by: Sidhant Kohli <[email protected]>
This build on top of our current rater to derive relevant information and calculate the required lookback window
This would encapsulate the lookback for two scenarios
lookbackSeconds - How many seconds to lookback for vertex average processing rate (tps) and pending messages calculation, defaults to 120. Rate and pending messages metrics are critical for autoscaling, you might need to tune this parameter a bit to see better results. For example, your data source only have 1 minute data input in every 5 minutes, and you don't want the vertices to be scaled down to 0. In this case, you need to increase lookbackSeconds to overlap 5 minutes, so that the calculated average rate and pending messages won't be 0 during the silent period, in order to prevent from scaling down to 0.
https://numaflow.numaproj.io/user-guide/reference/autoscaling/#numaflow-autoscaling
Follow up work
Operational Flow:
Data Entry: Pods report their processed message counts periodically, which are saved into a TimestampedCounts structure and pushed onto a queue.
Lookback Adjustment Process:
When the value for a pod metric changes, new data is read