-
Notifications
You must be signed in to change notification settings - Fork 15
[WIP] implement autoscaling #242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
internal/autoscaler/workerstate.go
Outdated
return true | ||
} | ||
|
||
func (w *WorkerState) AddVramSample(workload *WorkloadState, metrics *WorkerMetrics) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be calculated by select max in Greptime flow to offload memory based calculation
|
||
func (f *FakeMetricsProvider) GetHistoryMetrics() []*WorkerMetrics { | ||
metrics := []*WorkerMetrics{} | ||
startTime := time.Now().Add(-7 * 24 * time.Hour) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be configurable
internal/autoscaler/recommender.go
Outdated
) | ||
|
||
var ( | ||
safetyMarginFraction = flag.Float64("recommendation-margin-fraction", 0.15, `Fraction of usage added as the safety margin to the recommended request`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be in pool and scheduling config, not flags
internal/autoscaler/estimator.go
Outdated
// of 1 sample per minute, this metric is equal to N. | ||
// This implementation is a very simple heuristic which looks at the total count | ||
// of samples and the time between the first and the last sample. | ||
func getConfidence(s *WorkloadState, confidenceInterval time.Duration) float64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems not a normalized confidence factor
internal/autoscaler/workloadstate.go
Outdated
Resources tfv1.Resources | ||
AutoScalingConfig tfv1.AutoScalingConfig | ||
|
||
TflopsHistogram vpa.Histogram |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
coupled with vpa algorithm, can not extend with others like XGBoost / LightGBM / Prophet / MLP
…e support value all
No description provided.