-
Notifications
You must be signed in to change notification settings - Fork 511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Prometheus] Add ray_clusters_created_total
metric
#3204
base: master
Are you sure you want to change the base?
[Prometheus] Add ray_clusters_created_total
metric
#3204
Conversation
ray_clusters_created_total
metric
@troychiu PTAL, thx |
9976c1b
to
92e7b83
Compare
c446b4a
to
6fa17fc
Compare
@@ -734,6 +739,19 @@ func (r *RayClusterReconciler) reconcilePods(ctx context.Context, instance *rayv | |||
} else if len(headPods.Items) == 0 { | |||
// Create head Pod if it does not exist. | |||
logger.Info("reconcilePods: Found 0 head Pods; creating a head Pod for the RayCluster.") | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should increment after successfully create the head pod to avoid incrementing multiple times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. thx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's still possible to be counted multiple times. KubeRay is possible to delete the head Pod if the restart policy is Never
.
@@ -734,6 +739,19 @@ func (r *RayClusterReconciler) reconcilePods(ctx context.Context, instance *rayv | |||
} else if len(headPods.Items) == 0 { | |||
// Create head Pod if it does not exist. | |||
logger.Info("reconcilePods: Found 0 head Pods; creating a head Pod for the RayCluster.") | |||
|
|||
if r.EnableMetrics { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to have a function in metrics.go handle EnableMetrics in a general way?
|
||
if r.EnableMetrics { | ||
creatorCRDType := getCreatorCRDType(*instance) | ||
// Increment the counter for ray_clusters_created_total metric based on the creator CRD type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can move these checkings to ray_cluster_metrics.go to simplify the operator's logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or maybe still in this file but move to a function? What do you think it's better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a function called emitRayClusterMetrics
in raycluster_controller.go
. It takes a metric name and decides which metric to emit based on that.
WDYT?
a4290f9
to
8baf776
Compare
8baf776
to
ef1a740
Compare
"sigs.k8s.io/controller-runtime/pkg/metrics" | ||
) | ||
|
||
var rayClustersCreatedCounter = prometheus.NewCounterVec( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better not to use package-level variable. Can you open another PR first to improve it by:
ray_cluster_metric.go
type RayClusterMetricCollector struct {
// Metrics
rayClusterCreatedTotal: ...
}
func NewRayClusterMetricCollector(...) ... {
}
main.go
: Initialize the collectors if this feature is enabled. Pass the collectors into controllers.
Example:
@@ -734,6 +739,19 @@ func (r *RayClusterReconciler) reconcilePods(ctx context.Context, instance *rayv | |||
} else if len(headPods.Items) == 0 { | |||
// Create head Pod if it does not exist. | |||
logger.Info("reconcilePods: Found 0 head Pods; creating a head Pod for the RayCluster.") | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's still possible to be counted multiple times. KubeRay is possible to delete the head Pod if the restart policy is Never
.
151382b
to
34ba8b5
Compare
Signed-off-by: win5923 <[email protected]>
Signed-off-by: win5923 <[email protected]>
Signed-off-by: win5923 <[email protected]>
34ba8b5
to
0a3d445
Compare
Why are these changes needed?
Add
ray_clusters_created_total
metric to track the total number of RayClusters created.created_by_ray_job
: true/false-
created_by_ray_service
: true/false-
namespace
Manual test:
Related issue number
Closes #3175
Checks