Monitoring API: Add Metric server config #2322

marioferh · 2025-05-15T14:04:47Z

First PR: #1929
Related: Enhancements Proposal openshift/enhancements#1627

openshift-ci · 2025-05-15T14:05:08Z

Hello @marioferh! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

config/v1alpha1/types_cluster_monitoring.go

marioferh · 2025-07-16T16:46:49Z

@everettraven hi, updated, aligned https://github.com/openshift/api/pull/2148/files

marioferh · 2025-07-21T09:33:05Z

/retest-required

marioferh · 2025-07-21T18:47:58Z

/test e2e-aws-ovn-hypershift-conformance

config/v1alpha1/types_cluster_monitoring.go

everettraven · 2025-08-05T18:54:14Z

config/v1alpha1/types_cluster_monitoring.go

+	//
+	// see: https://kubernetes.io/docs/tasks/debug-application-cluster/audit/#audit-policy
+	// for more information about auditing and log levels.
+	// +optional


Should this be optional?

This is the only field in the struct and we generally want to avoid being able to set something like audit: {}

I think this made sense as a required field because the parent is optional and a pointer, meaning you can detect the need to set a default audit profile when the parent is nil controller side.

I could see this field being optional if you think this struct may expand in the future to support additional configuration options where you'd want the audit profile to still default to Metadata when unspecified.

If that is the case, we should add a minimum properties constraint to the Audit struct.

I've changed to optional because is a default value then, it is not necessary to add min properties, correct?

If you don't think you'll be expanding the options here, lets make this a required field when specifying the parent field (audit). Not specifying audit will mean we perform defaulting behavior.

Otherwise, lets add that at least one property must be specified so that setting audit: {} is never a valid configuration.

config/v1alpha1/types_cluster_monitoring.go

everettraven · 2025-08-05T19:06:50Z

config/v1alpha1/types_cluster_monitoring.go

+	// Valid values are positive integers, values over 10 are usually unnecessary.
+	// When omitted, this means no opinion and the platform is left to choose a reasonable default, that is subject to change over time.
+	// The current default value is `0`.
+	// default means minimal logging


Would it make sense for us to include here what commonly used ranges of values generally refer to?

As an example, thinking along the lines of:

When set to a value in the range 0 - 2, metrics server will log pertinent information, warnings, and errors.
When set to a value in the range 2 - 4, metrics server will log request information, all errors, and some warnings.
...

Without something like that included here, I wouldn't really know what value is reasonable to put here and what information I'll get out of that.

Building on this, I wonder if instead of accepting a numeric value we should follow the Alertmanager configuration approach and have more human-readable values which we assign an arbitrary verbosity value in the controller that stamps out the metrics server configuration?

done.

You are right, but the config of metric server is numeric:
`Logging flags:

--log-flush-frequency duration Maximum number of seconds between log flushes (default 5s) --log-json-info-buffer-size quantity [Alpha] In JSON format with split output streams, the info messages can be buffered for a while to increase performance. The default value of zero bytes disables buffering. The size can be specified as number of bytes (512), multiples of 1000 (1K), multiples of 1024 (2Ki), or powers of those (3M, 4G, 5Mi, 6Gi). Enable the LoggingAlphaOptions feature gate to use this. --log-json-split-stream [Alpha] In JSON format, write error messages to stderr and info messages to stdout. The default is to write a single stream to stdout. Enable the LoggingAlphaOptions feature gate to use this. --log-text-info-buffer-size quantity [Alpha] In text format with split output streams, the info messages can be buffered for a while to increase performance. The default value of zero bytes disables buffering. The size can be specified as number of bytes (512), multiples of 1000 (1K), multiples of 1024 (2Ki), or powers of those (3M, 4G, 5Mi, 6Gi). Enable the LoggingAlphaOptions feature gate to use this. --log-text-split-stream [Alpha] In text format, write error messages to stderr and info messages to stdout. The default is to write a single stream to stdout. Enable the LoggingAlphaOptions feature gate to use this. --logging-format string Sets the log format. Permitted formats: "json" (gated by LoggingBetaOptions), "text". (default "text")

-v, --v Level number for the log level verbosity
--vmodule pattern=N,... comma-separated list of pattern=N settings for file-filtered logging (only works for text log format)
`

The metric server may be numeric, but we don't have to follow that from the operator configuration perspective. In my opinion, we should focus on UX over choosing something that maps 1:1 with what the CLI option is for metrics-server.

For example, in the operator we can have a translation layer that takes something like Info and maps it to a -v=2.

I don't know what the best value options to put here are, but something like

api/operator/v1/types.go

Line 60 in 7605f9b

// Valid values are: "Normal", "Debug", "Trace", "TraceAll".

might be a good source of inspiration.

config/v1alpha1/types_cluster_monitoring.go

everettraven · 2025-08-19T18:30:53Z

config/v1alpha1/types_cluster_monitoring.go

+	//
+	// see: https://kubernetes.io/docs/tasks/debug-application-cluster/audit/#audit-policy
+	// for more information about auditing and log levels.
+	// +optional


If you don't think you'll be expanding the options here, lets make this a required field when specifying the parent field (audit). Not specifying audit will mean we perform defaulting behavior.

Otherwise, lets add that at least one property must be specified so that setting audit: {} is never a valid configuration.

config/v1alpha1/types_cluster_monitoring.go

config/v1alpha1/tests/clustermonitoring.config.openshift.io/ClusterMonitoringConfig.yaml

marioferh · 2025-08-27T10:28:37Z

/retest-required

everettraven · 2025-08-27T14:38:08Z

verify-crd-schema check is failing on a false positive. We can safely override this.

The API looks good to me, but we have a lint failure that needs to be resolved.

marioferh · 2025-08-27T16:01:14Z

verify-crd-schema check is failing on a false positive. We can safely override this.

The API looks good to me, but we have a lint failure that needs to be resolved.

not sure why I have different issues with linter in local and in the CI, maybe different scripts?

marioferh · 2025-08-28T07:55:55Z

/retest-required

JoelSpeed · 2025-08-28T09:15:00Z

not sure why I have different issues with linter in local and in the CI, maybe different scripts?

Make sure you've rebase, make sure your code is checked out under $GOPATH/src/github.com/openshift/api, run make -c tools clean and then run the scripts again

Signed-off-by: Mario Fernandez <[email protected]>

everettraven · 2025-08-29T15:08:38Z

/approve
/lgtm

Overriding verify-crd-schema as it is failing on a false positive
/override ci/prow/verify-crd-schema

openshift-ci · 2025-08-29T15:10:38Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: everettraven, marioferh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [everettraven]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2025-08-29T15:10:50Z

@everettraven: Overrode contexts on behalf of everettraven: ci/prow/verify-crd-schema

In response to this:

/approve
/lgtm

Overriding verify-crd-schema as it is failing on a false positive
/override ci/prow/verify-crd-schema

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci · 2025-08-29T15:10:56Z

@marioferh: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-gcp	`557334f`	link	false	`/test e2e-gcp`
ci/prow/e2e-aws-ovn-hypershift	`557334f`	link	true	`/test e2e-aws-ovn-hypershift`
ci/prow/e2e-aws-serial-techpreview-2of2	`557334f`	link	true	`/test e2e-aws-serial-techpreview-2of2`
ci/prow/e2e-aws-serial-2of2	`557334f`	link	true	`/test e2e-aws-serial-2of2`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label May 15, 2025

openshift-ci bot requested review from deads2k and JoelSpeed May 15, 2025 14:08

marioferh force-pushed the metrics_server_config_monitoring_api branch from 8592b9b to 4a25999 Compare May 15, 2025 15:39

everettraven reviewed May 21, 2025

View reviewed changes

marioferh force-pushed the metrics_server_config_monitoring_api branch from 4a25999 to d2369a1 Compare July 16, 2025 16:45

marioferh force-pushed the metrics_server_config_monitoring_api branch from d2369a1 to c38a1eb Compare July 16, 2025 16:47

everettraven reviewed Jul 22, 2025

View reviewed changes

config/v1alpha1/types_cluster_monitoring.go Outdated Show resolved Hide resolved

config/v1alpha1/types_cluster_monitoring.go Outdated Show resolved Hide resolved

everettraven reviewed Aug 5, 2025

View reviewed changes

everettraven suggested changes Aug 19, 2025

View reviewed changes

marioferh force-pushed the metrics_server_config_monitoring_api branch from f88b3e5 to 6dfbbf6 Compare August 27, 2025 13:55

Add metric server API

557334f

Signed-off-by: Mario Fernandez <[email protected]>

marioferh force-pushed the metrics_server_config_monitoring_api branch from 928d99d to 557334f Compare August 29, 2025 08:33

marioferh mentioned this pull request Aug 29, 2025

Monitoring API: Add KubeStateMetrics api #2461

Open

openshift-ci bot assigned everettraven Aug 29, 2025

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 29, 2025

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 29, 2025

Monitoring API: Add Metric server config #2322

Are you sure you want to change the base?

Monitoring API: Add Metric server config #2322

Uh oh!

Conversation

marioferh commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci bot commented May 15, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

marioferh commented Jul 16, 2025

Uh oh!

marioferh commented Jul 21, 2025

Uh oh!

marioferh commented Jul 21, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

marioferh commented Aug 27, 2025

Uh oh!

everettraven commented Aug 27, 2025

Uh oh!

marioferh commented Aug 27, 2025

Uh oh!

marioferh commented Aug 28, 2025

Uh oh!

JoelSpeed commented Aug 28, 2025

Uh oh!

everettraven commented Aug 29, 2025

Uh oh!

openshift-ci bot commented Aug 29, 2025

Uh oh!

openshift-ci bot commented Aug 29, 2025

Uh oh!

openshift-ci bot commented Aug 29, 2025

Uh oh!

Uh oh!

marioferh commented May 15, 2025 •

edited

Loading