Skip to content

Conversation

marioferh
Copy link
Contributor

@marioferh marioferh commented May 15, 2025

First PR: #1929
Related: Enhancements Proposal openshift/enhancements#1627

Copy link
Contributor

openshift-ci bot commented May 15, 2025

Hello @marioferh! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

@openshift-ci openshift-ci bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label May 15, 2025
@openshift-ci openshift-ci bot requested review from deads2k and JoelSpeed May 15, 2025 14:08
@marioferh marioferh force-pushed the metrics_server_config_monitoring_api branch from 8592b9b to 4a25999 Compare May 15, 2025 15:39
@marioferh marioferh force-pushed the metrics_server_config_monitoring_api branch from 4a25999 to d2369a1 Compare July 16, 2025 16:45
@marioferh
Copy link
Contributor Author

@marioferh marioferh force-pushed the metrics_server_config_monitoring_api branch from d2369a1 to c38a1eb Compare July 16, 2025 16:47
@marioferh
Copy link
Contributor Author

/retest-required

@marioferh
Copy link
Contributor Author

/test e2e-aws-ovn-hypershift-conformance

//
// see: https://kubernetes.io/docs/tasks/debug-application-cluster/audit/#audit-policy
// for more information about auditing and log levels.
// +optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be optional?

This is the only field in the struct and we generally want to avoid being able to set something like audit: {}

I think this made sense as a required field because the parent is optional and a pointer, meaning you can detect the need to set a default audit profile when the parent is nil controller side.

I could see this field being optional if you think this struct may expand in the future to support additional configuration options where you'd want the audit profile to still default to Metadata when unspecified.

If that is the case, we should add a minimum properties constraint to the Audit struct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed to optional because is a default value then, it is not necessary to add min properties, correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't think you'll be expanding the options here, lets make this a required field when specifying the parent field (audit). Not specifying audit will mean we perform defaulting behavior.

Otherwise, lets add that at least one property must be specified so that setting audit: {} is never a valid configuration.

// Valid values are positive integers, values over 10 are usually unnecessary.
// When omitted, this means no opinion and the platform is left to choose a reasonable default, that is subject to change over time.
// The current default value is `0`.
// default means minimal logging
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense for us to include here what commonly used ranges of values generally refer to?

As an example, thinking along the lines of:

When set to a value in the range 0 - 2, metrics server will log pertinent information, warnings, and errors.
When set to a value in the range 2 - 4, metrics server will log request information, all errors, and some warnings.
...

Without something like that included here, I wouldn't really know what value is reasonable to put here and what information I'll get out of that.

Building on this, I wonder if instead of accepting a numeric value we should follow the Alertmanager configuration approach and have more human-readable values which we assign an arbitrary verbosity value in the controller that stamps out the metrics server configuration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

You are right, but the config of metric server is numeric:
`Logging flags:

  --log-flush-frequency duration         Maximum number of seconds between log flushes (default 5s)
  --log-json-info-buffer-size quantity   [Alpha] In JSON format with split output streams, the info messages can be buffered for a while to increase performance. The default value of zero bytes disables buffering. The size can be specified as number of bytes (512), multiples of 1000 (1K), multiples of 1024 (2Ki), or powers of those (3M, 4G, 5Mi, 6Gi). Enable the LoggingAlphaOptions feature gate to use this.
  --log-json-split-stream                [Alpha] In JSON format, write error messages to stderr and info messages to stdout. The default is to write a single stream to stdout. Enable the LoggingAlphaOptions feature gate to use this.
  --log-text-info-buffer-size quantity   [Alpha] In text format with split output streams, the info messages can be buffered for a while to increase performance. The default value of zero bytes disables buffering. The size can be specified as number of bytes (512), multiples of 1000 (1K), multiples of 1024 (2Ki), or powers of those (3M, 4G, 5Mi, 6Gi). Enable the LoggingAlphaOptions feature gate to use this.
  --log-text-split-stream                [Alpha] In text format, write error messages to stderr and info messages to stdout. The default is to write a single stream to stdout. Enable the LoggingAlphaOptions feature gate to use this.
  --logging-format string                Sets the log format. Permitted formats: "json" (gated by LoggingBetaOptions), "text". (default "text")

-v, --v Level number for the log level verbosity
--vmodule pattern=N,... comma-separated list of pattern=N settings for file-filtered logging (only works for text log format)
`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metric server may be numeric, but we don't have to follow that from the operator configuration perspective. In my opinion, we should focus on UX over choosing something that maps 1:1 with what the CLI option is for metrics-server.

For example, in the operator we can have a translation layer that takes something like Info and maps it to a -v=2.

I don't know what the best value options to put here are, but something like

// Valid values are: "Normal", "Debug", "Trace", "TraceAll".
might be a good source of inspiration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdyt?

//
// see: https://kubernetes.io/docs/tasks/debug-application-cluster/audit/#audit-policy
// for more information about auditing and log levels.
// +optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't think you'll be expanding the options here, lets make this a required field when specifying the parent field (audit). Not specifying audit will mean we perform defaulting behavior.

Otherwise, lets add that at least one property must be specified so that setting audit: {} is never a valid configuration.

@marioferh
Copy link
Contributor Author

/retest-required

@marioferh marioferh force-pushed the metrics_server_config_monitoring_api branch from f88b3e5 to 6dfbbf6 Compare August 27, 2025 13:55
@everettraven
Copy link
Contributor

verify-crd-schema check is failing on a false positive. We can safely override this.

The API looks good to me, but we have a lint failure that needs to be resolved.

@marioferh
Copy link
Contributor Author

verify-crd-schema check is failing on a false positive. We can safely override this.

The API looks good to me, but we have a lint failure that needs to be resolved.

not sure why I have different issues with linter in local and in the CI, maybe different scripts?

@marioferh
Copy link
Contributor Author

/retest-required

@JoelSpeed
Copy link
Contributor

not sure why I have different issues with linter in local and in the CI, maybe different scripts?

Make sure you've rebase, make sure your code is checked out under $GOPATH/src/github.com/openshift/api, run make -c tools clean and then run the scripts again

Signed-off-by: Mario Fernandez <[email protected]>
@marioferh marioferh force-pushed the metrics_server_config_monitoring_api branch from 928d99d to 557334f Compare August 29, 2025 08:33
@everettraven
Copy link
Contributor

/approve
/lgtm

Overriding verify-crd-schema as it is failing on a false positive
/override ci/prow/verify-crd-schema

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 29, 2025
Copy link
Contributor

openshift-ci bot commented Aug 29, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: everettraven, marioferh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 29, 2025
Copy link
Contributor

openshift-ci bot commented Aug 29, 2025

@everettraven: Overrode contexts on behalf of everettraven: ci/prow/verify-crd-schema

In response to this:

/approve
/lgtm

Overriding verify-crd-schema as it is failing on a false positive
/override ci/prow/verify-crd-schema

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Contributor

openshift-ci bot commented Aug 29, 2025

@marioferh: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp 557334f link false /test e2e-gcp
ci/prow/e2e-aws-ovn-hypershift 557334f link true /test e2e-aws-ovn-hypershift
ci/prow/e2e-aws-serial-techpreview-2of2 557334f link true /test e2e-aws-serial-techpreview-2of2
ci/prow/e2e-aws-serial-2of2 557334f link true /test e2e-aws-serial-2of2

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants