Skip to content

Conversation

@maltesander
Copy link
Member

@maltesander maltesander commented Oct 27, 2025

Description

Part of stackabletech/issues#747
This PR adds a metrics service with the additional Prometheus annotations.
The metrics service is also added to the TLS cert in order to scrape the metrics service.

hdfs_target_health_1 hdfs_target_health_2

BREAKING:

  • Renamed the <stacklet>-<role>-<rolegroup> service to <stacklet>-<role>-<rolegroup>-headless.
  • Switched metrics port exposure and annotations to native prom metrics instead of JMX.

Definition of Done Checklist

  • Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant
  • Please make sure all these things are done and tick the boxes

Author

  • Changes are OpenShift compatible
  • CRD changes approved
  • CRD documentation for all fields, following the style guide.
  • Helm chart can be installed and deployed operator works
  • Integration tests passed (for non trivial changes)
  • Changes need to be "offline" compatible
  • Links to generated (nightly) docs added
  • Release note snippet added

Reviewer

  • Code contains useful comments
  • Code contains useful logging statements
  • (Integration-)Test cases added
  • Documentation added or updated. Follows the style guide.
  • Changelog updated
  • Cargo.toml only contains references to git tags (not specific commits or branches)

Acceptance

  • Feature Tracker has been updated
  • Proper release label has been added
  • Links to generated (nightly) docs added
  • Release note snippet added
  • Add type/deprecation label & add to the deprecation schedule
  • Add type/experimental label & add to the experimental features tracker

@maltesander maltesander self-assigned this Oct 27, 2025
@maltesander maltesander added the release-note/action-required Denotes a PR that introduces potentially breaking changes that require user action. label Oct 27, 2025
@sbernauer sbernauer moved this to Development: Waiting for Review in Stackable Engineering Oct 27, 2025
@sbernauer sbernauer moved this from Development: Waiting for Review to Development: In Progress in Stackable Engineering Oct 27, 2025
@sbernauer sbernauer moved this from Development: In Progress to Development: Waiting for Review in Stackable Engineering Oct 27, 2025
@adwk67 adwk67 self-requested a review October 27, 2025 09:32
@adwk67 adwk67 moved this from Development: Waiting for Review to Development: In Review in Stackable Engineering Oct 27, 2025
adwk67
adwk67 previously approved these changes Oct 27, 2025
Copy link
Member

@adwk67 adwk67 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@adwk67
Copy link
Member

adwk67 commented Oct 27, 2025

Do we need a changelog entry, as this is sort-of user-facing?

Copy link
Member

@adwk67 adwk67 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@maltesander maltesander enabled auto-merge October 27, 2025 13:18
@maltesander maltesander added this pull request to the merge queue Oct 27, 2025
@maltesander maltesander moved this from Development: In Review to Development: Done in Stackable Engineering Oct 27, 2025
Merged via the queue into main with commit e360278 Oct 27, 2025
17 checks passed
@maltesander maltesander deleted the chore/ensure-metrics-correctly-exposed branch October 27, 2025 13:21
@lfrancke lfrancke moved this from Development: Done to Acceptance: In Progress in Stackable Engineering Nov 4, 2025
@maltesander
Copy link
Member Author

Release snippet

This is a combined summary of this PR and the headless revert #726.

The metrics format changed, so this is possibly breaking (native vs jmx)

Previously, the hdfs-operator did not expose Prometheus annotations containing the http(s) scheme or the metrics path and port. This was added in this PR and allows custom relabel configs in Prometheus to use these annotations to scrape the metric endpoints.
The metrics service now serves native Prometheus metrics instead of JMX metrics, which are reachable under the `jmx-metrics` port (Note: this changes the metrics output).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note/action-required Denotes a PR that introduces potentially breaking changes that require user action.

Projects

Status: Acceptance: In Progress

Development

Successfully merging this pull request may close these issues.

4 participants