Skip to content

flb_metrics: change hot_reloaded_times from gauge to counter#11489

Open
eschabell wants to merge 1 commit intofluent:masterfrom
eschabell:erics_metric_fb_hot_reloaded_times_fix
Open

flb_metrics: change hot_reloaded_times from gauge to counter#11489
eschabell wants to merge 1 commit intofluent:masterfrom
eschabell:erics_metric_fb_hot_reloaded_times_fix

Conversation

@eschabell
Copy link
Contributor

@eschabell eschabell commented Feb 23, 2026

Registering it as a gauge causes incorrect behavior with PromQL
functions like rate() and increase(), which expect counters.

  • src/flb_metrics.c: replace cmt_gauge_create() with cmt_counter_create() for hot_reloaded_times in attach_hot_reload_info()
  • src/flb_metrics.c: replace struct cmt_gauge with struct cmt_counter
  • src/flb_metrics.c: replace cmt_gauge_set() with cmt_counter_set()

Fixes #11479.


Testing

  • [ N/A ] Example configuration file for the change
  • [ N/A ] Debug log output from testing the change
  • [ N/A ] Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • [ N/A ] Run local packaging test showing all targets (including any new ones) build.
  • [ N/A ] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • [ YES ] Documentation required for this feature

Docs PR for this change: fluent/fluent-bit-docs#2383

Backporting

  • [ N/A ] Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • Bug Fixes
    • Updated hot reload metric tracking to use more appropriate metric type for better accuracy in monitoring.

  Registering it as a gauge causes incorrect behavior with PromQL
  functions like rate() and increase(), which expect counters.

  - src/flb_metrics.c: replace cmt_gauge_create() with cmt_counter_create()
    for hot_reloaded_times in attach_hot_reload_info()
  - src/flb_metrics.c: replace struct cmt_gauge with struct cmt_counter
  - src/flb_metrics.c: replace cmt_gauge_set() with cmt_counter_set()

Fixes fluent#11479.

Signed-off-by: Eric D. Schabell <eric@schabell.org>
@coderabbitai
Copy link

coderabbitai bot commented Feb 23, 2026

📝 Walkthrough

Walkthrough

Changed the hot reload metric from a gauge to a counter in flb_metrics.c. The implementation now uses cmt_counter_create() instead of cmt_gauge_create() and updates the metric with cmt_counter_set() instead of cmt_gauge_set(), maintaining the same observable value source and hostname dimension handling.

Changes

Cohort / File(s) Summary
Metric Type Implementation
src/flb_metrics.c
Changed fluentbit_hot_reloaded_times metric from gauge to counter type. Replaced cmt_gauge_create() with cmt_counter_create(), added counter creation result check, and switched metric update from cmt_gauge_set() to cmt_counter_set().

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Suggested reviewers

  • edsiper

Poem

🐰 A gauge became a counter bright,
Hot reloads now counted right!
No more confusion, up it goes,
Like my fluffy tail's proud pose! 🔄

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: converting the hot_reloaded_times metric from a gauge to a counter implementation.
Linked Issues check ✅ Passed The code changes directly address issue #11479 by replacing cmt_gauge with cmt_counter for the hot_reloaded_times metric [#11479].
Out of Scope Changes check ✅ Passed All changes in src/flb_metrics.c are directly scoped to converting the hot_reloaded_times metric type as required by the linked issue.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/flb_metrics.c (1)

367-370: Pre-existing: return values of all attach_* helpers are discarded.

All four attach_* calls return -1 on metric-creation failure, but those errors are swallowed here. This is not introduced by this PR and is out of scope, but worth tracking separately since a failed cmt_counter_create for any metric goes completely unnoticed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/flb_metrics.c` around lines 367 - 370, The four helper calls
attach_uptime, attach_process_start_time_seconds, attach_build_info, and
attach_hot_reload_info can return -1 on failure but their results are ignored
here; update the caller to check each return value and propagate or handle
errors (e.g., if any helper returns -1, immediately return -1 from the
surrounding function or log and clean up as appropriate) so a failed
cmt_counter_create is not silently swallowed—ensure you reference those helper
functions by name when adding the checks and propagate the error consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/flb_metrics.c`:
- Line 346: The call to cmt_counter_set in attach_hot_reload_info (and likewise
in attach_uptime) currently drops the return value, hiding failures when
allow_reset==0; update the code to check the int return from cmt_counter_set and
handle errors: if it returns -1 log a debug/error with context (function name,
hostname/metric name and the attempted val) or propagate the error upward as
appropriate so failures are observable; locate calls to cmt_counter_set in
attach_hot_reload_info and attach_uptime and add the simple return-value check
plus logging/propagation.

---

Nitpick comments:
In `@src/flb_metrics.c`:
- Around line 367-370: The four helper calls attach_uptime,
attach_process_start_time_seconds, attach_build_info, and attach_hot_reload_info
can return -1 on failure but their results are ignored here; update the caller
to check each return value and propagate or handle errors (e.g., if any helper
returns -1, immediately return -1 from the surrounding function or log and clean
up as appropriate) so a failed cmt_counter_create is not silently
swallowed—ensure you reference those helper functions by name when adding the
checks and propagate the error consistently.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bc966ed and 61140f0.

📒 Files selected for processing (1)
  • src/flb_metrics.c

@eschabell
Copy link
Contributor Author

@cosmo0920 or @lecaros the linting errors are on windows build files, can someone look at this change and review?

@cosmo0920 cosmo0920 added this to the Fluent Bit v5.0 milestone Feb 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Metric fluentbit_hot_reloaded_times is currently a gauge impl, should be a counter impl

2 participants