flb_metrics: change hot_reloaded_times from gauge to counter#11489
flb_metrics: change hot_reloaded_times from gauge to counter#11489eschabell wants to merge 1 commit intofluent:masterfrom
Conversation
Registering it as a gauge causes incorrect behavior with PromQL
functions like rate() and increase(), which expect counters.
- src/flb_metrics.c: replace cmt_gauge_create() with cmt_counter_create()
for hot_reloaded_times in attach_hot_reload_info()
- src/flb_metrics.c: replace struct cmt_gauge with struct cmt_counter
- src/flb_metrics.c: replace cmt_gauge_set() with cmt_counter_set()
Fixes fluent#11479.
Signed-off-by: Eric D. Schabell <eric@schabell.org>
📝 WalkthroughWalkthroughChanged the hot reload metric from a gauge to a counter in Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/flb_metrics.c (1)
367-370: Pre-existing: return values of allattach_*helpers are discarded.All four
attach_*calls return-1on metric-creation failure, but those errors are swallowed here. This is not introduced by this PR and is out of scope, but worth tracking separately since a failedcmt_counter_createfor any metric goes completely unnoticed.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/flb_metrics.c` around lines 367 - 370, The four helper calls attach_uptime, attach_process_start_time_seconds, attach_build_info, and attach_hot_reload_info can return -1 on failure but their results are ignored here; update the caller to check each return value and propagate or handle errors (e.g., if any helper returns -1, immediately return -1 from the surrounding function or log and clean up as appropriate) so a failed cmt_counter_create is not silently swallowed—ensure you reference those helper functions by name when adding the checks and propagate the error consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/flb_metrics.c`:
- Line 346: The call to cmt_counter_set in attach_hot_reload_info (and likewise
in attach_uptime) currently drops the return value, hiding failures when
allow_reset==0; update the code to check the int return from cmt_counter_set and
handle errors: if it returns -1 log a debug/error with context (function name,
hostname/metric name and the attempted val) or propagate the error upward as
appropriate so failures are observable; locate calls to cmt_counter_set in
attach_hot_reload_info and attach_uptime and add the simple return-value check
plus logging/propagation.
---
Nitpick comments:
In `@src/flb_metrics.c`:
- Around line 367-370: The four helper calls attach_uptime,
attach_process_start_time_seconds, attach_build_info, and attach_hot_reload_info
can return -1 on failure but their results are ignored here; update the caller
to check each return value and propagate or handle errors (e.g., if any helper
returns -1, immediately return -1 from the surrounding function or log and clean
up as appropriate) so a failed cmt_counter_create is not silently
swallowed—ensure you reference those helper functions by name when adding the
checks and propagate the error consistently.
|
@cosmo0920 or @lecaros the linting errors are on windows build files, can someone look at this change and review? |
Registering it as a gauge causes incorrect behavior with PromQL
functions like rate() and increase(), which expect counters.
Fixes #11479.
Testing
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-testlabel to test for all targets (requires maintainer to do).Documentation
Docs PR for this change: fluent/fluent-bit-docs#2383
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit