enhancement(kubernetes_logs source): add end-to-end acknowledgement support#25325
enhancement(kubernetes_logs source): add end-to-end
acknowledgement support#25325connoryy wants to merge 9 commits intovectordotdev:masterfrom
Conversation
…upport Wire acknowledgements through the kubernetes_logs source so file checkpoints only advance after downstream sinks confirm delivery. Adds SourceAcknowledgementsConfig, FinalizerEntry, and OrderedFinalizer integration. Includes mock-based tests for the ack flow.
B1: Add Source::new_test() that accepts a pre-built Client and custom
logs directory, bypassing kubeconfig/env-var resolution. Add
logs_dir_override to K8sPathsProvider and path_helpers so tests
can glob a tempdir instead of /var/log/pods.
S1: Remove dead AckingMode::Unfinalized variant and its handling code.
S3/S5: Remove SourceConfigTest trait entirely. The test now calls
Source::new_test() directly, which is simpler and avoids
duplicating SourceConfig's doc comments and method signatures.
S4: Add checkpoint_does_not_advance_without_ack test that verifies
checkpoints do NOT advance when events are rejected (not acked).
N1: Combine super::super imports into a single use statement.
N2: Remove redundant #[cfg(any(test, feature = "all-integration-tests"))]
inside the already-#[cfg(test)] module.
N3: Add Clone, Copy derives to FinalizerEntry.
Add required changelog fragment for kubernetes_logs acknowledgement feature. Fix end-to-end acknowledgements URL in CUE documentation to match the pattern used by all other source docs.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0780694bd4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| let (batch, receiver) = BatchNotifier::new_with_receiver(); | ||
| event = event.with_batch_notifier(&batch); | ||
| let entry = FinalizerEntry { | ||
| file_id: line.file_id, | ||
| offset: line.end_offset, |
There was a problem hiding this comment.
Move checkpoint finalizers after partial-line merge
When acknowledgements are enabled, this attaches a batch notifier/finalizer to every raw line before parser/merge_partial_events runs, but PartialEventMergeState::add_event keeps the first partial event and drops later fragments after folding their bytes into it. Those dropped fragments still finalize their own notifiers as Delivered, so on a merged multiline message that is ultimately rejected by a sink, the first fragment reports Rejected while later fragments report Delivered, and OrderedFinalizer will still advance the checkpoint to a later offset. With auto_partial_merge enabled by default, this can skip unacknowledged data on restart.
Useful? React with 👍 / 👎.
When auto_partial_merge is enabled (the default), partial log lines are merged into a single event. Previously, each raw line received its own BatchNotifier before the merge step. When the merger dropped fragment events after extracting their bytes, those fragments' finalizers fired as Delivered, advancing the checkpoint past unacknowledged data. If the merged event was later rejected by a sink, earlier fragments had already been checkpointed. Fix: transfer finalizers from each fragment to the bucket event during merge, so the merged event carries all fragments' finalizers. The checkpoint only advances after the complete merged event is delivered.
…pstream-k8s-logs-acks
… into connor/upstream-k8s-logs-acks
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 42650c5a52
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // When not dealing with end-to-end acknowledgements, just | ||
| // clone the global shutdown to stop the checkpoint writer. | ||
| (None, global_shutdown.clone().map(|_| ()).boxed()) | ||
| }; |
There was a problem hiding this comment.
Use lifecycle shutdown for checkpointer in non-ack mode
In the non-ack path, shutdown_checkpointer is tied to global_shutdown instead of the lifecycle slot shutdown, so when Lifecycle::run initiates shutdown because the event-processing task ends first (e.g. downstream closes), file_server.run waits forever on checkpoint_task_handle because shutdown_checkpointer never resolves until a global shutdown happens. This can deadlock source teardown/reloads outside full process shutdown; shutdown_checkpointer should be driven by the same per-lifecycle shutdown signal that stops file_server.
Useful? React with 👍 / 👎.
Summary
Adds end-to-end acknowledgement support to the
kubernetes_logssource. When acknowledgements are enabled (via a downstream sink), file checkpoints only advance after downstream sinks confirm event delivery. This prevents data loss when the source crashes or restarts — unacknowledged events are re-read from the checkpoint position.Based on initial work by @ganelo (Orri), cleaned up and rebased onto current master.
Motivation
The
kubernetes_logssource currently returnscan_acknowledge() -> false, meaning it cannot participate in Vector's end-to-end acknowledgement system. When a downstream sink fails or the source crashes, events between the last checkpoint and the crash point are lost — the checkpoint was advanced before delivery was confirmed.The
filesource already supports acknowledgements using the sameOrderedFinalizer+BatchNotifierpattern. This PR brings the same capability tokubernetes_logs, which shares the underlyingfile_sourceinfrastructure.Approach
The implementation mirrors the
filesource's acknowledgement pattern:can_acknowledge()returnstrue— enables the topology's ack propagationOrderedFinalizer<FinalizerEntry>— receives ack status from downstream sinks in orderBatchNotifierper batch — attached to events before emitting; downstream sinks update status on deliverycheckpoints.update()only called whenBatchStatus::Deliveredis receivedKey design decisions:
FinalizerEntryis defined locally (not imported fromfilesource) sincesources-kubernetes_logsdoesn't depend onsources-fileSource::new_test()method added for mock-based testing with a pre-built Kubernetes clientacknowledgementsconfig field uses the standardSourceAcknowledgementsConfig+bool_or_structpatternVector configuration
No source-level configuration is needed. Acknowledgements activate automatically when a downstream sink has
acknowledgements.enabled = true, viapropagate_acknowledgements().How did you test this PR?
3 new tests + 51 existing tests pass:
file_start_position_server_restart_with_file_rotation_no_acknowledgefile_start_position_server_restart_with_file_rotation_acknowledgedcheckpoint_does_not_advance_without_ackTests use a mock Kubernetes client (
Source::new_test()) with a configurable logs directory, following the same pattern as thefilesource tests.All 51 existing
kubernetes_logstests pass unchanged.Change Type
Is this a breaking change?
Does this PR include user facing changes?
no-changeloglabel to this PR.References