feat(memory-sync): pause Composio periodic tick when Memory Tree is toggled off (#2719 follow-up)#2825
Conversation
…oggled off (tinyhumansai#2719 follow-up) Closes the named follow-up listed in tinyhumansai#2719's PR body: > Pause the 20-min Composio fetch loop on toggle > (`memory_sync/composio/periodic.rs` needs a `Notify` signal). Before this change, flipping the Memory Tree toggle off (via the Settings panel tinyhumansai#2719 added) only paused LLM-bound workers via the scheduler gate; the periodic Composio fetch loop kept ticking every 20 minutes, calling `list_connections` and walking every active provider regardless. That silently burned API budget and contradicted the user's "stop background work" intent. ## Approach Rather than adding a separate `Notify` channel and wiring it through `scheduler_gate::update_config`, this PR routes the pause signal through the existing `scheduler_gate::current_policy()` surface that the rest of the memory subsystem already consults: - `SchedulerGateMode::Off` → `Policy::Paused { UserDisabled }` - No live session → `Policy::Paused { SignedOut }` A new `periodic_pause_reason()` helper checks for both and is called at the top of `run_one_tick` *before* config load / auth-client build, so a paused session never even resolves the API token. Other `PauseReason` variants (battery / CPU pressure) are intentionally **not** treated as a global pause here — periodic Composio fetch is network-light enough that those signals shouldn't gate it. They already throttle LLM-bound work through the regular gate. ## Trade-off vs Notify YOMXXX's PR body suggested a `Notify` signal. The gate-check approach trades instant wake-up (which `Notify` would give) for ~20 min worst-case resume latency: when the user toggles back on, the next tick fires normally. The simplicity is worth it because: - The whole point is "stop doing work" — a tick that fires every 20 min but immediately returns `Ok(())` is functionally equivalent to "loop paused" from an API-budget / battery perspective. - The `current_policy()` surface is already the canonical "should the memory subsystem do background work?" signal used by other workers, so the periodic loop honouring it stays consistent with the rest of the system. - Faster wake-up can be layered on later (separate PR) without changing the gate-check contract here. ## Tests 8/8 pass in `memory_sync::composio::periodic::tests` (was 6, +2 new): - `periodic_pause_reason_returns_none_when_gate_not_initialised` — pins the OnceLock-uninitialised default (`Policy::Normal` → helper returns `None`), guarding against an "always pause" regression that would silently break every periodic-driven test. - `run_one_tick_does_not_short_circuit_when_not_paused` — runs the full tick under the standard `OPENHUMAN_WORKSPACE`-isolated env and confirms the new early-return arm doesn't fire spuriously. Coverage for the *paused* arm is deferred to integration tests that own `scheduler_gate::init_global(...)` — `OnceLock` makes per-test paused-state isolation brittle in unit tests, and the paused branch itself is a trivial match on `Policy::Paused`. `cargo check --lib`, `cargo fmt --check`, and `cargo test --tests --no-run` all clean.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe Composio periodic tick now queries scheduler-gate policy via ChangesScheduler-gate pause awareness in periodic sync
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
oxoxDev
left a comment
There was a problem hiding this comment.
Author: @justinhsu1477 (
CONTRIBUTOR— external contributor)
Clean small change. Gate-check placement is correct (runs before config load + auth-client build), current_policy() fallback semantics are right (Policy::Normal when STATE uninitialised), and the rationale for excluding OnBattery/CpuPressure from the periodic gate (network-light work) is sound.
Three minor nits + two questions — none blocking:
Nitpick 1 — src/openhuman/memory_sync/composio/periodic.rs:166-176 — helper can collapse via existing Policy::pause_reason()
The two explicit Policy::Paused { reason: X } arms reproduce what Policy::pause_reason() (policy.rs:70) already returns. Same semantics, half the lines, and the allow-list of "reasons that gate the periodic tick" becomes a clean matches!:
fn periodic_pause_reason() -> Option<PauseReason> {
let reason = current_policy().pause_reason()?;
matches!(reason, PauseReason::UserDisabled | PauseReason::SignedOut).then_some(reason)
}Behaviour identical for the four enum variants today. Future PauseReason additions (e.g. OnBattery, CpuPressure per #1073) still won't gate periodic — same as today's wildcard _ => None. Tradeoff: loses the explicit Policy::Paused { reason: X } pattern which doubled as documentation. OK to keep current form if you prefer the explicitness.
Nitpick 2 — src/openhuman/memory_sync/composio/periodic.rs:444-462 — new "does-not-short-circuit" test is functionally identical to pre-existing run_one_tick_returns_ok_when_no_client
Both set up ENV_LOCK + tempdir + OPENHUMAN_WORKSPACE, call run_one_tick() under 5s timeout, assert Ok(_). With no client available, both exit at the create_composio_client early-return — neither actually proves the new gate-check didn't fire first. A regression that made the helper "always pause" would still make this test pass.
Either drop the duplicate, or strengthen — e.g. via tracing_test::traced_test + logs_contain to assert the "scheduler-gate paused — skipping tick" line was NOT emitted:
#[traced_test]
#[tokio::test]
async fn run_one_tick_does_not_short_circuit_when_not_paused() {
// existing setup…
let _ = run_one_tick().await;
assert!(!logs_contain("scheduler-gate paused"));
}Nitpick 3 — src/openhuman/memory_sync/composio/periodic.rs:155-161 — skip-tick log is debug! only
Fleet operators investigating "why is Composio not syncing?" won't see anything at info/warn levels. A single info!-once-per-policy-transition log when entering the paused arm would aid prod observability without spamming (debug fires every 20 min × every user). Optional — debug is defensible given per-LLM-call gating elsewhere is also debug.
Questions
periodic.rs:160— Should the helper also gate onPauseReason::Unknown? Documented as "safe fallback" — current code lets the tick proceed when reason isUnknown. Worth a one-line rationale comment.periodic.rs:153-161— PR body mentions "Faster wake-up can be layered on later (separate PR) without changing the gate-check contract here." Has a follow-up issue been filed for theNotifywakeup path so the ~20-min worst-case resume doesn't get forgotten? Per [[feedback_deferred_classifier_arm_trap]] — defer-without-tracking burns events.
CI required checks all green. coderabbit APPROVED. APPROVE.
Five items raised in @oxoxDev's tinyhumansai#2825 review (all marked non-blocking, APPROVE-d) — addressing the substantive ones: * **Nitpick 1 (helper redundancy)** — collapsed the two explicit `Policy::Paused { reason: X }` match arms into a single `current_policy().pause_reason()` call + `matches!` allow-list. Same semantics, half the lines, future `PauseReason` variants stay explicitly opt-in via the `matches!` allow-list rather than the wildcard `_ => None` arm. * **Nitpick 2 (test functionally identical)** — dropped `run_one_tick_does_not_short_circuit_when_not_paused`. As @oxoxDev noted, both it and the pre-existing `run_one_tick_returns_ok_when_no_client` exited at the same `create_composio_client` no-client branch — neither actually proved the new gate-check arm fired in the right direction. The helper-level `periodic_pause_reason_returns_none_when_gate_not_initialised` test already pins the wiring. Asserting log-line absence via `tracing-test` would prove the early-return logically but adds a new dev-dependency for one assertion. * **Nitpick 3 (skip log only at `debug!`)** — added transition-edge `info!` logging. New `LAST_TICK_WAS_PAUSED: AtomicBool` tracks the previous tick's pause state; the log site emits `info!` exactly once when the loop crosses the boundary in either direction, and stays at `debug!` for the already-paused / already-running steady state. Fleet operators investigating "why is Composio not syncing?" now see a single breadcrumb at default log level without `info` spam every 20 min. * **Question 1 (`PauseReason::Unknown` rationale)** — added explicit doc comment on `periodic_pause_reason()` explaining why `Unknown` is intentionally allowed through (`Unknown` is documented in `scheduler_gate::policy` as a transitional / not-yet-resolved fallback; letting the tick proceed keeps periodic sync running through brief transitions instead of pausing on stale unresolved state). * **Question 2 (Notify wakeup follow-up)** — filed a dedicated tracking issue (tinyhumansai#2831) so the ~20 min worst-case resume latency doesn't get forgotten. Per the "defer-without-tracking burns events" feedback pattern. Tests: 7/7 pass in `memory_sync::composio::periodic::tests` (was 8; -1 from dropped duplicate, net same coverage on substantive paths). `cargo check --lib`, `cargo fmt --check`, and `cargo test --tests --no-run` all clean.
4a2dfb3
|
@oxoxDev thanks for the thorough pass — all five addressed in `4a2dfb3e`:
Tests: 7/7 pass (was 8 with the duplicate dropped; same coverage on substantive paths). `cargo check --lib` + `cargo fmt --check` + `cargo test --tests --no-run` clean. |
graycyrus
left a comment
There was a problem hiding this comment.
Clean follow-up to #2719. Gate-check placement is correct — fires at the top of run_one_tick before config load and auth-client build, so a paused session never resolves the API token. The transition logging design (info once on boundary crossing, debug on steady state) is the right call for fleet observability without spam.
The AtomicBool with Relaxed ordering is sound given the singleton scheduler loop — no cross-thread visibility concern. The Unknown pause-reason passthrough rationale in the doc comment is clear and defensible (avoid pausing on transient unresolved state).
One minor thing to track: the PR body acknowledges ~20 min worst-case resume latency if someone wants faster Notify-based wakeup, but I don't see a follow-up issue filed for that. Not blocking — the gate-check approach fully satisfies the stated goal — but worth filing so the tradeoff doesn't get forgotten.
Approved.
`Policy` was imported alongside `PauseReason` but only `PauseReason` is used directly — `periodic_pause_reason()` delegates the `Policy::Paused` destructure to `current_policy().pause_reason()`. Remove the dead import to silence the `unused_import` compiler warning. Co-Authored-By: Claude <noreply@anthropic.com>
6ee23f7
|
Actionable comments posted: 0 |
sanil-23
left a comment
There was a problem hiding this comment.
Round-3 shepherd review: implementation looks good; CI signal (current or expected after re-run) clean. Approving on behalf of sanil-23.
Closes the named follow-up listed in #2719's PR body:
Before this change, flipping the Memory Tree toggle off (via the Settings panel #2719 added) only paused LLM-bound workers via the scheduler gate; the periodic Composio fetch loop kept ticking every 20 minutes, calling `list_connections` and walking every active provider regardless. That silently burned API budget and contradicted the user's "stop background work" intent.
Approach — gate-check, not Notify
Rather than adding a separate `Notify` channel and wiring it through `scheduler_gate::update_config`, this PR routes the pause signal through the existing `scheduler_gate::current_policy()` surface that the rest of the memory subsystem already consults:
A new `periodic_pause_reason()` helper checks the relevant pause reasons and is called at the top of `run_one_tick` — before config load and auth-client build — so a paused session never even resolves the API token.
Battery / CPU pause is intentionally not treated as a tick-blocker here; periodic Composio fetch is network-light enough that those signals shouldn't gate it. They already throttle LLM-bound work through the regular gate.
Trade-off vs Notify
YOMXXX's PR body suggested a `Notify` signal. The gate-check approach trades instant wake-up (which `Notify` would give) for ~20 min worst-case resume latency: when the user toggles back on, the next tick fires normally. The simplicity is worth it because:
Tests
8/8 pass in `memory_sync::composio::periodic::tests` (was 6, +2 new):
Coverage for the paused arm is deferred to integration tests that own `scheduler_gate::init_global(...)` — `OnceLock` makes per-test paused-state isolation brittle in unit tests, and the paused branch itself is a trivial match on `Policy::Paused`.
Refs #2719, #1856 (Part 1).
Summary by CodeRabbit
New Features
Tests