fix(firmware): change promiscuous filter to ALL frames for CSI capture#901
fix(firmware): change promiscuous filter to ALL frames for CSI capture#901volkantasci wants to merge 32 commits into
Conversation
…(ADR-110)
`firmware/esp32-csi-node` now builds for both `esp32s3` (existing
production) and `esp32c6` (new research / battery-seed target) from
the same source tree. ESP-IDF auto-applies `sdkconfig.defaults.esp32c6`
when the target is set to esp32c6; every C6 module is gated on
CONFIG_IDF_TARGET_ESP32C6 (or the SOC_WIFI_HE_SUPPORT capability) so
the S3 build path is byte-identical to today.
New modules (all #ifdef-gated, no-op stubs on S3):
- c6_twt.{h,c} — iTWT wrapper, graceful AP-NACK fallback
- c6_timesync.{h,c} — 802.15.4 beacon-based mesh time-sync, EUI-64
leader election, c6_timesync_get_epoch_us()
- c6_lp_core.{h,c} — wake-on-motion deep-sleep helper (ext1 path
this cut; real LP-core polling deferred)
ADR-018 frame extension:
- byte 18: PPDU type (0=HT/legacy, 1=HE-SU, 2=HE-MU, 3=HE-TB)
- byte 19: bandwidth + STBC + 802.15.4-sync-valid flags
- Magic 0xC5110001 unchanged — backwards compatible
- Dual-branch encoding handles both struct variants of
wifi_pkt_rx_ctrl_t (legacy S3 / HE C6) per CONFIG_SOC_WIFI_HE_SUPPORT
Critical bug fixed during live witness collection (verified across 3
boards on COM6/COM9/COM12):
- c6_timesync.c read MAC into a 6-byte buffer and ran MAC-48->EUI-64
conversion. But esp_read_mac(ESP_MAC_IEEE802154) returns 8 bytes
already in EUI-64 form on C6 — code was double-inserting FFFE.
Boot log was 206ef1fffefffe17, fix yields 206ef1fffe17278c which
matches esptool's eFuse reading exactly.
Tooling:
- CI workflow (firmware-ci.yml) extended with c6-4mb matrix row +
ADR-110 host-unit-test step
- Host unit tests for pure functions (mac48_to_eui64,
eui64_bytes_to_u64, PPDU encoding both branches) — runs on Ubuntu CI
- Multi-board live-capture harness (test/capture-3board-experiment.py)
- Witness bundle script records SHA-256s for s3-adr110, c6-adr110, and
s3-fair-adr110 (apples-to-apples) binary archives
Honest empirical findings (full report in docs/WITNESS-LOG-110.md):
- Verified live on 3 C6 boards: boot, 802.15.4 init w/ correct EUIs,
WiFi STA reaching assoc->run on ruv.net, TWT setup attempted +
gracefully NACKed (AP is 11n-only, TWT Responder:0), HE-MAC firmware
loaded
- NOT verified (need 11ax AP / second-channel exp / INA meter):
HE-LTF subcarrier expansion, TWT cadence determinism, ±100 µs sync
alignment, 5 µA hibernation
- Bug found: leader election doesn't step down under live WiFi load —
likely 2.4 GHz radio coex preemption (WiFi ch 5 vs 15.4 ch 15);
follow-up task #30
- Apples-to-apples size: S3-no-display = 886 KB, C6 = 1003 KB
(C6 is 13% LARGER for equivalent CSI features; the extra is the
802.15.4 + OpenThread stack that S3 lacks)
Tracking: ruvnet#762
Co-Authored-By: claude-flow <ruv@ruv.net>
…10 D1) After 3 systematic hypotheses tested + rejected (radio coex, OpenThread shadowing, manual RX re-arm), the 802.15.4 leader-election bug is narrowed to: TX path works perfectly (~10/s clean, 0 fail), but the RX path stops after exactly 1 frame. Manual esp_ieee802154_receive() from either callback bootloops the driver (verified across all 3 boards). The IDF reference example uses the same handle_done-only pattern as this code, implying the driver should auto-restart RX — but empirically doesn't here. Either a half-duplex radio state issue or an IDF v5.4 bug. Tracked as known issue D1 in WITNESS-LOG-110. Changes shipped: - c6_twt.c: ESP_ERR_INVALID_ARG added to graceful-fallback list (empirically: ruv.net AP advertises TWT Responder=0, IDF driver validates against AP HE capability and rejects with INVALID_ARG) - c6_timesync.c: diagnostic counters (s_tx_count, s_tx_fail, s_rx_count, s_rx_magic_match) + per-10-beacon log line preserved so future investigation has the diagnostic harness ready - sdkconfig.defaults.esp32c6: 15.4 channel default 15 → 26 (non-overlap with WiFi 2.4 GHz channels), OpenThread disabled (we use raw 15.4) - promiscuous=true on the radio (broadcast frames addressed to 0xFFFF) - WITNESS-LOG-110 §D1 expanded with the full diagnostic trace + 3-hypothesis investigation record Cross-node sync claim (B3) BLOCKED until either an IDF maintainer trace or a working multi-board reference is available. The other three SOTA dimensions (HE-LTF, TWT cadence, 5 µA hibernation) are also still unverified and need different hardware (11ax AP, INA meter) — honestly recorded in §B. Tracking: ruvnet#762, task #30 closed as known-issue. Co-Authored-By: claude-flow <ruv@ruv.net>
Tried 4th hypothesis for the RX-path bug: maybe the IDF v5.4 receiver
strictly requires dst PAN to match the local set_panid() instead of
honoring the 0xFFFF broadcast PAN per 802.15.4 spec. Changed beacon
dst PAN to 0xCAFE (matching set_panid call) to test.
Result: still negative (tx#241 rx#0/1, magic_match=0). PAN was not the
root cause — but the change is technically more correct per the IDF
behavior and is kept.
Also expanded WITNESS-LOG-110 §D1 to record the 4-experiment matrix
that's now been run:
1. WiFi-on + ch15: tx#381 rx#1 magic_match=0
2. WiFi-on + ch26: identical negative
3. WiFi-off + ch26 + OT off + promiscuous true: tx#601 rx#0 — even
the earlier rx#1 was a noise frame, not protocol traffic
4. Dst PAN 0xCAFE: still negative
Hypothesis space narrowed; needs IDF maintainer trace or working
multi-board reference to fix.
Co-Authored-By: claude-flow <ruv@ruv.net>
The Python proof verifier (archive/v1/data/proof/verify.py) imports the project settings, which read the user's .env file. When pydantic validation fails (e.g., extra fields not in the Settings schema), the error dump includes the offending input_value — which means real Docker tokens, GitHub PATs, API keys, etc. were being echoed to stdout and captured into the bundled verification-output.log. Confirmed on this branch's first bundle generation: dckr_pat_, tok_... cluster token, and other long opaque strings leaked into witness-bundle-ADR028-<commit>/proof/verification-output.log inside the .tar.gz. Bundle + tarball nuked from disk before any push. Added: - scripts/redact-secrets.py — stdin->stdout filter with patterns for common token prefixes (dckr_pat_, tok_, sk-, ghp_, gho_, github_pat_, AKIA, hf_, xoxb-, xoxp-, Bearer), `field=secret` assignments, long opaque alphanumeric strings (40+ chars), and long hex runs (20+ chars which catch token suffixes after `...` truncation). - generate-witness-bundle.sh now pipes verify.py stderr through that filter before tee-ing into the bundled log. - Also fixed pre-existing stale `v1/` paths in the witness script (correct path is `archive/v1/`). The user must rotate the leaked credentials regardless (the bundle was never pushed, but they appeared in this local Claude session log). Co-Authored-By: claude-flow <ruv@ruv.net>
After 5 systematic experiments confirmed the 802.15.4 RX path is
unfixable from user code in this IDF v5.4 + C6 combination (D1), add a
parallel sync transport over ESP-NOW. Same TS_BEACON protocol, same
public API (c6_sync_espnow_get_epoch_us / is_valid / is_leader), but
runs on the WiFi MAC layer that ESP-IDF fully supports across every
ESP32 family.
The 802.15.4 code stays in source for when the IDF driver is fixed.
ESP-NOW is the working primary today.
Empirical (single-board COM9 — other 3 boards dropped off USB during
the experiment):
- c6_sync_espnow_init() succeeds: "init done local_id=… leader=
yes(candidate) period=100ms"
- TX path 100% reliable: tx#101 fail=0 over ~15s at 100ms cadence
- RX awaiting cross-board test once USB-enumeration is restored
Trade vs. 802.15.4 design:
- Loses: "frees WiFi airtime for CSI" property
- Gains: known-working RX path, cross-target (S3 and C6 both)
- Same API surface — consumers swap transports without code change
Files:
- main/c6_sync_espnow.{h,c} — new module, ~210 lines
- main/CMakeLists.txt — add to SRCS (always built, used on any target)
- main/main.c — init after WiFi STA up, skip on QEMU mock
- test/capture-3board-experiment.py — surface c6_espnow log lines
- docs/WITNESS-LOG-110.md — new §D-workaround documenting the pivot
Ref: ruvnet#762 / D1 known-issue / draft PR ruvnet#764
Co-Authored-By: claude-flow <ruv@ruv.net>
Parse the C6 firmware's HE PPDU type + bandwidth/flags from ADR-018 bytes 18-19 (previously discarded as _reserved). Adds two types to CsiMetadata: ppdu_type (HtLegacy/HeSu/HeMu/HeTb/Unknown) and adr018_flags (bw40/stbc/ldpc/ieee802154_sync_valid). Pre-ADR-110 firmware sends zeros which round-trip as HtLegacy + default flags — fully backwards compatible. 6 new deterministic unit tests: - Pre-ADR-110 backwards compat - HE-SU / HE-MU / HE-TB decode - Unknown PPDU byte -> Unknown - All-bits-set flags round-trip - PpduType byte round-trip Result: 122 wifi-densepose-hardware tests pass, 0 fail. Host decoder now matches the firmware encoder bit-for-bit — HE-LTF metadata path works end-to-end the moment an 11ax AP is in range. Ref: ruvnet#762 Co-Authored-By: claude-flow <ruv@ruv.net>
Python ESP32BinaryParser was using struct format '<IBBHIIBB2x' — the
'2x' skipped bytes 18-19 as reserved. After the Rust-side decoder was
extended to surface PPDU type + flags, the Python pipeline (which
archive/v1 still uses for testing + the proof verifier) needs the same
update so its consumers see the HE metadata too.
csi_extractor.py:
- HEADER_FMT now '<IBBHIIBBBB' (captures bytes 18-19)
- New metadata fields: ppdu_type ('ht_legacy'|'he_su'|'he_mu'|'he_tb'|'unknown'),
ppdu_type_raw, he_capable, bw40, stbc, ldpc, ieee802154_sync_valid,
adr018_flags_raw
- Class constants PPDU_HT_LEGACY..PPDU_UNKNOWN mirror the firmware
test_esp32_binary_parser.py:
- build_binary_frame() takes optional ppdu_byte + flags_byte (default 0)
- New TestAdr110ByteEncoding class with 5 tests:
- Pre-ADR-110 zeros decode as 'ht_legacy' + all-flags-false
- HE-SU / HE-MU / HE-TB decode correctly
- 0xFF decodes as 'unknown'
- All-flags-set round-trip (0x1D)
11/11 parser tests pass (6 existing + 5 new). Backwards compat verified.
Pairs with the Rust-side decoder in commit 3959fab. Both pipelines now
read the same wire format produced by the C6 firmware's
CONFIG_CSI_FRAME_HE_TAGGING path.
Ref: ruvnet#762, draft PR ruvnet#764
Co-Authored-By: claude-flow <ruv@ruv.net>
Real empirical evidence the ESP-NOW sync transport is long-term stable on the C6 (D-workaround). Single-board capture on COM9, latest firmware on branch (8eaa92c): Captured 33586 bytes over 120 s ESP-NOW samples: 24 first: tx=1 fail=0 rx=0 match=0 leader=1 offset=0 last: tx=1151 fail=0 rx=0 match=0 leader=1 offset=0 TX rate: 9.6/s (target ~10/s) TX failure rate: 0.00% app_main calls (reset detector): 1 -> no crash The 9.6/s vs 10/s gap is FreeRTOS timer schedulability slop at 100 ms ticks, not a transport issue. Zero TX failures over 1151 attempts + zero resets in 2 min = the ESP-NOW path is production-grade as a transport. Only the cross-board RX measurement is blocked on the other boards' USB enumeration. Ref: ruvnet#762 / draft PR ruvnet#764 / D-workaround Co-Authored-By: claude-flow <ruv@ruv.net>
The original CHANGELOG entry covered the initial firmware ship. Adding sub-bullets for everything that landed after: - D1 workaround: ESP-NOW cross-node sync (TX 0% failure rate over 1151 transmits in 120 s soak), 802.15.4 path documented as broken - Host-side dual-pipeline decoder for ADR-018 byte 18-19 (Rust 122/122, Python 11/11 — protocol path verified end-to-end without 11ax hardware) - Security fix for witness bundle secret leakage via Pydantic error dumps (redact-secrets.py filter) Witness link: docs/WITNESS-LOG-110.md Ref: ruvnet#762, draft PR ruvnet#764 Co-Authored-By: claude-flow <ruv@ruv.net>
The libFuzzer harness was compiled without CONFIG_CSI_FRAME_HE_TAGGING,
so the new byte 18/19 path in csi_collector.c was zero-filled at compile
time and never fuzzed. Three changes to fix that:
1. test/stubs/esp_stubs.h: wifi_pkt_rx_ctrl_t gains both branch families
- HE branch (CONFIG_SOC_WIFI_HE_SUPPORT path): cur_bb_format, second
- Legacy branch (S3 / pre-HE chips): sig_mode, cwb, stbc
A single stub compiles for either branch; the Makefile picks which
one is active via -D flags. Both sets are declared so a build for
the unselected branch still compiles cleanly.
2. test/Makefile: CFLAGS now defines CONFIG_CSI_FRAME_HE_TAGGING=1 so
the new code path is reachable. CONFIG_SOC_WIFI_HE_SUPPORT stays
UNSET (default — exercises the legacy S3 branch). Add it to CFLAGS
for a parallel HE-stub run if you want coverage of the C6 branch.
3. test/fuzz_csi_serialize.c: parses 3 more control bytes from fuzz
input (he_inputs[2] + legacy_inputs) and writes them through
info.rx_ctrl.{cur_bb_format,second,sig_mode,cwb,stbc} so the
serializer's PpduType switch and Adr018Flags computation are
reached on every iteration.
Result: the existing libFuzzer corpus + ASAN/UBSAN now covers the
ADR-110 wire encoding paths on every run. No more zero-fill silent skip.
Co-Authored-By: claude-flow <ruv@ruv.net>
The ADR index README hadn't been updated past ADR-099. Adding ADR-110 in the Hardware/firmware section with its honest status — firmware shipped + tested + CI-green, but the four SOTA capability claims (HE-LTF live capture, TWT cadence, cross-node sync, 5 µA hibernation) are each blocked on different physical hardware (11ax AP, more boards, INA meter), as fully documented in docs/WITNESS-LOG-110.md. Ref: ruvnet#762 / draft PR ruvnet#764 Co-Authored-By: claude-flow <ruv@ruv.net>
Original row said C6 *has* HE-LTF tagging + multi-node sync + 5µA hibernation as if they were active features. Reality per WITNESS-LOG-110: - Wire format VERIFIED (17 unit tests across firmware/Rust/Python) - ESP-NOW transport VERIFIED on 1 board (1151 tx, 0 fail in 120s soak) - TWT graceful NACK VERIFIED live (AP isn't 11ax → INVALID_ARG handled) - HE-LTF live capture: BLOCKED on 11ax AP availability - 5µA hibernation: datasheet number, not a measurement (no INA) - 802.15.4 RX: known broken in IDF v5.4, ESP-NOW is the workaround New row leads with 'wire format ready' + 'hardware-gated' to set honest expectations, and links to docs/WITNESS-LOG-110.md so readers can see the full empirical/claimed split themselves. Ref: ruvnet#762, draft PR ruvnet#764 Co-Authored-By: claude-flow <ruv@ruv.net>
After ADR-110 made this the same source tree for both esp32s3 (production) and esp32c6 (research / Wi-Fi-6 / 802.15.4 / LP-core seed nodes), the firmware README header should reflect that. Title, one-liner, and target badge updated; body sections still use S3 examples as the production default. The C6 build path is documented in docs/user-guide.md + sdkconfig.defaults.esp32c6 + Quick-Start Option 2b in the top-level README. Ref: ruvnet#762, draft PR ruvnet#764 Co-Authored-By: claude-flow <ruv@ruv.net>
Confirmation run vs the earlier 120 s soak. Same firmware, same board,
longer window:
Captured 67307 bytes over 300 s
ESP-NOW samples: 60
first: tx=1 fail=0 rx=0 match=0 leader=1 offset=0
last: tx=2951 fail=0 rx=0 match=0 leader=1 offset=0
TX rate: 9.83/s (target 10/s)
TX failure rate: 0.0000%
app_main calls (reset detector): 1 -> no crash
2.5x sample size, identical zero-failure rate, marginally higher
sustained rate (9.83 vs 9.60) — FreeRTOS timer settling. Adds a second
data point to WITNESS-LOG-110 §D-workaround.
Ref: ruvnet#762, draft PR ruvnet#764
Co-Authored-By: claude-flow <ruv@ruv.net>
The witness log is comprehensive but ~300 lines. A reviewer landing on this branch wants a five-minute tour: where to read first, what's actually empirically verified vs hardware-blocked, what the bugs were, and the commit history at a glance. docs/ADR-110-REVIEW-GUIDE.md provides that, with explicit links to the canonical witness + ADR. Doesn't duplicate content — points to where the canonical record lives. Also captures the security note for the operator (rotate the previously- exposed Docker Hub + PI-cluster tokens — they appeared in local logs during witness generation before scripts/redact-secrets.py was added). Ref: ruvnet#762, draft PR ruvnet#764 Co-Authored-By: claude-flow <ruv@ruv.net>
Two tiny updates to the ESP32-C6 row in the hardware-options table: - Add link to docs/ADR-110-REVIEW-GUIDE.md (the new one-page reviewer on-ramp added in 3133be6) - Update ESP-NOW soak number from '1151 tx 0 fail' (just the 120s run) to '4102 tx 0 fail cumulative across 120 s + 300 s soaks' — reflects the additional 300 s soak landed in 9a46fc8 Ref: ruvnet#762, draft PR ruvnet#764 Co-Authored-By: claude-flow <ruv@ruv.net>
…WT helper
ADR-110 P9 — software-only unblocks for the WITNESS-LOG-110 §B
hardware-blocked items. Two new modules, both default-off so v0.6.6 fleets
see no behavior change.
LP-core (B4 path):
- New firmware/esp32-csi-node/main/lp_core/main.c: real RISC-V LP-core
motion-gate program with debounce + monotonic motion_count counter.
- c6_lp_core.c rewritten to load + run the LP binary via ulp_lp_core_run
when CONFIG_C6_LP_CORE_ENABLE=y; falls back to the v0.6.6 ext1 GPIO-wake
path otherwise (keeps regression surface small).
- ulp_embed_binary() wired in main/CMakeLists.txt, gated on the Kconfig.
- New Kconfig knobs: C6_LP_POLL_PERIOD_US (default 10 ms),
C6_LP_DEBOUNCE_SAMPLES (default 3).
- Exposes c6_lp_core_motion_count() / c6_lp_core_poll_count() for the
witness harness — once an INA/Joulescope is on the bench, B4 is one
capture away from a measured number.
Soft-AP HE (B1/B2 unblock):
- New c6_softap_he.{h,c}: brings up the C6 in AP+STA mode with WPA2-PSK
+ HE advertisement, so a second C6 in STA mode can negotiate real
iTWT against a known-cooperative AP without buying an 11ax router.
- main.c calls c6_softap_he_start() right before esp_wifi_start() when
CONFIG_C6_SOFTAP_HE_ENABLE=y.
- New Kconfig knobs: C6_SOFTAP_HE_{SSID,PSK,CHANNEL} with NVS overrides
via softap_ssid / softap_psk / softap_chan in the ruview namespace.
Build artifacts (IDF v5.4, both green, RC=0):
- S3 8 MB: 1093 KB (47% partition slack)
- C6 4 MB: 1019 KB (45% partition slack)
- SHA-256 sums in dist/firmware-v0.6.7/SHA256SUMS.txt
Doc updates: CHANGELOG wave-3 entry, ADR-110 phase table gets P5
upgrade note + new P9 row, WITNESS-LOG-110 gets new A0 section
recording the v0.6.7 build evidence.
Co-Authored-By: claude-flow <ruv@ruv.net>
…ions - README C6 hardware row now links the v0.6.7-esp32 release and notes the LP-core RISC-V program (B4 code path) + soft-AP TWT Responder (B1/B2 two-board unblock). - README Option-2b quick-start mentions the new opt-in toggles. - User-guide gets the v0.6.7 boot banner, expanded battery-seed instructions (real LP-core poll period + debounce knobs), and a fresh "Two-board iTWT bench" section covering the soft-AP role (CONFIG_C6_SOFTAP_HE_ENABLE) and the NVS overrides for SSID / PSK / channel. - User-guide firmware release table prepends v0.6.7-esp32 as Latest above v0.5.0 (still recommended for S3-mesh production). Co-Authored-By: claude-flow <ruv@ruv.net>
Flashed v0.6.7 to two ESP32-C6 boards (COM9 + COM12, both matching the witness-log MACs from v0.6.6 session). A0.4 — regression check on COM9 (default config): - v0.6.7 boots in 446 ms, c6_ts up on ch 26, HAL_MAC_ESP32AX_761 loaded, ruv.net association at +5206 ms, iTWT graceful NACK, ESP-NOW init OK, CSI flowing at HT-LTF 64 subcarriers. Byte-for-byte same behavior as v0.6.6 confirms the new code paths (LP-core + soft-AP) are correctly default-off — zero behavioral regression for shipped fleets. A0.5 — soft-AP module live on COM12: - Built a CONFIG_C6_SOFTAP_HE_ENABLE=y variant locally, flashed COM12. - AP came up at +666 ms on channel 6 with WPA2-PSK, dual STA+AP iface visible (...00:84 STA / ...00:85 AP — standard +1 MAC offset). - Discovered live IDF constraint: when AP+STA both active and STA associates to an 11ax AP on a different bandwidth, the soft-AP gets demoted from HE to 11n by the radio scheduler. Documented in §A0.5 — the cleanest two-board iTWT bench needs the AP-role board's STA iface not to associate elsewhere (point it at a non-existent SSID). Release v0.6.7-esp32 now also carries: - esp32-csi-node-c6-4mb-softap.bin (the AP-variant binary) - COM9-v0.6.7-regression.log + COM12-v0.6.7-softap.log raw captures - SHA256SUMS.txt updated with the soft-AP variant hash Co-Authored-By: claude-flow <ruv@ruv.net>
Iter 1 finding from /loop 5m SOTA sprint: two C6 boards now mesh through the c6_softap_he soft-AP (COM12 hosts ruview-c6-twt, COM9 associates), but COM9 lands at phymode(0x3, 11bgn), he:0 — the soft-AP doesn't advertise HE. Confirmed by full grep of components/esp_wifi/include/esp_wifi*.h: the public API exposes ONLY STA-side iTWT/bTWT. There is no esp_wifi_ap_set_he_config, no wifi_he_ap_config_t, no wifi_config_t.ap.he_* field — soft-AP HE/TWT-Responder advertise is not user-controllable on ESP32-C6 in IDF v5.4. Consequence: B1/B2 cannot be measured via the two-C6 path on this IDF release. The c6_softap_he module ships as the in-place hook for any future IDF release that exposes the API; until then a real 11ax router or phone hotspot remains the path. Sharpens the open question from "do we need an 11ax AP?" to "we need either a future IDF AP-side HE config API, or an external 11ax AP". WITNESS-LOG-110 §A0.6 records the parallel boot logs from both boards plus the IDF surface grep evidence. c6_softap_he.c gains an ESP_LOGW at AP-up time so operators understand exactly why STAs land at 11bgn (avoids confusion with the v0.6.6 §A8 graceful-TWT-NACK story). While here: cleared the three -Wunused-variable warnings in swarm_bridge.c that fired on every build (fw_ver, free_heap, presence in heartbeat block). fw_ver now feeds an ESP_LOGI so the boot log names the build; free_heap + heartbeat-presence were dead anyway. Pure ultra-opt: smaller .o files, zero warning noise. Co-Authored-By: claude-flow <ruv@ruv.net>
…sync offset measured
SOTA iter 2 (cron c40dab4a tick 2). The §D-workaround that v0.6.6 left
on TX-only soak coverage is now empirically complete end-to-end.
Parallel 60 s capture with COM9 (206ef117053c) + COM12 (206ef1170084)
both on default v0.6.7, no WiFi associations needed:
* RX rate cross-board:
- COM12: tx=301 rx=297 match=297 (98.7 %)
- COM9: tx=301 rx=300 match=300 (99.7 %)
- 0 TX failures on either side over 30 s of beacons
* Leader election fired for the first time in ADR-110:
+27336 ms COM9: "stepping down: heard lower-id leader 206ef1170084
(we are 206ef117053c)" — the lowest-EUI-wins protocol the original
c6_timesync was designed to run, now actually working because the
transport is healthy.
* Cross-board sync offset converged and stable:
COM9 offset_us: -1462 -> -950 -> -954 -> -957 -> -948
±10 µs jitter once leader-following stabilises, hitting the ±100 µs
target named in ADR-110 §2.4.
802.15.4 c6_ts path stayed rx=0 across both 60 s captures — D1 still
broken in IDF v5.4, exactly as documented. ESP-NOW is confirmed as the
working multistatic time alignment transport.
Raw captures: dist/firmware-v0.6.7/iter2-{COM9,COM12}-espnow.log.
Co-Authored-By: claude-flow <ruv@ruv.net>
SOTA loop iter 3 added esp32-csi-node-s3-4mb.bin to the v0.6.7-esp32 release (882 KB binary built from sdkconfig.defaults.4mb, 52% partition slack on 4MB single-OTA — vs 47% for the 8MB build, +5pp). v0.6.6 shipped 8MB+4MB parity; v0.6.7 now matches. User-guide previously pointed SuperMini 4MB owners at v0.4.3 (which predates ADR-110 / fall-threshold fix / 4102-tx ESP-NOW soak). Point at v0.6.7 directly so 4MB users get the same firmware as 8MB users. Co-Authored-By: claude-flow <ruv@ruv.net>
…easured clock skew
SOTA iter 4 (cron c40dab4a tick 4). Converted iter 2's 30-second snapshot
into a real statistical measurement over 4 minutes / 2101 beacons.
Beacon throughput (both boards):
- Rate: 10.00/s exactly — FreeRTOS timer rock-solid
- COM12 leader: tx=2101, match=2101/2101 = 100.00%, 0 TX fail
- COM9 follower: tx=2101, match=2089/2101 = 99.43%, 0 TX fail
- 12 missed beacons / 210 s ≈ 1 miss / 17.5 s — inside the 3-second
VALID_WINDOW_MS freshness gate, sync remains valid
Sync offset (COM9, 37 follower-mode samples after warmup):
- mean: -1,163,123 µs (boot-time delta, not jitter)
- stdev: 540 µs
- range: 2994 µs over the soak
- drift Q1->Q4: -84.2 µs/min over 3 minutes
= 1.4 ppm relative clock skew between the two specific C6 crystals
(ESP32 spec: typical ±10 ppm — well within tolerance)
ADR-110 §2.4 target ±100 µs across one hop: met with margin at the
current 10 Hz beacon rate. A simple linear or Kalman fit on the offset
trajectory (host-side, no firmware change) would compress per-frame
alignment error to <50 µs. Hardware substrate is now quantified and
documented — downstream ADR-029/030 multistatic fusion can plan around
the measured numbers.
Also corrected §A0.7's "±10 µs jitter" wording — that was sample-to-sample
range within a 5-row snapshot, not the true stability profile. §A0.8
supersedes with the proper soak data.
Raw captures: dist/firmware-v0.6.7/iter4-{COM9,COM12}-soak240s.log
(7400+ lines each, full c6_espnow + c6_ts counter records).
Co-Authored-By: claude-flow <ruv@ruv.net>
…poch_us
SOTA iter 5 — converted the iter 4 ADR-110 §A0.8 closing recommendation
("host-side Kalman / linear fit on the offset trajectory") into a
firmware-side, fixed-point EMA so every downstream consumer of
c6_sync_espnow_get_epoch_us() gets bounded-jitter timestamps for free.
Implementation:
* α = 1/8 (Q3.3 shift = 3), ≈8-sample effective window at the 10 Hz
beacon rate. Tracks the ≈1.4 ppm crystal drift §A0.8 measured while
averaging out per-beacon WiFi-MAC jitter spikes.
* y[n] = y[n-1] + (raw - y[n-1]) >> 3 — integer arithmetic, two cycles
on the RISC-V LP/HP cores, no float dependency.
* Seeded from the first follower-mode sample so we don't bias toward 0.
* New getter: int64_t c6_sync_espnow_get_offset_us_smoothed(void).
* c6_sync_espnow_get_offset_us() (raw) stays for diagnostics, unchanged.
* c6_sync_espnow_get_epoch_us() now prefers the smoothed offset once
s_smoothed_seeded — meaning every CSI frame timestamp ADR-029/030
consumes is already filtered, no host-side rework required.
Diag log line now prints both:
c6_espnow: tx#N ... offset_us=R smoothed=S
90 s bench verification (witness §A0.9 + iter5-COM9-ema-90s.log) shows
both values tracking. Methodology caveat in §A0.9: short windows don't
let the smoothing benefit emerge over the raw noise floor — the
suppression ratio measurement needs ≥5 min, deferred to a long-soak
iteration.
Binary size cost: ~32 bytes (one int64, one bool, one getter). C6 build
still 45% partition slack.
Co-Authored-By: claude-flow <ruv@ruv.net>
…alignment shipped
SOTA iter 6 — the long-soak iter 5 owed. 300 s parallel two-board capture
with the iter 5 EMA firmware, 46 converged follower-mode samples.
Over the 225 s steady-state window:
stdev range drift Q1->Q4
raw 411.5 µs 2245 µs +30.1 µs/min
smoothed 104.1 µs 478 µs +27.8 µs/min
suppression: 3.95x (stdev), 4.70x (range)
The ADR-110 §2.4 ≤100 µs alignment target is now empirically met by the
smoothed offset alone — no host-side filter required. Drift is preserved
(within 2 µs/min between raw and smoothed), so the EMA tracks real clock
skew, not lag behind it.
Drift sign + magnitude vary with thermal state across runs (-84 µs/min
in §A0.8 iter 4, +30 µs/min here in iter 6 with boards warmer — both
within ESP32 ±10 ppm crystal spec). The EMA tracks whichever value
applies at any given moment.
Throughput: tx=2701, rx=2689, match=2689 → 99.56% cross-board match,
zero TX failures.
ADR-110 §B sync-substrate status: ≤100 µs multistatic alignment is now
*measured and shipped*, not just designed. Downstream multistatic CSI
fusion (ADR-029/030) can treat c6_sync_espnow_get_epoch_us() as a
black-box bounded-jitter timestamp source.
Co-Authored-By: claude-flow <ruv@ruv.net>
SOTA iter 7. Tags + ships the firmware that actually has the iter-5/6 EMA path so the GitHub release matches the witness measurements. v0.6.7 binaries on the release predate the EMA work — anyone downloading from the v0.6.7 release would not get the smoothing §A0.10 measured. Build evidence (IDF v5.4, both RC=0): - S3 8 MB: 1093 KB (47 % slack), SHA256 60e3ef907f... - C6 4 MB: 1019 KB (45 % slack), SHA256 feb88d60a0... - Soft-AP and 4 MB S3 variants ship unchanged from v0.6.7; not rebuilt. Wiring gap documented in WITNESS §A0.11: ADR-018 wire format has no timestamp field, so the synced clock value from get_epoch_us() doesn't yet reach CSI frames. Three options outlined (ADR-018 v2 / separate UDP sync packet / out-of-band HTTP probe). Likely landing place is the separate UDP sync packet — keeps the existing ADR-018 contract intact while ADR-029/030 multistatic fusion lights up the substrate. CHANGELOG Wave 4 entry summarises what v0.6.8 ships + the deferred gap so future maintainers don't lose the breadcrumb. Co-Authored-By: claude-flow <ruv@ruv.net>
Closes WITNESS-LOG-110 §A0.11 wiring gap. Adds a separate 32-byte UDP packet (magic 0xC511A110, distinct from the CSI frame magic 0xC5110001) carrying: [0..3] magic 0xC511A110 (LE u32) — CSI-ADR-110 sync packet [4] node_id [5] proto version (0x01) [6] flags: bit0=is_leader, bit1=is_valid, bit2=smoothed_used [7] reserved [8..15] local esp_timer_get_time() (LE u64) [16..23] mesh-aligned epoch (LE u64) = local + EMA-smoothed offset [24..27] high-water sequence number (LE u32) for pairing with CSI frames [28..31] reserved (room for leader_id low32 in a follow-up) Emitted once per 20 CSI frames (≈ 1 Hz at the 20 Hz send-rate gate). Same stream_sender UDP socket as CSI frames — host dispatches by first 4 bytes of each datagram. Backwards compatible: aggregators that don't know about the new magic ignore it (sync packets won't match the CSI parser's magic check, so they're dropped harmlessly by existing receivers). New aggregators pair (node_id, sequence) across the two packet streams to align CSI frames to mesh time. Sets us up for downstream ADR-029/030 multistatic CSI fusion: with the host now able to recover the mesh-aligned epoch from each frame's sequence number, frames from multiple boards can be ordered + fused on a common timeline. Build evidence: C6 image 1019 KB (+1 KB vs v0.6.8 no-sync), 45 % partition slack unchanged. Host-side parser update is a follow-up. Co-Authored-By: claude-flow <ruv@ruv.net>
…ards SOTA iter 9 — closes the §A0.11 wiring gap with empirical evidence. Added a diagnostic ESP_LOGI in the sync emit path; flashed both C6 boards; captured 45s parallel serial output. Sync packet generation confirmed live: COM12 (leader, ...00:84): sync-pkt ruvnet#1 ... node=12 flags=0x03 local_us=28864932 epoch_us=28864939 flags=0x03 = leader+valid, epoch ≈ local (7 µs delta = call-stack elapsed only — leader has no offset by definition) COM9 (follower, ...05:3c): sync-pkt ruvnet#1 ... node=9 flags=0x06 local_us=28798450 epoch_us=27634885 flags=0x06 = valid+smoothed_used, local-epoch = 1,163,565 µs Matches §A0.10's measured -1.16 s mesh-aligned offset within 285 µs (WiFi MAC TX jitter floor between samples). Cadence: 2.05 s between sync packets — 20 CSI frames at the bench's observed 10 fps rate = exactly the design intent. UDP send returns -1 (sr=-1) because the bench boards are intentionally not associated to a real AP (provisioned to dead SSIDs for the iter 2-8 mesh experiments). No crash, no resource leak in 45s. Once boards hit a routable network, sr becomes the byte count. Wiring gap §A0.11 now CLOSED. Multistatic CSI fusion downstream has a documented protocol to recover mesh-aligned timestamps for every CSI frame: host pairs (node_id, sequence) across the two packet streams. Host-side parser is the natural next layer (wifi-densepose-sensing-server). Build evidence: C6 image 1019 KB (+0.5 KB for the diag log line), 45% partition slack unchanged. Co-Authored-By: claude-flow <ruv@ruv.net>
…VERY_N_FRAMES tunable Bundles the iter 8 + iter 9 sync-packet work (§A0.11 + §A0.12) into a shipped release. v0.6.8 didn't carry the sync emission; v0.6.9 closes the loop. What ships: - csi_collector emits a 32-byte UDP sync packet (magic 0xC511A110) every CONFIG_C6_SYNC_EVERY_N_FRAMES CSI callbacks (default 20). - New Kconfig knob lets operators tune cadence from ~0.1 Hz (N=1000) to ~10 Hz (N=1) without rebuilding — sensible defaults for mainstream multistatic at ~2 s sync interval. - Backwards-compatible at the wire level: old aggregators drop the new magic on existing parser mismatch path. Build artifacts (both green on IDF v5.4): - S3 8 MB: 1094 KB, 47% partition slack - C6 4 MB: 1019 KB, 45% partition slack The macro define was renamed from SYNC_EVERY_N_FRAMES to CONFIG_C6_SYNC_EVERY_N_FRAMES so the Kconfig generator wires through. Header guard preserves the default for builds without the kconfig applied. Co-Authored-By: claude-flow <ruv@ruv.net>
…t broken 15.4) WITNESS-LOG-110 prior state had byte 19 bit 4 (cross-node sync valid) only being set from c6_timesync_is_valid() — but c6_timesync is the 802.15.4 path that D1 documented as unfixable in IDF v5.4 (rx=0 across every soak we've run). The working transport is c6_sync_espnow (§A0.7, §A0.10: 99.43-99.56% RX cross-board, 104 µs smoothed-offset stdev), yet frames from sync'd nodes had bit 4 cleared because the ESP-NOW path didn't OR into the flag. Fix: also set bit 4 when c6_sync_espnow_is_valid() — the OR semantic means a node signals sync from whichever transport is healthy. Host sees bit 4 set, knows to pair the frame against the most recent sync packet (§A0.12) from this node_id. Side effect: this also enables S3 boards to set bit 4 (c6_sync_espnow works on both targets, c6_timesync is C6-only). So a multi-target mesh of S3+C6 boards now correctly signals cross-node alignment regardless of which chips are in the fleet. Build evidence: C6 image 1019 KB (+16 bytes for the new check), 45% slack unchanged. Co-Authored-By: claude-flow <ruv@ruv.net>
…te closed
Marks the end of the firmware-side ADR-110 push. Everything the firmware
can deliver toward §B multistatic alignment without hardware-blocked
dependencies is shipped, measured, and witnessed:
§A0.7–§A0.10 ESP-NOW mesh quantified: 99.43-99.56% cross-board match,
104.1 µs smoothed offset stdev, 1.4 ppm crystal-skew
tracking, ≤100 µs alignment target empirically met.
§A0.12 32-byte UDP sync packet emits with mesh-aligned epoch
+ sequence high-water; verified live both boards.
§A0.13 (new) bit-4 wire-fix: byte 19 bit 4 sourced from
c6_sync_espnow_is_valid() too. Mixed S3+C6 fleets now
correctly advertise mesh-sync.
Host-side enabler (Python):
archive/v1/src/hardware/csi_extractor.py grows SyncPacketParser +
SyncPacket dataclass. ESP32BinaryParser docstring acknowledges the
sibling sync packet. Sets up wifi-densepose-sensing-server to
consume the §A0.12 stream without inventing the parser.
Build artifacts (IDF v5.4, both RC=0):
S3 8 MB: 1094 KB, 47% partition slack
C6 4 MB: 1019 KB, 45% partition slack
Tag v0.7.0-esp32. Branch adr-110-esp32c6. PR ruvnet#764.
What remains is outside the firmware: host-side parser wiring,
multistatic CSI fusion in wifi-densepose-signal, 11ax-cooperative AP
(or future IDF AP-HE API), INA226 for ≤5 µA LP-core.
Co-Authored-By: claude-flow <ruv@ruv.net>
ESP32-S3 CSI requires DATA frames (LTF extraction) — MGMT-only filter silences CSI entirely on this chip revision because management frames (beacons, probe req/resp) contain no extractable LTF. Root cause: CSI yield=0pps despite WiFi connected and RSSI healthy. Fix: WIFI_PROMIS_FILTER_MASK_MGMT → WIFI_PROMIS_FILTER_MASK_ALL. Verified on Heltec LoRa V3 (ESP32-S3FN8 rev v0.2, Zyxel_D631 AP). Post-fix yield: 5-8 pps stable. Note: On busy networks (100+ data frames/sec) this may cause WiFi HW interrupt storms that crash Core 0. Low-traffic networks are safe.
|
Your root-cause is exactly right — Heads-up on the filter change specifically (
The Heltec LoRa V3's display is an I²C SSD1306 OLED, not that AMOLED — so Two reasons
The rest of this PR (C6 build overlay, ESP-NOW EMA offset smoother, the |
Problem
ESP32-S3 (rev v0.2) CSI callback never fires after WiFi STA connects.
yield=0ppspermanently, even with RSSI -45 to -50 dBm indicating healthy WiFi.Root Cause
Promiscuous filter was set to
WIFI_PROMIS_FILTER_MASK_MGMT(management frames only). CSI is extracted from LTF fields present ONLY in DATA frames, not management frames. MGMT-only filter silences CSI entirely.Fix
Change promiscuous filter from
WIFI_PROMIS_FILTER_MASK_MGMT→WIFI_PROMIS_FILTER_MASK_ALL.Verification
Tested on:
Risk
On busy WiFi networks (100+ data frames/sec) this may cause Core 0 crashes in
wDev_ProcessFiq. Our network has low data traffic so this is safe. A future PR could implement adaptive filter toggling based on yield rate.Fixes #521