With -scheduler=cores, I see an intermittent RP2040 USB CDC monitor hang that appears after:
ca584de8 machine/usb: support bidirectional endpoints by dynamic registration (#5447)
Since this is still pre-release and the regression is reproducible, I'm flagging it as a possible release blocker. The reproduction is intermittent (2/5 below), so the conclusion is "this looks like a regression from #5447," not a certainty.
I'd suggest considering a temporary revert of #5447 — or investigating it — before release, unless it can be explained quickly. I'm not trying to root-cause it here, and I'm happy to run more tests.
Method
To rule out the older RP2 CDC TX race fixed in d9d19e81 / #5391, I tested both revisions with the same usbcdc.go restored from d9d19e81:
git restore --source=d9d19e81 -- src/machine/usb/cdc/usbcdc.go
So the source difference under test is the change introduced by #5447:
594be6db + usbcdc.go@d9d19e81 (previous revision)
ca584de8 + usbcdc.go@d9d19e81 (#5447)
Environment: tinygo 0.42.0-dev-18033ebc (go1.26.1, LLVM 20.1.1). Current upstream/dev already includes both #5391 and #5447. I checked out the two historical commits above only to isolate the regression point.
Reproduction
Target: Raspberry Pi Pico / RP2040
tinygo flash -target=pico -scheduler=cores -monitor 25_min_usb_xip_atomic_use_64kb.go
The test prints many lines with println while another goroutine repeatedly reads a large const string from flash. The hang appears to require this flash/XIP pressure.
Expected final output:
On a hang, the USB CDC monitor output stops before that line.
25_min_usb_xip_atomic_use_64kb.go
//go:build tinygo && rp2040
package main
import (
"sync/atomic"
"time"
)
const lines = 5000
const reads = 8192
const payload = "abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ--usb-xip-atomic-use-min--"
const chunk256 = "" +
"0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef" +
"fedcba9876543210fedcba9876543210fedcba9876543210fedcba9876543210" +
"00112233445566778899aabbccddeeffffeeddccbbaa99887766554433221100" +
"rp2040xipcacheflashworkerrandomaccesstestdataAAAAAAAAAAAAAAAAAAA"
const block4k = chunk256 + chunk256 + chunk256 + chunk256 +
chunk256 + chunk256 + chunk256 + chunk256 +
chunk256 + chunk256 + chunk256 + chunk256 +
chunk256 + chunk256 + chunk256 + chunk256
const flashData = block4k + block4k + block4k + block4k +
block4k + block4k + block4k + block4k +
block4k + block4k + block4k + block4k +
block4k + block4k + block4k + block4k
var (
workerDone uint32
workerSum uint32
)
func xipWorker() {
x := uint32(0x12345678)
s := uint32(0)
for {
v := atomic.LoadUint32(&workerDone)
s ^= v // use the atomic value, but do not branch on it
for i := 0; i < reads; i++ {
x = x*1664525 + 1013904223
s += uint32(flashData[int(x%uint32(len(flashData)))])
}
workerSum = s ^ x
}
}
func main() {
time.Sleep(2 * time.Second)
println("start usb xip stall atomic-use")
println("flashData size:", len(flashData))
println("lines:", lines)
go xipWorker()
time.Sleep(10 * time.Millisecond)
for i := 0; i < lines; i++ {
println("line:", i, payload)
}
workerDone = 1
println("worker sum:", workerSum)
println("test finished")
for {
}
}
Results
594be6db + usbcdc.go@d9d19e81 : OK OK OK OK OK (5/5 OK)
ca584de8 + usbcdc.go@d9d19e81 : OK NG OK OK NG (3/5 OK, 2/5 NG)
With the same CDC TX fix on both, the previous revision did not reproduce the hang while #5447 did. The sample is small, so I can't rule out variance. I can run more iterations and report a reproduction rate if that helps the decision.
Request
Consider temporarily reverting #5447 before release, or investigating this regression, if it can't be explained quickly.
I'm glad to help with more runs, a narrower repro, or capturing state at the time of the hang.
With
-scheduler=cores, I see an intermittent RP2040 USB CDC monitor hang that appears after:Since this is still pre-release and the regression is reproducible, I'm flagging it as a possible release blocker. The reproduction is intermittent (
2/5below), so the conclusion is "this looks like a regression from #5447," not a certainty.I'd suggest considering a temporary revert of #5447 — or investigating it — before release, unless it can be explained quickly. I'm not trying to root-cause it here, and I'm happy to run more tests.
Method
To rule out the older RP2 CDC TX race fixed in
d9d19e81/ #5391, I tested both revisions with the sameusbcdc.gorestored fromd9d19e81:So the source difference under test is the change introduced by #5447:
Environment:
tinygo 0.42.0-dev-18033ebc(go1.26.1, LLVM 20.1.1). Currentupstream/devalready includes both #5391 and #5447. I checked out the two historical commits above only to isolate the regression point.Reproduction
Target: Raspberry Pi Pico / RP2040
The test prints many lines with
printlnwhile another goroutine repeatedly reads a largeconst stringfrom flash. The hang appears to require this flash/XIP pressure.Expected final output:
On a hang, the USB CDC monitor output stops before that line.
25_min_usb_xip_atomic_use_64kb.go
Results
With the same CDC TX fix on both, the previous revision did not reproduce the hang while #5447 did. The sample is small, so I can't rule out variance. I can run more iterations and report a reproduction rate if that helps the decision.
Request
Consider temporarily reverting #5447 before release, or investigating this regression, if it can't be explained quickly.
I'm glad to help with more runs, a narrower repro, or capturing state at the time of the hang.