host/dwc2: cleanup transfer on device close #3076

HiFiPhile · 2025-04-11T13:59:59Z

Describe the PR
Fix queued xfer of previous failed enumeration mess with next enumeration.

[1:1] Get Descriptor: 80 06 03 03 09 04 00 01 
on EP 00 with 8 bytes: OK
on EP 80 with 40 bytes: OK
[1:1] Control data:
  0000:  28 03 4D 00 61 00 73 00 73 00 20 00 53 00 74 00  |(.M.a.s.s. .S.t.|
  0010:  6F 00 72 00 61 00 67 00 65 00 20 00 44 00 65 00  |o.r.a.g.e. .D.e.|
  0020:  76 00 69 00 63 00 65 00                          |v.i.c.e.|
[1:0:0] USBH DEVICE REMOVED
[1:0:0] unplugged address = 1
Device removed, address = 1
[1:] USBH Device Attach
High Speed
[1:0] Open EP0 with Size = 8
Get 8 byte of Device Descriptor
[1:0] Get Descriptor: 80 06 00 01 00 00 08 00 
on EP 00 with 8 bytes: OK                                  <------- Missed by one
on EP 00 with 8 bytes: OK
[1:0] Control data:
  0000:  12 01 00 02 00 00 00 40                          |.......@|
on EP 80 with 8 bytes: OK

Set Address = 1
[1:0] Set Address: 00 05 01 00 00 00 00 00 
on EP 00 with 0 bytes: OK
on EP 00 with 8 bytes: OK

[1:1] Open EP0 with Size = 64
Get Device Descriptor
[1:1] Get Descriptor: 80 06 00 01 00 00 12 00 
on EP 80 with 8 bytes: OK
hcd_edpt_xfer 655: ASSERT FAILED

Signed-off-by: HiFiPhile <[email protected]>

maximevince · 2025-04-14T07:13:12Z

While I can see the issue described here on my side as well, it seems to me this is not the proper fix, but rather a workaround - with its own side-effects.
Why wasn't this xfer cleared? And what if is currently ongoing?

There seem to be some race conditions between handling some stuff in the interrupt handler, and deferring some stuff to the tuh_task(). This is probably the root of (some of) these problems...

HiFiPhile · 2025-04-14T07:56:44Z

it seems to me this is not the proper fix, but rather a workaround - with its own side-effects.

Could you detail more, what you mean sie-effects ?

hathach · 2025-04-18T10:19:20Z

my bad, I think we cannot simply clear data here. We should mark endpoint for removal then disable channel first, within the halted interrupt check the flag and clear the endpoint.

HiFiPhile · 2025-04-18T10:29:30Z

Ah yes we should close the channel properly, especially when a hub is used.

Signed-off-by: HiFiPhile <[email protected]>

HiFiPhile · 2025-04-18T15:55:15Z

I've changed it use channel_disable(), I think there is no need to wait until the channel is disabled ?

hathach · 2025-04-29T14:55:05Z

I've changed it use channel_disable(), I think there is no need to wait until the channel is disabled ?

I am on vacation, will try to check this out afterwards (next week or so).

Signed-off-by: HiFiPhile <[email protected]>

Copilot

Pull Request Overview

This PR fixes an issue where queued transfers from a failed USB device enumeration interfere with subsequent enumeration attempts by implementing a cleanup mechanism for pending transfers when a device is closed.

Key Changes:

Added a closing flag to the hcd_xfer_t structure to mark transfers being cleaned up
Modified hcd_device_close() to mark associated transfers as closing and disable their channels
Updated all four channel interrupt handlers (IN/OUT for both slave and DMA modes) to check the closing flag and deallocate channels without sending events

Copilot · 2025-11-18T03:22:34Z

src/portable/synopsys/dwc2/hcd_dwc2.c

      tu_memclr(edpt, sizeof(hcd_endpoint_t));
+      for (uint8_t ch_id = 0; ch_id < (uint8_t) DWC2_CHANNEL_COUNT_MAX; ch_id++) {
+        hcd_xfer_t* xfer = &_hcd_data.xfer[ch_id];
+        if (xfer->allocated && xfer->ep_id == i) {
+          dwc2_channel_t* channel = &dwc2->channel[ch_id];
+          dwc2_channel_split_t hcsplt = {.value = channel->hcsplt};
+          xfer->closing = 1;
+          // Channel disable must not be programmed for non-split periodic channels
+          if (!channel_is_periodic(channel->hcchar) || hcsplt.split_en) {
+            channel_disable(dwc2, channel);
+          }
+        }
+      }


The endpoint is cleared before marking associated channels as closing, which creates a race condition. If a channel interrupt occurs after the endpoint is cleared (line 464) but before the transfer is marked as closing (line 470), the interrupt handler will access the cleared endpoint data via edpt = &_hcd_data.edpt[xfer->ep_id] (used in all channel handlers).

Recommendation: Move the tu_memclr(edpt, sizeof(hcd_endpoint_t)) call to after the channel cleanup loop, or disable interrupts for this channel before clearing the endpoint.

Copilot · 2025-11-18T03:22:34Z

src/portable/synopsys/dwc2/hcd_dwc2.c

+          // Channel disable must not be programmed for non-split periodic channels
+          if (!channel_is_periodic(channel->hcchar) || hcsplt.split_en) {
+            channel_disable(dwc2, channel);
+          }


For non-split periodic channels, the closing flag is set (line 470) but channel_disable() is not called (line 472-474). The closing flag is only checked within if (hcint & HCINT_HALTED) blocks in the interrupt handlers (lines 962-966, 1027-1031, 1149-1154, 1216-1221). If a non-split periodic channel doesn't naturally halt after the device is removed, it may never be properly cleaned up.

Consider whether non-split periodic channels need additional handling to ensure they eventually halt and get deallocated, or document why they will naturally halt in this scenario.

Suggested change

// Channel disable must not be programmed for non-split periodic channels

if (!channel_is_periodic(channel->hcchar) || hcsplt.split_en) {

channel_disable(dwc2, channel);

}

// Always disable the channel to ensure proper cleanup, even for non-split periodic channels

channel_disable(dwc2, channel);

Copilot · 2025-11-18T03:22:35Z

src/portable/synopsys/dwc2/hcd_dwc2.c

+    } else if (xfer->closing) {
+      // channel is closing, de-allocate channel
+      channel_dealloc(dwc2, ch_id);
+      // don't send event
+      is_done = false;
    } else {
      // got here due to NAK or NYET
      channel_xfer_in_retry(dwc2, ch_id, hcint);


In slave mode, the closing flag check uses else if (lines 962-966), which means it won't execute if the transfer has a valid result (line 957-958) or reached max errors (line 959-961). This could cause events to be sent for closing transfers.

In contrast, the DMA mode handlers (lines 1149-1154, 1216-1221) use a separate if statement that correctly overrides is_done to false regardless of the transfer result.

The slave mode handlers should use a separate if statement (not else if) for the closing check, similar to the DMA handlers, to ensure events are not sent for closing transfers.

Suggested change

} else if (xfer->closing) {

// channel is closing, de-allocate channel

channel_dealloc(dwc2, ch_id);

// don't send event

is_done = false;

} else {

// got here due to NAK or NYET

channel_xfer_in_retry(dwc2, ch_id, hcint);

} else {

// got here due to NAK or NYET

channel_xfer_in_retry(dwc2, ch_id, hcint);

}

// Always check closing after other conditions, to override is_done if needed

if (xfer->closing) {

// channel is closing, de-allocate channel

channel_dealloc(dwc2, ch_id);

// don't send event

is_done = false;

Copilot · 2025-11-18T03:22:35Z

src/portable/synopsys/dwc2/hcd_dwc2.c

+    } else if (xfer->closing) {
+      // channel is closing, de-allocate channel
+      channel_dealloc(dwc2, ch_id);
+      // don't send event
+      is_done = false;
    } else {
      // Got here due to NAK or NYET
      TU_ASSERT(channel_xfer_start(dwc2, ch_id));


Similar to the issue in handle_channel_in_slave, the closing flag check uses else if (lines 1027-1031), which means it won't execute if the transfer has a valid result (line 1022-1023) or reached max errors (line 1024-1026). This could cause events to be sent for closing transfers.

The DMA mode handlers (lines 1149-1154, 1216-1221) correctly use a separate if statement that overrides is_done to false.

Change this to a separate if statement (not else if) to match the DMA handler behavior and ensure events are not sent for closing transfers.

Suggested change

} else if (xfer->closing) {

// channel is closing, de-allocate channel

channel_dealloc(dwc2, ch_id);

// don't send event

is_done = false;

} else {

// Got here due to NAK or NYET

TU_ASSERT(channel_xfer_start(dwc2, ch_id));

} else {

// Got here due to NAK or NYET

TU_ASSERT(channel_xfer_start(dwc2, ch_id));

}

if (xfer->closing) {

// channel is closing, de-allocate channel

channel_dealloc(dwc2, ch_id);

// don't send event

is_done = false;

HiFiPhile · 2025-11-18T09:23:18Z

Copilot sounds reasonable, I need to bring up the memory ...

hathach

I making some changes to PR as well, will take this chance to implement the hcd_edpt_close() as well

hathach · 2025-11-18T11:55:02Z

src/portable/synopsys/dwc2/hcd_dwc2.c

+          dwc2_channel_t* channel = &dwc2->channel[ch_id];
+          dwc2_channel_split_t hcsplt = {.value = channel->hcsplt};
+          xfer->closing = 1;
+          // Channel disable must not be programmed for non-split periodic channels


this is only true for DMA mode, for slave mode, we still need to disable it I guess

I think you are right.

hathach · 2025-11-18T16:42:06Z

@HiFiPhile I make more changes to PR, also add closing flag for endpoint to de-alloc() it when chanel is halted with closing flags. Also put the check into chanel_disable(). Let me know if these make sense to you.

HiFiPhile · 2025-11-18T19:49:59Z

@hathach It looks good but I don't remember some details. Tried plugging in/out a bad contact USB stick on H7RS and no assert is hit.

hathach · 2025-11-19T03:31:47Z

@HiFiPhile why skipping these examples for h7rs, I see it compile just fine with minsize.

HiFiPhile · 2025-11-19T06:52:22Z

@hathach With LOG=2 IAR debug builds needs > 70k for these examples, cdc_msc_hid_freertos needs 66k even on MinRel build.

hathach · 2025-11-19T07:35:15Z

@hathach With LOG=2 IAR debug builds needs > 70k for these examples, cdc_msc_hid_freertos needs 66k even on MinRel build.

as long as minsize with LOG=0 compile, we can leave it enabled since that is what ci build, also user can test with those example as well. LOG > 1 include lots of constant file/func/line for assert and increase the footprint qutie a bit.

HiFiPhile · 2025-11-19T07:47:49Z

I've just dropped that commit, maybe later we can enhance skip/only mechanism for debug variant.

hathach

perfect, thank you very much

github-actions bot added the Prio Higher label Apr 11, 2025

host/dwc2: cleanup transfer on device close

1809548

Signed-off-by: HiFiPhile <[email protected]>

HiFiPhile force-pushed the xfer_close branch from ebb5169 to 1809548 Compare April 12, 2025 17:07

Disable channel properly.

90cf575

Signed-off-by: HiFiPhile <[email protected]>

HiFiPhile and others added 4 commits May 15, 2025 21:34

Fix transfer failed event still queued

376c106

Signed-off-by: HiFiPhile <[email protected]>

Merge branch 'master' into xfer_close

98da642

Signed-off-by: HiFiPhile <[email protected]>

Merge branch 'master' into xfer_close

619ef71

Merge branch 'master' into fork/HiFiPhile/xfer_close

6c14093

Copilot AI review requested due to automatic review settings November 18, 2025 03:18

Copilot started reviewing on behalf of hathach November 18, 2025 03:19 View session

Copilot finished reviewing on behalf of hathach November 18, 2025 03:21

Copilot AI reviewed Nov 18, 2025

View reviewed changes

hathach reviewed Nov 18, 2025

View reviewed changes

improve closing endpoint and channel

e4a04b2

HiFiPhile force-pushed the xfer_close branch from d9ec249 to e4a04b2 Compare November 19, 2025 07:45

hathach approved these changes Nov 19, 2025

View reviewed changes

hathach merged commit 15ed920 into hathach:master Nov 19, 2025
146 of 147 checks passed

HiFiPhile deleted the xfer_close branch November 19, 2025 11:58

host/dwc2: cleanup transfer on device close #3076

host/dwc2: cleanup transfer on device close #3076

Uh oh!

Conversation

HiFiPhile commented Apr 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maximevince commented Apr 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HiFiPhile commented Apr 14, 2025

Uh oh!

hathach commented Apr 18, 2025

Uh oh!

HiFiPhile commented Apr 18, 2025

Uh oh!

HiFiPhile commented Apr 18, 2025

Uh oh!

hathach commented Apr 29, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

HiFiPhile commented Nov 18, 2025

Uh oh!

hathach left a comment

Choose a reason for hiding this comment

Uh oh!

hathach Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

HiFiPhile Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

hathach commented Nov 18, 2025

Uh oh!

HiFiPhile commented Nov 18, 2025

Uh oh!

hathach commented Nov 19, 2025

Uh oh!

HiFiPhile commented Nov 19, 2025

Uh oh!

hathach commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HiFiPhile commented Nov 19, 2025

Uh oh!

hathach left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HiFiPhile commented Apr 11, 2025 •

edited

Loading

maximevince commented Apr 14, 2025 •

edited

Loading

hathach commented Nov 19, 2025 •

edited

Loading