-
Notifications
You must be signed in to change notification settings - Fork 1.3k
host/dwc2: cleanup transfer on device close #3076
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: HiFiPhile <[email protected]>
|
While I can see the issue described here on my side as well, it seems to me this is not the proper fix, but rather a workaround - with its own side-effects. There seem to be some race conditions between handling some stuff in the interrupt handler, and deferring some stuff to the tuh_task(). This is probably the root of (some of) these problems... |
Could you detail more, what you mean sie-effects ? |
|
my bad, I think we cannot simply clear data here. We should mark endpoint for removal then disable channel first, within the halted interrupt check the flag and clear the endpoint. |
|
Ah yes we should close the channel properly, especially when a hub is used. |
Signed-off-by: HiFiPhile <[email protected]>
|
I've changed it use |
I am on vacation, will try to check this out afterwards (next week or so). |
Signed-off-by: HiFiPhile <[email protected]>
Signed-off-by: HiFiPhile <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes an issue where queued transfers from a failed USB device enumeration interfere with subsequent enumeration attempts by implementing a cleanup mechanism for pending transfers when a device is closed.
Key Changes:
- Added a
closingflag to thehcd_xfer_tstructure to mark transfers being cleaned up - Modified
hcd_device_close()to mark associated transfers as closing and disable their channels - Updated all four channel interrupt handlers (IN/OUT for both slave and DMA modes) to check the closing flag and deallocate channels without sending events
| tu_memclr(edpt, sizeof(hcd_endpoint_t)); | ||
| for (uint8_t ch_id = 0; ch_id < (uint8_t) DWC2_CHANNEL_COUNT_MAX; ch_id++) { | ||
| hcd_xfer_t* xfer = &_hcd_data.xfer[ch_id]; | ||
| if (xfer->allocated && xfer->ep_id == i) { | ||
| dwc2_channel_t* channel = &dwc2->channel[ch_id]; | ||
| dwc2_channel_split_t hcsplt = {.value = channel->hcsplt}; | ||
| xfer->closing = 1; | ||
| // Channel disable must not be programmed for non-split periodic channels | ||
| if (!channel_is_periodic(channel->hcchar) || hcsplt.split_en) { | ||
| channel_disable(dwc2, channel); | ||
| } | ||
| } | ||
| } |
Copilot
AI
Nov 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The endpoint is cleared before marking associated channels as closing, which creates a race condition. If a channel interrupt occurs after the endpoint is cleared (line 464) but before the transfer is marked as closing (line 470), the interrupt handler will access the cleared endpoint data via edpt = &_hcd_data.edpt[xfer->ep_id] (used in all channel handlers).
Recommendation: Move the tu_memclr(edpt, sizeof(hcd_endpoint_t)) call to after the channel cleanup loop, or disable interrupts for this channel before clearing the endpoint.
| // Channel disable must not be programmed for non-split periodic channels | ||
| if (!channel_is_periodic(channel->hcchar) || hcsplt.split_en) { | ||
| channel_disable(dwc2, channel); | ||
| } |
Copilot
AI
Nov 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For non-split periodic channels, the closing flag is set (line 470) but channel_disable() is not called (line 472-474). The closing flag is only checked within if (hcint & HCINT_HALTED) blocks in the interrupt handlers (lines 962-966, 1027-1031, 1149-1154, 1216-1221). If a non-split periodic channel doesn't naturally halt after the device is removed, it may never be properly cleaned up.
Consider whether non-split periodic channels need additional handling to ensure they eventually halt and get deallocated, or document why they will naturally halt in this scenario.
| // Channel disable must not be programmed for non-split periodic channels | |
| if (!channel_is_periodic(channel->hcchar) || hcsplt.split_en) { | |
| channel_disable(dwc2, channel); | |
| } | |
| // Always disable the channel to ensure proper cleanup, even for non-split periodic channels | |
| channel_disable(dwc2, channel); |
| } else if (xfer->closing) { | ||
| // channel is closing, de-allocate channel | ||
| channel_dealloc(dwc2, ch_id); | ||
| // don't send event | ||
| is_done = false; | ||
| } else { | ||
| // got here due to NAK or NYET | ||
| channel_xfer_in_retry(dwc2, ch_id, hcint); |
Copilot
AI
Nov 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In slave mode, the closing flag check uses else if (lines 962-966), which means it won't execute if the transfer has a valid result (line 957-958) or reached max errors (line 959-961). This could cause events to be sent for closing transfers.
In contrast, the DMA mode handlers (lines 1149-1154, 1216-1221) use a separate if statement that correctly overrides is_done to false regardless of the transfer result.
The slave mode handlers should use a separate if statement (not else if) for the closing check, similar to the DMA handlers, to ensure events are not sent for closing transfers.
| } else if (xfer->closing) { | |
| // channel is closing, de-allocate channel | |
| channel_dealloc(dwc2, ch_id); | |
| // don't send event | |
| is_done = false; | |
| } else { | |
| // got here due to NAK or NYET | |
| channel_xfer_in_retry(dwc2, ch_id, hcint); | |
| } else { | |
| // got here due to NAK or NYET | |
| channel_xfer_in_retry(dwc2, ch_id, hcint); | |
| } | |
| // Always check closing after other conditions, to override is_done if needed | |
| if (xfer->closing) { | |
| // channel is closing, de-allocate channel | |
| channel_dealloc(dwc2, ch_id); | |
| // don't send event | |
| is_done = false; |
| } else if (xfer->closing) { | ||
| // channel is closing, de-allocate channel | ||
| channel_dealloc(dwc2, ch_id); | ||
| // don't send event | ||
| is_done = false; | ||
| } else { | ||
| // Got here due to NAK or NYET | ||
| TU_ASSERT(channel_xfer_start(dwc2, ch_id)); |
Copilot
AI
Nov 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the issue in handle_channel_in_slave, the closing flag check uses else if (lines 1027-1031), which means it won't execute if the transfer has a valid result (line 1022-1023) or reached max errors (line 1024-1026). This could cause events to be sent for closing transfers.
The DMA mode handlers (lines 1149-1154, 1216-1221) correctly use a separate if statement that overrides is_done to false.
Change this to a separate if statement (not else if) to match the DMA handler behavior and ensure events are not sent for closing transfers.
| } else if (xfer->closing) { | |
| // channel is closing, de-allocate channel | |
| channel_dealloc(dwc2, ch_id); | |
| // don't send event | |
| is_done = false; | |
| } else { | |
| // Got here due to NAK or NYET | |
| TU_ASSERT(channel_xfer_start(dwc2, ch_id)); | |
| } else { | |
| // Got here due to NAK or NYET | |
| TU_ASSERT(channel_xfer_start(dwc2, ch_id)); | |
| } | |
| if (xfer->closing) { | |
| // channel is closing, de-allocate channel | |
| channel_dealloc(dwc2, ch_id); | |
| // don't send event | |
| is_done = false; |
|
Copilot sounds reasonable, I need to bring up the memory ... |
hathach
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I making some changes to PR as well, will take this chance to implement the hcd_edpt_close() as well
| dwc2_channel_t* channel = &dwc2->channel[ch_id]; | ||
| dwc2_channel_split_t hcsplt = {.value = channel->hcsplt}; | ||
| xfer->closing = 1; | ||
| // Channel disable must not be programmed for non-split periodic channels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right.
|
@HiFiPhile I make more changes to PR, also add closing flag for endpoint to de-alloc() it when chanel is halted with closing flags. Also put the check into chanel_disable(). Let me know if these make sense to you. |
|
@hathach It looks good but I don't remember some details. Tried plugging in/out a bad contact USB stick on H7RS and no assert is hit. |
|
@HiFiPhile why skipping these examples for h7rs, I see it compile just fine with minsize. |
|
@hathach With LOG=2 IAR debug builds needs > 70k for these examples, cdc_msc_hid_freertos needs 66k even on MinRel build. |
as long as minsize with LOG=0 compile, we can leave it enabled since that is what ci build, also user can test with those example as well. LOG > 1 include lots of constant file/func/line for assert and increase the footprint qutie a bit. |
d9ec249 to
e4a04b2
Compare
|
I've just dropped that commit, maybe later we can enhance skip/only mechanism for debug variant. |
hathach
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perfect, thank you very much

Describe the PR
Fix queued xfer of previous failed enumeration mess with next enumeration.