-
Notifications
You must be signed in to change notification settings - Fork 441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XDMA data corruption issue (0xFFFFFFFF every other read) was fixed yet was not mentioned anywhere #317
Comments
@dmitrym1 Where can I find @alonbl's patch set? Thanks. And in my case the driver print this error logs periodically, do you know that does it mean? Nov 30 03:00:26 ubuntu kernel: xdma:xdma_xfer_submit: xfer 0x0000000090ad39ee,4, s 0x1 timed out, ep 0xa8008080. |
Hi @jason77-wang. In your log there is a failed transaction. The driver says it's because of a timeout but as you've got 0xffffffff from ioread32 I can say it's a communication problem. There could be various reasons why you can get this result. I've seen the same log and had the same issue in my application, and it turned out to be an XDMA IP bug. In my case I had to do a few dozens restarts of my software and this quickly and reliably triggered the issue. Otherwise it could reproduce by itself after a few days of continuous operation. Once it goes to that state, it stays there until I restart the whole system. I've updated Vivado to 2020.1 and upgraded IP cores, and this fixed the problem. The changelog for XDMA says the problem should be fixed since 2019.1. So you could try to update too and see if this fixes the issue. If it does not, then unfortunately I won't be able to help you any further. |
Okay, got it. thanks. |
TL;DR: if you have the same issue, you should upgrade to Vivado 2019.1 or newer. Or if you are a Xilinx/AMD employee, then you should write your documentation better
I was using Vivado 2018.2 and corresponding XDMA IP connected to iMX8. I had to use @alonbl's patch set yet I still faced issues #311 and #314. But there was one more issue that I couldn't explain and couldn't find anything related to it. I have AXI peripherals connected to AXI DMA port and also some peripherals connected to AXI Lite port for register access. After some time of my app working just fine, I start getting 0xFFFFFFFF instead of data every other read. The kernel module gets registers data corrupted the same way, which leads to #314 and some other issues, slowing everything down and eventually crashing the kernel. Unloading kernel module does not help. The problem persists until system restart. Debugging kernel module lead me to ioread32 function that already gets corrupted data, so the problem goes further into XDMA IP itself. Looking on Xilinx/AMD web site revealed that there is no support tickets, no design advisories, and my IP core version (4.1) is the latest one. IDK how much time I'd spent on this bug if I accidentally did not look into XDMA changelog from Vivado 2022.1. And there it is:
So first of all, the version is not 4.1, it is 4.1.X, which is not what official publicly available documentation says.
Second, I don't know if this bug fix is for the issue that I described above. Because Xilinx did not share anything about this issue. How a developer supposed to know that there was an issue and it was fixed? So I'm doing work instead of Xilinx, sharing as much info as I can for those who face the same issue and try to google for a solution.
Example log, take a look at fields that have 0xffffffff in them.
The text was updated successfully, but these errors were encountered: