-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Invalidate trace cache @ step 975: expected module 1003, but got module 1003 #5033
Comments
Fixed by #7039 |
Yejing-Lai
pushed a commit
to Yejing-Lai/DeepSpeed
that referenced
this issue
Feb 24, 2025
Make trace cache warnings configurable, and disabled by default. Fix deepspeedai#6985, deepspeedai#4081, deepspeedai#5033, deepspeedai#5006, deepspeedai#5662 --------- Signed-off-by: Olatunji Ruwase <[email protected]>
gyou2021
pushed a commit
to gyou2021/DeepSpeed
that referenced
this issue
Feb 28, 2025
Make trace cache warnings configurable, and disabled by default. Fix deepspeedai#6985, deepspeedai#4081, deepspeedai#5033, deepspeedai#5006, deepspeedai#5662 --------- Signed-off-by: Olatunji Ruwase <[email protected]> Signed-off-by: gyou2021 <[email protected]>
tohtana
pushed a commit
that referenced
this issue
Feb 28, 2025
Make trace cache warnings configurable, and disabled by default. Fix #6985, #4081, #5033, #5006, #5662 --------- Signed-off-by: Olatunji Ruwase <[email protected]> Signed-off-by: Masahiro Tanaka <[email protected]>
ys950902
pushed a commit
to ys950902/DeepSpeed
that referenced
this issue
Mar 6, 2025
Make trace cache warnings configurable, and disabled by default. Fix deepspeedai#6985, deepspeedai#4081, deepspeedai#5033, deepspeedai#5006, deepspeedai#5662 --------- Signed-off-by: Olatunji Ruwase <[email protected]> Signed-off-by: yisheng <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
I am using pytorch lightning with deepspeed zero3, with offload. Each step of the training, I am receiving the following warning:
"Invalidate trace cache @ step 975: expected module 1003, but got module 1003"
I saw other issues with the same bug, but none that prints the same module.
The text was updated successfully, but these errors were encountered: