Skip to content

GPU memory usage going up every epoch #1196

Answered by rwightman
AlejandroRigau asked this question in Q&A
Discussion options

You must be logged in to vote

@AlejandroRigau with caching allocators (like the pytorch one) that's not a reliable indicator of a leak, memory churn can increase the overall allocated without actually causing any issues (just more cached allocations that may get reused, only recovered if needed).

I've used the scripts enough for training on the order of months sometimes so I'm fairly certain no issues. If you're only using 70-somthing% at the start you should try pushing your batch size up such that you use 90-95%, going all the way to the limit can result in a OOM when you transition between train-eval-train

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by AlejandroRigau
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants