Update ddqn_agent.py to prevent RuntimeError with newer pytorch version #3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When running the ddqn agent on pytorch v 1.5.0 I get the following RuntimeError:
RuntimeError: range.second - range.first == t.size() INTERNAL ASSERT FAILED at ..\torch\csrc\autograd\generated\Functions.cpp:57, please report a bug to PyTorch. inconsistent range for TensorList output (copy_range at ..\torch\csrc\autograd\generated\Functions.cpp:57)
(no backtrace available)'
My guess is that there is a diamond shape dependency when running the backward method as the
self.q_eval
network parameters affect the loss viaq_pred
andq_eval
.I fixed the issue by explicitly detaching the
max_actions
tensor from the computational tree as it is a discrete value and small changes in theself.q_eval
network parameters should not change the max_actions taken. The derivative of the loss with respect to theself.q_eval
network parameters thus only comes from the q_pred calculation.I tested this change on my computer and got good performance and (more improtantly) didn't get the RuntimeError.