Skip to content

Commit 5c81398

Browse files
authored
Update QR-DQN optimizer to only use q_net parameters (#252)
* Updated QR-DQN optimizer input to only include quantile_net parameters * Fix QR-DQN paper link in docs and update changelog
1 parent dc25cc6 commit 5c81398

File tree

3 files changed

+5
-3
lines changed

3 files changed

+5
-3
lines changed

docs/misc/changelog.rst

+3-1
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@ New Features:
1616

1717
Bug Fixes:
1818
^^^^^^^^^^
19+
- Updated QR-DQN optimizer input to only include quantile_net parameters (@corentinlger)
20+
- Updated QR-DQN paper link in docs (@corentinlger)
1921

2022
Deprecations:
2123
^^^^^^^^^^^^^
@@ -580,4 +582,4 @@ Contributors:
580582
-------------
581583

582584
@ku2482 @guyk1971 @minhlong94 @ayeright @kronion @glmcdona @cyprienc @sgillen @Gregwar @rnederstigt @qgallouedec
583-
@mlodel @CppMaster @burakdmb @honglu2875 @ZikangXiong @AlexPasqua @jonasreiher @icheered @Armandpl
585+
@mlodel @CppMaster @burakdmb @honglu2875 @ZikangXiong @AlexPasqua @jonasreiher @icheered @Armandpl @corentinlger

docs/modules/qrdqn.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ instead of predicting the mean return (DQN).
2424
Notes
2525
-----
2626

27-
- Original paper: https://arxiv.org/abs/1710.100442
27+
- Original paper: https://arxiv.org/abs/1710.10044
2828
- Distributional RL (C51): https://arxiv.org/abs/1707.06887
2929
- Further reference: https://github.com/amy12xx/ml_notes_and_reports/blob/master/distributional_rl/QRDQN.pdf
3030

sb3_contrib/qrdqn/policies.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,7 @@ def _build(self, lr_schedule: Schedule) -> None:
171171

172172
# Setup optimizer with initial learning rate
173173
self.optimizer = self.optimizer_class( # type: ignore[call-arg]
174-
self.parameters(),
174+
self.quantile_net.parameters(),
175175
lr=lr_schedule(1),
176176
**self.optimizer_kwargs,
177177
)

0 commit comments

Comments
 (0)