Skip to content

Commit ed13980

Browse files
committed
Delete probability normalization in play_action for actor critic.
Renamed main.py
1 parent 03b5b57 commit ed13980

File tree

3 files changed

+5
-6
lines changed

3 files changed

+5
-6
lines changed

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,23 +20,23 @@ or download the zip from github and extract it.
2020

2121
If you want to train a simple DQN model just use "--train"
2222
```commandline
23-
python my_main.py --train
23+
python main.py --train
2424
```
2525

2626
to use Prioritized experience replay:
2727
```commandline
28-
python my_main.py --train --per
28+
python main.py --train --per
2929
```
3030

3131
to use actor critic instead of dqn:
3232
```commandline
33-
python my_main.py --train --ac
33+
python main.py --train --ac
3434
```
3535

3636
If you want to test a model:
3737

3838
```commandline
39-
python my_main.py --test --model model_file
39+
python main.py --test --model model_file
4040
```
4141

4242

agents.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -84,8 +84,7 @@ def __init__(self, hidden_layers_actor, hidden_layers_critic, state_spec, action
8484
# Playing action by following the policy (output of the actor network)
8585
def play_action(self, state):
8686
probabilities = self.actor_network(np.atleast_2d(state))
87-
selection_probabilities = probabilities[0] / np.sum(probabilities[0])
88-
action = np.random.choice(self.actor_network.output_shape[1], p=selection_probabilities)
87+
action = np.random.choice(self.actor_network.output_shape[1], p=probabilities[0])
8988
return action
9089

9190
def play_and_train(self, state, env, gamma):

my_main.py renamed to main.py

File renamed without changes.

0 commit comments

Comments
 (0)