Skip to content

PPORecurrent mini batch size inconsistent #113

@b-vm

Description

@b-vm

I am using PPORecurrent with the RecurrentDictRolloutBuffer. In #103 it's mentioned that batch size is intended to be constant size. However this seems not to be the case.

I did some experiments where I print the action batch sizes coming from the _get_samples() function.

Exp 1: batch_size < sequence length

Initialize PPORecurrent with batch_size = 10. I am sampling only 2000 steps for debugging purposes.
The mean sequence length is 32.3

image

It looks like whenever n_seq = 2 and it appends 2 different sequences the batch size is higher than the specified 10.

Exp 2: batch_size > sequence length

Initialize PPORecurrent with batch_size = 100
The mean sequence length this time is 31.7

image

The batch size is now always higher than the specified 100, and different every time.

Is this the intended behavior? It seems wrong to me since it is stated explicitly in the aforementioned issue that batch size is intended to be constant:

Actually no, the main reason is that you want to keep a mini batch size constant (otherwise you will need to adjust the learning rate for instance).

Also, now that we are at it, what is the reason for implementing mini batching, instead of feeding batches of whole sequences to the model?

System Info
Describe the characteristic of your environment:

  • installed with pip
  • Stable-Baselines3 1.6.0
  • sb3-contrib 1.6.0
  • GPU models and configuration
  • Python 3.8.0
  • PyTorch 1.12.1+cu116
  • Gym 0.21.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions