PPORecurrent mini batch size inconsistent

I am using `PPORecurrent` with the `RecurrentDictRolloutBuffer`. In https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/issues/103 it's mentioned that batch size is intended to be constant size. However this seems not to be the case.  

I did some experiments where I print the action batch sizes coming from the  `_get_samples()` function.

### Exp 1: batch_size < sequence length
Initialize PPORecurrent with batch_size = 10. I am sampling only 2000 steps for debugging purposes.
The mean sequence length is 32.3

![image](https://user-images.githubusercontent.com/40543177/198535548-fa936da2-5b7d-40a2-82e2-0fbf11e5d50c.png)

It looks like whenever n_seq = 2 and it appends 2 different sequences the batch size is higher than the specified 10.

### Exp 2: batch_size > sequence length
Initialize PPORecurrent with batch_size = 100
The mean sequence length this time is 31.7

![image](https://user-images.githubusercontent.com/40543177/198536892-6ca9835e-9909-4c54-95d8-596ad022f9f7.png)

The batch size is now always higher than the specified 100, and different every time.


Is this the intended behavior? It seems wrong to me since it is stated explicitly in the aforementioned issue that batch size is intended to be constant:

> Actually no, the main reason is that you want to keep a mini batch size constant (otherwise you will need to adjust the learning rate for instance).


Also, now that we are at it, what is the reason for implementing mini batching, instead of feeding batches of whole sequences to the model?

**System Info**
Describe the characteristic of your environment:
* installed with pip
 * Stable-Baselines3 1.6.0
 * sb3-contrib 1.6.0
 * GPU models and configuration
 * Python 3.8.0
 * PyTorch 1.12.1+cu116
 * Gym 0.21.0 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PPORecurrent mini batch size inconsistent #113

Exp 1: batch_size < sequence length

Exp 2: batch_size > sequence length

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PPORecurrent mini batch size inconsistent #113

Description

Exp 1: batch_size < sequence length

Exp 2: batch_size > sequence length

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions