-
Notifications
You must be signed in to change notification settings - Fork 76
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add trainning code and pre-trained models
- Loading branch information
Showing
116 changed files
with
2,221 additions
and
326 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
## Getting Started with AlphAction | ||
|
||
The hyper-parameters of each experiment are controlled by | ||
a .yaml config file, which is located in the directory | ||
`config_files`. All of these configuration files assume | ||
that we are running on 8 GPUs. We need to create a symbolic | ||
link to the directory `output`, where the output (logs and checkpoints) | ||
will be saved. Besides, we recommend to create a directory `models` to place | ||
model weights. These can be done with following commands. | ||
|
||
```shell | ||
mkdir -p /path/to/output | ||
ln -s /path/to/output data/output | ||
mkdir -p /path/to/models | ||
ln -s /path/to/models data/models | ||
``` | ||
|
||
### Training | ||
|
||
Download pre-trained models from [MODEL_ZOO.md](MODEL_ZOO.md#pre-trained-models). | ||
Then place pre-trained models in `data/models` directory with following structure: | ||
|
||
``` | ||
models/ | ||
|_ pretrained_models/ | ||
| |_ SlowFast-ResNet50-4x16.pth | ||
| |_ SlowFast-ResNet101-8x8.pth | ||
``` | ||
|
||
To train on a single GPU, you only need to run following command. The | ||
argument `--use-tfboard` enables tensorboard to log training process. | ||
Because the config files assume that we are using 8 GPUs, the global | ||
batch size `SOLVER.VIDEOS_PER_BATCH` and `TEST.VIDEOS_PER_BATCH` can | ||
be too large for a single GPU. Therefore, in the following command, we | ||
modify the batch size and also adjust the learning rate and schedule | ||
length according to the linear scaling rule. | ||
|
||
```shell | ||
python train_net.py --config-file "path/to/config/file.yaml" \ | ||
--transfer --no-head --use-tfboard \ | ||
SOLVER.BASE_LR 0.000125 \ | ||
SOLVER.STEPS '(560000, 720000)' \ | ||
SOLVER.MAX_ITER 880000 \ | ||
SOLVER.VIDEOS_PER_BATCH 2 \ | ||
TEST.VIDEOS_PER_BATCH 2 | ||
``` | ||
|
||
We use the launch utility `torch.distributed.launch` to launch multiple | ||
processes for distributed training on multiple gpus. `GPU_NUM` should be | ||
replaced by the number of gpus to use. Hyper-parameters in the config file | ||
can still be modified in the way used in single-GPU training. | ||
|
||
```shell | ||
python -m torch.distributed.launch --nproc_per_node=GPU_NUM \ | ||
train_net.py --config-file "path/to/config/file.yaml" \ | ||
--transfer --no-head --use-tfboard | ||
``` | ||
|
||
### Inference | ||
|
||
To do inference on multiple GPUs, you should run the following command. Note that | ||
our code first trys to load the `last_checkpoint` in the `OUTPUT_DIR`. If there | ||
is no such file in `OUTPUT_DIR`, it will then load the model from the | ||
path specified in `MODEL.WEIGHT`. To use `MODEL.WEIGHT` to do the inference, | ||
you need to ensure that there is no `last_checkpoint` in `OUTPUT_DIR`. | ||
You can download the model weights from [MODEL_ZOO.md](MODEL_ZOO.md#ava-models). | ||
|
||
```shell | ||
python -m torch.distributed.launch --nproc_per_node=GPU_NUM \ | ||
test_net.py --config-file "path/to/config/file.yaml" \ | ||
MODEL.WEIGHT "path/to/model/weight" | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
## AlphAction Model Zoo | ||
|
||
### Pre-trained Models | ||
|
||
We provide backbone models pre-trained on Kinetics dataset, used for further | ||
fine-tuning on AVA dataset. The reported accuracy are obtained by 30-view testing. | ||
|
||
| backbone | pre-train | frame length | sample rate | top-1 | top-5 | model | | ||
| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | | ||
| SlowFast-R50 | Kinetics-700 | 4 | 16 | 66.34 | 86.66 | [[link]](https://drive.google.com/file/d/1bNcF295jxY4Zbqf0mdtsw9QifpXnvOyh/view?usp=sharing) | | ||
| SlowFast-R101 | Kinetics-700 | 8 | 8 | 69.32 | 88.84 | [[link]](https://drive.google.com/file/d/1v1FdPUXBNRj-oKfctScT4L4qk8L1k3Gg/view?usp=sharing) | | ||
|
||
### AVA Models | ||
|
||
| config | backbone | IA structure | mAP | in paper | model | | ||
| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | | ||
| [resnet50_4x16f_baseline](config_files/resnet50_4x16f_baseline.yaml) | SlowFast-R50-4x16 | w/o | 26.7 | 26.5 | [[link]](https://drive.google.com/file/d/1_yAxk6R58Dn6IjBCx-WBwCQEf_582Vuv/view?usp=sharing) | | ||
| [resnet50_4x16f_parallel](config_files/resnet50_4x16f_parallel.yaml) | SlowFast-R50-4x16 | Parallel | 29.0 | 28.9 | [[link]](https://drive.google.com/file/d/13iDNnkxjDqo8OuEhnHFe3P-fERHTbFaD/view?usp=sharing) | | ||
| [resnet50_4x16f_serial](config_files/resnet50_4x16f_serial.yaml) | SlowFast-R50-4x16 | Serial | 29.8 | 29.6 | [[link]](https://drive.google.com/file/d/1S6NIPQ8NoZpzOKkHjzdpFVOtsU6GjqIv/view?usp=sharing) | | ||
| [resnet50_4x16f_denseserial](config_files/resnet50_4x16f_denseserial.yaml) | SlowFast-R50-4x16 | Dense Serial | 30.0 | 29.8 | [[link]](https://drive.google.com/file/d/1OZmlA6V6XoWEA_usyijUREOYujzYL_kP/view?usp=sharing) | | ||
| [resnet101_8x8f_baseline](config_files/resnet101_8x8f_baseline.yaml) | SlowFast-R101-8x8 | w/o | 29.3 | 29.3 | [[link]](https://drive.google.com/file/d/1GC56oNEX00oEH8aiGYdFMKENdWf2VvAY/view?usp=sharing) | | ||
| [resnet101_8x8f_denseserial](config_files/resnet101_8x8f_denseserial.yaml) | SlowFast-R101-8x8 | Dense Serial | 32.4 | 32.3 | [[link]](https://drive.google.com/file/d/1DKHo0XoBjrTO2fHTToxbV0mAPzgmNH3x/view?usp=sharing) | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.