Skip to content

Commit

Permalink
commit
Browse files Browse the repository at this point in the history
  • Loading branch information
zchoi committed Aug 27, 2022
1 parent 09487cd commit c13afe3
Show file tree
Hide file tree
Showing 5 changed files with 9 additions and 12 deletions.
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
/m2_annotations
evaluation/spice/*
*.pyc
*.jar
/saved_transformer_models
/tensorboard_logs
/visualization
Expand Down
12 changes: 5 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,11 +36,10 @@ python -m spacy download en_core_web_md

## Data Preparation

* **Annotation**. Download the annotation file [annotation.zip](https://drive.google.com/file/d/1Zc2P3-MIBg3JcHT1qKeYuQt9CnQcY5XJ/view?usp=sharing) [1]. Extract and put it in the project root directory.
* **Feature**. Download processed image features [ResNeXt-101](https://stduestceducn-my.sharepoint.com/:f:/g/personal/zhn_std_uestc_edu_cn/EssZY4Xdb0JErCk0A1Yx3vUBaRbXau88scRvYw4r1ZuwPg?e=f2QFGp) and [ResNeXt-152](https://stduestceducn-my.sharepoint.com/:f:/g/personal/zhn_std_uestc_edu_cn/EssZY4Xdb0JErCk0A1Yx3vUBaRbXau88scRvYw4r1ZuwPg?e=f2QFGp) features [2], put it in the project root directory.
* **Annotation**. Download the annotation file [m2_annotations](https://drive.google.com/file/d/12EdMHuwLjHZPAMRJNrt3xSE2AMf7Tz8y/view?usp=sharing) [1]. Extract and put it in the project root directory.
* **Feature**. Download processed image features [ResNeXt-101](https://pan.baidu.com/s/1avz9zaQ7c36XfVFK3ZZZ5w) and [ResNeXt-152](https://pan.baidu.com/s/1avz9zaQ7c36XfVFK3ZZZ5w) features [2] (code `9vtB`), put it in the project root directory.
<!-- * **Evaluation**. Download the evaluation tools [here](https://pan.baidu.com/s/1xVZO7t8k4H_l3aEyuA-KXQ). Acess code: jcj6. Extarct and put it in the project root directory. -->


## Training
Run `python train_transformer.py` using the following arguments:

Expand Down Expand Up @@ -83,7 +82,6 @@ We provide pretrained model [here](https://drive.google.com/file/d/1Y133r4Wd9edi
| Reproduced Model (ResNext101) | 81.2 | 39.9 | 29.6 | 59.1 | 133.7 | 23.3|



### Online Evaluation
We also report the performance of our model on the online COCO test server with an ensemble of four S<sup>2</sup> models. The detailed online test code can be obtained in this [repo](https://github.com/zhangxuying1004/RSTNet).

Expand All @@ -95,8 +93,8 @@ Huang, and Rongrong Ji. Rstnet: Captioning with adaptive attention on visual and
### Citation
```
@inproceedings{S2,
author = {Pengpeng Zeng and
Haonan Zhang and
author = {Pengpeng Zeng* and
Haonan Zhang* and
Jingkuan Song and
Lianli Gao},
title = {S2 Transformer for Image Captioning},
Expand All @@ -107,4 +105,4 @@ Huang, and Rongrong Ji. Rstnet: Captioning with adaptive attention on visual and
```
## Acknowledgements
Thanks Zhang _et.al_ for releasing the visual features (ResNeXt-101 and ResNeXt-152). Our code implementation is also based on their [repo](https://github.com/zhangxuying1004/RSTNet).
Thanks for the original annotations prepared by [M<sup>2</sup> Transformer](https://github.com/aimagelab/meshed-memory-transformer), and effective visual representation from [grid-feats-vqa](https://github.com/facebookresearch/grid-feats-vqa).
Thanks for the original annotations prepared by [M<sup>2</sup> Transformer](https://github.com/aimagelab/meshed-memory-transformer), and effective visual representation from [grid-feats-vqa](https://github.com/facebookresearch/grid-feats-vqa).
Binary file added evaluation/meteor/meteor-1.5.jar
Binary file not shown.
Binary file added evaluation/stanford-corenlp-3.4.1.jar
Binary file not shown.
8 changes: 4 additions & 4 deletions test_transformer.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,12 +49,12 @@ def predict_captions(model, dataloader, text_field):
device = torch.device('cuda')

parser = argparse.ArgumentParser(description='Transformer')
parser.add_argument('--batch_size', type=int, default=10)
parser.add_argument('--workers', type=int, default=4)
parser.add_argument('--batch_size', type=int, default=50)
parser.add_argument('--workers', type=int, default=12)
parser.add_argument('--m', type=int, default=40)

parser.add_argument('--features_path', type=str, default='/home/zhanghaonan/RSTNet-master/X101-features/X101_grid_feats_coco_trainval.hdf5')
parser.add_argument('--annotation_folder', type=str, default='/home/zhanghaonan/RSTNet-master/m2_annotations')
parser.add_argument('--features_path', type=str, default='./X101-features/X101_grid_feats_coco_trainval.hdf5')
parser.add_argument('--annotation_folder', type=str, default='./m2_annotations')

# the path of tested model and vocabulary
parser.add_argument('--model_path', type=str, default='saved_transformer_models/demo_rl_v5_best_test.pth')
Expand Down

0 comments on commit c13afe3

Please sign in to comment.