Beyond Audio and Pose: A General-Purpose Framework for Video Synchronization

Pytorch code for synchronizing two videos that are not aligned in time.

Requirements

# create conda env and install packages
conda create -y --name videosync_carl python=3.7.9
conda activate videosync_carl

conda install -y pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
conda install -y conda-build ipython pandas scipy pip av ffmpeg -c conda-forge

pip install --upgrade pip
pip install -r requirements.txt

# In order to resolve protobuf related error:
# AttributeError: module 'distutils' has no attribute 'version'
pip install protobuf==3.20.3
pip install setuptools==59.5.0 wandb av tensorflow-gpu==2.4.0 scikit-learn simplejson iopath easydict opencv-python matplotlib seaborn

Preparing Data

Create a directory to store datasets:

mkdir /home/username/datasets

Download NTU dataset / pre-process

Download NTU videos from here. You need to request access to the dataset.

Download NTU-SYN annotations by following links from here.

python dataset_preparation/ntu_process.py

Download CMU Pose dataset / pre-process

python dataset_preparation/download_cmu_script.sh
python dataset_preparation/cmu_process.py

Download CMU Multi human dataset / pre-process

python dataset_preparation/download_cmu_multi_human_script.sh
python dataset_preparation/cmu_process.py

Download Pouring datasets / pre-process

sh dataset_preparation/download_pouring_data.sh
python dataset_preparation/tfrecords_to_videos.py

Training

Check ./configs directory to see all config settings.

Training can be monitored on wandb.

Download ResNet50 pretrained with BYOL

Our ResNet50 beckbone is initialized with the weights trained by BYOL.

Download the pretrained weight at pretrained_models, and place it at /home/username/datasets/pretrained_models.

Pretraining on NTU

Download NTU dataset and pre-process it using the above steps.

python -m torch.distributed.launch --nproc_per_node 1 train.py --workdir ~/datasets --cfg_file ./configs/scl_transformer_ntu_pretrain_config.yml --logdir ~/tmp/scl_transformer_ntu_pretrain_logs

Pretraining on Kinetics400

Download K400 dataset from https://github.com/cvdfoundation/kinetics-dataset

python -m torch.distributed.launch --nproc_per_node 1 train.py --workdir ~/datasets --cfg_file ./configs/scl_transformer_k400_pretrain_config.yml --logdir ~/tmp/scl_transformer_k400_pretrain_logs

Checkpoints

We provide the checkpoints trained CARL method at link.

Place these checkpoints at /home/username/tmp to evaluate them.

Evaluation

Start evaluation.

python -m torch.distributed.launch --nproc_per_node 1 evaluate.py \
--workdir /data \
--cfg_file ./configs/scl_transformer_ntu_config.yml \
--logdir ~/tmp/scl_transformer_ntu_logs

python -m torch.distributed.launch --nproc_per_node 1 evaluate.py \
--workdir /data/ssd \
--cfg_file ./configs/scl_transformer_cmu_config.yml \
--logdir ~/tmp/scl_transformer_cmu_logs

python -m torch.distributed.launch --nproc_per_node 1 evaluate.py \
--workdir /data/ssd \
--cfg_file ./configs/scl_transformer_pouring_config.yml \
--logdir ~/tmp/scl_transformer_pouring_logs

Train and evaluate with sync offset classifier

Construct similarity matrices for each video pair.

python -m torch.distributed.launch --nproc_per_node 1 construct_softmaxed_sim_dataset.py \
--workdir /data/ssd \
--cfg_file ./configs/scl_transformer_cmu_config.yml \
--logdir ~/tmp/scl_transformer_cmu_logs \
--dataset_prefix cmu_pose_dataset_240_k400_pretrained

Train sync offset classifier.

python train_sync_offset_detector.py --prefix cmu_pose_dataset_240_k400_pretrained --sync_methods log_reg svm mlp cnn

Evaluate sync offset classifier.

python eval_sync_offset_detector.py --model_prefix ntu --data_prefix cmu_pose_dataset_240_k400_pretrained --models mlp

Acknowledgment

The training setup code was modified from https://github.com/minghchen/CARL_code

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
algos		algos
configs		configs
dataset_preparation		dataset_preparation
datasets		datasets
evaluation		evaluation
models		models
plots		plots
utils		utils
.gitignore		.gitignore
Framework.png		Framework.png
LICENSE		LICENSE
README.md		README.md
construct_softmaxed_sim_dataset.py		construct_softmaxed_sim_dataset.py
eval_sync_offset_detector.py		eval_sync_offset_detector.py
evaluate.py		evaluate.py
evaluate_finegym.py		evaluate_finegym.py
logistic_regression_model_cmu.pkl		logistic_regression_model_cmu.pkl
pennaction_alignment.gif		pennaction_alignment.gif
requirements.txt		requirements.txt
train.py		train.py
train_sync_offset_detector.py		train_sync_offset_detector.py
videosync-overview.drawio.png		videosync-overview.drawio.png
visualize_alignment.py		visualize_alignment.py
visualize_retrieval.py		visualize_retrieval.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Beyond Audio and Pose: A General-Purpose Framework for Video Synchronization

Requirements

Preparing Data

Download NTU dataset / pre-process

Download CMU Pose dataset / pre-process

Download CMU Multi human dataset / pre-process

Download Pouring datasets / pre-process

Training

Download ResNet50 pretrained with BYOL

Pretraining on NTU

Pretraining on Kinetics400

Checkpoints

Evaluation

Train and evaluate with sync offset classifier

Acknowledgment

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

VideoSyncAI/videosync

Folders and files

Latest commit

History

Repository files navigation

Beyond Audio and Pose: A General-Purpose Framework for Video Synchronization

Requirements

Preparing Data

Download NTU dataset / pre-process

Download CMU Pose dataset / pre-process

Download CMU Multi human dataset / pre-process

Download Pouring datasets / pre-process

Training

Download ResNet50 pretrained with BYOL

Pretraining on NTU

Pretraining on Kinetics400

Checkpoints

Evaluation

Train and evaluate with sync offset classifier

Acknowledgment

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages