【VideoMAE V1】复现记录

VideoMAE:

[NeurIPS 2022] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

Paper: https://arxiv.org/abs/2203.12602v3

Code: https://github.com/MCG-NJU/VideoMAE

paperwithcode: https://paperswithcode.com/paper/videomae-masked-autoencoders-are-data-1

环境准备

按照源代码的 INSTALL.md

# VideoMAE Installation

The codebase is mainly built with following libraries:

- Python 3.6 or higher

- [PyTorch](https://pytorch.org/) and [torchvision](https://github.com/pytorch/vision). <br>
  We can successfully reproduce the main results under two settings below:<br>
  Tesla **A100** (40G): CUDA 11.1 + PyTorch 1.8.0 + torchvision 0.9.0<br>
  Tesla **V100** (32G): CUDA 10.1 + PyTorch 1.6.0 + torchvision 0.7.0
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
- [timm==0.4.8/0.4.12](https://github.com/rwightman/pytorch-image-models)

- [deepspeed==0.5.8](https://github.com/microsoft/DeepSpeed)

  `DS_BUILD_OPS=1 pip install deepspeed`

- [TensorboardX](https://github.com/lanpa/tensorboardX)

- [decord](https://github.com/dmlc/decord)

- [einops](https://github.com/arogozhnikov/einops)

### Note:
 1. We recommend you to use **`PyTorch >= 1.8.0`**.
 2. We observed accidental interrupt in the last epoch when conducted the pre-training experiments on V100 GPUs (PyTorch 1.6.0). This interrupt is caused by the scheduler of learning rate. We naively set  `--epochs 801` to walk away from issue :)
  1. 创建环境
    建议选择python版本3.6~3.8,如果选择高版本的python,在安装torch时会报错
conda create -n videomae python==3.8
conda activate videomae
  1. 安装torch
    为了保证能正常多卡训练,选择安装与原文一致的版本,torch1.8.0
    pytorch(previous-versions)/找到安装命令:
# CUDA 11.1
pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html

# CUDA 10.2
pip install torch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0
  1. 安装timm
pip install timm==0.4.12
  1. 安装deepspeed
    我使用DS_BUILD_OPS=1 pip install deepspeed 会报错,于是安装完整deepspeed,实测可用
pip install deepspeed
  1. 安装其他库
pip install TensorboardX decord einops
  1. 另需安装opencv
pip install opencv-python

按照需要修改scripts中的pretrain.sh:

# Set the path to save checkpoints
OUTPUT_DIR='/data/wyyy/k400_videomae_pretrain_base_patch16_224_frame_16x4_tube_mask_ratio_0.9_e800'
# Set the path to Kinetics train set.
DATA_PATH='./train.csv'

# batch_size can be adjusted according to number of GPUs
# this script is for 64 GPUs (8 nodes x 8 GPUs)
OMP_NUM_THREADS=1 python -m torch.distributed.launch --nproc_per_node=8 \
        --master_port 12320 --nnodes=8  \
         --node_rank=$1 --master_addr=$2 \
        run_mae_pretraining.py \
        --data_path ${DATA_PATH} \
        --mask_type tube \
        --mask_ratio 0.9 \
        --model pretrain_videomae_base_patch16_224 \
        --decoder_depth 4 \
        --batch_size 32 \
        --num_frames 16 \
        --sampling_rate 4 \
        --opt adamw \
        --opt_betas 0.9 0.95 \
        --warmup_epochs 40 \
        --save_ckpt_freq 20 \
        --epochs 801 \
        --log_dir ${OUTPUT_DIR} \
        --output_dir ${OUTPUT_DIR}

nnodes:机器数量
nproc_per_node:每台机器GPU数量

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值