slowfast安装并训练自己的数据集

目录

安装SlowFast

运行自己的数据集


 

安装SlowFast

slowfast主页:https://github.com/facebookresearch/SlowFast

参考INSTALL.md

 建议使用虚拟环境安装

1.创建虚拟环境

conda create -n slowfast python=3.7

conda activate slowfast

2.安装指定版本pytorch,文档里写的是 Pytorch1.3,但是后面需要安装的detectron2需要pytorch版本高于1.6,所以根据自己的cuda版本安装对应的pytorch,示例如下

conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch

 3.安装剩下的库,根据文档指示即可

pip install 'git+https://github.com/facebookresearch/fvcore'

pip install simplejson

conda install av -c conda-forge

conda install -c iopath iopath

pip install psutil

pip install opencv-python

conda install torchvision -c pytorch

pip install tensorboard

conda install -c conda-forge moviepy

pip install pytorchvideo

4.安装Detectron2

 (1)安装cpython和pycocotools:

pip install cython 

pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'

(2) 安装Detectron2

python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'

5. Build pyslowfast

git clone https://github.com/facebookresearch/slowfast

cd SlowFast

python setup.py build develop

 等待运行完毕后即可

训练自己的数据集

1.创建数据集 

原始为video格式,需要按照Charades数据集的格式进行预处理,并生成frames,frame_lists文件

 数据目录如下所示:

(1)视频提帧

 使用ffmpeg批量提帧,每秒24帧

IN_DATA_DIR="/home/dataset/structuring/train5k_split_video" #原始视频目录
OUT_DATA_DIR="/home/slowfast/data/charades/frames"         #存放视频帧目录

if [[ ! -d "${OUT_DATA_DIR}" ]]; then
  echo "${OUT_DATA_DIR} doesn't exist. Creating it.";
  mkdir -p ${OUT_DATA_DIR}
fi

for video in $(find ${IN_DATA_DIR}/ -name *".mp4")
#for video in $(ls -A1 -U ${IN_DATA_DIR}/*)
do
  video_name=${video##*/}

  if [[ $video_name = *".mp4" ]]; then
    video_name=${video_name::-4}
  else
    continue
  fi

  out_video_dir=${OUT_DATA_DIR}/${video_name}/
  mkdir -p "${out_video_dir}"

  out_name="${out_video_dir}/${video_name}-%06d.jpg"

  ffmpeg -i "${video}" -r 24 -q:v 1 "${out_name}"
done

 (2)生成train.csv,val.csv

根据自己原本数据集的标注文件转换格式即可,我的标注文件如下所示:

只作展示格式所示,不同数据集可能不一样,可以自行修改

#train.txt
781218dfedcd54d8ea97a954177004ed	推广页,室外,中景,门口,动态,混剪,现代,喜悦,静态
3f7dd413d181f723b75e942ff3ffaf6f	推广页,家庭伦理,中景,手机电脑录屏,室内,教辅材料
11f2011efde567a96dca9f59e09ff211	多人情景剧,推广页,家,厌恶,家庭伦理,愤怒,拉近,中景

#label_id.txt
场景-其他	0
室内	1
家	2
室外	3
办公室	4
影棚幕布	5
学校	6
汽车内	7

 

import os

dataset_path = '/home/tione/notebook/slowfast/data/charades/frames'    # 切分图片目录
label_path = '/home/tione/notebook/slowfast/data_file/train.txt'  #train.txt, val.txt
tag_id_file = '/home/tione/notebook/VideoStructuring/dataset/label_id.txt'

if __name__ == '__main__':
    #获取类别字典
    dict_categories = {}
    with open(tag_id_file, 'r',encoding='utf-8') as lnf:
        for line in lnf:
            tag, idx = line.strip().split('\t')
            dict_categories[tag] = int(idx)

    print(dict_categories)

    
    count_cat = {k: 0 for k in dict_categories.keys()}
    with open(label_path) as f:
        lines = f.readlines()
    folders = []
    idx_categories = []
    categories_list = []
    for line in lines:
        line = line.rstrip()    #  删除 string 字符串末尾的指定字符(默认为空格)
        items = line.split('\t')
        folders.append(items[0])  #视频文件名
        categories_list = []
        items_list = items[1].split(',')
        for i in range(len(items_list)):
            items_catergory = items_list[i]
            categories_list.append(dict_categories[items_catergory])
        idx_categories.append(categories_list)

    assert len(idx_categories) == len(folders)
    
    csv_path = '/home/tione/notebook/SlowFast/data/charades/frame_lists/train.csv'
    csv_file = open(csv_path,'w')
    csv_file.write("original_vido_id,video_id,frame_id,path,labels\n")
    j = 0
    for i in range(len(folders)):
        curFolder = folders[i]
        curIDX = idx_categories[i]
        # counting the number of frames in each video folders
        img_dir = os.path.join(dataset_path, curFolder)
        #print(img_dir)
        k = 0
        filen = os.listdir(img_dir)
        filen.sort(key=lambda x:int(x[-10:-4]))
        for h in filen:
            csv_file.write(curFolder + ' ' + str(j) + ' ' + str(k) + ' ' + os.path.join(curFolder, h) + ' ' + str(curIDX).replace('[', '"').replace(']', '"').replace(' ', '') + '\n')
            
            k += 1
        j += 1
        
    csv_file.close()

 charades数据集csv文件的格式最后为:

 2.修改配置文件

在configs/Charades/SLOWFAST_16x8_R50.yaml  下

TRAIN:
  ENABLE: False
  DATASET: charades
  BATCH_SIZE: 16
  EVAL_PERIOD: 6
  CHECKPOINT_PERIOD: 6
  AUTO_RESUME: True
  CHECKPOINT_FILE_PATH: './checkpoints/SLOWFAST_32x2_R101_50_50.pkl' # please download from the model zoo.
  CHECKPOINT_TYPE: caffe2
DATA:
  NUM_FRAMES: 64
  SAMPLING_RATE: 2
  TRAIN_JITTER_SCALES: [256, 340]
  TRAIN_CROP_SIZE: 224
  TEST_CROP_SIZE: 256
  INPUT_CHANNEL_NUM: [3, 3]
  MULTI_LABEL: True
  INV_UNIFORM_SAMPLE: True
  ENSEMBLE_METHOD: max
  REVERSE_INPUT_CHANNEL: True
  PATH_TO_DATA_DIR: '/home/SlowFast/data/charades/frame_lists'  #添加data路径
  PATH_PREFIX: '/home/SlowFast/data/charades/frames'
SLOWFAST:
  ALPHA: 4
  BETA_INV: 8
  FUSION_CONV_CHANNEL_RATIO: 2
  FUSION_KERNEL_SZ: 7
RESNET:
  SPATIAL_STRIDES: [[1, 1], [2, 2], [2, 2], [2, 2]]
  SPATIAL_DILATIONS: [[1, 1], [1, 1], [1, 1], [1, 1]]
  ZERO_INIT_FINAL_BN: True
  WIDTH_PER_GROUP: 64
  NUM_GROUPS: 1
  DEPTH: 50
  TRANS_FUNC: bottleneck_transform
  STRIDE_1X1: False
  NUM_BLOCK_TEMP_KERNEL: [[3, 3], [4, 4], [6, 6], [3, 3]]
NONLOCAL:
  LOCATION: [[[], []], [[], []], [[], []], [[], []]]
  GROUP: [[1, 1], [1, 1], [1, 1], [1, 1]]
  INSTANTIATION: dot_product
BN:
  USE_PRECISE_STATS: True
  NUM_BATCHES_PRECISE: 200
  NORM_TYPE: sync_batchnorm
  NUM_SYNC_DEVICES: 2    #这里也改为GPU数量
SOLVER:
  BASE_LR: 0.0375
  LR_POLICY: steps_with_relative_lrs
  LRS: [1, 0.1, 0.01, 0.001, 0.0001, 0.00001]
  STEPS: [0, 41, 49]
  MAX_EPOCH: 57
  MOMENTUM: 0.9
  WEIGHT_DECAY: 1e-4
  WARMUP_EPOCHS: 4.0
  WARMUP_START_LR: 0.0001
  OPTIMIZING_METHOD: sgd
MODEL:
  NUM_CLASSES: 82   #修改类别数量
  ARCH: slowfast
  LOSS_FUNC: bce_logit
  HEAD_ACT: sigmoid
  DROPOUT_RATE: 0.5
TEST:
  ENABLE: False  #训练时修改为Fasle
  DATASET: charades
  BATCH_SIZE: 16
  NUM_ENSEMBLE_VIEWS: 10
  NUM_SPATIAL_CROPS: 3
DATA_LOADER:
  NUM_WORKERS: 4   #双卡可以设为4,4卡可以设为8
  PIN_MEMORY: True
NUM_GPUS: 2        #使用GPU数量
NUM_SHARDS: 1
RNG_SEED: 0
OUTPUT_DIR: .
LOG_MODEL_INFO: False

3.训练

python tools/run_net.py --cfg configs/Charades/SLOWFAST_16x8_R50.yaml

 

  • 13
    点赞
  • 106
    收藏
    觉得还不错? 一键收藏
  • 48
    评论
评论 48
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值