Learning Spatiotemporal Features with 3D Convolutional Networks (C3D User Guide)

最新推荐文章于 2022-12-29 13:23:39 发布

zhulei1109

最新推荐文章于 2022-12-29 13:23:39 发布

阅读量2.1k

点赞数

本文链接：https://blog.csdn.net/u014154380/article/details/78347437

版权

caffe 同时被 2 个专栏收录

12 篇文章 0 订阅

订阅专栏

CNN

6 篇文章 0 订阅

订阅专栏

1、C3D特征提取1.安装C3D（方法同安装caffe） 2.下载预训练模型将其保存在YOUR_C3D_HOME/examples/c3d_feature_extraction +将目录更改为YOUR_C3D_HOME / examples / c3d_feature_extraction +运行脚本文件： sh c3d_sport1m_feature_extraction_fr

摘要由CSDN通过智能技术生成

1、C3D特征提取

1.安装C3D（方法同安装caffe）
2.下载预训练模型将其保存在YOUR_C3D_HOME/examples/c3d_feature_extraction
+将目录更改为YOUR_C3D_HOME / examples / c3d_feature_extraction
+运行脚本文件： sh c3d_sport1m_feature_extraction_frm.sh o或者 sh c3d_sport1m_feature_extraction_video.sh
如果example运行成功，那么应该在输出文件夹找到提取的特征。
如果显示“out of memory”那么需要减小bath-size。
如果可以成功运行特征提取却在视频输入时出出错，原因可能来自视频编解码器，确保您已经编译了OpenCV和Ffmpeg且共享标志已打开。
+当你使用C3D作为体征提取器时，在输入层确保“shuffle：false”。这将会使我们保持输入片段和输出特征之间的对应关系。

为你自己的视屏文件和帧文件提取C3D特征

a.准备你的输入文件
C3D允许你使用视屏输入作为帧或者视频文件的序列，在使用帧的情况下，C3D假设每个视频是具有从1到N（帧数）开始编号的帧的文件夹。帧名称格式为“video_folder /％06d.jpg”。
注意：帧作为输入序号从1开始，用视频作为输入，序号从0开始。

b.准备设置文件：
有两个设置文件需要去准备:1. input-list 2.output prefix
在example中，他们是input_list_frm.txt, input_list_video.txt, and output_list_prefix.txt in YOUR_C3D_HOME/examples/c3d_feature_extraction/prototxt
输入列表文件是一个文本文件，其中每行包含你正在输入到C3D中以提取要素的剪辑的信息。格式如下：

<string_path> <starting_frame> <label>

只被用来训练测试微调，不参与提取特征。因此可以被忽略，在example中，他们全被置0。
视频作为输入：e.g. input/avi/v_ApplyEyeMakeup_g01_c01.avi
帧作为输入：e.g. input/frm/v_ApplyEyeMakeup_g01_c01/
最后，用于指定剪辑的起始帧。
C3D每16帧/s速度提取一次，从1开始就是（1，16），从17开始就是（17，32）
你也可以不使用16帧的提取步长使用你的自定义步长比如32帧/s。
output prefix file用于指定要保存的提取特征的位置。

output prefix file的每行对应于 input list file中的每行
C3D以output_prefix.[feature_name] (e.g. prefix.fc6)的形式保存特征，建议为每个视频创建一个输出文件夹，prefix lines格式为sprintf（“output_folder /％06d”，starting_frame）。
这意味着每个剪辑都有起始帧，标识符和文件扩展名用于不同的功能。记住创建输出文件夹，因为C3D不会创建它们。

c、提取C3D特征。
假设已准备好设置文件，则需要修改prototxt文件以指向输入列表文件。在prototxt文件中，寻找：
source: “prototxt/input_list_frm.txt”
使用图片作为输入：设置 use_image: truer
使用视频作为输入：设置 use_image: false
使用extract_image_features工具提取特征，此工具集使用的参数如下：
extract_image_features.bin …

+: is prototxt file (provided in example) which points to your input list file.

+: is the C3D pre-trained model that you downloaded.
+: GPU ID you would like to run (starting from 0), if this is set to -1, then it will use CPU.
+: your mini batch size. Default is 50, but you can modify this number, depend on your GPU memory.

+: Number of mini-batches you want to extract features. For examples, if you have 100 clips to extract features and you are using mini-batch size of 50, then this parameter should be set to 2. However, if you have 101 clips to be extracted features, then this number should be set to 3.
+: Your output prefix file.
+: You can list as many feature names as possible as long as they are in the names of the output blobs of the network (see prototxt file for all layers, but they look like fc6-1, fc7-1, fc8-1, pool5, conv5b, prob,…).
你可以在下面的示例中找到以下命令行：
GLOG_logtosterr=1 ../../build/tools/extract_image_features.bin prototxt/c3d_sport1m_feature_extractor_frm.prototxt conv3d_deepnetA_sport1m_iter_1900000 0 50 1 prototxt/output_list_prefix.txt fc7-1 fc6-1 prob

使用更小或更大batch-size提取C3D特征：

根据需要在Prototxt中改变参数：batch_size: 50
并且还需要在命令行中输入和的新调整参数。
提取C3D特征后，可以使用提供的MATLAB脚本（read_binary_blob.m）来读取特征进行进一步分析。

2、训练3D CNN

A、计算均值文件
该工具允许你为自己的数据集计算均值，这可以对C3D在你自己的数据集上的零碎训练或微调有用。
Usage:
GLOG_logtostderr=1 compute_volume_mean_from_list input_chunk_list length height width sampling_rate output_file [dropping rate]
Arguments:
input_chunk_list: the same as the list file used in feature extraction
length: the length of the clip used in training (e.g. 16)
height, width: size of frame e.g. 128, 171
sampling_rate: this is used to adjust the frame rate in you clip (e.g. clip length=16, sampling=1, then your clip is a 16-consecutive frame video chunk. Or if clip length=16, while sampling rate=2, then you clip is 32-frame long clips, but you sample 1 of every 2 frames)
output_file: the output mean file.
dropping_rate: In case you dataset is too large (e.g. 1M), you may want to compute the mean from a subset of your clips. Setting this to n, meaning the dropping rate is 1:n, choose 1 sample among every n clips for computing mean.

如果你喜欢使用mean_value，而不是volume_mean文件，那么你可以在数据层中设置mean_value。这相当于 volume mean ，所有值都设置为mean_value。

B.从头开始训练自己的网络
假设你有你的input_data_list，你的train / test prototxt和你的solver prototxt，你可以用train_net来训练网络

C.在UCF101上从零开始的一个例子
+Change directory to YOUR_C3D_HOME/examples/c3d_train_ucf101/
+run sh create_volume_mean.sh to compute the volume mean file
+run sh train_ucf101.sh to train, expect a couple days to finish
+run sh test_ucf101.sh to test, expect about 15’ to complete and you should have ~45% accuracy (this is clip accuracy)
I. Fine-tune C3D
download the C3D pre-trained model and try the fine-tuning example:
+Change directory to YOUR_C3D_HOME/examples/c3d_finetuning
+Run: sh ucf101_finetuning.sh
+When fine-tuning is done, you can test your fine-tuned model by running: sh ucf101_testing.sh
+[Added 05/10/2016] In case you don’t have time to fine-tune C3D on UCF101 yourself, here we provide the C3D model fine-tuned on UCF101: https://www.dropbox.com/s/mkc9q7g4wnqnmcv/c3d_ucf101_finetune_whole_iter_20000 Simply download this model to YOUR_C3D_HOME/examples/c3d_finetuning and run sh ucf101_testing.sh (assume that you have made sure your test_01.lst file points to your UCF101 frames). This will give an accuracy of 80.19% (clip accuracy). NOTE: this model is fine-tuned on UCF101 “train split 1”, thus it is only valid to test on “test split 1”

zhulei1109

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
1
评论
Learning Spatiotemporal Features with 3D Convolutional Networks (C3D User Guide)

1、C3D特征提取1.安装C3D（方法同安装caffe） 2.下载预训练模型将其保存在YOUR_C3D_HOME/examples/c3d_feature_extraction +将目录更改为YOUR_C3D_HOME / examples / c3d_feature_extraction +运行脚本文件： sh c3d_sport1m_feature_extraction_fr
复制链接

扫一扫