Deep-Feature-Flow文章及代码训练解析

Deep Feature Flow for Video Recognition
CVPR2017
Github地址:https://github.com/msracver/Deep-Feature-Flow
paper地址:https://arxiv.org/abs/1611.07715

读书笔记

Deep feature flow 利用deep feature (appearance 信息)和 光流(motion 信息)来对视频中的每一帧做建模,它的核心思想是利用指定的关键帧(key frame)和其他帧(current frame)到关键帧的光流信息,把关键帧送入到deep CNN中得到deep feature,而对于其他帧算出与关键帧的光流,再用光流把关键帧的deep feature propagate (相加,在做双线性差值)到当前帧,这时候就可大大减少计算量,因为对于非关键帧的话就不用再去过CNN了。对于新得到的特征,接下来就可对其进行不同任务处理,比如分割或者检测。

算法要点
1)在关键帧(Key Frame)进行特征图提取

  这一步是比较耗时的,因此是间隔进行的,至于是使用Faster RCNN 还是RFCN,网络是ResNet-101 还是Inception,可以自行选择。

2)帧间传播

  通过下面这幅图来看,作者将 任务分成两个:特征提取 N(feat)和分类&分割 N(task)。
  耗时的特征提取 N(feat)只在关键帧 Work,非关键帧的 Feature 通过传播(Propagation)得到。
  F是通过两个 Raw Frame得到的 Flow信息,作用于前面的 Feature Map,得到当前帧 特征图,并用于计算 N(task)。
在这里插入图片描述

3)特征图映射

   特征图映射是本文的最关键部分,因为高层特征和底层特征的差别,流估计的误差会使得特征形变不准确,来看映射效果:
在这里插入图片描述

4)端到端训练

  为了让算法达到比较好的效果,端到端(end-to-end)的训练必不可少,好处在于能够比较好的平衡误差,避免因为单独训练每一部分都挺好,结果却无法 Match 的情况,大多数 Deep Net 都会选择 end-to-end 的方法,也比较好理解。

算法流程

在这里插入图片描述

代码训练笔记

基于mxnet的Deep-Feature-Flow文件内容相信大家都可以看懂,作者基于imagenet比赛的数据集进行的训练,31类(包含背景)并不包含人,而我所作的需要对人员进行检测,所以重新训练数据集。
训练过程作者写的很详细了,在此不做介绍,主要记录一下我所进行的修改:

准备工作:

作者使用的是ILSVRC2015数据集,所以需要准备格式相同的数据集。

建议:先下载ILSVRC2015数据集,先能跑通该数据集,然后再用自己的数据集集跑,如果出问题,那么就是标注数据集的问题

data/ILSCRC2015/ImageSets VID_train_15frames.txt 解析

train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 10 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 30 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 50 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 70 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 90 300

第一列表示对应的路径信息
第二列表示正负样本, 1为正 -1 为负
第三列表示第几帧图像
第四列表示该vid一共多少帧图像

注意:图片格式为JPEG
训练过程

修改cfgs下的文件:以resnet_v1_101_imagenet_vid_rfcn_end2end_ohem.yaml为例:

MXNET_VERSION: "mxnet"
output_path: "./output/rfcn/imagenet_vid" #选择要输出的路径
symbol: resnet_v1_101_rfcn
gpus: '0,1,2,3' #修改为你所使用的GPU
CLASS_AGNOSTIC: true
dataset: #根据标题,修改为自己的数据集的相关信息
  NUM_CLASSES: 31
  dataset: ImageNetVID
  dataset_path: "./data/ILSVRC2015"
  image_set: DET_train_30classes+VID_train_15frames
  root_path: "./data"
  test_image_set: VID_val_frames
  proposal: rpn
TRAIN: # 修改学习率已经epoch等
  lr: 0.00025
  lr_step: '1.333'
  warmup: false
  begin_epoch: 0
  end_epoch: 2
  ...
主要遇到的训练问题:
1:TypeError: init_params() got an unexpected keyword argument ‘allow_extra’

在这里插入图片描述
解决办法:进入mnxet/module/找到base_module.py文件中 allow_extra=allow_extra这一行删除掉。

2:TypeError: _update_params_on_kvstore() takes exactly 4 arguments (3 given)

在这里插入图片描述
解决办法:进入对应的module.py文件,相应修改

_update_params_on_kvstore(self._exec_group.param_arrays,
                                      self._exec_group.grad_arrays,
                                      self._kvstore,
                                      self._param_names)
Key Features Learn advanced techniques in deep learning with this example-rich guide on Google's brainchildExplore various neural networks with the help of this comprehensive guideAdvanced guide on machine learning techniques, in particular TensorFlow for deep learning. Book Description Deep learning is the next step after machine learning. It is machine learning but with a more advanced implementation. As machine learning is no longer an academic topic, but a mainstream practice, deep learning has taken a front seat. With deep learning being used by many data scientists, deeper neural networks are evaluated for accurate results. Data scientists want to explore data abstraction layers and this book will be their guide on this journey. This book evaluates common, and not so common, deep neural networks and shows how these can be exploited in the real world with complex raw data using TensorFlow. The book will take you through an understanding of the current machine learning landscape then delve into TensorFlow and how to use it by considering various data sets and use cases. Throughout the chapters, you'll learn how to implement various deep learning algorithms for your machine learning systems and integrate them into your product offerings such as search, image recognition, and language processing. Additionally, we'll examine its performance by optimizing it with respect to its various parameters, comparing it against benchmarks along with teaching machines to learn from the information and determine the ideal behavior within a specific context, in order to maximize its performance. After finishing the book, you will be familiar with machine learning techniques, in particular TensorFlow for deep learning, and will be ready to apply some of your knowledge in a real project either in a research or commercial setting. What you will learn Provide an overview of the machine learning landscapeLook at the historical development and progress of deep learningDescribe TensorFlow and become very familiar with it both in theory and in practiceAccess public datasets and use TF to load, process, clean, and transform dataUse TensorFlow on real-world data sets including images and textGet familiar with TensorFlow by applying it in various hands on exercises using the command lineEvaluate the performance of your deep learning modelsQuickly teach machines to learn from data by exploring reinforcement learning techniques.Understand how this technology is being used in the real world by exploring active areas of deep learning research and application.
评论 7
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值