Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?

Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?

网址:http://openaccess.thecvf.com/content_cvpr_2018/papers/Hara_Can_Spatiotemporal_3D_CVPR_2018_paper.pdf

Abstract

本文主要工作:当前传统的研究都只关注shallow 3D结构,而我们在各类数据集上比较从较浅到非常深的各种3D CNN的结构。

主要结论:

1)在UCF-101, HMDB-51, and ActivityNet上,resnet-18过拟合严重;但在kinecits,并未出现过你和。

2)Kinetics 可以训练非常深的3D CNNs,例如152 resnet

3)Kinetics 预训练的简单3D结构都能比复杂2D结构表现好

Introduction

在行为识别上,well-organized 的3D模型都没有一些stacked flow和RGB images的2D模型好

原因:1)当前视频数据集较小,而3D CNN中参数多

           2)预训练问题:3D CNNs can only be trained on video datasets,然而2D CNN有imagenet预训练

所以作者提出主要困惑:3D CNN能否重现 2D CNN和ImageNet的历史? 使用在Kinetics上训练的3D CNN能否在行为识别or其他各类任务上产生和imagenet相似的作用? 要解答上述疑惑,kinetics要预备的特点: 1)Kinetics要像ImageNet一样大规模 2)Kinetics要支持训练very deep的结构,这样才能回答上述问题。

本文的主要工作:

        1)从relatively shallow to very deep 探究不同的3D CNN结构在不同数据集:UCF-101, HMDB-51, ActivityNet,Kinetics上的性能。网络结构主要基于resnet。

        2)探究from scratch和 fine-tuning的比较

本文最主要贡献:this is the first work to focus on the training of very deep 3D CNNs from scratch for action recognition

Experimental configuration

探究的三个问题:

1)determine whether current video datasets have sufficient data for training of deep 3D CNNs

探讨当前的数据集是否足够大,可以训练复杂的3D CNN网络。这里我们用resnet18(最小的resnet结构)在几个数据集上学习。如果resnet18在某个数据集上过拟合,就说明该数据集太小了,以至于不能训练deep 3D CNNs from scratch,因为resnet18已经是比较小的结构。

2)conducted a separate experiment to determine whether the Kinetics dataset could train deeper 3D CNNs.

这一部分主要探究,在Kinetics可以设计多深的3D CNNs。模型深度从18到200。如果可以达到imagenet在深resnet上的性能,我们可以用该数据集来做行为识别中其他数据的预训练

3)examined the fine-tuning of Kinetics pretrained 3D CNNs on UCF-101 and HMDB-51

探讨kinetics产生的预训练参数对小数据集UCF101 和 HMDB-51产生的影响。网络结构:ResNet (basic and bottleneck blocks), pre-activation ResNet ,wide ResNet (WRN) , ResNeXt, and DenseNet

Experiment

1.第一个问题的探究,在resnet18上他牛不同数据集

1) resnet18在UCF-101, HMDB-51, and ActivityNet的验证误差远远大于训练误差,说明resnet18在这些数据集上过拟合了,所以推断出在这些数据集上train deep 3D CNNs from scratch 是不可行的。但在Kinetics结构不同,并不过拟合,所以可以在Kinetics上训练deep 3D CNNs

2.第二个问题的探究,kinetics能训练多深的3D网络?

验证深层网络在Kinetics上的结果,发现随着depth的升高,acc上升,直到resnet152饱和。但resnet200和resnet152结果差不多,可能已经开始过拟合了。

3.验证fine-tuning和从头训练的对比

Kinetics可以从头训练,但其他数据集不行,所以用Kinetics给其他数据预训练,结果差的还挺多

个人总结

本文有点像总结性论文,探究了多种resnet结构在当前行为识别上的多个常见数据集上的性能。从而得到结论:

        1)现有的很多行为识别数据集都太小,不能从头开始训练复杂的3D 网络结构

        2)但Kinetics可以,并且网络可以设计的非常深,resnet152 ,resneg200等

        3)在行为识别上,Kinetics可以充当imagenet的作用,给其他数据集提供预训练。

github上代码很全,但其实本文的实验结果并不好。例如ucf101用kinetic预训练,resnet50才到89.3。之前的paper,TSM(Temporal Shift Module for Efficient Video Understanding)好像能做到96了。并且作者用了很多图像增强的trick,实际我在ucf101上没用这些trick复现论文时,达不到89。

  • 1
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
润色下面英文:The controlled drug delivery systems, due to their precise control of drug release in spatiotemporal level triggered by specific stimulating factors and advantages such as higher utilization ratio of drug, less side-effects to normal tissues and so forth, provide a new strategy for the precise treatment of many serious diseases, especially tumors. The materials that constitute the controlled drug delivery systems are called “smart materials” and they can respond to the stimuli of some internal (pH, redox, enzymes, etc.) or external (temperature, electrical/magnetic, ultrasonic and optical, etc.) environments. Before and after the response to the specific stimulus, the composition or conformational of smart materials will be changed, damaging the original balance of the delivery systems and releasing the drug from the delivery systems. Amongst them, the photo-controlled drug delivery systems, which display drug release controlled by light, demonstrated extensive potential applications, and received wide attention from researchers. In recent years, photo-controlled drug delivery systems based on different photo-responsive groups have been designed and developed for precise photo-controlled release of drugs. Herein, in this review, we introduced four photo-responsive groups including photocleavage groups, photoisomerization groups, photo-induced rearrangement groups and photocrosslinking groups, and their different photo-responsive mechanisms. Firstly, the photocleavage groups represented by O-nitrobenzyl are able to absorb the energy of the photons, inducing the cleavage of some specific covalent bonds. Secondly, azobenzenes, as a kind of photoisomerization groups, are able to convert reversibly between the apolar trans form and the polar cis form upon different light irradiation. Thirdly, 2-diazo-1,2-naphthoquinone as the representative of the photo-induced rearrangement groups will absorb specific photon energy, carrying out Wolff rearrangement reaction. Finally, coumarin is a promising category photocrosslinking groups that can undergo [2+2] cycloaddition reactions under light irradiation. The research progress of photo-controlled drug delivery systems based on different photo-responsive mechanisms were mainly reviewed. Additionally, the existing problems and the future research perspectives of photo-controlled drug delivery systems were proposed.
02-06

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值