主要基于网上一些质量比较高的分享进行总结,以对视频分析领域的常见问题和方法有一个大致的理解。
Activity Net 2017的五个task:
Task 1: Untrimmed Video Classification (ActivityNet)
videos can contain more than one activity, and typically large time lapses of the video are not related with any activity of interest.
Task 2: Trimmed Action Recognition (Kinetics) [New]
videos contain a single activity, and all the clips have a standard duration of ten seconds.
Task 3: Temporal Action Proposals (ActivityNet) [New]
The goal is to produce a set of candidate temporal segments that are likely to contain a human action.
Task 4: Temporal Action Localization (ActivityNet)
This task is intended to evaluate the ability of algorithms to temporally localize activities in untrimmed video sequences. Here, videos can contain more than one activity instance, and mutiple activity categories can appear in the video.
Task 5: Dense-Captioning Events in Videos (Activity