2021-06-01: We are organizing DeeperAction Challenge at ICCV 2021, by introducing three new benchmarks on temporal action localization, spatiotemporal action detection, and part-level action parsing.
2021-04-20: The extension of TRecgNet is accpeted by IJCV.
2021-04-07: We propose a target transformer for accurate anchor-free tracking, termed as TREG (code comming soon).
2021-04-07: We present a transformer decoder for direct action proposal generation, termed as RTD-Net (code comming soon).
2021-03-01: Two papers on action recognition and point cloud segmentation are accepted by CVPR 2021.
2020-12-30: We propose a new video architecture of using temporal difference, termed as TDN and realease the code.
2020-07-03: Three papers on action detection and segmentation are accepted by ECCV 2020.
2020-06-28: Our proposed DSN, a dynamic version of TSN for efficient action recognition, is accepted by TIP.
2020-05-14: We propose a temporal adaptive module for video recognition, termed as TAM and code.
2020-04-16: The code of our published papers will be made available at Github: MCG-NJU.
2020-04-16: We propose a fully convolutional online tracking framwork, termed as FCOT and code.
2020-03-10: Our proposed temporal module TEA is accepted by CVPR 2020.
2020-01-20: We propose an efficient video representation learning framwork, termed as CPD and release the code.
2020-01-15: We present an anchor-free action tubelet detector, termed as MOC-Detector and release the code.
2019-12-20: Our proposed V4D, a principled video-level represenation learning framework, is accepted by ICLR 2020.
2019-11-21: Our proposed TEINet, an efficient video archiecture for video recognition, is accepted by AAAI 2020.
2019-07-23: Our proposed LIP, a general alternative to average or max pooling, is accepted by ICCV 2019.
2019-03-15: Two papers are accepted by CVPR 2019: one for group activity recognition and one for RGB-D transfer learning.
2018-08-19: One paper is accepted by ECCV 2018 and one by T-PAMI.
2017-11-28: We released a recent work on video architecture design for spatiotemporal feature learning. [ arXiv ] [ Code ].
2017-09-08: We have released the TSN models learned in the Kinetics dataset. These models could be transferred well to the existing datasets for action recognition and detection [ Link ].
2017-09-01: One paper is accepted by ICCV 2017 and one by IJCV.
2017-03-28: I am co-organizing the CVPR2017 workshop and challenge on Visual Understanding by Learning from Web Data. For more details, please see the workshop page and challenge page.
2017-02-28: Two papers are accepted by CVPR 2017.
2016-12-20: We release the code and models for SR-CNN paper [ Code ].
2016-10-05: We release the code and models for Places2 scene recognition challenge [ arXiv ] [ Code ].
2016-08-03: Code and model of Temporal Segment Networks is released [ arXiv ] [ Code ].
2016-07-15: One paper is accepted by ECCV 2016 and one by BMVC 2016.
2016-06-16: Our team secures the 1st place for untrimmed video classification at ActivityNet Challenge 2016 [ Result ].
Basically, our solution is based on our works of Temporal Segment Networks (TSN) and Trajectory-pooled Deep-convolutional Descriptors (TDD).
2016-03-01: Two papers are accepted by CVPR 2016.
2015-12-10: Our SIAT_MMLAB team secures the 2nd place for scene recognition at ILSVRC 2015 [ Result ].
2015-09-30: We rank 3rd for cultural event recognition on ChaLearn Looking at People challenge, at ICCV 2015.
2015-08-07: We release the Places205-VGGNet models [ Link ].
2015-07-22: Code of Trajectory-Pooled Deep-onvolutional Descriptors (TDD) is released [ Link ].
2015-07-15: Very deep two stream ConvNets are proposed for action recognition [ Link ].
2015-03-15: We are the 1st winner of both tracks for action recognition and cultural event recognition, on ChaLearn Looking at People Challenge at CVPR 2015.