Temporal-Relational CrossTransformers for Few-Shot Action Recognition 学习解读

最新推荐文章于 2022-12-01 17:40:46 发布

fadedtj

最新推荐文章于 2022-12-01 17:40:46 发布

阅读量1.4k

点赞数 2

分类专栏：小样本学习 Action Recognition 文章标签： python 深度学习

本文链接：https://blog.csdn.net/m0_50811752/article/details/115459067

版权

Temporal-Relational CrossTransformers for Few-Shot Action Recognition

Abstract
Introduction
Related Work
Method
Ablations

一作：Toby Perrett 主页介绍
作者之前主要做LSTM、元学习；这篇文章也很快就开源了，开源地址如下，作者很热心，回复很耐心。
Github源码

Abstract

Distinct from previous few-shot works, we construct class prototypes using the CrossTransformer attention mechanism to observe relevant sub-sequences of all support videos, rather than using class averages or single best matches. Video representations are formed from ordered tuples of varying numbers of frames, which allows sub-sequences of actions at different speeds and temporal offsets to be compared.

我们主要关注这两句话；首先指出了和以前的小样本学习方法的不同，然后提出解决了什么样的问题。

观察了所有支持集视频的相关子序列--------而不是类平均值or单个最佳匹配值（之前的方法）
视频表示由不同数量帧的有序元组构成，可以比较不同速度和时间偏移下的动作子序列

Introduction

We propose a novel approach to few-shot action recognition, which we term Temporal-Relational CrossTransformers (TRX). A query-specific class prototype is constructed by using an attention mechanism to match each query sub-sequence against all sub-sequences in the support set, and aggregating this evidence. By performing the attention operation over temporally-ordered sub-sequences rather than

最低0.47元/天解锁文章

fadedtj

关注

2
点赞
踩
7

收藏

觉得还不错? 一键收藏
43
评论
Temporal-Relational CrossTransformers for Few-Shot Action Recognition 学习解读

Temporal-Relational CrossTransformers for Few-Shot Action RecognitionAbstractIntroductionRelated WorkMethodAblationsTRX with different Ω valuesThe impact of ordered tuplesMatching to multiple support set videoVisualising tuple matchesVarying the number of
复制链接

扫一扫