MixUp as Locally Linear Out-of-Manifold Regularization论文阅读笔记

AdaMix是一种针对MixUp方法的改进,解决了混合数据可能导致的‘流形入侵’问题。通过引入两个神经网络,一个生成混合系数,一个作为入侵鉴别器,以自适应方式避免流形入侵,提升模型性能。最终,通过结合入侵损失和原始数据、混合数据的损失来优化总损失函数。
摘要由CSDN通过智能技术生成

MixUp as Locally Linear Out-of-Manifold Regularization

问题

MixUp方法会对一对输入的图像和其标签分别进行混合
x ^ = λ x + ( 1 − λ ) x ′ y ^ = λ y + ( 1 − λ ) y ′ \hat{x} = \lambda x+(1-\lambda)x'\\ \hat{y} = \lambda y+(1-\lambda)y'\\ x^=λx+(1λ)xy^=λy+(1λ)y
作者发现,有时混合数据 x ^ \hat{x} x^会与已存在的数据特别类似,但此时他的标签是混合标签 y ^ \hat{y} y^, 由此就会造成分类器的混乱,从而让其性能下降,作者称该状况为’流形入侵(manifold intrusion)’。

解决方案

引入两个神经网络 π ( ⋅ ) \pi(\cdot) π() φ ( ⋅ ) \varphi(\cdot) φ()分别用来生成混合所用的系数 λ \lambda λ和判断新混合的数据是否造成流形入侵,因为其混合所用的系数是自适应生成的,故该方法称为AdaMix。

方法详解

符号

χ \chi χ:全部的数据空间

Υ \Upsilon Υ:全部的数据标签空间

M \Mu M:流形空间

g ( x ) g(x) g(x):将 M \Mu M x x x映射到 Υ \Upsilon Υ空间的方程

D D D M \Mu M的子集

P ( Υ ) P(\Upsilon) P(Υ):标签的分布

F ( χ , Υ ) F(\chi,\Upsilon) F(χ,Υ):映射方程组

H H H F ( χ , Υ ) F(\chi,\Upsilon) F(χ,Υ)的子集

δ y \delta_{y} δy:在y位置置1,即为one-hot-label

Λ \Lambda Λ:混合策略空间, Λ ⊆ S k \Lambda\subseteq\mathbb{S}_{k} ΛSk

Ω ( k ) \Omega^{(k)} Ω(k):定义了有 k k k个列数据的矩阵,同理 M ( k ) , D ( k ) M^{(k)},D^{(k)} M(k),D(k)

Adaptive MixUp(AdaMixUp)

定义一个网络 π k ( ⋅ ) \pi_{k}(\cdot) πk()用来生成 k k k个输入的混合策略,并将其写为 Λ ∗ ( X ) \Lambda^{*}(X) Λ(X),其中 X X X是输入的 k k k个数据组成的列矩阵,且 Λ ∗ ( X ) ⊆ S k \Lambda^{*}(X)\subseteq\mathbb{S}_{k} Λ(X)Sk

定义另一个网络 φ ( ⋅ ) \varphi(\cdot) φ()用来进行二分类,目的是预测混合后的数据是否属于流形入侵,若是则分为0类,否则为1,作者称其为入侵鉴别器

使用“intrusion loss”来训练鉴别器
L i n t r : = 1 k m a x − 1 ∑ k = 2 k m a x E X ∼ D k , λ ∼ π k ( X ) log ⁡ p ( 1 ∣ X λ ; φ ) + E x ∼ D log ⁡ p ( 0 ∣ x ; φ ) L_{intr}:=\frac{1}{k_{max}-1}\sum_{k=2}^{k_{max}}E_{X\sim D^{k},\lambda\sim\pi_{k}(X)}\log p(1|X\lambda;\varphi)+E_{x\sim D}\log p(0|x;\varphi) Lintr:=kmax11k=2kmaxEXDk,λπk(X)logp(1Xλ;φ)+ExDlogp(0x;φ)
最后,全部的loss方程为:
L t o t a l : = L D ( H ) + L D ′ ( H , { π k } ) + L i n t r ( { π k } , φ ) L_{total}:=L_{D}(H)+L_{D'}(H,\{\pi_{k}\})+L_{intr}(\{\pi_{k}\}, \varphi) Ltotal:=LD(H)+LD(H,{πk})+Lintr({πk},φ)
其中 L D L_{D} LD L D ′ L_{D'} LD分别是对原始数据和混合数据的交叉熵损失。

应用

在实际使用中,作者只使用了k=2的模式,对用大于2的情况,就将前 k − 1 k-1 k1列数据看成一个整体与最后一列数据做 k = 2 k=2 k=2模式的操作即可。

Deep person re-identification is the task of recognizing a person across different camera views in a surveillance system. It is a challenging problem due to variations in lighting, pose, and occlusion. To address this problem, researchers have proposed various deep learning models that can learn discriminative features for person re-identification. However, achieving state-of-the-art performance often requires carefully designed training strategies and model architectures. One approach to improving the performance of deep person re-identification is to use a "bag of tricks" consisting of various techniques that have been shown to be effective in other computer vision tasks. These techniques include data augmentation, label smoothing, mixup, warm-up learning rates, and more. By combining these techniques, researchers have been able to achieve significant improvements in re-identification accuracy. In addition to using a bag of tricks, it is also important to establish a strong baseline for deep person re-identification. A strong baseline provides a foundation for future research and enables fair comparisons between different methods. A typical baseline for re-identification consists of a deep convolutional neural network (CNN) trained on a large-scale dataset such as Market-1501 or DukeMTMC-reID. The baseline should also include appropriate data preprocessing, such as resizing and normalization, and evaluation metrics, such as mean average precision (mAP) and cumulative matching characteristic (CMC) curves. Overall, combining a bag of tricks with a strong baseline can lead to significant improvements in deep person re-identification performance. This can have important practical applications in surveillance systems, where accurate person recognition is essential for ensuring public safety.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值