【知识蒸馏】feature-based 知识蒸馏 - - MGD(mask generative dissillation)

一、MGD特征蒸馏介绍

掩码生成蒸馏(mask generative distillaton, MGD)通过掩码学生特征的随机像素,并通过一个简单的块强制其生成教师的完整特征。它是一种真正通用的基于特征的蒸馏方法,可用于各种任务,包括图像分类、目标检测、语义分割和实例分割。
参考论文:
在这里插入图片描述

二、MGD特征蒸馏实现流程

(1)align学生模型与教师模型的特征通道

# ---- 特征cheannels对齐 ----#
        if student_channels != teacher_channels:
            self.align = nn.Conv2d(student_channels, teacher_channels, kernel_size=1, stride=1, padding=0)
        else:
            self.align = None

(2)对学生特征进行随机Mask

# Masked student feature
mat = torch.rand((N, C, 1, 1)).to(device)
mat = torch.where(mat < self.lambda_mgd, 0, 1).to(device)
masked_fea = torch.mul(s_pred, mat)

(3)定义Generation,并让学生的block还原教师的全部特征

# ---- 生成 ----#
self.generation = nn.Sequential(
    nn.Conv2d(teacher_channels, teacher_channels, kernel_size=3, padding=1),
    nn.ReLU(inplace=True),
    nn.Conv2d(teacher_channels, teacher_channels, kernel_size=3, padding=1))
    
# ---- 让学生的block还原教师的全部特征 ----#
new_fea = self.generation(masked_fea)

(4)计算loss

# loss
L2 = nn.MSELoss(reduction='sum')
loss = L2(new_fea, t_pred) / N # N:batch

三、完整MGD特征蒸馏代码实现

# ---- MGD是一种真正通用的基于特征的蒸馏方法,可用于各种任务,包括图像分类、目标检测、语义分割和实例分割-------#
class MGDLoss(nn.Module):
    def __init__(self,student_channels,teacher_channels,name,alpha_mgd=0.00007,lambda_mgd=0.5,):
        super(MGDLoss, self).__init__()
        self.alpha_mgd = alpha_mgd
        self.lambda_mgd = lambda_mgd

        # ---- 特征cheannels对齐 ----#
        if student_channels != teacher_channels:
            self.align = nn.Conv2d(student_channels, teacher_channels, kernel_size=1, stride=1, padding=0)
        else:
            self.align = None

        # ---- 生成 ----#
        self.generation = nn.Sequential(
            nn.Conv2d(teacher_channels, teacher_channels, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(teacher_channels, teacher_channels, kernel_size=3, padding=1))

    def forward(self,s_pred,t_pred):
        # ---- 检查特征cheannels是否对齐 ----#
        assert s_pred.shape[-2:] == t_pred.shape[-2:]
        if self.align is not None:
            preds_S = self.align(s_pred)

        # ---- 计算loss ----#
        loss = self.get_dis_loss(preds_S, t_pred) * self.alpha_mgd
        return loss

    def get_dis_loss(self, s_pred, t_pred):
        L2 = nn.MSELoss(reduction='sum')
        N, C, H, W = t_pred.shape
        device = s_pred.device

        # Masked student block
        mat = torch.rand((N, C, 1, 1)).to(device)
        mat = torch.where(mat < self.lambda_mgd, 0, 1).to(device)
        masked_fea = torch.mul(s_pred, mat)

        # 让学生的block还原教师的全部特征
        new_fea = self.generation(masked_fea)

        # loss
        loss = L2(new_fea, t_pred) / N

        return loss
  • 18
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

m0_51579041

你的奖励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值