知识蒸馏！

%海棠依旧%

已于 2024-10-25 16:36:33 修改

阅读量1.7k

点赞数 14

文章标签： yolov8 目标检测 YOLO 人工智能深度学习

于 2024-02-25 17:59:51 首次发布

本文链接：https://blog.csdn.net/weixin_43175143/article/details/136285196

版权

研究者在YOLOv8框架下进行了知识蒸馏实验，比较了CWD、MGD、BCKD等不同方法，发现Logits蒸馏中的BCKD在自制数据集上表现优于CWD，特别是单独使用LD在回归分支上提点显著。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

09.14 在YOLOv8下的知识蒸馏，目前实验进展，已测试基于特征图的CWD和MGD，对自建数据集均有提点。其中，学生模型YOLOv8n，教师模型YOLOv8s，CWD有效提点1.01%，MGD提点0.34%。同时，支持对自己的改进模型进行知识蒸馏。

09.16 框架大改，加入Logits蒸馏。支持Logits蒸馏和特征蒸馏同时或者分别进行。

目前支持如下方法：

Logits蒸馏：最新的BCKD（Bridging Cross-task Protocol Inconsistency for Distillation in Dense Object Detection）https://arxiv.org/pdf/2308.14286.pdf，后续将加入其它Logits蒸馏方法。

特征蒸馏：CWD（Channel-wise Knowledge Distillation for Dense Prediction）https://arxiv.org/pdf/2011.13256.pdf；MGD（Masked Generative Distillation）https://arxiv.org/abs/2205.01529；FGD（Focal and Global Knowledge Distillation for Detectors）https://arxiv.org/abs/2111.11837；FSP（A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning）https://openaccess.thecvf.com/content_cvpr_2017/papers/Yim_A_Gift_From_CVPR_2017_paper.pdf

。后续将加入其它特征蒸馏方法。

09.17 BCKD实验结果，自制数据集上提点1.63%，优于CWD，并且两者可以同时训练。

09.18 加入调试成功的各类蒸馏方法。

目前支持如下方法：

Logits蒸馏：最新的BCKD（Bridging Cross-task Protocol Inconsistency for Distillation in Dense Object Detection）https://arxiv.org/pdf/2308.14286.pdf；CrossKD（Cross-Head Knowledge Distillation for Dense Object Detection）https://arxiv.org/abs/2306.11369；NKD（From Knowledge Distillation to Self-Knowledge Distillation: A Unified Approach with Normalized Loss and Customized Soft Labels）https://arxiv.org/abs/2303.13005；DKD(Decoupled Knowledge Distillation) https://arxiv.org/pdf/2203.08679.pdf； LD(Localization Distillation for Dense Object Detection) https://arxiv.org/abs/2102.12252；WSLD(Rethinking the Soft Label of Knowledge Extraction: A Bias-Balance Perspective) https://arxiv.org/pdf/2102.00650.pdf；Distilling the Knowledge in a Neural Network https://arxiv.org/pdf/1503.02531.pd3f。

；PKD（General Distillation Framework for Object Detectors via Pearson Correlation Coefﬁcient） https://arxiv.org/abs/2207.02039。

09.20 单独使用LD在回归分支的实验结果，目前表现最好，提点1.69%，比加了分类分支的BCKD要好。原因分析：可能是分类分支的KD影响了回归分支。