noisy students: 一种半监督学习方法

最新推荐文章于 2024-04-25 10:08:05 发布

good_pen

最新推荐文章于 2024-04-25 10:08:05 发布

阅读量877

点赞数

分类专栏：目标检测经典论文笔记

本文链接：https://blog.csdn.net/qq_35166798/article/details/111596085

版权

目标检测经典论文笔记专栏收录该内容

1 篇文章 0 订阅

订阅专栏

paper

The difference with knowledge distillation :

(You can call noisy student as knowledge expansion )

the student network is no smaller than the teacher.
inject noise

For input noise:

use data augmentation with RandAugment [18]. (Data augmentation is an important noising method)

For model noise:

we use dropout [76] and stochastic depth [37].

consistency training:

the teacher model, which has not converged thus can not get high accuracy, generates the psudo labels.
consistency training regularizes the model towards high entropy predictions, and prevents it from achieving good accuracy. To improve it, additional hyperparameters are introduced, which makes the mothed difficult to use at scale (大规模)

The experiment relies on a Cloud TPU v3 Pod, which has 2048 cores Our
largest model, EfficientNet-L2, needs to be trained for 6 days, if the
unlabeled batch size is 14x the labeled batch size

To be continued…

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

good_pen

关注关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
noisy students: 一种半监督学习方法

The difference with knowledge distillation :(You can call noisy student as knowledge expansion )the student network is no smaller than the teacher.inject noiseFor input noise:use data augmentation with RandAugment [18]. (Data augmentation is a.
复制链接

扫一扫