2020-12-29 bert state dict

最新推荐文章于 2022-06-13 10:59:08 发布

小老鼠上灯台偷油吃跑下来

最新推荐文章于 2022-06-13 10:59:08 发布

阅读量282

点赞数

本文链接：https://blog.csdn.net/weixin_43526074/article/details/111937271

版权

the teacher layer ids to distill is: [0, 1, 2, 3, 4, 5]
odict_keys([
‘bert.embeddings.word_embeddings.weight’,
‘bert.embeddings.position_embeddings.weight’,
‘bert.embeddings.token_type_embeddings.weight’,

‘bert.embeddings.LayerNorm.weight’,
‘bert.embeddings.LayerNorm.bias’,

‘bert.encoder.layer.0.attention.self.query.weight’,
‘bert.encoder.layer.0.attention.self.query.bias’,
‘bert.encoder.layer.0.attention.self.key.weight’,
‘bert.encoder.layer.0.attention.self.key.bias’,
‘bert.encoder.layer.0.attention.self.value.weight’,
‘bert.encoder.layer.0.attention.self.value.bias’,

‘bert.encoder.layer.0.attention.output.dense.weight’,
‘bert.encoder.layer.0.attention.output.dense.bias’,
‘bert.encoder.layer.0.attention.output.LayerNorm.weight’,
‘bert.encoder.layer.0.attention.output.LayerNorm.bias’,

‘bert.encoder.layer.0.intermediate.dense.weight’,
‘bert.encoder.layer.0.intermediate.dense.bias’,

‘bert.encoder.layer.0.output.dense.weight’,
‘bert.encoder.layer.0.output.dense.bias’,
‘bert.encoder.layer.0.output.LayerNorm.weight’,
‘bert.encoder.layer.0.output.LayerNorm.bias’,

‘cls.predictions.transform.dense.weight’,
‘cls.predictions.transform.dense.bias’,
‘cls.predictions.transform.LayerNorm.weight’,
‘cls.predictions.transform.LayerNorm.bias’,

‘cls.predictions.decoder.weight’,
‘cls.predictions.decoder.bias’

‘cls.predictions.bias’,

‘bert.pooler.dense.weight’,
‘bert.pooler.dense.bias’,

])

小老鼠上灯台偷油吃跑下来

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
2020-12-29 bert state dict

the teacher layer ids to distill is: [0, 1, 2, 3, 4, 5]odict_keys([‘bert.embeddings.word_embeddings.weight’,‘bert.embeddings.position_embeddings.weight’,‘bert.embeddings.token_type_embeddings.weight’,‘bert.embeddings.LayerNorm.weight’,‘bert.embedding
复制链接

扫一扫