When I was training Transformer based on 12M+ source sentences and equal number of target sentences (batch size equals 4096, platform is 4 × T I T A N X p 4\times{TITAN Xp} 4×TITA
Tensor2Tensor GPU Memory Error During Training
最新推荐文章于 2022-03-23 19:55:10 发布