论文:
[2402.15627] MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs (arxiv.org)
中文讲解:
https://zhuanlan.zhihu.com/p/684712727
论文:
[2402.15627] MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs (arxiv.org)
中文讲解:
https://zhuanlan.zhihu.com/p/684712727