「已解决」使用DDP多卡训练在All distributed processes registered. Starting with 8 processes卡死

最新推荐文章于 2024-08-01 16:52:44 发布

Ceder1c

最新推荐文章于 2024-08-01 16:52:44 发布

阅读量619

点赞数 1

文章标签： pytorch

本文链接：https://blog.csdn.net/CCCDeric/article/details/133993371

版权

使用DDP进行多卡加速训练，卡在以下位置：

----------------------------------------------------------------------------------------------------
distributed_backend=nccl
All distributed processes registered. Starting with 8 processes
----------------------------------------------------------------------------------------------------

解决方法