问题:RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.
解决: github503问题,解决方案,windows环境使用detectron2 # 503
cuda_num = os.environ['CUDA_VISIBLE_DEVICES']
cuda_num_list = list(cuda_num.split(","))
if len(cuda_num_list) == 1:
import torch.distributed as dist
dist.init_process_group(backend='nccl', init_method='tcp://localhost:23456', rank=0, world_size=1)
出现报错 RuntimeError: Distributed package doesn't have NCCL built in
原因分析: windows不支持NCCL backend
解决方案: 在dist.init_process_group语句之前添加backend=‘gloo’,也就是在windows中使用GLOO替代NCCL