这个bug是在用accelerate跑代码时出现的,完整的报错信息是:
ConnectionError: Tried to launch distributed communication on port `xxxxx`, but another process is utilizing it. Please specify a different port (such as using the `----main_process_port` flag or specifying a different `main_process_port` in your config file) and rerun your script. To automatically use the next open port (on a single node), you can set this to `0`.
事实上改成0没用,我试后有用的解决方案是改成port这个数字+1
默认config文件的路径是.cache/huggingface/accelerate/default_config.yaml
,可以直接改这个,如果担心改这个会影响别的代码,可以新建一个config文件,在最后添加这行:main_process_port: 12023
(port号)
然后把命令行添上:accelerate launch --config_file {path/to/config/my_config_file.yaml} {script_name.py} {--arg1} {--arg2} ...
参考资料:Launching your 🤗 Accelerate scripts
这种问题问ChatGPT果然没用啊,还得我自己去搜文档,ChatGPT不行啊。