1、使用ssh登录服务器时,在登录节点不要做任何计算操作,甚至conda操作也不要
例如:使用conda install XXX,会报错
(base) [cs_cs_ycs@workstation ~]$ conda install jupyter
Collecting package metadata (current_repodata.json): - Exception ignored in thread started by: <bound method Thread._bootstrap of <Thread(ThreadPoolExecutor-0_3, started daemon 23065636779776)>>
Traceback (most recent call last):
File "/home/cs_cs_ycs/miniconda3/lib/python3.8/threading.py", line 890, in _bootstrap
File "/home/cs_cs_ycs/miniconda3/lib/python3.8/threading.py", line 934, in _bootstrap_inner
MemoryError: failed
CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/linux-64/current_repodata.json>
Elapsed: -
An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.
'https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/linux-64'
还有内存不足错误
(base) [cs_cs_ycs@workstation ~]$ conda uninstall numpy
Collecting package metadata (repodata.json): failed
CondaMemoryError: The conda process ran out of memory. Increase system memory and/or try again.
解决方法:请求一个计算节点,在计算节点上操作就会正常
(base) [cs_cs_ycs@workstation ~]$ salloc
salloc: Granted job allocation 661773
salloc: Waiting for resource configuration
ssh salloc: Nodes cpu11 are ready for job
(base) [cs_cs_ycs@workstation ~]$ ssh cpu11
Warning: Permanently added 'cpu11' (ED25519) to the list of known hosts.
Last login: Tue Jul 23 19:55:24 2024 from 192.168.0.1
(base) [cs_cs_ycs@cpu11 ~]$ conda install jupyter