首先我现在的python版本是3.7,5,要求3.8+,于是升级python并重新安装所有依赖库:
sudo apt-get install libffi-dev#必要,否则可能会报找不到_ctype库的错
wget https://www.python.org/ftp/python/3.8.5/Python-3.8.5.tgz
tar -xf Python-3.8.5.tgz
./configure && make && make install
#删除旧的软链接
rm -rf /usr/bin/python
rm -rf /usr/bin/python-config
#创建新的软链接
ln -sf /usr/local/bin/python3.8 /usr/bin/python
ln -sf /usr/local/bin/python3.8-config /usr/bin/python-config
#查看python版本
python -V
python3 -V
#更新pip(必要,否则库会装到旧版本python)
python3 -m pip install --upgrade pip
#如果是第一次安装python可能要配环境变量,网上教程很多
如果想要彻底卸载python再重装:
sudo rm -rf /usr/bin/python2*
sudo rm -rf /usr/bin/python3*
sudo rm -rf /usr/lib/python2*
sudo rm -rf /usr/lib/python3*
sudo rm -rf /usr/local/lib/python2*
sudo rm -rf /usr/local/lib/python3*
安装pytorch,我是cuda11.3
官网看历史版本,注意cuda对应https://pytorch.org/get-started/previous-versions/
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
#测试
python
import torch
print(torch.__version__)
安装cudnn和nccl,之前我安装过了,主要是改下软链接
https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html
https://developer.nvidia.com/nccl/nccl-download
最后按照https://github.com/NVIDIA/FasterTransformer/blob/main/docs/gpt_guide.md安装FasterTransformer即可