Jepack 4.5.1 刷机后 默认版本是cuda 10.2,但项目需求,要用cuda10.0,网上资料都说该本版装不了低版本的cuda10.0,我试了下也没装上, 会装不上去,官网也说不可以。领导说刷机的时候就不要刷cuda10.2和cudnn以及opencv和python,再试一下,再不行,那就是真装不上了。找了很多资料,各种尝试各种设置,经历几个小时,最终还是成功了。
cuda10.0与cudnn7 网盘下载地址:
链接:https://pan.baidu.com/s/1YgnVEJ9W2x_Hsweuonn7_w
提取码:6qlm
文件共5.1G,有Jepack4.2的SDK不用管它。可以自行下载你需要的文件。
下面开始再TX2 i Jepack4.5.1上安装cuda10.0
提示下:装之前确定下是用root装还是用户装。你可以选择默认。就拿我来说,我装完后想尽量删除无用的空间时,如果以root装就可以删除home下的所有东西,如果用户装就没办法删掉了。我也是后来才知道的。
cd到你下载的Jepack4.2的目录下,执行以下指令:
sudo dpkg -i cuda-repo-l4t-10-0-local-10.0.166_1.0-1_arm64.deb #安装cuda10.0
sudo dpkg -i graphsurgeon-tf_5.0.6-1+cuda10.0_arm64.deb
sudo dpkg -i libcudnn7_7.3.1.28-1+cuda10.0_arm64.deb
sudo dpkg -i libcudnn7-dev_7.3.1.28-1+cuda10.0_arm64.deb
再安装些依赖项,现状下面3个,其他依赖项有需要再装。
进入到*/var/cuda-repo-10-0-local-10.0.166*
sudo dpkg -i cuda-license-10-0_10.0.166-1_arm64.deb
sudo dpkg -i cuda-cublas-10-0_10.0.166-1_arm64.deb
sudo dpkg -i cuda-cublas-dev-10-0_10.0.166-1_arm64.deb
下面这些我全执行了,但后来想如果不装python2是不是也可以。所以可以尝试不装
cd 到你下载安装包路径:
sudo dpkg -i libnvinfer5_5.0.6-1+cuda10.0_arm64.deb
sudo dpkg -i libnvinfer-dev_5.0.6-1+cuda10.0_arm64.deb
sudo dpkg -i libnvinfer-samples_5.0.6-1+cuda10.0_all.deb
sudo dpkg -i tensorrt_5.0.6.3-1+cuda10.0_arm64.deb
sudo dpkg -i uff-converter-tf_5.0.6-1+cuda10.0_arm64.deb
sudo dpkg -i python-libnvinfer_5.0.6-1+cuda10.0_arm64.deb
sudo dpkg -i python-libnvinfer-dev_5.0.6-1+cuda10.0_arm64.deb
sudo dpkg -i python3-libnvinfer_5.0.6-1+cuda10.0_arm64.deb
sudo dpkg -i python3-libnvinfer-dev_5.0.6-1+cuda10.0_arm64.deb
再修复依赖项:
sudo apt --fix-broken install
sudo apt-get install libtbb2
下面3条指令是装opencv 的,但并不是装在python3的安装包路径下,我要节省空间,所以不想装这个版本,想换成其他可控的opencv版本,但是python是可以调用这个3.3的。如果不装会不会出错不知道,不过可以尝试下,报错了就再装呗。
sudo dpkg -i libopencv_3.3.1-2-g31ccdfe11_arm64.deb
sudo dpkg -i libopencv-dev_3.3.1-2-g31ccdfe11_arm64.deb
sudo dpkg -i libopencv-python_3.3.1-2-g31ccdfe11_arm64.deb
pip3 install pycuda #这里如果出现src/cpp/cuda.hpp:14:10: fatal error: cuda.h: No such file or directory请参看步骤4
安装上面历程执行完后,发现cuda不对,没有对应的lib和include文件:
在执行下面指令
在执行下 下面的指令,将你安装包所在的位置放到源里
sudo apt-get update
如果报了公钥无法验证的错误,
file:/var/cuda-repo-10-0-local Release: 由于没有公钥,无法验证下列签名: NO_PUBKEY F60F4B3D7FA2AF80
cuda源没有数字签名的错误执行指令
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys F60F4B3D7FA2AF80
sudo apt-get install cuda-toolkit-10-0
查询cuda版本,这里提示下输入nvcc是没有反应的,因为没有装nvcc啊。改用用下面这条指令查询:
cat /usr/local/cuda/version.txt #查询cuda版本
查询cudnn版本:
这里提示下,你可以手动进到cudnn安装文件夹,打开cudnn.h手动查看。
cat /usr/include/cudnn.h | grep CUDNN_MAJOR -A 2
输入指令后,如果没有反应,请参看我的另一篇文章Jepack4.5.1安装cuda10.2,pytorch1.8(1.6也行的),有配置步骤,比该篇文章配置全点,这里不再赘述。
2、添加cuda到系统变量
export PATH=/usr/local/cuda-10.0/bin:/usr/local/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.0/lib64
export LIBRARY_PATH=$LIBRARY_PATH:/usr/local/cuda-10.0/lib64
export PATH=$PATH:/usr/local/cuda-10.0/lib64
3、下面开始配置pytorch1.3
Jepack 4.5.1 是要求装pytorch 1.5以上的,但由于cuda是10.0,装不了1.5以上的哦,我尝试1.6,会报错,找不到一些cuda的库,这个应按照cuda的版本来装pytorch啊,我这装了 pytorch1.3 torvision 0.4.2
这里你也可以参考我的另一篇文章Jepack4.5.1安装cuda10.2,pytorch1.8(1.6也行的)
pytorch官网:
https://forums.developer.nvidia.com/t/pytorch-for-jetson-version-1-8-0-now-available/72048
这里装低版本建议离线装,把cd到你下载的路径下执行下面3条语句:
sudo apt-get install python3-pip libopenblas-base libopenmpi-dev
pip3 install Cython
pip3 install numpy torch-1.3.0-cp36-cp36m-linux_aarch64.whl
error1
File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libcurand.so.10: cannot open shared object file: No such file or directory
resolve method:
cd /usr/local/cuda-10.0/lib64
sudo ln -sf libcurand.so.10.0 libcurand.so.10
resolve method:
pip install --upgrade numpy==1.16.0 # or
pip install numpy==1.16.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
or -i https://mirrors.aliyun.com/pypi/simple
error2
File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 81, in <module>
from torch._C import *
ModuleNotFoundError: No module named 'numpy.core._multiarray_umath'
请按照下面表格对应的版本来装
torchvision:
sudo apt-get install libjpeg-dev zlib1g-dev libpython3-dev libavcodec-dev libavformat-dev libswscale-dev
git clone --branch <version> https://github.com/pytorch/vision torchvision # see below for version of torchvision to download
cd torchvision
export BUILD_VERSION=0.x.0 # where 0.x.0 is the torchvision version
python3 setup.py install --user
cd ../ # attempting to load torchvision from build dir will result in import error
pip3 install 'pillow<7' # always needed for Python 2.7, not needed torchvision v0.5.0+ with Python 3.6
验证安装是否成功:
import torch
print(torch.__version__)
print('CUDA available: ' + str(torch.cuda.is_available()))
print('cuDNN version: ' + str(torch.backends.cudnn.version()))
a = torch.cuda.FloatTensor(2).zero_()
print('Tensor a = ' + str(a))
b = torch.randn(2).cuda()
print('Tensor b = ' + str(b))
c = a + b
print('Tensor c = ' + str(c))
import torchvision
print(torchvision.__version__)
注意事项:
装pytorch时候可能会遇到找不到libcuda*.10.so等错误,这是请检查下是否添加到环境变量了,以及是否设置了软连接。
设置软连接方式及修改权限参考我的另一篇文章
4、 bug解决
这里可能会出现些错误。先记录下:
错误1:
src/cpp/cuda.hpp:14:10: fatal error: cuda.h: No such file or directory
find /usr/local/cuda-10.2 -name cuda.h
这个错误是你环境变量不对:
sudo su #切换到root用户
gedit ~/.bashrc # 打开 文件,习惯vim的使用 vim ~/.bashrc
将 export PATH=/usr/local/cuda-10.2/bin:/usr/local/cuda/bin:$PATH 添加 到最后
source ~/.bashrc
sudo su zhihui
gedit ~/.bashrc
将 export PATH=/usr/local/cuda-10.2/bin:/usr/local/cuda/bin:$PATH 添加到最后
error 3
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-build-egqxmwvg/opencv-python/setup.py", line 10, in <module>
import skbuild
ModuleNotFoundError: No module named 'skbuild'
sudo apt install cmake
pip install scikit-build
error 3
BUILDING MATPLOTLIB
matplotlib: yes [3.3.4]
python: yes [3.6.9 (default, Jan 26 2021, 15:33:00) [GCC 8.4.0]]
platform: yes [linux]
sample_data: yes [installing]
tests: no [skipping due to configuration]
macosx: no [Mac OS-X only]
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-a54sluj1/matplotlib/
pip install ipython # resolve the error
error 5
File
“/home/tx2/Downloads/0607/atr3det-on-mmdetection-master/mmdet/core/evaluation/init.py”,
line 5, in
from .mean_ap import average_precision, eval_map, print_map_summary File
“/home/tx2/Downloads/0607/atr3det-on-mmdetection-master/mmdet/core/evaluation/mean_ap.py”,
line 6, in
from terminaltables import AsciiTable ModuleNotFoundError: No module named ‘terminaltables’
pip install terminaltables -i https://mirrors.aliyun.com/pypi/simple # resolve the error
error 6
File
“/home/tx2/Downloads/0607/atr3det-on-mmdetection-master/mmdet/core/mask/structures.py”,
line 5, in
import pycocotools.mask as maskUtils ModuleNotFoundError: No module named ‘pycocotools’
resolve the error
git clone https://github.com/pdollar/coco
cd coco/PythonAPI
cd coco/PythonAPI
Ubuntu下安装matplotlib:
安装matplotlib相对复杂一些
需要先安装其依赖的包libpng和freetype
sudo apt-get install libpng-dev # 安装libpng:
cd ~/Downloads
wget http://download.savannah.gnu.org/releases/freetype/freetype-2.4.10.tar.gz # 安装freetype:
tar zxvf freetype-2.4.10.tar.gz
cd freetype-2.4.10/
./congfigure
make
sudo make install
sudo pip install matplotlib #然后通过pip来安装matplotlib, 安装pip: sudo apt-get install python-pip