—————v100———cutlass—start———————————————————————
__________web site_____________________________________________
https://developer.nvidia.com/blog/cutlass-linear-algebra-cuda/
https://github.com/NVIDIA/cutlass
READ.md 较高参考价值
__________download cutlass source code___________________________
$ git clone https://github.com/NVIDIA/cutlass.git
___________run docker__________________________________________
$ sudo docker run --name cutlass_exception -it \
-v /home/xiaoming/workspace/bitbucket/cutlass_exception:/ex \
-v /dev:/dev -v /usr/src/:/usr/src -v /lib/modules/:/lib/modules --privileged --cap-add=ALL nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04 /bin/bash
安装好 cuda 驱动程序和 cudnn,也可以在如下版本的docker image的容器中运行,里面的cuda环境是完整的:
nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04
下文动作都在dockers中操作:
__________compile env__________________________________________
1. 启动进入上述docker中,并在docker中进入cutlass源代码目录
# cd /ex/cutlass
2.修改ubuntu的源,并安装vim:
# mv /etc/apt/sources.list /etc/apt/sources.list.backupLL \
&& echo "deb http://mirrors.163.com/ubuntu/ bionic main restricted universe multiverse" > /etc/apt/sources.list \
&& apt-get update && apt-get upgrade \
&& apt-get install vim
3. 手动安装cmake 3.22.2 //ubuntu 仓库的版本太低,不能满足cutlass的CMakeLists.txt的要求,参考:
https://blog.csdn.net/eloudy/article/details/105951149
4. 安装git,用于cmake 自动下载googletest源代码
$ sudo apt-get install git //自动下载googletest源代码,所以也需要联网
5. 安装 python3
# sudo apt-get install python3
__________compile_____________________________________________
# export CUDA_INSTALL_PATH=/usr/local/cuda-10.2 \
&& export CUDACXX=${CUDA_INSTALL_PATH}/bin/nvcc && mkdir build && cd build \
&& cmake .. -DCUTLASS_NVCC_ARCHS=70 -DCUTLASS_ENABLE_CUBLAS=OFF -DCUTLASS_ENABLE_CUDNN=OFF \
&& make cutlass_profiler -j12 \
&& make test_unit -j \
&& make test_unit_gemm_warp -j \
&& ./tools/profiler/cutlass_profiler --kernels=sgemm --m=4352 --n=4 --k=4
—————v100———cutlass—end—————————————————————————