full guide tutorial to install and configure deep learning environments on linux server
Quick Guide
prepare
tools
MobaXterm (for windows)
ssh + vscode
for windows:
drop files to MobaXterm to upload to server
use zip format
commands
view disk
du -d 1 -h
df -h
gpu and cpu usage
watch -n 1 nvidia-smi
top
view files and count
wc -l data.csv
# count how many folders
ls -lR | grep '^d' | wc -l
17
# count how many jpg files
ls -lR | grep '.jpg' | wc -l
1360
# view 10 images
ls train | head
ls test | head
link datasets
# link
ln -s srt dest
ln -s /data_1/kezunlin/datasets/ dl4cv/datasets
scp
scp -r node17:~/dl4cv ~/git/
scp -r node17:~/.keras ~/
tmux for background tasks
tmux new -s notebook
tmux ls
tmux attach -t notebook
tmux detach
wget download
# wget
# continue donwload
wget -c url
# background donwload for large file
wget -b -c url
tail -f wget-log
# kill background wget
pkill -9 wget
tips about training large model
terminal 1:
tmux new -s train
conda activate keras
time python train_alexnet.py
terminal 2:
tmux detach
tmux attach -t train
and then close vscode, otherwise bash training process will exit when we close vscode.
cuda driver and toolkits
cudatookit version depends on cuda driver version.
install nvidia-drivers
sudo add-apt-repository ppa:graphics-drivers/ppa
sudp apt-get update
sudo apt-cache search nvidia-*
# nvidia-384
# nvidia-396
sudo apt-get -y install nvidia-418
# test
nvidia-smi
Failed to initialize NVML: Driver/library version mismatch
install cuda-toolkit(dirvers)
remove all previous nvidia drivers
sudo apt-get -y pruge nvidia-*
go to here and download cuda_10.1
wget -b -c http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda_10.1.243_418.87.00_linux.run
sudo sh cuda_10.1.243_418.87.00_linux.run
sudo ./cuda_10.1.243_418.87.00_linux.run
vim .bashrc
# for cuda and cudnn
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
check cuda driver version
> cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 418.87.00 Thu Aug 8 15:35:46 CDT 2019
GCC version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.11)
>nvidia-smi
Tue Aug 27 17:36:35 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
> nvidia-smi -L
GPU