深度学习训练工具
深度学习训练工具(Deep Learning Tool, DLT)用于训练检测刀闸、仪表、压板等目标的深度学习模型。
运行环境
1.硬件环境
- 内存:>16GB
- 显存:>9GB
2.软件环境
- 系统:Ubuntu 16.04-x86_64
- CUDA
- cuDNN
- caffe
- python2.7
- pyqt4
- sip
- java
- apache tomcat8
软件安装
一. 安装Ubuntu 16.04及显卡驱动
Ubuntu16.04系统安装方法可以在网络上搜索相关教程,此处就不再详述。
安装NVIDIA显卡驱动大致流程如下:进入命令行终端 --> 禁用lightdm桌面服务 --> 安装驱动 (禁用nouveau驱动,安装NVIDIA驱动)–> 启用lightdm桌面服务 --> 重启进入BIOS关闭secure boot --> 重启电脑。
最重要的步骤为:重启进入BIOS关闭secure boot,此步若不操作,驱动将不会起效!
-
ubuntu系统安装完毕后,启动时会进入X桌面,可以用U盘将所有提前下载好的驱动、CUDA安装文件、CUDNN安装文件等等文件拷贝到电脑中。
-
键盘上按下
ctrl + alt + F1
,进入命令行模式。ubuntu有命令行模式和X桌面模式,安装驱动必须在命令行模式进行。 -
禁用X桌面服务,命令行输入:
sudo service lightdm stop
,此命令将关闭桌面服务,现在已经不能进入桌面模式(重启电脑会重启桌面服务)。 -
禁用nouveau驱动。ubuntu默认使用自带的nouveau驱动,在安装NVIDIA驱动前,要先禁止nouveau驱动。将nouveau添加到黑名单blacklist.conf中,linux启动时,就不会加载nouveau。
禁用nouveau需要在tty文本模式下进行操作。
由于blacklist.conf文件的属性不允许修改。所以需要先修改文件属性。
查看属性:
ll /etc/modprobe.d/blacklist.conf
修改属性:
sudo chmod 666 /etc/modprobe.d/blacklist.conf
用vi编辑器打开:
sudo vi /etc/modprobe.d/blacklist.conf
在文件末尾添加如下2行:
blacklist nouveau
options nouveau modeset=0
修改并保存文件后,把文件属性复原:
sudo chmod 644 /etc/modprobe.d/blacklist.conf
更新一下内核:
sudo update-initramfs -u
修改后需要重启系统。
重启系统确认nouveau是否已经被屏蔽掉,使用lsmod命令查看:
lsmod | grep nouveau
若命令行中没有nouveau驱动出现,说明nouveau驱动禁用成功。 -
确认NVIDIA-CUDA版本对应关系:
-
添加ppa库,通过ppa安装显卡驱动,注意不要从NVIDIA官网下载显卡驱动,直接通过ppa安装即可:
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt-get install nvidia-driver-415
由于本机显卡为GTX 1080Ti,截止到2018年12月17日,该显卡最新稳定版驱动版本为415.23,因此安装nvidia-415驱动。
安装驱动过程中,若提示各种操作,一般都按accept同意。如果某一步提示你是否Disable Secure Boot,选择ok,进行禁用secure boot。这时会要求你设置密码,直接输入12345678,再输一次确认。安装成功后,再次输入sudo apt-get install nvidia-415,会提示已经安装驱动。
但此时输入nvidia-smi
会提示无此命令,驱动没有安装好。这个提示是正常的,因为我们实际上还没正式在BIOS禁用secure boot,现在先可以忽略这个警告。
二. 安装CUDA
CUDA是实现训练过程计算的工具,可以到CUDA下载相应版本的runfile文件(.run文件),将文件放在HOME目录下。
我使用的是9.1版本的cuda,在tty文本模式下直接安装:
sudo sh cuda_9.1.85_387.26_linux.run
安装过程中会提示你进行一些确认操作,首先是接受服务条款,输入accept确认,然后会提示是否安装cuda tookit、cuda-example等,均输入Y进行确定。但请注意,当询问是否安装附带的驱动时,一定要选N!我们在第一部分已经安装好最新的驱动,附带的驱动是旧版本的而且会有问题,所以不要选择安装驱动。
稍等片刻等待安装完成,这个时候可以直接接着测试CUDA是否安装成功,也可以重启后测试CUDA是否安装成功。重启后打开命令行,输入ls /dev/nvidia*
,若出现4~5个以nvidia开头的文件夹说明安装成功。
此时已经安装好显卡驱动和CUDA 9.1。输入nvidia-smi
可查看显卡驱动和其他信息。
最后是配置环境变量,此步很重要,不配置环境变量系统将无法知道CUDA是否被安装:
输入:sudo vi /etc/profile
,在底部插入以下两句话:
export PATH=/usr/local/cuda-9.1/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-9.1/lib64:$LD_LIBRARY_PATH
然后按esc键,输入:wq
,然后回车退出。
最后输入:source /etc/profile
,使设置生效。
三. 安装CUDNN
CUDNN是NVIDIA用于加速深度学习的模块,装完CUDA之后就要装这个,预先在其它电脑下载完毕,然后复制到本机HOME目录下,解压
tar -xzf cudnn-9.1-linux-x64-v7.1.tgz
cd cuda
sudo cp lib64/* /usr/local/cuda/lib64/
sudo cp include/* /usr/local/cuda/include/
CUDNN就这样安装完毕,现在重启lightdm服务,可重启桌面模式:
sudo service lightdm start
若之前未重启电脑,现在可以重启电脑了,进入下一步,禁用Secure Boot!
四. 禁用Secure Boot
ubuntu16.04有个重要的特性,就是如果需要安装第三方显卡驱动(NVIDIA的就是第三方显卡驱动),就必须在BIOS中禁用“安全启动”模式(secure boot),否则第三方显卡驱动将无法被启动!
当第三部分结束后,输入sudo reboot重启电脑,电脑重启的那一刻,按下F2或者DEL键进入华硕的BIOS,这里只以华硕X99-E WS主板为例,其它主板请参考各自进入BIOS的方式。
- 进入BIOS,点击BOOT(启动)菜单栏,往下选择“Secure Boot”:
- 进入Secure Boot界面后,光标移动至“OS type”,选择为“Other OS”:
然后点击“Key Management”,进入界面。 - 选择“Clear Secure Root Keys”,删除安全启动密钥,删除后就能禁止secure boot。
按下yes或comfirm,确认删除:
- 按下F10,确认保存设置并重启电脑:
成功禁用secure boot之后,重启就能回到X桌面,按Ctrl + Alt + T,调出命令行,输入nvidia-smi
就能看见驱动信息。
你还可以输入nvcc -V
来查看CUDA版本。
五. 安装opencv-3.4.1
本机使用的是opencv3.4.1,安装前先到OpenCV官网下载source源码opencv-3.4.1.zip,将其保存在$HOME目录下,解压opencv-3.4.1
unzip opencv-3.4.1.zip
进入opencv-3.4.1文件夹
cd opencv-3.4.1
安装依赖项
sudo apt-get install cmake
sudo apt-get install build-essential libgtk2.0-dev libavcodec-dev libavformat-dev
sudo apt-get install libjpeg.dev libtiff4.dev libswscale-dev libjasper-dev
要确保每一个依赖项都安装成功。创建编译文件夹并进入编译文件夹
mkdir release
cd release
cmake一下
cmake -D CMAKE_BUILD_TYPE=Release -D CMAKE_INSTALL_PREFIX=/usr/local …
注意:如果已经在新的文件夹中编译,但是还会出现之前的报错,把cmakecache.txt删了再编译就可
期间可能会下载一个东西,等待一会儿就OK 。
cmake结束,开始编译
make -j64
编译结束,开始安装
sudo make install
安装结束后,OpenCV编译过程就结束了,接下来就需要配置一些OpenCV的编译环境首先将OpenCV的库添加到路径,从而可以让系统找到
sudo gedit /etc/ld.so.conf.d/opencv.conf
执行此命令后打开的可能是一个空白的文件,只需要在文件末尾添加
/usr/local/lib
保存后关闭文件。接着执行如下命令使得刚才的配置路径生效
sudo ldconfig
接着配置bash环境变量
sudo gedit /etc/bash.bashrc
在文件末尾添加
PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/lib/pkgconfig
export PKG_CONFIG_PATH
保存,执行如下命令使得配置生效
source /etc/bash.bashrc
更新
sudo updatedb
到此OpenCV所有的配置都已经完成 ,下面用一个小程序来测试一下,使用OpenCV的sample程序,目的在于打开机器的摄像头设备。
首先进入$opencv_HOME/samples/cpp/example_cmake
cd opencv-3.4.1/samples/cpp/example_cmake
然后按顺序执行
cmake .
make
./opencv_example
即可看到打开了摄像头,在左上角有一个“hello opencv”,即表示测试成功。
六. 安装caffe
- 安装依赖项
sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev
sudo apt-get install libopencv-dev libhdf5-serial-dev protobuf-compiler
sudo apt-get install --no-install-recommends libboost-all-dev
sudo apt-get install libatlas-base-dev
sudo apt-get install libopenblas-dev
sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev
sudo apt-get install python-skimage python-protobuf
确保每个依赖项都安装成功再进行下一步,若出错,请自行百度解决。
- 安装caffe
安装过程大致为:在$HOME目录下打开命令行,下载caffe–>创建并修改Makefile.config文件–>修改Makefile文件–>编译caffe–>配置环境变量
下载caffe
git clone https://github.com/weiliu89/caffe.git
创建Makefile.config文件
cd caffe
cp Makefile.config.example Makefile.config
我使用Ubuntu系统自带的python,使用GPU加速。有相同需求的话可以打开Makefile.config文件,使用我的文件替换其中的内容。
Makefile.config文件:
## Refer to http://caffe.berkeleyvision.org/installation.html
# Contributions simplifying and improving our build system are welcome!
# cuDNN acceleration switch (uncomment to build with cuDNN).
# USE_CUDNN := 1
# CPU-only switch (uncomment to build without GPU support).
# CPU_ONLY := 1
# uncomment to disable IO dependencies and corresponding data layers
# USE_OPENCV := 0
# USE_LEVELDB := 0
# USE_LMDB := 0
# uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary)
# You should not set this flag if you will be reading LMDBs with any
# possibility of simultaneous read and write
# ALLOW_LMDB_NOLOCK := 1
# Uncomment if you're using OpenCV 3
OPENCV_VERSION := 3
# To customize your choice of compiler, uncomment and set the following.
# N.B. the default for Linux is g++ and the default for OSX is clang++
# CUSTOM_CXX := g++
# CUDA directory contains bin/ and lib/ directories that we need.
CUDA_DIR := /usr/local/cuda
# On Ubuntu 14.04, if cuda tools are installed via
# "sudo apt-get install nvidia-cuda-toolkit" then use this instead:
# CUDA_DIR := /usr
# CUDA architecture setting: going with all of them.
# For CUDA < 6.0, comment the *_50 through *_61 lines for compatibility.
# For CUDA < 8.0, comment the *_60 and *_61 lines for compatibility.
CUDA_ARCH := -gencode arch=compute_30,code=sm_30 \
-gencode arch=compute_35,code=sm_35 \
-gencode arch=compute_50,code=sm_50 \
-gencode arch=compute_52,code=sm_52 \
-gencode arch=compute_60,code=sm_60 \
-gencode arch=compute_61,code=sm_61 \
-gencode arch=compute_61,code=compute_61
# BLAS choice:
# atlas for ATLAS (default)
# mkl for MKL
# open for OpenBlas
BLAS := atlas
# Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
# Leave commented to accept the defaults for your choice of BLAS
# (which should work)!
# BLAS_INCLUDE := /path/to/your/blas
# BLAS_LIB := /path/to/your/blas
# Homebrew puts openblas in a directory that is not on the standard search path
# BLAS_INCLUDE := $(shell brew --prefix openblas)/include
# BLAS_LIB := $(shell brew --prefix openblas)/lib
# This is required only if you will compile the matlab interface.
# MATLAB directory should contain the mex binary in /bin.
# MATLAB_DIR := /usr/local
# MATLAB_DIR := /Applications/MATLAB_R2012b.app
# NOTE: this is required only if you will compile the python interface.
# We need to be able to find Python.h and numpy/arrayobject.h.
PYTHON_INCLUDE := /usr/include/python2.7 \
/usr/local/lib/python2.7/dist-packages/numpy/core/include
# Anaconda Python distribution is quite popular. Include path:
# Verify anaconda location, sometimes it's in root.
# ANACONDA_HOME := $(HOME)/anaconda
# PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
# $(ANACONDA_HOME)/include/python2.7 \
# $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include
# Uncomment to use Python 3 (default is Python 2)
# PYTHON_LIBRARIES := boost_python3 python3.5m
# PYTHON_INCLUDE := /usr/include/python3.5m \
# /usr/lib/python3.5/dist-packages/numpy/core/include
# We need to be able to find libpythonX.X.so or .dylib.
PYTHON_LIB := /usr/lib
# PYTHON_LIB := $(ANACONDA_HOME)/lib
# Homebrew installs numpy in a non standard path (keg only)
# PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include
# PYTHON_LIB += $(shell brew --prefix numpy)/lib
# Uncomment to support layers written in Python (will link against Python libs)
WITH_PYTHON_LAYER := 1
# Whatever else you find you need goes here.
#INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
#LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-gnu/hdf5/serial
# If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies
# INCLUDE_DIRS += $(shell brew --prefix)/include
# LIBRARY_DIRS += $(shell brew --prefix)/lib
# NCCL acceleration switch (uncomment to build with NCCL)
# https://github.com/NVIDIA/nccl (last tested version: v1.2.3-1+cuda8.0)
# USE_NCCL := 1
# Uncomment to use `pkg-config` to specify OpenCV library paths.
# (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)
# USE_PKG_CONFIG := 1
# N.B. both build and distribute dirs are cleared on `make clean`
BUILD_DIR := build
DISTRIBUTE_DIR := distribute
# Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171
# DEBUG := 1
# The ID of the GPU that 'make runtest' will use to run unit tests.
TEST_GPUID := 0
# enable pretty build (comment to see full commands)
Q ?= @
同样的,打开Makefile文件,使用我的Makefile文件内容进行替换。
Makefile文件:
PROJECT := caffe
CONFIG_FILE := Makefile.config
# Explicitly check for the config file, otherwise make -k will proceed anyway.
ifeq ($(wildcard $(CONFIG_FILE)),)
$(error $(CONFIG_FILE) not found. See $(CONFIG_FILE).example.)
endif
include $(CONFIG_FILE)
BUILD_DIR_LINK := $(BUILD_DIR)
ifeq ($(RELEASE_BUILD_DIR),)
RELEASE_BUILD_DIR := .$(BUILD_DIR)_release
endif
ifeq ($(DEBUG_BUILD_DIR),)
DEBUG_BUILD_DIR := .$(BUILD_DIR)_debug
endif
DEBUG ?= 0
ifeq ($(DEBUG), 1)
BUILD_DIR := $(DEBUG_BUILD_DIR)
OTHER_BUILD_DIR := $(RELEASE_BUILD_DIR)
else
BUILD_DIR := $(RELEASE_BUILD_DIR)
OTHER_BUILD_DIR := $(DEBUG_BUILD_DIR)
endif
# All of the directories containing code.
SRC_DIRS := $(shell find * -type d -exec bash -c "find {} -maxdepth 1 \
\( -name '*.cpp' -o -name '*.proto' \) | grep -q ." \; -print)
# The target shared library name
LIBRARY_NAME := $(PROJECT)
LIB_BUILD_DIR := $(BUILD_DIR)/lib
STATIC_NAME := $(LIB_BUILD_DIR)/lib$(LIBRARY_NAME).a
DYNAMIC_VERSION_MAJOR := 1
DYNAMIC_VERSION_MINOR := 0
DYNAMIC_VERSION_REVISION := 0
DYNAMIC_NAME_SHORT := lib$(LIBRARY_NAME).so
#DYNAMIC_SONAME_SHORT := $(DYNAMIC_NAME_SHORT).$(DYNAMIC_VERSION_MAJOR)
DYNAMIC_VERSIONED_NAME_SHORT := $(DYNAMIC_NAME_SHORT).$(DYNAMIC_VERSION_MAJOR).$(DYNAMIC_VERSION_MINOR).$(DYNAMIC_VERSION_REVISION)
DYNAMIC_NAME := $(LIB_BUILD_DIR)/$(DYNAMIC_VERSIONED_NAME_SHORT)
COMMON_FLAGS += -DCAFFE_VERSION=$(DYNAMIC_VERSION_MAJOR).$(DYNAMIC_VERSION_MINOR).$(DYNAMIC_VERSION_REVISION)
##############################
# Get all source files
##############################
# CXX_SRCS are the source files excluding the test ones.
CXX_SRCS := $(shell find src/$(PROJECT) ! -name "test_*.cpp" -name "*.cpp")
# CU_SRCS are the cuda source files
CU_SRCS := $(shell find src/$(PROJECT) ! -name "test_*.cu" -name "*.cu")
# TEST_SRCS are the test source files
TEST_MAIN_SRC := src/$(PROJECT)/test/test_caffe_main.cpp
TEST_SRCS := $(shell find src/