Caffe + Ubuntu 15.04 + CUDA 7.5 在服务器上安装配置及卸载重新安装（已测试可执行）

最新推荐文章于 2022-03-07 11:44:00 发布

博瓦

最新推荐文章于 2022-03-07 11:44:00 发布

阅读量1.7k

点赞数

分类专栏： testflow caffe

本文链接：https://blog.csdn.net/u010925447/article/details/76832850

版权

caffe 同时被 2 个专栏收录

24 篇文章 1 订阅

订阅专栏

testflow

1 篇文章 0 订阅

订阅专栏

介绍Google的TensorFlow机器学习开源库，在UbuntuKylin上的安装和和源码编译。
原始官方文档参见：http://www.tensorflow.org.

本电脑配置如下：

3.19.0-15-generic #15-Ubuntu x86_64 GNU/Linux
NVIDIA Corporation GK110BGL [Tesla K40c]
NVIDIA Corporation GK110GL [Quadro K5200]
Python 2.7
Cuda toolkit = 7.5 
cuDNN = 7.5 v5
gcc = 4.9
g++ = 4.9  
Bazel = 0.4.4
  
  1
2
3
4
5
6
7
8
9
  
  1
2
3
4
5
6
7
8
9

TensorFlow学习资源推荐

tensorflow中文入门教程-含视频
 tensorflow入门视频教程-含互动

tensorflow中文社区

TensorFlow 官方文档中文版

TensorFlow在图像识别中的应用

本文是在安装caffe之后，继续安装TensorFlow，下面有些CUDA和 CUDNN的安装可见 Caffe + Ubuntu 15.04 + CUDA 7.5 在服务器上安装配置及卸载重新安装（已测试可执行）

安装TensorFlow的Requirements

 Python 2.7 and Python 3.3+
 Cuda toolkit >= 7.0 
 cuDNN >= v3
 gcc > 4.8
 g++ > 4.8  
 Bazel > 0.4.2
  
  1
2
3
4
5
6
  
  1
2
3
4
5
6

一、安装依赖包

1. 安装TensorflowPython API

sudo apt-get install python-pip python-dev 
sudo apt-get install python-numpy swig python-dev 
sudo apt-get install Git
  
  1
2
3
  
  1
2
3

2. 安装 Bazel

TensorFlow Serving requires Bazel 0.4.2 or higher，Bazel的安装可见官网。

OpenJDK做为GPL许可（GPL-licensed）的Java平台的开源化实现，Sun正式发布它已经六年有余。从发布那一时刻起，Java社区的大众们就又开始努力学习，以适应这个新的开源代码基础（code-base）。 [1] 
OpenJDK在2013年发展迅速，被著名IT杂志SD Times评选为2013 SD Times 100，位于“极大影响力”分类第9位。

http://www.infoq.com/cn/news/2015/03/google-open-source-bazel 
Google日前开源了他们内部使用的构建工具Bazel。 
Bazel是一个类似于Make的工具，是Google为其内部软件开发的特点量身定制的工具，如今Google使用它来构建内部大多数的软件。它的功能有诸多亮点： 
多语言支持：目前Bazel默认支持Java、Objective-C和C++，但可以被扩展到其他任何变成语言。

高级构建描述语言：项目是使用一种叫BUILD的语言来描述的，它是一种简洁的文本语言，它把一个项目视为一个集合，这个集合由一些互相关联的库、二进制文件和测试用例组成。相反，像Make这样的工具，需要去描述每个文件如何调用编译器。

多平台支持：同一套工具和相同的BUILD文件可以用来为不同的体系结构构建软件，甚至是不同的平台。在Google，Bazel被同时用在数据中心系统中的服务器应用和手机端的移动应用上。

可重复性：在BUILD文件中，每个库、测试用例和二进制文件都需要明确指定它们的依赖关系。当一个源码文件被修改时，Bazel凭这些依赖来判断哪些部分需要重新构建，以及哪些任务可以并行进行。这意味着所有构建都是增量的，并且相同构建总是产生一样的结果。

可伸缩性：Bazel可以处理大型项目；在Google，一个服务器软件有十万行代码是很常见的，在什么都不改的前提下重新构建这样一个项目，大概只需要200毫秒。
  
  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
  
  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

JDK8的安装（必须的）

sudo apt-get install openjdk-8-jdk openjdk-8-source
sudo apt-get install pkg-config zip g++ zlib1g-dev unzip
sudo add-apt-repository ppa:webupd8team/java  #添加仓库
sudo apt-get update   #更新软件列表
sudo apt-get install oracle-java8-installer #正式安装jdk8
java -version      # 验证安装
  
  1
2
3
4
5
6
  
  1
2
3
4
5
6

2.1 安装 Bazel-方法1

echo “deb http://storage.googleapis.com/bazel-apt stable jdk1.8” | sudo tee /etc/apt/sources.list.d/bazel.list
curl https://storage.googleapis.com/bazel-apt/doc/apt-key.pub.gpg | sudo apt-key add -
sudo apt-get update 
sudo apt-get install bazel
sudo apt-get upgrade bazel
bazel version 
  
  1
2
3
4
5
6
  
  1
2
3
4
5
6

2.2 安装 Bazel-方法2

Bazel 下载链接

cd ~/Downloads
chmod +x bazel-0.4.5-installer-linux-x86_64.sh #对.sh文件授权
./bazel-0.4.5-installer-linux-x86_64.sh --user #运行.sh文件
bazel version
  
  1
2
3
4
  
  1
2
3
4

设置环境变量

export PATH="$PATH:$HOME/bin"
  
  1
  
  1

可能出现的问题

W: 无法下载 http://storage.googleapis.com/bazel-apt/dists/stable/InRelease Unable to find expected entry ‘jdk1.8/binary-i386/Packages’ in Release file (Wrong sources.list entry or malformed file) 
E: Some index files failed to download. They have been ignored, or old ones used instead. 
的错误
  
  1
2
3
  
  1
2
3

解决方法

sudo gedit /etc/apt/sources.list.d/bazel.list 
将deb http://storage.googleapis.com/bazel-apt stable jdk1.8修改为deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8 
  
  1
2
  
  1
2

3. CUDA和 CUDNN的安装，在Linux 上开启 GPU 支持

为了编译并运行能够使用 GPU 的 TensorFlow, 需要先安装 NVIDIA 提供的 Cuda Toolkit 7.5 和 CUDNN 7.5 V5

TensorFlow 的 GPU 特性只支持 NVidia Compute Capability >= 3.5 的显卡. 被支持的显卡包括但不限于

NVidia Titan

NVidia Titan X

NVidia K20

NVidia K40
  
  1
2
3
4
5
6
7
  
  1
2
3
4
5
6
7

可见 Caffe + Ubuntu 15.04 + CUDA 7.5 在服务器上安装配置及卸载重新安装（已测试可执行）

二、Ubuntu/linux直接安装

# 仅使用 CPU 的版本
$ pip install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.0.1-cp27-none-linux_x86_64.whl
# 开启 GPU 支持的版本 (安装该版本的前提是已经安装了 CUDA sdk)
$ pip install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-1.0.1-cp27-none-linux_x86_64.whl
  
  1
2
3
4
  
  1
2
3
4

三、源码编译

TensorFlow 源码安装官方教程

3.1 克隆 TensorFlow 仓库

git clone --recurse-submodules https://github.com/tensorflow/tensorflow   #拉取源代码
  
  1
  
  1

–recurse-submodules 参数是必须得, 用于获取 TesorFlow 依赖的 protobuf 库

3.2 配置 TensorFlow 的 Cuba 选项

cd tensorflow
./configure    # 配置tensorflow

  
  1
2
3
  
  1
2
3

执行configure的时候会问你问题

Please specify the location of python. [Default is /usr/bin/python]
Please specify optimization flags to use during compilation [Default is -march=native]
Do you wish to use jemalloc as the malloc implementation? [Y/N]
y
Do you wish to build TensorFlow with Google Cloud Platform support? [Y/N]
y
Do you wish to build TensorFlow with Hadoop File System support? [Y/N]
y
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [Y/N]
y
Do you wish to build TensorFlow with OpenCL support? [Y/N]
n
Do you wish to build TensorFlow with CUDA support? [Y/N]
y

  
  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
  
  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

若 Do you wish to build TensorFlow with OpenCL support? [Y/N] 中选择y，则需要安装 OpenCL drivers 和 ComputeCpp compiler，具体步骤可参考

Optional: Install OpenCL (Experimental, Linux only)

tensorflow-opencl

否则，会出现如下一直循环的情况。

这里写图片描述

3.3 编译

mkdir /tmp/tensorflow_pkg
  
  1
  
  1

3.3.1 仅 CPU 支持，无 GPU 支持

cd tensorflow
bazel build -c opt //tensorflow/tools/pip_package:build_pip_package

  
  1
2
3
  
  1
2
3

出现的问题

The 'build' command is only supported from within a workspace
  
  1
  
  1

解决方法

cd tensorflow
  
  1
  
  1

3.3.2 有 GPU 支持

cd tensorflow 
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package 

  
  1
2
3
  
  1
2
3

3.3.3 生成 pip安装包

bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
  
  1
  
  1

cd 到 /tmp/tensorflow_pkg目录下，找到编译好的whl文件

cd /tmp/tensorflow_pkg
sudo pip install --config=cuda tensorflow-1.0.1-cp27-none-linux_x86_64.whl
  
  1
2
  
  1
2

3.3.4 编译目标程序, 开启 GPU 支持

bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer

bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu
  
  1
2
3
  
  1
2
3

四、设置TensorFlow环境

cd tensorflow 
bazel build -c opt //tensorflow/tools/pip_package:build_pip_package
 # To build with GPU support:
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
mkdir _python_build
cd _python_build
ln -s ../bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/org_tensorflow/* .
ln -s ../tensorflow/tools/pip_package/* .
sudo python setup.py develop
  
  1
2
3
4
5
6
7
8
9
  
  1
2
3
4
5
6
7
8
9

五、测试TensorFlow

import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))
Hello, TensorFlow!

  
  1
2
3
4
5
6
  
  1
2
3
4
5
6

a = tf.constant(10)
b = tf.constant(32)
print(sess.run(a+b))
42
  
  1
2
3
4
  
  1
2
3
4

用tensorflow实现梵高作画

1. neural-style下载在这个[github网站下载相应代码]

2. 下载vgg19

3. 将imagenet-vgg-verydeep-19.mat复制到neural-style的文件夹根目录下

cp -r imagenet-vgg-verydeep-19.mat /home/bids/neural-style-master/
  
  1
  
  1

4. 执行梵高作画

python neural_style.py –content ./example/xxx.jpg (此括号内不要复制：xxx代表你想要使用的图片名称) –styles ./example/ 1-style.jpg(此括号内不要复制：1-style.jpg是梵高星空图片在文件夹内名称) –output ./example/yyy.jpg (yyy代表你想要生成的图片名称)

cd neural-style-master
python neural_style.py –content  ./example/1-content.jpg  --styles ./example/1-style.jpg --output ./example/1-output.jpg
  
  1
2
  
  1
2

六、出现的问题

gcc 版本 -fno-canonical-system-headers

当执行

./configure
  
  1
  
  1

出现如下问题

INFO: Found 1 target...
Slow read: a 51765952-byte read from /home/bids/.cache/bazel/_bazel_bids/5df0e0fb624204ab1c5ce0472e695b94/external/local_config_cuda/cuda/lib/libcurand.so.7.5 took 9675ms.
INFO: From Compiling external/llvm/lib/Support/Host.cpp:
external/llvm/lib/Support/Host.cpp: In function 'llvm::StringRef llvm::sys::getHostCPUName()':
external/llvm/lib/Support/Host.cpp:898:5: warning: 'Type' may be used uninitialized in this function [-Wuninitialized]
external/llvm/lib/Support/Host.cpp:964:7: warning: 'Subtype' may be used uninitialized in this function [-Wmaybe-uninitialized]
ERROR: /home/bids/.cache/bazel/_bazel_bids/5df0e0fb624204ab1c5ce0472e695b94/external/llvm/BUILD:1667:1: C++ compilation of rule '@llvm//:support' failed: gcc failed: error executing command /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -B/usr/bin -B/usr/bin -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG ... (remaining 43 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
In file included from external/llvm/lib/Support/DynamicLibrary.cpp:16:0:
external/llvm/include/llvm/ADT/DenseSet.h:226:16: error: 'using llvm::DenseSet<ValueT, ValueInfoT>::BaseT::BaseT' conflicts with a previous declaration
external/llvm/include/llvm/ADT/DenseSet.h:223:39: note: previous declaration 'using BaseT = class llvm::detail::DenseSetImpl<ValueT, llvm::DenseMap<ValueT, llvm::detail::DenseSetEmpty, ValueInfoT, llvm::detail::DenseSetPair<ValueT> >, ValueInfoT>'
external/llvm/include/llvm/ADT/DenseSet.h:244:16: error: 'using llvm::SmallDenseSet<ValueT, InlineBuckets, ValueInfoT>::BaseT::BaseT' conflicts with a previous declaration
external/llvm/include/llvm/ADT/DenseSet.h:241:18: note: previous declaration 'using BaseT = class llvm::detail::DenseSetImpl<ValueT, llvm::SmallDenseMap<ValueT, llvm::detail::DenseSetEmpty, InlineBuckets, ValueInfoT, llvm::detail::DenseSetPair<ValueT> >, ValueInfoT>'
Target //tensorflow/tools/pip_package:build_pip_package failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 54.671s, Critical Path: 28.01s
bids@bids-HP-Z840-Workstation:~/tensorflow$ bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
WARNING: /home/bids/tensorflow/tensorflow/contrib/learn/BUILD:15:1: in py_library rule //tensorflow/contrib/learn:learn: target '//tensorflow/contrib/learn:learn' depends on deprecated target '//tensorflow/contrib/session_bundle:exporter': Use SavedModel Builder instead.
WARNING: /home/bids/tensorflow/tensorflow/contrib/learn/BUILD:15:1: in py_library rule //tensorflow/contrib/learn:learn: target '//tensorflow/contrib/learn:learn' depends on deprecated target '//tensorflow/contrib/session_bundle:gc': Use SavedModel instead.
INFO: Found 1 target...
ERROR: /home/bids/.cache/bazel/_bazel_bids/5df0e0fb624204ab1c5ce0472e695b94/external/zlib_archive/BUILD.bazel:5:1: C++ compilation of rule '@zlib_archive//:zlib' failed: crosstool_wrapper_driver_is_not_gcc failed: error executing command external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -fPIE -Wall -Wunused-but-set-parameter ... (remaining 37 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
gcc: error: unrecognized command line option '-fno-canonical-system-headers'
Target //tensorflow/tools/pip_package:build_pip_package failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 4.726s, Critical Path: 1.88s

  
  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
  
  1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

解决方法：

这是因为gcc 版本的问题。因之前安装caffe 所需的gcc版本为4.7，故升级到4.9版本即可。可参考

Porting to GCC 4.7
Caffe + Ubuntu 15.04 + CUDA 7.5 在服务器上安装配置及卸载重新安装（已测试可执行）

cd /usr/bin
sudo rm gcc
sudo ln -s gcc-4.9 gcc
sudo rm g++
sudo ln -s g++-4.9 g++
  
  1
2
3
4
5
  
  1
2
3
4
5

问题 Oracle JDK 8 is not installed

当执行如下

sudo apt-get install openjdk-8-jdk openjdk-8-source
  
  1
  
  1

出现如下错误

download failed
Oracle JDK 8 is NOT installed.
dpkg: error processing package oracle-java8-installer (--configure):
 subprocess installed post-installation script returned error exit status 1
Errors were encountered while processing:
 oracle-java8-installer
E: Sub-process /usr/bin/dpkg returned an error code (1)
  
  1
2
3
4
5
6
7
  
  1
2
3
4
5
6
7

解决方法：这是因为oracle-java8-installer 不能下载或者下载不完整导致的。

手动下载，见链接。

cp -r jdk-8u121-linux-x64.tar.gz /var/cache/oracle-jdk8-installer/
sudo apt-get install oracle-jdk8-installer
  
  1
2
  
  1
2

问题 TensorFlow ImportError: cannot import name pywrap_tensorflow

当执行如下

cd tensorflow
 import tensorflow as tf
  
  1
2
  
  1
2

出现如下错误

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>

  File "tensorflow/__init__.py", line 23, in <module>

    from tensorflow.Python import *

  File "tensorflow/python/__init__.py", line 48, in <module>

    from tensorflow.python import pywrap_tensorflow

ImportError: cannot import name pywrap_tensorflow
  
  1
2
3
4
5
6
7
8
9
10
11
12
13
  
  1
2
3
4
5
6
7
8
9
10
11
12
13

解决方法：这是因为python误以为tensorflow目录中的tensorflow就是要导入的模块

不要在tensorflow中运行python或者ipython

更改keras的backend 设置 tensorflow,theano

sudo gedit ~/.keras/keras.json
  
  1
  
  1

Theano为后端

{
    "image_dim_ordering": "th", 
    "epsilon": 1e-07, 
    "floatx": "float32", 
    "backend": "theano"
}
  
  1
2
3
4
5
6
  
  1
2
3
4
5
6

Tensorflow为后端

{
    "image_dim_ordering": "tf", 
    "epsilon": 1e-07, 
    "floatx": "float32", 
    "backend": "tensorflow"
}
  
  1
2
3
4
5
6
  
  1
2
3
4
5
6

参考文献：

TensorFlow源码编译－基于Ubuntu 15.04

TensorFlow 研究实践一

Ubuntu安装Bazel

官网教程 Installing Bazel

搭建Tensorflow虚拟机学习环境

TensorFlow的安装

TensorFlow 从入门到精通（一）：安装和使用

ubuntu16.04下安装TensorFlow(GPU加速)—-详细图文教程

Ubuntu: Oracle JDK 8 is NOT installed

教你从头到尾利用DL学梵高作画：GTX 1070 cuda 8.0 tensorflow gpu版

博瓦

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Caffe + Ubuntu 15.04 + CUDA 7.5 在服务器上安装配置及卸载重新安装（已测试可执行）

介绍Google的TensorFlow机器学习开源库，在UbuntuKylin上的安装和和源码编译。原始官方文档参见：http://www.tensorflow.org.本电脑配置如下：3.19.0-15-generic #15-Ubuntu x86_64 GNU/LinuxNVIDIA Corporation GK110BGL [Tesla K40c]NVIDIA Corpo
复制链接

扫一扫

专栏目录