Ubuntu14.04下源码安装tensorflow 0.12

Ubuntu14.04下源码安装tensorflow 0.12

昨晚手痒用pip将tensorflow更新到了0.12,更新完发现问题来了:pip安装最新版的tensorflow默认只支持CUDA8.0和CuDNN v5,由于我的机器是CUDA7.5和CuDNN v4,所以必须从源码安装,下面是安装过程,希望给大家一些帮助。


一、首先是一些准备工作

1、下载tensorflow源码:
$ git clone https://github.com/tensorflow/tensorflow
2、准备Linux安装环境,包括:Install Bazel、Install other dependencies、CUDA、CuDNN等,详细过程请参考tensorflow官网 ,这里不再赘述。

二、Configure the installation

这里以我的机器为例,

$ cd tensorflow
$ ./confogure

Do you wish to build TensorFlow with OpenCL support? [y/N] y
OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] y
CUDA support will be enabled for TensorFlow
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 7.5
Please specify the location where CUDA 7.5 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda-7.5
Please specify the Cudnn version you want to use. [Leave empty to use system default]: 4
Please specify the location where cuDNN 4 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-7.5]: 
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 5.2
Please specify which C++ compiler should be used as the host C++ compiler. [Default is ]: /usr/bin/g++
Please specify which C compiler should be used as the host C compiler. [Default is ]: /usr/bin/g++
Please specify the location where ComputeCpp for SYCL 1.2 is installed. [Default is /usr/local/computecpp]:

其中,GPU的计算能力值可以从这里查到,我的GTX 980Ti是5.2:

https://developer.nvidia.com/cuda-gpus

如果在配置时将选择OpenCL support为Y,则在最后一项可能会提示没有找到computecpp,需要安装computecpp,官网

https://www.codeplay.com/products/computesuite/computecpp

三、遇到坑

以上信息配置好之后,配置程序会自动下载一些文件,在下载过程中遇到了墙,提示下列信息:

Timeout connecting to https://cdnjs.cloudflare.com/ajax/libs/numeroc/1.2.6/numeric.min.js

怎么解决呢?方法一:科学上网;方法二:修改tensorflow配置文件,这里只介绍方法二。

方法二:修改tensorflow配置文件

上面的网站被墙了,没办法,那就找一个代替的网站。我们首先在源码文件夹里找到配置文件:

$ grep numeric.min.js *

找到配置参数在WORKSPACE文件中,用vim打开WORKSPACE,修改配置参数,将url的值改成下面的网址:

http_file(
  name = "numericjs_numeric_min_js",
  url = "http://www.numericjs.com/lib/numeric-1.2.6.min.js",
)

重新./configure,应该可以了。

四、numpy、six版本问题

上述过程配置好之后,安装过程没有出现问题,导入tensorflow时出错了:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
RuntimeError: module compiled against API version 0xa but this version of numpy is 0x9
---------------------------------------------------------------------------
SystemError                               Traceback (most recent call last)
<ipython-input-1-a649b509054f> in <module>()
----> 1 import tensorflow

/usr/local/lib/python3.4/dist-packages/tensorflow/__init__.py in <module>()
     22 
     23 # pylint: disable=wildcard-import
---> 24 from tensorflow.python import *
     25 # pylint: enable=wildcard-import
     26 

/usr/local/lib/python3.4/dist-packages/tensorflow/python/__init__.py in <module>()
     59     _default_dlopen_flags = sys.getdlopenflags()
     60     sys.setdlopenflags(_default_dlopen_flags | ctypes.RTLD_GLOBAL)
---> 61     from tensorflow.python import pywrap_tensorflow
     62     sys.setdlopenflags(_default_dlopen_flags)
     63   else:

/usr/local/lib/python3.4/dist-packages/tensorflow/python/pywrap_tensorflow.py in <module>()
     26                 fp.close()
     27             return _mod
---> 28     _pywrap_tensorflow = swig_import_helper()
     29     del swig_import_helper
     30 else:

/usr/local/lib/python3.4/dist-packages/tensorflow/python/pywrap_tensorflow.py in swig_import_helper()
     22         if fp is not None:
     23             try:
---> 24                 _mod = imp.load_module('_pywrap_tensorflow', fp, pathname, description)
     25             finally:
     26                 fp.close()

/usr/lib/python3.4/imp.py in load_module(name, file, filename, details)
    241                 return load_dynamic(name, filename, opened_file)
    242         else:
--> 243             return load_dynamic(name, filename, file)
    244     elif type_ == PKG_DIRECTORY:
    245         return load_package(name, filename)

SystemError: initialization of _pywrap_tensorflow raised unreported exception

猜测是numpy版本问题,于是卸载numpy,重新安装(试了直接升级,然并软。。):

$ sudo pip3 uninstall numpy
$ sudo pip3 install numpy

重新导入tf,又出错了。查了一下,是six包的原因,于是又将six卸载重装,搞定!

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-a649b509054f> in <module>()
----> 1 import tensorflow

/usr/local/lib/python3.4/dist-packages/tensorflow/__init__.py in <module>()
     22 
     23 # pylint: disable=wildcard-import
---> 24 from tensorflow.python import *
     25 # pylint: enable=wildcard-import
     26 

/usr/local/lib/python3.4/dist-packages/tensorflow/python/__init__.py in <module>()
    122 from tensorflow.python.platform import resource_loader
    123 from tensorflow.python.platform import sysconfig
--> 124 from tensorflow.python.platform import test
    125 
    126 from tensorflow.python.util.all_util import remove_undocumented

/usr/local/lib/python3.4/dist-packages/tensorflow/python/platform/test.py in <module>()
     67 # pylint: disable=g-bad-import-order
     68 from tensorflow.python.client import device_lib as _device_lib
---> 69 from tensorflow.python.framework import test_util as _test_util
     70 from tensorflow.python.platform import googletest as _googletest
     71 from tensorflow.python.util.all_util import remove_undocumented

/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/test_util.py in <module>()
     41 from tensorflow.python.framework import random_seed
     42 from tensorflow.python.framework import versions
---> 43 from tensorflow.python.platform import googletest
     44 from tensorflow.python.platform import tf_logging as logging
     45 from tensorflow.python.util import compat

/usr/local/lib/python3.4/dist-packages/tensorflow/python/platform/googletest.py in <module>()
     31 
     32 from tensorflow.python.platform import app
---> 33 from tensorflow.python.platform import benchmark  # pylint: disable=unused-import
     34 
     35 Benchmark = benchmark.TensorFlowBenchmark  # pylint: disable=invalid-name

/usr/local/lib/python3.4/dist-packages/tensorflow/python/platform/benchmark.py in <module>()
    115 
    116 
--> 117 class Benchmark(six.with_metaclass(_BenchmarkRegistrar, object)):
    118   """Abstract class that provides helper functions for running benchmarks.
    119 

/usr/lib/python3/dist-packages/six.py in with_metaclass(meta, *bases)
    615 def with_metaclass(meta, *bases):
    616     """Create a base class with a metaclass."""
--> 617     return meta("NewBase", bases, {})
    618 
    619 def add_metaclass(metaclass):

/usr/local/lib/python3.4/dist-packages/tensorflow/python/platform/benchmark.py in __new__(mcs, clsname, base, attrs)
    110     newclass = super(mcs, _BenchmarkRegistrar).__new__(
    111         mcs, clsname, base, attrs)
--> 112     if not newclass.is_abstract():
    113       GLOBAL_BENCHMARK_REGISTRY.add(newclass)
    114     return newclass

AttributeError: type object 'NewBase' has no attribute 'is_abstract'

再次导入tensorflow,没有出现问题,程序也能够正常运行。若还出现问题,可以参考这里

In [1]: import tensorflow
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcublas.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcudnn.so.4 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcufft.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcurand.so.7.5 locally

五、总结

折腾了一上午,终于将tensorflow成功升级到0.12,希望对遇到同样问题的人有些帮助。


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值