Ubuntu18.04 安装 CUDA10.0 + cuDNN7.6.5 + TensorFlow2.0

最新推荐文章于 2024-07-05 00:01:39 发布

breeze_csdn

最新推荐文章于 2024-07-05 00:01:39 发布

阅读量3.5k

点赞数

分类专栏： deep learning

原文链接：https://blog.csdn.net/qq_33431368/article/details/103849719

版权

deep learning 专栏收录该内容

25 篇文章 1 订阅

订阅专栏

第一次安装 CUDA 的过程简直抓狂，中间出现了很多次莫名其妙的 bug，踩了很多坑。比如装好了 CUDA 重启后进不去桌面系统了，直接黑屏、比如鼠标键盘都不 work 了、再比如装好了却安装不了 TensorFlow-GPU......看了一圈网上的安装教程，发现还是官方指南真香了~

新年第一篇，分享一下我的 Ubuntu 18.04 + CUDA 10.0 + cuDNN 7.6.5 + TensorFlow 2.0 安装笔记，希望可以帮助大家少踩坑。

整个安装流程大致是：安装显卡驱动 -> 安装 CUDA^[1] -> 安装 cuDNN^[2] -> 安装 tensorflow-gpu 并测试。

全文目录：

Ubuntu安装与更新
安装显卡驱动
安装CUDA
安装cuDNN
安装TensorFlow2.0 GPU及测试

1. Ubuntu安装和更新

先进行Ubuntu18.04系统一些基本的安装和更新，具体的操作系统安装过程省略，比较容易，大家可自行百度，有很多教程。


    
    
      
      
       
       
      
      
      
      
       
       
        
        sudo apt-get update 
        
        # 更新源
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo apt-get upgrade 
        
        # 更新已安装的包
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo apt-get install vim

2. 安装显卡驱动

2.1 禁用 Nouveau 驱动

注意：Linux 系统下有两种方案安装 CUDA：一种是 Package Manager Installation (.deb)，另一种是 Runfile Installation (.run)。本文采取的是第一种（也是官方推荐的方式）。如果使用deb方式安装CUDA可以忽略此步，本人测试OK。如果使用 runfile 安装CUDA需要手动禁用系统自带的 Nouveau 驱动：


    
    
      
      
       
       
      
      
      
      
       
       
        
        lsmod | grep nouveau 
        
        # 要确保这条命令无输出


    
    
      
      
       
       
      
      
      
      
       
       
        
        vim /etc/modprobe.d/blacklist-nouveau.conf
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        # 添加下面两行：
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        #######################################################
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        blacklist nouveau
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        options nouveau modeset=
        
        0
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        #######################################################
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        # 保存后重启：
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo update-initramfs -u
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo reboot
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        # 再次输入以下命令，无输出就表示设置成功了
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        lsmod | grep nouveau

2.2 安装合适的显卡驱动^[3]


    
    
      
      
       
       
      
      
      
      
       
       
        
        # 先清空现有的显卡驱动及依赖并重启
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo apt-get remove --purge nvidia*
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo apt autoremove
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo reboot


    
    
      
      
       
       
      
      
      
      
       
       
        
        # 添加ppa源并安装最新的驱动
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo add-apt-repository ppa:graphics-drivers/ppa
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo apt update
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        ubuntu-drivers devices
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo apt install nvidia-driver
        
        -440
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        # 为了防止自动更新驱动导致的兼容性问题，我们还可以锁定驱动版本:
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo apt-mark hold nvidia-driver
        
        -440
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        # nvidia-driver-440 set on hold.

并在【软件和更新】菜单中的附加驱动列表中，可以找到刚刚安装的nvidia-driver-440，选定即可。输入sudo reboot重启后，输入nvidia-smi，显示下图信息，这样表示显卡驱动已经 ready：


    
    
      
      
       
       
      
      
      
      
       
       
        
        lsmod | grep nvidia 
        
        # 看到下面的输出则为安装成功，如果无输出，表示有问题

也可以手动去官网下载对应的安装程序安装显卡^[4]


    
    
      
      
       
       
      
      
      
      
       
       
        
        # 动态监测显卡使用的方式：
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        watch -n 
        
        1 nvidia-smi 
        
        # 1表示每1秒刷新一次
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        watch -n 
        
        0.01 nvidia-smi 
        
        # 也可改成0.01s刷新一次
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        # 也可以用gpustat
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        pip install gpustat
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        gpustat -i 
        
        1 -P

3. 安装 CUDA

百度百科：CUDA（Compute Unified Device Architecture），是显卡厂商NVIDIA^[5]推出的运算平台。CUDA 是一种由 NVIDIA 推出的通用并行计算^[6]架构，该架构使GPU^[7]能够解决复杂的计算问题。

Linux 系统下有两种方案安装 CUDA：一种是 Package Manager Installation (.deb)，另一种是 Runfile Installation (.run)。本文采取的是第一种（也是官方推荐的方式）。

另外，CUDA 对于系统环境有严格的依赖，比如对于 CUDA10.0 有如下的要求。其他的版本可查看对应的Online Documentation^[8]。

3.1 安装前的准备

在安装 CUDA 之前需要先确定环境是 ready 的，以免出现乱七八糟的 bug 无从下手。直接引用官网的说明：

Some actions must be taken before the CUDA Toolkit and Driver can be installed on Linux:

Verify the system has a CUDA-capable GPU.
Verify the system is running a supported version of Linux.
Verify the system has gcc installed.
Verify the system has the correct kernel headers and development packages installed.
Download the NVIDIA CUDA Toolkit.
Handle conflicting installation methods.

3.1.1 确认你有支持 CUDA 的 GPU


    
    
      
      
       
       
      
      
      
      
       
       
        
        lspci | grep -i nvidia | grep VGA

3.1.2 确认你的 linux 版本


    
    
      
      
       
       
      
      
      
      
       
       
        
        uname -m && cat /etc
        
        /*release
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        uname -a
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        # The x86_64 line indicates you are running on a 64-bit system.

3.1.3 确认 gcc 版本


    
    
      
      
       
       
      
      
      
      
       
       
        
        gcc --version
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        # gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0

3.1.4 安装对应内核版本的头文件

查看 kernel 的版本：


    
    
      
      
       
       
      
      
      
      
       
       
        
        uname -r
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        # 5.0.0-37-generic

This is the version of the kernel headers and development packages that must be installed prior to installing the CUDA Drivers.

安装对应内核版本的头文件：


    
    
      
      
       
       
      
      
      
      
       
       
        
        sudo apt-get install linux-headers-$(uname -r)

3.1.5 选择安装方式

下载对应的安装包（以官方推荐的 Deb packages 安装方式为例）^[9]

The CUDA Toolkit can be installed using either of two different installation mechanisms: distribution-specific packages (RPM and Deb packages), or a distribution-independent package (runfile packages).

(1) The distribution-independent package has the advantage of working across a wider set of Linux distributions, but does not update the distribution's native package management system.

(2) The distribution-specific packages interface with the distribution's native package management system. It is recommended to use the distribution-specific packages, where possible.

3.1.6 彻底卸载之前安装过的相关应用，避免冲突

如果是全新的 ubuntu，可忽略此部分，执行 3.2 部分即可。

如果 ubuntu 下用 RPM/Deb 安装的：


    
    
      
      
       
       
      
      
      
      
       
       
        
        sudo apt-get --purge remove <package_name>
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo apt autoremove

如果是 runfile 安装的：


    
    
      
      
       
       
      
      
      
      
       
       
        
        sudo /usr/bin/nvidia-uninstall
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo /usr/local/cuda-X.Y/bin/uninstall_cuda_X.Y.pl

3.2 安装

首先确保已经下载好对应的.deb 文件，然后执行：


    
    
      
      
       
       
      
      
      
      
       
       
        
        sudo dpkg -i cuda-repo-ubuntu1804
        
        -10
        
        -0-local
        
        -10.0
        
        .130
        
        -410.48_1
        
        .0
        
        -1_amd64.deb
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo apt-key add /
        
        var/cuda-repo-<version>/
        
        7fa2af80.pub 
        
        # 根据执行完第一步的提示输入，比如我是：
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        # sudo apt-key add /var/cuda-repo-10-0-local-10.0.130-410.48/7fa2af80.pub
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo apt-get update
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo apt-get install cuda-toolkit
        
        -10
        
        -0 
        
        # 注意不是cuda，因为在第二步中装过驱动了，此过程安装cuda-toolkit-10-0即可

3.3 安装后

安装之后需要手动进行一些设置才能使 CUDA 正常的工作。


    
    
      
      
       
       
      
      
      
      
       
       
        
        export PATH=/usr/local/cuda
        
        -10.0/bin${PATH:+:${PATH}}


    
    
      
      
       
       
      
      
      
      
       
       
        
        nvcc -V 
        
        # 检查CUDA是否安装成功
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        # OUTPUT:
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        nvcc: NVIDIA (R) Cuda compiler driver
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        Copyright (c) 
        
        2005
        
        -2018 NVIDIA Corporation
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        Built on Sat_Aug_25_21:
        
        08:
        
        01_CDT_2018
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        Cuda compilation tools, release 
        
        10.0, V10
        
        .0
        
        .130

最好关闭系统的自动更新，防止安装好的环境突然 bug：


    
    
      
      
       
       
      
      
      
      
       
       
        
        sudo vi /etc/apt/apt.conf.d/
        
        10periodic
       
       
      
      

      
      
       
       
      
      
      
      
       
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        # 修改为：
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        APT::Periodic::Update-Package-Lists 
        
        "0";
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        APT::Periodic::Download-Upgradeable-Packages 
        
        "0";
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        APT::Periodic::AutocleanInterval 
        
        "0";

也可以通过桌面设置：System Settings => Software&Updates => updates

4. 安装 cuDNN^[10]

NVIDIA cuDNN 是用于深度神经网络的 GPU 加速库。首先需要注册下载对应 CUDA 版本号的 cuDNN 安装包: 链接^[11]。

比如对应 CUDA10.0，我下载的是：tar -zxvf cudnn-10.0-linux-x64-v7.6.5.32.tgz


    
    
      
      
       
       
      
      
      
      
       
       
        
        tar -zxvf cudnn
        
        -10.0-linux-x64-v7
        
        .6
        
        .5
        
        .32.tgz
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo cp cuda/
        
        include/cudnn.h /usr/local/cuda/
        
        include
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo chmod a+r /usr/local/cuda/
        
        include/cudnn.h /usr/local/cuda/lib64/libcudnn*

验证是否安装成功：


    
    
      
      
       
       
      
      
      
      
       
       
        
        cat /usr/local/cuda/
        
        include/cudnn.h | grep CUDNN_MAJOR -A 
        
        2
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        # 输出
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        ""
        
        "
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        #define CUDNN_MAJOR 7
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        #define CUDNN_MINOR 6
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        #define CUDNN_PATCHLEVEL 5
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        --
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        #define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        #include "driver_types.h
        
        "
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        "
        
        ""

更推荐使用 Debian File 去安装，因为可以通过里面的样例去验证 cuDNN 是否成功安装。首先下载下面三个文件：


    
    
      
      
       
       
      
      
      
      
       
       
        
        # 分别下载
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo dpkg -i libcudnn7_7
        
        .6
        
        .5
        
        .32
        
        -1+cuda10
        
        .0_amd64.deb
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo dpkg -i libcudnn7-dev_7
        
        .6
        
        .5
        
        .32
        
        -1+cuda10
        
        .0_amd64.deb
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        sudo dpkg -i libcudnn7-doc_7
        
        .6
        
        .5
        
        .32
        
        -1+cuda10
        
        .0_amd64.deb
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        # 安装完验证：
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        cp -r /usr/src/cudnn_samples_v7/ $HOME
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        cd  $HOME/cudnn_samples_v7/mnistCUDNN
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        make clean && make
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        ./mnistCUDNN
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        # Test passed!

另外也可以用 conda 来安装 cudatoolkit 和 cuDNN，但要保证驱动是 ready 的。


    
    
      
      
       
       
      
      
      
      
       
       
        
        conda install cudatoolkit=
        
        10.0
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        conda install -c anaconda cudnn

5. 安装 TensorFlow2.0 GPU及测试


    
    
      
      
       
       
      
      
      
      
       
       
        
        # 安装conda
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        wget https:
        
        //repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh && bash Miniconda3-latest-Linux-x86_64.sh
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        source ~/.bashrc
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        conda create -y -n tf2 python=
        
        3.7
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        conda activate tf2
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        pip install --upgrade pip
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        pip config set 
        
        global.index-url https:
        
        //pypi.tuna.tsinghua.edu.cn/simple
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        pip install tensorflow-gpu
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        pip install catboost

测试:


    
    
      
      
       
       
      
      
      
      
       
       
        
        import tensorflow 
        
        as tf
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        print(tf.__version__)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        print(
        
        "Num GPUs Available: ", len(tf.config.experimental.list_physical_devices(
        
        'GPU')))
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        ""
        
        "
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        2.0.0
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        Num GPUs Available:  2
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        "
        
        ""


    
    
      
      
       
       
      
      
      
      
       
       
        
        ""
        
        "
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        测试程序：
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        源链接：https://github.com/dragen1860/TensorFlow-2.x-Tutorials/blob/master/08-ResNet/main.py
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        "
        
        ""
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        import os
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        os.environ[
        
        "CUDA_VISIBLE_DEVICES"] = 
        
        "1" 
        
        # os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        import tensorflow 
        
        as tf
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        import numpy 
        
        as np
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        from tensorflow import keras
       
       
      
      

      
      
       
       
      
      
      
      
       
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        tf.random.set_seed(
        
        22)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        np.random.seed(
        
        22)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        os.environ[
        
        'TF_CPP_MIN_LOG_LEVEL'] = 
        
        '2'
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        assert tf.__version__.startswith(
        
        '2.')
       
       
      
      

      
      
       
       
      
      
      
      
       
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        (x_train, y_train), (x_test, y_test) = keras.datasets.fashion_mnist.load_data()
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        x_train, x_test = x_train.astype(np.float32) / 
        
        255., x_test.astype(
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            np.float32) / 
        
        255.
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        # [b, 28, 28] => [b, 28, 28, 1]
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        x_train, x_test = np.expand_dims(x_train, axis=
        
        3), np.expand_dims(x_test,
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                                                                          axis=
        
        3)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        # one hot encode the labels. convert back to numpy as we cannot use a combination of numpy
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        # and tensors as input to keras
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        y_train_ohe = tf.one_hot(y_train, depth=
        
        10).numpy()
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        y_test_ohe = tf.one_hot(y_test, depth=
        
        10).numpy()
       
       
      
      

      
      
       
       
      
      
      
      
       
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        print(x_train.shape, y_train.shape)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        print(x_test.shape, y_test.shape)
       
       
      
      

      
      
       
       
      
      
      
      
       
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        # 3x3 convolution
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        def conv3x3(channels, stride=
        
        1, kernel=(
        
        3, 
        
        3)):
       
       
      
      

      
      
       
       
      
      
      
      
       
           
        
        return keras.layers.Conv2D(
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                channels,
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                kernel,
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                strides=stride,
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                padding=
        
        'same',
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                use_bias=
        
        False,
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                kernel_initializer=tf.random_normal_initializer())
       
       
      
      

      
      
       
       
      
      
      
      
       
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        class ResnetBlock(keras.Model):
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            def __init__(self, channels, strides=1, residual_path=False):
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                super(ResnetBlock, self).__init__()
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                self.channels = channels
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                self.strides = strides
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                self.residual_path = residual_path
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                self.conv1 = conv3x3(channels, strides)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                self.bn1 = keras.layers.BatchNormalization()
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                self.conv2 = conv3x3(channels)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                self.bn2 = keras.layers.BatchNormalization()
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                if residual_path:
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                    self.down_conv = conv3x3(channels, strides, kernel=(1, 1))
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                    self.down_bn = tf.keras.layers.BatchNormalization()
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            def call(self, inputs, training=None):
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                residual = inputs
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                x = self.bn1(inputs, training=training)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                x = tf.nn.relu(x)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                x = self.conv1(x)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                x = self.bn2(x, training=training)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                x = tf.nn.relu(x)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                x = self.conv2(x)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                # this module can be added into self.
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                # however, module in for can not be added.
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                if self.residual_path:
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                    residual = self.down_bn(inputs, training=training)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                    residual = tf.nn.relu(residual)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                    residual = self.down_conv(residual)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                x = x + residual
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                return x
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        class ResNet(keras.Model):
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            def __init__(self, block_list, num_classes, initial_filters=16, **kwargs):
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                super(ResNet, self).__init__(**kwargs)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                self.num_blocks = len(block_list)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                self.block_list = block_list
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                self.in_channels = initial_filters
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                self.out_channels = initial_filters
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                self.conv_initial = conv3x3(self.out_channels)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                self.blocks = keras.models.Sequential(name='dynamic-blocks')
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                # build all the blocks
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                for block_id in range(len(block_list)):
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                    for layer_id in range(block_list[block_id]):
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                        if block_id != 0 and layer_id == 0:
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                            block = ResnetBlock(self.out_channels,
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                                                strides=2,
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                                                residual_path=True)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                        else:
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                            if self.in_channels != self.out_channels:
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                                residual_path = True
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                            else:
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                                residual_path = False
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                            block = ResnetBlock(self.out_channels,
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                                                residual_path=residual_path)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                        self.in_channels = self.out_channels
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                        self.blocks.add(block)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                    self.out_channels *= 2
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                self.final_bn = keras.layers.BatchNormalization()
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                self.avg_pool = keras.layers.GlobalAveragePooling2D()
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                self.fc = keras.layers.Dense(num_classes)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            def call(self, inputs, training=None):
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                out = self.conv_initial(inputs)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                out = self.blocks(out, training=training)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                out = self.final_bn(out, training=training)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                out = tf.nn.relu(out)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                out = self.avg_pool(out)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                out = self.fc(out)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                return out
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        def main():
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            num_classes = 10
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            batch_size = 128
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            epochs = 2
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            # build model and optimizer
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            model = ResNet([2, 2, 2], num_classes)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            model.compile(optimizer=keras.optimizers.Adam(0.001),
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                          loss=keras.losses.CategoricalCrossentropy(from_logits=True),
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                          metrics=['accuracy'])
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            model.build(input_shape=(None, 28, 28, 1))
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            print("Number of variables in the model :", len(model.variables))
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            model.summary()
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            # train
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            model.fit(x_train,
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                      y_train_ohe,
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                      batch_size=batch_size,
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                      epochs=epochs,
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                      validation_data=(x_test, y_test_ohe),
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
                      verbose=1)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            # evaluate on test set
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            scores = model.evaluate(x_test, y_test_ohe, batch_size, verbose=1)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            print("Final test loss and accuracy :", scores)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        if __name__ == '__main__':
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            main()

监测 GPU 使用：


    
    
      
      
       
       
      
      
      
      
       
       
        
        watch -n 
        
        0.01 nvidia-smi

测试 catboost 使用 CPU：


    
    
      
      
       
       
      
      
      
      
       
       
        
        from catboost.datasets import titanic
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        import numpy 
        
        as np
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        from sklearn.model_selection import train_test_split
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        from catboost import CatBoostClassifier, Pool, cv
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        from sklearn.metrics import accuracy_score
       
       
      
      

      
      
       
       
      
      
      
      
       
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        train_df, test_df = titanic()
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        null_value_stats = train_df.isnull().sum(axis=
        
        0)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        null_value_stats[null_value_stats != 
        
        0]
       
       
      
      

      
      
       
       
      
      
      
      
       
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        train_df.fillna(
        
        -999, inplace=
        
        True)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        test_df.fillna(
        
        -999, inplace=
        
        True)
       
       
      
      

      
      
       
       
      
      
      
      
       
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        X = train_df.drop(
        
        'Survived', axis=
        
        1)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        y = train_df.Survived
       
       
      
      

      
      
       
       
      
      
      
      
       
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        X_train, X_validation, y_train, y_validation = train_test_split(X, y, train_size=
        
        0.75, random_state=
        
        42)
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        X_test = test_df
       
       
      
      

      
      
       
       
      
      
      
      
       
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        categorical_features_indices = np.where(X.dtypes != np.float)[
        
        0]
       
       
      
      

      
      
       
       
      
      
      
      
       
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        model = CatBoostClassifier(
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            task_type=
        
        "GPU",
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            custom_metric=[
        
        'Accuracy'],
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            random_seed=
        
        666,
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            logging_level=
        
        'Silent'
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        )
       
       
      
      

      
      
       
       
      
      
      
      
       
        
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        model.fit(
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            X_train, y_train,
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            cat_features=categorical_features_indices,
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            eval_set=(X_validation, y_validation),
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            logging_level=
        
        'Verbose',  
        
        # you can comment this for no text output
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
            plot=
        
        True
       
       
      
      

      
      
       
       
      
      
      
      
       
       
        
        );

监测 GPU 使用：


    
    
      
      
       
       
      
      
      
      
       
       
        
        watch -n 
        
        0.01 nvidia-smi

REFERENCE

[1]

安装CUDA: https://developer.nvidia.com/cuda-toolkit-archive

[2]

安装cuDNN: https://developer.nvidia.com/rdp/cudnn-download

[3]

安装合适的显卡驱动: http://www.linuxandubuntu.com/home/how-to-install-latest-nvidia-drivers-in-linux

[4]

也可以手动去官网下载对应的安装程序安装显卡: https://www.geforce.cn/drivers

[5]

NVIDIA: https://baike.baidu.com/item/NVIDIA

[6]

并行计算: https://baike.baidu.com/item/并行计算/113443

[7]

GPU: https://baike.baidu.com/item/GPU

[8]

Online Documentation: https://developer.nvidia.com/cuda-toolkit-archive

[9]

下载对应的安装包（以官方推荐的Deb packages安装方式为例）: https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=deblocal

[10]

安装cuDNN: https://developer.nvidia.com/rdp/cudnn-download

[11]

链接: https://developer.nvidia.com/rdp/cudnn-download

[12]

官方-NVIDIA CUDA Installation Guide for Linux: https://docs.nvidia.com/cuda/archive/10.0/cuda-installation-guide-linux/index.html

[13]

CUDA_Quick_Start_Guide-pdf: https://developer.download.nvidia.com/compute/cuda/10.0/Prod/docs/sidebar/CUDA_Quick_Start_Guide.pdf

[14]

CUDA_Installation_Guide_Linux-pdf: https://developer.download.nvidia.com/compute/cuda/10.0/Prod/docs/sidebar/CUDA_Installation_Guide_Linux.pdf

[15]

官方-cuDNN安装: https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html#install-linux

[16]

[How To] Install Latest NVIDIA Drivers In Linux: http://www.linuxandubuntu.com/home/how-to-install-latest-nvidia-drivers-in-linux

推荐原创干货阅读：

聊聊近状，唠十块钱的

【Deep Learning】详细解读LSTM与GRU单元的各个公式和区别

【手把手AI项目】一、安装win10+linux-Ubuntu16.04的双系统（全网最详细）

【Deep Learning】为什么卷积神经网络中的“卷积”不是卷积运算?

【TOOLS】Pandas如何进行内存优化和数据加速读取（附代码详解）

【TOOLS】python3利用SMTP进行邮件Email自主发送

【手把手AI项目】七、MobileNetSSD通过Ncnn前向推理框架在PC端的使用

【时空序列预测第一篇】什么是时空序列问题？这类问题主要应用了哪些模型？主要应用在哪些领域？

breeze_csdn

关注

0
点赞
踩
8

收藏

觉得还不错? 一键收藏
1
评论
Ubuntu18.04 安装 CUDA10.0 + cuDNN7.6.5 + TensorFlow2.0

第一次安装 CUDA 的过程简直抓狂，中间出现了很多次莫名其妙的 bug，踩了很多坑。比如装好了 CUDA 重启后进不去桌面系统了，直接黑屏、比如鼠标键盘都不 work 了、再比如装好了却安装不了 TensorFlow-GPU......看了一圈网上的安装教程，发现还是官方指南真香了~新年第一篇，分享一下我的...
复制链接

扫一扫