Tensorflow-gpu安装
安装docker及nvidia-docker
在Ubuntu上安装Docker并使得Docker支持GPU
安装Tensorflow
在宿主机上安装GPU驱动
- 查找合适的Nvidia驱动器版本并安装
sudo ubuntu-drivers devices sudo ubuntu-drivers autoinstall
使用python3.8作为基础镜像
- 拉取镜像
sudo docker pull python:3.8
- 编写docker-compose.yml
version: '3' services: tensorflow_gpu: container_name: tensorflow_gpu image: python:3.8 user: "0" working_dir: /home volumes: ./src:/home deploy: resources: reservations: devices: - driver: nvidia count: "all" capabilities: [gpu] stdin_open: true tty: true command: /bin/bash -c "chown -R 1002:1002 . && /bin/bash"
- 创建容器
sudo docker-compose up -d
在Docker内需且仅需安装cuda
- 查看最大支持的cuda版本
nvidia-smi
- 选择指定版本的cuda,安装类型选择runfile(local)
wget https://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda_10.1.243_418.87.00_linux.run
- 安装cuda,选择仅安装cuda-toolkit,并配置环境变量
sh cuda_10.1.243_418.87.00_linux.run echo "export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/cuda/extras/CPUTI/lib64" >> ~/.bashrc echo "export CUDA_HOME=/usr/local/cuda/bin" >> ~/.bashrc echo "export PATH=$PATH:$LD_LIBRARY_PATH:$CUDA_HOME" >> ~/.bashrc source ~/.bashrc
- 若报错
Failed to verify gcc version. See log at /var/log/cuda-installer.log for details.
,则添加--override
参数或安装cuda对应版本的gcc- 添加
--override
参数sh cuda_10.1.243_418.87.00_linux.run --override
- 查看cuda对应的Versioned Online Documentation,安装合适版本的gcc
cp /etc/apt/sources.list /etc/apt/sources.list.bak echo "deb https://mirrors.ustc.edu.cn/ubuntu/ focal main restricted universe multiverse" > /etc/apt/sources.list echo "deb https://mirrors.ustc.edu.cn/ubuntu/ focal-security main restricted universe multiverse" >> /etc/apt/sources.list echo "deb https://mirrors.ustc.edu.cn/ubuntu/ focal-updates main restricted universe multiverse" >> /etc/apt/sources.list echo "deb https://mirrors.ustc.edu.cn/ubuntu/ focal-backports main restricted universe multiverse" >> /etc/apt/sources.list apt update | grep NO_PUBKEY gpg --keyserver keyserver.ubuntu.com --recv-keys $key gpg --export --armor $key | apt-key add - apt update apt install gcc-7 -y apt install g++-7 -y apt upgrade -y apt autoremove update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 90 update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-7 90 update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 50 update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-10 50 update-alternatives --config gcc
- 添加
- 若报错
在Docker内安装tensorflow
- pip安装对应版本的tensorflow
pip install tensorflow==2.3.0
- 测试tensorflow,此时可以导入包,但无法检测到GPU
python
import tensorflow as tf
tf.test.is_gpu_available()
在Docker内安装cudnn
- 下载解压对应版本的cudnn,并移动到cuda中
cp -r -d $path/lib64/* /usr/local/cuda/lib64/
- 重新测试tensorflow,此时可以检测到GPU
import tensorflow as tf
tf.config.list_physical_devices('GPU')
- 强制使用CPU
import os
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
环境设置需在执行tf初始化前使用
Tensoflow案例
函数拟合
import os
import tensorflow as tf
class Linear(tf.keras.Model):
def __init__(self):
super().__init__()
self.dense = tf.keras.layers.Dense(
units=1,
activation=None,
kernel_initializer=tf.zeros_initializer(),
bias_initializer=tf.zeros_initializer()
)
def call(self, input):
output = self.dense(input)
return output
def demo_func():
X = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
y = tf.constant([[7.0], [8.0]])
model = Linear()
optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)
for i in range(1000):
with tf.GradientTape() as tape:
y_pred = model(X)
loss = tf.reduce_mean(tf.square(y_pred - y))
# 使用model.variables直接获得模型中的所有变量
grads = tape.gradient(loss, model.variables)
optimizer.apply_gradients(grads_and_vars=zip(grads, model.variables))
if i % 100 == 0:
print(i, loss.numpy())
print(model.variables)
if __name__ == "__main__":
print('Tensorflow vesion:{}'.format(tf.__version__))
use_gpu = True
if use_gpu:
print('Default to GPU')
print('GPU Info:{}'.format(tf.config.list_physical_devices('GPU')))
else:
print('Set to use CPU')
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
print('GPU Info:{}'.format(tf.config.list_physical_devices('GPU')))
demo_func()