Tensorflow笔记（一）

最新推荐文章于 2024-08-21 09:35:24 发布

Lee_Wei4939

最新推荐文章于 2024-08-21 09:35:24 发布

阅读量749

点赞数

分类专栏：人工智能

本文链接：https://blog.csdn.net/liwei1205/article/details/99327980

版权

人工智能专栏收录该内容

11 篇文章 0 订阅

订阅专栏

Tensorflow笔记

tf_cnn_benchmarks运行命令

 tf_cnn_benchmarks

--device=cpu --data_format=NHWC works for me.
I would suggest that --device=cpu --data_format=NHWC should be the default when --num_gpus option is 0 or is not there.

cpu运行命令：python tf_cnn_benchmarks.py  --batch_size=32 --model=resnet50 --variable_update=parameter_server  --device=cpu --data_format=NHWC --use_fp16=True --fp16_vars=True
 
gpu运行命令：python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=32 --model=inception3  --variable_update=parameter_server
resnet50, inception3, vgg16, and alexnet
Set-ExecutionPolicy Bypass -Scope Process -Force; iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))
多gpu cifar10的运算：python cifar10_multi_gpu_train.py --num_gpus=2

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
#也可以直接按固定的比例分配
#config.gpu_options.per_process_gpu_memory_fraction = 0.4
sess = tf.Session(config=config)

Colab 使用GDriver

!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse
from google.colab import auth
auth.authenticate_user()
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}

Drive中的文件与Colab关联

from google.colab import drive
drive.mount('/content/drive/')

GPU命令与CPU命令的差异

TensorFlow 为什么选择 NHWC 格式作为默认格式？因为早期开发都是基于 CPU，使用 NHWC 比 NCHW 稍快一些（不难理解，NHWC 局部性更好，cache 利用率高）
NCHW 则是 Nvidia cuDNN 默认格式，使用 GPU 加速时用 NCHW 格式速度会更快
设计网络时充分考虑两种格式，最好能灵活切换，在 GPU 上训练时使用 NCHW 格式，在 CPU 上做预测时使用 NHWC 格式

隐藏GPU

CUDA_VISIBLE_DEVICES=""

监控gpu

watch --color -n1 gpustat -cpu
nvidia-smi -l
nvidia-smi topo -m 查看GPU拓扑
nvidia-smi dmon  设备监控命令，以滚动条形式显示GPU设备统计信息
获取gpu的温度
nvidia-smi -q |grep "GPU Current Temp"|cut -d ':' -f 2|cut -d ' ' -f 2
nvidia-smi -q |grep "Gpu"

tf session使用

conf=tf.ConfigProto(allow_soft_placement=True,
      log_device_placement=FLAGS.log_device_placement)
conf.gpu_options.allow_growth = True
sess = tf.Session(config=config)

config = tf.ConfigProto()
config.allow_soft_placement = True
config.gpu_options.allow_growth=True

Linux下查找Python解释器的位置

import sys
sys.executable

which python3.6
whereis python

tf API

tf.get_collection
tf.variable_scope
tf.train.Saver
tf.gfile.Exists
tf.keras.utils.multi_gpu_model
tf.strided_slice
tf.train.shuffle_batch
tf.train.batch
tf.train.string_input_producer
tf.nn.l2_loss
add_to_collection
variable_scope
get_shape().as_list()
compute_gradients
tf.contrib.slim.prefetch_queue.prefetch_queue
name_scope
tf.get_variable_scope().reuse_variables()
tf.group
tf.train.start_queue_runners
apply_gradients
tf.contrib.framework.local_variable
tf.nn.in_top_k
@contextlib.contextmanager
tf.variance_scaling_initializer
tf.nn.fused_batch_norm
average_pooling2d
tf.data.experimental.parallel_interleave
tf.train.slice_input_producer
tf.parallel_stack
tf.nn.embedding_lookup

API解释

tf.name_scope() 主要是用来管理命名空间的，让我们的整个模型更加有条理; 而 tf.variable_scope() 的作用是为了实现变量共享，它和 tf.get_variable() 来完成变量共享的功能

Python

namedtuple
defaultdict(lambda: 0)
sys._getframe().f_lineno  获取行号

Pyrotch

DataParallel
DistributedParallel

multi_gpu_lstm

class DeviceCellWrapper(tf.nn.rnn_cell.RNNCell):
  def __init__(self, cell, device):
    self._cell = cell
    self._device = device

  @property
  def state_size(self):
    return self._cell.state_size

  @property
  def output_size(self):
    return self._cell.output_size

  def __call__(self, inputs, state, scope=None):
    with tf.device(self._device):
      return self._cell(inputs, state, scope)

cell_fw = DeviceCellWrapper(cell=tf.nn.rnn_cell.LSTMCell(num_units=n_neurons, state_is_tuple=False), device='/gpu:0')
cell_bw = DeviceCellWrapper(cell=tf.nn.rnn_cell.LSTMCell(num_units=n_neurons, state_is_tuple=False), device='/gpu:0')
outputs, states = tf.nn.bidirectional_dynamic_rnn(cell_fw, cell_bw, X, dtype=tf.float32)

https://stackoverflow.com/questions/47762902/how-to-speed-up-the-training-of-an-rnn-model-with-multiple-gpus-in-tensorflow?rq=1

multi_gpu_model

https://keras.io/zh/utils/

keras.utils.multi_gpu_model(model, gpus=None, cpu_merge=True, cpu_relocation=False)

import tensorflow as tf
from keras.applications import Xception
from keras.utils import multi_gpu_model
import numpy as np

num_samples = 1000
height = 224
width = 224
num_classes = 1000

# 实例化基础模型（或者「模版」模型）。
# 我们推荐在 CPU 设备范围内做此操作，
# 这样模型的权重就会存储在 CPU 内存中。
# 否则它们会存储在 GPU 上，而完全被共享。
with tf.device('/cpu:0'):
    model = Xception(weights=None,
                     input_shape=(height, width, 3),
                     classes=num_classes)

# 复制模型到 8 个 GPU 上。
# 这假设你的机器有 8 个可用 GPU。
parallel_model = multi_gpu_model(model, gpus=8)
parallel_model.compile(loss='categorical_crossentropy',
                       optimizer='rmsprop')

# 生成虚拟数据
x = np.random.random((num_samples, height, width, 3))
y = np.random.random((num_samples, num_classes))

# 这个 `fit` 调用将分布在 8 个 GPU 上。
# 由于 batch size 是 256, 每个 GPU 将处理 32 个样本。
parallel_model.fit(x, y, epochs=20, batch_size=256)

# 通过模版模型存储模型（共享相同权重）：
model.save('my_model.h5')

快捷安装tensorflow, pytorch的GPU版本

安装gpu驱动；安装Anaconda
conda创造虚拟环境，指定gpu版本
conda create -n tensorflow_env tensorflow-gpu
激活环境（进入环境）
conda activate tensorflow_env
安装keras的gpu版本，默认把tf的gpu版本一块装上
conda install keras-gpu
安装pytorch的gpu版本
conda install pytorch torchvision cudatoolkit=10.0 -c pytorch 
注意：cudatoolkit的版本需要结合显卡
安装cpu版本的pytorch
conda install pytorch-cpu torchvision-cpu -c pytorch

conda环境迁移
conda env export > environment.yaml 将包保存为 YAML。命令的第一部分 conda env export 用于输出环境中的所有包的名称（包括 Python 版本）。

根据yaml文件创建新的虚拟环境
conda env create -n tf_gpu_new -f tf_gpu_env.yaml 

使用 conda env list 列出你创建的所有环境

要在 conda 环境中安装 Jupyter notebook，请使用 conda install jupyter notebook。

Jupyter 附带了一个名为 nbconvert 的实用程序，可将 notebook 转换为 HTML、Markdown、幻灯片等格式

jupyter notebook 使用远程的python解释器

服务器： jupyter notebook --ip=0.0.0.0  --port=1111 --allow-root (--no_browser)
客户端：ssh -N -f  -L localhost:1112:localhost:1111 username@serverIP

tensorflow训练过程中出现loss为nan的可行解决方案