深度学习常见网络及报错大汇总:vgg训练 tensorflow版本匹配模型保存与加载 loss特别大_unimplementederror: graph execution error:-CSDN博客

本文链接：https://blog.csdn.net/weixin_54227557/article/details/126408974

xavier_initializer
替换成
tf.keras.initializers.glorot_normal()

slim = tf.contrib.slim
替换成
pip install --upgrade tf_slim
import tf_slim
slim = tf_slim

AttributeError: module 'tensorflow' has no attribute 'placeholder'
替换成
tf.compat.v1.disable_eager_execution()

tf.compat.v1.placeholder

AttributeError: module 'tensorflow._api.v2.train' has no attribute 'exponential_decay'
原先的
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize
修改成
optimizer = tf.compat.v1.train.GradientDescentOptimizer(learning_rate).minimize

AttributeError: module 'tensorboard.summary._tf.summary' has no attribute 'merge_all'
修改为
tf.compat.v1.summary.merge_all()

imagenet_mini的数据集可以找我的网盘索要。

 module 'tensorflow' has no attribute 'Session'
 修改为
 tf.compat.v1.Session()

module 'tensorflow' has no attribute 'global_variables_initializer'
替换成
tf.compat.v1.global_variables_initializer().

出现报错

ValueError: Cannot feed value of shape (128, 100) for Tensor labels:0, which has shape (None, 1000)
解决如下图

在这里插入图片描述

ValueError: cannot reshape array of size 19267584 into shape (12800)
数据大小对接出错，需要因式分解

module 'keras.optimizers' has no attribute 'RMSprop'
修改：
from tensorflow import optimizers
opt = optimizers.RMSprop(lr=0.0001, decay=1e-6)

RuntimeError: tf.placeholder() is not compatible with eager execution.
解决
tf.compat.v1.disable_eager_execution()

IOPub data rate exceeded.
The Jupyter server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--ServerApp.iopub_data_rate_limit`.

Current values:
ServerApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
ServerApp.rate_limit_window=3.0 (secs)
解决

jupyter notebook --generate-config

在这里插入图片描述

Found input variables with inconsistent numbers of samples: [448, 6400]
解决：
肯定出了什么岔子，样本数和标签数不一样

UnimplementedError: Graph execution error:

Detected at node 'mean_squared_error/Cast' defined at (most recent call last):

2 root error(s) found.
  (0) UNIMPLEMENTED:  Cast string to float is not supported
	 [[{{node mean_squared_error/Cast}}]]
  (1) CANCELLED:  Function was cancelled before it was started
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_690]
解决：
抓住关键词float即可发现
需要一个类型转换：
Y_train = Y_train.astype(float)

在这里插入图片描述
初始loss可能会很大，可能是激活函数造成的。

net.load_weights("alexnet_weights.h5")
AttributeError: 'str' object has no attribute 'decode'
解决
pip install h5py

强制使用cpu

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"  # 这一行注释掉就是使用gpu，不注释就是使用cpu

InternalError: Failed copying input tensor from /job:localhost/replica:0/task:0/device:CPU:0 to /job:localhost/replica:0/task:0/device:GPU:0 in order to run _EagerConst: Dst tensor is not initialized.
这个多半就是内存不够了。

ValueError: `labels.shape` must equal `logits.shape` except for the last dimension. Received: labels.shape=(4,) and logits.shape=(1, 4)
解决
当 **loss = ‘sparse_categorical_crossentropy’**时，数据的标签不能进行onehot编码，才能运行；
当数据标签进行了onehot编码后，改为 **loss = ‘categorical_crossentropy’**就能跑通了。

growth memory

import tensorflow as tf
gpu_list = tf.config.experimental.list_physical_devices('GPU')
if len(gpu_list) > 0 :
  for gpu in gpu_list:
    try:
      # 设置多张 GPU ，如果不需要 for 去设置多张，则使用 list 的索引设置即可
      tf.config.experimental.set_memory_growth(gpu, True)
  	except RuntimeError as e:
    	print(e)
else：
	print("Got no GPUs")

memory_limit

import tensorflow as tf

using_gpu_index = 0 # 使用的 GPU 号码
gpu_list = tf.config.experimental.list_physical_devices('GPU')
if len(gpu_list) > 0:
  try:
    tf.config.experimental.set_virtual_device_configuration(
        gpu_list[using_gpu_index], 
        [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=2048)]
    )
  except RuntimeError as e:
    print(e)
else：
	print("Got no GPUs")