tensorflow 预训练resnet50预测单张图片常见错误

最新推荐文章于 2024-03-13 20:06:43 发布

流萤小扇

最新推荐文章于 2024-03-13 20:06:43 发布

阅读量2.3k

点赞数 2

分类专栏：人工智能文章标签： tensorflow 神经网络 python

本文链接：https://blog.csdn.net/CandyLove102130/article/details/107953320

版权

人工智能专栏收录该内容

2 篇文章 0 订阅

订阅专栏

tensorflow 预训练resnet50预测单张图片常见错误

1.ValueError: Cannot feed value of shape (112, 112, 3) for Tensor 'input_images:0', which has shape '(?, 112, 112, 3)'

2.TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed values include Python scalars, strings, lists, numpy ndarrays, or TensorHandles. For reference, the tensor object was Tensor("Reshape:0", shape=(3, 112, 112, 1), dtype=float32) which was passed to the feed with key Tensor("input_images:0", shape=(?, 112, 112, 3), dtype=float32).

3.ValueError: Dimension size must be evenly divisible by 12544 but is 172800 for 'Reshape' (op: 'Reshape') with input shapes: [360,480,1], [4] and with input tensors computed as partial shapes: input[1] = [?,112,112,1].

4.ValueError: Cannot feed value of shape (360, 480, 1) for Tensor 'input_images:0', which has shape '(?, 112, 112, 3)'

5.ValueError: Cannot feed value of shape (3, 112, 112, 1) for Tensor 'input_images:0', which has shape '(?, 112, 112, 3)'

6.TypeError: Expected binary or unicode string, got <tf.Tensor 'resize/Squeeze:0' shape=(112, 112, 1) dtype=float32>

看代码比对，解决自己的问题吧。

前提，

（1）提前下载resnet_v1_50.ckpt

（2）训练自己的图片利用resnet50迁移学习生成新的model

训练可以参考：

ResNet-V1-50卷积神经网络迁移学习进行不同品种的花的分类识别

https://www.jianshu.com/p/388c3cf02554

预测单张或多张图片：

# 预测单张图片方式2
        image = cv2.imread(pre_path)
        image = img_to_array(cv2.resize(image, (INPUT_SIZE, INPUT_SIZE)))
        imgs = np.array([image], dtype="float32")
        prob, label = sess.run([probabilities, prediction], feed_dict={input_images: imgs})
        print("预测概率2222:", prob)
        print("预测标签2222:", label)

feed_dict参数说的很明白：Acceptable feed values include Python scalars, strings, lists, numpy ndarrays, or TensorHandles

（1）输入尺寸和你训练输入尺寸要一致

（2）feed_dict参数要符合要求

（3）dtype="float32" 和input_images类型保持一致，不能一个float32，一个float64

完整代码如下：

# -*- coding: UTF-8 -*-
import numpy as np
import tensorflow as tf
import tensorflow.contrib.slim as slim
# 加载通过slim定义好的resnet_v1模型
import tensorflow.contrib.slim.python.slim.nets.resnet_v1 as resnet_v1
from keras.preprocessing.image import img_to_array
import cv2
from imutils import paths

# 保存训练好的模型
TRAIN_FILE = "D:/resource/ckpt/my_model_resnet50_test"
# 提供的已经训练好的模型
CKPT_FILE = "D:/resource/resnet_v1_50.ckpt"

# 定义训练所用参数
LEARNING_RATE = 0.0001
STEPS = 4
BATCH = 32
N_CLASSES = 10
INPUT_SIZE = 112
# 这里指出了不需要从训练好的模型中加载的参数，就是最后的自定义的全连接层
CHECKPOINT_EXCLUDE_SCOPES = 'Logits'
# 指定最后的全连接层为可训练的参数
TRAINABLE_SCOPES = 'Logits'


def main():

    # 定义数据格式
    input_images = tf.placeholder(tf.float32, [None, INPUT_SIZE, INPUT_SIZE, 3], name='input_images')
    labels = tf.placeholder(tf.int64, [None], name='labels')
    # 训练还是测试？测试的时候弃权参数会设置为1.0
    is_training = tf.placeholder(dtype=tf.bool)

    # 定义模型，因为给出的只有参数，并没有模型，这里需要指定模型的具体结构
    with slim.arg_scope(resnet_v1.resnet_arg_scope()):
        # logits就是最后预测值，images就是输入数据，指定num_classes=None是为了使resnet模型最后的输出层禁用
        logits, _ = resnet_v1.resnet_v1_50(input_images, num_classes=None)

    #自定义的输出层
    with tf.variable_scope("Logits"):
        #将原始模型的输出数据去掉维度为2和3的维度，最后只剩维度1的batch数和维度4的300*300*3
        #也就是将原来的二三四维度全部压缩到第四维度
        net = tf.squeeze(logits, axis=[1, 2])
        #加入一层dropout层
        net = slim.dropout(net, keep_prob=0.5, scope='dropout_scope')
        #加入一层全连接层，指定最后输出大小
        logits = slim.fully_connected(net, num_outputs=N_CLASSES, scope='fc')

    # 定义预测过程
    with tf.name_scope('prediction'):
        probabilities = tf.nn.softmax(logits)
        # 获取最大概率的标签位置
        prediction = tf.argmax(logits, 1)

    saver = tf.train.Saver()
    with tf.Session() as sess:
        saver.restore(sess, tf.train.latest_checkpoint('D:/resource/ckpt'))  # 加载模型变量
        pre_path = 'D:/resource/test/0000001.jpg'
        # 预测单张图片方式1
        # 加载需要预测的图片
        image_data = tf.gfile.FastGFile(pre_path, 'rb').read()
        # 将图片格式转换成我们所需要的矩阵格式，第二个参数为1，代表1维
        decode_image = tf.image.decode_png(image_data, 3)
        # 再把数据格式转换成能运算的float32
        image = tf.image.convert_image_dtype(decode_image, tf.float32)
        # 转换成指定的输入格式形状
        image = tf.image.resize_images(image, [INPUT_SIZE, INPUT_SIZE])
        image = sess.run(image)
        image1 = np.array([image], dtype="float32")
        prob, label = sess.run([probabilities, prediction], feed_dict={input_images: image1})
        print("预测概率1111:", prob)
        print("预测标签1111:", label)

        # 预测单张图片方式2
        image = cv2.imread(pre_path)
        image = img_to_array(cv2.resize(image, (INPUT_SIZE, INPUT_SIZE)))
        imgs = np.array([image], dtype="float32")
        prob, label = sess.run([probabilities, prediction], feed_dict={input_images: imgs})
        print("预测概率2222:", prob)
        print("预测标签2222:", label)

        # 预测单张图片方式3
        image = cv2.imread(pre_path)
        image = img_to_array(cv2.resize(image, (INPUT_SIZE, INPUT_SIZE)))
        image = np.expand_dims(image, 0)
        prob, label = sess.run([probabilities, prediction], feed_dict={input_images: image})
        print("预测概率3333:", prob)
        print("预测标签3333:", label)

        # 预测多张图片
        images = load_predict_img("D:/resource/test", INPUT_SIZE)
        prob, label = sess.run([probabilities, prediction], feed_dict={input_images: images})
        print("预测概率4444:", prob)
        print("预测标签4444:", label)

        # 预测多张图片
        images = load_predict_img("D:/resource/test", INPUT_SIZE)
        image = images[0]
        image = np.expand_dims(image, 0)
        prob, label = sess.run([probabilities, prediction], feed_dict={input_images: image})
        print("预测概率5555:", prob)
        print("预测标签5555:", label)


def load_predict_img(path, INPUT_SIZE):
    data = []
    image_paths = sorted(list(paths.list_images(path)))
    for imagePath in image_paths:
        image = cv2.imread(imagePath)
        image = img_to_array(cv2.resize(image, (INPUT_SIZE,INPUT_SIZE)))
        data.append(image)
    return np.array(data, dtype="float")


if __name__ == '__main__':
    main()

执行结果：

我测试的模型训练了4次，准确率不高，训练100次以上达到90%以上没问题。

走个过场，参考一下吧。

新手学习，预测单张图片弄的云里雾里。有空继续写。

流萤小扇

关注

2
点赞
踩
7

收藏

觉得还不错? 一键收藏
0
评论
tensorflow 预训练resnet50预测单张图片常见错误

tensorflow 预训练resnet50预测单张图片常见错误1.ValueError: Cannot feed value of shape (112, 112, 3) for Tensor 'input_images:0', which has shape '(?, 112, 112, 3)'2.TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed values include Python s
复制链接

扫一扫

专栏目录