使用tf-slim的ResNet V1 152和ResNet V2 152预训练模型进行图像分类

本文使用tf-slim的ResNet V1 152和ResNet V2 152预训练模型进行图像分类,并研究slim网络的scope命名等。

tf-slim文档不太多,实现过程中多参考官网的源码: https://github.com/tensorflow/models/tree/master/research/slim
注意resnet v2的预处理有点不一样,输入是299而不是224
ResNet V2 152
(tf-slim: ResNet V2 models use Inception pre-processing and input image size of 299 (use –preprocessing_name inception –eval_image_size 299 when using eval_image_classifier.py). Performance numbers for ResNet V2 models are reported on the ImageNet validation set.)

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Fri Sep 29 16:25:16 2017

@author: wayne


我们用的是tf1.2,最新的tf1.3地址是
https://github.com/tensorflow/models/tree/master/research/slim

http://geek.csdn.net/news/detail/126133
如何用TensorFlow和TF-Slim实现图像分类与分割

https://www.2cto.com/kf/201706/649266.html
【Tensorflow】辅助工具篇——tensorflow slim(TF-Slim)介绍

https://stackoverflow.com/questions/39582703/using-pre-trained-inception-resnet-v2-with-tensorflow
The Inception networks expect the input image to have color channels scaled from [-1, 1]. As seen here.
You could either use the existing preprocessing, or in your example just scale the images yourself: im = 2*(im/255.0)-1.0 before feeding them to the network.
Without scaling the input [0-255] is much larger than the network expects and the biases all work to very strongly predict category 918 (comic books).

TensorFlow实现ResNet(ResNet 152网络结构的forward耗时检测)
http://blog.csdn.net/superman_xxx/article/details/65452735
ResNet原理及其在TF-Slim中的实现
http://www.jianshu.com/p/3af06422c768
"""

import tensorflow as tf
slim = tf.contrib.slim
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import imagenet  #注意需要用最新版tf中的对应文件,否则http地址是不对的

from inception_resnet_v2 import *
from resnet_v1 import *
from resnet_v2 import *

import inception_preprocessing
import vgg_preprocessing


'''
inception_resnet_v2
  Returns:
    tensor_out: output tensor corresponding to the final_endpoint.
    end_points: a set of activations for external use, for example summaries or
                losses.
'''
tf.reset_default_graph()

checkpoint_file = 'inception_resnet_v2_2016_08_30.ckpt'
image = tf.image.decode_jpeg(tf.read_file('dog.jpeg'), channels=3) #['dog.jpg', 'panda.jpg']

image_size = inception_resnet_v2.default_image_size #  299

'''这个函数做了裁剪,缩放和归一化等'''
processed_image = inception_preprocessing.preprocess_image(image, 
                                                        image_size, 
                                                        image_size,
                                                        is_training=False,)
processed_images  = tf.expand_dims(processed_image, 0)

'''Creates the Inception Resnet V2 model.'''
arg_scope = inception_resnet_v2_arg_scope()
with slim.arg_scope(arg_scope):
  logits, end_points = inception_resnet_v2(processed_images, is_training=False)   

probabilities = tf.nn.softmax(logits)

saver = tf.train.Saver()


with tf.Session() as sess:
    saver.restore(sess, checkpoint_file)

    #predict_values, logit_values = sess.run([end_points['Predictions'], logits])
    logits2, image2, network_inputs, probabilities2 = sess.run([logits,
                                                                image,
                                                       processed_images,
                                                       probabilities])

    print(logits2)  
    print(logits2.shape) #(1, 1001)

    print(network_inputs.shape)
    print(probabilities2.shape)
    probabilities2 = probabilities2[0,:]
    sorted_inds = [i[0] for i in sorted(enumerate(-probabilities2),
                                        key=lambda x:x[1])]    


# 显示下载的图片
plt.figure()
plt.imshow(image2)#.astype(np.uint8))
plt.suptitle("Original image", fontsize=14, fontweight='bold')
plt.axis('off')
plt.show()

# 显示最终传入网络模型的图片
plt.imshow(network_inputs[0,:,:,:])
plt.suptitle("Resized, Cropped and Mean-Centered inputs to network",
             fontsize=14, fontweight='bold')
plt.axis('off')
plt.show()

names = imagenet.create_readable_names_for_imagenet_labels()
for i in range(5):
    index = sorted_inds[i]
    print(index)
    # 打印top5的预测类别和相应的概率值。
    print('Probability %0.2f => [%s]' % (probabilities2[index], names[index+1]))





'''https://github.com/tensorflow/models/blob/master/research/slim/train_image_classifier.py'''
def _get_variables_to_train():
    """Returns a list of variables to train.
    Returns:
      A list of variables to train by the optimizer.
    """
    trainable_scopes = 'InceptionResnetV2/Logits,InceptionResnetV2/AuxLogits'

    if trainable_scopes is None:
      return tf.trainable_variables()
    else:
      scopes = [scope.strip() for scope in trainable_scopes.split(',')]

    variables_to_train = []
    for scope in scopes:
      variables = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope)
      variables_to_train.extend(variables)
    return variables_to_train

'''
一些关于inception_resnet_v2变量的测试,在理解模型代码和迁移学习中很有用
'''
exx = tf.trainable_variables()
print(type(exx))
print(exx[0])
print(exx[-1])
print(exx[-2])
print(exx[-3])
print(exx[-4])
print(exx[-5])
print(exx[-6])
print(exx[-7])
print(exx[-8])
print(exx[-9])
print(exx[-10])

print('###############################################################')
variables_to_train = _get_variables_to_train()
print(variables_to_train)

print('###############################################################')
exclude = ['InceptionResnetV2/Logits', 'InceptionResnetV2/AuxLogits']
variables_to_restore = slim.get_variables_to_restore(exclude = exclude)
print(variables_to_restore[0])
print(variables_to_restore[-1])

print('###############################################################')
exclude = ['InceptionResnetV2/Logits']
variables_to_restore = slim.get_variables_to_restore(exclude = exclude)
print(variables_to_restore[0])
print(variables_to_restore[-1])



'''
resnet_v2 152
    num_classes: Number of predicted classes for classification tasks. If None
      we return the features (2048) before the logit layer.

  Returns:
    net: A rank-4 tensor of size [batch, height_out, width_out, channels_out].
      If global_pool is False, then height_out and width_out are reduced by a
      factor of output_stride compared to the respective height_in and width_in,
      else both height_out and width_out equal one. If num_classes is None, then
      net is the output of the last ResNet block, potentially after global
      average pooling. If num_classes is not None, net contains the pre-softmax
      activations.
    end_points: A dictionary from components of the network to the corresponding
      activation.
'''
tf.reset_default_graph()

checkpoint_file = 'resnet_v2_152.ckpt'
image = tf.image.decode_jpeg(tf.read_file('dog.jpeg'), channels=3) #['dog.jpg', 'panda.jpg']

image_size = inception_resnet_v2.default_image_size #  299

'''这个函数做了裁剪,缩放和归一化等'''
processed_image = inception_preprocessing.preprocess_image(image, 
                                                        image_size, 
                                                        image_size,
                                                        is_training=False,)
processed_images  = tf.expand_dims(processed_image, 0)

'''Creates the Resnet V2 model.'''
arg_scope = resnet_arg_scope()
with slim.arg_scope(arg_scope):
    net, end_points = resnet_v2_152(processed_images, 1001, is_training=False)   


probabilities = tf.nn.softmax(net)

saver = tf.train.Saver()


with tf.Session() as sess:
    saver.restore(sess, checkpoint_file)

    net2, image2, network_inputs, end_points2, probabilities2= sess.run([net, 
                                                             image,
                                                       processed_images,
                                                       end_points,
  • 4
    点赞
  • 14
    收藏
    觉得还不错? 一键收藏
  • 11
    评论
《Deep Koalarization:使用CNN和Inception-ResNet-V2进行图像着色》是一篇研究论文,介绍了一种利用深度学习模型进行图像着色的方法。着色是给黑白图像添加色彩信息的过程,传统方法通常需要人工干预,而这篇论文提出了基于卷积神经网络(CNN)和Inception-ResNet-V2模型的自动图像着色方法。 首先,论文介绍了CNN模型的基本原理。CNN是一种特殊的神经网络结构,具有良好的图像处理能力。它通过多层的卷积和池化操作,自动提取图像中的特征,从而实现对图像的理解和表达。这种模型在计算机视觉领域有着广泛的应用。 其次,论文介绍了Inception-ResNet-V2模型。这是一个深度卷积神经网络模型,由Google团队提出。它结合了InceptionResNet两种模型的优点,具有更好的图像分类和识别能力。在图像着色任务中,论文采用了这个模型作为基础网络,以提高着色的准确性和效果。 论文还详细介绍了图像着色的方法。首先,将黑白图像输入CNN模型,提取图像的特征表示。然后,再将这些特征输入Inception-ResNet-V2模型,进行图像着色的预测。最后,将预测结果转换为RGB色彩空间,并添加到原始黑白图像上,完成着色过程。 实验结果表明,《Deep Koalarization:使用CNN和Inception-ResNet-V2进行图像着色》方法在图像着色任务上取得了显著的效果。与传统方法相比,它具有更高的自动化程度和更好的着色质量。论文的研究成果对于图像处理和计算机视觉领域具有重要的理论和应用意义。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 11
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值