Tensorflow使用slim工具(vgg16模型)实现图像分类与分割

接触tensorflow小白,网上教程很多,图像分类应该属于比较经典的一个例子啦,特别是google推了slim,但是网上的教程遗漏啦许多细节介绍会导致跑不出来,经过调试终于跑出来啦

结果还算可以,分享一下

我的环境,cuda8.0+cudnn5.1+python2.7

关于tensorflow,cuda+cudnn等安装推荐教程:

http://blog.csdn.net/xierhacker/article/details/53035989

整体思路就是通过训练好的vgg_16模型进行图像分类,

原图:




step01输出结果:

Probability 1.00 => [lion, king of beasts, Panthera leo]
5 things
Probability 0.00 => [collie]
5 things
Probability 0.00 => [ox]
5 things
Probability 0.00 => [baboon]
5 things
Probability 0.00 => [chow, chow chow]

很显然分类的比较正确

Step02图像分割显示效果:




工具:tensorflow slim opencv numpy 

Step01:下载vgg模型,

import sys
import os

os.environ["CUDA_VISIBLE_DEVICES"] = '0'
sys.path.append("/home/yc/models/slim")
from datasets import dataset_utils
import tensorflow as tf

url = "http://download.tensorflow.org/models/vgg_16_2016_08_28.tar.gz"

# 指定保存路径
checkpoints_dir = '/home/yc/models/checkpoints'

if not tf.gfile.Exists(checkpoints_dir):
    tf.gfile.MakeDirs(checkpoints_dir)

dataset_utils.download_and_uncompress_tarball(url, checkpoints_dir)

Step02:图像分类 from datasets等属于下载models/slim下的文件夹,可能代码不提示,没有关系,

import sys
import os

os.environ["CUDA_VISIBLE_DEVICES"] = '0'
sys.path.append("/home/yc/models/slim")
from matplotlib import pyplot as plt
import numpy as np
import cv2 
import tensorflow as tf
from datasets import imagenet
from nets import vgg
from preprocessing import vgg_preprocessing
checkpoints_dir = '/home/yc/models/checkpoints'
slim=tf.contrib.slim
image_size=vgg.vgg_16.default_image_size
with tf.Graph().as_default():
 
    # Open specified url and load image as a string
    
    # Decode string into matrix with intensity values
    image = cv2.imread("/home/yc/1214/tiger.jpg")
    
    image=cv2.cvtColor(image, 4)
    plt.imshow(image)
    plt.suptitle("The tiger",
                    fontsize=14, fontweight='bold')
    
    plt.axis('off')
    plt.show()    
    # Resize the input image, preserving the aspect ratio
    # and make a central crop of the resulted image.
    # The crop will be of the size of the default image size of
    # the network.
    processed_image = vgg_preprocessing.preprocess_image(image,
                                                         image_size,
                                                         image_size,
                                                         is_training=False)
    
    
    # Networks accept images in batches.
    # The first dimension usually represents the batch size.
    # In our case the batch size is one.
    processed_images  = tf.expand_dims(processed_image, 0)
    
    # Create the model, use the default arg scope to configure
    # the batch norm parameters. arg_scope is a very conveniet
    # feature of slim library -- you can define default
    # parameters for layers -- like stride, padding etc.
    with slim.arg_scope(vgg.vgg_arg_scope()):
        logits, _ = vgg.vgg_16(processed_images,
                               num_classes=1000,
                               is_training=False)
    
    # In order to get probabilities we apply softmax on the output.
    probabilities = tf.nn.softmax(logits)
    
    # Create a function that reads the network weights
    # from the checkpoint file that you downloaded.
    # We will run it in session later.
    init_fn = slim.assign_from_checkpoint_fn(
        os.path.join(checkpoints_dir, 'vgg_16.ckpt'),
        slim.get_model_variables('vgg_16'))
  
    with tf.Session() as sess:
        
        # Load weights
        init_fn(sess)
        
        # We want to get predictions, image as numpy matrix
        # and resized and cropped piece that is actually
        # being fed to the network.
        network_input, probabilities = sess.run([processed_image,probabilities])
        probabilities = probabilities[0, 0:]
        sorted_inds = [i[0] for i in sorted(enumerate(-probabilities),
                                            key=lambda x:x[1])]

    # Show the downloaded image
   
    
    # Show the image that is actually being fed to the network
    # The image was resized while preserving aspect ratio and then
    # cropped. After that, the mean pixel value was subtracted from
    # each pixel of that crop. We normalize the image to be between [-1, 1]
    # to show the image.
    plt.imshow( network_input / (network_input.max() - network_input.min()) )
    plt.suptitle("Resized, Cropped and Mean-Centered input to network",
                 fontsize=14, fontweight='bold')
    plt.axis('off')
    plt.show()

    names = imagenet.create_readable_names_for_imagenet_labels()
    for i in range(5):
        print "5 things"
        index = sorted_inds[i]
        # Now we print the top-5 predictions that the network gives us with
        # corresponding probabilities. Pay attention that the index with
        # class names is shifted by 1 -- this is because some networks
        # were trained on 1000 classes and others on 1001. VGG-16 was trained
        # on 1000 classes.
        print('Probability %0.2f => [%s]' % (probabilities[index], names[index+1]))
        
    res = slim.get_model_variables()

Step03:图像分割显示(上一步注重的是局部类别,根据概率排名,很清楚等看出分类,在复杂场景下,需要全图的类别情况,需要分割并显示)


import sys
import os

os.environ["CUDA_VISIBLE_DEVICES"] = '0'
sys.path.append("/home/yc/models/slim")
from matplotlib import pyplot as plt
import numpy as np
import cv2 
import tensorflow as tf
import urllib2
from datasets import imagenet
from nets import vgg
from preprocessing import vgg_preprocessing
checkpoints_dir = '/home/yc/models/checkpoints'
slim=tf.contrib.slim
image_size=vgg.vgg_16.default_image_size

# Load the mean pixel values and the function
# that performs the subtraction
from preprocessing.vgg_preprocessing import (_mean_image_subtraction,
                                            _R_MEAN, _G_MEAN, _B_MEAN)

# Function to nicely print segmentation results with
# colorbar showing class names
def discrete_matshow(data, labels_names=[], title=""):
    print "matshow  begin"
    
    #get discrete colormap
    cmap = plt.get_cmap('Paired', np.max(data)-np.min(data)+1)
    # set limits .5 outside true range
    mat = plt.matshow(data,
                      cmap=cmap,
                      vmin = np.min(data)-.5,
                      vmax = np.max(data)+.5)
    #tell the colorbar to tick at integers
    cax = plt.colorbar(mat,
                       ticks=np.arange(np.min(data),np.max(data)+1))
    
    # The names to be printed aside the colorbar
    if labels_names:
        cax.ax.set_yticklabels(labels_names)
    
    if title:
        plt.suptitle(title, fontsize=14, fontweight='bold')
    plt.show()
        
with tf.Graph().as_default():
    
    image01 = cv2.imread("/home/yc/1214/tiger.jpg")
    
    image=cv2.cvtColor(image01,4)
    
    # Convert image to float32 before subtracting the
    # mean pixel value
    image_float = tf.to_float(image, name='ToFloat')
    
    # Subtract the mean pixel value from each pixel
    processed_image = _mean_image_subtraction(image_float,
                                              [_R_MEAN, _G_MEAN, _B_MEAN])

    input_image = tf.expand_dims(processed_image, 0)
    
    with slim.arg_scope(vgg.vgg_arg_scope()):
        
        # spatial_squeeze option enables to use network in a fully
        # convolutional manner
        logits, _ = vgg.vgg_16(input_image,
                               num_classes=1000,
                               is_training=False,
                               spatial_squeeze=False)
    
    # For each pixel we get predictions for each class
    # out of 1000. We need to pick the one with the highest
    # probability. To be more precise, these are not probabilities,
    # because we didn't apply softmax. But if we pick a class
    # with the highest value it will be equivalent to picking
    # the highest value after applying softmax
    pred = tf.argmax(logits, dimension=3)
    
    init_fn = slim.assign_from_checkpoint_fn(
        os.path.join(checkpoints_dir, 'vgg_16.ckpt'),
        slim.get_model_variables('vgg_16'))
    
    with tf.Session() as sess:
        init_fn(sess)
        segmentation = sess.run([pred])

# Remove the first empty dimension
segmentation = np.squeeze(segmentation)

# Let's get unique predicted classes (from 0 to 1000) and
# relable the original predictions so that classes are
# numerated starting from zero
unique_classes, relabeled_image = np.unique(segmentation,
                                            return_inverse=True)


segmentation_size = segmentation.shape
print unique_classes
relabeled_image = relabeled_image.reshape(segmentation_size)

labels_names = []
names = imagenet.create_readable_names_for_imagenet_labels()
for index, current_class_number in enumerate(unique_classes):

    labels_names.append(str(index) + ' ' + names[current_class_number+1])
print labels_names

discrete_matshow(data=relabeled_image, labels_names=labels_names, title="Segmentation")

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
下面是使用TensorFlow搭建VGG16模型的示例代码: ```python import tensorflow as tf def vgg16(input_tensor, num_classes): # 定义卷积层和池化层的函数,方便重复使用 def conv_block(inputs, filters, kernel_size, strides=(1,1), padding='same', activation=tf.nn.relu): conv = tf.layers.conv2d(inputs, filters, kernel_size, strides, padding, activation=activation) return tf.layers.batch_normalization(conv) def max_pool(inputs, pool_size=(2,2), strides=(2,2), padding='same'): return tf.layers.max_pooling2d(inputs, pool_size, strides, padding) # 定义VGG16的卷积层和池化层结构 net = conv_block(input_tensor, 64, (3,3)) net = conv_block(net, 64, (3,3)) net = max_pool(net) net = conv_block(net, 128, (3,3)) net = conv_block(net, 128, (3,3)) net = max_pool(net) net = conv_block(net, 256, (3,3)) net = conv_block(net, 256, (3,3)) net = conv_block(net, 256, (3,3)) net = max_pool(net) net = conv_block(net, 512, (3,3)) net = conv_block(net, 512, (3,3)) net = conv_block(net, 512, (3,3)) net = max_pool(net) net = conv_block(net, 512, (3,3)) net = conv_block(net, 512, (3,3)) net = conv_block(net, 512, (3,3)) net = max_pool(net) # 定义全连接层结构 net = tf.layers.flatten(net) net = tf.layers.dense(net, 4096, activation=tf.nn.relu) net = tf.layers.dropout(net, rate=0.5) net = tf.layers.dense(net, 4096, activation=tf.nn.relu) net = tf.layers.dropout(net, rate=0.5) net = tf.layers.dense(net, num_classes) return net ``` 在上面的代码中,我们使用TensorFlow中的tf.layers模块,这个模块可以方便地创建各种神经网络层,包括卷积层、池化层、全连接层等。 VGG16模型的具体结构可以参考下图: ![VGG16结构图](https://miro.medium.com/max/2000/1*7S266GKv9KsK1Tehn3Km7g.png) 上面的代码实现VGG16模型的卷积层和全连接层结构。你可以通过调用这个函数来创建一个VGG16模型,并且可以通过传入不同的输入张量和输出类别数来创建不同的模型

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值