迁移学习用于图像识别的Tensorflow实现

最新推荐文章于 2024-07-24 23:58:59 发布

gzroy

最新推荐文章于 2024-07-24 23:58:59 发布

阅读量1.9k

点赞数 1

分类专栏：人工智能

本文链接：https://blog.csdn.net/gzroy/article/details/83020200

版权

人工智能专栏收录该内容

42 篇文章 23 订阅

订阅专栏

最近在研究目标识别的YOLO论文，想按照论文中的模型进行实现，不过发现按照论文中的24层CNN网络结构，在我的GTX750Ti 2G显存的卡上没法跑起来，看来是时候要换张大容量的显卡了。不过在换显卡之前，我想先测试一下在现有的显卡基础上，是否有其他办法可以跑YOLO算法。其中一个办法是采用迁移学习的思路，把别人训练好的卷积网络直接拿来计算图像的特征。这样我就不用自己搭建这么多层网络来重新训练了。

在Tensorflow Hub里面有很多训练好的卷积神经网络，这些网络大部分都是以Imagenet 2012竞赛的1000类图像来进行训练的，这个数据集大概有146GB，按照YOLO论文作者的说法，他的模型也是先在这个数据集上跑了1个星期来进行预训练的，可想而知如果自己用这个数据集来训练的话，需要有多大的显卡资源才能满足要求。因此明智的做法是直接采用Tensorflow Hub里面的模型来进行迁移学习。我选取了其中的Google Inception V3模型，这个也是一个非常出名的模型，有着很高的准确度。在做迁移学习到YOLO模型的搭建之前，我先用这个模型来进行图像的重新训练识别，先练一下手，熟悉一下迁移学习的做法。

我也是采用了Tensorflow网站的Image Retrain Guide里面的Flowers数据集，但是没有按照Guide里面的程序来做，而是自己写了一个程序来实现，这也是更好的学习掌握Tensorflow的一个好方法。

我的程序分为两个部分。第一部分是把Flowers数据集转换为TFRECORD格式，每个图片都按照Inception V3模型的要求缩放为299×299像素，并把类型标签也一并写入到文件中。这样方便我们进行训练时读取数据。这个程序代码如下所示：

import os
import cv2
import tensorflow as tf
import numpy as np

reshape_width = 299.0
reshape_height = 299.0

def make_example(image, label):
    return tf.train.Example(features=tf.train.Features(feature={
        'image' : tf.train.Feature(bytes_list=tf.train.BytesList(value=[image])),
        'label' : tf.train.Feature(bytes_list=tf.train.BytesList(value=[label]))
    }))

flower_classes = {"daisy":0, "dandelion":1, "roses":2, "sunflowers":3, "tulips":4}

for flower_class in flower_classes.keys():
    writer = tf.python_io.TFRecordWriter(flower_class+".tfrecord")
    folder_path = "/home/roy/flower_photos/"+flower_class
    files = os.listdir(folder_path)
    label = np.array([flower_classes[flower_class]])
    for jpgfile in files:
        img = cv2.imread(folder_path+"/"+jpgfile, cv2.IMREAD_COLOR)
        img = cv2.resize(img, (int(reshape_width), int(reshape_height)))
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        img = img.astype(np.uint8)
        ex = make_example(img.tobytes(), label.tobytes())
        writer.write(ex.SerializeToString())
    writer.close()

程序运行完后，会生成5个TFRECORD文件，每个文件对应一种花的类型。

现在我们可以编写程序来进行数据读取和训练了，直接上代码

import tensorflow as tf
import tensorflow_hub as hub

imageWidth = 299
imageHeight = 299
imageDepth = 3
batch_size = 10

#解析TFRecord文件的格式
def _parse_function(example_proto):
    features = {"image": tf.FixedLenFeature((), tf.string, default_value=""),
                "label": tf.FixedLenFeature((), tf.string, default_value="")}
    parsed_features = tf.parse_single_example(example_proto, features)
    return parsed_features["image"], parsed_features["label"]

#文件列表，分为3个Dataset, Train, Validation, Test. 其中Train大概占80%，其他两个Dataset大概各占10%的数据量
filenames = ["daisy.tfrecord", "dandelion.tfrecord", "roses.tfrecord", "sunflowers.tfrecord", "tulips.tfrecord"]
dataset = tf.data.TFRecordDataset(filenames)
dataset = dataset.map(_parse_function).shuffle(3670)

#Get the first 2920 records for training dataset
train_dataset = dataset.take(2920)
train_dataset = train_dataset.batch(batch_size)
train_dataset = train_dataset.repeat(50)
train_iterator = train_dataset.make_initializable_iterator()

images, labels = train_iterator.get_next()
images_raw = tf.decode_raw(images, tf.uint8)
images_batch = tf.image.convert_image_dtype(images_raw, tf.float32)
images_batch = tf.reshape(images_batch, [batch_size, imageHeight, imageWidth, imageDepth])
labels_raw = tf.decode_raw(labels, tf.int64)
labels_batch = tf.reshape(labels_raw, [batch_size])

#Get the other 370 records for validation dataset
valid_dataset = dataset.skip(2920).take(370)
valid_dataset = valid_dataset.batch(batch_size)
valid_iterator = valid_dataset.make_initializable_iterator()

valid_images, valid_labels = valid_iterator.get_next()
valid_images_raw = tf.decode_raw(valid_images, tf.uint8)
valid_images_batch = tf.image.convert_image_dtype(valid_images_raw, tf.float32)
valid_images_batch = tf.reshape(valid_images_batch, [batch_size, imageHeight, imageWidth, imageDepth])
valid_labels_raw = tf.decode_raw(valid_labels, tf.int64)
valid_labels_batch = tf.reshape(valid_labels_raw, [batch_size])

#Get the remaining 380 records for test dataset
test_dataset = dataset.skip(3290)
test_dataset = test_dataset.batch(batch_size)
test_iterator = test_dataset.make_initializable_iterator()

test_images, test_labels = test_iterator.get_next()
test_images_raw = tf.decode_raw(test_images, tf.uint8)
test_images_batch = tf.image.convert_image_dtype(test_images_raw, tf.float32)
test_images_batch = tf.reshape(test_images_batch, [batch_size, imageHeight, imageWidth, imageDepth])
test_labels_raw = tf.decode_raw(test_labels, tf.int64)
test_labels_batch = tf.reshape(test_labels_raw, [batch_size])

#读取下载的HUB的Inception模型
module = hub.Module("/home/roy/tensorflow/models/Inception")

#设置输入图像和标签的Placeholder
input_images = tf.placeholder(tf.float32, (batch_size, imageHeight, imageWidth, imageDepth))
input_labels = tf.placeholder(tf.int64, (batch_size))

#Bottleneck Input是把原始图像经过Hub模型处理后输出的特征值，维度为2048
bottleneck_input = module(input_images)

#增加一个全连接层，把输入的Bottleneck Input输出为5个维度的Softmax
output_logits = tf.layers.dense(bottleneck_input, units=5, activation=None, 
                                kernel_initializer=tf.initializers.truncated_normal, 
                                name="output_layer", reuse=tf.AUTO_REUSE)

#定义Loss
loss = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=input_labels, logits=output_logits)
loss_mean = tf.reduce_mean(loss)

#对输出结果进行Softmax
output_result = tf.argmax(tf.nn.softmax(output_logits), 1)

#计算准确度
accuracy_batch = tf.reduce_mean(tf.cast(tf.equal(input_labels, output_result), tf.float32))

#定义训练参数，前30个EPOCH的学习率为0.1, 30-40个EPOCH的学习率为0.05...
global_step = tf.Variable(0, trainable=False)
epoch_steps = 2920/batch_size
boundaries = [epoch_steps*30, epoch_steps*40, epoch_steps*50]
values = [0.1, 0.05, 0.01, 0.005]
learning_rate = tf.train.piecewise_constant(global_step, boundaries, values)
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
opt_op = optimizer.minimize(loss_mean)

#进行训练，每100个Step输出LOSS值。每一个EPOCH训练完成后分别计算Valiation和Test两个数据集的准确度
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    sess.run(train_iterator.initializer)
    step = 0
    total_loss = 0.0
    epoch = 0
    while(True):
        try:
            step += 1
            images_i, labels_i = sess.run([images_batch, labels_batch])
            loss_a, lr, _ = sess.run([loss_mean, learning_rate, opt_op], feed_dict={input_images: images_i, input_labels: labels_i})
            total_loss += loss_a
            if step%100==0:
                print "step %i Learning_rate: %f Loss: %f" %(step, lr, total_loss/100)
                total_loss = 0.0
            if step%epoch_steps==0:
                sess.run([valid_iterator.initializer, test_iterator.initializer])
                valid_step = 0
                test_step = 0
                accuracy_valid = 0.0
                accuracy_test = 0.0
                epoch += 1
                while(True):
                    try:
                        images_v, labels_v = sess.run([valid_images_batch, valid_labels_batch])
                        accuracy_valid += sess.run(accuracy_batch, feed_dict={input_images: images_v, input_labels: labels_v})
                        valid_step += 1
                    except tf.errors.OutOfRangeError:
                        print "epoch %i validation accuracy: %f" %(epoch, accuracy_valid/valid_step)
                        break
                while(True):
                    try:
                        images_t, labels_t = sess.run([test_images_batch, test_labels_batch])
                        accuracy_test += sess.run(accuracy_batch, feed_dict={input_images: images_t, input_labels: labels_t})
                        test_step += 1
                    except tf.errors.OutOfRangeError:
                        print "epoch %i test accuracy: %f" %(epoch, accuracy_test/test_step)
                        break
        except tf.errors.OutOfRangeError:
            break

从运行结果来看，效果非常理想，在训练了大概30多个EPOCH之后，Validation和Test这2个Dataset的准确度都去到了100%