tensorflow2.3实现目标分类 + 定位(二)

tensorflow2.3实现目标分类 + 定位

上一篇实现了图像定位,这一篇我们实现分类 + 定位

代码实现

导入包

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from lxml import etree
import glob
from matplotlib.patches import Rectangle

读取数据

images = glob.glob('./dataset/images/*.jpg')
xmls = glob.glob('./dataset/annotations/xmls/*.xml')
names = [x.split('xmls/')[-1].split('.xml')[0] for x in xmls]
print(len(names))
  • 3686

创建训练集

images_train = [img for img in images if (img.split('/')[-1].split('.jpg')[0]) in names]
print(len(images_train))
  • 3686

创建测试集

images_test = [img for img in images if (img.split('/')[-1].split('.jpg')[0]) not in names]
print(len(images_test))

-3704
3686 + 3704 =7390正好和图像数据大小一致。
为了把图像数据和标签数据是一一对应的,所以按照名称进行排序。

images_train.sort(key=lambda x: x.split('/')[-1].split('.jpg')[0])
print(images_train[:5])
xmls.sort(key=lambda x: x.split('/')[-1].split('.xml')[0])
print(xmls[:5])
  • [’./dataset/images/Abyssinian_1.jpg’,
    ‘./dataset/images/Abyssinian_10.jpg’,
    ‘./dataset/images/Abyssinian_100.jpg’,
    ‘./dataset/images/Abyssinian_101.jpg’,
    ‘./dataset/images/Abyssinian_102.jpg’]
  • [’./dataset/annotations/xmls/Abyssinian_1.xml’,
    ‘./dataset/annotations/xmls/Abyssinian_10.xml’,
    ‘./dataset/annotations/xmls/Abyssinian_100.xml’,
    ‘./dataset/annotations/xmls/Abyssinian_101.xml’,
    ‘./dataset/annotations/xmls/Abyssinian_102.xml’]

结果显示是一一对应的。

上面是label测试方法的可行性,下面自定义一个封装函数,把以上过程封装在一起。

def to_labels(path):
    xml = open('{}'.format(path)).read()
    sel = etree.HTML(xml)
    width = int(sel.xpath('.//size/width/text()')[0])
    height = int(sel.xpath('.//size/height/text()')[0])
    xmin = int(sel.xpath('.//bndbox/xmin/text()')[0])
    xmax = int(sel.xpath('.//bndbox/xmax/text()')[0])
    ymin = int(sel.xpath('.//bndbox/ymin/text()')[0])
    ymax = int(sel.xpath('.//bndbox/ymax/text()')[0])
    return [xmin / width, ymin / height, xmax / width, ymax / height]

把标注数据应用到这个封装函数上

labels = [to_labels(path) for path in xmls]
print(labels[:3]
  • [[0.555, 0.18, 0.708, 0.395],
    [0.192, 0.21, 0.768, 0.582],
    [0.3832, 0.142, 0.850, 0.534]]
    目前的label中把四个数值放在一个序列里,我们输入时要把四个值每一个值作为一个列表所以要反序列压缩
out1, out2, out3, out4 = list(zip(*labels))
out1 = np.array(out1)
out2 = np.array(out2)
out3 = np.array(out3)
out4 = np.array(out4)

从每个图像中找出类别

class_labels = set(x.split('/')[-1].split('.xml')[0].split('_')[0] for x in xmls)
class_labels
  • {‘Abyssinian’, ‘Bengal’, ‘Birman’, ‘Bombay’, ‘British’, ‘Egyptian’, ‘Maine’, ‘Persian’, ‘Ragdoll’, ‘Russian’, ‘Siamese’, ‘Sphynx’, ‘american’, ‘basset’, ‘beagle’, ‘boxer’, ‘chihuahua’, ‘english’, ‘german’, ‘great’, ‘havanese’, ‘japanese’, ‘keeshond’, ‘leonberger’, ‘miniature’, ‘newfoundland’, ‘pomeranian’, ‘pug’, ‘saint’, ‘samoyed’, ‘scottish’, ‘shiba’, ‘staffordshire’, ‘wheaten’, ‘yorkshire’}

统计类别的种数

class_labels_len= len(set(class_labels))
class_labels_len
  • 35

类别编码,方便后面训练输入。

class_label_to_index = dict((name, index) for index, name  in enumerate(class_labels))
class_label_to_index
  • {‘american’: 0, ‘saint’: 1, ‘Ragdoll’: 2, ‘english’: 3, ‘Persian’: 4, ‘staffordshire’: 5, ‘Maine’: 6, ‘basset’: 7, ‘Sphynx’: 8, ‘japanese’: 9, ‘miniature’: 10, ‘newfoundland’: 11, ‘Bombay’: 12, ‘pomeranian’: 13, ‘great’: 14, ‘Egyptian’: 15, ‘keeshond’: 16, ‘havanese’: 17, ‘shiba’: 18, ‘scottish’: 19, ‘leonberger’: 20, ‘Siamese’: 21, ‘chihuahua’: 22, ‘pug’: 23, ‘wheaten’: 24, ‘yorkshire’: 25, ‘samoyed’: 26, ‘beagle’: 27, ‘Abyssinian’: 28, ‘german’: 29, ‘boxer’: 30, ‘British’: 31, ‘Birman’: 32, ‘Russian’: 33, ‘Bengal’: 34}

标签中的类别用数字替换

class_labels = [class_label_to_index[x.split('/')[-1].split('.xml')[0].split('_')[0]] for x in xmls]
len(class_labels)
  • 3686
    标签数据数量没变与原来保持一致。

再把数字转换为类别,方便后面分类时输出

index_to_class_label = dict((index, name) for index, name  in enumerate(class_label_to_index))
print(index_to_class_label)
  • {0: ‘american’, 1: ‘saint’, 2: ‘Ragdoll’, 3: ‘english’, 4: ‘Persian’, 5: ‘staffordshire’, 6: ‘Maine’, 7: ‘basset’, 8: ‘Sphynx’, 9: ‘japanese’, 10: ‘miniature’, 11: ‘newfoundland’, 12: ‘Bombay’, 13: ‘pomeranian’, 14: ‘great’, 15: ‘Egyptian’, 16: ‘keeshond’, 17: ‘havanese’, 18: ‘shiba’, 19: ‘scottish’, 20: ‘leonberger’, 21: ‘Siamese’, 22: ‘chihuahua’, 23: ‘pug’, 24: ‘wheaten’, 25: ‘yorkshire’, 26: ‘samoyed’, 27: ‘beagle’, 28: ‘Abyssinian’, 29: ‘german’, 30: ‘boxer’, 31: ‘British’, 32: ‘Birman’, 33: ‘Russian’, 34: ‘Bengal’}

创建标签数据集,包括定位标签和分类标签

label_dataset = tf.data.Dataset.from_tensor_slices(((out1, out2, out3, out4), class_labels))

封装加载图像数据函数

def load_image(path):
    img = tf.io.read_file(path)
    img = tf.image.decode_jpeg(img, channels=3)
    img = tf.image.resize(img, [224, 224])
    img = img/127.5 - 1           #-1~1
    return img

构造图像数据训练集,并应用到加载函数上

image_dataset = tf.data.Dataset.from_tensor_slices(images_train)
image_dataset = image_dataset.map(load_image)

将图片和标签合并

dataset = tf.data.Dataset.zip((image_dataset, label_dataset))
dataset = dataset.shuffle(len(dataset))

设置训练集中训练和测试的数量,将dataset数据集中20%拿出来做验证集,80%做训练集

train_count = int(len(dataset) * 0.8)
test_count = len(dataset) - train_count
train_dataset = dataset.skip(test_count)
test_dataset = dataset.take(test_count)

训练集和测试集设置训练状态

train_dataset = train_dataset.repeat().shuffle(train_count).batch(32)
test_dataset = test_dataset.batch(32)

加载预训练模型

xception = tf.keras.applications.xception.Xception(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

模型输入与目标定位输出四个位置坐标

inputs = tf.keras.layers.Input(shape=(224, 224, 3))
x = xception(inputs)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dense(2048, activation='relu')(x)
x1 = tf.keras.layers.Dense(256, activation='relu')(x)
#  回归
xmin = tf.keras.layers.Dense(1)(x1)
ymin = tf.keras.layers.Dense(1, name='out2')(x1)
xmax = tf.keras.layers.Dense(1,name='out3')(x1)
ymax = tf.keras.layers.Dense(1, name='out4')(x1)
prediction_sites = [xmin, ymin, xmax, ymax]

模型类别输出

x2 = tf.keras.layers.Dense(256, activation='relu')(x)
prediction_classes = tf.keras.layers.Dense(class_labels_len, activation='softmax', name='class')(x2)

建立模型

model = tf.keras.Model(inputs=inputs, outputs=[prediction_sites, prediction_classes])
model.summary()

模型概述

Model: "functional_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_2 (InputLayer)            [(None, 224, 224, 3) 0                                            
__________________________________________________________________________________________________
xception (Functional)           (None, 7, 7, 2048)   20861480    input_2[0][0]                    
__________________________________________________________________________________________________
global_average_pooling2d (Globa (None, 2048)         0           xception[0][0]                   
__________________________________________________________________________________________________
dense (Dense)                   (None, 2048)         4196352     global_average_pooling2d[0][0]   
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 256)          524544      dense[0][0]                      
__________________________________________________________________________________________________
dense_3 (Dense)                 (None, 256)          524544      dense[0][0]                      
__________________________________________________________________________________________________
dense_2 (Dense)                 (None, 1)            257         dense_1[0][0]                    
__________________________________________________________________________________________________
out2 (Dense)                    (None, 1)            257         dense_1[0][0]                    
__________________________________________________________________________________________________
out3 (Dense)                    (None, 1)            257         dense_1[0][0]                    
__________________________________________________________________________________________________
out4 (Dense)                    (None, 1)            257         dense_1[0][0]                    
__________________________________________________________________________________________________
class (Dense)                   (None, 35)           8995        dense_3[0][0]                    
==================================================================================================
Total params: 26,116,943
Trainable params: 26,062,415
Non-trainable params: 54,528
__________________________________________________________________________________________________

模型配置

model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.0001), loss={'out1': 'mse', 'out2': 'mse','out3': 'mse','out4': 'mse','class': 'sparse_categorical_crossentropy'}, metrics=['mae', 'acc'])

模型训练

history = model.fit(train_dataset, epochs=50, steps_per_epoch=train_count//32, validation_data=test_dataset, validation_steps=test_count//32)

模型保存

model.save('site_and_class.h5')

训练过程显示

loss = history.history['loss']
print(history.history.keys())
class_acc = history.history['class_acc']
plt.figure()
plt.plot(history.epoch, loss, 'r', label='Training loss')
plt.plot(history.epoch, class_acc, 'b', label='Training class_acc')
plt.title('Training and validation Loss and Class_acc')
plt.xlabel('Epoch')
plt.ylabel('Loss and Class_acc Value')
plt.legend()
plt.savefig('loss_and_acc.png')
plt.show()

在这里插入图片描述

dataset的形状

<ShuffleDataset shapes: ((224, 224, 3), (((), (), (), ()), ())), types: (tf.float32, ((tf.float64, tf.float64, tf.float64, tf.float64), tf.int32))>

train_dataset的形状

<BatchDataset shapes: ((None, 224, 224, 3), (((None,), (None,), (None,), (None,)), (None,))), types: (tf.float32, ((tf.float64, tf.float64, tf.float64, tf.float64), tf.int32))>

从训练集中取出前三个数据显示(未训练)

for imgs, labels in train_dataset.take(1):
        for i in range(3):
            # 显示图片 ===> 需要将tensor array对象转换为image
            plt.imshow(tf.keras.preprocessing.image.array_to_img(imgs[i]))
            xmin, ymin, xmax, ymax = np.array(labels[0][0])[i], np.array(labels[0][1])[i], np.array(labels[0][2])[i], \
                                     np.array(labels[0][3])[i]
        #  按照图片尺寸获取对应比例的xmin, ymin, xmax, ymax
            xmin, ymin, xmax, ymax = xmin * 224, ymin * 224, xmax * 224, ymax * 224
        # 绘制矩形框((x,y),h,w)  fill ===> 指定是否填充矩形框
            rect = Rectangle((xmin, ymin), (xmax - xmin), (ymax - ymin), fill=False, color='red')
        # 获取当前图像
            ax = plt.gca()
        # 给当前图像添加矩形框
            ax.axes.add_patch(rect)
            plt.show()

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
训练后测试数据,显示前三个数据

for img, _ in train_dataset.take(1):
        pre = model.predict(img)
        for i in range(3):
            plt.imshow(tf.keras.preprocessing.image.array_to_img(img[i]))
            xmin, ymin, xmax, ymax = pre[0][0][i] * 224, pre[0][1][i] * 224, pre[0][2][i] * 224, pre[0][3][i] * 224
            # 绘制矩形框((x,y),h,w)  fill ===> 指定是否填充矩形框
            rect = Rectangle((xmin, ymin), (xmax - xmin), (ymax - ymin), fill=False, color='red')
            # 获取当前图像
            ax = plt.gca()
            # 给当前图像添加矩形框
            ax.axes.add_patch(rect)  
            classes_index = np.argmax(pre[1][i])
            title = index_to_class_label[classes_index]
            print(classes_index) 
            plt.title(title)
#             plt.title(index_to_class_label(classes_index))
#             if pre[1][i] > 0.5:
#                 # 给当前图像添加title
#                 plt.title('cat')
#             else:
#                 plt.title('dog')
            plt.show()

**加粗样式**
在这里插入图片描述
在这里插入图片描述

  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值