Tensorflow 速览
写这篇博客的,主要目的是介绍如何用Tensorflow 训练自己的数据。当然前提是已经对数据做了相应的分析,根据实际任务选择相应模型,有了一定的参考。这里我们着重介绍,基于Tensorflow的模型训练过程。
Tensorflow 对数据处理的方式比较多样,这里我们介绍一下3种常用的数据处理方式:
Tensorflow数据加载方式
- feed_dict feed_dict,笔者这里不建议大家直接用这种方式加载训练数据。这种数据加载方式,受别于Python GIL,计算资源的消耗与调用量不成正比。
- Tensorflow TfRecords 这是Tensorflow 自带的数据格式。当我们拥有少量的数据时,学习使用这种方式显得过于沉重;同事由于TfRecord序列化的耗时问题,反序列化容易出错。笔者在使用Tensorflow Model Zoo中的模型用到过这种方式,因此在相应的场合使用还是有必要的。
- tensorflow.data.Dataset 这种方式笔者建议使用的方式,网上对这种方式的介绍不多,这也会此文的目的,介绍并使用tf.data.Dataset;
人脸识别模型应用
这里以一个简单的网络结构,做一个人脸识别应用。
import tensorflow as tf
from tensorflow import Tensor
class Inputs(object):
def __init__(self, img1, img2, label):
self.img1 = img1
self.img2 = img2
self.lable = label
class Model(object):
def __init__(self, inputs: Inputs):
self.inputs = inputs
self.predictions = self.predict(inputs)
self.loss = self.calculate_loss(inputs, self.predictions)
self.opt_step = tf.train.AdamOptimizer(learning_rate=0.001).minimize(self.loss)
def predict(self, inputs):
with tf.name_scope('image_subtraction'):
img_diff = (inputs.img1 - inputs.img2)
x = img_diff
with tf.name_scope('conv_relu_maxpool'):
for conv_layer_i in range(5):
x = tf.layers.conv2d(x,
filter=20*(conv_layer_i + 1),
kernal_size=3,
activation=tf.nn.relu
)
x = tf.layers.max_pooling2d(x, pool_size=3, strides=2)
with tf.name_scope('fully_connected'):
for conv_layer_i in range(1):
x = tf.layers.dense(x, units=200, activation=tf.nn.relu)
with tf.name_scope('linear_predict'):
predict_logits = tf.layers.dense(x, 1, activation=None)
return tf.squeeze(predicted_logits)
def calculate_loss(self, inputs, prediction_logits):
with tf.name_scope('calculate_loss'):
return tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=inputs.label, logits=prediction_logits))
The Data (lfw as example)
Fold 结构如下所示:
/lfw
/lfw/lilei/lilei_1.jpg
/lfw/lilei/lilei_2.jpg
…
/lfw/zhuangxy/zhuangxy_1.jpg
/lfw/zhuangxy/zhuangxy_2.jpg
这里总共有18984张图片,因此这里我们使用Pythonic Generator 来生成数据;
import os
import glob
import random
class PairGenerator(object):
def __init__(self, lfw_path=None):
self.all_people = self.generate_all_people_dict(lfw_path)
def generate_all_people_dict(self, lfw_path):
all_people = {}
for person_folder in os.listdir(lfw_path):
person_photos = glob.glob(os.path.join(lfw_path, person_folder, '*.jpg'))
all_peopel[person_folder] = person_photos
return all_people
def get_next_pair(self):
while True:
person1 = random.choice(self.all_people)
same_person = random.random() > 0.5
if same_person:
person2 = person1
else:
person2 = random.choice(self.all_people)
person1_photo = random.choice(self.all_people[person1])
person2_photo = random.choice(self.all_people[person2])
yield({'person1':person1_photo,'person2':person2_photo,'label':same_person})
由于图像读取,缩放,采样,数据增强操作,比较耗时,需要计算资源,因此笔者把这些工作放到了Tensorflow 实现。这里也是我们这篇文章的核心,具体如下:
import tensorflow as tf
from .pair_generator import PairGenerator
from .model import Inputs
class Dataset(object):
img1_resized ='img1_resized'
img2_resized= 'img2_resized'
label = 'same_person'
def __init__(self, generator = PairGenerator())
self.next_element = self.build_iterator(generator)
def build_generator(self, pair_gen:PairGenerator):
batch_size = 10
prefetch_batch_buffer = 5
dataset = tf.data.Dataset.from_generator(pair_gen.get_next_pair, output_types={PairGenerator.person1:tf.string, PairGenerator.person2:tf.string,PairGenerator.label:tf.bool})
dataset = dataset.map(self._resize_image_and_resizze)
dataset = dataset.batch(batch_size)
dataset = dataset.prefetch(prefetch_batch_buffer)
iter = dataset.make_one_shot_iterator()
element = iter.get_next()
return Inputs(element[self.img1_resized],element[self.img2_resized],element[PairGenerator.label])
def _read_image_and_resize(self,pair_element):
target_size=[128,128]
img1_file = tf.read_file(pair_element[PairGenerator.person1])
img2_file = tf.read_file(pair_element[PairGenerator.person2])
img1 = tf.image.decode_image(img1_file)
img2 = tf.image.decode_image(img2_file)
img1.set_shape([None,None,3])
img2.set_shape([None,None,3])
img1_resized = tf.image.resize_images(img1, target_size)
img2_resized = tf.image.resize_images(img2, target_size)
pair_element[self.img1_resized] = img1_resized
pair_element[self.img2_resized] = img2_resized
pair_element[self.label] = tf.cast(pair_element[PairGenerator.label], tf.float32)
到这里笔者,已经把本文核心代码展示出来,下面简要介绍一下模型训练。
模型训练
from reconginzer.pair_generator import PairGenrator
from recongnizer.tf_dataset import Dataset
from recongnizer.model import Model
import tensorflow as tf
import pylad as plt
import numpy as np
def main():
generator = PairGenerator()
iter = generator.get_next_pair()
for i in range(2):
print(next(iter))
ds = Dataset(generator)
model_input = ds.next_element
model = Model(model_input)
with tf.Session() as sess:
(img1, img2, label) = sess.run([model_input.img1, model_input.img2, model_input.label])
sess.run(tf.global_variables_initializer())
for step in range(100):
(_, current_loss) = sess.run([model.opt_step, model.loss])
if __name__ == '__main__':
main()
到这这里,笔者已经把Tensorflow 使用的整个流程快速梳理了一遍,希望对大家的学习使用有所帮助。这里也是学习引用,如有不当之处,欢迎大家批评指正。
[1]: https://towardsdatascience.com/how-to-quickly-build-a-tensorflow-training-pipeline-15e9ae4d78a0