David的facenet里面队列的逻辑

Mute杭盖

于 2020-07-27 21:32:03 发布

阅读量283

点赞数

分类专栏：人脸识别文章标签：深度学习机器学习 pytorch tensorflow

本文链接：https://blog.csdn.net/HeavenerWen/article/details/103708435

版权

人脸识别专栏收录该内容

13 篇文章 0 订阅

订阅专栏

在`train_softmax`里面的`queue`

这个队列queue是用来给428268个image_path和对应的label做索引用的。也就是产生0到428267的索引。

# Create a queue that produces indices into the image_list and label_list
labels = ops.convert_to_tensor(label_list, dtype=tf.int32)
# 得到Tensor("Const:0", shape=(428268,), dtype=int32)
也就是428268个元素组成的张量。

将列表转化为tensor.

在这里插入图片描述
从这个张量的形状，可以看出来，这个tensor就是由list转变过来的.

range_size = array_ops.shape(labels)[0]
# 然后我以为range_size的数值为428268，但是实际却为显示他的形状，所以才有shape=()
{Tensor}Tensor("strided_slice:0", shape=(), dtype=int32)

那么，接下来该真正的创建索引的队列了.

index_queue = tf.train.range_input_producer(range_size, num_epochs=None,shuffle=True, seed=None, capacity=32)
# 这就得到为index创建的queue了. 
然后我们有
index_queue={FIFOQueue}<tensorflow.python.ops.data_flow_ops.FIFOQueue object at 0x7f1637f15668>
# 从上面这句可以看出，这个FIFOQueue已经有了

然后该定义出队的操作了.

index_dequeue_op=index_queue.dequeue_many(90*1000, 'index_dequeue')
# 索引出队，难道一次就出90000个索引么？
# 这个数字是由一轮会有1000个batch，每个batch会有90个图像，所以是90000个索引。

在这里插入图片描述
这就涉及队列的工作机制,是不是真正在内存中跑的是batch_size个, 但是队列外面排着的有batch_size*args.epoch_size(90*1000也就是9万)个? 怎么个排法? 哪些用哪些不用?

一. 如何完成的data agumentation和fixed_Standardization

nrof_preprocess_threads = 4
# 然后这是用到的threads数量

先创建一个用于放数据的队列, 之前那个是用来放索引的队列

input_queue = data_flow_ops.FIFOQueue(capacity=2000000,
                                    dtypes=[tf.string, tf.int32, tf.int32],
                                    shapes=[(1,), (1,), (1,)],
                                    shared_name=None, name=None)

string, int32, int32应该分别对应的是image_path, label, control

这块这个200000和9000什么区别和联系？

然后input_queue有了，该定义如何给这个存数据的队列往里面放数据了。那就是该定义如何入列了。

enqueue_op = input_queue.enqueue_many([image_paths_placeholder, labels_placeholder, control_placeholder], name='enqueue_op')

刚好按照预先规定的类型和形状，把image_paths_placeholder, labels_placeholder和control_placeholder分别对应的入列进去。

image_paths_placeholder对应的应该是90个样本的path,所以是个paths.

然后关于怎么做数据预处理的操作就在下面这句话

image_batch, label_batch = facenet.create_input_pipeline(input_queue, image_size, nrof_preprocess_threads, batch_size_placeholder)

然后facenet里面那个函数

def create_input_pipeline(input_queue, image_size, nrof_preprocess_threads, batch_size_placeholder):
    images_and_labels_list = []
    for _ in range(nrof_preprocess_threads):
        filenames, label, control = input_queue.dequeue()

filenames, label, control = input_queue.dequeue()这部分是关键，然后根据control的值来进行该进行的预处理。

玄机就是在这里，因为这里有个control_placeholder，螳螂捕蝉黄雀在后，后面居然有对应的control_array。

然后关于control_array是这样定义的：

# Firstly we define control value
control_value = facenet.RANDOM_ROTATE * random_rotate + facenet.RANDOM_CROP * random_crop + facenet.RANDOM_FLIP * random_flip + facenet.FIXED_STANDARDIZATION * use_fixed_image_standardization
# Then
control_array = np.ones_like(labels_array) * control_value

control_value = 1 * False + 2 * False + 4 * True + 8 * True

所以，才有如下的图像

在这里插入图片描述
说明：这里的数据增强确实用的是use_fixed_image_standardization和random_flip.

好了，上面的代码段完事以后，真正的黄雀就千呼万唤始出来了。

 sess.run(enqueue_op, {image_paths_placeholder: image_paths_array, labels_placeholder: labels_array, control_placeholder: control_array})

有了上面这句话，我们可以看出，真正的入列操作enqueue_op是从这里开始的，所以，会按照现在这个control_array的方式来入列得到input_queue。

入是入进去了，但还没进行真正的数据预处理和数据增强的。真正的是在调用facenet.create_input_pipeline(input_queue, x, x, x)后。

所以,真正的给input_queue入列enqueue_op和对图像进行预处理是根据下面这两行代码并且在后期加上sess.run

在这里插入图片描述

image_batch, label_batch = facenet.create_input_pipeline(input_queue, image_size, nrof_preprocess_threads, batch_size_placeholder)

最后得到的image_batch和label_batch如下：

image_batch = {Tensor}Tensor("batch_join:0", shape=(?, 112, 112, 3), dtype=float32)

 label_batch = {Tensor}Tensor("batch_join:1", shape=(?,), dtype=int32)

因为此时把按照前面叙述入好的队列传进去当参数，然后先从input_queue里进行dequeue操作，然后分线程的根据真正的get_control_flag(control[0], RANDOM_FLIP)来当做lamda语句的判断条件，来选择如何进行数据增强和数据预处理(FIXED_STANDARDIZATION).

传进去以后，在真正的预处理以前，create_input_pipeline函数内的input_queue.dequeue()很关键。然后再通过下述的几行代码进行多线程，单个图像的预处理和数据增强操作。

def create_input_pipeline(input_queue)
	images_and_labels_list = []
	# 分多个线程处理
	for _ in range(nrof_preprocess_threads):
		filenames, label, control = input_queue.dequeue()
    	images = []
    	for filename in tf.unstack(filenames):
        	file_contents = tf.read_file(filename)
        	image = tf.image.decode_image(file_contents, 3)
        	image = tf.cond(get_control_flag(control[0], RANDOM_ROTATE)，
        					lambda:tf.py_func(random_rotate_image, [image], tf.uint8), 
        					lambda:tf.identity(image))

在这里插入图片描述
看上图中的filenames, label, control. 似乎也看不出三者具体长啥样子。

这是inception_v1得到的prelogits

prelogits={Tensor}
Tensor("InceptionResnetV1/Bottleneck/BatchNorm/Reshape_1:0", shape=(?, 512), dtype=float32)

Mute杭盖

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
David的facenet里面队列的逻辑

一. 如何完成的data agumentation和fixed_Standardization先创建一个用于放数据的队列input_queue = data_flow_ops.FIFOQueue(capacity=2000000, dtypes=[tf.string, tf.int32, tf.int32], ...
复制链接

扫一扫