TensorFlow 之 构建人物识别系统

从零构建一个自己的人物识别CNN模型,识别图像里的人是谁。这里以识别SHE的Ella和Selina为例!

只是一个简单的示例,重在理解机器学习的过程,以及机器学习的难点,比如:
- 数据(样本的数量、样本的质量)
- 模型(构成、算法)
- 学习方法(节点初始值、学习率)

机器学习的前提是需要大量的训练样本,但获取一定规模的采样数据并逐个标记并不是那么的容易。大体过程如下:
1-采用爬虫根据指定关键字爬取图像(比如百度、谷歌)
2-根据需求对爬取来的图像做特殊处理(比如通过OpenCV识别并裁剪出人脸)
3-排查并整理图像(筛选图像以及调整图像大小等)
4-整理标记文件
5-编写模型
6-训练数据
7-测试确认

版本
TensorFlow 1.2 + OpenCV 2.5

相关文章:
[url=http://rensanning.iteye.com/blog/2381794]TensorFlow 之 入门体验[/url]
[url=http://rensanning.iteye.com/blog/2382529]TensorFlow 之 手写数字识别MNIST[/url]
[url=http://rensanning.iteye.com/blog/2381885]TensorFlow 之 物体检测[/url]
[url=http://rensanning.iteye.com/blog/2383607]TensorFlow 之 构建人物识别系统[/url]

[b](1)文件结构 [/b]

/usr/local/tensorflow/sample/tf-she-image
[quote]├ ckpt [checkpoint文件]
├ data [学习结果]
├ eval_images [测试图像]
├ face [OpenCV提取的人脸]
│ ├ ella
│ └ selina
├ original [从百度图片抓取的原始图像]
│ ├ ella
│ └ selina
├ test [完成学习后检验模型精度的图像]
│ ├ data.txt [图像路径及标记]
│ ├ ella
│ └ selina
└ train [训练学习用图像]
  ├ data.txt [图像路径及标记]
  ├ ella
  └ selina[/quote]

[b](2)抓取图像[/b]

根据关键字从百度图片的搜索结果中抓取图片,Python抓取百度图片网上有很多可以参考的例子,都比较简单。由于要提供给机器作为样本学习,所以要尽可能多的抓取有脸部特征的高质量图像。

[b]/usr/local/tensorflow/sample/tf-she-image/original/ella[/b]
[img]http://dl2.iteye.com/upload/attachment/0125/9400/ac7642ce-12f6-3abf-a80b-ee9a3ef5adb2.png[/img]

[b]/usr/local/tensorflow/sample/tf-she-image/original/selina[/b]
[img]http://dl2.iteye.com/upload/attachment/0125/9402/6000c140-f33c-35d8-b9df-20f6a5919b8e.png[/img]

[b](3)提取人脸[/b]

作为人脸识别的样本,只需要面部的局部数据即可,所以抓下来的图需要特殊处理一下,通过OpenCV识别出图像中的人脸,提取并保存。

[b]/usr/local/tensorflow/sample/tf-she-image/face/ella[/b]
[img]http://dl2.iteye.com/upload/attachment/0125/9404/7dd7e154-3d3c-3066-8554-7bfa93207361.png[/img]

[b]/usr/local/tensorflow/sample/tf-she-image/face/selina[/b]
[img]http://dl2.iteye.com/upload/attachment/0125/9406/5812bd0a-203e-3917-90d3-1b2cf81437b8.png[/img]

face_detect.py
import cv2
import numpy as np
import os.path

input_data_path = '/usr/local/tensorflow/sample/tf-she-image/original/ella/'
save_path = '/usr/local/tensorflow/sample/tf-she-image/face/ella/'
cascade_path = '/usr/share/OpenCV/haarcascades/haarcascade_frontalface_default.xml'
faceCascade = cv2.CascadeClassifier(cascade_path)

image_count = 16000

face_detect_count = 0

for i in range(image_count):
if os.path.isfile(input_data_path + str(i) + '.jpg'):
try:
img = cv2.imread(input_data_path + str(i) + '.jpg', cv2.IMREAD_COLOR)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
face = faceCascade.detectMultiScale(gray, 1.1, 3)

if len(face) > 0:
for rect in face:
x = rect[0]
y = rect[1]
w = rect[2]
h = rect[3]

cv2.imwrite(save_path + 'face-' + str(face_detect_count) + '.jpg', img[y:y+h, x:x+w])
face_detect_count = face_detect_count + 1
else:
print('image' + str(i) + ': No Face')
except Exception as e:
print('image' + str(i) + ': Exception - ' + str(e))
else:
print('image' + str(i) + ': No File')


[b](3)整理图像[/b]

由于抓到的图质量,以及OpenCV识别率的问题,还需要重新筛选图像,只留下真正有明显面部特征的图像。

[b]/usr/local/tensorflow/sample/tf-she-image/train/ella[/b]
[img]http://dl2.iteye.com/upload/attachment/0125/9408/54104c9c-bece-3f74-aa57-9d8da2ebd46e.png[/img]

[b]/usr/local/tensorflow/sample/tf-she-image/train/selina[/b]
[img]http://dl2.iteye.com/upload/attachment/0125/9410/bbaa2d14-cb16-3999-b5e8-6993b7a9f048.png[/img]

[b][color=red]这一步是非常费时间的!因为提供的训练样本的质量越高识别就越准确。最终这里提取到ella 380张、selina 350张。所以要特别感谢那些已经开源的数据集的提供者们![/color][/b]

整理图像的labels文件:data.txt
[quote]/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00001.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00002.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00003.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00004.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00005.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00006.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00007.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00008.jpg 0
。。。
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00344.jpg 1
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00345.jpg 1
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00346.jpg 1
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00347.jpg 1
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00348.jpg 1
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00349.jpg 1
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00350.jpg 1[/quote]

*** 检验模型精度的图像可以选择和训练一样。

[b](4)编写模型[/b]

train.py
import sys
import cv2
import random
import numpy as np
import tensorflow as tf
import tensorflow.python.platform

NUM_CLASSES = 2

IMAGE_SIZE = 28

IMAGE_PIXELS = IMAGE_SIZE*IMAGE_SIZE*3

flags = tf.app.flags
FLAGS = flags.FLAGS

flags.DEFINE_string('train', '/usr/local/tensorflow/sample/tf-she-image/train/data.txt', 'File name of train data')

flags.DEFINE_string('test', '/usr/local/tensorflow/sample/tf-she-image/test/data.txt', 'File name of test data')

flags.DEFINE_string('train_dir', '/usr/local/tensorflow/sample/tf-she-image/data/', 'Directory to put the training data')

flags.DEFINE_integer('max_steps', 100, 'Number of steps to run trainer.')

flags.DEFINE_integer('batch_size', 20, 'Batch size Must divide evenly into the dataset sizes.')

flags.DEFINE_float('learning_rate', 1e-4, 'Initial learning rate.')

def inference(images_placeholder, keep_prob):
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)

def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)

def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')

x_image = tf.reshape(images_placeholder, [-1, IMAGE_SIZE, IMAGE_SIZE, 3])

with tf.name_scope('conv1') as scope:
W_conv1 = weight_variable([5, 5, 3, 32])

b_conv1 = bias_variable([32])

h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)

with tf.name_scope('pool1') as scope:
h_pool1 = max_pool_2x2(h_conv1)

with tf.name_scope('conv2') as scope:
W_conv2 = weight_variable([5, 5, 32, 64])

b_conv2 = bias_variable([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)

with tf.name_scope('pool2') as scope:
h_pool2 = max_pool_2x2(h_conv2)

with tf.name_scope('fc1') as scope:
W_fc1 = weight_variable([7*7*64, 1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])

h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

with tf.name_scope('fc2') as scope:
W_fc2 = weight_variable([1024, NUM_CLASSES])
b_fc2 = bias_variable([NUM_CLASSES])

with tf.name_scope('softmax') as scope:
y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)

return y_conv

def loss(logits, labels):
cross_entropy = -tf.reduce_sum(labels*tf.log(logits))

tf.summary.scalar("cross_entropy", cross_entropy)

return cross_entropy

def training(loss, learning_rate):
train_step = tf.train.AdamOptimizer(learning_rate).minimize(loss)
return train_step

def accuracy(logits, labels):
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

tf.summary.scalar("accuracy", accuracy)
return accuracy

if __name__ == '__main__':

f = open(FLAGS.train, 'r')
train_image = []
train_label = []

for line in f:
line = line.rstrip()
l = line.split()

img = cv2.imread(l[0])
img = cv2.resize(img, (IMAGE_SIZE, IMAGE_SIZE))

train_image.append(img.flatten().astype(np.float32)/255.0)

tmp = np.zeros(NUM_CLASSES)
tmp[int(l[1])] = 1
train_label.append(tmp)

train_image = np.asarray(train_image)
train_label = np.asarray(train_label)
f.close()

f = open(FLAGS.test, 'r')
test_image = []
test_label = []
for line in f:
line = line.rstrip()
l = line.split()
img = cv2.imread(l[0])
img = cv2.resize(img, (IMAGE_SIZE, IMAGE_SIZE))
test_image.append(img.flatten().astype(np.float32)/255.0)
tmp = np.zeros(NUM_CLASSES)
tmp[int(l[1])] = 1
test_label.append(tmp)
test_image = np.asarray(test_image)
test_label = np.asarray(test_label)
f.close()

with tf.Graph().as_default():
images_placeholder = tf.placeholder("float", shape=(None, IMAGE_PIXELS))
labels_placeholder = tf.placeholder("float", shape=(None, NUM_CLASSES))
keep_prob = tf.placeholder("float")
logits = inference(images_placeholder, keep_prob)
loss_value = loss(logits, labels_placeholder)
train_op = training(loss_value, FLAGS.learning_rate)
acc = accuracy(logits, labels_placeholder)

saver = tf.train.Saver()

sess = tf.Session()
sess.run(tf.global_variables_initializer())

summary_op = tf.summary.merge_all()
summary_writer = tf.summary.FileWriter(FLAGS.train_dir, sess.graph)

for step in range(FLAGS.max_steps):
for i in range(int(len(train_image)/FLAGS.batch_size)):
batch = FLAGS.batch_size*i

sess.run(train_op, feed_dict={
images_placeholder: train_image[batch:batch+FLAGS.batch_size],
labels_placeholder: train_label[batch:batch+FLAGS.batch_size],
keep_prob: 0.5})

train_accuracy = sess.run(acc, feed_dict={
images_placeholder: train_image,
labels_placeholder: train_label,
keep_prob: 1.0})
print("step %d, training accuracy %g" % (step, train_accuracy))

summary_str = sess.run(summary_op, feed_dict={
images_placeholder: train_image,
labels_placeholder: train_label,
keep_prob: 1.0})
summary_writer.add_summary(summary_str, step)

print("test accuracy %g" % sess.run(acc, feed_dict={
images_placeholder: test_image,
labels_placeholder: test_label,
keep_prob: 1.0}))

save_path = saver.save(sess, '/usr/local/tensorflow/sample/tf-she-image/ckpt/model.ckpt')


[b](5)训练数据[/b]

accuracy越接近1表示精度越高。
[quote](tensorflow) # python train.py
step 0, training accuracy 0.479452
step 1, training accuracy 0.479452
step 2, training accuracy 0.480822
step 3, training accuracy 0.505479
step 4, training accuracy 0.531507
step 5, training accuracy 0.609589
step 6, training accuracy 0.630137
step 7, training accuracy 0.639726
step 8, training accuracy 0.732877
step 9, training accuracy 0.713699
。。。。。。
step 89, training accuracy 0.994521
step 90, training accuracy 0.994521
step 91, training accuracy 0.994521
step 92, training accuracy 0.994521
step 93, training accuracy 0.994521
step 94, training accuracy 0.994521
step 95, training accuracy 0.994521
step 96, training accuracy 0.994521
step 97, training accuracy 0.994521
step 98, training accuracy 0.994521
step 99, training accuracy 0.994521
test accuracy 0.994521[/quote]

执行完成后,/usr/local/tensorflow/sample/tf-she-image/ckpt/ 里会生成以下文件:
[quote]model.ckpt.index
model.ckpt.meta
model.ckpt.data-00000-of-00001
checkpoint[/quote]

[b](6)查看训练结果[/b]

[code="java"](tensorflow) # tensorboard --logdir=/usr/local/tensorflow/sample/tf-she-image/data[/code]

[img]http://dl2.iteye.com/upload/attachment/0125/9412/5dd140a9-62ae-35e2-8d4d-5423f12984e1.png[/img]
[img]http://dl2.iteye.com/upload/attachment/0125/9414/b987c699-0418-3359-ae1d-ad3adeed8dd0.png[/img]

[b](7)测试确认[/b]

准备四张图像来测试一下是否能够正确识别:

[b]test-ella-01.jpg[/b]
[img]http://dl2.iteye.com/upload/attachment/0125/9418/0521aa4f-f327-3d52-9bac-482851e2a294.jpg[/img]

[b]test-ella-02.jpg[/b]
[img]http://dl2.iteye.com/upload/attachment/0125/9420/3db7325f-5f4e-3bed-817e-0c9342ee151d.jpg[/img]

[b]test-selina-01.jpg[/b]
[img]http://dl2.iteye.com/upload/attachment/0125/9422/ae425953-c5ca-3d71-97b2-d82474e9bee7.jpg[/img]

[b]test-selina-02.jpg[/b]
[img]http://dl2.iteye.com/upload/attachment/0125/9424/2d89309c-8585-37f1-b50d-4454a8e8f15b.jpg[/img]

确认结果:
[code="java"](tensorflow) # python eval.py
/usr/local/tensorflow/sample/tf-she-image/eval_images/test-ella-01.jpg
[{'name': 'ella', 'rate': 85.299999999999997, 'label': 0}, {'name': 'selina', 'rate': 14.699999999999999, 'label': 1}]
/usr/local/tensorflow/sample/tf-she-image/eval_images/test-ella-02.jpg
[{'name': 'ella', 'rate': 99.799999999999997, 'label': 0}, {'name': 'selina', 'rate': 0.20000000000000001, 'label': 1}]
/usr/local/tensorflow/sample/tf-she-image/eval_images/test-selina-01.jpg
[{'name': 'selina', 'rate': 100.0, 'label': 1}, {'name': 'ella', 'rate': 0.0, 'label': 0}]
/usr/local/tensorflow/sample/tf-she-image/eval_images/test-selina-02.jpg
[{'name': 'selina', 'rate': 99.900000000000006, 'label': 1}, {'name': 'ella', 'rate': 0.10000000000000001, 'label': 0}][/code]

[color=blue][b]可以看到识别率分别是:85.2%、99.7%、100%、99.9%。识别的还不错![/b][/color]

eval.py
import sys
import numpy as np
import cv2
import tensorflow as tf
import os
import random
import train

cascade_path = '/usr/share/OpenCV/haarcascades/haarcascade_frontalface_default.xml'
faceCascade = cv2.CascadeClassifier(cascade_path)

HUMAN_NAMES = {
0: u"ella",
1: u"selina"
}

def evaluation(img_path, ckpt_path):
tf.reset_default_graph()

f = open(img_path, 'r')
img = cv2.imread(img_path, cv2.IMREAD_COLOR)

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
face = faceCascade.detectMultiScale(gray, 1.1, 3)

if len(face) > 0:
for rect in face:
random_str = str(random.random())

cv2.rectangle(img, tuple(rect[0:2]), tuple(rect[0:2]+rect[2:4]), (0, 0, 255), thickness=2)

face_detect_img_path = '/usr/local/tensorflow/sample/tf-she-image/eval_images/' + random_str + '.jpg'

cv2.imwrite(face_detect_img_path, img)
x = rect[0]
y = rect[1]
w = rect[2]
h = rect[3]

cv2.imwrite('/usr/local/tensorflow/sample/tf-she-image/eval_images/' + random_str + '.jpg', img[y:y+h, x:x+w])

target_image_path = '/usr/local/tensorflow/sample/tf-she-image/eval_images/' + random_str + '.jpg'
else:
print('image:No Face')
return
f.close()
f = open(target_image_path, 'r')

image = []
img = cv2.imread(target_image_path)
img = cv2.resize(img, (28, 28))

image.append(img.flatten().astype(np.float32)/255.0)
image = np.asarray(image)

logits = train.inference(image, 1.0)

sess = tf.InteractiveSession()

saver = tf.train.Saver()

sess.run(tf.global_variables_initializer())

if ckpt_path:
saver.restore(sess, ckpt_path)

softmax = logits.eval()

result = softmax[0]

rates = [round(n * 100.0, 1) for n in result]
humans = []

for index, rate in enumerate(rates):
name = HUMAN_NAMES[index]
humans.append({
'label': index,
'name': name,
'rate': rate
})

rank = sorted(humans, key=lambda x: x['rate'], reverse=True)

print(img_path)
print(rank)

return [rank, os.path.basename(img_path), random_str + '.jpg']

if __name__ == '__main__':
TEST_IMAGE_PATHS = [ 'test-ella-01.jpg', 'test-ella-02.jpg', 'test-selina-01.jpg', 'test-selina-02.jpg' ]
for image_path in TEST_IMAGE_PATHS:
evaluation('/usr/local/tensorflow/sample/tf-she-image/eval_images/'+image_path, '/usr/local/tensorflow/sample/tf-she-image/ckpt/model.ckpt')


参考:
http://qiita.com/neriai/items/bd7bc36ec42c8ef65b2e
  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
本文的视频人脸检测识别方法的基本设计思想是,在给出一段视频文件以及这个视频文件的字幕和剧本之后,可以自动的对视频中的人物进行检测和识别,不需要任何的训练样本。视频人脸检测识别方法主要由四个部分组成:字幕剧本融合部分,人脸检测部分,样本集自动生成部分和基于深度学习的人脸识别部分。本文将深度学习算法引入到了视频人脸识别中来,有两方面的重要意义,一方面,视频人脸的识别要求算法具备一定的抗干扰能力,并且能够保证一定的实时性,本文的实验与分析表明,深度学习算法具备这方面的要求;另一方面,从深度学习算法特性的角度来说,深度学习算法最大的缺点就是构造深度模型需要大量的样木,这很大程度上限制了深度学习算法的应用,然而本文所设计的基于视频的人脸检测模块可以轻松的产生数万、数十万的样本,从而满足了深度学习算法的大样本集要求。 基于深度学习模型的人脸识别部分是整个系统的重点,这一部分主要有两方面的意义:一,经历了视频人脸的检测部分之后,虽然视频人脸集合中人脸的纯度有了很大的提升,但是依然会存在一些杂质,因此必须通过识别模块来进一步的过滤掉人脸集合中的杂质;二,通过视频所得到的帧文件中,经常会出现多张人脸同时出现的情况,在这种情况下,视频人脸的检测部分是无法将说话者与人脸进行对应的,必须通过识别模块才能区分出一个帧中的多个人脸。 基于深度学习模型的人脸识别部分主要包含三个模块:数据预处理模块、深度学习模块和识别模块。数据预处理模块主要由数据整合和构造数据立方体两个部分组成。深度学习模块通过两个具体过程来实现:RBM调节和深度模型的反馈微调。RBM的调节过程是自下而上的各个层间的调节过程,以这种方式来初始化整个深度模型的系统权值,而深度模型的反馈微调,首先进行自下而上的识别模型转换,然后再进行自上而下的生成模型转换,最后通过不同层次之间的不断调节,使生成模型可以重构出具有较低误差的原样本,这样就得到了此样本的本质特征,即深度模型的最高抽象表示形式。经过深度学习模型的处理,可以得到降维之后的样本特征,在此基础上运用识别模块,本文中所采用的识别方法是人工神经网络的识别方法。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值