目录
DeepLearning Practise of building CNN with TensorFlow-slim
张天天
zxt235813@163.com
https://github.com/piglaker
概述
小数据集下,使用tensorflow-slim搭建卷积神经网络实现服装分类,并用SVM优化结果
代码in github
- TensorFlow-slim :https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim
- 爬虫,训练集数据的获得
- 模型搭建:浅层卷积神经网络
- TensorBoard;
- 训练&测试和代码详解,loss曲线
- 坏测试集
- 加入SVM的决策树
- 总结
1.必要的:
python3.5: anaconda自带python3.6不支持tensorflow,
opencv
tensorflow-gpu
numpy
pillow
models:https://github.com/tensorflow/models
2.我的硬件:
gtx1080
i7
3.详细
训练集规模:8000 .jpg( 爬取自 京东 百度)
测试集规模:3000. jpg(爬取自 )
STEPS=20000
Learning_rate=0.005
accuracy=0.84
TensorFlow-slim
略(but important!)
见 tensorflow-slim
poor English? here
爬虫
request&正则表达式
略,懒得写
CNN模型搭建
what is CNN?
‘c’ means Convolutional
‘NN’ means neural network
常识:
cs231n: http://cs231n.github.io/understanding-cnn/
这个人写的也不错:https://www.cnblogs.com/alexcai/p/5506806.html
吴恩达:https://blog.csdn.net/ice_actor/article/details/78648780
关于图像识别常识(matlab):https://www.cnblogs.com/kkyyhh96/p/6505473.html
numpy常识
理解:
图文并茂: 【深度学习】卷积神经网络的实现与理解
比较偏理论深度神经网络结构以及Pre-Training的理解
实例:
matlab车牌识别
tf普通神经网络
工程级
概念
filter 卷积核
pooling 池化
fc 全连接
bias 偏置
backpropagation 反向传播
loss 损失
optimizer 优化器
Momentum动量随机梯度下降法
Learningrate 学习率
relu 激活函数
tensorflow架构
一张图了解tensorflow搭建神经网络模型结构:
CNN的实现:(conv-pool)*n ->dropout-> flatten->(fullconnected)*n->outputs
NN算法的前向传播过程(forward)我理解为 矩阵乘,
//
import tensorflow as tf
slim = tf.contrib.slim
class Model(object):
def __init__(self,
is_training,
num_classes):
self._num_classes = num_classes
self._is_training = is_training
@property
def num_classes(self):
return self._num_classes
def preprocess(self, inputs):
preprocessed_inputs = tf.to_float(inputs)
preprocessed_inputs = tf.subtract(preprocessed_inputs, 128.0)
preprocessed_inputs = tf.div(preprocessed_inputs, 128.0)
#使图片灰度值归一化到-1~1之间,小数据集的preprocessing中使用rotate,crop&border_expend是有用的,这次时间紧懒得搞了
return preprocessed_inputs
def predict(self, preprocessed_inputs):
with slim.arg_scope([slim.conv2d, slim.fully_connected],
activation_fn=tf.nn.relu):
net = preprocessed_inputs
net = slim.repeat(net, 2, slim.conv2d, 32, [3, 3], scope='conv1')
net = slim.max_pool2d(net, [2, 2], scope='pool1')
net = slim.repeat(net, 2, slim.conv2d, 64, [3, 3], scope='conv2')
net = slim.max_pool2d(net, [2, 2], scope='pool2')
net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv3')
net = slim.flatten(net, scope='flatten')
net = slim.dropout(net, keep_prob=0.5,
is_training=self._is_training)
net = slim.fully_connected(net, 512, scope='fc1')
net = slim.fully_connected(net, self.num_classes,
activation_fn=None, scope='fc2')
#net=slim.resnet_v1.predict()
#调用slim搭建网络,3个conv,2个fc,logits是形如[x,y].后面训练SVM会用到
prediction_dict = {
'logits': net}
return prediction_dict
def postprocess_logits(self, prediction_dict):
logits = prediction_dict['logits']
logits = tf.cast(logits, dtype=tf.float32)
logits_dict = {
'logits': logits}
return logits_dict#有用,但是懒得解释
def postprocess_classes(self, prediction_dict):
logits = prediction_dict['logits']
logits = tf.nn.softmax(logits)
classes = tf.cast(tf.argmax(logits, axis=1), dtype