2018-08-09 高阶内容

一、分类学习

  1. 准备mnist数据
      mnist数据需要翻墙,可以先从这里下载下来,并在代码中指定数据目录
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
# mnist = input_data.read_data_sets('<data dir>', one_hot=True)
  1. 搭建网络
# 图片作为输入,规格为 28×28 = 784
xs = tf.placeholder(tf.float32, [None, 784])
# 每个图片表示一个数字,0~9,共10类
ys = tf.placeholder(tf.float32, [None, 10]) 
# 创建输出层,输出一个包含10个元素的列表
# softmax 常用于分类问题
prediction = tf.layers.dense(xs, 10, tf.nn.softmax)

 

7130128-2a426bb9d97e5fda.png
网络结构

 


  loss函数选用交叉熵函数cross entropy(关于交叉熵,可以参考这篇文章)。交叉熵用来衡量预测值和真实值的相似程度,如果完全相同,它们的交叉熵等于零。
  另外,定义compute_accuracy来计算精确度。

 

cross_entropy = tf.reduce_mean(-tf.reduce_sum(ys * tf.log(prediction), reduction_indices=[1]))

# train operation
train_op = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
# compute accuracy
def compute_accuracy(v_xs, v_ys):
    global prediction
    global sess
    y_pre = sess.run(prediction, feed_dict={xs: v_xs})
    correct_prediction = tf.equal(tf.argmax(y_pre, 1), tf.argmax(v_ys, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    result = sess.run(accuracy, feed_dict={xs: v_xs, ys: v_ys})
    return result

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    # train
    for step in range(1000):
        batch_xs, batch_ys = mnist.train.next_batch(100)
        sess.run(train_op, feed_dict={xs: batch_xs, ys: batch_ys})
        if step % 50 == 0:
            print(compute_accuracy(mnist.test.images, mnist.test.labels))

参考链接:https://morvanzhou.github.io/tutorials/machine-learning/tensorflow/5-01-classifier/

二、Dropout 解决overfitting

  使用sklearn提供的手写数字数据集from sklearn.datasets import load_digits。在处理的时候需要将label转为二进制,即只有黑白色的像素。

import tensorflow as tf
from sklearn.datasets import load_digits
from sklearn.cross_validation import train_test_split # split train set and test set
from sklearn.preprocessing import LabelBinarizer # convert label to binary 0,1
# load data
digits = load_digits()
X = digits.data
y = digits.target
y = LabelBinarizer().fit_transform(y) # fit to data(get mean and variance), then transform it
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

  接下来定义需要用到的变量。tf_is_training 用于控制是训练还是测试

# define inputs
keep_prob = tf.placeholder(tf.float32)
xs = tf.placeholder(tf.float32, [None, 64]) # 8 * 8
ys = tf.placeholder(tf.float32, [None, 10]) # 10 types label
tf_is_training = tf.placeholder(tf.bool, None) # to control dropout when training and testing

  定义两个神经网络,一个不使用dropout,另一个使用dropout。(注意:dropout只需要在隐藏层进行即可)。

# normal layer
h1 = tf.layers.dense(xs, 50, tf.nn.softmax)
output = tf.layers.dense(h1, 10)
# dropout layer
dh1 = tf.layers.dropout(xs, 50, tf.nn.softmax)
dh1 = tf.layers.dropout(dh1, rate=0.2, training=tf_is_training)
doutput = tf.layers.dense(dh1, 10)

  使用交叉熵作为损失函数。

# loss
loss = tf.losses.softmax_cross_entropy(ys, output)
tf.summary.scalar('loss', loss)
# dropout loss
dloss = tf.losses.softmax_cross_entropy(ys, doutput)
tf.summary.scalar('dloss', dloss)

  使用梯度下降优化器。

# train operation
train_op = tf.train.GradientDescentOptimizer(0.1).minimize(loss)
dtrain_op = tf.train.GradientDescentOptimizer(0.1).minimize(dloss)

  训练200次,每10次记录一次loss和dloss。

# session area
with tf.Session() as sess:
    # tensorboard
    merge_op = tf.summary.merge_all()
    test_writer = tf.summary.FileWriter('logs/test', sess.graph)
    sess.run(tf.global_variables_initializer())
    # train
    for step in range(200):
        sess.run([train_op, dtrain_op], feed_dict={xs: X_train, ys: y_train, tf_is_training: True})
        # get result
        if step % 10 == 0:
            test_result = sess.run(merge_op, feed_dict={xs: X_test, ys: y_test, tf_is_training: False})
            test_writer.add_summary(test_result, step)
            rloss, rdloss = sess.run([loss, dloss], feed_dict={xs: X_test, ys: y_test, tf_is_training: False})
            print(rloss, rdloss)
7130128-05bd8d84b7efb8e3.png
输出, 左: loss,右: dloss

 


参考链接:https://morvanzhou.github.io/tutorials/machine-learning/tensorflow/5-02-dropout/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值