机器学习实战——基于Scikit-Learn和TensorFlow 阅读笔记之第十章：人工神经网络简介

最新推荐文章于 2022-05-08 16:16:14 发布

xbs118

最新推荐文章于 2022-05-08 16:16:14 发布

阅读量661

点赞数

分类专栏：《机器学习实战——基于Scikit-Learn和TensorFlow》文章标签： scikitl-learn TensorFlow 机器学习实战人工神经网络 DNN

本文链接：https://blog.csdn.net/qq_38262728/article/details/88551453

版权

《机器学习实战——基于Scikit-Learn和TensorFlow》专栏收录该内容

13 篇文章 4 订阅

订阅专栏

《机器学习实战——基于Scikit-Learn和TensorFlow》
这是一本非常好的机器学习和深度学习入门书，既有基本理论讲解，也有实战代码示例。
我将认真阅读此书，并为每一章内容做一个知识笔记。
我会摘录一些原书中的关键语句和代码，若有错误请为我指出。

在这里插入图片描述

第十章人工神经网络简介

1 从生物神经元到人工神经元

兴起的原因：

海量数据
计算能力提升
训练算法提升
理论限制被接受
资金技术良性循环

1.1 生物神经元

。。。。。。

1.2 具有神经元的逻辑计算

与或非们

1.3 感知器

最简单的ANN架构之一。

heabiside和sgn两种阶跃函数。

单层感知器无法解决异或分类问题，可以使用多层感知器解决。

def heaviside(z):
    return (z >= 0).astype(z.dtype)

def mlp_xor(x1, x2, activation=heaviside):
    return activation(-activation(x1 + x2 - 1.5) + activation(x1 + x2 - 0.5) - 0.5)
    
x1s = np.linspace(-0.2, 1.2, 100)
x2s = np.linspace(-0.2, 1.2, 100)
x1, x2 = np.meshgrid(x1s, x2s)

z1 = mlp_xor(x1, x2, activation=heaviside)
z2 = mlp_xor(x1, x2, activation=sigmoid)

plt.figure(figsize=(10,4))

plt.subplot(121)
plt.contourf(x1, x2, z1)
plt.plot([0, 1], [0, 1], "gs", markersize=20)
plt.plot([0, 1], [1, 0], "y^", markersize=20)
plt.title("Activation function: heaviside", fontsize=14)
plt.grid(True)

plt.subplot(122)
plt.contourf(x1, x2, z2)
plt.plot([0, 1], [0, 1], "gs", markersize=20)
plt.plot([0, 1], [1, 0], "y^", markersize=20)
plt.title("Activation function: sigmoid", fontsize=14)
plt.grid(True)

1.4 多层感知器和反向传播

一个MLP包含一个输入层，一个或多个被成为隐藏层的LTU层，以及一个被称为输出层的LTU组成的最终层。

除了输出层外，没层都包含一个偏移神经元，并且与下一层完全相连。

如果一个ANN有2个以及以上的隐藏层，则别成为深度神经网络（DNN）。

反向传播：略。

阶跃函数替换为逻辑函数，以便反向计算梯度。

2 用TensorFlow的高级API来训练MLP

略

3 使用纯TensorFlow训练DNN

3.1 构建阶段

def neuron_layer(X, n_neurons, name, activation=None):
    with tf.name_scope(name):
        n_inputs = int(X.get_shape()[1])
        stddev = 2 / np.sqrt(n_inputs)
        init = tf.truncated_normal((n_inputs, n_neurons), stddev=stddev)
        W = tf.Variable(init, name="kernel")
        b = tf.Variable(tf.zeros([n_neurons]), name="bias")
        Z = tf.matmul(X, W) + b
        if activation is not None:
            return activation(Z)
        else:
            return Z

with tf.name_scope("dnn"):
    hidden1 = neuron_layer(X, n_hidden1, name="hidden1",
                           activation=tf.nn.relu)
    hidden2 = neuron_layer(hidden1, n_hidden2, name="hidden2",
                           activation=tf.nn.relu)
    logits = neuron_layer(hidden2, n_outputs, name="outputs")

with tf.name_scope("loss"):
    xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y,
                                                              logits=logits)
    loss = tf.reduce_mean(xentropy, name="loss")

def shuffle_batch(X, y, batch_size):
    rnd_idx = np.random.permutation(len(X))
    n_batches = len(X) // batch_size
    for batch_idx in np.array_split(rnd_idx, n_batches):
        X_batch, y_batch = X[batch_idx], y[batch_idx]
        yield X_batch, y_batch

3.2 执行阶段

with tf.Session() as sess:
    init.run()
    for epoch in range(n_epochs):
        for X_batch, y_batch in shuffle_batch(X_train, y_train, batch_size):
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
        acc_batch = accuracy.eval(feed_dict={X: X_batch, y: y_batch})
        acc_val = accuracy.eval(feed_dict={X: X_valid, y: y_valid})
        print(epoch, "Batch accuracy:", acc_batch, "Val accuracy:", acc_val)

    save_path = saver.save(sess, "./my_model_final.ckpt")