Theano入门——神经网络

最新推荐文章于 2019-05-15 08:53:32 发布

灰巧克力爱松露

最新推荐文章于 2019-05-15 08:53:32 发布

阅读量1.6k

点赞数

分类专栏： Deep Learning 文章标签： Theano 神经网络 CIFAR-10

本文链接：https://blog.csdn.net/shadow_guo/article/details/49030109

版权

Deep Learning 专栏收录该内容

24 篇文章 1 订阅

订阅专栏

Theano入门——神经网络

1.神经网络介绍

参考链接（1）。

2.参数设置

（1）模型分层
（2）神经网络每层权重维度
（3）块大小
（4）学习率和动量学习率

（5）训练次数

3.代码实现

（1）权重初始化
w_h1，b_h1为神经网络第1层（输入层）的权重和偏置，输入层的输入节点有32*32个，输入层的输出节点有100个。
w_h2，b_h2为神经网络第2层（隐含层）的权重和偏置，隐含层的输入节点有100个，隐含层的输出节点有100个。
w_o，b_o为神经网络第3层（输出层）的权重和偏置，输出层的输入节点有100个，输出层的输出节点有10个。
输入层的输入节点个数对应数据集每个图像样本的长度（1幅图像中像素的行数*像素的列数），输出层的输出节点个数对应数据集的标签可能出现的值的总数。
对于单隐含层的神经网络，输入层的输出连接隐含层的输入，隐含层的输出连接输出层的输入。这里的连接是指神经元的连接，神经元间的连接上有权重，偏置也是权重的一种。
神经网络通过调整神经元间的连接权重来得到所需的输出结果。
params把上述权重以列表的形式存起来。

（2）model函数

该函数负责神经网络的前向计算，即已知输入层的输入来计算输出层的输出。
x为输入层的输入，维度为（图像样本数目，图像长度32*32），w_h1的维度为（32*32，100），相乘后得到的维度为（图像样本数目，100）。
诡异的地方是h1 = T.maximum(0, T.dot(x, w_h1) + b_h1)。T.dot(x, w_h1)的维度为（图像样本数目，100），b_h1的维度为（1，100），它俩相加矩阵维度不相等；即使相加成功，得到的结果也是个矩阵，矩阵和标量0比较也不在相同维度上。
h1的维度为（图像样本数目，100），w_h2的维度为（100，100），h2的维度为（图像样本数目，100）。

h2的维度为（图像样本数目，100），w_o的维度为（100，10），p_y_given_x的维度为（图像样本数目，10），输出为模型预测的各个位的概率。

P.S. 维度的结果为推导的结果，如有错误请指出~

（3）模型预测

概率最大位的索引为y。y的维度为（图像样本数，1）。

（4）权重更新

损失cost的计算见前篇“Logistic回归”部分。权重更新用momentum函数。

（5）momentum函数

该函数有4个输入：损失cost，参数params，学习率learning_rate，动量momentum。其中参数包括所有的权重和偏置。
动量和学习率为都为固定值。动量参数mparam_i由动量*动量参数-学习率*损失梯度来更新。参数由参数+动量参数来更新。

训练和测试的上层过程不赘述了。

import theano
import theano.tensor as T
import numpy as np

import load_cifar

# 加载数据
x_train, t_train, x_test, t_test = load_cifar.cifar10(dtype=theano.config.floatX)
labels_test = np.argmax(t_test, axis=1)

# 定义标识性Theano变量
x = T.matrix()
t = T.matrix()

# 定义模型：神经网络
def floatX(x):
    return np.asarray(x, dtype=theano.config.floatX)

def init_weights(shape):
    return theano.shared(floatX(np.random.randn(*shape) * 0.1))

def momentum(cost, params, learning_rate, momentum):
    grads = theano.grad(cost, params)
    updates = []
    
    for p, g in zip(params, grads):
        mparam_i = theano.shared(np.zeros(p.get_value().shape, dtype=theano.config.floatX))
        v = momentum * mparam_i - learning_rate * g
        updates.append((mparam_i, v))
        updates.append((p, p + v))

    return updates

def model(x, w_h1, b_h1, w_h2, b_h2, w_o, b_o):
    h1 = T.maximum(0, T.dot(x, w_h1) + b_h1)
    h2 = T.maximum(0, T.dot(h1, w_h2) + b_h2)
    p_y_given_x = T.nnet.softmax(T.dot(h2, w_o) + b_o)
    return p_y_given_x

w_h1 = init_weights((32 * 32, 100))
b_h1 = init_weights((100,))
w_h2 = init_weights((100, 100))
b_h2 = init_weights((100,))
w_o = init_weights((100, 10))
b_o = init_weights((10,))

params = [w_h1, b_h1, w_h2, b_h2, w_o, b_o]

p_y_given_x = model(x, *params)
y = T.argmax(p_y_given_x, axis=1)

cost = T.mean(T.nnet.categorical_crossentropy(p_y_given_x, t))

updates = momentum(cost, params, learning_rate=0.001, momentum=0.9)


# 编译theano函数
train = theano.function([x, t], cost, updates=updates)
predict = theano.function([x], y)

# train model
batch_size = 50

for i in range(50):
    for start in range(0, len(x_train), batch_size):
        x_batch = x_train[start:start + batch_size]
        t_batch = t_train[start:start + batch_size]
        cost = train(x_batch, t_batch)

    predictions_test = predict(x_test)
    accuracy = np.mean(predictions_test == labels_test)
    print "iteration %d - accuracy: %.5f" % (i + 1, accuracy)

4.实验结果

5.参考链接

（1）神经网络介绍：略

灰巧克力爱松露

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Theano入门——神经网络

Theano入门——神经网络1.神经网络介绍参考链接（1）。2.参数设置（1）模型分层（2）神经网络每层权重维度（3）块大小（4）学习率和动量学习率（5）训练次数3.代码实现（1）权重初始化w_h1，b_h1为神经网络第1层（输入层）的权重和偏置，输入层的输入节点有32*32个，输入层的输出节点有100个。w_h2，b_h2为
复制链接

扫一扫