第四章 神经网络学习 第五节
学习算法的实现
随机梯度下降法(SGD)就是指用于计算梯度的数据是随机选取的 mini batch数据。
2层神经网络的类
用python实现
import os,sys
sys.path.append(os.pardir)
from ch01.activeFunction import *
from numerical_diff import numerical_gradient
from ch03.mini_batch import *
class TwoLayerNet:
def __init__(self,input_size,hidden_size,output_size,weight_init_std=0.01):
#初始化权重
self.params={}
self.params['W1'] = weight_init_std*np.random.rand(input_size,hidden_size)
self.params['b1'] = np.zeros(hidden_size)
self.params['W2'] = weight_init_std*np.random.rand(hidden_size,output_size)
self.params['b2'] =np.zeros(output_size)
def predict(self,x):
W1, W2 = self.params['W1'],self.params['W2']
b1, b2 = self.params['b1'],self.params['b2']
a1 = np.dot(x,W1)+b1
z1 = sigmoid(a1)
a2 = np.dot(z1,W2)+b2
y = softmax(a2)
return y
def loss(self,x,t):
y = self.predict(x)
return cross_entropy_error(y,t)
def accuracy(self,x,t):
y = self.predict(x)
y = np.argmax(y,axis=1)
t = np.argmax(t, axis=1)
accuracy = np.sum(y == t) / float(x.shape[0])
return accuracy
def numerical_gradient(self,x,t):
loss_W = lambda W:self.loss(x,t)
grads = {}
grads['W1'] = numerical_gradient(loss_W, self.params['W1'])
grads['b1'] = numerical_gradient(loss_W, self.params['b1'])
grads['W2'] = numerical_gradient(loss_W, self.params['W2'])
grads['b2'] = numerical_gradient(loss_W, self.params['b2'])
return grads
二层网络实现,第一步先初始化示输入层的神经元数、隐藏层的神经元数、输出层的神经元数。第二步计算输出信号,第三步计算模型的准确率,第四步计算各个参数的梯度。
mini-batch实现
import numpy as np
import os,sys
sys.path.append('H:\pythonfile\ch04\two_layer_net.py')
from dataset.mnist import load_mnist
from two_layer_net import TwoLayerNet
(x_train,t_train),(x_test,t_test) = load_mnist(normalize=True,one_hot_label=True)
train_loss_list = []
#超参数
iters_num = 10
train_size = x_train.shape[0]
batch_size = 5
learning_rate = 0.1
network = TwoLayerNet(input_size=784,hidden_size=50,output_size=10)
for i in range(iters_num):
#获取mini-batch
batch_mask = np.random.choice(train_size,batch_size)
x_bacth = x_train[batch_mask]
t_bacth = t_train[batch_mask]
#计算梯度
grad = network.numerical_gradient(x_bacth,t_bacth)
for key in ('W1','b1','W2','b2'):
network.params[key] -= learning_rate*grad[key]
loss = network.loss(x_bacth,t_bacth)
train_loss_list.append(loss)
print(train_loss_list)
这个代码我调试过了,我最开始粗心把network.numerical_gradient(x_bacth,t_bacth)想成numerical_gradient(f,x)函数了,我就一直像为什么第一个参数不是函数,后面我调试发现它是调用network中函数,里面的函数又调用numerical_gradient(f,x)这个方法,感觉调试真的很有用。
下一小节我感觉是优化了一下,没有细看,想看下一章就先不看这一小节啦,加油喔,向下一章出发。