一个小测试,测试写的函数对不对
首先是初始化
input_size = 4
hidden_size = 10
num_classes = 3
num_inputs = 5
def init_toy_model():
np.random.seed(0)
return TwoLayerNet(input_size, hidden_size, num_classes, std=1e-1)
def init_toy_data():
np.random.seed(1)
X = 10 * np.random.randn(num_inputs, input_size)
y = np.array([0, 1, 2, 2, 1])
return X, y
net = init_toy_model()
X, y = init_toy_data()
print X.shape, y.shape
初始化
class TwoLayerNet(object):
def __init__(self, input_size, hidden_size, output_size, std=1e-4):
"""
Initialize the model. Weights are initialized to small random values and
biases are initialized to zero. Weights and biases are stored in the
variable self.params, which is a dictionary with the following keys:
W1: First layer weights; has shape (D, H)
b1: First layer biases; has shape (H,)
W2: Second layer weights; has shape (H, C)
b2: Second layer biases; has shape (C,)
Inputs:
- input_size: The dimension D of the input data.
- hidden_size: The number of neurons H in the hidden layer.
- output_size: The number of classes C.
"""
self.params = {}
self.params['W1'] = std * np.random.randn(input_size, hidden_size)
self.params['b1'] = np.zeros(hidden_size)
self.params['W2'] = std * np.random.randn(hidden_size, output_size)
self.params['b2'] = np.zeros(output_size)
对于W1的维数,即将输入样本的个数每个分配一个权重,最后输出相当于是hidden_size个分数,然后这些分数和激活函数相比较,b1应该是比较的阈值吧(自己觉得),有些分数就不会起作用。这样得到处理后的分数,在与W2相乘,与激活函数相比较,可以看到,W2输出是output_size,也就是说,输出的分数和类别数一样,即最终的分数。这里初始化这四个参数的意思大概就是这样子。
X的大小X.shape = (5, 4)
y的大小y.shape = (5, )
net.params[‘W1’].shape = (4, 10)
net.params[‘b1’].shape = (10, )
net.params[‘W2’].shape = (10, 3)
net.params[‘b2’].shape = (3, )
知道了维数关系,也就清楚了是 X*W 而不是 W*X,这个按实际去写,不要硬记。
计算loss和grad
def loss(self, X, y=None, reg=0.0):
"""
Compute the loss and