一、作业:具有神经网络思维的logistics回归
主要思路:
理清思路很重要,大致可以分为下面5个步骤。我会分别编写在 5 个python文件中。下面是我的目录结构:
二、处理数据集
第一步的任务肯定就是处理数据集了,我们拿到数据集和加载数据集的代码之后,开始对数据集进行处理。数据集和加载数据集的代码我放到评论区。原始图片的维度是(64*64*3),我们从load_dataset函数中拿到train_set_x_orign是(209*64*64*3),根据z=w^T*x=b我们需要把它展开成(64*64*3,209)维度的,使用下面的函数可以将训练集转换为(64*64*3,209)维度的,然测试集转换成(64*64*3,50)维度的。其中,参数-1表示根据前面的209,自动算出大小。
# 训练集
train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0], -1).T
# 测试集
test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0], -1).T
然后对训练集和测试集进行归一化,都除以255。
train_x = train_set_x_flatten / 255
test_x = test_set_x_flatten / 255
三、编写损失函数及初始化函数
编写损失函数,这里用的是sigmoid函数。
def sigmoid(x):
return 1 / (1 + np.exp(-x))
然后对参数进行初始化,对w和b进行初始化,将w初始化为全0的矩阵,b初始化为0。其中w的维度应该是(64*64*3,1),b是一个整数或浮点数0。
def inint_weight_with_zero(dim):
w = np.zeros(shape=(dim, 1))
print("初始化权重参数:", w.shape)
b = 0
assert w.shape == (dim, 1)
assert isinstance(b, float) or isinstance(b, int)
return w, b
四、写传播的函数及优化函数
开始写计算的函数,并计算我们需要的dw和db以及loss值。
def propagate(w, b, x, y):
"""
:param w: 权重数组(64*64*3,1)
:param b: 偏差,标量
:param x: 训练矩阵(64*64*3,209)
:param y: 标签(1,209)
:return:
cost:loss
dw:梯度
db:梯度
"""
print("*" * 20)
m = x.shape[1]
# 正向传播
z = np.dot(w.T, x) + b
a = sigmoid(z)
loss = -np.sum((y * np.log(a) + (1 - y) * np.log(1 - a))) / m
# 反向传播
dw = (1 / m) * np.dot(x, (a - y).T)
db = (1 / m) * np.sum(a - y)
assert dw.shape == w.shape
assert db.dtype == float
loss = np.squeeze(loss)
assert loss.shape == ()
grads = {
"dw": dw,
"db": db
}
return grads, loss
第四步开始进行优化,训练2000轮次。
def optimize(w, b, x, y, epochs, learning_rate, print_loss=False):
"""
:param w: 权重数组(64*64*3,1)
:param b: 偏差,标量
:param x: 训练矩阵(64*64*3,209)
:param y: 标签(1,209)
:param epochs: 迭代次数
:param learning_rate: 学习率
:param print_loss:每100次打印loss
:return:
params:{w,b}
grads:{dw,db}
loss_list:[]
"""
loss_list = []
for epoch in range(epochs):
grads, loss = propagate(w, b, x, y)
dw = grads["dw"]
db = grads["db"]
w = w - dw * learning_rate
b = b - db * learning_rate
if epoch % 100 == 0:
loss_list.append(loss)
if print_loss and epoch % 100 == 0:
print("epoch = %i,loss = %f." % (epoch, loss))
params = {
"w": w,
"b": b
}
grads = {
"dw": dw,
"db": db
}
return params, grads, loss_list
五、predict函数以及model模块
model模块
def model(x_train, y_train, x_test, y_test, epochs=2000, learning_rate=0.005, print_loss=False):
w, b = inint_weight_with_zero(x_train.shape[0])
params, grads, loss = optimize(w, b, x_train, y_train, epochs, learning_rate, print_loss)
w, b = params["w"], params["b"]
predict_accuracy_train = predict_all(w, b, x_train)
predict_accuracy_test = predict_all(w, b, x_test)
print("训练集准确性:", format(100 - np.mean(np.abs(predict_accuracy_train - y_train)) * 100), "%")
print("测试集准确性:", format(100 - np.mean(np.abs(predict_accuracy_test - y_test)) * 100), "%")
d = {
"loss": loss,
"predict_accuracy_test": predict_accuracy_test,
"predict_accuracy_train": predict_accuracy_train,
"w": w,
"b": b,
"learning_rate": learning_rate,
"epochs": epochs
}
return d
predict函数
def predict_all(w, b, x):
m = x.shape[1]
predict_list = np.zeros((1, m))
w = w.reshape(x.shape[0], 1)
z = np.dot(w.T, x) + b
a = sigmoid(z)
print(a.shape)
for i in range(a.shape[1]):
predict_list[0, i] = 1 if a[0, i] > 0.5 else 0
assert predict_list.shape == (1, m)
return predict_list
六、总结
这样,一个二元分类的逻辑回归就写好了,可以在一个main函数中进行调用,也可以全部写在一个文件中。
if __name__ == '__main__':
train_path = "dataset/train_catvnoncat.h5"
test_path = "dataset/test_catvnoncat.h5"
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset(train_path, test_path)
print(train_set_x_orig[0].shape)
print(train_set_x_orig.shape)
print(train_set_y.shape)
print(test_set_x_orig.shape)
print(test_set_y.shape)
train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0], -1).T
print(train_set_x_flatten.shape)
test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0], -1).T
print(test_set_x_flatten.shape)
# 归一化
train_x = train_set_x_flatten / 255
test_x = test_set_x_flatten / 255
d = model(train_x, train_set_y, test_x, test_set_y, 2000, 0.005, True)
losses = np.squeeze(d['loss'])
plt.plot(losses)
plt.ylabel('loss')
plt.xlabel('iterations (per hundreds)')
plt.title("Learning rate =" + str(d["learning_rate"]))
plt.show()
训练结果每次应该都是相同的,训练集准确率有99%,测试集有70%,下面是loss值的图像。可以看到,第2000轮次的时候,loss的值已经很小了,并且再继续进行训练的话,效果也不会变化特别大了。当把每一轮次的loss值都打印出来的时候,可以看到的应该是一捺的样子,很形象的“捺”。在最初的一段有上下的阴影,大家可以试一试。