问题描述
如下二维特征,每个样本属于正样本(红色)或负样本(蓝色),实现二分类模型
单层神经网络
[
w
1
w
2
]
[
x
1
x
2
]
+
[
b
]
=
[
y
]
\left[\begin{matrix} w_1 & w_2 \end{matrix}\right] \left[ \begin{matrix} x_1\\ x_2 \end{matrix}\right] + \left[ \begin{matrix} b \end{matrix}\right] = \left[ \begin{matrix} y \end{matrix}\right]
[w1w2][x1x2]+[b]=[y]
class LogisticRegression(nn.Module):
def __init__(self):
super(LogisticRegression, self).__init__()
self.lr = nn.Linear(2, 1)
self.sm = nn.Sigmoid()
def forward(self, x):
out = self.lr(x)
out = self.sm(out)
return out
我们想查看模型的训练效果,需要把
L
i
n
e
a
r
Linear
Linear层可视化
w
1
∗
x
1
+
w
2
∗
x
2
+
b
=
y
w_1*x_1+w_2*x_2+b=y
w1∗x1+w2∗x2+b=y
y
=
0
y=0
y=0时,
x
1
x_1
x1和
x
2
x_2
x2的关系如下:
x
2
=
−
w
1
∗
x
1
−
b
w
2
x_2=\frac{-w_1*x_1-b}{w2}
x2=w2−w1∗x1−b
取
x
1
x_1
x1在
[
30
,
100
]
[30,100]
[30,100]之间,使用神经网络
L
i
n
e
a
r
Linear
Linear层的参数将
x
2
x_2
x2的值求出,画出训练结果。
def vis_one_layer(logistic_model):
w1, w2 = logistic_model.lr.weight.data.numpy()[0]
b = logistic_model.lr.bias.data.numpy()[0]
plot_x = np.arange(30, 100, 0.1)
plot_y = (-w1 * plot_x - b) / w2
plt.plot(plot_x, plot_y)
plt.show()
多层神经网络
如果神经网络的层数不止一层,无法代入求得
x
1
x_1
x1和
x
2
x_2
x2的关系,应该如何可视化神经网络的预测结果?
class MyLogistic(nn.Module):
def __init__(self, input_size):
super().__init__()
self.hidden_1 = nn.Linear(input_size, 64)
self.hidden_2 = nn.Linear(64, 32)
self.hidden_3 = nn.Linear(32, 16)
self.output = nn.Linear(16, 1)
self.relu_1 = nn.ReLU()
self.relu_2 = nn.ReLU()
self.relu_3 = nn.ReLU()
self.sigmoid = nn.Sigmoid()
def forward(self, x):
x = self.hidden_1(x)
x = self.relu_1(x)
x = self.hidden_2(x)
x = self.relu_2(x)
x = self.hidden_3(x)
x = self.relu_3(x)
x = self.output(x)
x = self.sigmoid(x)
return x
思路还是通过一系列点确定神经网络划分二分类问题的区域,只不过因为预测的结果不是线性的,所以要对区域内的点使用模型预测正负性,然后用不同颜色可视化,使用 p l t . c o n t o u r f plt.contourf plt.contourf绘制轮廓线并填充。
def vis_result(x_data, y_data, model):
x_min, x_max = x_data[:, 0].min() - 1, x_data[:, 0].max() + 1
y_min, y_max = x_data[:, 1].min() - 1, x_data[:, 1].max() + 1
print(x_min, x_max, y_min, y_max)
h = 0.1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
z = model(torch.from_numpy(np.c_[xx.ravel(), yy.ravel()]).float())
z = z.reshape(xx.shape).detach().numpy()
print(z)
z = [z[i] >= 0.5 for i in range(len(z))]
z = np.array(z)
plt.contourf(xx, yy, z, alpha=0.3)
for i in range(len(y_data)):
if y_data[i] == 1:
plt.scatter(x_data[i][0], x_data[i][1], c='r')
else:
plt.scatter(x_data[i][0], x_data[i][1], c='b')
plt.show()