2023.10.6学习
人工智能基础学习
多层感知器(MLP)
尝试建立一个模型,其结构模仿人的思考机制
数学表达式
认为X(x1、x2、x3…)与y之间存在(a1、a2、a3…)的中间变量
{
a
1
2
=
g
(
θ
10
1
x
0
+
θ
11
1
x
1
+
θ
12
1
x
2
+
θ
13
1
x
3
)
a
2
2
=
g
(
θ
20
1
x
0
+
θ
21
1
x
1
+
θ
22
1
x
2
+
θ
23
1
x
3
)
a
3
3
=
g
(
θ
30
1
x
0
+
θ
31
1
x
1
+
θ
32
1
x
2
+
θ
33
1
x
3
)
\begin {cases}a_1^2 =g(\theta^1_{10}x_0+\theta^1_{11}x_1+\theta^1_{12}x_2+\theta^1_{13}x_3)\\a_2^2=g(\theta^1_{20}x_0+\theta^1_{21}x_1+\theta^1_{22}x_2+\theta^1_{23}x_3)\\a_3^3=g(\theta^1_{30}x_0+\theta^1_{31}x_1+\theta^1_{32}x_2+\theta^1_{33}x_3) \end {cases}
⎩
⎨
⎧a12=g(θ101x0+θ111x1+θ121x2+θ131x3)a22=g(θ201x0+θ211x1+θ221x2+θ231x3)a33=g(θ301x0+θ311x1+θ321x2+θ331x3)
y = g ( θ 10 2 a 0 2 + θ 11 2 a 1 2 + θ 12 2 a 2 2 + θ 13 2 a 3 2 ) y=g(\theta^2_{10}a_0^2+\theta^2_{11}a_1^2+\theta^2_{12}a_2^2+\theta^2_{13}a_3^2) y=g(θ102a02+θ112a12+θ122a22+θ132a32)
MLP实现非线性分类预测
# 建立Sequential顺序模型
from keras.models import Sequential
model = Sequential()
# 通过.add()叠加各层网络
from keras.layers import Dense
model.add(Dense(units=3, activation='sigmoid', input_dim=3))
# 通过.compile()配置模型求解过程参数
model.compile(loss='categorical_crossentropy', optimizer='sgd')
# 非线性二分类
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense, Activation
from sklearn.metrics import accuracy_score
data = pd.read_csv('data.csv')
print(data.head())
X = data.drop(['y'], axis=1)
y = data.loc[:, 'y']
fig1 = plt.figure()
passed = plt.scatter(X.loc[:, 'x1'][y == 1], X.loc[:, 'x2'][y == 1])
failed = plt.scatter(X.loc[:, 'x1'][y == 0], X.loc[:, 'x2'][y == 0])
plt.legend((passed, failed), ('passed', 'failed'))
plt.show()
# 数据分离
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.33, random_state=10)
print(X_train.shape, X_test.shape, X.shape) # (65, 2) (134, 2) (199, 2)
# 创建MLP实例
mlp = Sequential()
mlp.add(Dense(units=20, input_dim=2, activation='sigmoid')) # 设置(一层)隐藏层参数:隐藏层有20个神经元,输入数据的维度为2个,使用“逻辑回归”激活函数
mlp.add(Dense(units=1, activation='sigmoid')) # 设置输出层参数:输出层有1个神经元,使用“逻辑回归”激活函数
mlp.summary() # 查看结构
# 配置模型训练参数
mlp.compile(optimizer='adam', loss='binary_crossentropy') # 优化方法: adam;损失函数:考虑到二分类,使用binary_crossentropy
# 模型训练
mlp.fit(X_train, y_train, epochs=3000) # 迭代3000次
# 查看模型准确率
# y_train_predict = mlp.predict_classes(X_train)
y_train_predict = mlp.predict(X_train)
y_train_predict = np.round(y_train_predict).astype(int)
'''
注意:我们希望预测结果为0或1,但是mlp.predict(X_train)只能得到每个点的概率,原本使用mlp.predict_classes可以将结果自动变成0/1分布,但是这个方法已经被弃用了
所以,这里使用的办法是,先用mlp.predict(X_train)方法得到概率数组,再使用np.round()对所有的数据进行四舍五入,再astype(int),将数组中的数据类型转化为int
'''
accuracy_train = accuracy_score(y_train, y_train_predict)
print(accuracy_train) # 0.9230769230769231
# 测试数据
y_test_predict = mlp.predict(X_test)
y_test_predict = np.round(y_test_predict).astype(int)
accuracy_test = accuracy_score(y_test, y_test_predict)
print(accuracy_test) # 0.8955223880597015
# 可视化模型预测结果
print(type(y_train_predict), y_train_predict) # 发现这种格式无法用于索引
y_train_predict_form = pd.Series(i[0] for i in y_train_predict)
print(type(y_train_predict_form), y_train_predict_form) # pandas.core.series.Series
xx, yy = np.meshgrid(np.arange(0, 100, 1), np.arange(0, 100, 1)) # 构建坐标轴数据,0:1,间隔0.01
x_range = np.c_[xx.ravel(), yy.ravel()] # 生成点集
y_range_predict = mlp.predict(x_range)
y_range_predict = np.round(y_range_predict).astype(int)
y_range_predict_form = pd.Series(i[0] for i in y_range_predict)
print(type(y_range_predict_form), y_range_predict_form) # pandas.core.series.Series
fig2 = plt.figure()
passed_predict = plt.scatter(x_range[:, 0][y_range_predict_form == 1], x_range[:, 1][y_range_predict_form == 1])
failed_predict = plt.scatter(x_range[:, 0][y_range_predict_form == 0], x_range[:, 1][y_range_predict_form == 0])
passed = plt.scatter(X.loc[:, 'x1'][y == 1], X.loc[:, 'x2'][y == 1])
failed = plt.scatter(X.loc[:, 'x1'][y == 0], X.loc[:, 'x2'][y == 0])
plt.legend((passed, failed, passed_predict, failed_predict), ('passed', 'failed', 'passed_predict', 'failed_predict'))
plt.title('predicted result')
plt.show()
评价分类结果:一般。改进措施:增加迭代次数;增加数据点数量;改善data_train和data_test的数据集的分配。
MPL实现多分类预测
MLP实现图像预测
import numpy as np
from matplotlib import pyplot as plt
from keras.datasets import mnist
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Dense, Activation
from sklearn.metrics import accuracy_score
# 载入数据
# keras.datasets库中自带了mnist数据集
(X_train, y_train), (X_test, y_test) = mnist.load_data()
print(type(X_train), X_train.shape) # <class 'numpy.ndarray'> (60000, 28, 28)
# 显示图像内容
img1 = X_train[0]
fig1 = plt.figure()
plt.imshow(img1)
plt.title(y_train[0])
plt.show()
# 对输入数据进行格式转换
print(img1.shape) # (28, 28)
feature_size = img1.shape[0]*img1.shape[1]
X_train_format = X_train.reshape(X_train.shape[0], feature_size)
X_test_format = X_test.reshape(X_test.shape[0], feature_size)
print(X_train_format.shape) # (60000, 784)
# 数据归一化
X_train_normal = X_train_format/255 # 将RGB 0-255的数据转化为0-1
X_test_normal = X_test_format/255
print(X_train_normal[0]) # 查看归一化效果
# 对输出结果(labels)进行格式转换
y_train_format = to_categorical(y_train) # 将类别向量转换为二进制0/1的矩阵类型表示
y_test_format = to_categorical(y_test)
print(y_train[0], y_train_format[0])
# 创建模型
mlp = Sequential()
mlp.add(Dense(units=392, activation='sigmoid', input_dim=feature_size)) # 设置第一层隐藏层参数:有392个神经元,输入维度为feature_size,激活函数为sigmoid
mlp.add(Dense(units=392, activation='sigmoid')) # 设置第二层隐藏层参数:有392个神经元,输入维度与上一层相同,激活函数为sigmoid
mlp.add(Dense(units=10, activation='softmax')) # 设置输出层参数:有0123456789共10个维度,因为是多分类模型,激活函数使用“softmax”
mlp.summary()
# 模型配置
mlp.compile(loss='categorical_crossentropy', optimizer='adam') # 设置损失函数和优化方法
# 模型训练
mlp.fit(X_train_normal, y_train_format, epochs=10) # 样本数较大,迭代数相对小些
# 模型评估
# 训练数据集
y_train_predict = mlp.predict(X_train_normal)
y_train_predict = np.round(y_train_predict).astype(int)
print(y_train_predict)
accuracy_train = accuracy_score(y_train_format, y_train_predict)
print(accuracy_train) # 0.9963
# 预测数据集
y_test_predict = mlp.predict(X_test_normal)
# y_test_predict = np.round(y_test_predict).astype(int)
y_test_predict = np.argmax(y_test_predict, axis=1)
'''
换了一种方法,因为发现np.round(y_predict).astype(int)对于一维输出是有效的,它可以将概率四舍五入为0/1,从而反映输出结果y为0/1
但是对于这个预测图像的例子,其输出为十维的列向量,以此表达输出结果是0-10的哪一个数字
从某种意义上来讲用np.round(y_predict).astype(int)也是对的,比如预测为数字9,那么它会以[0 0 0 0 0 0 0 0 0 1]的形式表达出来,plt.title(y_test_predict[9])便是如此
但是显然这是不直观的,如果再调用一个函数去数第几位是1倒也是一种解决办法
通过查阅资料,找到了np.argmax(y_predict, axis=1)这个函数,它的作用是寻找y_predict这个矩阵的最大值的索引,axis=0时返回所有列的最大值索引,axis=1时返回所有行的最大值索引,返回值的矩阵维度是y_predict矩阵维度-1
同时还要注意一个小细节,如果使用y_test_predict = np.round(y_test_predict).astype(int),那么accuracy_test = accuracy_score(y_test_format, y_test_predict)
如果使用y_test_predict = np.argmax(y_test_predict, axis=1),那么accuracy_test = accuracy_score(y_test, y_test_predict)
所以我又想了想,前面的to_categorical(y_train)将输出结果转换为0/1矩阵,就是为了MLP的多维分析,将非1的数转化为0/1的多维矩阵
'''
accuracy_test = accuracy_score(y_test, y_test_predict)
print(accuracy_test) # 0.9786
# 选择一幅图验证预测的准确性
img2 = X_test[9]
fig2 = plt.figure()
plt.imshow(img2)
plt.title(y_test_predict[9])
plt.show()
img2 = X_test[56]
fig2 = plt.figure()
plt.imshow(img2)
plt.title(y_test_predict[56])
plt.show()