SS00003.Machinelearning——|Arithmetic&Machine.v03|——|TensorFlow：监督学习算法.v03|

yanqi_vip

已于 2022-05-24 14:50:25 修改

阅读量568

点赞数

分类专栏： bigdatav030——机器学习文章标签： tensorflow 算法学习机器学习深度学习

于 2022-05-23 14:09:00 首次发布

不予转载

本文链接：https://blog.csdn.net/yanqi_vip/article/details/124938751

版权

bigdatav030——机器学习专栏收录该内容

3 篇文章 0 订阅

订阅专栏

一、逻辑回归

### --- 逻辑回归(分类)

~~~     例如，零或一， True 或False ，是或否，猫或狗，或者它可以是两个以上的分类值； 
~~~     例如，红色，蓝色或绿色，或一，二，三，四或五。 
~~~     标签通常具有与之相关的概率； 例如， P(cat = 0.92) ，P(dog = 0.08) 。 
~~~     因此，逻辑回归也称为分类。

### --- 通过TensorFlow 实现对图片进行分类

~~~     我们将使用fashion_mnist 数据集使用逻辑回归来预测时尚商品的类别
~~~     这个数据集包含了10个类别的图像，
~~~     分别是：t-shirt(T恤)，trouser(牛仔裤)，pullover(套衫)，dress(裙子)，
~~~     coat(外套)，sandal(凉鞋)，shirt(衬衫)，sneaker(运动鞋)，bag(包)，
~~~     ankle boot(短靴)。

~~~     该数据集每列有6000张图片，共计 6w 张图片，我们可以在 5w 张图像上训练模型，
~~~     在1w 张图像上进行验证(验证集在训练集上切分)，并在另外 1w 张图像上进行测试。

~~~     # 导入相关包

import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras.callbacks import ModelCheckpoint

~~~     # 数据处理
### --- 定义常量：

batch_size = 128
epochs = 20
n_classes = 10
learning_rate = 0.1
width = 28
height = 28

### --- 定义图片类型：

~~~     # 类型 0 1 2 3 4 5
6 7 8 9
fashion_labels = ["Tshirt/
top","Trousers","Pullover","Dress","Coat","Sandal","Shirt","Sneaker","Bag"
,"Ankle boot"]

~~~     # 切分数据集&独热编码：

# 读取数据集
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
# 将每个图像中的每个整数值像素转换为float32并除以255以对其进行归一化
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0
# 将特征展开
x_train = x_train.reshape((60000, width * height))
x_test = x_test.reshape((10000, width * height))
split = 50000
#将训练集的数据再次切割--->训练集和验证集
(x_train, x_valid) = x_train[:split], x_train[split:]
(y_train, y_valid) = y_train[:split], y_train[split:]
# tf中的独热编码(tensor)
# 转成numpy，因为numpy不能合并
y_train_ohe = tf.one_hot(y_train, depth=n_classes).numpy()
y_valid_ohe = tf.one_hot(y_valid, depth=n_classes).numpy()
y_test_ohe = tf.one_hot(y_test, depth=n_classes).numpy()

### --- 显示原始标签和独热编码的差异：

i=5
print(y_train[i]) # 索引为i的标签值
print (tf.one_hot(y_train[i], depth=n_classes))# depth设置编码深度(10个类型)
print(y_train_ohe[i]) # 索引为i的one-hot编码

### --- 打印10个时尚图片样本：

import matplotlib.pyplot as plt
%matplotlib inline
_,image = plt.subplots(1,10,figsize=(8,1))
for i in range(10):
image[i].imshow(np.reshape(x_train[i],(width, height)), cmap="Greys")
print(fashion_labels[y_train[i]],sep='', end='')

### --- 建模

~~~     官方(Google)建议，对于创建任何类型的机器学习模型，
~~~     都可以通过将其分类为tf.keras.Model来创建模型。

~~~     在我们的逻辑回归示例中，我们需要在子类中编写两个方法。 
~~~     首先，我们需要编写一个构造器，该构造器调用父类的构造器，以便正确创建模型。 
~~~     在这里，我们传入正在使用的类数( 10 )，并在实例化模型以创建单个层时使用此构造器。 
~~~     我们还必须声明call 方法，并使用该方法来定义在模型训练的正向传递过程中计算。

~~~     在call 方法中，我们采用输入的softmax 来产生输出。 
~~~     softmax 函数的作用是获取一个向量(或张量)，
~~~     然后在其元素具有该向量最大值的位置上用几乎为 1 的值覆盖，
~~~     在所有其他位置上使用几乎为零的值覆盖。
~~~     这与独热编码很相似。 请注意，在此方法中，由于softmax 未为 GPU 实现，
~~~     因此我们必须在 CPU 上强制执行。

~~~     softmax 可以看作是sigmoid 的一般情况，用于多分类问题。
~~~     将K维的实数向量压缩(映射)成另一个K维的实数向量，其中向量中的每个元素取值都介于 (0，1) 之间。
~~~     常用于多分类问题。sigmoid , softmax 也被称为激活函数，主要用于神经网络输出层的输出。

~~~     # 定义模型(逻辑回归)

class LogisticRegression(tf.keras.Model):
def __init__(self, num_classes):
super(LogisticRegression, self).__init__()
self.dense = tf.keras.layers.Dense(num_classes) #创建10层全连接网络(多层感知器)

# 使用该方法来定义在模型训练的正向传递过程中发生的情况
def call(self, inputs, training=None, mask=None):
output = self.dense(inputs)
# softmax op 不存在gpu上, 所以在CPU上强制执行
with tf.device('/cpu:0'): #设置使用cpu，0是在第1块
output = tf.nn.softmax(output)
return output

### --- 训练模型

# 创建模型
model = LogisticRegression(n_classes)
# 优化器(学习率)
optimizer =tf.keras.optimizers.Adam()
# 目标函数(损失函数)
#categorical_crossentropy是多标签逻辑回归问题的正态损失函数(交叉熵损失函数)，'accuracy'度量是通常用于分类问题的准确率。
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=
['accuracy'],)
# 我们输入1个图片(width*height)作为特征的大小，否则训练时将整个数据集加载到内存中以确定输入特征的大小
dummy_x = tf.zeros((1, width * height))
model.call(dummy_x)
# 保存训练期间的最佳模型
checkpointer = ModelCheckpoint(filepath="./model.weights.best.hdf5", verbose=2,
save_best_only=True, save_weights_only=True)
# 训练模型
model.fit(x_train, y_train_ohe, batch_size=batch_size, epochs=epochs,
validation_data=(x_valid, y_valid_ohe), callbacks=[checkpointer],
verbose=2)
# 根据最高的准确率来保存模型
model.load_weights("./model.weights.best.hdf5")
# 在测试集上评估模型
scores = model.evaluate(x_test, y_test_ohe, batch_size, verbose=2)
print("Final test loss and accuracy :", scores)
y_predictions = model.predict(x_test)

### --- 验证数据集

~~~     # 将预测结果与实际值比对
index = 42
index_predicted = np.argmax(y_predictions[index])
index_true = np.argmax(y_test_ohe[index])
print("When prediction is ",index_predicted)
print("ie. predicted label is", fashion_labels[index_predicted])
print("True label is ",fashion_labels[index_true])
print ("\n\nPredicted V (True) fashion labels, green is correct, red is wrong")
size = 12 # i.e. 随机抽12个
fig = plt.figure(figsize=(20,6))
rows = 3
cols = 4
for i, index in enumerate(np.random.choice(x_test.shape[0], size = size, replace
= False)):
axis = fig.add_subplot(rows,cols,i+1, xticks=[], yticks=[])
axis.imshow(x_test[index].reshape(width,height), cmap="Greys")
index_predicted = np.argmax(y_predictions[index])
index_true = np.argmax(y_test_ohe[index])
axis.set_title(("{}
({})").format(fashion_labels[index_predicted],fashion_labels[index_true]),
color=("green" if
index_predicted==index_true else "red"))

yanqi_vip

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
SS00003.Machinelearning——|Arithmetic&Machine.v03|——|TensorFlow：监督学习算法.v03|

一、逻辑回归### --- 逻辑回归(分类)~~~ 例如，零或一， True 或False ，是或否，猫或狗，或者它可以是两个以上的分类值； ~~~ 例如，红色，蓝色或绿色，或一，二，三，四或五。 ~~~ 标签通常具有与之相关的概率；例如， P(cat = 0.92) ，P(dog = 0....
复制链接

扫一扫