《深度学习第三章神经网络入门》

最新推荐文章于 2023-06-14 20:38:52 发布

逆夏11111

最新推荐文章于 2023-06-14 20:38:52 发布

阅读量1.2k

点赞数 1

分类专栏：深度学习

本文链接：https://blog.csdn.net/weixin_43955530/article/details/89164852

版权

深度学习专栏收录该内容

28 篇文章 1 订阅

订阅专栏

1.关于神经网络

构建一个神经网络，首先要构造他的模型，有几层，每层有多少个神经元；然后要配置学习过程，也就是编译的这个过程，这个过程需要选择合适的optimizer（优化器），loss（损失函数），metrics（监控指标）；最后是学习过程fit，这一步要指定循环多少个轮次epochs，每次处理多少个数据batch_size：

##两种构建模型的方式
#Sequential()类
from keras import models
from keras import layers

model = models.Sequential()
model.add(layers.Dense(32,activation = 'relu',input_shape = (784,)))
model.add(layers.Dense(10,activation = 'softmax'))


#函数式API
input_tensor = layers.Input(shape = (784,))
x = layers.Dense(32,activation = 'relu')(input_tensor)
output_tensor = layers.Dense(10,activation = 'softmax')(x)

model = models.Model(inputs = input_tensor,outputs = output_tensor)

##配置学习过程
from keras import optimizers
model.compile(optimizer = optimizers.RMSprop(lr = 0.001),
              loss = 'mse',
              metrics = ['accuracy'])#优化器，损失函数，监控的指标

##训练
x_train = 
y_train = 
model.fit(x_train,y_train,batch_size = 128,epochs = 10)
#注意这里由于用的Dense，输入必须是reshape成2D张量(60000,784)的
#y_train也要是独热码形式的，因为第二层神经网络输出是10维的

2.二分类问题：电影评论分类

载入数据
keras数据集载入有两种方式：1.直接用书上的imdb.load_data()从官网上下载，下载后的数据保存在C:\Users\lenovo.keras\datasets文件夹中，以后每次运行代码，自动检测该文件夹中是否已经有文件，有的话直接加载数据，没有的话自动下载。2.如果网速太慢了，可以直接在百度云盘上下载数据集，保存到对应上面目录下，当然也可以转存到项目目录下面，如果保存在项目目录下面就要重新load()一下数据

###下载数据集，如果下载好了第二次运行时便不在下载
from keras.datasets import imdb
(train_data,train_labels),(test_data,test_labels) = imdb.load_data(num_words = 10000)

如果网速太慢，戳我的百度云：
链接：https://pan.baidu.com/s/1tinNqlV86FySMW5HZTLqXA
提取码：4n8a
复制这段内容后打开百度网盘手机App，操作更方便哦
如果已经存好了数据集，载入：

##载入数据集
import numpy as np
data_set = np.load('imdb.npz')
print(data_set.keys()) 
x_test = data_set['x_test']
x_train = data_set['x_train']
y_train = data_set['y_train']
y_test = data_set['y_test']

这个时候因为你是载入的整个数据集，所以所以单词索引会有超过10000的，训练数据，测试数据都是25000个样本

将数据还原成英文评论
这个时候得下载一个单词序号对应的词袋

from keras.datasets import imdb
word_index = imdb.get_word_index()#下载单词索引对应词袋

链接：https://pan.baidu.com/s/1GEfbdnCRNNszHISVhmjAWg
提取码：wejw
复制这段内容后打开百度网盘手机App，操作更方便哦

下载下来之后是一个.json文件，存储位置也是在C:\Users\lenovo.keras\datasets，可以提前把这个文件存好在你的项目文件夹中，然后用json.load()一下

###将某评论解码为英文
#from keras.datasets import imdb
#word_index = imdb.get_word_index()#下载单词索引对应词袋

#读取.json文件
import json
f = open('imdb_word_index.json',encoding = 'utf-8')
word_index = json.load(f)
#解码为英文评论
reverse_word_index = dict([(value,key) for (key,value) in word_index.items()])
decoded_review = ' '.join([reverse_word_index.get(i - 3,'?') for i in train_data[0]])
print(decoded_review)

? this film was just brilliant casting location scenery story direction everyone’s really suited the part they played and you could just imagine being there robert ? is an amazing actor and now the same being director ? father came from the same scottish island as myself so i loved the fact there was a real connection with this film the witty remarks throughout the film were great it was just brilliant so much that i bought the film as soon as it was released for ? and would recommend it to everyone to watch and the fly fishing was amazing really cried at the end it was so sad and you know what they say if you cry at a film it must have been good and this definitely was also ? to the two little boy’s that played the ? of norman and paul they were just brilliant children are often left out of the ? list i think because the stars that play them all grown up are such a big profile for the whole film but these children are amazing and should be praised for what they have done don’t you think the whole story was so lovely because it was true and was someone’s life after all that was shared with us all

得到了一段英文评论

准备数据
由于每一条样本的词语数量是不一样的，所以每一条样本的长度也是不一样的，这个时候训练集和测试集都是一个整数序列的形式（每一条长度不一样），这个是不能放进神经网络训练的，要把整数序列转化为2D张量（矩阵，每行数据长度一样）

转化方式有两种：1.填充列表，然后用Embedding神经层 2.用One-Hot编码，转变为10000维的向量，然后用Dense神经层

###准备数据
#将整数序列变成二进制矩阵（2D张量）
import numpy as np
def vectorize_sequence(sequence,dimention = 10000):
    results = np.zeros((len(sequence),dimention))
    for i,sequence in enumerate(sequence):
        results[i,sequence] = 1
    return results
#这个使用了One-Hot编码，生成了（25000,10000）的矩阵，由0/1组成

x_train = vectorize_sequence(train_data)
x_test = vectorize_sequence(test_data)

#标签向量化
y_train = np.asarray(train_labels).astype('float32')
y_test = np.asarray(test_labels).astype('float32')

这样数据就能够用神经网络训练了，我们可以看到数据类型已经发生了改变：
本来：
在这里插入图片描述已经变成了：

构建神经网络
输入数据是向量，labels是标量（0/1），比较适合用Dense神经层：

###构建网络模型
from keras import models
from keras import layers

model = models.Sequential()
model.add(layers.Dense(16,activation = 'relu',input_shape = (10000,)))
model.add(layers.Dense(16,activation = 'relu'))
model.add(layers.Dense(1,activation = 'sigmoid'))

编译模型

###编译模型
model.compile(optimizer = 'rmsprop',
              loss = 'binary_crossentropy',
              metrics = ['accuracy'])
#优化器选择rmsprop，loss选择‘二元交叉熵’，同时我们还在训练过程中对精度进行监控

对于二分类问题，网络输出为一个概率值，那么最好使用loss是crossentropy（二元交叉熵），也可以使用mean_squared_error（均方误差），但前者更适用于输出是概率的模型

也可以用可自定义参数的optimizer：

from keras import optimizers
model.compile(optimizer = optimizers.RMSprop(lr = 0.001),
              loss = 'binary_crossentropy',
              metrics = ['accuracy'])

留出验证集，进行训练
为了不使用测试集，我们对训练集进行拆分，拆分成验证集和训练集，因为我们并不知道现在拿出来的模型是不是好的模型，里面的超参数还需要后续进行修改什么的，验证集与测试集是不一样的，验证集来源于训练集

训练集用于训练模型参数，测试集用于估计模型对样本的泛化误差，验证集用于“训练”模型的超参数。
什么是模型的参数
我们知道一个机器学习模型通常包括两个部分的参数：模型参数和超参数。其中超参数是用于控制模型行为的参数，这些参数不是通过模型本身学习而来的。例如多项式回归模型里面，多项式的次数，学习速率是超参数。这些超参数不能由模型本身训练得到是因为模型会倾向于把超参数训练的过大或者过小，从而极易导致过拟合。例如多项式回归模型里面，如果让模型本身去训练多项式的次数，那么模型会选择高次多项式，因为这样做误差可以取到特别小，极端情况下，N个点的多项式回归会选择次数N【参考】

###模型训练
#我们先对训练集进行拆分，拆分成验证集和训练集
#25000个样本，取前10000个样本作为验证集，后15000个样本作为训练集
x_val = x_train[:10000]
partial_x_train = x_train[10000:]

y_val = y_train[:10000]
partial_y_train = y_train[10000:]

#fit
History = model.fit(partial_x_train,partial_y_train,
                    epochs = 20,batch_size = 512,
                    validation_data = (x_val,y_val))
#用训练集训练，每次取512个样本组成的小批量训练，将整个训练数据训练20轮
#同时监控验证集上的10000个样本的损失和精度

在这里插入图片描述
训练结果分析
fit返回了一个History对象，History有一个成员history，它是一个字典包含训练过程的所有数据

history_dict = History.history
print(history_dict.keys())

dict_keys([‘val_loss’, ‘val_acc’, ‘loss’, ‘acc’])

画图分别根据loss，val_loss画出训练集和验证集上面的损失图对比：

###画图比较
import matplotlib.pyplot as plt

loss_values = history_dict['loss']
val_loss_values = history_dict['val_loss']

epochs = list(range(1,21))

plt.plot(epochs,loss_values,'bo',label = 'Training loss')
plt.plot(epochs,val_loss_values,'red',label = 'Validation loss')
plt.title('Training and Validation loss')
plt.xlabel('epochs')
plt.ylabel('Loss')
plt.legend()

plt.show()

在这里插入图片描述
画图分别根据acc，val_acc画出训练集和验证集上精确度的对比图：

import matplotlib.pyplot as plt

acc_values = history_dict['acc']
val_acc_values = history_dict['val_acc']

epochs = list(range(1,21))

plt.plot(epochs,acc_values,'bo',label = 'Training acc')
plt.plot(epochs,val_acc_values,'red',label = 'Validation acc')
plt.title('Training and Validation accuracy')
plt.xlabel('epochs')
plt.ylabel('Acc')
plt.legend()

plt.show()

在这里插入图片描述
我们看到精确度在测试机上越来越高，损失在测试机上越来越小，但是在验证集上，损失越来越大，精确值越来越小，说明我们的模型过拟合了，这个模型针对训练集过度优化了，却无法泛化到训练集之外的数据集

训练轮次并不是越多越好，为了防止过拟合，我们只训练4轮，epochs = 4重新训练模型
注：重新训练时，可能会报错：MemoryError，内存不足，可以尝试重启kernel

#fit
History = model.fit(partial_x_train,partial_y_train,
                    epochs = 4,batch_size = 512,
                    validation_data = (x_val,y_val))

测试集上面测试效果
用evaluate来检测效果：

results = model.evaluate(x_test,y_test)
print(results)

25000/25000 [==============================] - 4s 153us/step
[0.30375005383491516, 0.87712]
可以达到0.877的精度，效果很一般

用predict来查看预测结果：

y_pre = model.predict(x_test)
print(y_pre)

得到的是为1的概率，有一些值非常确定（概率大于0.9确定为1，小于0.1确定为0，其他的概率都是不太确定的）
在这里插入图片描述
后续改进
后续还可以从神经网络的层数，单元数，loss的选择，activation的选择进行改进，rmsprop的优化器可以不需要改进，他通常都很好

3.多分类问题：新闻分类

前面探讨的是二分类问题，现在探讨多分类问题，现新闻有46个主题，相当于46个类别。（多分类问题也分为：多分类，单标签问题；多分类，多标签问题）
这里每条新闻只能分到一个类，属于多分类，单标签问题，如果每条新闻可以分到多个主题下面，那就是多分类，多标签问题了。
加载数据集

from keras.datasets import reuters
(train_data,train_labels),(test_data,test_labels) = reuters.load_data(num_words = 10000)

print(len(train_data))
print(len(test_data))

同理数据集下载之后，存在c盘datasets文件夹里，第二次运行时不用再次下载了。
我们发现训练样本有8982个，测试样本2246个
train_data[10]
Out[5]:
[1,
245,
273,
207,
156,
53,
74,
160,
26,
14,
46,
296,
26,
39,
74,
2979,
3554,
14,
46,
4689,
4329,
86,
61,
3499,
4795,
14,
61,
451,
4329,
17,
12]
如IMDB一样，每个样本都是一个证书列表，表示单词索引
将索引解码为新闻文本查看

###查看新闻内容
word_index = reuters.get_word_index()# 获取reuters这个数据集的单词对应的索引
reverse_word_index = dict([(value,key) for (key,value) in word_index.items()])#将索引单词对变成单词索引对的形式
decoded_newswire = ' '.join([reverse_word_index.get(i-3,'?') for i in train_data[0]])

‘? ? ? said as a result of its december acquisition of space co it expects earnings per share in 1987 of 1 15 to 1 30 dlrs per share up from 70 cts in 1986 the company said pretax net should rise to nine to 10 mln dlrs from six mln dlrs in 1986 and rental operation revenues to 19 to 22 mln dlrs from 12 5 mln dlrs it said cash flow per share this year should be 2 50 to three dlrs reuter 3’
数据向量化
将长度不同的若干整数序列表示成用one-hot编码，整理成2D张量，长度相同。由于是多类别，标签有从1到46，所以标签也进行one-hot编码

import numpy as np
###data进行one-hot向量化
def vectorize_sequences(sequences,dimension = 10000):
    results = np.zeros((len(sequences),dimension))
    for i,sequence in enumerate(sequences):
        results[i,sequence] = 1
    return results
    
x_train = vectorize_sequences(train_data)
x_test = vectorize_sequences(test_data)
###labels进行one-hot向量化
def to_one_hot(labels,dimension = 46):
    results = np.zeros((len(labels),dimension))
    for i,label in enumerate(labels):
        results[i,label] = 1.
    return results

one_hot_train_labels = to_one_hot(train_labels)
one_hot_test_labels = to_one_hot(test_labels)

one-hot也可以用keras内置方法来向量化

###也可以用keras内置的方法实现向量化操作
#from keras.utils.np_utils import to_categorical
#
#one_hot_train_labels = to_categorical(train_labels)
#one_hot_test_labels = to_categorical(test_labels)

定义模型
之前用的16个神经元的网络不能再用了，这里输出都是46维，16维太小了，会丢失很多数据和相关信息，所以这里用64维的（中间层），第一层包含64个神经元，输入10000输出64的，第二层包含46个神经元，输入64，输出46的，中间空间64维。

###定义模型
from keras import models
from keras import layers

model = models.Sequential()
model.add(layers.Dense(64,activation = 'relu',input_shape = (10000,)))#第一层64个神经元
model.add(layers.Dense(64,activation = 'relu'))#中间空间64维
model.add(layers.Dense(46,activation = 'softmax'))#第二层46个神经元

编译模型

###编译模型
model.compile(optimizer = 'rmsprop',
              loss = 'categorical_crossentropy',
              metrics = ['accuracy'])

这里编译模型用的是categorical_crossentropy分类交叉熵，用于衡量两个概率分布之间的距离
留出验证集

###留出验证集
x_val = x_train[:1000]
y_val = one_hot_train_labels[:1000]

practical_x_train = x_train[1000:]
practical_y_train = one_hot_train_labels[1000:]

训练模型

###训练模型
history = model.fit(practical_x_train,
                    practical_y_train,
                    epoches = 20,
                    batch_size = 512,
                    validation_data = (x_val,y_val))

绘制训练损失和验证损失

###绘制训练损失和验证损失
import matplotlib.pyplot as plt
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(1,len(loss) + 1)

plt.plot(epochs,loss,'bo',label = 'Training loss')
plt.plot(epochs,val_loss,'r',label = 'Validation loss')
plt.title('Training and validation loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()

plt.show()

在这里插入图片描述
绘制训练精度和验证精度

###绘制训练精度和验证精度
import matplotlib.ppyplot as plt
acc = history.history['acc']
val_acc = history.history['val_acc']

epochs = range(1,len(acc) + 1)

plt.plot(epochs,acc,'bo',label = 'Training acc')
plt.plot(epochs,val_acc,'r',label = 'Validation acc')
plt.title('Training and Validation accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()

plt.show()

在这里插入图片描述
随着训练轮数的增加，训练集上的loss和acc不断减小，但是验证集上的loss和acc在9轮之后就不变了，说明是过拟合了，我们重新训练模型，只训练9轮

model.fit(practical_x_train,
          practical_y_train,
          epochs = 9,
          batch_size = 512,
          validation_data = (x_val,y_val))

#在验证机上看训练9轮的模型的精度
result = model.evaluate(x_test,one_hot_test_labels)

In [38]:result
Out[38]: [0.9764915253578926, 0.7862867320210193]
得到精度是78%

随机分类器效果
如果用一个随机分类器来做分类，如果是二分类问题，有一半的可能性分对，但是这里是一个多分类问题，随机分类器的精度会很低

import copy
test_labels_copy = copy.copy(test_labels)
np.random.shuffle(test_labels_copy)
hits_array = np.array(test_labels) == np.array(test_labels_copy)
print(float(np.sum(hits_array))/len(test_labels))

0.20035618878005343
精度只有20%

在测试机上用训练好的模型分类
用predictt方法返回在46个主题（分类）上的概率分布：

presictions = model.predict(x_test)

predictions[0].shape
Out[47]: (46,)
可以看出每一个predictions里面的向量都是46维的，代表是46个类别的概率，这46个概率之和是1
在这里插入图片描述
根据概率得到最终分类结果

###概率到分类结果
y_pre = np.zeros((len(predictions),1))
for i in range(len(predictions)):
    y_pre[i] = np.argmax(predictions[i])

在这里插入图片描述
就得到了分类结果

注意：
对于向量化那一步操作，标签的向量化有两种方式：1.使用one-hot编码。2.将标签变成整数张量
上面用的是第一种方法，第二种方法也是可以做的

###标签化成整数张量的向量化方法
#y_train = np.array(train_labels)
#y_test = np.array(test_labels)

在编译模型的步骤，要修改损失函数的选择，把categorical_crossentropy改成sparse_categorical_crossentropy，两者作用完全相同，只是接口不同

如果中间层神经元维数过小，会怎么样
我们尝试将神经网络修改一下，虽然第一层还是64个神经元，第二层还是46个神经元，但是中间空间压缩到4维试一试：

model = models.Sequential()
model.add(layers.Dense(64,activation = 'relu',input_shape = (10000,)))
model.add(layers.Dense(4,activation = 'relu'))
model.add(layers.Dense(46,activation = 'softmax'))

然后查看精度结果：
result = model.evaluate(x_test,one_hot_test_labels)
print(result)
[1.4833514590615793, 0.6660730187529872]
只有66%的准确度了
原因是：试图将大量信息压缩到很小的空间，丢失了很多信息，造成信息缺失

4.回归问题：预测房价

除了分类问题，还有回归问题，分类问题最后的出来的是类别，回归问题预测的是连续值而不是离散的标签（logistic回归不是回归问题是分类问题）
加载数据

from keras.datasets import boston_housing
(train_data,train_labels),(test_data,test_labels) = boston_housing.load_data()

在这里插入图片描述
datasets可以从我的百度云下载：
链接：https://pan.baidu.com/s/19Sk7kcyRyXKLcKL-ZWy63Q
提取码：9qgr
复制这段内容后打开百度网盘手机App，操作更方便哦
数据预处理
这里因为数据的范围变化很大，量级都不一样，所以要做标准化处理
就是用一列的每个值减去均值，除以最大最小值的差

mean = train_data.mean(axis = 0)#纵向取均值
train_data -=mean
std = train_data.std(axis = 0)
train_data /=std

test_data -=mean
test_data /=mean

定义模型

##定义模型
from keras import models
from keras import layers

def build_model():
    model = models.Sequential()
    #需要将一个模型多次实例化，所以用一个函数来构建模型
    model.add(layers.Dense(64,activation = 'relu',input_shape = (train_data.shape[1],)))
    model.add(layers.Dense(64,activation = 'relu'))
    model.add(layers.Dense(1))
    model.compile(optimizer = 'rmsprop',loss = 'mse',metrics = ['mae'])
    return model

这次的样本较少，我们是用一个比较小的网络，第一层还是64个神经元，隐藏空间64，第二层就是一个神经元，一般回归问题都是这样的，输出没有激活函数，如果加了激活函数，就会将结果限制在0-1之间了
损失函数：
MSE：均方误差，预测值与目标值之间的平方
MAE：平均绝对误差，预测值与目标只差的绝对值
交叉验证（K折验证）
就是cross validation

import numpy as np
k = 4#将训练数据分成四份
num_val_samples = len(train_data) //k #除了之后保留整数
num_epochs = 100
all_scores = []

for i in range(k):
    print('processing fold #',i)
    #将第i份数据作为验证集
    val_data = train_data[i*num_val_samples:(i+1)*num_val_samples]
    val_labels = train_labels[i*num_val_samples:(i+1)*num_val_samples]
    
    #除了i份数据以外作为训练集
    partial_train_data = np.concatenate([train_data[:i*num_val_samples],train_data[(i+1)*num_val_samples:]],axis = 0)
    partial_train_labels = np.concatenate([train_labels[:i*num_val_samples],train_labels[(i+1)*num_val_samples:]],axis = 0)

    ##对每一次的分法都进行建模，算出分数
    model = build_model()
    model.fit(partial_train_data,partial_train_labels,epochs = num_epochs,batch_size = 1,verbose = 0)
    #verbose = 0表示静默模式
    val_mse,val_mae = model.evaluate(val_data,val_labels,verbose = 0)
    #在验证集上评估数据，算出每一次得分
    all_scores.append(val_mae)

该过程持续了大概十分钟吧。。。
（np.concatenate表示数据拼接）
verbose：日志显示
verbose = 0 为不在标准输出流输出日志信息
verbose = 1 为输出进度条记录
verbose = 2 为每个epoch输出一行记录 verbose参考

算出每一次的分数
在这里插入图片描述
np.mean(all_scores)
Out[7]: 2.3953994822384113

之前是直接训练100轮，然后算出分数，现在我们想知道每一轮的具体分数是多少，需要用到history方法
将轮数epochs修改为300轮

import numpy as np
k = 4#将训练数据分成四份
num_val_samples = len(train_data) //k
num_epochs = 300
all_scores = []
all_mae_histories = []

for i in range(k):
    print('processing fold #',i)
    #将第i份数据作为验证集
    val_data = train_data[i*num_val_samples:(i+1)*num_val_samples]
    val_labels = train_labels[i*num_val_samples:(i+1)*num_val_samples]
    
    #除了i份数据以外作为训练集
    partial_train_data = np.concatenate([train_data[:i*num_val_samples],train_data[(i+1)*num_val_samples:]],axis = 0)
    partial_train_labels = np.concatenate([train_labels[:i*num_val_samples],train_labels[(i+1)*num_val_samples:]],axis = 0)

    ##对每一次的分法都进行建模，算出分数
    model = build_model()
    ##看每一轮训练的得分情况，用history方法
    history = model.fit(partial_train_data,partial_train_labels,
                        validation_data = (val_data,val_labels),
                        epochs = num_epochs,batch_size = 1,verbose = 0)
                        
    #注意这里直接把验证集放到模型训练的步骤里面去了，就不单独拿出来evaluate了
    
    #取每一轮的mae平均绝对误差拿出来，组成all_mae_history
    mae_history = history.history['val_mean_absolute_error']
    all_mae_histories.append(mae_history)

history记录下来300轮每一轮的得分情况，这个过程非常非常漫长。。。
查看history.history,令其等于a，查看a = history.history：
在这里插入图片描述
history.history是一个字典，里面存了四个key的值
每个key的值有300行，代表300轮训练，每一轮四个分数

all_mae_histories:
在这里插入图片描述
得出的all_mae_histories记录的是每一种分法下面，300轮训练得出的模型，拿去验证集上求300个mae的值，所以需要把四种分法，取一个平均值：

average_mae_history = [np.mean([x[i] for x in all_mae_histories]) for i in range(num_epochs)]

在这里插入图片描述
绘制验证分数图像

import matplotlib.pyplot as plt
plt.plot(range(1,len(average_mae_history) + 1),average_mae_history)
plt.xlabel('Epochs')
plt.ylabel('Validation MAE')
plt.show()

在这里插入图片描述
画图发现前十个点mae太大了，使坐标纵轴变得太大，决定废除前十个点画图（并做平滑处理）：

def smooth_curve(points,factor = 0.9): #平滑处理
    smoothed_points = []
    for point in points:
        if smoothed_points:
            previous = smoothed_points[-1]
            smoothed_points.append(previous*factor + point*(1-factor))
            #后一个数是这个数的0.1倍加上他前面数的0.9倍
        else:
            smoothed_points.append(point)
    return smoothed_points  #一定要注意return的对齐！！！！！好几次因为这个错了！！！！！
    
smooth_mae_history = smooth_curve(average_mae_history[10:])

plt.plot(range(1,len(smooth_mae_history) + 1),smooth_mae_history)
plt.xlabel('Epochs')
plt.ylabel('Validation MAE')
plt.show()

在这里插入图片描述
发现在epochs等于80是mae平均绝对误差最小的时候，所以把epochs定在80，训练最终的模型，取跑测试集
训练最终模型
至此，做的都是寻找超参数的过程，确定下来超参数之后，我们来用最合适的方法训练最终的模型

####训练最终模型,epochs = 80
model = build_model()#模型还是那个模型，只是训练方式重新确定下来了
model.fit(train_data,train_labels,epochs = 80,batch_size = 16,verbose = 0)
#用于测试集上
test_mse_score,test_mae_score = model.evaluate(test_data,test_labels)

在这里插入图片描述
绝对误差还有3.86说明你预测的房价与真实房价还是相差3865美元

逆夏11111

关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
《深度学习第三章神经网络入门》

1.关于神经网络构建一个神经网络，首先要构造他的模型，有几层，每层有多少个神经元；然后要配置学习过程，也就是编译的这个过程，这个过程需要选择合适的optimizer（优化器），loss（损失函数），metrics（监控指标）；最后是学习过程fit，这一步要指定循环多少个轮次epochs，每次处理多少个数据batch_size：##两种构建模型的方式#Sequential()类from ke...
复制链接

扫一扫