[NN]前向神经网络的tf.keras详细实现教学

最新推荐文章于 2023-05-24 20:46:30 发布

是土豆大叔啊！

最新推荐文章于 2023-05-24 20:46:30 发布

阅读量645

点赞数 1

分类专栏：神经网络文章标签：神经网络 tensorflow 深度学习 python keras

本文链接：https://blog.csdn.net/potato_uncle/article/details/109713107

版权

神经网络专栏收录该内容

9 篇文章 0 订阅

订阅专栏

Feedforward neural networks & `tf.keras`

写在前面：

该代码详细介绍了如何实现前向反馈神经网络，训练一个神经网络的主要代码结构为：

下载和加载数据 $\rightarrow$ 用TF的tf.data加载数据 $\rightarrow$ 用layers类定义模型 $\rightarrow$ 定义损失和矩阵 $\rightarrow$ 训练模型

注意：本文使用两种方式训练模型

一种是根据神经网络的算法，计算模型梯度，损失，更新参数，计算准确率
一种是直接用fit计算，简单快捷

#导入TF，然后查看其版本，我用的是 2.3.1 版本
import tensorflow as tf
print(tf.__version__)

Interlude 1: Handling devices with TensorFlow 用TF管理CPU\GPU

#查看可用的设备
# "XLA_CPU" and "XLA_GPU" refers to using the devices with the new XLA accelerator: https://www.tensorflow.org/xla
print(tf.config.list_logical_devices())

#我们可以通过检查新创建的张量来检查目前使用的设备
# We can check which device is currently used by inspecting a newly created tensor
x = tf.random.normal((3, 1))
print(x.device)

#我们也可以为操作的执行指定一个具体的设备
# We can also specify a certain device placement for an operation:
with tf.device('CPU:0'):
    x= tf.random.normal((3, 1))
    print(x.device)

Downloading data and some preprocessing 下载和处理数据

我们用一个简单的从数字特征诊断驱动器的数据集

点击此处获取该数据集的更多信息

下载地址：请点击此处

import pandas as pd
sensorless = pd.read_csv('Sensorless_drive_diagnosis.txt', header=None, sep=' ')

#前48列是数字特征，最后一列是类标签
# The first 48 columns are the numerical features, while the last column is the class (1, ..., 11)
sensorless.head(5)

#注意：X必须是数字(float32)，输出为整型(int64)
# Note all the preprocessing: X must be numeric (float32), while the output are integers (int64).
# We also ensure that the output is shaped as (n, 1), and that the index for the class starts from 0.
X = sensorless.values[:, 0:-1].astype('float32')
y = sensorless.values[:, -1:].astype('int64') - 1

print(X.shape)
print(y.shape)

#各类的数据很平均
# Classes are perfectly balanced (as per the dataset description, check the link above)
import matplotlib.pyplot as plt
_ = plt.hist(y, bins=11, rwidth=0.9)

#train_test_split用于分割数据集，默认比例为25%为测试集，75%为训练集

from sklearn import model_selection
X_tr, X_tst, y_tr, y_tst = model_selection.train_test_split(X, y, stratify=y)

在这里插入图片描述

Interlude 2: Iterators and generators in Python 迭代器和生成器简介

#迭代器可以在for循环里面使用
# An iterator is any object that can be used inside a for-loop, like range
for i in range(4):
    _

#迭代器可以动态的生成数据
# Iterators can be used to generate data on-the-fly, as it is being consumed by the for-loop.
range(4)

#一种简单的方法是用yeild生成器
# To build an iterator, you need to construct a specific class implementing __iter__ and __next (https://wiki.python.org/moin/Iterator).
# A simpler way is to use generators, exploiting the keywork yield, as below.
def gen_custom_numbers():
    own_list = [3.0, 2.4, -12, 8]
    for n in own_list:
        yield n

# Note: gen_custom_numbers() builds the iterator, which is then consumed by the for-loop.
for n in gen_custom_numbers():
    print(n)

Loading the data in TensorFlow with `tf.data` 用TF的tf.data加载数据

Before going into this section, it is a good idea to check the basic tf.data guide: https://www.tensorflow.org/guide/data.
The tf.data.Dataset allows to easily create iterable from your data.

#加载数据到 tf.data.Dataset对象，这样使用的时候可以将数据和标签一块使用
# Load the data inside a tf.data.Dataset object.
# Note: from_tensor_slices specifies that each row of the matrices is a single element of the dataset.
# Doing tf.data.Dataset.from_tensors((X_tr, y_tr)) would create a dataset with *a single* element (containing the two tensors).
train_dataset = tf.data.Dataset.from_tensor_slices((X_tr, y_tr))

#按32个数据为一个batch，打乱数据
# We can create pipelines by concatenating operations on the data. It is important to understand
# that the operations are not run here, but only defined. Execution happens when the iterator is consumed.
train_dataset = train_dataset.shuffle(1000).batch(32)

for data in train_dataset:
    print(data[0].shape)
    break
    

# We can apply custom functions to our dataset. In this case, we need a simple function
# to only take the input elements for the Normalization layer.
def take_first_element(xb, yb):
    return xb

from tensorflow.keras.layers.experimental.preprocessing import Normalization
normalizer = Normalization()
# Note: we are using a tf.data.Dataset here, meaning that the adapt function will work on mini-batches.
normalizer.adapt(train_dataset.map(take_first_element))

Defining the model using layers 用layers类定义模型

from tensorflow.keras import layers

#单个的全连接层相当于一个线性层: w * x + b
# A single Dense layer is equivalent to a linear layer (w@x + b), possibly with an activation function.
model = layers.Dense(11)

#当我们第一次运行“层”时，可以创建个内部变量
# When we run the layer for the first time, we create the internal variables.
print(model(data[0]).shape)

#Sequential 可以让我们建立一个连续的模型，即把连续的层堆叠到一起
# There are multiple ways of building models from layers. Sequential is the easiest one.
from tensorflow.keras import Sequential

#在模型的开头可以加入 layers.Input((48, ))，便于确定实际输入的数据shape和你想要的shape一致
# For TF 2.2.0, you can add layers.Input((48,)) at the beginning to ensure that the shapes are correctly computed.
# We are including the preprocessing layer as part of the model architecture.
model = Sequential(layers=[
                           normalizer,
                           # layers.Dense(50, activation='relu'),
                           layers.Dense(50, activation='relu'),
                           layers.Dense(11, activation='softmax')
])

print(model(data[0]).shape)

#可以显示所有参数，包括不训练的参数
# These are *all* the parameters, including parameters that are not trained.
model.count_params()

#可以显示整个模型结构和各种参数
# The Normalization has a series of internal variables that are not trained.
model.summary()

print(len(model.trainable_variables))

Losses and metrics in `tf.keras` 损失和矩阵

from tensorflow.keras import losses
#损失的两种实现方法：

#方法1，用函数的版本
# Approach 1: functional version (note the sparse version, because our targets are defined as indexes and not as one-hot vectors).
y_pred = model(data[0])
tf.reduce_mean(losses.sparse_categorical_crossentropy(data[1], y_pred))

#方法2，用面向对象的版本
# Approach 2: object-oriented version.
cross_entropy = losses.SparseCategoricalCrossentropy()
cross_entropy(data[1], y_pred)

#优化器
from tensorflow.keras import optimizers
sgd = optimizers.SGD(learning_rate=1e-3)

Interlude 3: merging multiple lists with `zip` 简介zip的使用

a = [1.0, 2.0, 3.0]
b = ['e', 'b', 'f']
for el in zip(a, b):
    print(el)

The main training loop 主要的训练训练

其实这种方法，感觉不常用，但是因为他需要自己编写对模型训练的每个过程，所以对于理解神经网络的内部原理会非常有好处！

#计算模型的斜率，损失，用sgd更新参数
# The tf.function will compile the function at the first execution, in order to considerably
# speed-up training: https://www.tensorflow.org/api_docs/python/tf/function.
# Carefully read the guide, as the compiled code has a number of important limitations.
# For example: try to return ce.numpy() instead of ce, with and without compilation.
# Can you understand why the former is not working?
@tf.function
def train_step(batch):
  
        xb, yb = batch
        with tf.GradientTape() as tape:
            # Get the predictions of the model
            y_predicted = model(xb)

            # Compute the average loss of the predictions
            ce = cross_entropy(yb, y_predicted)

        # Get the gradients of the parameters
        grads = tape.gradient(ce, model.trainable_variables)

        # Update the parameters using gradient descent
        sgd.apply_gradients(zip(grads, model.trainable_variables))

        return ce
        
#加载测试数据，方法同上面训练集的操作一样，把数据和标签放一起
# Load the test part of the dataset. In practice, we would use a separate validation
# set here. Note that we are not shuffling the dataset, as this is not needed.
test_dataset = tf.data.Dataset.from_tensor_slices((X_tst, y_tst)).batch(32)

print(y_pred[0])

#将各类的概率转化为具体的预测类别
# To go from probabilities to classes, we take the argmax of the predictions.
tf.argmax(y_pred, axis=1)

#计算准确率，很复杂！哈哈哈
# Computing the accuracy is strangely complex!
print(tf.reduce_mean(tf.cast(tf.argmax(y_pred, axis=1) == data[1][:, 0], tf.float32)))

from tensorflow.keras import metrics

#定义准确率矩阵
# Using metrics is generally simpler. Metrics are built to process multiple batches,
# hence the update_state function.
acc = metrics.SparseCategoricalAccuracy()
acc.update_state(data[1], y_pred)
print(acc.result())

ce_history = []
for epoch in range(10):
    
    # Compute the accuracy for this epoch
    acc = metrics.SparseCategoricalAccuracy()
    for batch in test_dataset:
        xb, yb = batch
        y_pred = model(xb)
        acc.update_state(yb, y_pred)
    print(f'Accuracy at epoch {epoch} is {acc.result().numpy()}')

    # Perform one epoch of training
    for batch in train_dataset:
        ce = train_step(batch)
        ce_history.append(ce.numpy())


import matplotlib.pyplot as plt
plt.plot(ce_history)
plt.plot(pd.Series(ce_history).ewm(halflife=15).mean(), 'r') # A smoothed version of the curve is easier to interpret.

在这里插入图片描述

Model training with `compile` and `fit` 还可以用fit训练模型

fit可说是tf.keras最简单且常用的训练方法了

#编译模型，只有编译后的模型才可以训练
# Compile writes all the previous training code for us!
model.compile(optimizer=sgd,
              loss=cross_entropy,
              metrics=[acc])


model.fit(train_dataset, epochs=5, validation_data=test_dataset)