Feedforward neural networks & `tf.keras`
- Interlude 1: Handling devices with TensorFlow 用TF管理CPU\GPU
- Downloading data and some preprocessing 下载和处理数据
- Interlude 2: Iterators and generators in Python 迭代器和生成器简介
- Loading the data in TensorFlow with `tf.data` 用TF的tf.data加载数据
- Defining the model using layers 用layers类定义模型
- Losses and metrics in `tf.keras` 损失和矩阵
- Interlude 3: merging multiple lists with `zip` 简介zip的使用
- The main training loop 主要的训练训练
- Model training with `compile` and `fit` 还可以用fit训练模型
写在前面:
该代码详细介绍了如何实现前向反馈神经网络,训练一个神经网络的主要代码结构为:
- 下载和加载数据 → \rightarrow → 用TF的tf.data加载数据 → \rightarrow → 用layers类定义模型 → \rightarrow → 定义损失和矩阵 → \rightarrow → 训练模型
注意:本文使用两种方式训练模型
- 一种是根据神经网络的算法,计算模型梯度,损失,更新参数,计算准确率
- 一种是直接用
fit
计算,简单快捷
#导入TF,然后查看其版本,我用的是 2.3.1 版本
import tensorflow as tf
print(tf.__version__)
Interlude 1: Handling devices with TensorFlow 用TF管理CPU\GPU
#查看可用的设备
# "XLA_CPU" and "XLA_GPU" refers to using the devices with the new XLA accelerator: https://www.tensorflow.org/xla
print(tf.config.list_logical_devices())
#我们可以通过检查新创建的张量来检查目前使用的设备
# We can check which device is currently used by inspecting a newly created tensor
x = tf.random.normal((3, 1))
print(x.device)
#我们也可以为操作的执行指定一个具体的设备
# We can also specify a certain device placement for an operation:
with tf.device('CPU:0'):
x= tf.random.normal((3, 1))
print(x.device)
Downloading data and some preprocessing 下载和处理数据
我们用一个简单的从数字特征诊断驱动器的数据集
下载地址:请点击此处
import pandas as pd
sensorless = pd.read_csv('Sensorless_drive_diagnosis.txt', header=None, sep=' ')
#前48列是数字特征,最后一列是类标签
# The first 48 columns are the numerical features, while the last column is the class (1, ..., 11)
sensorless.head(5)
#注意:X必须是数字(float32),输出为整型(int64)
# Note all the preprocessing: X must be numeric (float32), while the output are integers (int64).
# We also ensure that the output is shaped as (n, 1), and that the index for the class starts from 0.
X = sensorless.values[:, 0:-1].astype('float32')
y = sensorless.values[:, -1:].astype('int64') - 1
print(X.shape)
print(y.shape)
#各类的数据很平均
# Classes are perfectly balanced (as per the dataset description, check the link above)
import matplotlib.pyplot as plt
_ = plt.hist(y, bins=11, rwidth=0.9)
#train_test_split用于分割数据集,默认比例为25%为测试集,75%为训练集
from sklearn import model_selection
X_tr, X_tst, y_tr, y_tst = model_selection.train_test_split(X, y, stratify=y)
Interlude 2: Iterators and generators in Python 迭代器和生成器简介
#迭代器可以在for循环里面使用
# An iterator is any object that can be used inside a for-loop, like range
for i in range(4):
_
#迭代器可以动态的生成数据
# Iterators can be used to generate data on-the-fly, as it is being consumed by the for-loop.
range(4)
#一种简单的方法是用yeild生成器
# To build an iterator, you need to construct a specific class implementing __iter__ and __next (https://wiki.python.org/moin/Iterator).
# A simpler way is to use generators, exploiting the keywork yield, as below.
def gen_custom_numbers():
own_list = [3.0, 2.4, -12, 8]
for n in own_list:
yield n
# Note: gen_custom_numbers() builds the iterator, which is then consumed by the for-loop.
for n in gen_custom_numbers():
print(n)
Loading the data in TensorFlow with tf.data
用TF的tf.data加载数据
Before going into this section, it is a good idea to check the basic tf.data guide: https://www.tensorflow.org/guide/data.
The tf.data.Dataset
allows to easily create iterable from your data.
#加载数据到 tf.data.Dataset对象,这样使用的时候可以将数据和标签一块使用
# Load the data inside a tf.data.Dataset object.
# Note: from_tensor_slices specifies that each row of the matrices is a single element of the dataset.
# Doing tf.data.Dataset.from_tensors((X_tr, y_tr)) would create a dataset with *a single* element (containing the two tensors).
train_dataset = tf.data.Dataset.from_tensor_slices((X_tr, y_tr))
#按32个数据为一个batch,打乱数据
# We can create pipelines by concatenating operations on the data. It is important to understand
# that the operations are not run here, but only defined. Execution happens when the iterator is consumed.
train_dataset = train_dataset.shuffle(1000).batch(32)
for data in train_dataset:
print(data[0].shape)
break
# We can apply custom functions to our dataset. In this case, we need a simple function
# to only take the input elements for the Normalization layer.
def take_first_element(xb, yb):
return xb
from tensorflow.keras.layers.experimental.preprocessing import Normalization
normalizer = Normalization()
# Note: we are using a tf.data.Dataset here, meaning that the adapt function will work on mini-batches.
normalizer.adapt(train_dataset.map(take_first_element))
Defining the model using layers 用layers类定义模型
from tensorflow.keras import layers
#单个的全连接层相当于一个线性层: w * x + b
# A single Dense layer is equivalent to a linear layer (w@x + b), possibly with an activation function.
model = layers.Dense(11)
#当我们第一次运行“层”时,可以创建个内部变量
# When we run the layer for the first time, we create the internal variables.
print(model(data[0]).shape)
#Sequential 可以让我们建立一个连续的模型,即把连续的层堆叠到一起
# There are multiple ways of building models from layers. Sequential is the easiest one.
from tensorflow.keras import Sequential
#在模型的开头可以加入 layers.Input((48, )),便于确定实际输入的数据shape和你想要的shape一致
# For TF 2.2.0, you can add layers.Input((48,)) at the beginning to ensure that the shapes are correctly computed.
# We are including the preprocessing layer as part of the model architecture.
model = Sequential(layers=[
normalizer,
# layers.Dense(50, activation='relu'),
layers.Dense(50, activation='relu'),
layers.Dense(11, activation='softmax')
])
print(model(data[0]).shape)
#可以显示所有参数,包括不训练的参数
# These are *all* the parameters, including parameters that are not trained.
model.count_params()
#可以显示整个模型结构和各种参数
# The Normalization has a series of internal variables that are not trained.
model.summary()
print(len(model.trainable_variables))
Losses and metrics in tf.keras
损失和矩阵
from tensorflow.keras import losses
#损失的两种实现方法:
#方法1,用函数的版本
# Approach 1: functional version (note the sparse version, because our targets are defined as indexes and not as one-hot vectors).
y_pred = model(data[0])
tf.reduce_mean(losses.sparse_categorical_crossentropy(data[1], y_pred))
#方法2,用面向对象的版本
# Approach 2: object-oriented version.
cross_entropy = losses.SparseCategoricalCrossentropy()
cross_entropy(data[1], y_pred)
#优化器
from tensorflow.keras import optimizers
sgd = optimizers.SGD(learning_rate=1e-3)
Interlude 3: merging multiple lists with zip
简介zip的使用
a = [1.0, 2.0, 3.0]
b = ['e', 'b', 'f']
for el in zip(a, b):
print(el)
The main training loop 主要的训练训练
其实这种方法,感觉不常用,但是因为他需要自己编写对模型训练的每个过程,所以对于理解神经网络的内部原理
会非常有好处!
#计算模型的斜率,损失,用sgd更新参数
# The tf.function will compile the function at the first execution, in order to considerably
# speed-up training: https://www.tensorflow.org/api_docs/python/tf/function.
# Carefully read the guide, as the compiled code has a number of important limitations.
# For example: try to return ce.numpy() instead of ce, with and without compilation.
# Can you understand why the former is not working?
@tf.function
def train_step(batch):
xb, yb = batch
with tf.GradientTape() as tape:
# Get the predictions of the model
y_predicted = model(xb)
# Compute the average loss of the predictions
ce = cross_entropy(yb, y_predicted)
# Get the gradients of the parameters
grads = tape.gradient(ce, model.trainable_variables)
# Update the parameters using gradient descent
sgd.apply_gradients(zip(grads, model.trainable_variables))
return ce
#加载测试数据,方法同上面训练集的操作一样,把数据和标签放一起
# Load the test part of the dataset. In practice, we would use a separate validation
# set here. Note that we are not shuffling the dataset, as this is not needed.
test_dataset = tf.data.Dataset.from_tensor_slices((X_tst, y_tst)).batch(32)
print(y_pred[0])
#将各类的概率转化为具体的预测类别
# To go from probabilities to classes, we take the argmax of the predictions.
tf.argmax(y_pred, axis=1)
#计算准确率,很复杂!哈哈哈
# Computing the accuracy is strangely complex!
print(tf.reduce_mean(tf.cast(tf.argmax(y_pred, axis=1) == data[1][:, 0], tf.float32)))
from tensorflow.keras import metrics
#定义准确率矩阵
# Using metrics is generally simpler. Metrics are built to process multiple batches,
# hence the update_state function.
acc = metrics.SparseCategoricalAccuracy()
acc.update_state(data[1], y_pred)
print(acc.result())
ce_history = []
for epoch in range(10):
# Compute the accuracy for this epoch
acc = metrics.SparseCategoricalAccuracy()
for batch in test_dataset:
xb, yb = batch
y_pred = model(xb)
acc.update_state(yb, y_pred)
print(f'Accuracy at epoch {epoch} is {acc.result().numpy()}')
# Perform one epoch of training
for batch in train_dataset:
ce = train_step(batch)
ce_history.append(ce.numpy())
import matplotlib.pyplot as plt
plt.plot(ce_history)
plt.plot(pd.Series(ce_history).ewm(halflife=15).mean(), 'r') # A smoothed version of the curve is easier to interpret.
Model training with compile
and fit
还可以用fit训练模型
fit
可说是tf.keras最简单且常用的训练方法了
#编译模型,只有编译后的模型才可以训练
# Compile writes all the previous training code for us!
model.compile(optimizer=sgd,
loss=cross_entropy,
metrics=[acc])
model.fit(train_dataset, epochs=5, validation_data=test_dataset)