Feedforward neural networks & `tf.keras`
- Interlude 1: Handling devices with TensorFlow 用TF管理CPU\GPU
- Downloading data and some preprocessing 下载和处理数据
- Interlude 2: Iterators and generators in Python 迭代器和生成器简介
- Loading the data in TensorFlow with `tf.data` 用TF的tf.data加载数据
- Defining the model using layers 用layers类定义模型
- Losses and metrics in `tf.keras` 损失和矩阵
- Interlude 3: merging multiple lists with `zip` 简介zip的使用
- The main training loop 主要的训练训练
- Model training with `compile` and `fit` 还可以用fit训练模型
- 下载和加载数据 → \rightarrow → 用TF的tf.data加载数据 → \rightarrow → 用layers类定义模型 → \rightarrow → 定义损失和矩阵 → \rightarrow → 训练模型
- 一种是根据神经网络的算法,计算模型梯度,损失,更新参数,计算准确率
- 一种是直接用
#导入TF,然后查看其版本,我用的是 2.3.1 版本
import tensorflow as tf
Interlude 1: Handling devices with TensorFlow 用TF管理CPU\GPU
# "XLA_CPU" and "XLA_GPU" refers to using the devices with the new XLA accelerator: https://www.tensorflow.org/xla
# We can check which device is currently used by inspecting a newly created tensor
x = tf.random.normal((3, 1))
# We can also specify a certain device placement for an operation:
with tf.device('CPU:0'):
x= tf.random.normal((3, 1))
Downloading data and some preprocessing 下载和处理数据
import pandas as pd
sensorless = pd.read_csv('Sensorless_drive_diagnosis.txt', header=None, sep=' ')
# The first 48 columns are the numerical features, while the last column is the class (1, ..., 11)
# Note all the preprocessing: X must be numeric (float32), while the output are integers (int64).
# We also ensure that the output is shaped as (n, 1), and that the index for the class starts from 0.
X = sensorless.values[:, 0:-1].astype('float32')
y = sensorless.values[:, -1:].astype('int64') - 1
# Classes are perfectly balanced (as per the dataset description, check the link above)
import matplotlib.pyplot as plt
_ = plt.hist(y, bins=11, rwidth=0.9)
from sklearn import model_selection
X_tr, X_tst, y_tr, y_tst = model_selection.train_test_split(X, y, stratify=y)
Interlude 2: Iterators and generators in Python 迭代器和生成器简介
# An iterator is any object that can be used inside a for-loop, like range
for i in range(4):
# Iterators can be used to generate data on-the-fly, as it is being consumed by the for-loop.
# To build an iterator, you need to construct a specific class implementing __iter__ and __next (https://wiki.python.org/moin/Iterator).
# A simpler way is to use generators, exploiting the keywork yield, as below.
def gen_custom_numbers():
own_list = [3.0, 2.4, -12, 8]
for n in own_list:
yield n
# Note: gen_custom_numbers() builds the iterator, which is then consumed by the for-loop.
for n in gen_custom_numbers():
Loading the data in TensorFlow with tf.data
Before going into this section, it is a good idea to check the basic tf.data guide: https://www.tensorflow.org/guide/data.
The tf.data.Dataset
allows to easily create iterable from your data.
#加载数据到 tf.data.Dataset对象,这样使用的时候可以将数据和标签一块使用
# Load the data inside a tf.data.Dataset object.
# Note: from_tensor_slices specifies that each row of the matrices is a single element of the dataset.
# Doing tf.data.Dataset.from_tensors((X_tr, y_tr)) would create a dataset with *a single* element (containing the two tensors).
train_dataset = tf.data.Dataset.from_tensor_slices((X_tr, y_tr))
# We can create pipelines by concatenating operations on the data. It is important to understand
# that the operations are not run here, but only defined. Execution happens when the iterator is consumed.
train_dataset = train_dataset.shuffle(1000).batch(32)
for data in train_dataset:
# We can apply custom functions to our dataset. In this case, we need a simple function
# to only take the input elements for the Normalization layer.
def take_first_element(xb, yb):
return xb
from tensorflow.keras.layers.experimental.preprocessing import Normalization
normalizer = Normalization()
# Note: we are using a tf.data.Dataset here, meaning that the adapt function will work on mini-batches.
Defining the model using layers 用layers类定义模型
from tensorflow.keras import layers
#单个的全连接层相当于一个线性层: w * x + b
# A single Dense layer is equivalent to a linear layer (w@x + b), possibly with an activation function.
model = layers.Dense(11)
# When we run the layer for the first time, we create the internal variables.
#Sequential 可以让我们建立一个连续的模型,即把连续的层堆叠到一起
# There are multiple ways of building models from layers. Sequential is the easiest one.
from tensorflow.keras import Sequential
#在模型的开头可以加入 layers.Input((48, )),便于确定实际输入的数据shape和你想要的shape一致
# For TF 2.2.0, you can add layers.Input((48,)) at the beginning to ensure that the shapes are correctly computed.
# We are including the preprocessing layer as part of the model architecture.
model = Sequential(layers=[
# layers.Dense(50, activation='relu'),
layers.Dense(50, activation='relu'),
layers.Dense(11, activation='softmax')
# These are *all* the parameters, including parameters that are not trained.
# The Normalization has a series of internal variables that are not trained.
Losses and metrics in tf.keras
from tensorflow.keras import losses
# Approach 1: functional version (note the sparse version, because our targets are defined as indexes and not as one-hot vectors).
y_pred = model(data[0])
tf.reduce_mean(losses.sparse_categorical_crossentropy(data[1], y_pred))
# Approach 2: object-oriented version.
cross_entropy = losses.SparseCategoricalCrossentropy()
cross_entropy(data[1], y_pred)
from tensorflow.keras import optimizers
sgd = optimizers.SGD(learning_rate=1e-3)
Interlude 3: merging multiple lists with zip
a = [1.0, 2.0, 3.0]
b = ['e', 'b', 'f']
for el in zip(a, b):
The main training loop 主要的训练训练
# The tf.function will compile the function at the first execution, in order to considerably
# speed-up training: https://www.tensorflow.org/api_docs/python/tf/function.
# Carefully read the guide, as the compiled code has a number of important limitations.
# For example: try to return ce.numpy() instead of ce, with and without compilation.
# Can you understand why the former is not working?
def train_step(batch):
xb, yb = batch
with tf.GradientTape() as tape:
# Get the predictions of the model
y_predicted = model(xb)
# Compute the average loss of the predictions
ce = cross_entropy(yb, y_predicted)
# Get the gradients of the parameters
grads = tape.gradient(ce, model.trainable_variables)
# Update the parameters using gradient descent
sgd.apply_gradients(zip(grads, model.trainable_variables))
return ce
# Load the test part of the dataset. In practice, we would use a separate validation
# set here. Note that we are not shuffling the dataset, as this is not needed.
test_dataset = tf.data.Dataset.from_tensor_slices((X_tst, y_tst)).batch(32)
# To go from probabilities to classes, we take the argmax of the predictions.
tf.argmax(y_pred, axis=1)
# Computing the accuracy is strangely complex!
print(tf.reduce_mean(tf.cast(tf.argmax(y_pred, axis=1) == data[1][:, 0], tf.float32)))
from tensorflow.keras import metrics
# Using metrics is generally simpler. Metrics are built to process multiple batches,
# hence the update_state function.
acc = metrics.SparseCategoricalAccuracy()
acc.update_state(data[1], y_pred)
ce_history = []
for epoch in range(10):
# Compute the accuracy for this epoch
acc = metrics.SparseCategoricalAccuracy()
for batch in test_dataset:
xb, yb = batch
y_pred = model(xb)
acc.update_state(yb, y_pred)
print(f'Accuracy at epoch {epoch} is {acc.result().numpy()}')
# Perform one epoch of training
for batch in train_dataset:
ce = train_step(batch)
import matplotlib.pyplot as plt
plt.plot(pd.Series(ce_history).ewm(halflife=15).mean(), 'r') # A smoothed version of the curve is easier to interpret.
Model training with compile
and fit
# Compile writes all the previous training code for us!
model.fit(train_dataset, epochs=5, validation_data=test_dataset)