介绍
混合密度网络(Mixture Density Networks, MDN),目标是根据给定的输入数据学习输出分布参数(均值、方差和分配系数)。
直接上代码,本文代码是基于tensorflow框架实现。
依赖包版本
tensorflow 2.11.0
numpy 1.23.4
matplotlib 3.4.2
引例
一维sine曲线拟合
1维预测1维
from tensorflow import keras
import mdn
import numpy as np
import matplotlib.pyplot as plt
# Generating training data
NSAMPLE = 3000
y_data = np.float32(np.random.uniform(-10.5, 10.5, NSAMPLE))
r_data = np.random.normal(size=NSAMPLE)
x_data = np.sin(0.75 * y_data) * 7.0 + y_data * 0.5 + r_data * 1.0
x_data = x_data.reshape((NSAMPLE, 1))
# # Plot the data, if you need
plt.scatter(x_data, y_data)
# plt.show()
N_HIDDEN = 15 # number of hidden units in the Dense layer
N_MIXES = 10 # number of mixture components
OUTPUT_DIMS = 1 # number of real-values predicted by each mixture component
# Create model
model = keras.Sequential()
model.add(keras.layers.Dense(N_HIDDEN, batch_input_shape=(None, 1), activation='relu'))
model.add(keras.layers.Dense(N_HIDDEN, activation='relu'))
model.add(mdn.MDN(OUTPUT_DIMS, N_MIXES))
model.compile(loss=mdn.get_mixture_loss_func(OUTPUT_DIMS, N_MIXES), optimizer=keras.optimizers.Adam())
model.summary()
# Model train
history = model.fit(x=x_data, y=y_data, batch_size=128, epochs=500, validation_split=0.15)
# Save model, if you need
# model.save('mdn_sine.h5')
# Generating testing data
x_test = np.float32(np.arange(-15, 15, 0.01))
NTEST = x_test.size
print("Testing:", NTEST, "samples.")
x_test = x_test.reshape(NTEST, 1) # needs to be a matrix, not a vector
# Make distributions predictions from the model
y_test = model.predict(x_test)
# Sample from the predicted distributions
y_samples = np.apply_along_axis(mdn.sample_from_output, 1, y_test, 1, N_MIXES, temp=1.0)
y_samples = y_samples.squeeze(axis=2)
# Plot test result
plt.scatter(x_test, y_samples)
plt.show()
代码运行结果:
二维sine曲线拟合
1维预测2维
from tensorflow import keras
import mdn
import numpy as np
import matplotlib.pyplot as plt
# Generating training data
NSAMPLE = 5000
z_data = np.float32(np.random.uniform(-10.5, 10.5, NSAMPLE))
r_data = np.random.normal(size=NSAMPLE)
s_data = np.random.normal(size=NSAMPLE)
x_data = np.sin(0.75 * z_data) * 7.0 + z_data * 0.5 + r_data * 1.0
y_data = np.cos(0.80 * z_data) * 6.5 + z_data * 0.5 + s_data * 1.0
x_input = z_data.reshape((NSAMPLE, 1))
y_input = np.array([x_data,y_data])
y_input = y_input.T
# # Plot the data, if you need
# fig = plt.figure()
# ax = fig.add_subplot(111, projection='3d')
# ax.scatter(x_data, y_data, z_data, alpha=0.3, c='r')
# plt.show()
N_HIDDEN = 15
N_MIXES = 10
OUTPUT_DIMS = 2
# Create model
model = keras.Sequential()
model.add(keras.layers.Dense(N_HIDDEN, batch_input_shape=(None, 1), activation='relu'))
model.add(keras.layers.Dense(N_HIDDEN, activation='relu'))
model.add(mdn.MDN(OUTPUT_DIMS, N_MIXES))
model.compile(loss=mdn.get_mixture_loss_func(OUTPUT_DIMS,N_MIXES), optimizer=keras.optimizers.Adam())
model.summary()
# Model train
history = model.fit(x=x_input, y=y_input, batch_size=128, epochs=300, validation_split=0.15, callbacks=[keras.callbacks.TerminateOnNaN()])
# Save model, if you need
# model.save('mdn_2D_sine.h5')
# Plot the loss of training process
plt.figure(figsize=(10, 5))
plt.ylim([0,9])
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.show()
# Generating testing data:
x_test = np.float32(np.arange(-15, 15, 0.1))
NTEST = x_test.size
print("Testing:", NTEST, "samples.")
x_test = x_test.reshape(NTEST, 1) # needs to be a matrix, not a vector
print(x_test.shape)
# Make predictions from the model
y_test = model.predict(x_test)
print(y_test.shape)
# Sample from the predicted distributions
y_samples = np.apply_along_axis(mdn.sample_from_output, 1, y_test, OUTPUT_DIMS, N_MIXES, temp=1.0, sigma_temp=1.0)
# Plot the predicted samples.
fig = plt.figure(figsize=(8, 8))
ax = fig.add_subplot(111, projection='3d')
ax.scatter(x_data, y_data, z_data, alpha=0.1, c='r') #c=perf_down_sampled.moving
ax.scatter(y_samples.T[0], y_samples.T[1], x_test, alpha=0.1, c='b') #c=perf_down_sampled.moving
plt.show()
实验结果为:
轨迹预测
本文重要的是将方法用于轨迹预测,采用最简单的测试:一个位置点(x, y),预测下一个位置点(x1, y1)。简而言之,就是2维预测2维。
from tensorflow import keras
import mdn
import numpy as np
import matplotlib.pyplot as plt
# Generating train and test data
NSAMPLE = 3001
Train_num = 2500
x_data = np.linspace(0, 100, NSAMPLE)
y_data = np.sin(x_data)
data = np.array([x_data, y_data]).T
x_train = data[:Train_num, :]
y_train = data[1:Train_num+1, :]
x_test = data[Train_num:NSAMPLE-1, :]
y_test = data[Train_num+1:, :]
# # Plot the data, if you need
# plt.scatter(x_train[:, 0], x_train[:, 1])
# plt.scatter(y_train[:, 0], y_train[:, 1])
# plt.scatter(x_test[:, 0], x_test[:, 1])
# plt.scatter(y_test[:, 0], y_test[:, 1])
# plt.show()
# Create model
model = keras.Sequential()
model.add(keras.layers.Dense(N_HIDDEN, batch_input_shape=(None, 2), activation='relu'))
model.add(keras.layers.Dense(N_HIDDEN, activation='relu'))
model.add(mdn.MDN(OUTPUT_DIMS, N_MIXES))
model.compile(loss=mdn.get_mixture_loss_func(OUTPUT_DIMS,N_MIXES), optimizer=keras.optimizers.Adam())
model.summary()
# Model train
history = model.fit(x=x_train, y=y_train, batch_size=128, epochs=300, validation_split=0.15, callbacks=[keras.callbacks.TerminateOnNaN()])
# Model save
model.save('mdn_2_2_sine.h5')
# Make predictions from the model
y_pred = model.predict(x_test)
print(y_pred.shape)
# Sample from the predicted distributions
y_samples = np.apply_along_axis(mdn.sample_from_output, 1, y_pred, OUTPUT_DIMS, N_MIXES, temp=1.0, sigma_temp=1.0)
y_samples = y_samples.squeeze(1)
print(y_samples.shape)
plt.scatter(y_samples[:, 0], y_samples[:, 1], c='r')
plt.scatter(y_test[:, 0], y_test[:, 1], c='b')
plt.show()
轨迹预测结果为: