本文使用PCA、ICA、TSNE等降维模型进行了对比实验,分别对原始数据进行降维并可视化展示,然后分别用PCA、ICA、TSNE降维后的数据训练一个简单MLP神经网络用于评估降维的数据质量。最后用三个模型的预测效果进行了对比和结果的展示。
本文使用的数据集在本人上传的资源中,链接为mock_kaggle.csv
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from sklearn.decomposition import FastICA,PCA
from sklearn.manifold import TSNE
from sklearn import preprocessing
取数据
data=pd.read_csv('mock_kaggle.csv',encoding ='gbk',parse_dates=['datetime'])
data=data.iloc[:,1:]
data
特价 | 股票 | 价格 | |
---|---|---|---|
0 | 0 | 4972 | 1.29 |
1 | 70 | 4902 | 1.29 |
2 | 59 | 4843 | 1.29 |
3 | 93 | 4750 | 1.29 |
4 | 96 | 4654 | 1.29 |
... | ... | ... | ... |
932 | 98 | 3179 | 2.39 |
933 | 108 | 3071 | 2.39 |
934 | 128 | 4095 | 2.39 |
935 | 270 | 3825 | 2.39 |
936 | 183 | 3642 | 2.39 |
937 rows × 3 columns
标准化
data_normal = preprocessing.scale(data)
train_label=data_normal[:900,1]
test_label=data_normal[901:,1]
PCA降维
pca = PCA(n_components=2)
pca_data = pca.fit_transform(data_normal) #等价于pca.fit(X) pca.transform(X)
#invX = pca.inverse_transform(X) #将降维后的数据转换成原始数据
print(pca.explained_variance_ratio_)#输出降维后的各特征贡献度
[0.38909061 0.3429785 ]
plt.figure(figsize=(12,8))
plt.title('PCA Components')
plt.scatter(pca_data[:,0], pca_data[:,1])
训练简单神经网络
pca_train_data=pca_data[:900]
pca_test_data=pca_data[901:]
pca_model=tf.keras.Sequential([tf.keras.layers.Dense(10,input_shape=(2,),activation='relu'),
tf.keras.layers.Dense(10,activation='relu'),
tf.keras.layers.Dense(1)])
pca_model.summary()#查看模型基本信息
pca_model.compile(optimizer='adam',loss='mse',metrics=['mape']) #设置优化器和损失函数
pca_history=pca_model.fit(pca_train_data,train_label,epochs=100) #每个数据训练1000次
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 10) 30
_________________________________________________________________
dense_1 (Dense) (None, 10) 110
_________________________________________________________________
dense_2 (Dense) (None, 1) 11
=================================================================
Total params: 151
Trainable params: 151
Non-trainable params: 0
_________________________________________________________________
Train on 900 samples
Epoch 1/100
900/900 [==============================] - 2s 2ms/sample - loss: 0.7850 - mape: 174.4778
Epoch 2/100
900/900 [==============================] - 0s 231us/sample - loss: 0.6291 - mape: 203.4696
Epoch 3/100
900/900 [==============================] - 0s 184us/sample - loss: 0.5221 - mape: 235.8750
Epoch 4/100
900/900 [==============================] - 0s 182us/sample - loss: 0.4389 - mape: 260.7603
Epoch 5/100
900/900 [==============================] - 0s 172us/sample - loss: 0.3774 - mape: 280.1786
Epoch 6/100
900/900 [==============================] - 0s 184us/sample - loss: 0.3348 - mape: 283.5222
Epoch 7/100
900/900 [==============================] - 0s 310us/sample - loss: 0.3014 - mape: 286.3506
Epoch 8/100
900/900 [==============================] - 0s 230us/sample - loss: 0.2762 - mape: 291.6097
Epoch 9/100
900/900 [==============================] - 0s 234us/sample - loss: 0.2539 - mape: 279.0452
Epoch 10/100
900/900 [==============================] - 0s 199us/sample - loss: 0.2364 - mape: 267.8730
Epoch 11/100
900/900 [==============================] - 0s 187us/sample - loss: 0.2219 - mape: 254.5653
Epoch 12/100
900/900 [==============================] - 0s 183us/sample - loss: 0.2095 - mape: 241.9442
Epoch 13/100
900/900 [==============================] - 0s 266us/sample - loss: 0.1987 - mape: 234.5619
Epoch 14/100
900/900 [==============================] - 0s 184us/sample - loss: 0.1884 - mape: 225.1507
Epoch 15/100
900/900 [==============================] - 0s 183us/sample - loss: 0.1811 - mape: 221.8909
Epoch 16/100
900/900 [==============================] - 0s 214us/sample - loss: 0.1749 - mape: 205.1877
Epoch 17/100
900/900 [==============================] - 0s 199us/sample - loss: 0.1702 - mape: 204.4365
Epoch 18/100
900/900 [==============================] - 0s 189us/sample - loss: 0.1649 - mape: 198.4753
Epoch 19/100
900/900 [==============================] - 0s 186us/sample - loss: 0.1621 - mape: 205.5184
Epoch 20/100
900/900 [==============================] - 0s 174us/sample - loss: 0.1590 - mape: 195.3820
Epoch 21/100
900/900 [==============================] - 0s 196us/sample - loss: 0.1569 - mape: 197.1003
Epoch 22/100
900/900 [==============================] - 0s 199us/sample - loss: 0.1547 - mape: 193.8486
Epoch 23/100
900/900 [==============================] - 0s 181us/sample - loss: 0.1543 - mape: 192.1595
Epoch 24/100
900/900 [==============================] - 0s 185us/sample - loss: 0.1519 - mape: 192.7433
Epoch 25/100
900/900 [==============================] - 0s 169us/sample - loss: 0.1519 - mape: 190.2121
Epoch 26/100
900/900 [==============================] - 0s 171us/sample - loss: 0.1508 - mape: 196.8424
Epoch 27/100
900/900 [==============================] - 0s 193us/sample - loss: 0.1510 - mape: 187.2915
Epoch 28/100
900/900 [==============================] - 0s 174us/sample - loss: 0.1489 - mape: 187.9957
Epoch 29/100
900/900 [==============================] - 0s 174us/sample - loss: 0.1494 - mape: 191.7432
Epoch 30/100
900/900 [==============================] - 0s 182us/sample - loss: 0.1485 - mape: 196.7090
Epoch 31/100
900/900 [==============================] - 0s 177us/sample - loss: 0.1477 - mape: 189.1148
Epoch 32/100
900/900 [==============================] - 0s 180us/sample - loss: 0.1468 - mape: 193.2476
Epoch 33/100
900/900 [==============================] - 0s 176us/sample - loss: 0.1473 - mape: 191.6133
Epoch 34/100
900/900 [==============================] - 0s 173us/sample - loss: 0.1464 - mape: 186.4275
Epoch 35/100
900/900 [==============================] - 0s 233us/sample - loss: 0.1454 - mape: 190.5075
Epoch 36/100
900/900 [==============================] - 0s 180us/sample - loss: 0.1450 - mape: 189.6160
Epoch 37/100
900/900 [==============================] - 0s 174us/sample - loss: 0.1450 - mape: 190.8441
Epoch 38/100
900/900 [==============================] - 0s 176us/sample - loss: 0.1449 - mape: 191.0887
Epoch 39/100
900/900 [==============================] - 0s 192us/sample - loss: 0.1446 - mape: 195.8095
Epoch 40/100
900/900 [==============================] - 0s 203us/sample - loss: 0.1446 - mape: 187.4518
Epoch 41/100
900/900 [==============================] - 0s 188us/sample - loss: 0.1441 - mape: 191.2747
Epoch 42/100
900/900 [==============================] - 0s 243us/sample - loss: 0.1441 - mape: 191.9978
Epoch 43/100
900/900 [==============================] - 0s 217us/sample - loss: 0.1439 - mape: 185.9973
Epoch 44/100
900/900 [==============================] - 0s 198us/sample - loss: 0.1446 - mape: 186.2656
Epoch 45/100
900/900 [==============================] - 0s 198us/sample - loss: 0.1439 - mape: 179.1912
Epoch 46/100
900/900 [==============================] - 0s 217us/sample - loss: 0.1443 - mape: 189.1231
Epoch 47/100
900/900 [==============================] - 0s 221us/sample - loss: 0.1446 - mape: 190.8205
Epoch 48/100
900/900 [==============================] - 0s 201us/sample - loss: 0.1426 - mape: 197.7570
Epoch 49/100
900/900 [==============================] - 0s 185us/sample - loss: 0.1425 - mape: 191.7054
Epoch 50/100
900/900 [==============================] - 0s 235us/sample - loss: 0.1424 - mape: 195.1246
Epoch 51/100
900/900 [==============================] - 0s 180us/sample - loss: 0.1417 - mape: 188.4568
Epoch 52/100
900/900 [==============================] - 0s 247us/sample - loss: 0.1422 - mape: 190.8196
Epoch 53/100
900/900 [==============================] - 0s 175us/sample - loss: 0