利用G25祖源计算器坐标画PCA散点图
概述
编程语言:python3.x
模块:numpy
sklearn
matplotlib
可选:jupyter
整体思路:将G25给出的25维坐标降维并画图
二维PCA散点图
步骤:
- 先把自己手中的G25坐标数据集整理成csv
- 读取csv文件(此处我直接用的
numpy
读取,用csv
模块或pandas
也可) - 将读取的数组利用
sklearn
降维(此处的n_components
表示降维后的维度,既然是二维图,所以是2) - 用
matplotlib
画散点图
代码:
import numpy as np
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
with open('samples.csv', encoding = 'utf-8') as f:
data = np.loadtxt(f, float, delimiter = ",")
pca = PCA(n_components = 2)
new_data = pca.fit_transform(data)
label = ['Alex_scaled',
'bao_scaled',
'xiulan_scaled',
'Robert_scaled',
'Foxy_wg_scaled',
'Haolin_scaled',
'Leona_scaled',
'penn_scaled',
'shi_mf_scaled',
'LeonaQi_scaled',
'ENF','NEA','EEA']
for i in range(13):
plt.scatter(new_data[i][0], new_data[i][-1], s = 40 ,label = label[i])
plt.legend(loc = 'best')
plt.show()
结果展示:
三维PCA散点图
步骤:
步骤与二维类似,只是需要把n_components
改成3,然后matplotlib
部分需要画三维图
代码:
import numpy as np
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
with open('samples.csv', encoding = 'utf-8') as f:
data = np.loadtxt(f, float, delimiter = ",")
pca = PCA(n_components = 3)
new_data = pca.fit_transform(data)
label = ['Alex_scaled',
'bao_scaled',
'xiulan_scaled',
'Robert_scaled',
'Foxy_wg_scaled',
'Haolin_scaled',
'Leona_scaled',
'penn_scaled',
'shi_mf_scaled',
'LeonaQi_scaled',
'ENF','NEA','EEA']
fig = plt.figure()
ax = Axes3D(fig)
for i in range(13):
ax.scatter(new_data[i][0], new_data[i][1], new_data[i][2], s = 40 ,label = label[i])
plt.legend(loc = 'best')
plt.show()
结果展示:
这时候估计有小伙伴就会说了:“这看着不膈应吗?”
所以这个时候…就需要获取每个点的坐标。
获取三维PCA坐标
话不多说,直接上代码:
代码:
import numpy as np
from sklearn.decomposition import PCA
with open('samples.csv', encoding = 'utf-8') as f:
data = np.loadtxt(f, float, delimiter = ",")
pca = PCA(n_components = 3)
new_data = pca.fit_transform(data)
label = ['Alex_scaled',
'bao_scaled',
'xiulan_scaled',
'Robert_scaled',
'Foxy_wg_scaled',
'Haolin_scaled',
'Leona_scaled',
'penn_scaled',
'shi_mf_scaled',
'LeonaQi_scaled',
'ENF','NEA','EEA']
for i in range(13):
print(f'{label[i]}:{new_data[i][0]},{new_data[i][1]},{new_data[i][2]}')
结果展示:
Alex_scaled:-0.011596488722119198,-0.00721039268000285,0.004327757597152235
bao_scaled:0.008957327165311457,0.03157204787328956,-0.012162365253179541
xiulan_scaled:-0.038263037685343365,-0.009423923747343946,0.009725629103724714
Robert_scaled:-0.028957259795948216,-0.024755176390934306,-0.009760572109930386
Foxy_wg_scaled:-0.05695083943398266,-0.039729613258914734,-0.014264926832824069
Haolin_scaled:-0.04331299000524389,-0.0002329777672932537,0.010328443388716371
Leona_scaled:-0.04650055185985425,0.0050266575879064045,0.003363030038096107
penn_scaled:-0.0464100888097683,-0.007910106367791488,-0.01116300525568197
shi_mf_scaled:-0.041521117170468275,-0.0019015953462750332,0.0030836026394054556
LeonaQi_scaled:-0.04471519716020188,0.0038319892067295172,0.002849597471080112
ENF:0.4295182339250529,-0.005889585418883738,0.0013563877559191943
NEA:-0.025410190741495663,0.0637904527870023,-0.0044124624225301305
EEA:-0.05483779970593805,-0.00716777647748854,0.016728883880051944