在python中输入圆的半,在Python中绘制关联圆

I've been doing some Geometrical Data Analysis (GDA) such as Principal Component Analysis (PCA). I'm looking to plot a Correlation Circle... these look a bit like this:

13d1e14842721e1c8fad37608c8cde7b.png

Basically, it allows to measure to which extend the Eigenvalue / Eigenvector of a variable is correlated to the principal components (dimensions) of a dataset.

Anyone knows if there is a python package that plots such data visualization?

解决方案

Here is a simple example using sklearn and the iris dataset. Includes both the factor map for the first two dimensions and a scree plot:

from sklearn.decomposition import PCA

import seaborn as sns

import numpy as np

import matplotlib.pyplot as plt

df = sns.load_dataset('iris')

n_components = 4

# Do the PCA.

pca = PCA(n_components=n_components)

reduced = pca.fit_transform(df[['sepal_length', 'sepal_width',

'petal_length', 'petal_width']])

# Append the principle components for each entry to the dataframe

for i in range(0, n_components):

df['PC' + str(i + 1)] = reduced[:, i]

display(df.head())

# Do a scree plot

ind = np.arange(0, n_components)

(fig, ax) = plt.subplots(figsize=(8, 6))

sns.pointplot(x=ind, y=pca.explained_variance_ratio_)

ax.set_title('Scree plot')

ax.set_xticks(ind)

ax.set_xticklabels(ind)

ax.set_xlabel('Component Number')

ax.set_ylabel('Explained Variance')

plt.show()

# Show the points in terms of the first two PCs

g = sns.lmplot('PC1',

'PC2',

hue='species',data=df,

fit_reg=False,

scatter=True,

size=7)

plt.show()

# Plot a variable factor map for the first two dimensions.

(fig, ax) = plt.subplots(figsize=(12, 12))

for i in range(0, len(pca.components_)):

ax.arrow(0,

0, # Start the arrow at the origin

pca.components_[0, i], #0 for PC1

pca.components_[1, i], #1 for PC2

head_width=0.1,

head_length=0.1)

plt.text(pca.components_[0, i] + 0.05,

pca.components_[1, i] + 0.05,

df.columns.values[i])

an = np.linspace(0, 2 * np.pi, 100)

plt.plot(np.cos(an), np.sin(an)) # Add a unit circle for scale

plt.axis('equal')

ax.set_title('Variable factor map')

plt.show()

It'd be a good exercise to extend this to further PCs, to deal with scaling if all components are small, and to avoid plotting factors with minimal contributions.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值