几组数据的相关性python_在Python中,如何计算两个数据数组之间的相关性和统计显着性?...

本文指导如何使用Python的scipy和numpy库计算两个等长数组的相关系数及其统计显著性,特别关注于处理紧密相关或无明显相关的情况,包括Pearson相关系数的计算和Mahalanobis距离的应用。
摘要由CSDN通过智能技术生成

1586010002-jmsa.png

I have sets of data with two equally long arrays of data, or I can make an array of two-item entries, and I would like to calculate the correlation and statistical significance represented by the data (which may be tightly correlated, or may have no statistically significant correlation).

I am programming in Python and have scipy and numpy installed. I looked and found Calculating Pearson correlation and significance in Python, but that seems to want the data to be manipulated so it falls into a specified range.

What is the proper way to, I assume, ask scipy or numpy to give me the correlation and statistical significance of two arrays?

解决方案

If you want to calculate the Pearson Correlation Coefficient, then scipy.stats.pearsonr is the way to go; although, the significance is only meaningful for larger data sets. This function does not require the data to be manipulated to fall into a specified range. The value for the correlation falls in the interval [-1,1], perhaps that was the confusion?

If the significance is not terribly important, you can use numpy.corrcoef().

The Mahalanobis distance does take into account the correlation between two arrays, but it provides a distance measure, not a correlation. (Mathematically, the Mahalanobis distance is not a true distance function; nevertheless, it can be used as such in certain contexts to great advantage.)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值