几组数据的相关性python,在Python中如何做多列之间的相关性超过2个变量?

I have a Pandas Dataframe like so:

id cat1 cat2 cat3 num1 num2

1 0 WN 29 2003 98

2 1 TX 12 755 76

3 0 WY 11 845 32

4 1 IL 19 935 46

I want to find out the correlation between cat1 and column cat3, num1 and num2

or between cat1 and num1 and num2

or between cat2 and cat1, cat3, num1, num2

When I use df.corr() it gives Correlation between all the columns in the dataframe, but I want to see Correlation between just these selective columns detailed above.

How do I do that in Python pandas?

A Thousand thanks in advance for your answers.

解决方案

I tried the following and it worked :

features1=list(['cat1','cat2','cat3'])

features2=list(['Cat1', 'Cat2','num1','num2'])

df[features1].corr()

df[features2].corr()

Good way to select the columns based on the need when you have a very high number of variables in your dataset.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值