基尼辛普森指数衡量多样性

Simpson index

\lambda =\sum_{i=1}^{R}p_{i}^{2}

The measure equals the probability that two entities taken at random from the dataset (with replacement) represent the same type, where R is the total number of types in the dataset.

 

Gini–Simpson index

The transformation 1-\lambda equals the probability that the two entities represent different types.

分布越均衡,该指数越高;分布越集中,该指数越低。

 

Code

import pandas as pd

def gini_calc(df2):
    sum_ = sum_square = 0
    sum_ = df2['cnt'].sum()
    df2['cnt_prop']=df2['cnt'].apply(lambda x :x/sum_)
    for i in df2['cnt_prop']:
        sum_square += i**2
    return 1-sum_square


################################
df = pd.read_excel('gini.xlsx')
df=df.groupby([df['population'],df['subpopulation'],df['type']],as_index=False).sum()


################################
a=[]
b=[]
c=[]
for name,group in df.groupby([df['population'],df['subpopulation']]):
    index = gini_calc(group)
    a.append(name[0])
    b.append(name[1])
    c.append(index)
 
res={"population":a, "subpopulation":b, "gini_simpson_index":c}
data=pd.DataFrame(res)
result=data.to_csv('gini_result.csv')

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值