威尔逊置信区间 php,威尔逊置信区间

最新推荐文章于 2024-03-11 21:08:45 发布

撸猫少女

最新推荐文章于 2024-03-11 21:08:45 发布

阅读量269

点赞数

文章标签：威尔逊置信区间 php

由于正态区间对于小样本并不可靠，因而，1927年，美国数学家 Edwin Bidwell Wilson提出了一个修正公式，被称为“威尔逊区间”，很好地解决了小样本的准确性问题。

根据离散型随机变量的均值和方差定义：

μ=E(X)=0*(1-p)+1*p=p

σ=D(X)=(0-E(X))2(1-p)+(1-E(X))2p=p2(1-p)+(1-p)2p=p2-p3+p3-2p2+p=p-p2=p(1-p)

因此上面的威尔逊区间公式可以简写成：

代码：

def wilson_score(pos, total, p_z=2.):

"""

威尔逊得分计算函数

参考：https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval

:param pos: 正例数

:param total: 总数

:param p_z: 正太分布的分位数

:return: 威尔逊得分

"""

pos_rat = pos * 1. / total * 1. # 正例比率

score = (pos_rat + (np.square(p_z) / (2. * total))

- ((p_z / (2. * total)) * np.sqrt(4. * total * (1. - pos_rat) * pos_rat + np.square(p_z)))) / \

(1. + np.square(p_z) / total)

return score

SQL实现代码：

#wilson_score

SELECT widget_id, ((positive + 1.9208) / (positive + negative) -

1.96 * SQRT((positive * negative) / (positive + negative) + 0.9604) /

(positive + negative)) / (1 + 3.8416 / (positive + negative))

AS ci_lower_bound FROM widgets WHERE positive + negative > 0

ORDER BY ci_lower_bound DESC;

#

SELECT widget_id, (positive - negative)

AS net_positive_ratings FROM widgets ORDER BY net_positive_ratings DESC;

#

SELECT widget_id, positive / (positive + negative)

AS average_rating FROM widgets ORDER BY average_rating DESC;

excel实现代码：

=IFERROR((([@[Up Votes]] + 1.9208) / ([@[Up Votes]] + [@[Down Votes]]) - 1.96 *

SQRT(([@[Up Votes]] * [@[Down Votes]]) / ([@[Up Votes]] + [@[Down Votes]]) + 0.9604) /

([@[Up Votes]] + [@[Down Votes]])) / (1 + 3.8416 / ([@[Up Votes]] + [@[Down Votes]])),0)

星级评价排名

参考资料：

标签：置信区间,Votes,positive,威尔逊,negative,pos,Up,total

来源： https://www.cnblogs.com/iupoint/p/13354631.html

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。