二项分布比例的置信区间计算

原创 2011年10月27日 21:43:27

之前一网友遇到类似问题,特查了相关文献,归结一下方法有二。

1.根据各方法的计算公式进行编辑公式,data 步就可以搞定。相关文献有:《A Comparison of Binomial Proportion Interval  Estimation Methods 》、《Confidence Interval Calculation for Binomial Proportions》,这两篇文章都曾讨论比较了Wald、Wilson Score、Exact Clopper Pearson几种方法,另外《A SAS Program to Calculate and Plot Widths of (1- )100% Confidence Intervals for Binomial Proportions》一文将其中的Clopper Pearson法写成了宏。

2.PROC FREQ语句也提供了计算基于二项分布置信区间的方法。《Are you computing Confidence Interval for binomial proportion using PROC FREQ? Be Careful!!》和《Confidence Intervals for the Binomial Proportion with Zero Frequency》作了相关讨论,proc freq语句官方文档也作了相关说明(http://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#statug_freq_a0000000660.htm

 

相关代码摘录如下。

/***********************************************************************
** Program: Propci.sas
** Author: Keith Dunnigan
**
** Purpose: Calculate the confidence interval for one proportion.
**
** Description: Calculate Wald, Wilson Score, and exact Clopper-Pearson
** confidence intervals for a variety of n, p, and alpha
** combinations for either a one or two-sided interval.
**
***********************************************************************/
%macro runprog;
data propci;
set parms;
if sided = 1 then do;
zalpha=probit(1-(alpha));
end;
else if sided = 2 then do;
zalpha=probit(1-(alpha/2));
end;
** Wald;
q = 1 - p;
stder = sqrt(p*q/n);
Wald_lcl = p - zalpha * stder;
Wald_ucl = p + zalpha * stder;
** Wilson Score;
part1 = p + ((zalpha**2)/(2*n));
part2 = sqrt( (p*q/n) + ((zalpha**2)/(4*n**2)) );
part3 = 1 + (zalpha**2)/n;
Wilson_lcl = (part1 - (zalpha * part2))/ part3;
Wilson_ucl = (part1 + (zalpha * part2))/ part3;
** Exact Clopper Pearson;
x = round (n*p,0.1);
* Calculate the lower limit.;
v1 = 2*(n-x+1);
v2 = 2*x;
if sided = 1 then do;
a = 1-(alpha);
end;
else if sided = 2 then do;
a = 1-(alpha/2);
end;
coef = (n-x+1)/x;
fscore = finv(a,v1,v2);
exact_lcl = 1/(1+coef*fscore);
* Calculate the upper limit.;
v11 = 2*(x+1);
v22 = 2*(n-x);
fscore2 = finv(a,v11,v22);
coef2 = (x+1)/(n-x);
numer = coef2*fscore2;
exact_ucl = numer/(1+numer);
run;
options nodate;
title 'Confidence Intervals for a Single Proportion';
proc print data = propci split = '_' noobs;
var n p alpha sided Wald_lcl Wald_ucl Wilson_lcl Wilson_ucl exact_lcl exact_ucl;
label wald_lcl = 'LCL_(Wald)'
wald_ucl = 'UCL_(Wald)'
wilson_lcl = 'LCL_(Wilson_Score)'
wilson_ucl = 'UCL_(Wilson_Score)'
sided = 'Sides_on_CI'
exact_lcl = 'LCL_(Exact)'
exact_ucl = 'UCL_(Exact)';
run;
%mend runprog;
data parms;
infile cards;
input n p alpha sided;
cards;
24 0.9 0.05 1
25 0.9 0.05 1
26 0.9 0.05 1
29 0.9 0.05 1
32 0.9 0.05 1
35 0.9 0.05 1
38 0.9 0.05 1
43 0.9 0.05 1
44 0.9 0.05 1
45 0.9 0.05 1
45 0.9 0.05 1
48 0.9 0.05 1
49 0.9 0.05 1
50 0.9 0.05 1
run;
%runprog;


 

**Some programs for calculating the Exact Binomial P value and Confidence Bound;

**Using SAS Proc FREQ;

data temp1;
  input batch bleeding $ count;
  datalines;
  1 have 2
  1 no 3
  2 have 3
  2 no 7
  3 have 4
  3 no 11
run;

proc freq;
  weight count;
  tables bleeding /  binomial (p=0.1) alpha=0.2 cl;  **p=1 option indicates the standard rate to compare with, 
                                                       here we assume the standard bleeding rate is 10%;
                                                     **Alpha=0.2 to obtain one side 90% lower limit;
  exact binomial;  *Obtain the exact p-value;
  by batch;
run;


* This Program Calculates EXACT Confidence Bounds
* on proportion of "successes" in a Binomial Distribution
* See Sachs, Applied Statistics, p. 333
*
* Author: Mark DeHaan   msd@inel.gov
* Date: 4/21/92
*
* Macro Variable Names & Definitions
*  nsuccess= number of Successes found in sample
*  sampsize= sample size
*  twosided= two sided confidence bounds (0=no, 1=yes)
*  conflev=  confidence level for estimates (e.g. .95)
******************************************************;
%macro p_conf(nsuccess,sampsize,twosided,conflev);
   Data null;
     phat=&nsuccess/&sampsize;
     if &twosided then do;  *compute a two-sided conf. bound;
        clevel=&conflev+((1-&conflev)/2);
        put;put;
        put "***Note: Two sided &conflev confidence bounds***";
        end;
     else do;
        clevel=&conflev;
        put;put;
        put "***Note: One sided &conflev confidence bounds***";
        end;
     f_low=finv(clevel,2*(&sampsize-&nsuccess+1),2*&nsuccess);
     f_up=finv(clevel,2*(&nsuccess+1),2*(&sampsize-&nsuccess));
     low_phat=&nsuccess/(&nsuccess+(&sampsize-&nsuccess+1)*f_low);
     up_phat=(&nsuccess+1)*f_up/(&sampsize-&nsuccess+(&nsuccess+1)*f_up);
     put "Number of successes= &nsuccess    Sample size= &sampsize";
     put phat= low_phat= up_phat=;
     put;put;
     run;
%mend;
 
%p_conf(7,20,0,.975)
%p_conf(7,20,1,.95)
%p_conf(2,5,0,.90)
%p_conf(2,5,1,.80) 
 
 
---------------------------------------------------------------
Date:         Tue, 5 Jun 90 09:14:57 GMT
Reply-To:     LESLIE DALY PhD <LDALY@IRLEARN>
Sender:       "SAS(r) Discussion" <SAS-L@OHSTVMA>
From:         LESLIE DALY PhD <LDALY@IRLEARN.BITNET>
Subject:      POISSON (AND bINOMIAL) CONFIDENCE INTERVALS
To:           Barry W Grau <U42054@UICVM.BITNET>
 
Stephen Hull asked for SAS code for a Poisson confidence interval.  I
attach below two short MACROS to get confidence intervals for both the
binomial and Poisson cases.  I have found these to be most useful.  The
easiest way to use them is to copy this mail into a file and delete
all except the MACROS.  %INCLUDE this file in the SAS program and
run the macro using %CIPOISS etc.  The MACROS are commented in full.
 
%macro cibinom(CI,n,k,P,PL,PR);
*****************************************************************
*                                                               *
*   MACRO FOR BINOMIAL CONFIDENCE INTERVALS                     *
*   (Leslie Daly   LDALY@IRLEARN.BITNET     VERSION   1.1)      *
*                                                               *
*   This SAS Macro calculates the confidence interval for a     *
*   binomial distribution.  The method is that given in the     *
*   Geigy Scientific Tables Vol 2 (Ed C Lentner) (Ciba-Geigy    *
*   Ltd, Basle, Switzerland) p221.  It is one of the most       *
*   accurate approximations available in a closed form.         *
*                                                               *
*   This reference should be consulted for an explanation of    *
*   the code below.                                             *
*                                                               *
*   USAGE:  %CIBINOM(CI,N,K,P,PL,PU);                           *
*                                                               *
*   INPUT PARAMETERS:  CI - Confidence level as a proportion    *
*                           (ie for 95% confidence interval     *
*                            CI should be 0.95)                 *
*                       N - Total number of trials              *
*                       K - Number of successes                 *
*   OUTPUT PARAMETERS;  P - Observed prob. of success (K/N)     *
*                      PL - Lower confidence limit              *
*                      PU - Upper confidence limit              *
*                                                               *
*                                                               *
*****************************************************************;
ALPHA = (1 - &CI)/2;
&P = &K/&N;
IF &K EQ 0 THEN DO;
   &PL = 0;
   &PR = 1 - 10**(LOG10(ALPHA)/&N);
END;
IF &K EQ &N THEN DO;
   &PL = 10**(LOG10(ALPHA)/&N);
   &PR = 1;
END;
IF &K GT 0 AND &K LT &N THEN DO;
calpha =  probit(1 - ALPHA);
a = (calpha**4)/18  +  (calpha**2)*(2*&N + 1)/6 + (&N + 1/3)**2  ;
AL = &K;
AR = &K + 1;
BL =  (CALPHA**4)/18  + (CALPHA**2)*(4*(&N - AL) + 3)/6
                      + 2*(AL*(3*&N + 1) - &N)/3  - 2/9;
CL =  (CALPHA**4)/18  + (AL - 1/3)**2  -  (CALPHA**2)*(2*AL - 1)/6;
&PL =  BL/(2*A) - SQRT( (BL/(2*A))**2 - CL/A );
BR =  (CALPHA**4)/18  + (CALPHA**2)*(4*(&N - AR) + 3)/6
                      + 2*(AR*(3*&N + 1) - &N)/3  - 2/9;
CR =  (CALPHA**4)/18  + (AR - 1/3)**2  -  (CALPHA**2)*(2*AR - 1)/6;
&PR =  BR/(2*A) + SQRT( (BR/(2*A))**2 - CR/A );
END;
%MEND CIBINOM;
 
 
%MACRO cipoiss(CI,K,KL,KR);
*****************************************************************
*                                                               *
*   MACRO FOR POISSON CONFIDENCE INTERVALS                      *
*   (Leslie Daly   LDALY@IRLEARN.BITNET     VERSION   1.1)      *
*                                                               *
*   This SAS Macro calculates the confidence interval for a     *
*   Poisson distribution.  The confidence lilmits are exact     *
*   and are based on the relationship between the chisquare     *
*   distribution and the Poisson.                               *
*   If CHI(df,P) represents the chisquuare value which on       *
*   df degrees of freedom cuts off an area P in the upper       *
*   tail of the distribution (ie CHI(1,0.05) = 3.84) then       *
*   for a count of x (observed) the 95% Confidence limits are:  *
*      Lower: 0.5*( CHI(2*X,0.975)                              *
*      Upper: 0.5*( CHI(2*X +2,0.025)                           *
*   with obvious adjustments for 99% limits etc.                *
*                                                               *
*   In SAS the chisquare value can be obtained using the        *
*   function GAMINV. In the notation above (watch the           *
*   probability levels)                                         *
*                                                               *
*        CHI(DF,P) = 2*GAMINV(1-P,0.5*DF)                       *
*                                                               *
*   hence the code below.                                       *
*                                                               *
*                                                               *
*   USAGE:             %CIPOISS(CI,K,KL,KR)                     *
*                                                               *
*                                                               *
*   INPUT PARAMETERS:  CI - Confidence level as a proportion    *
*                           (ie for 95% confidence interval     *
*                            CI should be 0.95)                 *
*                       K - Observed number of events           *
*   OUTPUT PARAMETERS; KL - Lower confidence limit              *
*                      KU - Upper confidence limit              *
*                                                               *
*                                                               *
*****************************************************************;
ALPHA = (1 - &CI)/2;
if &k ne 0 then do;
   &KR = gaminv (1 - ALPHA,&K+1);
   &KL = gaminv (ALPHA,&K);
END;
IF &K EQ 0 THEN DO;
   &KR = -LOG(ALPHA);
   &KL = 0;
END;
%MEND CIPOISS;


 

相关文章推荐

SPSS中八类常用非参数检验之二:二项分布(Binomial)检验

一、定义  现实生活中有很多数据的取值只有两类,如医学中的生与死、患病的有与无、性别中的男性和女性、产品的合格与不合格等。从这种二分类总体中抽取的所有可能结果,要么是对立分类中的这一类,要么是另一类,...
  • downmoon
  • downmoon
  • 2012年03月26日 11:55
  • 14678

Khan公开课 - 统计学学习笔记:(七)伯努利分布、置信区间、t分布、p-value和第一型错误

伯努利分布(Bernoulli Distribution) Bernoulli Distribution是最简单的二项式分布,只有两个选择,Y or N,以0表示N,1表示Y。在日常生活中也比较常见...

Python单例模式的4种实现方法

[python] view plaincopy   print '----------------------方法1--------------------------' ...

UCI用法说明(一种读写文件的方法)

目录: 1.UCI命令 2.UCI的文件和流程 3.UCI 的文件语法 4.UCI 命令读写配置 5.综合实例 1.UCI命令       一个众所周知的原因,在Linux下各种软件包有各种不同的...

统计学一些知识的总结:qq图, 肥尾分布, 置信区间

Q-Q图 http://bioinformatics.lofter.com/post/bffd5_23f5fe
  • rav009
  • rav009
  • 2014年07月21日 16:54
  • 7344

非参数统计--二项分布检验

  • 2017年03月11日 17:13
  • 731KB
  • 下载

二项分布代码

  • 2013年12月29日 23:01
  • 1KB
  • 下载

二项分布的概念与特征

http://www.foodmate.net/lesson/41/ 第一节  二项分布的概念与特征 一、二项分布的概念在医学领域中,有一些随机事件是只具有两种互斥结果的离散型随机事件,称为二项分类...
  • knzeus
  • knzeus
  • 2011年03月25日 14:17
  • 264
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:二项分布比例的置信区间计算
举报原因:
原因补充:

(最多只允许输入30个字)