概率分布函数(四种）

最新推荐文章于 2024-07-05 12:07:43 发布

weixin_30745553

最新推荐文章于 2024-07-05 12:07:43 发布

阅读量1.7k

点赞数

原文链接：http://www.cnblogs.com/bangemantou/archive/2012/12/22/2828983.html

版权

1、概率密度函数

定义：对任一个随机变量X，存在一个函数f(x)，满足以上条件，那么就说，f(x)是X的概率密度函数：

意义说明：描述随机变量在某一个确定取值点的可能性的函数，或者说是瞬时增幅的一个函数，用微分定义如下：

2、累积分布函数

定义：对任一随机变量X，对于任意给定值a，所有小于值a出现的概率和，就是随机变量X的分布函数，分布函数可以唯一决定一个随机变量，定义公式：

性质：（1）有界性；（2）单调性；（3）右连续性。

累积分布函数由于英文为Cumulative Distribution Function，所以经常简称为CDF。

3、分位数函数

定义：分位数函数是累积分布函数的反函数，也就是说，给定概率值，计算出随机变量的取值（左侧分位数）。

常用的有四个分布的分位数：

标准正态分布，qnorm(p, mean=0, sd=1)

Student’s (t) , qt(p,df=N,ncp=0)

卡方分布：qchisq(p, df=N,ncp=0)

Fisher-Snedecor：qf(p, df1,df2,ncp=0)

特例：四分位数

定义：四分位数是统计学中分位数的一种，即把所有的数值从小到大朴烈并分为四等分，处于三个分割点的数就是四分位数。

选值原则：样本总量N，分位数y（百分数），令

（1）L是整数，取第L和L+1的平均值

（2）L不是整数，取下一个最近的整数（1.2取2）

4、随机数函数

定义，从一个给定函数的的取值中随机挑出一个自变量，输出的是因变量的值。

5、几个常见的随机变量的四种函数形式：

（1）The Normal Distribution

Usage：

dnorm(x, mean = 0, sd = 1, log = FALSE)

pnorm(q, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE)

qnorm(p, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE)

rnorm(n, mean = 0, sd = 1)

Arguments：

x,q	vector of quantiles.
p	vector of probabilities.
n	number of observations. If length(n) > 1, the length is taken to be the number required.
mean	vector of means.
sd	vector of standard deviations.
log, log.p	logical; if TRUE, probabilities p are given as log(p).
lower.tail	logical; if TRUE (default), probabilities are P[X ≤ x] otherwise, P[X > x]

（2）卡方分布

Usage：

dchisq(x, df, ncp=0, log = FALSE)

pchisq(q, df, ncp=0, lower.tail = TRUE, log.p = FALSE)

qchisq(p, df, ncp=0, lower.tail = TRUE, log.p = FALSE)

rchisq(n, df, ncp=0)

Arguments：

x, q	vector of quantiles.
p	vector of probabilities.
n	number of observations. If length(n) > 1, the length is taken to be the number required.
df	degrees of freedom (non-negative, but can be non-integer).
ncp	non-centrality parameter (non-negative).
log, log.p	logical; if TRUE, probabilities p are given as log(p).
lower.tail	logical; if TRUE (default), probabilities are P[X ≤ x], otherwise, P[X > x].

（3）F分布

Usage：

df(x, df1, df2, ncp, log = FALSE)

pf(q, df1, df2, ncp, lower.tail = TRUE, log.p = FALSE)

qf(p, df1, df2, ncp, lower.tail = TRUE, log.p = FALSE)

rf(n, df1, df2, ncp)

Arguments：

x, q	vector of quantiles.
p	vector of probabilities.
n	number of observations. If length(n) > 1, the length is taken to be the number required.
df1, df2	degrees of freedom. Inf is allowed.
ncp	non-centrality parameter. If omitted the central F is assumed.
log, log.p	logical; if TRUE, probabilities p are given as log(p).
lower.tail	logical; if TRUE (default), probabilities are P[X ≤ x], otherwise, P[X > x].

（4）T分布

Usage：

dt(x, df, ncp, log = FALSE)

pt(q, df, ncp, lower.tail = TRUE, log.p = FALSE)

qt(p, df, ncp, lower.tail = TRUE, log.p = FALSE)

rt(n, df, ncp)

Arguments：

x, q	vector of quantiles.
p	vector of probabilities.
n	number of observations. If length(n) > 1, the length is taken to be the number required.
df	degrees of freedom (> 0, maybe non-integer). df = Inf is allowed.
ncp	non-centrality parameter delta; currently except for rt(), only for abs(ncp) <= 37.62. If omitted, use the central t distribution.
log, log.p	logical; if TRUE, probabilities p are given as log(p).
lower.tail	logical; if TRUE (default), probabilities are P[X ≤ x], otherwise, P[X > x].