1. Commonly used function
1.1 options()
Invoking options() with no arguments returns a list with the current values of the options. Note that not all options listed below are set initially. To access the value of a single option, one should use, e.g., getOption("width") rather than options("width") which is a list of length one.For getOption, the current value set for option x, or NULL if the option is unset.
> x <- seq(-20,20,by=.5)
> y <- dt(x,df=10)
> plot(x,y)
1.1 options()
Invoking options() with no arguments returns a list with the current values of the options. Note that not all options listed below are set initially. To access the value of a single option, one should use, e.g., getOption("width") rather than options("width") which is a list of length one.For getOption, the current value set for option x, or NULL if the option is unset.
For uses setting one or more options, a list with the previous values of the options changed (returned invisibly).
> getOption('width')
[1] 63
> options('width')
$width
[1] 63
1.2 apply()
apply(X, MARGIN, FUN, ...)
If you would like to know what distributions are available you can do a search using the commands
help.search(“distribution”) and help("distribution")
1.2 apply()
apply(X, MARGIN, FUN, ...)
X an array, including a matrix.
MARGIN a vector giving the subscripts which the function will be applied over. E.g., for a matrix 1 indicates rows, 2 indicates columns, c(1, 2) indicates rows and columns. Where X has named dimnames, it can be a character vector selecting dimension names.
x <- cbind(x1 = 3, x2 = c(4:1, 2:5)) #区别x<-c(x1=3,x2=c(4:1,2:5))
dimnames(x)[[1]] <- letters[1:8]
#注意是[[1]],因为dimnames(x)返回的是list,[[1]] is the row name,[[2]] is the column name.
#letters[1:8] is a b c d e f g h, because letters is a vector containing 26 letters.
apply(x, 2, mean, trim = .2)
# Notice that there are eight rows,less than ten.
x <- cbind(x1 = 3, x2 = c(4:1, 2:5)) #区别x<-c(x1=3,x2=c(4:1,2:5))
dimnames(x)[[1]] <- letters[1:8]
#注意是[[1]],因为dimnames(x)返回的是list,[[1]] is the row name,[[2]] is the column name.
#letters[1:8] is a b c d e f g h, because letters is a vector containing 26 letters.
apply(x, 2, mean, trim = .2)
apply(x, 2, sort)
# Sort the columns of a matrix,注意该矩阵第二列被排序了,但其他列的次序并没有随之改变。
2.Commonly Used DistributionIf you would like to know what distributions are available you can do a search using the commands
help.search(“distribution”) and help("distribution")
“d” | returns the height of the probability density function |
“p” | returns the cumulative density function |
“q” | returns the inverse cumulative density function (quantiles) |
“r” | returns randomly generated numbers |
2.1 Normal Distribution
2.1.1
dnorm(x, mean = 0, sd = 1, log = FALSE) #密度函数,density;暂时不知道log的用法;返回密度函数在x处的值
> dnorm(0)
[1] 0.3989423
> dnorm(0)*sqrt(2*pi)
[1] 1
2.1.1
dnorm(x, mean = 0, sd = 1, log = FALSE) #密度函数,density;暂时不知道log的用法;返回密度函数在x处的值
> dnorm(0)
[1] 0.3989423
> dnorm(0)*sqrt(2*pi)
[1] 1
2.1.2
pnorm(q, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE) #分布函数
q is the given number, mean and sd are used to determine the distribution.
If you wish to find the probability that a number is larger than the given number you can use the lower.tail option:
> pnorm(0,lower.tail=FALSE)
[1] 0.5
> pnorm(1,lower.tail=FALSE)
[1] 0.1586553
2.1.3
The next function we look at is qnorm which is the inverse of pnorm. The idea behind qnorm is that you give it a probability, and it returns the number whose cumulative distribution matches the probability
pnorm(q, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE) #分布函数
q is the given number, mean and sd are used to determine the distribution.
If you wish to find the probability that a number is larger than the given number you can use the lower.tail option:
> pnorm(0,lower.tail=FALSE)
[1] 0.5
> pnorm(1,lower.tail=FALSE)
[1] 0.1586553
2.1.3
The next function we look at is qnorm which is the inverse of pnorm. The idea behind qnorm is that you give it a probability, and it returns the number whose cumulative distribution matches the probability
qnorm(p, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE)
> qnorm(0.5)
[1] 0
2.1.4
> qnorm(0.5)
[1] 0
2.1.4
rnorm(n, mean = 0, sd = 1) #产生n个服从N(0,1)的随机数
2.1.5Arguments
2.1.5Arguments
log, log.plogical; if TRUE, probabilities p are given as log(p).
lower.taillogical; if TRUE (default), probabilities are P[X ≤ x] otherwise, P[X > x].
2.2.T-Distribution
help(TDist)
dt, pt, qt, and rt.2.2.T-Distribution
help(TDist)
> x <- seq(-20,20,by=.5)
> y <- dt(x,df=10)
> plot(x,y)
2.3 The Binomial Distribution
help(Binomial)
> x <- seq(0,50,by=1)
> y <- dbinom(x,50,0.2)
> plot(x,y)
2.4 The Chi-Squared Distribution
help(Chisquare)
> x <- seq(-20,20,by=.5)
> y <- dchisq(x,df=10)
> plot(x,y)
3. Random samples and permutations
3.1 Random samples
In this section, we need to caculate the factorial. There are several ways:
(1) factorial(n)
(2)gamma(n+1)
(3)prod(1:n)
reference
http://www.cyclismo.org/tutorial/R/probability.html ;
help(Binomial)
> x <- seq(0,50,by=1)
> y <- dbinom(x,50,0.2)
> plot(x,y)
2.4 The Chi-Squared Distribution
help(Chisquare)
> x <- seq(-20,20,by=.5)
> y <- dchisq(x,df=10)
> plot(x,y)
3. Random samples and permutations
3.1 Random samples
sample(x, size, replace = FALSE, prob = NULL)
replace: Should sampling be with replacement?
prob: A vector of probability weights for obtaining the elements of the vector being sampled.
3.2 Combinations and permutationsIn this section, we need to caculate the factorial. There are several ways:
(1) factorial(n)
(2)gamma(n+1)
(3)prod(1:n)
> system.time(replicate(1000000,factorial(170)))
用户 系统 流逝
2.33 0.00 2.33
> system.time(replicate(1000000,prod(1:170)))
用户 系统 流逝
2.38 0.02 2.39
> system.time(replicate(1000000,gamma(171)))
用户 系统 流逝
1.42 0.03 1.45
http://www.cyclismo.org/tutorial/R/probability.html ;
https://cran.r-project.org/manuals.html