R note(1)

最新推荐文章于 2024-08-16 09:32:24 发布

header_zj

最新推荐文章于 2024-08-16 09:32:24 发布

阅读量364

点赞数

分类专栏： R 文章标签： R

本文链接：https://blog.csdn.net/qq_31095335/article/details/51660261

版权

R 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

1.title() 相当于plot（main=）

2.paste () ：

sep参数

> paste("1st", "2nd", "3rd", sep = ", ")
[1] "1st, 2nd, 3rd"

collapse 参数：是为了把最后的结果分开的符号（paste是连接成很多项，这个最终结果还可以分隔一下）

> nth
 [1] "1st"  "2nd"  "3rd"  "4th"  "5th"  "6th"  "7th"  "8th"  "9th"  "10th" "11th" "12th"
> paste0(nth)
 [1] "1st"  "2nd"  "3rd"  "4th"  "5th"  "6th"  "7th"  "8th"  "9th"  "10th" "11th" "12th"
> paste0(nth, collapse = ", ")
[1] "1st, 2nd, 3rd, 4th, 5th, 6th, 7th, 8th, 9th, 10th, 11th, 12th"

3.用lines()可以向已有图中添加一条曲线，而不是用plot()

4.legend() ：

坐标参数可以用locator(1)来自己用鼠标指定位置；
legend输入文字；
col、pch可以指定图表；
fill 可以给出方块状图表

legend(locator(1), fill = 1:3, pch = 1:3, legend = levels(factor))

5.编写函数可以返回一个列表且赋予表中每个部分名字

stats<- function(x,na.move = FALSE)
{
 ####
  return (list(n = n , mean = m, length = n, sd = sd, skew  = skew, kurt = kurt))
}

6.Hmisc, psych 包里都有describe(),如何使用被masked的包中同名函数

library(psych)

library(Hmisc)

Attaching package: ‘Hmisc’

The following object is masked from ‘package:psych’:

    describe

psych::describe()

7.取data.frame中元素的方法

mtcars$mpg or mtcars["mpg"]

8.除了summary()还可以用describe()获得数据的简单统计量

9.用来进行分类计算统计量的函数

aggregate():
- by = list() 分类参数（必须是list形式）
- FUN = 进行运算的函数
  但是FUN只能是mean、sd这种单值返回函数
by()
- INDICES：分组因子参数
- FUN 可以使任何函数
reshape包里的melt()和cast()函数的运用：
melt:
- id.vars: 表示用来进行分类的标识
- measure.var: 表示得到的值，全部都会被融入variable一项中
cast：
- fomula：var1+ var2+var3 ~ . 表示用melt后得到的var标签中的一些（其中还加入了variable）后面的则是用于计算的值
- FUN：用于计算的函数

10.生成列联表：

偏好采用table()
结合margin.table():
- margins: 1表示按照第一个分类标识，2表示第二个
- prop.table():
  不加参数则算出所有分类的比例
  加标识则是不同分类标准的比例

> T<-table(Treatment,Sex)
> margin.table(T,1)
Treatment
Placebo Treated 
     43      41 
> margin.table(T,2)
Sex
Female   Male 
    59     25 
> prop.table(T)
         Sex
Treatment    Female      Male
  Placebo 0.3809524 0.1309524
  Treated 0.3214286 0.1666667
> prop.table(T,1)
         Sex
Treatment    Female      Male
  Placebo 0.7441860 0.2558140
  Treated 0.6585366 0.3414634
> addmargins(prop.table(T))
         Sex
Treatment    Female      Male       Sum
  Placebo 0.3809524 0.1309524 0.5119048
  Treated 0.3214286 0.1666667 0.4880952
  Sum     0.7023810 0.2976190 1.0000000

更加推荐使用gmodels包中的CrossTable函数

> CrossTable(Treatment,Sex)


   Cell Contents
|-------------------------|
|                       N |
| Chi-square contribution |
|           N / Row Total |
|           N / Col Total |
|         N / Table Total |
|-------------------------|


Total Observations in Table:  84 


             | Sex 
   Treatment |    Female |      Male | Row Total | 
-------------|-----------|-----------|-----------|
     Placebo |        32 |        11 |        43 | 
             |     0.107 |     0.253 |           | 
             |     0.744 |     0.256 |     0.512 | 
             |     0.542 |     0.440 |           | 
             |     0.381 |     0.131 |           | 
-------------|-----------|-----------|-----------|
     Treated |        27 |        14 |        41 | 
             |     0.112 |     0.265 |           | 
             |     0.659 |     0.341 |     0.488 | 
             |     0.458 |     0.560 |           | 
             |     0.321 |     0.167 |           | 
-------------|-----------|-----------|-----------|
Column Total |        59 |        25 |        84 | 
             |     0.702 |     0.298 |           | 
-------------|-----------|-----------|-----------|

同时给出了频数，行比例，列比例，总体比例

11.检验列联表的独立性和相关性：

用chisq.test() 和 fisher.test()

> mytable<- table(Treatment,Sex)
> chisq.test(mytable)

    Pearson's Chi-squared test with Yates' continuity correction

data:  mytable
X-squared = 0.38378, df = 1, p-value = 0.5356

> fisher.test(mytable)

    Fisher's Exact Test for Count Data

data:  mytable
p-value = 0.4763
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.5320442 4.3286798
sample estimates:
odds ratio 
  1.500984

原假设是变量独立

12.相关系数（使用非参数假设中spearman的秩相关系数）

states<-state.x77[,1:6]

> colnames(states)
[1] "Population" "Income"     "Illiteracy" "Life Exp"   "Murder"     "HS Grad"   
> cor(states,method = "spearman")
> pcor(c(1,5,2,3,6),cov(states))

在cor()中加参数method可以指定用哪一种相关系数，默认为简单相关系数（即pearson相关系数）
利用ggm包中的pcor()可以计算偏相关系数
- 第一个参数为向量，向量前两个值为待计算的变量，后面的全部为被剔除影响的变量

偏相关系数：指排除了其他变量的影响之后两个变量间的相关系数：比如state.x77数据集中人口和谋杀率之间的相关关系是可能和收入，文盲率等有关系的，但是要想探讨两者之间的净关系，就要采用偏相关系数（因为原始数据中人口和谋杀率的数据就已经包含了其他变量的影响）

13.紧接着还应该对相关系数是否显著为0进行假设检验（前面只是利用个例进行了相关系数的计算）1

利用psych包里面的corr.test()进行检验

> corr.test(states)
Call:corr.test(x = states)
Correlation matrix 
           Population Income Illiteracy Life Exp Murder HS Grad
Population       1.00   0.21       0.11    -0.07   0.34   -0.10
Income           0.21   1.00      -0.44     0.34  -0.23    0.62
Illiteracy       0.11  -0.44       1.00    -0.59   0.70   -0.66
Life Exp        -0.07   0.34      -0.59     1.00  -0.78    0.58
Murder           0.34  -0.23       0.70    -0.78   1.00   -0.49
HS Grad         -0.10   0.62      -0.66     0.58  -0.49    1.00
Sample Size 
[1] 50
Probability values (Entries above the diagonal are adjusted for multiple tests.) 
           Population Income Illiteracy Life Exp Murder HS Grad
Population       0.00   0.59       1.00      1.0   0.10       1
Income           0.15   0.00       0.01      0.1   0.54       0
Illiteracy       0.46   0.00       0.00      0.0   0.00       0
Life Exp         0.64   0.02       0.00      0.0   0.00       0
Murder           0.01   0.11       0.00      0.0   0.00       0
HS Grad          0.50   0.00       0.00      0.0   0.00       0

R包中会有很多.test这样的函数，是用来进行显著性检验的，因为我们不能通过一次个例的计算某些检验量就判定模型中参数是怎样的，要利用显著性检验进行判断，这个时候.test就派上用场了。 ↩

header_zj

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
R note(1)

title 相当于plotmainpaste 用lines可以向已有图中添加一条曲线而不是用plotlegend 编写函数可以返回一个列表且赋予表中每个部分名字Hmisc psych 包里都有describe如何使用被masked的包中同名函数取dataframe中元素的方法除了summary还可以用describe获得数据的简单统计量用来进行分类计算统计量的函数生成列联表检验
复制链接

扫一扫

专栏目录