【本文用于教学】
基本统计图形
要观察分类变量与定量变量的数据,最基本的方法就是用图形:
1. 将变量的分布作可视化展示;
2. 通过结果变量进行跨组比较。
#条形图 条形图通过垂直的或水平的条形展示了类别型变量的分布(频数)。函数barplot()的最简单用法是:
其中,height为向量或矩阵。
##简单条形图(height为向量)
library(vcd) #使用数据集Arthritis
## Loading required package: grid
counts <- table(Arthritis$Improved)
counts
##
## None Some Marked
## 42 14 28
par(mfrow=c(1,2))
barplot(counts, main = "Simple Bar Plot", xlab = "Improvement", ylab = "Frequency") #竖直条形图
barplot(counts, main = "Horizontal Bar Plot", xlab = "Frequency", ylab = "Improvement", horiz = TRUE) #水平条形图
![c2776b8269d65a9d606fa355d63421b4.png](https://i-blog.csdnimg.cn/blog_migrate/20e973e3439810ee27f26ea90b9ca2ad.png)
##堆砌条形图和分组条形图(height为矩阵)
若**beside=FALSE**(默认值),则矩阵中的每一列都将生成图中的一个条形,各列中的值将给出堆砌的“子条”的高度。
若**beside=TRUE**,则矩阵中的每一列都表示一个分组,各列中的值将并列而不是堆砌。
library(vcd) #使用数据集Arthritis
counts <- table(Arthritis$Improved, Arthritis$Treatment)
counts
##
## Placebo Treated
## None 29 13
## Some 7 7
## Marked 7 21
par(mfrow=c(1,2),pin = c(3,3))
barplot(counts, main = "Stacked Bar Plot", xlab = "Treatment", ylab = "Frequency", col = c("red", "yellow", "green"),legend = rownames(counts)) #堆砌条形图
barplot(counts, main = "Grouped Bar Plot", xlab = "Treatment", ylab = "Frequency", col = c("red", "yellow", "green"),legend = rownames(counts), beside = TRUE) #分组条形图
![f2f1040a03b5524900ebc420a1e3e0ba.png](https://i-blog.csdnimg.cn/blog_migrate/9742c1c8b2341dad282117b377d7318b.png)
注:棘状图可对堆砌条形图进行重缩放
library(vcd)
attach(Arthritis)
counts <- table(Treatment, Improved)
par(pin = c(3.5,3.5))
spine(counts, main = "Spinogram Example")
![9ddbcf3f18177253ac1ce246952c7070.png](https://i-blog.csdnimg.cn/blog_migrate/2aab9e4cca2386834131d1ec7d41cb48.png)
detach(Arthritis)
##均值条形图 条形图并不一定要基于计数数据或频率数据,也可以使用数据整合函数并将结果传递给函数barplot(),来创建表示均值、中位数、标准差等的条形图。
states <- data.frame(state.region, state.x77)
means <- aggregate(states$Illiteracy, by = list(state.region), FUN = mean)
means <- means[order(means$x), ] #均值从小到大排序
means
## Group.1 x
## 3 North Central 0.700000
## 1 Northeast 1.000000
## 4 West 1.023077
## 2 South 1.737500
barplot(means$x, names.arg=means$Group.1) #参数2设置标签
title("Mean Illiteracy Rate")
![86561ce1e4f8601cd5cff68f94ae35e3.png](https://i-blog.csdnimg.cn/blog_migrate/6a1b62cace48e3e6adab22f441cb8063.png)
##条形图的微调
par(mar = c(5, 8, 4, 2))
par(las = 2) #旋转图形标签
counts <- table(Arthritis$Improved)
barplot(counts, main = "Treatment Outcome", horiz = TRUE, cex.names = 0.8, names.arg = c("No Improvement", "Some Improvement", "Marked Improvement")) #cex.names缩小字体
![08a999f1a084d89294b46767f3462214.png](https://i-blog.csdnimg.cn/blog_migrate/c2b5715aa3e93d449223ee398fb1abbd.png)
#饼图 饼图的功能同条形图,每一个扇形的角度与相应频数成比例。基本函数为:
其中,x为非负数值向量,表示每个扇形的面积;labels表示各扇形标签的字符型向量。
attach(mtcars)
piedata<-table(c