# ggplot2设置坐标轴范围_数据科学13 | 探索性数据分析ggplot2绘图

## 1. ggplot绘图

ggplot绘图系统自有一套严密的绘图语法系统。

(绘图) 语法表明统计图形就是一种从数据到几何形状 (点、线、柱) 的美学属性 (颜色、形状、大小) 的映射。

➢ggplot2绘图的基本组成部分

➢qplot函数

qplot函数类似于基础系统的plot函数，通常用来绘制基本的图像，如散点图、柱状图。并不能展现多少ggplot的基础架构。

library(ggplot2)str(mpg)Classes ‘tbl_df’, ‘tbl’ and 'data.frame':    234 obs. of  11 variables: $manufacturer: chr "audi" "audi" "audi" "audi" ...$ model       : chr  "a4" "a4" "a4" "a4" ... $displ : num 1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ...$ year        : int  1999 1999 2008 2008 1999 1999 2008 1999 1999 2008 ... $cyl : int 4 4 4 4 6 6 6 4 4 4 ...$ trans       : chr  "auto(l5)" "manual(m5)" "manual(m6)" "auto(av)" ... $drv : chr "f" "f" "f" "f" ...$ cty         : int  18 21 20 21 16 18 18 18 16 20 ... $hwy : int 29 29 31 30 26 26 27 26 25 28 ...$ fl          : chr  "p" "p" "p" "p" ... $class : chr "compact" "compact" "compact" "compact" ... 将字符变量转变为因子变量，因子变量的标签信息对注释很重要 mpg$manufacturer mpg$model mpg$trans mpg$drv mpg$fl mpg$class str(mpg)Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 234 obs. of 11 variables:$ manufacturer: Factor w/ 15 levels "audi","chevrolet",..: 1 1 1 1 1 1 1 1 1 1 ... $model : Factor w/ 38 levels "4runner 4wd",..: 2 2 2 2 2 2 2 3 3 3 ...$ displ       : num  1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ... $year : int 1999 1999 2008 2008 1999 1999 2008 1999 1999 2008 ...$ cyl         : int  4 4 4 4 6 6 6 4 4 4 ... $trans : Factor w/ 10 levels "auto(av)","auto(l3)",..: 4 9 10 1 4 9 1 9 4 10 ...$ drv         : Factor w/ 3 levels "4","f","r": 2 2 2 2 2 2 2 1 1 1 ... $cty : int 18 21 20 21 16 18 18 18 16 20 ...$ hwy         : int  29 29 31 30 26 26 27 26 25 28 ... $fl : Factor w/ 5 levels "c","d","e","p",..: 4 4 4 4 4 4 4 4 4 4 ...$ class       : Factor w/ 7 levels "2seater","compact",..: 2 2 2 2 2 2 2 2 2 2 ...

qplot( )绘制散点图

library(ggplot2)qplot(displ,hwy,data = mpg)

qplot(displ, hwy, data = mpg, color = drv)

qplot(displ, hwy, data = mpg, geom = c("point", "smooth"))

qplot( )绘制直方图

qplot(hwy, data = mpg, fill = drv)#按drv变量水平填充颜色

qplot( )绘制刻面图，即网格图形

qplot(displ, hwy, data = mpg, facets = . ~ drv)qplot(hwy, data = mpg, facets = drv ~ ., binwidth = 2)

facets参数：drv变量在“~”右边表示每个drv水平的独立图排成一行，drv变量在“~”左边表示每个drv水平的独立图排成一列。

➢ggplot函数

ggplot( )是ggplot系统的核⼼函数，一般先初始化一个ggplot图形，然后再逐渐在上面添加各式的图层。

・可先用qplot函数绘制

qplot(wt, mpg, data = mtcars, facets = . ~ cyl, geom = c("point", "smooth"), method = "lm")

・再用更高级的ggplot框架进行绘制

1.初始化图形

head(mtcars,n=3)                   mpg cyl disp  hp drat    wt  qsec vs am gear carbMazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1g summary(g) #显示出数据，美学映射以及关于faceting的内容data: mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb [32x11]mapping:  x = ~wt, y = ~mpgfaceting:     compute_layout: function    draw_back: function    draw_front: function    draw_labels: function    draw_panels: function    finish_data: function    init_scales: function    map_data: function    params: list    setup_data: function    setup_params: function    shrink: TRUE    train_scales: function    vars: function    super:  print(g)

p print(p)#或直接使用以下代码g + geom_point()

geom_smooth( )添加平滑曲线

g + geom_point() + geom_smooth()g + geom_point() + geom_smooth(method = "lm")#添加线性回归线

facet_grid( )绘制刻面图

g + geom_point() + facet_grid(. ~ cyl) + geom_smooth(method = "lm")

g + geom_point(color = "steelblue", size = 4, alpha = 1/2)

g + geom_point(aes(color = cyl), size = 4, alpha = 1/2)

labs( )添加标签

g + geom_point(aes(color = cyl)) + labs(title = "Automobile Data") +   labs(x = "weight", y = "Miles Per Gallon")

g + geom_point(aes(color = cyl), size = 2, alpha = 1/2) +  geom_smooth(size = 4, linetype = 3, method = "lm", se = FALSE) #se=FALSE不显示置信区间

theme_bw( )设置黑白背景及字体

g + geom_point(aes(color = cyl)) + theme_bw(base_family = "Times")#使用Times字体

## 2. 偏值点处理

➢基础绘图处理偏值

testdat testdat[50,2] plot(testdat$x, testdat$y, type = "l", ylim = c(-3,3))

➢ggplot绘图处理偏值

g g + geom_line()

g + geom_line() + ylim(-3, 3)

g + geom_line() + coord_cartesian(ylim = c(-3, 3))

coord_cartesian函数处理偏值点，将偏值点包含在数据中再coord_cartesian函数把 y 轴的极限设为 -3 到 3，数据集没有缩减为只包含 y 轴范围的子集。

## 3. 连续变量分类

quantile( ) 找出切割点，cut( )把数据切割成一系列的分布范围。

#计算数据的十分位数cutpoints cutpoints    0%   10%   20%   30%   40%   50%   60%   70%   80%   90%  100%  52.0  66.0  93.4 106.2 110.0 123.0 165.0 178.5 200.0 243.5 335.0#按十分位数进行划分生成新的因子变量mtcars$hpdec levels(mtcars$hpdec) [1] "(52,66]"    "(66,93.4]"  "(93.4,106]" "(106,110]"  "(110,123]"  "(123,165]"  "(165,178]"  [8] "(178,200]"  "(200,244]"  "(244,335]"#绘图并逐层添加图层，设置各项参数g g + geom_point(alpha = 1/3) +   facet_wrap(cyl ~ hpdec, nrow = 3, ncol = 5) +   geom_smooth(method="lm", se=FALSE, col="steelblue") +  theme_bw(base_family = "Avenir", base_size = 10) +  labs(x = "weight") +  labs(y = "Miles Per Gallon") +  labs(title = "Automobile Data")

• 0
点赞
• 0
收藏
• 0
评论
12-29
09-27 187
11-20 1万+
08-02 859
10-18 1084
02-04 5175
05-29 3899
09-24 1万+
06-14 9679

• 非常没帮助
• 没帮助
• 一般
• 有帮助
• 非常有帮助