目的:将原始的age和friend_count的散点图和摘要绘制的图形放在一起
1、更改原始数据的绘图颜色,以便在合并后仍然可以看清
导入数据
getwd()
library('ggplot2')
pf <- read.csv('pseudo_facebook.tsv',sep = '\t')
更改颜色
ggplot(aes(x=age,y=friend_count),data=pf)+
geom_point(alpha=1/20,
position = position_jitter(h=0),
color='orange')+
xlim(13,90)+
coord_trans(y="sqrt")
2、按照年龄的平均好友数覆盖在原始数据上
> ggplot(aes(x=age,y=friend_count),data=pf)+
+ geom_point(alpha=1/20,
+ position = position_jitter(h=0),
+ color='orange')+
+ xlim(13,90)+
+ coord_trans(y="sqrt")+
+ geom_line(stat = 'summary',fun.y=mean)
3、显示更多细节,同时显示更多汇总
- 添加10分位数
> ggplot(aes(x=age,y=friend_count),data=pf)+
+ geom_point(alpha=1/20,
+ position = position_jitter(h=0),
+ color='orange')+
+ xlim(13,90)+
+ coord_trans(y="sqrt")+
+ geom_line(stat = 'summary',fun.y=mean)+
+ geom_line(stat='summary',fun.y=quantile,probs = .1)
**Warning: Ignoring unknown parameters: probs**
Warning messages:
1: Removed 4906 rows containing non-finite values (stat_summary).
2: Removed 4906 rows containing non-finite values (stat_summary).
3: Removed 5200 rows containing missing values (geom_point).
注意:ggplot 2.0.0 在使用 stat = ‘summary’ 时改变了函数“参数自变量”的语法。要表示在 fun.y 指定的函数上设置的参数,请使用 fun.args 参数,比如
geom_line(stat = ‘summary’, fun.y = quantile, fun.args = list(probs = .9), … )
> ggplot(aes(x=age,y=friend_count),data=pf)+
+ geom_point(alpha=1/20,
+ position = position_jitter(h=0),
+ color='orange')+
+ xlim(13,90)+
+ coord_trans(y="sqrt")+
+ geom_line(stat = 'summary',fun.y=mean)+
+ geom_line(stat='summary',fun.y=quantile,fun.args=list(probs = .1))
Warning messages:
1: Removed 4906 rows containing non-finite values (stat_summary).
2: Removed 4906 rows containing non-finite values (stat_summary).
3: Removed 5182 rows containing missing values (geom_point).
更改颜色和线条属性
> ggplot(aes(x=age,y=friend_count),data=pf)+
+ geom_point(alpha=1/20,
+ position = position_jitter(h=0),
+ color='orange')+
+ xlim(13,90)+
+ coord_trans(y="sqrt")+
+ geom_line(stat = 'summary',fun.y=mean)+
+ geom_line(stat='summary',fun.y=quantile,fun.args=list(probs = .1),
+ linetype=2,color='blue')
Warning messages:
1: Removed 4906 rows containing non-finite values (stat_summary).
2: Removed 4906 rows containing non-finite values (stat_summary).
3: Removed 5183 rows containing missing values (geom_point).
添加90分位
> ggplot(aes(x=age,y=friend_count),data=pf)+
+ geom_point(alpha=1/20,
+ position = position_jitter(h=0),
+ color='orange')+
+ xlim(13,90)+
+ coord_trans(y="sqrt")+
+ geom_line(stat = 'summary',fun.y=mean)+
+ geom_line(stat='summary',fun.y=quantile,fun.args=list(probs = .1),
+ linetype=2,color='blue')+
+ geom_line(stat='summary',fun.y=quantile,fun.args=list(probs = .9),
+ linetype=2,color='blue')
添加50分位
> ggplot(aes(x=age,y=friend_count),data=pf)+
+ geom_point(alpha=1/20,
+ position = position_jitter(h=0),
+ color='orange')+
+ xlim(13,90)+
+ coord_trans(y="sqrt")+
+ geom_line(stat = 'summary',fun.y=mean)+
+ geom_line(stat='summary',fun.y=quantile,fun.args=list(probs = .1),
+ linetype=2,color='blue')+
+ geom_line(stat='summary',fun.y=quantile,fun.args=list(probs = .5),
+ color='blue')+
+ geom_line(stat='summary',fun.y=quantile,fun.args=list(probs = .9),
+ linetype=2,color='blue')
在 R 中计算 某一位百分比.
3、添加coord_cartesian图层,放大图层的不同部分
要缩进,你可以使用 coord_cartesian(xlim = c(13, 90)) 层,而不是 xlim(13, 90)。
如果你对 coord_cartesian() 和 quantile() 函数不熟悉,请阅读相关文档。
使用coord_cartesian放大某个部分要先去掉coord_trans和xlim图层
> ggplot(aes(x=age,y=friend_count),data=pf)+
+ geom_point(alpha=1/20,
+ position = position_jitter(h=0),
+ color='orange')+
+ coord_cartesian(xlim = c(13,70),ylim = c(0,1000))+
+ geom_line(stat = 'summary',fun.y=mean)+
+ geom_line(stat='summary',fun.y=quantile,fun.args=list(probs = .1),
+ linetype=2,color='blue')+
+ geom_line(stat='summary',fun.y=quantile,fun.args=list(probs = .5),
+ color='blue')+
+ geom_line(stat='summary',fun.y=quantile,fun.args=list(probs = .9),
+ linetype=2,color='blue')