R语言的多维可视化方法(ggplot二维图表现多维)

转载自http://www.edvancer.in/create-a-multi-dimensional-visualisation-in-r/

大意就是在二维图的基础上,用不同的符号,颜色,大小等表现多维


 in  Blog, R tips and tutorials
on 10/04/2015

Aim of any visualisation is to gain insight into the data, which by no means should be limited to just two factors at a time. Because in real life you always have multiple factors involved in any process. Challenge here is that traditional scatter plots can at max be scaled to 3 dimensions. Beyond that it becomes impossible to add more axes to your plot. But i don’t agree with the thought that inability to add more axes results in restriction on dimensions that you can show in your scatter plot. Visualisation on 2D planes is not restricted to just two dimensions opposed to general belief.

Let me give you a simple example using “mtcars” data in R. Those who are not really  interested in programming, can ignore the code bit. Data set mtcars contains information on how various factors affect mileage of a car. here is a quick look at the data.

kable(head(mtcars,3))

 

  mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1

 

We’ll start with a simple 2D plot depicting how mileage varies with weight of the vehicle

library(ggplot2)
p=ggplot(mtcars,aes(y=Mileage,x=Weight))
p+geom_point(size=4)

plot of chunk unnamed-chunk-3

As apparent from the plot, Mileage goes down with increase in Weight. Now lets put in another dimension in this and see how having automatic transmission affects mileage.

p=ggplot(mtcars,aes(y=Mileage,x=Weight,color=Transmission))
p+geom_point(size=4)

plot of chunk unnamed-chunk-5

We can see that most of the cars with automatic transmission tend to have higher mileage. One thing to note here is that most of the high weight cars tend to have manual transmission which might be the real underlying reason for cars with automatic transmission to have higher mileage.

Ok, now lets add one more dimension to find out how number of gears change across these different vehicles.

p=ggplot(mtcars,aes(y=Mileage,x=Weight,color=Transmission,size=Gears))
p+geom_point()+scale_size_discrete(range = c(4,6))

plot of chunk unnamed-chunk-6

You can see number of gears dont really affect mileage as they tend to take all possible values across entire range of mileage, same goes for weight. But a curious thing to observe here is that cars with automatic transmission managed to have higher number of gears in comparison to manual transmission cars. In fact there seems to be a limit on the number of gears which can be in the manual transmission cars.

Lets add one more dimension depciting number of cylinders in engines.

p=ggplot(mtcars,aes(y=Mileage,x=Weight,color=Transmission,size=Gears,shape=Cylinders))
p+geom_point()+scale_size_discrete(range = c(4,6))

plot of chunk unnamed-chunk-7

You can see that number of cylinders certainly seem to have an effect on mileage. Low mileage and high weight cars tend to have 8 cylinders in the engine where as high mileage and low weight cars tend to have 4 cylinders.

If you have noticed , by now we have 5 dimensions in a 2-D plot. Lession here is that visualising multiple factors [ dimension ] is not really about making n-D [impossible!] plot. Its not really feasible to add more axes to your plot. What we can do however is to give more features to our “points”, which is exactly what we have done here. The additional 3 dimensions that we introduced, are by adding features like shape, size and color to our points.

Thats what I wanted to convey, let your imagination [ and Mr hadley wickham :author of ggplot2] take you out of those dimensionality constraints! Happy Plotting in R!


  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值