Using R to Fix Data Quality: Section 2

Section 2: Visualizing Variables



Overview

In this section, we will talk about how to create charts and graphs so that you can explore your data in a quick visual summary.

Dot Plots & Jitter Plots

An easy way to visualize a single variable is to create a dot plot or a jitter plot.

First of all, we can use the way in section 1 to read the CSV file and check the data.



> data=read.csv("weather.csv")

> head(data)


  Ozone Solar.R Wind Temp Month Day
1    41     190  7.4   67     5   1
2    36     118  8.0   72     5   2
3    12     149 12.6   74     5   3
4    18     313 11.5   62     5   4
5    NA      NA 14.3   56     5   5
6    28      NA 14.9   66     5   6 


We can use $ operator to get one column in the table:

> data$Ozone

The easy way to get a dot plot of it:

> stripchart(data$Ozone)

The way to get a jitter plot:

> stripchart(data$Ozone, method="jitter")


Histograms

Jitter plots can be used in low volume data, but it is not a good way when there is a big number of data. Histograms can give you a better view to visualize it. Histograms can separate the x-axis into partitions and make a count of each partition. As a result, you can see the centralized tendency on it.

The way to make histogram:

> hist(data$Ozone)

Try to change breaks:

> hist(data$Ozone,breaks=2)
> hist(data$Ozone,breaks=100)


Practice Questions

1. What is the centralized tendency of the Ozone?

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值