Project summary
This assignment uses data from the UC Irvine Machine Learning Repository, a popular repository for machine learning datasets. In particular, we will be using the “Individual household electric power consumption Data Set” which I have made available on the course web site:
- Dataset: Electric power consumption [20Mb]
Description: Measurements of electric power consumption in one household with a one-minute sampling rate over a period of almost 4 years. Different electrical quantities and some sub-metering values are available.
The following descriptions of the 9 variables in the dataset are taken from the UCI web site:Date
: Date in format dd/mm/yyyyTime
: time in format hh:mm:ssGlobal_active_power
: household global minute-averaged active power (in kilowatt)Global_reactive_power
: household global minute-averaged reactive power (in kilowatt)Voltage
: minute-averaged voltage (in volt)
6Global_intensity
: household global minute-averaged current intensity (in ampere)Sub_metering_1
: energy sub-metering No. 1 (in watt-hour of active energy). It corresponds to the kitchen, containing mainly a dishwasher, an oven and a microwave (hot plates are not electric but gas powered).Sub_metering_2
: energy sub-metering No. 2 (in watt-hour of active energy). It corresponds to the laundry room, containing a washing-machine, a tumble-drier, a refrigerator and a light.Sub_metering_3
: energy sub-metering No. 3 (in watt-hour of active energy). It corresponds to an electric water-heater and an air-conditioner.
Review criteria
- Was a valid GitHub URL containing a git repository submitted?
- Does the GitHub repository contain at least one commit beyond the original fork?
- Please examine the plot files in the GitHub repository. Do the plot files appear to 4. be of the correct graphics file format?
- Does each plot appear correct?
- Does each set of R code appear to create the reference plot?
Loading the data
When loading the dataset into R, please consider the following:
- The dataset has 2,075,259 rows and 9 columns. First calculate a rough estimate of how much memory the dataset will require in memory before reading into R. Make sure your computer has enough memory (most modern computers should be