1. data: The data you want to visualise
2. aesthetic mappings: describing how activities in the data are mapped to aesthetic attributes
3. geometric(geom for short): represent what you actually see on the plot: point, lines, polygons, etc,
4. statistical (stats for short): summarise data in many useful ways.
5. scales: map values in the data space to values in an aesthetic space( like color, shape, size..)
6. Coordinate system: describe how data coordinates are mapped to the plane of the graphic(usually Cartesian coordinate)
7. faceting: describes how to break up the data into subsets and how to display those subsets as small multiples(known as lattice or grid)
quick ggplot: qplot()
qplot(x,y,data=..,)
qplot(log(x),log(y),data=..)
qplot(x,y,color=..,shape=..)
for every aesthetic attributes, there is a function, called a scale, which maps data values to valid values for that aesthetic.
2d geom:
geom="point" : the default one. scatterplot
geom="smooth" fit a smoother to the data and displays the smooth and its standard error( turn of f standard error, use se=F)
geom="boxplot"
geom="path"
geom="line"
1d geom:
geom="histogram"
geom="density"
geom="freqploy": a frequency ploygon.
geom="bar" : for discrete variable
we combine multiple geoms by supplying a vector of geom names created with c().
method="lm"
method="rlm" robust fitting algorithm (in MASS package)
qplot(carat, price, data=dsmall, geom=c("point","smooth"),method="lm",formula=y~ns(x,5))
ns is in splines package
qplot(carat, price, data=dsmall, geom=c("point","smooth"),method="lm", formula=y~poly(x,2))
one variable:
qplot(carat, data=diamonds, geom="histogram", binwidth=1,xlim=c(0,3))
qplot(carat,data=diamonds,geom="histogram",binwidth=1,color=color)
qplot(color, data=diamonds, geom="bar")
a line plot is just a path plot of the data sorted by x value
line plots usually have time on the x-axis, showing how a single variable has changed over time.
Path plots show how two variables have simultaneously changed over time, with time encoded in the way that the points are joined together.
faceting: aesthetic(color, shape) to compare subgroups, drawing all groups on the same plot.
faceting creates tables of splitting the data into subsets and displaying the same graph for each subset in an arrangement that facilitate comparison.
grammer: facets=row_var~col_var
if only one row or one col, use dot . instead. like row_var~. or .~col_var
qplot(carat, data=diamonds,facets=color~.,geom="histogram", binwidth=0.1, facets=color~.,xlim=c(0,3))
qplot(carat, ..density..,data=diamonds, facets=color~., geom="histogram", xlim=c(0,3))
..density..: density surrounded by .. means the data is not from the original data, but from statistical transformation taht counts the number of observations in each bin
other options in qplot: xlim=c( ) ylim=c()
log="x" log="y" log="xy"
title: main, sub, xlab, ylab