library(ggplot2)
data(algae, package="DMwR2")
# Plot on the left (Standard), P.99
freqOcc <- table(algae$season)
barplot(freqOcc, main='Frequency of the Seasons')

# Plot on the right (ggplot2), P.99
ggplot(algae, aes(x=season)) + geom_bar() + ggtitle("Frequency of the Seasons")

# To flip the coordinates, use the following code:
ggplot(algae, aes(x=season)) + geom_bar() + ggtitle("Frequency of the Seasons") + coord_flip()

Let’s look at the distributions of the values of a continuous
variable using histograms and boxplots. Pp 99 - 100
library(ggplot2)
data(iris)
# Plot on the left (standard). P.100
hist(iris$Petal.Length, xlab='Petal Length')

# Plot on the right (ggplot2). p.100
ggplot(iris, aes(x=Petal.Length)) + geom_histogram() + xlab("Petal Length")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

A different way of showing information on the distribution
of the values of a continuous variable is through the boxplot. See
below:
# Using the boxplot fuction on a continuous variable.
# Code: P. 100, Figure: P. 101
library(ggplot2)
data(iris)
## Plot on the left (standard). P. 101
boxplot(iris$Sepal.Width, ylab= 'Sepal Width')

## Plot on the right (ggplot2). P.101
ggplot(iris, aes(x=factor(0), y=Sepal.Width)) + geom_boxplot() + xlab("") + ylab("Sepal Width") + theme(axis.text.x=element_blank())

With plots for continuous variables under our belts, let’s turn to
plots that look at subgroups, or subgroups of datasets.
The Conditioned plots are the plots that handled the task of
plotting subgroups of datasets.
Only the boxplot in standard graphics can handle this task of
comparing the behaviors across subgroups. No other function in standard
graphics can handle this task of comparing the behaviors across
subgroups.
Even though Conditioned plots pose problems when comparing the
behaviors across subgroups, we will work with them.
Within the ggplot ecosystem, the task of comparing the behaviors
across subgroups is usually handled by “facets.”
Facets are variations of the same plot that are obtained with
different subsets of a dataset.
The ggplot graphics system provides better conditioning through
facets. Below, we check the distribution of algal “a1” for the different
types of rivers (in terms of water speed and river size), through a
histogram. We need as many histogramsas there are combinations of river
size and speed.
Below, we show these graphs:
library(ggplot2)
data(algae, package= "DMwR2")
ggplot(algae, aes(x=a1)) + geom_histogram() + facet_grid(size ~ speed)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
