Histogram
Let’s use the USCancerRates dataset from latticeExtra package -
data(USCancerRates, package = "latticeExtra")
str(USCancerRates)
'data.frame': 3041 obs. of 8 variables:
$ rate.male : num 364 346 341 336 330 ...
$ LCL95.male : num 311 274 304 289 293 ...
$ UCL95.male : num 423 431 381 389 371 ...
$ rate.female : num 151 140 182 185 172 ...
$ LCL95.female: num 124 103 161 157 151 ...
$ UCL95.female: num 184 190 206 218 195 ...
$ state : Factor w/ 49 levels "Alabama","Alaska",..: 1 1 1 1 1 1 1 1 1 1 ...
$ county : 'AsIs' chr "Pickens County" "Bullock County" "Russell County" "Barbour County" ...
Make a simple histogram -
histogram(x = ~ rate.male, data = USCancerRates)
Here, Y-axis by default shows relative bin frequency.
Using base R-
hist(USCancerRates$rate.male)

In the two outputs the following things are different -
- Visual appearance (colors, etc.) is different
- The y-axes represent different quantities
- Bin boundaries are different
Adding title and axis labels -
histogram(x = ~ rate.male, data = USCancerRates,
main = "Country wise deaths due to cancer (1999-2003)",
xlab = "Rate among males (per 100,000)")

Specifying number of intervals -
histogram(x = ~ rate.male, data = USCancerRates,
nint = 30)

In the case of histogram(), the optional argument type controls what is plotted on the y-axis. It can take three values:
- “percent”, the default, gives percentage or relative frequency.(default)
- “count” gives bin count, which is the default in hist().
- “density” gives a density histogram.
histogram(x = ~ rate.male, data = USCancerRates,
nint = 30, type = "density")

histogram(x = ~ rate.male, data = USCancerRates,
nint = 30, type = "count")

Scatterplot
Make a simple scatterplot -
xyplot(rate.female ~ rate.male, data = USCancerRates)

To add axis labels -
xyplot(rate.female ~ rate.male, data = USCancerRates,
xlab = "Rate among males (per 100,000)",
ylab = "Rate among females (per 100,000)")

Adding grid and abline -
xyplot(rate.female ~ rate.male, data = USCancerRates,
abline = c(0,1), grid = TRUE)

Adding linear regression line -
xyplot(rate.female ~ rate.male, data = USCancerRates,
panel = function(x, y) {
panel.xyplot(x, y)
panel.abline(lm(y ~ x))
})

Customizing legend -
xyplot(Ozone ~ Temp, data = airquality, groups = Month,
# Complete the legend spec
auto.key = list(space = "right",
title = "Month",
text = month.name[5:9]))

Conditioned scatterplot -
# Create 'state.ordered' by reordering levels
library(dplyr)
USCancerRates <-
mutate(USCancerRates,
state.ordered = reorder(state,
rate.male + rate.female,
mean, na.rm = TRUE))
# Create conditioned scatter plot
xyplot(rate.female ~ rate.male | state.ordered,
data = USCancerRates,
grid = TRUE,
panel = function(x, y) {
panel.xyplot(x, y)
panel.abline(lm(y ~ x))
})

In a conditioned lattice plot, the panels are by default drawn starting from the bottom-left position, going right and then up. This is patterned on the Cartesian coordinate system where the x-axis increases to the right and the y-axis increases from bottom to top.
Often we want to change this so that the layout is similar to a matrix or table, where rows start at the top. The layout of any conditioned lattice plot can be changed to follow this scheme by adding the optional argument as.table = TRUE.
xyplot(rate.female ~ rate.male | state.ordered,
data = USCancerRates,
grid = TRUE,
panel = function(x, y) {
panel.xyplot(x, y)
panel.abline(lm(y ~ x))
},
as.table = TRUE)

Density plot
Use the ‘airquality’ dataset
data(airquality)
str(airquality)
'data.frame': 153 obs. of 6 variables:
$ Ozone : int 41 36 12 18 NA 28 23 19 8 NA ...
$ Solar.R: int 190 118 149 313 NA NA 299 99 19 194 ...
$ Wind : num 7.4 8 12.6 11.5 14.3 14.9 8.6 13.8 20.1 8.6 ...
$ Temp : int 67 72 74 62 56 66 65 59 61 69 ...
$ Month : int 5 5 5 5 5 5 5 5 5 5 ...
$ Day : int 1 2 3 4 5 6 7 8 9 10 ...
Create a density plot -
densityplot(~ Ozone, data = airquality)

A useful optional argument for densityplot() is plot.points, which can take values -
- TRUE, the default, to plot the data points along the x-axis in addition to the density;
- FALSE to suppress plotting the data points, and
- “jitter”, to plot the points along the y-axis but with some random jittering in the y-direction so that overlapping points are easier to see.
densityplot(~ Ozone, data = airquality,
plot.points = TRUE)

densityplot(~ Ozone, data = airquality,
plot.points = FALSE)

Box and Whisker Plot
Creating a box and whisker plot -
bwplot(x = ~ rate.male, data = USCancerRates)

Creating box and whisker plots by some factor -
bwplot(state ~ rate.male, data = USCancerRates)

Reordering the states by their median rate -
bymedian <- with(USCancerRates, reorder(state, rate.male, median, na.rm = T))
bwplot(bymedian ~ rate.male, data = USCancerRates)

Changing labels -
# Create box-and-whisker plot
bwplot(state.ordered ~ rate.female + rate.male,
data = USCancerRates,
outer = TRUE,
xlab = "Rate (per 100,000)",
# Add strip labels
strip = strip.custom(factor.levels = c("Male", "Female")))

Using the plot as an object -
pl <- bwplot(state.ordered ~ rate.female + rate.male,
data = USCancerRates,
outer = TRUE,
xlab = "Rate (per 100,000)")
pl

[1] "trellis"
Call:
bwplot(state.ordered ~ rate.female + rate.male, data = USCancerRates,
outer = TRUE, xlab = "Rate (per 100,000)")
Number of observations:
rate.female rate.male
3041 3041
[[1]]
[1] "rate.female" "rate.male"
Updating trellis object -
update(pl, strip = strip.custom(factor.levels = c("Men","Women")))

Another way to change the labels -
dimnames(pl)[[1]] <- c("Male", "Female")
Subset the trellis object like matrix -

Conditioning/Facetting
Conditioning scatterplot on Species -
'data.frame': 150 obs. of 5 variables:
$ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
$ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
$ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
$ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
$ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
xyplot(Sepal.Width ~ Sepal.Length | Species, # facet by Species
iris, grid = TRUE)

Conditioning histogram of weight on group -
'data.frame': 30 obs. of 2 variables:
$ weight: num 4.17 5.58 5.18 6.11 4.5 4.61 5.17 4.53 5.33 5.14 ...
$ group : Factor w/ 3 levels "ctrl","trt1",..: 1 1 1 1 1 1 1 1 1 1 ...
densityplot( ~ weight | group, PlantGrowth)

Conditioning two different variables in one plot -
histogram( ~ rate.male + rate.female, USCancerRates,
outer = TRUE)

Notice that rate.male and rate.female are two different variables in the dataset, which means that USCancerRates is not a tidy data frame. lattice, unlike ggplot2, allows you to have data in a wide format.
densityplot(~ rate.male + rate.female,
data = USCancerRates,
plot.points = FALSE, # Suppress data points
)

With outer=TRUE -
densityplot(~ rate.male + rate.female,
data = USCancerRates,
outer = TRUE,
plot.points = FALSE, # Suppress data points
)

Changing layout -
densityplot( ~ rate.male + rate.female, USCancerRates,
outer = TRUE, layout = c(1,2) # 1 column, 2 rows
)

Doing some data manipulation to get summary statistics -
USCancerRates.state <- with(USCancerRates, {
rmale <- tapply(rate.male, state, median, na.rm= TRUE)
rfemale <- tapply(rate.female, state, median, na.rm= TRUE)
data.frame(
Rate = c(rmale, rfemale),
State = rep(names(rmale), 2),
Gender = rep(c("Male", "Female"), each = length(rmale))
)
})
USCancerRates.state <- dplyr::mutate(USCancerRates.state,
State = reorder(State, Rate))
head(USCancerRates.state, 10)
Rate State Gender
1 286.00 Alabama Male
2 237.95 Alaska Male
3 209.30 Arizona Male
4 284.10 Arkansas Male
5 221.30 California Male
6 204.40 Colorado Male
7 228.55 Connecticut Male
8 268.25 Delaware Male
9 250.20 Florida Male
10 280.80 Georgia Male
Conditioning by gender -
xyplot(State ~ Rate | Gender, USCancerRates.state, grid = TRUE)

Grouping by gender -
xyplot(State ~ Rate, groups = Gender, data = USCancerRates.state, grid = TRUE)

To add legend -
xyplot(State ~ Rate, groups = Gender, data = USCancerRates.state,
grid = TRUE,
auto.key = TRUE)

Positioning and formatting the legend -
xyplot(State ~ Rate, groups = Gender, data = USCancerRates.state,
grid = TRUE,
auto.key=list(space="bottom", columns = 2,
title=NULL, cex.title = 1))

# USCancerRates has been pre-loaded
str(USCancerRates)
'data.frame': 3041 obs. of 9 variables:
$ rate.male : num 364 346 341 336 330 ...
$ LCL95.male : num 311 274 304 289 293 ...
$ UCL95.male : num 423 431 381 389 371 ...
$ rate.female : num 151 140 182 185 172 ...
$ LCL95.female : num 124 103 161 157 151 ...
$ UCL95.female : num 184 190 206 218 195 ...
$ state : Factor w/ 49 levels "Alabama","Alaska",..: 1 1 1 1 1 1 1 1 1 1 ...
$ county : 'AsIs' chr "Pickens County" "Bullock County" "Russell County" "Barbour County" ...
$ state.ordered: Factor w/ 49 levels "Utah","Colorado",..: 40 40 40 40 40 40 40 40 40 40 ...
..- attr(*, "scores")= num [1:49(1d)] 450 428 351 457 383 ...
.. ..- attr(*, "dimnames")=List of 1
.. .. ..$ : chr [1:49] "Alabama" "Alaska" "Arizona" "Arkansas" ...
# Create a density plot
densityplot(~ rate.male + rate.female,
data = USCancerRates,
# Set value of 'outer'
outer = FALSE,
# Add x-axis label
xlab = "Rate (per 100,000)",
# Add a legend
auto.key = TRUE,
plot.points = FALSE,
ref = TRUE)

