Histogram
Let’s use the USCancerRates dataset from latticeExtra package -
data(USCancerRates, package = "latticeExtra")
str(USCancerRates)
'data.frame': 3041 obs. of 8 variables:
$ rate.male : num 364 346 341 336 330 ...
$ LCL95.male : num 311 274 304 289 293 ...
$ UCL95.male : num 423 431 381 389 371 ...
$ rate.female : num 151 140 182 185 172 ...
$ LCL95.female: num 124 103 161 157 151 ...
$ UCL95.female: num 184 190 206 218 195 ...
$ state : Factor w/ 49 levels "Alabama","Alaska",..: 1 1 1 1 1 1 1 1 1 1 ...
$ county : 'AsIs' chr "Pickens County" "Bullock County" "Russell County" "Barbour County" ...
Make a simple histogram -
histogram(x = ~ rate.male, data = USCancerRates)
Here, Y-axis by default shows relative bin frequency.
Using base R-
hist(USCancerRates$rate.male)

In the two outputs the following things are different -
- Visual appearance (colors, etc.) is different
- The y-axes represent different quantities
- Bin boundaries are different
Adding title and axis labels -
histogram(x = ~ rate.male, data = USCancerRates,
main = "Country wise deaths due to cancer (1999-2003)",
xlab = "Rate among males (per 100,000)")

Specifying number of intervals -
histogram(x = ~ rate.male, data = USCancerRates,
nint = 30)

In the case of histogram(), the optional argument type controls what is plotted on the y-axis. It can take three values:
- “percent”, the default, gives percentage or relative frequency.(default)
- “count” gives bin count, which is the default in hist().
- “density” gives a density histogram.
histogram(x = ~ rate.male, data = USCancerRates,
nint = 30, type = "density")

histogram(x = ~ rate.male, data = USCancerRates,
nint = 30, type = "count")

Scatterplot
Make a simple scatterplot -
xyplot(rate.female ~ rate.male, data = USCancerRates)

To add axis labels -
xyplot(rate.female ~ rate.male, data = USCancerRates,
xlab = "Rate among males (per 100,000)",
ylab = "Rate among females (per 100,000)")

Adding grid and abline -
xyplot(rate.female ~ rate.male, data = USCancerRates,
abline = c(0,1), grid = TRUE)

Adding linear regression line -
xyplot(rate.female ~ rate.male, data = USCancerRates,
panel = function(x, y) {
panel.xyplot(x, y)
panel.abline(lm(y ~ x))
})

Customizing legend -
xyplot(Ozone ~ Temp, data = airquality, groups = Month,
# Complete the legend spec
auto.key = list(space = "right",
title = "Month",
text = month.name[5:9]))

Conditioned scatterplot -
# Create 'state.ordered' by reordering levels
library(dplyr)
USCancerRates <-
mutate(USCancerRates,
state.ordered = reorder(state,
rate.male + rate.female,
mean, na.rm = TRUE))
# Create conditioned scatter plot
xyplot(rate.female ~ rate.male | state.ordered,
data = USCancerRates,
grid = TRUE,
panel = function(x, y) {
panel.xyplot(x, y)
panel.abline(lm(y ~ x))
})

In a conditioned lattice plot, the panels are by default drawn starting from the bottom-left position, going right and then up. This is patterned on the Cartesian coordinate system where the x-axis increases to the right and the y-axis increases from bottom to top.
Often we want to change this so that the layout is similar to a matrix or table, where rows start at the top. The layout of any conditioned lattice plot can be changed to follow this scheme by adding the optional argument as.table = TRUE.
xyplot(rate.female ~ rate.male | state.ordered,
data = USCancerRates,
grid = TRUE,
panel = function(x, y) {
panel.xyplot(x, y)
panel.abline(lm(y ~ x))
},
as.table = TRUE)

Density plot
Use the ‘airquality’ dataset
data(airquality)
str(airquality)
'data.frame': 153 obs. of 6 variables:
$ Ozone : int 41 36 12 18 NA 28 23 19 8 NA ...
$ Solar.R: int 190 118 149 313 NA NA 299 99 19 194 ...
$ Wind : num 7.4 8 12.6 11.5 14.3 14.9 8.6 13.8 20.1 8.6 ...
$ Temp : int 67 72 74 62 56 66 65 59 61 69 ...
$ Month : int 5 5 5 5 5 5 5 5 5 5 ...
$ Day : int 1 2 3 4 5 6 7 8 9 10 ...
Create a density plot -
densityplot(~ Ozone, data = airquality)

A useful optional argument for densityplot() is plot.points, which can take values -
- TRUE, the default, to plot the data points along the x-axis in addition to the density;
- FALSE to suppress plotting the data points, and
- “jitter”, to plot the points along the y-axis but with some random jittering in the y-direction so that overlapping points are easier to see.
densityplot(~ Ozone, data = airquality,
plot.points = TRUE)

densityplot(~ Ozone, data = airquality,
plot.points = FALSE)

Box and Whisker Plot
Creating a box and whisker plot -
bwplot(x = ~ rate.male, data = USCancerRates)

Creating box and whisker plots by some factor -
bwplot(state ~ rate.male, data = USCancerRates)

Reordering the states by their median rate -
bymedian <- with(USCancerRates, reorder(state, rate.male, median, na.rm = T))
bwplot(bymedian ~ rate.male, data = USCancerRates)

Changing labels -
# Create box-and-whisker plot
bwplot(state.ordered ~ rate.female + rate.male,
data = USCancerRates,
outer = TRUE,
xlab = "Rate (per 100,000)",
# Add strip labels
strip = strip.custom(factor.levels = c("Male", "Female")))

Using the plot as an object -
pl <- bwplot(state.ordered ~ rate.female + rate.male,
data = USCancerRates,
outer = TRUE,
xlab = "Rate (per 100,000)")
pl

[1] "trellis"
Call:
bwplot(state.ordered ~ rate.female + rate.male, data = USCancerRates,
outer = TRUE, xlab = "Rate (per 100,000)")
Number of observations:
rate.female rate.male
3041 3041
[[1]]
[1] "rate.female" "rate.male"
Updating trellis object -
update(pl, strip = strip.custom(factor.levels = c("Men","Women")))

Another way to change the labels -
dimnames(pl)[[1]] <- c("Male", "Female")
Subset the trellis object like matrix -

Conditioning/Facetting
Conditioning scatterplot on Species -
'data.frame': 150 obs. of 5 variables:
$ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
$ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
$ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
$ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
$ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
xyplot(Sepal.Width ~ Sepal.Length | Species, # facet by Species
iris, grid = TRUE)

Conditioning histogram of weight on group -
'data.frame': 30 obs. of 2 variables:
$ weight: num 4.17 5.58 5.18 6.11 4.5 4.61 5.17 4.53 5.33 5.14 ...
$ group : Factor w/ 3 levels "ctrl","trt1",..: 1 1 1 1 1 1 1 1 1 1 ...
densityplot( ~ weight | group, PlantGrowth)

Conditioning two different variables in one plot -
histogram( ~ rate.male + rate.female, USCancerRates,
outer = TRUE)

Notice that rate.male and rate.female are two different variables in the dataset, which means that USCancerRates is not a tidy data frame. lattice, unlike ggplot2, allows you to have data in a wide format.
densityplot(~ rate.male + rate.female,
data = USCancerRates,
plot.points = FALSE, # Suppress data points
)

With outer=TRUE -
densityplot(~ rate.male + rate.female,
data = USCancerRates,
outer = TRUE,
plot.points = FALSE, # Suppress data points
)

Changing layout -
densityplot( ~ rate.male + rate.female, USCancerRates,
outer = TRUE, layout = c(1,2) # 1 column, 2 rows
)

Doing some data manipulation to get summary statistics -
USCancerRates.state <- with(USCancerRates, {
rmale <- tapply(rate.male, state, median, na.rm= TRUE)
rfemale <- tapply(rate.female, state, median, na.rm= TRUE)
data.frame(
Rate = c(rmale, rfemale),
State = rep(names(rmale), 2),
Gender = rep(c("Male", "Female"), each = length(rmale))
)
})
USCancerRates.state <- dplyr::mutate(USCancerRates.state,
State = reorder(State, Rate))
head(USCancerRates.state, 10)
Rate State Gender
1 286.00 Alabama Male
2 237.95 Alaska Male
3 209.30 Arizona Male
4 284.10 Arkansas Male
5 221.30 California Male
6 204.40 Colorado Male
7 228.55 Connecticut Male
8 268.25 Delaware Male
9 250.20 Florida Male
10 280.80 Georgia Male
Conditioning by gender -
xyplot(State ~ Rate | Gender, USCancerRates.state, grid = TRUE)

Grouping by gender -
xyplot(State ~ Rate, groups = Gender, data = USCancerRates.state, grid = TRUE)

To add legend -
xyplot(State ~ Rate, groups = Gender, data = USCancerRates.state,
grid = TRUE,
auto.key = TRUE)

Positioning and formatting the legend -
xyplot(State ~ Rate, groups = Gender, data = USCancerRates.state,
grid = TRUE,
auto.key=list(space="bottom", columns = 2,
title=NULL, cex.title = 1))

# USCancerRates has been pre-loaded
str(USCancerRates)
'data.frame': 3041 obs. of 9 variables:
$ rate.male : num 364 346 341 336 330 ...
$ LCL95.male : num 311 274 304 289 293 ...
$ UCL95.male : num 423 431 381 389 371 ...
$ rate.female : num 151 140 182 185 172 ...
$ LCL95.female : num 124 103 161 157 151 ...
$ UCL95.female : num 184 190 206 218 195 ...
$ state : Factor w/ 49 levels "Alabama","Alaska",..: 1 1 1 1 1 1 1 1 1 1 ...
$ county : 'AsIs' chr "Pickens County" "Bullock County" "Russell County" "Barbour County" ...
$ state.ordered: Factor w/ 49 levels "Utah","Colorado",..: 40 40 40 40 40 40 40 40 40 40 ...
..- attr(*, "scores")= num [1:49(1d)] 450 428 351 457 383 ...
.. ..- attr(*, "dimnames")=List of 1
.. .. ..$ : chr [1:49] "Alabama" "Alaska" "Arizona" "Arkansas" ...
# Create a density plot
densityplot(~ rate.male + rate.female,
data = USCancerRates,
# Set value of 'outer'
outer = FALSE,
# Add x-axis label
xlab = "Rate (per 100,000)",
# Add a legend
auto.key = TRUE,
plot.points = FALSE,
ref = TRUE)

---
title: "Plotting in R using Lattice"
author: 'MD. AHSANUL ISLAM'
date: "Last updated on `r format(Sys.Date(), '%d %B, %Y')`"
output:
  rmdformats::robobook:
    self_contained: true
    thumbnails: false
    lightbox: true
    code_download: true
pkgdown:
  as_is: true
---

```{css, echo=FALSE}
body{
  font-family: "Arial";
  font-size: 10pt;
}
```

```{r, include=FALSE}
knitr::opts_chunk$set(
  comment = "", prompt = F, message = F, warning = F
)

```


---

# Library

Install the package -
```{r, eval = FALSE}
install.packages("lattice")
```

Load the package - 
```{r}
library(lattice)
```

# Histogram

Let's use the USCancerRates dataset from latticeExtra package - 
```{r}
data(USCancerRates, package = "latticeExtra")
str(USCancerRates)
```

Make a simple histogram - 
```{r}
histogram(x = ~ rate.male, data = USCancerRates)
```
Here, Y-axis by default shows relative bin frequency.

Using base R-
```{r}
hist(USCancerRates$rate.male)
```

In the two outputs the following things are different -    

1. Visual appearance (colors, etc.) is different   
2. The y-axes represent different quantities   
3. Bin boundaries are different   

Adding title and axis labels - 
```{r}
histogram(x = ~ rate.male, data = USCancerRates,
          main = "Country wise deaths due to cancer (1999-2003)",
          xlab = "Rate among males (per 100,000)")
```

Specifying number of intervals - 
```{r}
histogram(x = ~ rate.male, data = USCancerRates,
          nint = 30)
```

In the case of histogram(), the optional argument type controls what is plotted on the y-axis. It can take three values:   

1. "percent", the default, gives percentage or relative frequency.(default)   
2. "count" gives bin count, which is the default in hist().   
3. "density" gives a density histogram.   

```{r}
histogram(x = ~ rate.male, data = USCancerRates,
          nint = 30, type = "density")
histogram(x = ~ rate.male, data = USCancerRates,
          nint = 30, type = "count")
```


# Scatterplot

Make a simple scatterplot - 
```{r}
xyplot(rate.female ~ rate.male, data = USCancerRates)
```

To add axis labels - 
```{r}
xyplot(rate.female ~ rate.male, data = USCancerRates,
       xlab = "Rate among males (per 100,000)",
       ylab = "Rate among females (per 100,000)")
```

Adding grid and abline - 
```{r}
xyplot(rate.female ~ rate.male, data = USCancerRates,
       abline = c(0,1), grid = TRUE)
```


Adding linear regression line - 
```{r}
xyplot(rate.female ~ rate.male, data = USCancerRates,
       panel = function(x, y) {
         panel.xyplot(x, y)
         panel.abline(lm(y ~ x))
       })
```

Customizing legend - 
```{r}
xyplot(Ozone ~ Temp, data = airquality, groups = Month,
       # Complete the legend spec 
       auto.key = list(space = "right", 
                       title = "Month", 
                       text = month.name[5:9]))
```

Conditioned scatterplot - 
```{r}
# Create 'state.ordered' by reordering levels
library(dplyr)
USCancerRates <- 
  mutate(USCancerRates, 
         state.ordered = reorder(state, 
                                    rate.male + rate.female, 
                                    mean, na.rm = TRUE))

# Create conditioned scatter plot
xyplot(rate.female ~ rate.male | state.ordered,
       data = USCancerRates, 
       grid = TRUE, 
       panel = function(x, y) {
         panel.xyplot(x, y)
         panel.abline(lm(y ~ x))
       })
```

In a conditioned lattice plot, the panels are by default drawn starting from the bottom-left position, going right and then up. This is patterned on the Cartesian coordinate system where the x-axis increases to the right and the y-axis increases from bottom to top.

Often we want to change this so that the layout is similar to a matrix or table, where rows start at the top. The layout of any conditioned lattice plot can be changed to follow this scheme by adding the optional argument `as.table = TRUE`.

```{r}
xyplot(rate.female ~ rate.male | state.ordered,
       data = USCancerRates, 
       grid = TRUE, 
       panel = function(x, y) {
         panel.xyplot(x, y)
         panel.abline(lm(y ~ x))
       },
       as.table = TRUE)
```


# Density plot

Use the 'airquality' dataset
```{r}
data(airquality)
str(airquality)
```

Create a density plot -
```{r}
densityplot(~ Ozone, data = airquality)
```

A useful optional argument for densityplot() is plot.points, which can take values -   

1. TRUE, the default, to plot the data points along the x-axis in addition to the density;   
2. FALSE to suppress plotting the data points, and   
3. "jitter", to plot the points along the y-axis but with some random jittering in the y-direction so that overlapping points are easier to see.

```{r}
densityplot(~ Ozone, data = airquality,
    plot.points = TRUE)
densityplot(~ Ozone, data = airquality,
    plot.points = FALSE)
```

# Box and Whisker Plot

Creating a box and whisker plot - 
```{r}
bwplot(x = ~ rate.male, data = USCancerRates)
```

Creating box and whisker plots by some factor - 
```{r, fig.height = 9}
bwplot(state ~ rate.male, data = USCancerRates)
```

Reordering the states by their median rate - 
```{r, fig.height = 9}
bymedian <- with(USCancerRates, reorder(state, rate.male, median, na.rm = T))
bwplot(bymedian ~ rate.male, data = USCancerRates)
```

Changing labels -
```{r}
# Create box-and-whisker plot
bwplot(state.ordered ~ rate.female + rate.male,
       data = USCancerRates, 
       outer = TRUE, 
       xlab = "Rate (per 100,000)", 
       # Add strip labels
       strip = strip.custom(factor.levels = c("Male", "Female")))
```

Using the plot as an object - 
```{r}
pl <- bwplot(state.ordered ~ rate.female + rate.male,
       data = USCancerRates, 
       outer = TRUE, 
       xlab = "Rate (per 100,000)")
pl
class(pl)
summary(pl)
dimnames(pl)
```

Updating trellis object - 
```{r}
update(pl, strip = strip.custom(factor.levels = c("Men","Women")))
```

Another way to change the labels - 
```{r}
dimnames(pl)[[1]] <- c("Male", "Female")
```

Subset the trellis object like matrix - 
```{r}
pl[1,]  # only males
```


# Conditioning/Facetting

Conditioning scatterplot on Species - 
```{r}
str(iris)
xyplot(Sepal.Width ~ Sepal.Length | Species,   # facet by Species
       iris, grid = TRUE)
```

Conditioning histogram of weight on group -
```{r}
str(PlantGrowth)
densityplot( ~ weight | group, PlantGrowth)
```

Conditioning two different variables in one plot -
```{r}
histogram( ~ rate.male + rate.female, USCancerRates,
           outer = TRUE)
```

Notice that rate.male and rate.female are two different variables in the dataset, which means that USCancerRates is not a tidy data frame. lattice, unlike ggplot2, allows you to have data in a wide format.
```{r}
densityplot(~ rate.male + rate.female,
    data = USCancerRates, 
    plot.points = FALSE,    # Suppress data points
    )
```

With `outer=TRUE` -
```{r}
densityplot(~ rate.male + rate.female,
    data = USCancerRates, 
    outer = TRUE,
    plot.points = FALSE,    # Suppress data points
    )
```

Changing layout - 
```{r}
densityplot( ~ rate.male + rate.female, USCancerRates,
             outer = TRUE, layout = c(1,2) # 1 column, 2 rows
           )
```

Doing some data manipulation to get summary statistics - 
```{r}
USCancerRates.state <- with(USCancerRates, {    
  rmale <- tapply(rate.male, state, median, na.rm= TRUE)    
  rfemale <- tapply(rate.female, state, median, na.rm= TRUE)  
  data.frame(
    Rate = c(rmale, rfemale),
    State = rep(names(rmale), 2),
    Gender = rep(c("Male", "Female"), each = length(rmale))
    )
  })
USCancerRates.state <- dplyr::mutate(USCancerRates.state,
                                     State = reorder(State, Rate))
head(USCancerRates.state, 10)
```

Conditioning by gender -
```{r}
xyplot(State ~ Rate | Gender, USCancerRates.state, grid = TRUE)
```

Grouping by gender -
```{r}
xyplot(State ~ Rate, groups = Gender, data = USCancerRates.state, grid = TRUE)
```

To add legend -
```{r}
xyplot(State ~ Rate, groups = Gender, data = USCancerRates.state, 
       grid = TRUE,
       auto.key = TRUE)
```

Positioning and formatting the legend - 
```{r}
xyplot(State ~ Rate, groups = Gender, data = USCancerRates.state, 
       grid = TRUE,
       auto.key=list(space="bottom", columns = 2,
                     title=NULL, cex.title = 1))
```

```{r}
# USCancerRates has been pre-loaded
str(USCancerRates)

# Create a density plot
densityplot(~ rate.male + rate.female,
    data = USCancerRates,
    # Set value of 'outer' 
    outer = FALSE,
    # Add x-axis label
    xlab = "Rate (per 100,000)",
    # Add a legend
    auto.key = TRUE,
    plot.points = FALSE,
    ref = TRUE)
```



