In this exercise you will learn to visualize the pairwise relationships between a set of quantitative variables. To this end, you will make your own note of 8.5 Mosaic plots from Data Visualization with R.
Mosaic charts can display the relationship between categorical variables using:
The Titanic data set came from https://osf.io/aupb4/.
library(tidyverse)
# Import data
titanic <- read_csv(url("https://osf.io/aupb4/download"))
titanic
str(titanic)
# create a table
tbl <- xtabs(~Survived + PClass + Sex, titanic)
ftable(tbl)
In the graph below,
# create a mosaic plot from the table
library(vcd)
mosaic(tbl,
shade = TRUE,
legend = TRUE,
labeling_args = list(set_varnames = c(Sex = "Gender",
Survived = "Survived",
PClass = "Class")),
set_labels = list(Survived = c("No", "Yes"),
Class = c("1st", "2nd", "3rd"),
Sex = c("F", "M")),
main = "Titanic data")
No, more passengers died than survived.
The largest group that did not survive was the male third class.
The largest group that did survive was the female second class group.
Third class male has more cases than expected because it is the biggest box that is is shaded in blue.
The first class females has the least amount of cases because it was the smallest block in the chart.
Hint: The Arthritis data set is from the vcd package. Add an additional argument gp = shading_max in the mosaic function. This is because the residuals are too small to have color.
library(tidyverse)
# Import data
data(Arthritis, package = "vcd")
Arthritis
str(Arthritis)
# create a table
tbl <- xtabs(~Improved + Treatment, Arthritis)
ftable(tbl)
In the graph below,
# create a mosaic plot from the table
library(vcd)
mosaic(tbl,
shade = TRUE,
legend = TRUE,
main = "Titanic data", gp = shading_max)
No, more people died than getting better from the treatment.
The largest group to not survuve ws the placebo group with no survivals.
The marked placebo group has the most survivals.
The placebo treatment where marked participants surviuved has the most cases because it is the biggest blue shaded in block in the chart.
The least amount of cases was the some marked plaebo treatment. They had the least amount in the chart.
Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.