In this exercise you will learn to visualize the pairwise relationships between a set of quantitative variables. To this end, you will make your own note of 8.5 Mosaic plots from Data Visualization with R.

Mosaic charts can display the relationship between categorical variables using:

The Titanic data set came from https://osf.io/aupb4/.

library(tidyverse)
# Import data
titanic <- read_csv(url("https://osf.io/aupb4/download"))
titanic
str(titanic)
# create a table
tbl <- xtabs(~Survived + PClass + Sex, titanic)
ftable(tbl)

In the graph below,

# create a mosaic plot from the table
library(vcd)

mosaic(tbl, 
       shade = TRUE,
       legend = TRUE,
       labeling_args = list(set_varnames = c(Sex = "Gender",
                                             Survived = "Survived",
                                             PClass = "Class")),
       set_labels = list(Survived = c("No", "Yes"),
                         Class = c("1st", "2nd", "3rd"),
                         Sex = c("F", "M")),
       main = "Titanic data")

Q1 Did more passengers survive?

No, more passengers died than survived.

Q2 Describe the largest group that didn’t survive. Discuss by class and gender.

The largest group that did not survive was the male third class.

Q3 Describe the largest group that did survive. Discuss by class and gender.

The largest group that did survive was the female second class group.

Q4 Describe one group that has more cases than expected given independence (by chance). Discuss by class and gender.

Third class male has more cases than expected because it is the biggest box that is is shaded in blue.

Q5 Describe one group that has less cases than expected given independence (by chance). Discuss by class and gender.

The first class females has the least amount of cases because it was the smallest block in the chart.

Q6 Create a mosaic plot for Arthritis in the same way as above.

Hint: The Arthritis data set is from the vcd package. Add an additional argument gp = shading_max in the mosaic function. This is because the residuals are too small to have color.

library(tidyverse)
# Import data
data(Arthritis, package = "vcd")
Arthritis
str(Arthritis)
# create a table
tbl <- xtabs(~Improved + Treatment, Arthritis)
ftable(tbl)

In the graph below,

# create a mosaic plot from the table
library(vcd)

mosaic(tbl, 
       shade = TRUE,
       legend = TRUE,
       main = "Titanic data", gp = shading_max)

Q7 Repeat Q1-Q5.

No, more people died than getting better from the treatment.

The largest group to not survuve ws the placebo group with no survivals.

The marked placebo group has the most survivals.

The placebo treatment where marked participants surviuved has the most cases because it is the biggest blue shaded in block in the chart.

The least amount of cases was the some marked plaebo treatment. They had the least amount in the chart.

Q8 Hide the messages, the code and its results on the webpage.

Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.

Q9 Display the title and your name correctly at the top of the webpage.

Q10 Use the correct slug.