In this exercise you will learn to visualize the pairwise relationships between a set of quantitative variables. To this end, you will make your own note of 8.5 Mosaic plots from Data Visualization with R.
Mosaic charts can display the relationship between categorical variables using:
The Titanic data set came from https://osf.io/aupb4/.
## # A tibble: 1,313 x 5
## Name PClass Age Sex Survived
## <chr> <chr> <dbl> <chr> <dbl>
## 1 Allen, Miss Elisabeth Walton 1st 29 fema… 1
## 2 Allison, Miss Helen Loraine 1st 2 fema… 0
## 3 Allison, Mr Hudson Joshua Creighton 1st 30 male 0
## 4 Allison, Mrs Hudson JC (Bessie Waldo Daniel… 1st 25 fema… 0
## 5 Allison, Master Hudson Trevor 1st 0.92 male 1
## 6 Anderson, Mr Harry 1st 47 male 1
## 7 Andrews, Miss Kornelia Theodosia 1st 63 fema… 1
## 8 Andrews, Mr Thomas, jr 1st 39 male 0
## 9 Appleton, Mrs Edward Dale (Charlotte Lamson) 1st 58 fema… 1
## 10 Artagaveytia, Mr Ramon 1st 71 male 0
## # … with 1,303 more rows
## Classes 'spec_tbl_df', 'tbl_df', 'tbl' and 'data.frame': 1313 obs. of 5 variables:
## $ Name : chr "Allen, Miss Elisabeth Walton" "Allison, Miss Helen Loraine" "Allison, Mr Hudson Joshua Creighton" "Allison, Mrs Hudson JC (Bessie Waldo Daniels)" ...
## $ PClass : chr "1st" "1st" "1st" "1st" ...
## $ Age : num 29 2 30 25 0.92 47 63 39 58 71 ...
## $ Sex : chr "female" "female" "male" "female" ...
## $ Survived: num 1 0 0 0 1 1 1 0 1 0 ...
## - attr(*, "spec")=
## .. cols(
## .. Name = col_character(),
## .. PClass = col_character(),
## .. Age = col_double(),
## .. Sex = col_character(),
## .. Survived = col_double()
## .. )
## Sex female male
## Survived PClass
## 0 1st 9 120
## 2nd 13 147
## 3rd 132 441
## 1 1st 134 59
## 2nd 94 25
## 3rd 80 58
In the graph below,
More passengers survived than not.
Males of the third class.
The largest group to survive was the females of the second and first class.
Males of the third class. If class and gender didn’t matter than fewer people would have died.
Males of the third class.
Hint: The Arthritis data set is from the vcd package. Add an additional argument gp = shading_max
in the mosaic function. This is because the residuals are too small to have color.
## ID Treatment Sex Age Improved
## 1 57 Treated Male 27 Some
## 2 46 Treated Male 29 None
## 3 77 Treated Male 30 None
## 4 17 Treated Male 32 Marked
## 5 36 Treated Male 46 Marked
## 6 23 Treated Male 58 Marked
## 7 75 Treated Male 59 None
## 8 39 Treated Male 59 Marked
## 9 33 Treated Male 63 None
## 10 55 Treated Male 63 None
## 11 30 Treated Male 64 None
## 12 5 Treated Male 64 Some
## 13 63 Treated Male 69 None
## 14 83 Treated Male 70 Marked
## 15 66 Treated Female 23 None
## 16 40 Treated Female 32 None
## 17 6 Treated Female 37 Some
## 18 7 Treated Female 41 None
## 19 72 Treated Female 41 Marked
## 20 37 Treated Female 48 None
## 21 82 Treated Female 48 Marked
## 22 53 Treated Female 55 Marked
## 23 79 Treated Female 55 Marked
## 24 26 Treated Female 56 Marked
## 25 28 Treated Female 57 Marked
## 26 60 Treated Female 57 Marked
## 27 22 Treated Female 57 Marked
## 28 27 Treated Female 58 None
## 29 2 Treated Female 59 Marked
## 30 59 Treated Female 59 Marked
## 31 62 Treated Female 60 Marked
## 32 84 Treated Female 61 Marked
## 33 64 Treated Female 62 Some
## 34 34 Treated Female 62 Marked
## 35 58 Treated Female 66 Marked
## 36 13 Treated Female 67 Marked
## 37 61 Treated Female 68 Some
## 38 65 Treated Female 68 Marked
## 39 11 Treated Female 69 None
## 40 56 Treated Female 69 Some
## 41 43 Treated Female 70 Some
## 42 9 Placebo Male 37 None
## 43 14 Placebo Male 44 None
## 44 73 Placebo Male 50 None
## 45 74 Placebo Male 51 None
## 46 25 Placebo Male 52 None
## 47 18 Placebo Male 53 None
## 48 21 Placebo Male 59 None
## 49 52 Placebo Male 59 None
## 50 45 Placebo Male 62 None
## 51 41 Placebo Male 62 None
## 52 8 Placebo Male 63 Marked
## 53 80 Placebo Female 23 None
## 54 12 Placebo Female 30 None
## 55 29 Placebo Female 30 None
## 56 50 Placebo Female 31 Some
## 57 38 Placebo Female 32 None
## 58 35 Placebo Female 33 Marked
## 59 51 Placebo Female 37 None
## 60 54 Placebo Female 44 None
## 61 76 Placebo Female 45 None
## 62 16 Placebo Female 46 None
## 63 69 Placebo Female 48 None
## 64 31 Placebo Female 49 None
## 65 20 Placebo Female 51 None
## 66 68 Placebo Female 53 None
## 67 81 Placebo Female 54 None
## 68 4 Placebo Female 54 None
## 69 78 Placebo Female 54 Marked
## 70 70 Placebo Female 55 Marked
## 71 49 Placebo Female 57 None
## 72 10 Placebo Female 57 Some
## 73 47 Placebo Female 58 Some
## 74 44 Placebo Female 59 Some
## 75 24 Placebo Female 59 Marked
## 76 48 Placebo Female 61 None
## 77 19 Placebo Female 63 Some
## 78 3 Placebo Female 64 None
## 79 67 Placebo Female 65 Marked
## 80 32 Placebo Female 66 None
## 81 42 Placebo Female 66 None
## 82 15 Placebo Female 66 Some
## 83 71 Placebo Female 68 Some
## 84 1 Placebo Female 74 Marked
## 'data.frame': 84 obs. of 5 variables:
## $ ID : int 57 46 77 17 36 23 75 39 33 55 ...
## $ Treatment: Factor w/ 2 levels "Placebo","Treated": 2 2 2 2 2 2 2 2 2 2 ...
## $ Sex : Factor w/ 2 levels "Female","Male": 2 2 2 2 2 2 2 2 2 2 ...
## $ Age : int 27 29 30 32 46 58 59 59 63 63 ...
## $ Improved : Ord.factor w/ 3 levels "None"<"Some"<..: 2 1 1 3 3 3 1 3 1 1 ...
## Treatment Placebo Treated
## Improved
## None 29 13
## Some 7 7
## Marked 7 21
In the graph below,
Hint: Use message
, echo
and results
in the chunk options. Refer to the RMarkdown Reference Guide.