These data were obtain from the Australian Bureau of Statistics website, under the Census section (http://www.abs.gov.au/census). A direct URL to the tabulated data cannot be provided as this area of the site requires a “Guest User” login. Details of the steps of bring these data into R for pre-processing are outlined in the appendix.
The data table used for the analysis.
census <- read_xlsx("FINALtable.xlsx", 1)
## Warning in strptime(x, format, tz = tz): unknown timezone 'zone/tz/2018c.
## 1.0/zoneinfo/Australia/Melbourne'
# Checking the data
head(census, n = 5)
## # A tibble: 5 x 24
## `SEXP Sex` `EETP Engagement in Em… `0-4 years` `5-9 years` `10-14 years`
## <chr> <chr> <dbl> <dbl> <dbl>
## 1 Male Fully engaged 0. 0. 0.
## 2 <NA> Partially engaged 0. 0. 0.
## 3 <NA> At least partially eng… 0. 0. 0.
## 4 <NA> Not Engaged 0. 0. 0.
## 5 <NA> Engagement status unde… 0. 0. 0.
## # ... with 19 more variables: `15-19 years` <dbl>, `20-24 years` <dbl>,
## # `25-29 years` <dbl>, `30-34 years` <dbl>, `35-39 years` <dbl>, `40-44
## # years` <dbl>, `45-49 years` <dbl>, `50-54 years` <dbl>, `55-59
## # years` <dbl>, `60-64 years` <dbl>, `65-69 years` <dbl>, `70-74
## # years` <dbl>, `75-79 years` <dbl>, `80-84 years` <dbl>, `85-89
## # years` <dbl>, `90-94 years` <dbl>, `95-99 years` <dbl>, `100 years and
## # over` <dbl>, Total <dbl>
# Checking the data
tail(census, n = 5)
## # A tibble: 5 x 24
## `SEXP Sex` `EETP Engagement in Em… `0-4 years` `5-9 years` `10-14 years`
## <chr> <chr> <dbl> <dbl> <dbl>
## 1 <NA> Partially engaged 0. 0. 0.
## 2 <NA> At least partially eng… 0. 0. 0.
## 3 <NA> Not Engaged 0. 0. 0.
## 4 <NA> Engagement status unde… 0. 0. 0.
## 5 <NA> Not applicable 1464782. 1502647. 1397187.
## # ... with 19 more variables: `15-19 years` <dbl>, `20-24 years` <dbl>,
## # `25-29 years` <dbl>, `30-34 years` <dbl>, `35-39 years` <dbl>, `40-44
## # years` <dbl>, `45-49 years` <dbl>, `50-54 years` <dbl>, `55-59
## # years` <dbl>, `60-64 years` <dbl>, `65-69 years` <dbl>, `70-74
## # years` <dbl>, `75-79 years` <dbl>, `80-84 years` <dbl>, `85-89
## # years` <dbl>, `90-94 years` <dbl>, `95-99 years` <dbl>, `100 years and
## # over` <dbl>, Total <dbl>
# Checking the size
dim(census)
## [1] 18 24
# Display the first attribute as a vector
c(census[1:18,1])
## $`SEXP Sex`
## [1] "Male" NA NA NA NA NA "Female"
## [8] NA NA NA NA NA "Total" NA
## [15] NA NA NA NA
# Display the first attribute as a vector
c(census[1:18,1])
## $`SEXP Sex`
## [1] "Male" "Male" "Male" "Male" "Male" "Male" "Female"
## [8] "Female" "Female" "Female" "Female" "Female" "Total" "Total"
## [15] "Total" "Total" "Total" "Total"
headerNew <- c("Gender", "Engagement", "2", "7", "12", "17", "22", "27",
"32", "37", "42", "47", "52", "57", "62", "67", "72", "77",
"82", "87", "92", "97", "100+", "Total")
colnames(census) <- headerNew
# Drop age 0 - 18 (no data recorded) & drop 100+ (different interval)
censusViz <- census[-c(3, 5, 6, 9, 11, 12, 13, 14, 15, 16, 17, 18), -c(3, 4, 5, 23, 24)]
# Checking the data
str(censusViz)
## Classes 'tbl_df', 'tbl' and 'data.frame': 6 obs. of 19 variables:
## $ Gender : chr "Male" "Male" "Male" "Female" ...
## $ Engagement: chr "Fully engaged" "Partially engaged" "Not Engaged" "Fully engaged" ...
## $ 17 : num 587338 43097 51755 564773 49503 ...
## $ 22 : num 530431 92269 95785 487796 115051 ...
## $ 27 : num 558982 83821 87244 453365 143524 ...
## $ 32 : num 587397 75855 81836 369790 202239 ...
## $ 37 : num 541873 69043 80007 299536 232233 ...
## $ 42 : num 535291 72456 89937 317693 253261 ...
## $ 47 : num 516059 75111 101870 332521 244726 ...
## $ 52 : num 473883 78123 118341 306164 221484 ...
## $ 57 : num 394833 83950 154576 240776 202529 ...
## $ 62 : num 242659 89835 232986 129180 150442 ...
## $ 67 : num 93675 67358 362931 40828 73753 ...
## $ 72 : num 27724 33984 325735 10427 26378 ...
## $ 77 : num 9221 13591 249343 3326 8077 ...
## $ 82 : num 3111 4702 169140 1228 2458 ...
## $ 87 : num 1161 1640 101497 522 837 ...
## $ 92 : num 348 363 37806 242 226 ...
## $ 97 : num 457 105 6877 272 205 ...
# Turn the Gender and engagement into factors rather than Char
censusViz$Gender <- factor(censusViz$Gender)
censusViz$Engagement <- factor(censusViz$Engagement)
# Checking the data
summary(censusViz)
## Gender Engagement 17 22
## Female:3 Fully engaged :2 Min. : 38476 Min. : 92269
## Male :3 Not Engaged :2 1st Qu.: 44698 1st Qu.: 97325
## Partially engaged:2 Median : 50629 Median :108498
## Mean :222490 Mean :237213
## 3rd Qu.:436518 3rd Qu.:394610
## Max. :587338 Max. :530431
## 27 32 37 42
## Min. : 83821 Min. : 75855 Min. : 69043 Min. : 72456
## 1st Qu.:101314 1st Qu.:107958 1st Qu.:103692 1st Qu.:108451
## Median :147530 Median :194280 Median :203491 Median :208627
## Mean :246412 Mean :250573 Mean :232907 Mean :238772
## 3rd Qu.:377908 3rd Qu.:327902 3rd Qu.:282710 3rd Qu.:301585
## Max. :558982 Max. :587397 Max. :541873 Max. :535291
## 47 52 57 62
## Min. : 75111 Min. : 78123 Min. : 83950 Min. : 89835
## 1st Qu.:116682 1st Qu.:132619 1st Qu.:166564 1st Qu.:134496
## Median :202922 Median :198468 Median :215932 Median :191714
## Mean :238567 Mean :228908 Mean :217667 Mean :194490
## 3rd Qu.:310572 3rd Qu.:284994 3rd Qu.:237916 3rd Qu.:240241
## Max. :516059 Max. :473883 Max. :394833 Max. :321841
## 67 72 77 82
## Min. : 40828 Min. : 10427 Min. : 3326 Min. : 1228
## 1st Qu.: 68957 1st Qu.: 26714 1st Qu.: 8363 1st Qu.: 2621
## Median : 83714 Median : 30854 Median : 11406 Median : 3906
## Mean :178894 Mean :132674 Mean : 95807 Mean : 65931
## 3rd Qu.:295617 3rd Qu.:252797 3rd Qu.:190405 3rd Qu.:128030
## Max. :434817 Max. :371799 Max. :291283 Max. :214948
## 87 92 97
## Min. : 522 Min. : 226.0 Min. : 105.0
## 1st Qu.: 918 1st Qu.: 268.5 1st Qu.: 221.8
## Median : 1400 Median : 355.5 Median : 364.5
## Mean : 42908 Mean :18949.2 Mean : 4477.3
## 3rd Qu.: 76533 3rd Qu.:28445.2 3rd Qu.: 5272.0
## Max. :151789 Max. :74710.0 Max. :18948.0
# Converting from Wide to Long format
censusVizNew <- censusViz %>% gather(`17`:`97`, key = "Age", value = "Number")
# Fully Engaged
fullyEngaged <- ggplot(data = censusVizNew[censusVizNew$Engagement=="Fully engaged",], aes(x = as.numeric(Age), y = Number, colour = Engagement)) +
geom_density2d(bins = 40, size = 0.4, colour = "#1f78b4", show.legend = TRUE) +
geom_point(show.legend = TRUE) +
labs(colour = "Engagement Level") +
scale_colour_manual(values=c("black")) +
scale_y_continuous(limits = c(0, 600000), labels = comma, breaks=seq(0,600000,50000)) +
scale_x_continuous(labels = comma, breaks=seq(17,97,10)) +
labs(y="Number of Individuals") +
theme(axis.title.x=element_blank(),
axis.title.y = element_text(size=11, face="bold"),
axis.ticks.length=unit(.03, "cm"),
legend.title = element_text(size=10, face="bold"),
legend.text = element_text(size=9),
panel.border = element_rect(fill = NA, colour = "grey30", size = 0.1)) +
facet_grid(.~Gender) +
theme(strip.text.x = element_text(size = 10, face = "bold"))
# Partially Engaged
partiallyEngaged <- ggplot(data = censusVizNew[censusVizNew$Engagement=="Partially engaged",], aes(x = as.numeric(Age), y = Number, colour = Engagement)) +
geom_density2d(bins = 40, size = 0.4, colour = "#4daf4a", show.legend = TRUE) +
geom_point(show.legend = TRUE) +
labs(colour = "Engagement Level") +
scale_colour_manual(values=c("black")) +
scale_y_continuous(limits = c(0, 600000), labels = comma, breaks=seq(0,600000,50000)) +
scale_x_continuous(labels = comma, breaks=seq(17,97,10)) +
labs(y="Number of Individuals") +
theme(axis.title.x=element_blank(),
axis.title.y = element_text(size=11, face="bold"),
legend.title = element_text(size=10, face="bold"),
legend.text = element_text(size=9),
plot.title = element_text(hjust=0.5),
axis.ticks.length=unit(.03, "cm"),
panel.border = element_rect(fill = NA, colour = "grey30", size = 0.1)) +
facet_grid(.~Gender) +
theme(strip.text.x = element_text(size = 10, face = "bold"))
# Not Engaged
notEngaged <- ggplot(data = censusVizNew[censusVizNew$Engagement=="Not Engaged",], aes(x = as.numeric(Age), y = Number, colour = Engagement)) +
geom_density2d(bins = 40, size = 0.4, colour = "#e41a1c", show.legend = TRUE) +
geom_point(show.legend = TRUE) +
labs(colour = "Engagement Level") +
scale_colour_manual(values=c("black")) +
scale_y_continuous(limits = c(0, 600000), labels = comma, breaks=seq(0,600000,50000)) +
scale_x_continuous(labels = comma, breaks=seq(17,97,10)) +
labs(y="Number of Individuals") +
theme(axis.title.x=element_blank(),
axis.title.y = element_text(size=11, face = "bold"),
legend.title = element_text(size=10, face="bold"),
legend.text = element_text(size=9),
axis.ticks.length=unit(.03, "cm"),
panel.border = element_rect(fill = NA, colour = "grey30", size = 0.1)) +
facet_grid(.~Gender) +
theme(strip.text.x = element_text(size = 10, face = "bold"))
title1 = textGrob("2D Density Plots (with trend points) of Engagement and Age from Census 2016", hjust = 0.567, gp=gpar(fontface="bold"))
bottom1 = textGrob("Age of Individuals (years)", hjust = 0.68, gp=gpar(fontface="bold", fontsize=10))
grid.arrange(fullyEngaged, partiallyEngaged, notEngaged,
#left="Number of Individuals",
bottom=bottom1,
top = title1)
The figure shows 2D density plots overlaid with the discrete sample number from the 2016 Australian Census (http://www.abs.gov.au/census). The columns represent gender, while each row represents engagement level (with the y-axis giving number of individuals, and the x-axis age in years). Fully engaged (for example full time work or full time tertiary study) shows a peak in male participation around child-bearing age, while females immediately begin disengaging from full engagement (participants under the age of 17 years are overwhelmingly still engaged in high school study, not included in these data). Females leaving child-bearing age (around 37 years of age) increase their engagement and then disengage at a rate equivalent to males (starting at approximately 37 years for females and approximately 32 years for males). The number of females in the partially engaged category (for example part-time work or study) gradually increases during child-bearing years, then gradually decrease. Males are relatively constant in this category. For both males and females, the category of not engaged (neither working or studying) is quite similar. In both plots an inflexion point is seen at 67 years, most likely associated with morbidity.