2. Provide step-by-step description on how the data visualisation was prepared by using gplot2 and other related R packages. (3 marks)
Step 1
1.1 Import Required Packages - tidyverse, dplyr, ggplot2 and dataset(using read_csv)
1.2 Add options(scipen=10000) to remove negative x-axis values for population pyramid
packages = c('tidyverse', 'dplyr', 'ggplot2')
for (p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p, character.only = T)
}
library(ggplot2)
library(dplyr)
options(scipen=10000)
pop_data <- read_csv("data/respopagesextod2011to2020.csv")
Step 2
2.1 Create “males” and “females” dataframe using dplyr
2.2 For each gender’s dataframe, filter for time==‘2019’
2.3 Arrange data in ascending order according to “pop”
2.4 Using Mutate and Factor, arrange age groups in ascending order
males <- pop_data %>%
filter(., Sex=='Males' & Time=='2019')%>%
arrange(., Pop)%>%
mutate(AG = factor(AG, levels=c("0_to_4","5_to_9", "10_to_14","15_to_19","20_to_24","25_to_29","30_to_34","35_to_39","40_to_44","45_to_49","50_to_54","55_to_59","60_to_64","65_to_69","70_to_74","75_to_79","80_to_84","85_to_89","90_and_over")))
females <- pop_data %>%
filter(., Sex=='Females' & Time=='2019')%>%
mutate(., Pop = Pop * -1)%>%
arrange(., Pop)%>%
mutate(AG = factor(AG, levels=c("0_to_4","5_to_9", "10_to_14","15_to_19","20_to_24","25_to_29","30_to_34","35_to_39","40_to_44","45_to_49","50_to_54","55_to_59","60_to_64","65_to_69","70_to_74","75_to_79","80_to_84","85_to_89","90_and_over")))
Step 3
3.1 Manipulate pop_data dataframe
3.2 Filter time == ‘2019’ and arrange data in ascending order, according to ‘pop’ variable
3.3 Using Mutate and Factor, arrange age groups in ascending order
pop_data <- pop_data %>%
filter(., Time=='2019')%>%
arrange(., Pop)%>%
mutate(AG = factor(AG, levels=c("0_to_4","5_to_9", "10_to_14","15_to_19","20_to_24","25_to_29","30_to_34","35_to_39","40_to_44","45_to_49","50_to_54","55_to_59","60_to_64","65_to_69","70_to_74","75_to_79","80_to_84","85_to_89","90_and_over")))
Step 4
4.1 Insert geom_bar plot for males and females respectively
4.2 Set continuous y-axis scale and flip coordinates
4.3 Insert Data Labels and Data Source and Legends
ggplot(data=pop_data,aes(x=AG, fill=TOD)) +
geom_bar(data=females, aes(y=Pop), stat='identity')+
geom_bar(data=males, aes(y=Pop), stat='identity')+
scale_y_continuous(breaks=seq(-300000,300000,100000),labels=abs(seq(-300000,300000,100000)))+
coord_flip()+
xlab('Age Group') +
ylab('Females Males \nPopulation Count') +
ggtitle('Population Pyramid - Age Groups vs Gender vs Type of Dwelling (2019)')+
labs(fill = "Type of Dwelling")

Data Source: https://www.singstat.gov.sg/find-data/search-by-theme/population/geographic-distribution/latest-data
3. The final data visualization and a short description of not more than 350 words. The description must provide at least two useful information revealed by the data visualization. (4 marks)
Aim of data visualization: Illustrate the breakdown of Singapore’s Demographics, in terms of the following 3 categories: Age Group, Gender, Type of Dwelling
Description:
- A population pyramid has been used for the above-mentioned data visualization, with females on the left axis and males on the right axis.
- The age groups have been arranged bottom-up, in ascending order
- For every age group, a colour gradient is included to separate between different types of dwelling
- The Y-Axis represents the age group bins
- The X-Axis represents the population count per age group
Useful Information 1 - Ageing Population
The data visualization reveals an “beehive” shape, that is wider in the middle compared to the rest of the visualization, showing that majority of the population sits in the 30-34 -> 60-64 age groups. This shows that Singapore is experiencing an ageing population whereby an increasing proportion of its population will fall under the more senior age groups, thereby placing pressure on a shrinking Singaporean workforce that has to support the ageing population.
Useful Information 2 - Even Distribution of Type of Dwelling between Genders for all age groups
When looking at the distribution of type of dwelling, we can see that across all age groups, the gender split for type of dwellings is generally even throughout the age groups, with slightly more disparity when it comes to the higher / senior end of age groups.