In this assignment, the aim is to analyse the Singapore resident population by age structure & planning area. The dataset contains all the residents data of Singapore from June 2011 to 2019. The columns of the data set are as follows:
| Column | Description |
|---|---|
| PA | Planning Area |
| SZ | Subzone |
| AG | Age Group |
| Sex | Sex |
| TOD | Type of Dwelling |
| Pop | Resident Count |
| Time | Time/Period |
We use this dataset to analyze the following issues:
-population age group by gender
-population by planning area & age group
In order to solve above issues, here are 2 proposed sketches for visualization.
1. Population Pyramid by age group & sex
2. Population heatmap by age group & planning area
image:! [Alt text] (/Users/User/Documents/sketches.jpg)
packages <- c('tidyverse', 'readr', 'plotly','ggplot2','tibble','plyr','ggthemes','ggpubr')
for (p in packages){
if (!require(p,character.only = T)){
install.packages(p)
}
library(p,character.only = T)
}
data <- read_csv("respopagesextod2011to2019.csv")
We will focus on 2019 population data so We filter the data for 2019.
data2019 <- filter(data, Time == "2019")
We need to aggregate population by age group.
population <-aggregate(Pop~Time+AG+Sex+PA,data=data2019,FUN=sum)
In order to plot population pyramid, we need to make population values for male (or female) negative, so the pyramid can be centred at 0.
population$Pop <- ifelse(population$Sex == "Males", -1*population$Pop, population$Pop)
Plot the population pyramid using ggplot
p1<-ggplot(population, aes(x = AG, y = Pop, fill = Sex)) +
geom_bar(data = subset(population, Sex == "Females"), stat = "identity") +
geom_bar(data = subset(population, Sex == "Males"), stat = "identity") +
scale_y_continuous(breaks = seq(-200000, 200000, 20000),labels = paste0(as.character(c(seq(200, 0, -20), seq(20,200,20))), "K")) +
scale_fill_economist() +
theme_bw()+
theme(axis.title.x = element_blank(),panel.grid.major.y = element_blank(),panel.grid.minor.y=element_blank(),axis.text.x.top = element_text(size=12),plot.title = element_text(size=14, face = "bold", hjust = 0.5),plot.subtitle = element_text(hjust = 0.5))+
coord_flip()+ scale_fill_manual(values = c("orchid2","dodgerblue1"))+ labs(title = "Population Pyramid of Singapore Resident 2019")
p1
Plot the population heatmap by planning area & age group
pop2 <-aggregate(Pop~Time+AG+Sex+PA,data=data2019,FUN=sum)
pop2 <- pop2 %>%
mutate(text = paste0("Area:", PA, "\n", Pop, " people in Age Group ", AG))
ggplot(pop2, aes(x=PA,y=AG, fill=Pop,text=text))+scale_x_discrete(expand=c(0,0))+theme(axis.text.x = element_text(angle = 90, hjust = 1), axis.text.y = element_text(size = 5))+
geom_raster(aes(height=4))+labs(title="Age Strucuture of Singaore Population by Planning Area 2019")+ scale_fill_gradient(name = "Population",
low = "#87CEFA",
high = "#191970")