Population Trends of Singapore Residents by Age Group, Sex & Planning Area in 2019

ISSS608 Visual Analytics and Applications | Assignment 4

Author: Kou Zhigang

1. Overview

In this assignment, the aim is to analyse the Singapore resident population by age structure & planning area. The dataset contains all the residents data of Singapore from June 2011 to 2019. The columns of the data set are as follows:

Column Description
PA Planning Area
SZ Subzone
AG Age Group
Sex Sex
TOD Type of Dwelling
Pop Resident Count
Time Time/Period

We use this dataset to analyze the following issues:
-population age group by gender
-population by planning area & age group

In order to solve above issues, here are 2 proposed sketches for visualization.
1. Population Pyramid by age group & sex
2. Population heatmap by age group & planning area
image:! [Alt text] (/Users/User/Documents/sketches.jpg)

2. DataViz Step-by Step

2.1 Install and load R package

packages <- c('tidyverse', 'readr', 'plotly','ggplot2','tibble','plyr','ggthemes','ggpubr')

for (p in packages){
  if (!require(p,character.only = T)){
    install.packages(p)
  }
  library(p,character.only = T)
}

2.2 Load Data

data <- read_csv("respopagesextod2011to2019.csv")

2.3 Preparing Data for Population Pyrimad

We will focus on 2019 population data so We filter the data for 2019.

data2019 <- filter(data, Time == "2019")

We need to aggregate population by age group.

population <-aggregate(Pop~Time+AG+Sex+PA,data=data2019,FUN=sum)

In order to plot population pyramid, we need to make population values for male (or female) negative, so the pyramid can be centred at 0.

population$Pop <- ifelse(population$Sex == "Males", -1*population$Pop, population$Pop)

2.4 Plot Population Pyrimad

Plot the population pyramid using ggplot

p1<-ggplot(population, aes(x = AG, y = Pop, fill = Sex)) + 
  geom_bar(data = subset(population, Sex == "Females"), stat = "identity") + 
  geom_bar(data = subset(population, Sex == "Males"), stat = "identity")  + 
   scale_y_continuous(breaks = seq(-200000, 200000, 20000),labels = paste0(as.character(c(seq(200, 0, -20), seq(20,200,20))), "K")) + 
  scale_fill_economist() + 
  theme_bw()+
  theme(axis.title.x = element_blank(),panel.grid.major.y = element_blank(),panel.grid.minor.y=element_blank(),axis.text.x.top = element_text(size=12),plot.title = element_text(size=14, face = "bold", hjust = 0.5),plot.subtitle = element_text(hjust = 0.5))+
  coord_flip()+ scale_fill_manual(values = c("orchid2","dodgerblue1"))+ labs(title = "Population Pyramid of Singapore Resident 2019")
p1

2.5 Population Heatmap

Plot the population heatmap by planning area & age group

pop2 <-aggregate(Pop~Time+AG+Sex+PA,data=data2019,FUN=sum)
pop2 <- pop2 %>%
  mutate(text = paste0("Area:", PA, "\n", Pop, " people in Age Group ", AG))
ggplot(pop2, aes(x=PA,y=AG, fill=Pop,text=text))+scale_x_discrete(expand=c(0,0))+theme(axis.text.x = element_text(angle = 90, hjust = 1), axis.text.y = element_text(size = 5))+
geom_raster(aes(height=4))+labs(title="Age Strucuture of Singaore Population by Planning Area 2019")+ scale_fill_gradient(name = "Population",
                      low = "#87CEFA",
                      high = "#191970") 

3. Insights

  • From population pyramid, we can see that population of female are noticeable more than male for age above 65, which is probably the result of female having longer life expectancy than male. The majority of Singaporeans are working age in 2019.
  • The younger population (below 30) are decreasing by age group, which could reflect the decreasing birth rate. Without considering the migration, this will eventually lead to less working population, more ageing population with higher old age support ratio, which means working age population needs to support more elderly population in the future if this trend continues.
  • From population headmap, we can see that Bedok, Jurong West, Tampines, Woodlands, Yishun, Sengkang, Hougang have more dense population. This actually aligns with the fact that these areas have more dewlling units.
  • Punggle & Sengkang have more younger population (< 10 years old). This could due to there more new HBDs launched in these areas in last 5-10 years. Tampines & Bedok have more olderly population.