This data visualization aims to draw insights on the demographic structure of Singapore’s population by age cohort and by planning area in 2019.
(Data used will be retrieved from https://www.singstat.gov.sg/find-data/search-by-theme/population/geographic-distribution/latest-data)
The data is categorized into 55 planning zones and 19 age groups. While the comprehensiveness and specificity of the data gives flexibility, it is difficult to visualize the data distinctly without over-clustering the visualizations.
In order to simplify the data, the 55 planning zones will be condensed into the 5 regions (Central, East, North, North-East, West)
Some regions such as Pioneer have a population of 0.
In order to reduce clustering of data, the data of such locations will be omitted.
As the data is sorted by alphabetical order, the age group “5_to_9” comes after “45_to_49”
Reorder the age groups such that they are in numerical order.
sketch design
packages = c('tidyverse')
for(p in packages) {
if(!require(p, character.only = T)) {
install.packages(p)
}
library(p, character.only=T)
}
## Loading required package: tidyverse
## ── Attaching packages ───────────────────────────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.2 ✓ purrr 0.3.4
## ✓ tibble 3.0.3 ✓ dplyr 1.0.2
## ✓ tidyr 1.1.2 ✓ stringr 1.4.0
## ✓ readr 1.3.1 ✓ forcats 0.5.0
## ── Conflicts ──────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
pop_data <- read.csv("respopagesextod2011to2020.csv")
pop_2019 <- subset(pop_data, Time == '2019')
pop_2019$AG <- as.character(pop_2019$AG)
pop_2019$AG <- factor(pop_2019$AG, levels=unique(pop_2019$AG))
central <- c('Bishan','Bukit Merah', 'Bukit Timah','Downtown Core', 'Geylang', 'Kallang', 'Marina East','Marina South','Marine Parade','Museum','Newton','Novena','Orchard','Outram','Queenstown','River Valley','Rochor','Singapore River','Southern Islands','Straits View','Tanglin','Toa Payoh')
east <- c('Bedok', 'Changi','Changi Bay','Pasir Ris','Paya Lebar', 'Tampines')
north <- c('Central Water Catchment', 'Lim Chu Kang', 'Mandai', 'Sembawang', 'Simpang', 'Sungei Kadut', 'Woodlands', 'Yishun')
northeast <- c('Ang Mo Kio', 'Hougang', 'North-Eastern Islands', 'Punggol', 'Seletar', 'Sengkang', 'Serangoon')
west <- c('Boon Lay', 'Bukit Batok', 'Bukit Panjang', 'Choa Chu Kang', 'Clementi', 'Jurong East', 'Jurong West', 'Pioneer', 'Tengah', 'Tuas', 'Western Islands', 'Western Water Catchment')
pop_region_2019 <- pop_2019 %>% mutate(Region = case_when(
`PA` %in% central ~ "Central",
`PA` %in% east ~ "East",
`PA` %in% north ~ "North",
`PA` %in% northeast ~ "North East",
`PA` %in% west ~ "West",
TRUE ~ ""))
DT::datatable(head(pop_region_2019), class = 'cell-border stripe')
7. Create a facet grid. For each grid, the sum of the population Pop is added up for every age group.
ggplot(data = pop_region_2019, aes(x = AG, y = Pop)) +
stat_summary(fun = sum, geom = "bar", fill = "grey", size = 1) +
facet_grid(Region ~ .) +
labs(title = "Demographic Structure of Singapore",
subtitle = "by age group and region",
y = "Population Count",
x = "Age Group") +
theme(axis.text.x = element_text(angle=50, hjust=1),strip.background = element_rect(fill="white"))
ggplot(data = pop_region_2019, aes(x = AG, y = Pop)) +
stat_summary(fun = sum, geom = "bar", fill = "grey", size = 1) +
facet_grid(Region ~ .) +
labs(title = "Demographic Structure of Singapore",
subtitle = "by age group and region",
y = "Population Count",
x = "Age Group") +
theme(axis.text.x = element_text(angle=50, hjust=1),strip.background = element_rect(fill="white"))
The data shows the demographic structure for each Singapore region over the different age groups.
Across the regions, the general population count (across all age groups) is higher in the Central, North East, and West compared to the East and North.
There is a higher population count of young adults in the West (around 60 000 to 70 000 per age group) across all age groups between 20 to 34
There is a higher population count of adults in the Central, North East, and West (approximately 60 000 to 80 000 per age group) across all age groups between 35 to 59.
While this suggests that the population density in the Central, North East, and West remains high, the different concentrations of age groups in the each regions might suggest a slow transition of younger adults moving away from the Central to the North East.