This interactive visualization aims to reveal the demographic structure of Singapore population by age cohort and by planning area in 2019.
Type of Challenge | Description |
---|---|
Design Challenge | There are so many age groups that filling all into one visualization will cases the plot become very cluttered. |
Design Challenge | To present both age group and planning area attributes in one map visualization. |
Data Challenge | Due to the data type of the Age groups, the group “5_to_9” is placed after group “45_to_49” instead of “0_to_4”. |
Challenge | Solution |
---|---|
There are so many age groups that filling all into one visualization will cases the plot become very cluttered. | Regroup the ages into 3 main groups (Aged, Young, Economy Active). |
To present both age group and planning area attributes in one map visualization. | Use leaflet library to separate the age groups into 3 layers. With the leaflet map, users are able to switch between each age group by checking the checkboxes at the left side of the map. |
Due to the data type of the Age groups, the group “5_to_9” is placed after group “45_to_49” instead of “0_to_4”. | Hard code the column index to specify which column to use while summing up the population. |
The code chunk below will check if the R packages in the packaging list have been installed. if not, install the library. After the installation, it will also load the R packages in R.
packages <- c('rgdal', 'spdep', 'tmap', 'tidyverse', 'prettydoc', 'sf', 'magick', 'plotly', 'leaflet', 'RColorBrewer')
for (p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p, character.only = T)
}
pop <- read_csv("../data/aspatial/respopagesextod2011to2020.csv")
mpsz = st_read(dsn = "../data/geospatial",
layer = "MP14_SUBZONE_WEB_PL")
The code chunk below are used for:
mpsz_pa_sf <- st_as_sf(mpsz[c("REGION_N", "PLN_AREA_N")])
mpsz_pa_sf <- st_set_crs(mpsz_pa_sf, 3414)
mpsz_pa_sf[rowSums(is.na(mpsz_pa_sf))!=0,]
## Simple feature collection with 0 features and 2 fields
## bbox: xmin: NA ymin: NA xmax: NA ymax: NA
## projected CRS: SVY21 / Singapore TM
## [1] REGION_N PLN_AREA_N geometry
## <0 rows> (or 0-length row.names)
mpsz_pa_sf <- st_make_valid(mpsz_pa_sf)
The code chunk below are used for:
Using the spread function to convert the age group into columns and the population as the rows.
Mutating new columns YOUNG
, ECONOMY ACTIVE
, AGED
, and TOTAL
by summing up the values in specific columns.
Category | Age Group |
---|---|
Young | 0 - 24 years old |
Economy Active | 25 - 64 years old |
Aged | 65 years old and above |
Mutate a new column DEPENDENCY
by calculating the sum of young and aged population devided by economy active population
Mutate a new column DEPENDENCY_R
by rounding the value in DEPENDENCY
column into 2 digits
Saving the result to popdata2019
popdata2019 <- pop %>%
filter(Time == 2019) %>%
group_by(PA, SZ, AG) %>%
summarise(`POP` = sum(`Pop`)) %>%
ungroup()%>%
spread(AG, POP)%>%
mutate(`YOUNG` = rowSums(.[3:6])
+rowSums(.[12])) %>%
mutate(`ECONOMY ACTIVE` = rowSums(.[7:11])+rowSums(.[13:15])) %>%
mutate(`AGED` = rowSums(.[16:21])) %>%
mutate(`TOTAL` = rowSums(.[3:21])) %>%
mutate(`DEPENDENCY` = (`YOUNG` + `AGED`)/`ECONOMY ACTIVE`) %>%
select(`PA`, `SZ` , `YOUNG`, `ECONOMY ACTIVE`, `AGED`, `TOTAL`, `DEPENDENCY`)
popdata2019 <- popdata2019 %>%
mutate(`DEPENDENCY_R` = round(`DEPENDENCY`, digits = 2))
The code chunk below are used for:
PA
and SZ
to upper case.SUBZONE_N
, PLN_AREA_N
, REGION_N
popdata2019 <- popdata2019 %>%
mutate_at(.vars = vars(PA, SZ),
.funs = list(toupper)) %>%
filter(`ECONOMY ACTIVE` > 0)
mpsz_sub <- mpsz %>%
select(SUBZONE_N, PLN_AREA_N, REGION_N)
mpszpop2019 <- merge(mpsz_sub, popdata2019,
by.x = "SUBZONE_N", by.y = "SZ")
Create objects that duplicating object mpszpop2019. These will be used as layer names in the leaflet map.
Total_Population <- mpszpop2019
Dependency_Rate <- mpszpop2019
Aged_pop <- mpszpop2019
Young_pop<- mpszpop2019
Economy_Active_pop <- mpszpop2019
The map consists 5 layers:
The code chunk below are used for:
tmap_mode("view")
tm <-
tm_shape(Total_Population) +
tm_polygons("TOTAL",
palette = "Blues",
popup.vars=c(
"Planning Area: "="PLN_AREA_N",
"Depenency Rate: " = "DEPENDENCY_R",
"Young Population" = "YOUNG",
"Economy Active Population" = "ECONOMY ACTIVE",
"Aged Population" = "AGED"
),
title = "Population",
alpha = 1,
n = 6,
)+
tm_shape(Dependency_Rate) +
tm_bubbles(size = "DEPENDENCY",
col = "REGION_N",
title.col="Dependency rate",
title.size="Dependency rate (%)",
scale = 2,
border.alpha = .5,
popup.vars = c(
"Planning Area: "="PLN_AREA_N",
"Depenency Rate: " = "DEPENDENCY_R"
),
)+
tm_shape(Aged_pop) +
tm_bubbles(size = "AGED",
col = "AGED",
palette = "OrRd",
title.col="Aged Population",
scale = 2,
border.alpha = .5,
popup.vars = c(
"Planning Area: "="PLN_AREA_N",
"Aged Population" = "AGED"
),
)+
tm_shape(Young_pop) +
tm_bubbles(size = "YOUNG",
col = "YOUNG",
palette = "PuRd",
title.col="Young Population",
scale = 2,
border.alpha = .5,
popup.vars = c(
"Planning Area: "="PLN_AREA_N",
"Young Population" = "YOUNG"
),
)+
tm_shape(Economy_Active_pop) +
tm_bubbles(size = "ECONOMY ACTIVE",
col = "ECONOMY ACTIVE",
palette = "BuGn",
title.col="Economy Active Population",
scale = 2,
border.alpha = .5,
popup.vars = c(
"Planning Area: "="PLN_AREA_N",
"Economy Active Population" = "ECONOMY ACTIVE"
),
)+
tm_borders(alpha = 0.5)+
tm_view(set.zoom.limits = c(11,14))
Convert the tmap to leaflet map using tmap_leaflet() function. And hide the Economy_Active_pop
, Young_pop
, Aged_pop
layers from the map initially.
lf <- tmap_leaflet(tm) %>%
leaflet::hideGroup("Economy_Active_pop") %>%
leaflet::hideGroup("Young_pop") %>%
leaflet::hideGroup("Aged_pop")
Use the interactive plot to figure out the findings further.
lf
In this ternary plot, the age groups are now divided into the Young
, Economy Active
and Aged
on the 3 axes and the values of the point which corresponds to the 3 axes should add up to 100%. The color density of the choropleth map represents the number of population in each subzone. And size of the bubbles represent the dependency rate in each subzone, their colors are grouped by 5 regions Central region
, East region
, North region
, North-east region
, west region
into different colors. By checking the checkboxes at the left side of the map, we can see the different layers, such as aged population. The aged population also being presented into bubbles, the color and the size of bubbles change base on the population number.
The 2 main insights gathered from the interactive map: