1 Introduction

This DataViz aims to examine and compare the age composition of the Singapore resident population across planning area as of June 2019. The data used is from The Singapore Residents by Planning Area, Subzone, Age Group, Sex and Type of Dwelling, June 2011-2019, published by the Department of Statistics, Singapore.

Singapore is divided into 5 planning regions which are further subdivided into 55 planning area. Each planning area have a population of about 150,000 and is served by a town centre and several neighbourhood commercial/shopping centres.The planning areas used in this dataset are based on the areas demarcated in the Urban Redevelopment Authority’s Master Plan 2014.

2 Major Data and Design Challenges

The dataset contains data spanning over 10 years from years 2011 till 2019 with a total of 883,728 row records, 6 column variables on planning areas (55 levels), subzones (323 levels), age groups in 5-year interval (19 levels), dwelling types (8 levels), gender (2 levels) and year (10 levels), as well as the population measure. For this dataviz, one would focus simply on the most recent 2019 data, and even with one year of data, several data and design challenges are noted:

Given the large number of planning areas and age groups, it would be challenging to compare the age structure across the different geographical levels effectively, and using only static charts.
Population size varies widely across the planning areas in the dataset, from null value (i.e. 0) to ~280,000. Some form of data cleaning and engineering would be required to ensure meaningful comparison.
As the data structure provided is in long form, it has to be restructured into the forms useful for our intent. E.g. the column on age groups may have to the restructured into the wide format for creation of age-related variables.

3 Suggestions to Overcome Challenges

Visualisation would be done primarily at the planning area or/and planning region level; and subzone details would be dropped. The 5-year age groups would also be aggregated to 3 broad age groups: 0-19 (i.e. the young), 20-64 (i.e. adult population), and the seniors (65 years and above). Each of the planning area would need to be mapped into their respective region. The required mapping information was obtained from https://en.wikipedia.org/wiki/Planning_Areas_of_Singapore. Data are extracted into a separate csv file and merged into the dataset.
Comparison of age composition across planning regions/areas would be based on proportion rather than population count. Planning areas with 0 population count would be excluded in the analysis.
Appropriate data wrangling using the dplyr and tidyr functions in the tidyverse library would be performed to transform the dataset.

3.1 Sketch of the Proposed Design.

One would first use a horizontal bar plot to get a quick sense of the population distribution across the planning aea. Thereafter, the following visualisation designs would be explored: bubble plot (static vs. interactive), ternary plot (interactive) and population pyramid (static) to visualise the age-composition across the planning areas/regions before deciding on the final visualisation.

4 DataViz Step-by-step Guide

4.1 Load Required R-packages and Data

The following required packages would be installed and loaded:

tidyverse: Include ggplot2 for data visualisation; dplyr for data manipulation; tidyr for data tidying, readr for data import, purrr for functional programming; tibble for re-imagining of data frames; stringr for strings; and forcats for factors.
plotly: Provides interactive web graphics.
geofacet: Provides geofaceting functionality for ggplot2. Geofaceting arranges a sequence of plots of data for different geographical entities into a grid that strives to preserve some of the original geographical orientation of the entities.
ggrepel: Automatically repel overlapping with ggplot2.
ggpubr: Provides some easy-to-use functions for creating and customizing ggplot2- based publication ready plots.
gridExtra: Provides a number of user-level functions to work with “grid” graphics, notably to arrange multiple grid-based plots on a page, and draw tables.

packages = c('tidyverse','plotly','geofacet','ggrepel','ggpubr','grid','gridExtra')
for (p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p, character.only = T)
}

Next, the csv dataset is imported into R using read_csv() of readr package.

#Import source data file
pop_data <- read_csv("data/respopagesextod2011to2019.csv")
#Inspect the structure of the dataset
str(pop_data)

## tibble [883,728 x 7] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ PA  : chr [1:883728] "Ang Mo Kio" "Ang Mo Kio" "Ang Mo Kio" "Ang Mo Kio" ...
##  $ SZ  : chr [1:883728] "Ang Mo Kio Town Centre" "Ang Mo Kio Town Centre" "Ang Mo Kio Town Centre" "Ang Mo Kio Town Centre" ...
##  $ AG  : chr [1:883728] "0_to_4" "0_to_4" "0_to_4" "0_to_4" ...
##  $ Sex : chr [1:883728] "Males" "Males" "Males" "Males" ...
##  $ TOD : chr [1:883728] "HDB 1- and 2-Room Flats" "HDB 3-Room Flats" "HDB 4-Room Flats" "HDB 5-Room and Executive Flats" ...
##  $ Pop : num [1:883728] 0 10 30 50 0 0 40 0 0 10 ...
##  $ Time: num [1:883728] 2011 2011 2011 2011 2011 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   PA = col_character(),
##   ..   SZ = col_character(),
##   ..   AG = col_character(),
##   ..   Sex = col_character(),
##   ..   TOD = col_character(),
##   ..   Pop = col_double(),
##   ..   Time = col_double()
##   .. )

4.2 Data Preparation

Only columns containing information of interest i.e. planning area, age group, population and year are kept via the function select(). The population measure is aggregated to the level of planning area level, age group and year using group_by() followed by summarize(). Next, the age groups at row-level are restructured as column variables via spread(). Additional measures for the population viz. the young (0-19 years), adult population (20-64 years), the seniors (65 years and above), total population as well as the respective proportions of the young, adult and seniors out of total population are then created via mutate() for each planning area for those with total population count more than 0.

pop_data_cleaned <- pop_data %>% 
  select(PA, AG, Pop, Time) %>%
  group_by(across(c(PA, AG, Time))) %>%
  summarize(Sum_Pop = sum(Pop)) %>%
  spread(AG, Sum_Pop) %>%
  select(1:3, "5_to_9", everything()) %>%
  mutate(young = rowSums(across("0_to_4":"15_to_19"))) %>%
  mutate(active = rowSums(across("20_to_24":"60_to_64"))) %>%
  mutate(old = rowSums(across("65_to_69":"90_and_over"))) %>%
  mutate(total = rowSums(across("young":"old"))) %>%
  mutate(total_thou = round(total/1000,1)) %>% 
  filter(total > 0) %>%
  mutate(percent_young = young/total*100) %>%
  mutate(percent_active = active/total*100) %>%
  mutate(percent_old = old/total*100) %>% 
  rename('planning_area'='PA', 'year'='Time') %>% 
  mutate(planning_area = as.factor(planning_area))

There were a total of 42 planning areas in the dataset after removing those with no population count.

#List of planning area with population count in the dataset
levels(pop_data_cleaned$planning_area)

##  [1] "Ang Mo Kio"              "Bedok"                  
##  [3] "Bishan"                  "Bukit Batok"            
##  [5] "Bukit Merah"             "Bukit Panjang"          
##  [7] "Bukit Timah"             "Changi"                 
##  [9] "Choa Chu Kang"           "Clementi"               
## [11] "Downtown Core"           "Geylang"                
## [13] "Hougang"                 "Jurong East"            
## [15] "Jurong West"             "Kallang"                
## [17] "Lim Chu Kang"            "Mandai"                 
## [19] "Marine Parade"           "Museum"                 
## [21] "Newton"                  "Novena"                 
## [23] "Orchard"                 "Outram"                 
## [25] "Pasir Ris"               "Punggol"                
## [27] "Queenstown"              "River Valley"           
## [29] "Rochor"                  "Seletar"                
## [31] "Sembawang"               "Sengkang"               
## [33] "Serangoon"               "Singapore River"        
## [35] "Southern Islands"        "Sungei Kadut"           
## [37] "Tampines"                "Tanglin"                
## [39] "Toa Payoh"               "Western Water Catchment"
## [41] "Woodlands"               "Yishun"

#Read and merge in planning region information to the transformed dataset
pa_to_region_lookup <- read_csv("data/planningarea_to_region_mapping.csv")
pop_data_pr <- left_join(pop_data_cleaned, pa_to_region_lookup, by = c("planning_area" = "PA")) %>%
          select(1, "planning_region", everything())

#Create a subset of the resident population data for the year 2019.
pop_data2019 <- subset(pop_data_pr, year==2019)
head(pop_data2019)

## # A tibble: 6 x 30
## # Groups:   planning_area [6]
##   planning_area planning_region  year `0_to_4` `5_to_9` `10_to_14` `15_to_19`
##   <chr>         <chr>           <dbl>    <dbl>    <dbl>      <dbl>      <dbl>
## 1 Ang Mo Kio    Northeast Regi~  2019     5420     6230       7380       7930
## 2 Bedok         East Region      2019    10020    11640      13300      14640
## 3 Bishan        Central Region   2019     2850     3850       4430       4740
## 4 Bukit Batok   West Region      2019     7130     6640       7800       8800
## 5 Bukit Merah   Central Region   2019     6100     6650       6640       6380
## 6 Bukit Panjang West Region      2019     6700     7230       7680       8500
## # ... with 23 more variables: `20_to_24` <dbl>, `25_to_29` <dbl>,
## #   `30_to_34` <dbl>, `35_to_39` <dbl>, `40_to_44` <dbl>, `45_to_49` <dbl>,
## #   `50_to_54` <dbl>, `55_to_59` <dbl>, `60_to_64` <dbl>, `65_to_69` <dbl>,
## #   `70_to_74` <dbl>, `75_to_79` <dbl>, `80_to_84` <dbl>, `85_to_89` <dbl>,
## #   `90_and_over` <dbl>, young <dbl>, active <dbl>, old <dbl>, total <dbl>,
## #   total_thou <dbl>, percent_young <dbl>, percent_active <dbl>,
## #   percent_old <dbl>

Another dataset is prepared for the plot of the resident population pyramid by planning area, 5-year age group and gender in 2019. Firstly, two temporary files are created. In temp1, population is aggregated at the level of planning area, age group, gender and year using group_by() followed by summarize(). In temp2, the intent is to compute the population for males and for females separately for each planning area and year. temp2 is then merged with temp1 using the left_join() function. The male population is assigned negative values so that it appears on the left on the population pyramid. Since the planning areas are of different population counts, proportion is computed and displayed instead of the absolute population value in the population pyramid.

The function recode_factor() was used for the age group variable to change the order of levels to match the order of replacements.

temp1 <- pop_data %>% 
  select(PA, AG, Sex, Pop, Time) %>%
  group_by(across(c(PA, AG, Sex, Time))) %>%
  summarize(sum_pop = sum(Pop))
  
temp2 <- pop_data %>% 
  select(PA, AG, Sex, Pop, Time) %>%
  group_by(across(c(PA, Sex, Time))) %>%
  summarize(gender_pop = sum(Pop))

pop_data_gender_2019 <- left_join(temp1, temp2) %>% 
  filter(gender_pop > 0) %>% 
  filter(Time==2019) %>% 
  mutate(percent = sum_pop/gender_pop*100) %>% 
  mutate(percent = ifelse(Sex=="Males", -1*percent, percent)) %>% 
  mutate(AG = as.factor(AG)) %>% 
  mutate(age_group = recode_factor(AG,
                          `0_to_4` = "0-4",
                          `5_to_9` = "5-9",
                          `10_to_14` = "10-14",
                          `15_to_19` = "15-19",
                          `20_to_24` = "20-24",
                          `25_to_29` = "25-29",
                          `30_to_34` = "30-34",
                          `35_to_39` = "35-39",
                          `40_to_44` = "40-44",
                          `45_to_49` = "45-49",
                          `50_to_54` = "50-54",
                          `55_to_59` = "55-59",
                          `60_to_64` = "60-64",
                          `65_to_69` = "65-69",
                          `70_to_74` = "70-74",
                          `75_to_79` = "75-79",
                          `80_to_84` = "80-84",
                          `85_to_89` = "85-89",
                          `90_and_over` = "90&over")) %>% 
  rename(name=`PA`)
head(pop_data_gender_2019)

## # A tibble: 6 x 8
## # Groups:   name, AG, Sex [6]
##   name       AG       Sex      Time sum_pop gender_pop percent age_group
##   <chr>      <fct>    <chr>   <dbl>   <dbl>      <dbl>   <dbl> <fct>    
## 1 Ang Mo Kio 0_to_4   Females  2019    2660      85770    3.10 0-4      
## 2 Ang Mo Kio 0_to_4   Males    2019    2760      78660   -3.51 0-4      
## 3 Ang Mo Kio 10_to_14 Females  2019    3670      85770    4.28 10-14    
## 4 Ang Mo Kio 10_to_14 Males    2019    3710      78660   -4.72 10-14    
## 5 Ang Mo Kio 15_to_19 Females  2019    3890      85770    4.54 15-19    
## 6 Ang Mo Kio 15_to_19 Males    2019    4040      78660   -5.14 15-19

4.3 Horizontal Bar Plot

Horizontal bar plot is created to obtain a quick high level overview to find out the planning areas which are highly populated in 2019. Colour is added to distinguish the 5 planning regions.

colors <- c('#4AC6B7', '#1972A4', '#965F8A', '#FF7070', '#C61951')

pop_data2019$planning_area <- as.factor(pop_data2019$planning_area) %>% 
  fct_reorder(pop_data2019$total_thou)

P1 <- ggplot(pop_data2019, aes(x = planning_area, y= total_thou, fill=planning_region, label=total_thou)) +
  geom_col(alpha=0.5) +
  coord_flip() +
  geom_text(hjust = 0, nudge_x = 0.05, size = 3, color="darkgrey") +
  theme_classic() +
  scale_fill_manual(values = colors) +
  labs(title="Resident Population ('000) by Planning Area, 2019") +
  theme(legend.title = element_blank(),
        legend.position = c(0.8, 0.5),
        legend.text = element_text(size = 8),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        plot.title = element_text(hjust = 0.5, size=12))
P1

4.3 Bubble Plot

4.3.1 Static with Annotation

The code below creates a basic bubble plot with the proportion of seniors on the y-axis vs the proportion of young residents on the x-axis for each planning area. The number of resident population in each of the planning area is represented by the size of the bubble and planning region is again represented by the colour.

To add more insights from the visualisation, annotations were added on the plot to identify planning areas with at least 100,000 population having an elderly proportion of more than 20% or a young population of more than 25%.

colors <- c('#4AC6B7', '#1972A4', '#965F8A', '#FF7070', '#C61951')

temp <- pop_data2019 %>% 
  mutate(
    annotation = case_when(
      percent_young > 25 & total_thou > 10 ~ "yes",
      percent_old > 20 & total_thou > 10 ~ "yes"
    )
) %>% 
arrange(desc(total_thou)) %>% 
mutate(planning_area = factor(planning_area, planning_area))

#Plot
P2 <- ggplot(temp, aes(x= percent_young, y= percent_old, size=total_thou, color = planning_region)) +
  geom_point(alpha=0.5) +
  geom_vline(xintercept=25, size=0.5, linetype="longdash")+
  geom_hline(yintercept=20, size=0.5, linetype="longdash")+
  scale_size_continuous(range = c(0.2,10), name="Population ('000')") +
  scale_x_continuous(name="% aged 0-19 years", limits=c(0, 50)) +
  scale_y_continuous(name="% aged 65 years and above", limits=c(0, 30)) +
  theme_minimal() +
  scale_color_manual(values = colors, guide=FALSE) +
  geom_text_repel(data=temp %>% filter(annotation=="yes"), aes(label=planning_area), size=3) +
  labs(title = "Proportion of Seniors vs Proportion of Young across Planning Area, 2019", 
         subtitle = "Planning areas with >= 100,000 population, and with >=20% seniors or >=25% young population are marked", 
         y = "% aged 65 years and above", 
         x = "% aged 0-19 years") +
  theme(plot.title = element_text(hjust = 0.5, size=12),
        legend.text = element_text(size = 8),
        legend.title=element_text(size=8),
        legend.position = c(0.8, 0.5),
        axis.title.x = element_text(size=10),
        axis.title.y = element_text(size=10),
        plot.subtitle = element_text(hjust = 0.5, size=8))
P2

4.3.2 Interactive

We further explored the same bubble plot but with interactivity using plot_ly function. A tooltip has been added to to reveal specific details on the size of the population and the respective proportions of young, and senior populations for each planning area when one hovers over the circles. This is definitely more informative than the static bubble chart above. However, the requirement of this exercise is to make use of static plots for the dataviz.

colors <- c('#4AC6B7', '#1972A4', '#965F8A', '#FF7070', '#C61951')

P3 <- plot_ly(
  pop_data2019, x = ~`percent_young`, y = ~`percent_old`,  
  color = ~`planning_region`, type = "scatter",
  mode = "markers", colors = ~colors, size = ~`total`,
  marker = list(symbol = 'circle', sizemode = 'diameter', sizeref = 2,
                line = list(width = 2, color = '#FFFFFF'), opacity=0.4),
  text = ~paste(sep='','Planning Area:', `planning_area`,
                '<br>Resident Population:',`total`,
                '<br>% young:', round(`percent_young`,1),
                '<br>% Old:', round(`percent_old`,1))
  )%>%
  layout(
        title="Percentage of Seniors aged 65 years and above vs Young aged 0-19 years)",
        xaxis = list(title = '% aged 0-19 years',
                      gridcolor = 'rgb(243, 243, 243)',
                      range=c(0,35),
                      ticklen = 5,
                      gridwidth = 1),
         yaxis = list(title = '% aged 65 years and above',
                      gridcolor = 'rgb(243, 243, 243)',
                      range=c(0,30),
                      ticklen = 5,
                      gridwith = 1)
  )
P3

4.4 Interactive Ternary Plot

Interactive ternary plot will be a better way of displaying the distribution and variability of three-part compositional data. One can thus use it to visualise the proportion of young, adults and senior population. Like in the bubble plot, the size of the bubble represents the total population in the subzone while the subzones are colored by planning region. The code below create an interactive ternary plot using plot_ly() function of Plotly R. Mouse over the bubble to view the specific values.

# reusable function for axis formatting
axis <- function(txt, min_value) {
  list(
    title = txt, min = min_value, tickformat = ".0%", tickfont = list(size = 10), 
    titlefont = list(size = 12)
  )
}

# reusable function for creating annotation object
label <- function(txt) {
  list(
    text = txt, 
    x = 0.15, y = 0.9,
    ax = 0, ay = 0,
    xref = "paper", yref = "paper", 
    align = "center",
    font = list(family = "serif", size = 15, color = "white"),
    bgcolor = "#b3b3b3", bordercolor = "black", borderwidth = 2
  )
}

ternaryAxes <- list(
  aaxis = axis("Adults (20-64 years)", 0.5), 
  baxis = axis("Young (0-19 years)", 0.0), 
  caxis = axis("Seniors (65 years and above)", 0.0)
)

colors <- c('#4AC6B7', '#1972A4', '#965F8A', '#FF7070', '#C61951')

# Initiating a plotly visualization 
P4 <- plot_ly(
  pop_data2019, 
  a = ~active, 
  b = ~young, 
  c = ~old,
  color = ~`planning_region`,
  type = "scatterternary",
  colors = ~colors,
  size = ~total,
  text = ~paste('Planning Area: ', `planning_area`,
                '<br>Population: ', `total`,
                '<br>% Young: ',round(`percent_young`,1),
                '<br>% Adults: ',round(`percent_active`,1), 
                '<br>% Seniors: ',round(`percent_old`,1)),
  marker = list(symbol = 'circle', opacity = 0.4, sizemode = "diameter", 
                sizeref = 2, line = list(width = 2, color = '#FFFFFF'))
) %>%
  layout(
    annotations = label("Singapore Resident Population June 2019"),
    ternary = ternaryAxes
)

P4

However, between a static ternary and a static bubble plot for the fairly large number of planning areas, the former with annotations might be too cluttered and the information/insight ‘lost’ with the use of a static buble plot might not be great. Furthermore, for some people not familiar with ternary chart, they may find it hard to interpret/visualise a tenary plot as compared to the bubble plot which is fairly straightforward.

Attempt was also made to create an animation of the plot spanning from years 2011 till 2019 to visualise how the demographic structure of each planning area and in relation to the others have changed over the past 10 years. This was done via the frame parameter and the slider using the animation_opts function. Click Play to visualise the movement of the bubbles with time, which reflects trends towards an aging population in most planning areas.

# reusable function for axis formatting
axis <- function(txt, min_value) {
  list(
    title = txt, min = min_value, tickformat = ".0%", tickfont = list(size = 10), 
    titlefont = list(size = 12)
  )
}

# reusable function for creating annotation object
label <- function(txt) {
  list(
    text = txt, 
    x = 0.15, y = 0.9,
    ax = 0, ay = 0,
    xref = "paper", yref = "paper", 
    align = "center",
    font = list(family = "serif", size = 15, color = "white"),
    bgcolor = "#b3b3b3", bordercolor = "black", borderwidth = 2
  )
}

ternaryAxes <- list(
  aaxis = axis("Adults (20-64 years)", 0.5), 
  baxis = axis("Young (0-19 years)", 0.0), 
  caxis = axis("Seniors (>=65 years)", 0.0)
)

colors <- c('#4AC6B7', '#1972A4', '#965F8A', '#FF7070', '#C61951')

# Initiating a plotly visualization 
plot_ly(
  pop_data_pr,
  a = ~active, 
  b = ~young, 
  c = ~old,
  frame = ~year,
  color = ~`planning_region`,
  type = "scatterternary",
  colors = ~colors,
  size = ~total,
  text = ~paste('Planning Area: ', `planning_area`,
                '<br>Population: ', `total`,
                '<br>% Young: ',round(`percent_young`,1),
                '<br>% Adults: ',round(`percent_active`,1), 
                '<br>% Seniors: ',round(`percent_old`,1)),
  marker = list(symbol = 'circle', opacity = 0.4, sizemode = "diameter", 
                sizeref = 2, line = list(width = 2, color = '#FFFFFF'))
) %>%
  layout(
    annotations = label("Singapore Resident Population 2011-2019"),
    ternary = ternaryAxes
  ) %>%
animation_slider(
  currentvalue = list(prefix = "Year ",
  font = list(color="red"))
) %>%
  animation_opts(
  2000, redraw = FALSE
)

4.5 Faceted Population Pyramid

A population pyramid, also known as an “age-gender-pyramid”, provides an illustration of the distribution of various age groups in a population. Most can be categorized into three shapes: expansive (young), constrictive (elderly and shrinking), or stationary (little or no population growth)(https://populationeducation.org/what-are-different-types-population-pyramids/).

Geofacet is an R package that extends ggplot2’s faceting capabilities. Instead of creating uniform facets, the facets are mapped onto a grid representing the geographical location of the geographical locations of the country. The parameter grid of the function facet_geo allows us to select the country grid.

Population pyramid (proportion) for each planning area by gender and 5-year age group are created using geom_bar in ggplot. The Singapore grid from the function facet_geo() is then used to map the planning area to the approximate geographical locations in Singapore.

P5 <- ggplot(pop_data_gender_2019, aes(x=age_group,fill=Sex,y=percent))+
  geom_bar(stat="identity",width=1,color="black")+
  facet_geo(~ name, grid = "sg_planning_area_grid1")+
  ylim(-20,20)+
  theme_bw() +
  theme(axis.text.x = element_text(size=8),
        axis.text.y = element_text(size=7),
        strip.text.x = element_text(size = 7),
        legend.title = element_blank()) + 
  coord_flip() +
  labs(title = "Resident Population Pyramid across Planning Areas, 2019", 
         subtitle = "Proportion of each group group by gender used in place of population count", 
         y = "Age Group", 
         x = "Proportion %") +
  theme(plot.title = element_text(hjust = 0.5,size=12),
        plot.subtitle = element_text(hjust = 0.5, size=8),
        legend.title = element_blank(),
        legend.position="top")
        
P5

5 Insights and Final Visualisation

After some exploration, the horizontal bar plot, static bubble plot, and the population pyramid are chosen to be included in the final visualisation for this dataviz noting the requirement for static plots.

Below are the insigts gained and the final visualisation:

Population size varies rather significantly across planning areas. In 2019, the most populated planning areas is Bedok with approximately 280,000 residents followed by Jurong West, Tampines, Woodlands and Sengkang, each with a population size of at least 240,000. On the other spectrum, there are planning areas with very small population such as Lim Chu Kang, Seletar, Museum which are basically the military camp areas and/or central business district areas (Fig. 1).

Punggol and Sengkang, both from the North-east Region, had the largest proportion of young residents aged 0-19 years old (Fig.2). Outram in the Central Region on the other hand, has the largest proportion of seniors aged 65 years and older. Planning areas with a relatively more mature population includes Bukit Merah, Rochor, Ang Mo Kio, Queenstown, Toa Payoh and Kallang with each having at least 20% of its population aged 65 years and older. Majority of these mature planning area are from the Central Region.

The relatively younger population among planning areas in the North-east region and more mature population in the Central region, are also evident from the population pyramids (Fig.3). For instance, the broader base as well as concentration in the middle age groups reflected by the population pyramid of Punggol and Sengkang, indicates a relatively young and growing population pool in these areas; while Outram and Bukit Merah have a population pyramid with a rather obvious inverted shape tapering at the bottom.

Population pyramids, together with the bar charts and bubble plot are thus suitable as an visual aid for detecting differences in age composition structure across the planning regions for this exercise.

References

The R graph gallery 2018. Bubble plot. https://www.r-graph-gallery.com/bubble-chart.html
David Ten (2018) Faceted Population Pyramids: https://xang1234.github.io/visualizations
David Ten (2018) Ternary Plots: https://xang1234.github.io/ternary

How does the Population and Age Distribution Differ across Planning Areas in Singapore, 2019

ISSS608 Visual Analytics & Applications Assignment 4

Phua Hwee Pin

24 July 2020