1.0 Overview

This assignment aims to explore Singapore’s demographic patterns by age groups and by the country’s planning areas in 2019 .

The dataset used is taken from Singapore Residents by Planning Area*, Subzone, Age Group, Sex and Type of Dwelling June 2011-2019 data series published by the Singapore Department of Statistics.

*Planning areas refer to the areas demarcated in the Urban Redevelopment Authority’s Master Plan 2014. 2019 data was used for this data visualisation.

2.0 Purpose of visualisation

The census contains demographic details of Singapore’s residents (Age Groups, Sex, location and their respective Planning Area, Subzones and Type of Dwellings) and contains approximately 883,728 residents’ details (2011- 2019) in total.

For this purpose of this visualisation, we will look at 2019 dataset and explore:

  • Singapore’s population pyramid by age group
  • Popularity and Population density within planning areas

Currently the dataset is grouped according to these different categories as seen from the table below.

  • PA (Planning Area)
  • SZ (Subzone)
  • AG (Age Group)
  • Sex (Gender)
  • TOD (Type of Dwelling)
  • POP (Population count)
  • Time (Time/ Period)
PA SZ AG Sex TOD POP Time
Ang Mo Kio Ang Mo Kio Town Centre 0_to_4 Males HDB 1-and 2 Room Flats 0 2011
Ang Mo Kio Ang Mo Kio Town Centre 0_to_4 Males HDB 3-Room Flats 10 2011

2.1 Design and Data challenges faced

1. Dataset consists of data from 2011 to 2019

Filtering of data from 2019 is required for a indepth analysis on 2019 resident’s data

2. Difficulties in aggregating resident’s count according to the residents’ age group and planning areas

The data is currently segmented according to the subzones and the type of dwellings, making it difficult for data aggregation and analysis via age groups and planning areas. Further data wrangling would be required to aggregate and group resident count by Age Group, Planning areas and Gender.

3. Segmenting age groups into different generation groups

The data is currently segmented according to different age groups. However as the purpose of the visualisation is to derive insights and possible actions from population density and population of planning areas, having too granular segmentation of age-groups would hinder instead of drive insights. Hence combining of age groups into different generation groups (e.g. young, young working adults, economic active, retirees, old) is required for a broader view.

4. Conflicts between R packages

The original idea was to leverage on the ggpol facet_share function to build a butterfly chart with a shared axis to explore the relationship between planning areas, gender and generation groups. However as the ggtern package used for designing ternary plots also changes the coordinates, it overrides the ggpol package’s functionality, rendering it unusable. Hence the idea was mooted.

2.2 Sketch of Proposed Data Vizualisation Design

library('knitr')
include_graphics("/Users/jayneteo/Dropbox/SMU MITB/Term 2 2020/Visual Analytics/Assignments/Assignment 4/Untitled Notebook (4)/Sketch1.jpg")

include_graphics("/Users/jayneteo/Dropbox/SMU MITB/Term 2 2020/Visual Analytics/Assignments/Assignment 4/Untitled Notebook (4)/Sketch2.jpg")

Given that there are categorical and numeric variables at play:

  • A butterfly barplot (population pyramid) was plotted to illustrate the distribution of various age groups in Singapore. If a population is a young population, it should mirror the shape of a pyramid

  • To explore the relationship between Planning areas, Generation Groups and the corresponding Population count, a ranked stacked barplot was plotted

  • To explore if there are planning areas preferred by Singaporeans, a treemap has been plotted

  • Lastly, to explore the relationship between generation groups, a ternary plot was plotted

3.0 DataViz Step-by-Step

3.1 Install and load R packages

  • Tidyverse contains a set of essential packages for data wrangling and data visualisation

  • ggpubr" is an extension of ggplot2 and provides easy-to-use functions for creating and customizing ggplot2 based publication ready plots

  • Treemap is an extension of ggplot2 and offers great flexibility in drawing treemaps

  • kableExtra and knitr are a packages that offer flexibility in creating tables

  • ggtern is an extension of ggplot2 and provides easy to use function for creating ternary plots.

Please do not install or run the ggtern package con-currently as it overrides the ggplot coordinates. Please install / load the ggtern package only at the very last

packages = c('tidyverse','ggpubr','ggthemes', 'treemap','kableExtra','knitr')

for(p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p, character.only = T)
}

3.2 Load the data

The data is imported using read_csv() function which converts the data into a tibble dataframe.

popdata <- read_csv("/Users/jayneteo/Dropbox/SMU MITB/Term 2 2020/Visual Analytics/Assignments/Assignment 4/Singapore Residents by Planning AreaSubzone Age Group Sex and Type of Dwelling June 20112019/Population2011to2019.csv")
## Parsed with column specification:
## cols(
##   PA = col_character(),
##   SZ = col_character(),
##   AG = col_character(),
##   Sex = col_character(),
##   TOD = col_character(),
##   Pop = col_double(),
##   Time = col_double()
## )

3.3 Data Wrangling

3.3.1 Understanding the dataset structure

The glimpse() and the head() functions were used to understand the characteristic of the dataset.

glimpse(popdata)
## Rows: 883,728
## Columns: 7
## $ PA   <chr> "Ang Mo Kio", "Ang Mo Kio", "Ang Mo Kio", "Ang Mo Kio", "Ang Mo …
## $ SZ   <chr> "Ang Mo Kio Town Centre", "Ang Mo Kio Town Centre", "Ang Mo Kio …
## $ AG   <chr> "0_to_4", "0_to_4", "0_to_4", "0_to_4", "0_to_4", "0_to_4", "0_t…
## $ Sex  <chr> "Males", "Males", "Males", "Males", "Males", "Males", "Males", "…
## $ TOD  <chr> "HDB 1- and 2-Room Flats", "HDB 3-Room Flats", "HDB 4-Room Flats…
## $ Pop  <dbl> 0, 10, 30, 50, 0, 0, 40, 0, 0, 10, 30, 60, 0, 0, 40, 0, 0, 10, 3…
## $ Time <dbl> 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011…
head(popdata,10)
## # A tibble: 10 x 7
##    PA        SZ               AG    Sex    TOD                         Pop  Time
##    <chr>     <chr>            <chr> <chr>  <chr>                     <dbl> <dbl>
##  1 Ang Mo K… Ang Mo Kio Town… 0_to… Males  HDB 1- and 2-Room Flats       0  2011
##  2 Ang Mo K… Ang Mo Kio Town… 0_to… Males  HDB 3-Room Flats             10  2011
##  3 Ang Mo K… Ang Mo Kio Town… 0_to… Males  HDB 4-Room Flats             30  2011
##  4 Ang Mo K… Ang Mo Kio Town… 0_to… Males  HDB 5-Room and Executive…    50  2011
##  5 Ang Mo K… Ang Mo Kio Town… 0_to… Males  HUDC Flats (excluding th…     0  2011
##  6 Ang Mo K… Ang Mo Kio Town… 0_to… Males  Landed Properties             0  2011
##  7 Ang Mo K… Ang Mo Kio Town… 0_to… Males  Condominiums and Other A…    40  2011
##  8 Ang Mo K… Ang Mo Kio Town… 0_to… Males  Others                        0  2011
##  9 Ang Mo K… Ang Mo Kio Town… 0_to… Femal… HDB 1- and 2-Room Flats       0  2011
## 10 Ang Mo K… Ang Mo Kio Town… 0_to… Femal… HDB 3-Room Flats             10  2011

3.3.2 Cleaning of data

Before any data wrangling is done, any rows with NA has been dropped with the drop_na() function.

drop_na(popdata)
## # A tibble: 883,728 x 7
##    PA        SZ               AG    Sex    TOD                         Pop  Time
##    <chr>     <chr>            <chr> <chr>  <chr>                     <dbl> <dbl>
##  1 Ang Mo K… Ang Mo Kio Town… 0_to… Males  HDB 1- and 2-Room Flats       0  2011
##  2 Ang Mo K… Ang Mo Kio Town… 0_to… Males  HDB 3-Room Flats             10  2011
##  3 Ang Mo K… Ang Mo Kio Town… 0_to… Males  HDB 4-Room Flats             30  2011
##  4 Ang Mo K… Ang Mo Kio Town… 0_to… Males  HDB 5-Room and Executive…    50  2011
##  5 Ang Mo K… Ang Mo Kio Town… 0_to… Males  HUDC Flats (excluding th…     0  2011
##  6 Ang Mo K… Ang Mo Kio Town… 0_to… Males  Landed Properties             0  2011
##  7 Ang Mo K… Ang Mo Kio Town… 0_to… Males  Condominiums and Other A…    40  2011
##  8 Ang Mo K… Ang Mo Kio Town… 0_to… Males  Others                        0  2011
##  9 Ang Mo K… Ang Mo Kio Town… 0_to… Femal… HDB 1- and 2-Room Flats       0  2011
## 10 Ang Mo K… Ang Mo Kio Town… 0_to… Femal… HDB 3-Room Flats             10  2011
## # … with 883,718 more rows

3.3.3 Preparing the data

As we would only want to show data in 2019, the filter() function was used to select 2019 data.

popdata2019 <- popdata %>% 
  filter(`Time` == 2019)
glimpse(popdata2019)
## Rows: 98,192
## Columns: 7
## $ PA   <chr> "Ang Mo Kio", "Ang Mo Kio", "Ang Mo Kio", "Ang Mo Kio", "Ang Mo …
## $ SZ   <chr> "Ang Mo Kio Town Centre", "Ang Mo Kio Town Centre", "Ang Mo Kio …
## $ AG   <chr> "0_to_4", "0_to_4", "0_to_4", "0_to_4", "0_to_4", "0_to_4", "0_t…
## $ Sex  <chr> "Males", "Males", "Males", "Males", "Males", "Males", "Males", "…
## $ TOD  <chr> "HDB 1- and 2-Room Flats", "HDB 3-Room Flats", "HDB 4-Room Flats…
## $ Pop  <dbl> 0, 10, 10, 20, 0, 0, 50, 0, 0, 10, 10, 20, 0, 0, 40, 0, 0, 10, 1…
## $ Time <dbl> 2019, 2019, 2019, 2019, 2019, 2019, 2019, 2019, 2019, 2019, 2019…

The variables - Planning Areas, Age Groups, Population Count and Sex - were selected using the select() function for further data wrangling.

The variables are then grouped using the group_by() function.

popdata2019_select_gender <- popdata2019 %>% 
  select(PA,AG,Pop,Sex) %>%
  group_by(PA,AG,Pop,Sex) %>% 
  mutate(AG = as_factor(AG),PA = as_factor(PA), Sex = as_factor(Sex))

popdata2019_select_gender
## # A tibble: 98,192 x 4
## # Groups:   PA, AG, Pop, Sex [18,873]
##    PA         AG       Pop Sex    
##    <fct>      <fct>  <dbl> <fct>  
##  1 Ang Mo Kio 0_to_4     0 Males  
##  2 Ang Mo Kio 0_to_4    10 Males  
##  3 Ang Mo Kio 0_to_4    10 Males  
##  4 Ang Mo Kio 0_to_4    20 Males  
##  5 Ang Mo Kio 0_to_4     0 Males  
##  6 Ang Mo Kio 0_to_4     0 Males  
##  7 Ang Mo Kio 0_to_4    50 Males  
##  8 Ang Mo Kio 0_to_4     0 Males  
##  9 Ang Mo Kio 0_to_4     0 Females
## 10 Ang Mo Kio 0_to_4    10 Females
## # … with 98,182 more rows

3.3.4 Creating generation groups

To create the generation groups, the data was first transformed from a thin dataframe to a wide dataframe using the pivot_wider() function and then aggregated using the summarise() function. The variable “Sex” was also renamed to Gender for ease.

popdata2019_gender <- popdata2019_select_gender %>% 
  group_by(PA,AG, Sex) %>% 
  summarise_all(sum) %>% 
  pivot_wider(names_from = AG,
              values_from = Pop) %>% 
  rename(Gender = Sex)

popdata2019_gender
## # A tibble: 110 x 21
## # Groups:   PA [55]
##    PA    Gender `0_to_4` `10_to_14` `15_to_19` `20_to_24` `25_to_29` `30_to_34`
##    <fct> <fct>     <dbl>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>
##  1 Ang … Femal…     2660       3670       3890       4390       5410       5440
##  2 Ang … Males      2760       3710       4040       4530       5210       5070
##  3 Bedok Femal…     4940       6580       7210       8080       9710       9180
##  4 Bedok Males      5080       6720       7430       8580       9820       8760
##  5 Bish… Femal…     1350       2100       2260       2680       3430       2770
##  6 Bish… Males      1500       2330       2480       2890       3660       2660
##  7 Boon… Femal…        0          0          0          0          0          0
##  8 Boon… Males         0          0          0          0          0          0
##  9 Buki… Femal…     3460       3880       4380       4930       6450       6270
## 10 Buki… Males      3670       3920       4420       4920       6060       6210
## # … with 100 more rows, and 13 more variables: `35_to_39` <dbl>,
## #   `40_to_44` <dbl>, `45_to_49` <dbl>, `5_to_9` <dbl>, `50_to_54` <dbl>,
## #   `55_to_59` <dbl>, `60_to_64` <dbl>, `65_to_69` <dbl>, `70_to_74` <dbl>,
## #   `75_to_79` <dbl>, `80_to_84` <dbl>, `85_to_89` <dbl>, `90_and_over` <dbl>

The functions mutate() and rowSum() were used to create the various generation groups.

The row numbers used are the variables’ order according to the new data frame after spreading the data frame.

The age groups are binned into 5 new generation groups:

  • Children: 0 to 19
  • Young Working Adults: 20 to 39
  • Economically Active: 40 to 59
  • Retirees: 60 to 79
  • Old: 80 and above
processeddata1 <- ungroup(popdata2019_gender) %>% 
  mutate(Children = rowSums(.[,c("0_to_4","5_to_9","10_to_14","15_to_19")]), 
         Young_Working_Adults = rowSums(.[,c("20_to_24","25_to_29","30_to_34","35_to_39")]),
         Economic_Active = rowSums(.[,c("40_to_44","45_to_49","50_to_54","55_to_59")]),
         Retirees = rowSums(.[,c("60_to_64","65_to_69","70_to_74","75_to_79")]),
         Old = rowSums(.[,c("80_to_84","85_to_89","90_and_over")])) %>%  
  select(PA, Gender, Children, Young_Working_Adults, Economic_Active, Retirees,Old) %>% 
  group_by(PA) %>% 
  pivot_longer(cols = Children:Old,
             names_to = "Age_Group",
              values_to = "Population",
              values_drop_na = TRUE) %>% 
    mutate(Age_Group = as_factor(Age_Group)) 


processeddata1
## # A tibble: 550 x 4
## # Groups:   PA [55]
##    PA         Gender  Age_Group            Population
##    <fct>      <fct>   <fct>                     <dbl>
##  1 Ang Mo Kio Females Children                  13330
##  2 Ang Mo Kio Females Young_Working_Adults      21040
##  3 Ang Mo Kio Females Economic_Active           25660
##  4 Ang Mo Kio Females Retirees                  21550
##  5 Ang Mo Kio Females Old                        4190
##  6 Ang Mo Kio Males   Children                  13630
##  7 Ang Mo Kio Males   Young_Working_Adults      19950
##  8 Ang Mo Kio Males   Economic_Active           23930
##  9 Ang Mo Kio Males   Retirees                  18350
## 10 Ang Mo Kio Males   Old                        2800
## # … with 540 more rows

3.3.5 Creating generation groups for Tenary plot

To create a ternary chart, three variables are required. Hence a secondary dataframe was created with three generation groups.

The age groups are binned into 3 new generation groups:

  • Children: 0 to 24
  • Economically Active: 25 to 59
  • Old: 60 and above
processeddata2 <- ungroup(popdata2019_gender) %>% 
  mutate(Young = rowSums(.[,c("0_to_4","5_to_9","10_to_14","15_to_19","20_to_24")]), 
         Economically_Active = rowSums(.[,c("25_to_29","30_to_34","35_to_39","40_to_44","45_to_49","50_to_54","55_to_59")]),
         Old = rowSums(.[,c("60_to_64","65_to_69","70_to_74","75_to_79","80_to_84","85_to_89","90_and_over")])) 

processeddata2
## # A tibble: 110 x 24
##    PA    Gender `0_to_4` `10_to_14` `15_to_19` `20_to_24` `25_to_29` `30_to_34`
##    <fct> <fct>     <dbl>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>
##  1 Ang … Femal…     2660       3670       3890       4390       5410       5440
##  2 Ang … Males      2760       3710       4040       4530       5210       5070
##  3 Bedok Femal…     4940       6580       7210       8080       9710       9180
##  4 Bedok Males      5080       6720       7430       8580       9820       8760
##  5 Bish… Femal…     1350       2100       2260       2680       3430       2770
##  6 Bish… Males      1500       2330       2480       2890       3660       2660
##  7 Boon… Femal…        0          0          0          0          0          0
##  8 Boon… Males         0          0          0          0          0          0
##  9 Buki… Femal…     3460       3880       4380       4930       6450       6270
## 10 Buki… Males      3670       3920       4420       4920       6060       6210
## # … with 100 more rows, and 16 more variables: `35_to_39` <dbl>,
## #   `40_to_44` <dbl>, `45_to_49` <dbl>, `5_to_9` <dbl>, `50_to_54` <dbl>,
## #   `55_to_59` <dbl>, `60_to_64` <dbl>, `65_to_69` <dbl>, `70_to_74` <dbl>,
## #   `75_to_79` <dbl>, `80_to_84` <dbl>, `85_to_89` <dbl>, `90_and_over` <dbl>,
## #   Young <dbl>, Economically_Active <dbl>, Old <dbl>

3.4 Data illustrations

3.4.1 Creating pyramid population (butterfly) graph

  • A butterfly barplot (population pyramid) was plotted to illustrate the distribution of various age groups and genders in Singapore.
  • Should a country have a demographically young population, the plot should mirror the shape of a pyramid whilst a rapidly aging population would have a top-heavy pyramid shape as evidenced in the data visualization plotted.
Butterflychart <- popdata2019_select_gender %>% 
  filter(Pop != 0) %>% 
  mutate(AG = factor(AG, levels = c("0_to_4","5_to_9","10_to_14","15_to_19","20_to_24","25_to_29","30_to_34","35_to_39","40_to_44","45_to_49","50_to_54","55_to_59","60_to_64","65_to_69","70_to_74","75_to_79","80_to_84","85_to_89","90_and_over"))) %>% 
  ggplot(aes(x = AG, y = ifelse(test = Sex == "Males",
                                yes = - Pop, no = Pop),fill = Sex))+
  geom_bar(stat = 'identity',width = 0.7)+
  
  scale_fill_manual(values=c('lightpink2','steelblue3'))+
  labs(title = "Singapore's demographic structure by age groups",
       y = "Population",
       fill = "Gender")+
  theme(
    axis.title.x = element_text(size = 9, face = 'bold'),
    axis.title.y = element_blank(),
    legend.title = element_text(size =8,face = 'bold'),
    legend.text = element_text(size =8),
    plot.title = element_text(size = 11, hjust = 0.5,face = 'bold',margin = margin(5,0,10,0))) + 
  scale_y_continuous(breaks = seq(-150000, 150000, 50000), 
                     labels = paste0(as.character(c(seq(150, 0, -50), seq(50, 150, 50))))) +
  coord_flip()
  

Butterflychart

3.4.2 Creating ranked bar chart graph by planning areas and population count

  • To explore the relationship between Planning areas, Generation Groups and the corresponding Population count, a ranked stacked barplot was plotted.
  • The dataframe has also need plotted out into table format for further visibility.
  • Planning areas with zero population count (eg. industrial areas) have been filtered out in the data visulisation.
Populationcount <- processeddata1 %>% 
  summarise(Total = sum(Population)) %>% 
  arrange(-Total) %>% 
    rename(Planning_Areas = PA)

kable(Populationcount) %>% 
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
Planning_Areas Total
Bedok 279970
Jurong West 265010
Tampines 257020
Woodlands 255000
Sengkang 244910
Hougang 227110
Yishun 220810
Choa Chu Kang 191100
Punggol 170920
Ang Mo Kio 164430
Bukit Batok 154140
Bukit Merah 152600
Pasir Ris 148210
Bukit Panjang 139700
Toa Payoh 121060
Serangoon 116490
Geylang 110520
Kallang 101940
Queenstown 96470
Sembawang 96070
Clementi 92910
Bishan 88230
Jurong East 79230
Bukit Timah 77720
Novena 49390
Marine Parade 46450
Tanglin 21710
Outram 19050
Rochor 12860
River Valley 10180
Newton 8000
Singapore River 2940
Downtown Core 2500
Mandai 2060
Southern Islands 1880
Changi 1790
Orchard 900
Sungei Kadut 700
Western Water Catchment 680
Museum 430
Seletar 260
Lim Chu Kang 70
Boon Lay 0
Central Water Catchment 0
Changi Bay 0
Marina East 0
Marina South 0
North-Eastern Islands 0
Paya Lebar 0
Pioneer 0
Simpang 0
Straits View 0
Tengah 0
Tuas 0
Western Islands 0
barchart <- processeddata1 %>% 
  mutate(PA = fct_reorder(PA,Population)) %>% 
  mutate(Age_Group = fct_rev(Age_Group)) %>% 
  mutate(Population = as.numeric(Population)) %>% 
    filter(Population != 0) %>% 
  ggplot(aes(y = reorder(PA,Population,sum),x = Population,fill = Age_Group))+
  geom_bar(stat = 'identity')+
  coord_cartesian(xlim = c(0,300000))+
  scale_fill_brewer(palette = "RdBu")+
  labs(title = "Singapore's demographics structure by age and by planning area",
       x = "Population",
       y = "Planning areas in Singapore",
       fill = "Age Group")+
  theme(
    panel.background = element_blank(),
    axis.line.y = element_line(),
    axis.ticks.y = element_blank(),
    axis.ticks.x = element_blank(),
    axis.text.y.left = element_text(size = 7),
    axis.text.x.bottom = element_text(size = 7),
    axis.title.x = element_text(size = 9, face = 'bold'),
    axis.title.y = element_text(size = 9, face = 'bold'),
    legend.title = element_text(size =8,face = 'bold'),
    legend.text = element_text(size =8),
    plot.title = element_text(size = 11, hjust = 0.5,face = 'bold',margin = margin(5,0,10,0)))+
  scale_x_continuous(breaks = seq(0,300000,50000))
  #geom_text(aes(Populationcount$Planning_Areas, Populationcount$Total, label = Populationcount$Total),size = 3, hjust = -0.2, data = Populationcount)


barchart

3.4.3 Creating a planning area tree map

  • To understand if there are planning areas preferred by Singaporeans, a treemap has been plotted using the treemap package.
tree_map <- processeddata1 %>%
  treemap(index = c("PA"),
          vSize = "Population",
          type = "index",
          fontsize.labels = 10,
          fontcolor.labels = "black",
          fontface.labels = 2,
          align.labels = list(
            c('center','center')
          ),
          inflate.labels = F,
          palette = 'Dark2',
          title = "Planning Areas by population count",
          fontsize.title = 12,
          border.col = "white",
          border.lwds = 0.5)

3.4.4 Creating a ternary graph

  • Lastly, to explore the relationship between generation groups a ternary plot was plotted with the help of the ggtern function.
  • Given Singapore’s demographic structure, the ternary plot will see more points clustered around the Active and Old areas.
packages = c("ggtern")

for(p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p, character.only = T)
}
## Loading required package: ggtern
## Registered S3 methods overwritten by 'ggtern':
##   method           from   
##   grid.draw.ggplot ggplot2
##   plot.ggplot      ggplot2
##   print.ggplot     ggplot2
## --
## Remember to cite, run citation(package = 'ggtern') for further info.
## --
## 
## Attaching package: 'ggtern'
## The following objects are masked from 'package:ggplot2':
## 
##     aes, annotate, ggplot, ggplot_build, ggplot_gtable, ggplotGrob,
##     ggsave, layer_data, theme_bw, theme_classic, theme_dark,
##     theme_gray, theme_light, theme_linedraw, theme_minimal, theme_void
ternplot<- ggtern(data = processeddata2, aes(x=Young,y=Economically_Active, z=Old)) +
  geom_point()
ternplot+
  ggtitle("Ternary Plot of Population Structure, 2019")+
            xlab("Young")+
            ylab("Active")+
  zlab("Old")+
  theme_rgbw()+
  theme(plot.title = element_text(hjust=0.5))

ternplot

4.0 Wrap up

There are four graphs plotted in total:

  • A population pyramid chart depicting Singapore’s demographics structure
  • A ranked stacked barplot exploring the relationship between Planning areas, Generation Groups and the corresponding Population count
  • A treemap visualizing the population density and Singapore’s location preferences amongst planning areas
  • A ternary plot exploring the relationship between the different generation groups

4.1 Insights

  • Looking at the population pyramid, it can be observed that Singapore’s population is aging rapidly. The bulk of the population is concentrated at the 45-59 age groups with decreasing birthrates. This demographic shift places pressure on the society as a shrinking workforce is tasked to support an rapidly aging population. Additionally, it can be observed that women tend to have longer life expectancy, judging by the proportion of females in the 75 and above age brackets.

  • Looking at the stacked barplot, it can be observed that the top five most populous areas are Bedok, Jurong West, Tampines, Woodlands and Sengkang. Not only are these areas the most populous, these areas also reflect a relatively younger demographic profile compared to the other planning areas. Amongst these five most populous areas, Bedok stands out for having both a balanced demographic profile - both young and old. This will have impact on the authorities’ amenities planning and housing developments for the future.