Overview of the demographic structure of Singapore’s population

1. Introduction

The purpose of Assignment 4 is to provide a better understanding of the demographic structure of Singapore. The data used in this assignment can be obtained from the Department of Statistics:https://www.singstat.gov.sg/find-data/search-by-theme/population/geographic-distribution/latest-data.

There will be three parts to this visualization.

The age sex pyramid will be used to analyze the age sex structure of the total population in Singapore as at 2019 as well as by the four main racial groups in Singapore (Chinese, Malays, Indians and Others)
Filled bar graphs will be used to display the proportion of the three population categories (young, economically active and aged) within the various planning areas and regions in Singapore
Bar graphs will be used to display how the young, economically active and aged has changed from 2011 to 2019 on a year to year basis by region
A dumbbell plot will be used to display how dependency ratio has changed from 2011 to 2019 between regions as well as planning areas

The age group split for the young, eocnomically active and aged are as follows: (i) Young: 0 years old to 24 years old (ii) Economically Active:25 years old to 64 years old (iii) Aged: 65 years old and above

2. Loading of Libraries

Below are the list of libraries used in this viusalization assignment

library(ggpubr)

## Loading required package: ggplot2

library(ggplot2)
library(readr)
library(ggalt)

## Registered S3 methods overwritten by 'ggalt':
##   method                  from   
##   grid.draw.absoluteGrob  ggplot2
##   grobHeight.absoluteGrob ggplot2
##   grobWidth.absoluteGrob  ggplot2
##   grobX.absoluteGrob      ggplot2
##   grobY.absoluteGrob      ggplot2

library(ggcorrplot)
library(ggthemes)
library(plotly)

## 
## Attaching package: 'plotly'

## The following object is masked from 'package:ggplot2':
## 
##     last_plot

## The following object is masked from 'package:stats':
## 
##     filter

## The following object is masked from 'package:graphics':
## 
##     layout

library(purrr)
library(readr)
library(readxl)
library(stringr)
library(tibble)
library(tidyr)
library(tidyverse)

## -- Attaching packages ----------------------------------------------------------------------------------------------------------------------------------------- tidyverse 1.3.0 --

## v dplyr   0.8.3     v forcats 0.4.0

## -- Conflicts -------------------------------------------------------------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks plotly::filter(), stats::filter()
## x dplyr::lag()    masks stats::lag()

library(viridis)

## Loading required package: viridisLite

library(viridisLite)

3. Loading of datasets and data prepation/manipulation

The raw data was downloaded from Department of Statistics:https://www.singstat.gov.sg/find-data/search-by-theme/population/geographic-distribution/latest-data and transformed into these five datasets where new columns such as Age_Groups_Category, Category, Category1,Percent ,Dependency_Ratio_2011, Dependency_Ratio_2019 were formed by doing up a new calculation or by redcoding of previously existing columns.

There are 5 main datasets that were used: “AgeGenderRace2”, “Datacombined3” , “Datacombined2”, “DependencyRatioByRegion”, “DependencyRatio2”.

A brief description of each dataset are as follows:

(a) AgeGenderRace2

The columns of the dataset are as follows: Age_Groups,Age_Groups_Category,Gender, Category, Category1, Chiness,Malays,Indians,Others,Tota and the data pertains to 2019