The Demographic Structure of Singapore Population in 2019

Insights:

Until 2019, the economically active group (aged from 25 to 64) is the biggest part of the population, which accounting for more than a half of people in Singapore, followed by the economy dependency (aged 0 to 14), while the percentage of aged people (aged 65 and above) is the smallest.
There is an obvious residence bias in planning areas of Singapore, with most people dwelling in Bedok, Jurong West and Choa Chu Kang and nearly no people living in Pioneer, Paya Leber, North Eastern Islands, Marina East, Changi Bay, Central Water Catchment, .et.
No matter in chart 1 which showing the distribution of population of age groups, or in chart 2 which presenting the distribution of population in different planning areas, there is not a remarkable difference between the number of men and women. However, for people aged above 80, the number of women is slightly higher than men.

Installing R packages

packages <- c('tidyverse')

for (p in packages){
  if (!require(p,character.only = T)){
  install.packages(p)
}
  library(p,character.only = T)
}

## Loading required package: tidyverse

## -- Attaching packages ------------------------------------------- tidyverse 1.3.0 --

## v ggplot2 3.2.1     v purrr   0.3.3
## v tibble  2.1.3     v dplyr   0.8.3
## v tidyr   1.0.0     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.4.0

## -- Conflicts ---------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

Importing Data

data_SG <- read.csv("data/respopagesextod2011to2019.csv")
library(dplyr)
data_SG <- filter(data_SG, Time =="2019")

Data Preprocessing

library("reshape2")

## 
## Attaching package: 'reshape2'

## The following object is masked from 'package:tidyr':
## 
##     smiths

require(data.table)

## Loading required package: data.table

## 
## Attaching package: 'data.table'

## The following objects are masked from 'package:reshape2':
## 
##     dcast, melt

## The following objects are masked from 'package:dplyr':
## 
##     between, first, last

## The following object is masked from 'package:purrr':
## 
##     transpose

data_pivot= dcast(setDT(data_SG), Sex~AG,fun=list(sum,length),value.var = "Pop")

data_pivot<-data_pivot[,1:20]

names(data_pivot)<-c("Gender","0-4","10-14","15-19","20-24","25-29","30-34","35-39","40-44","45-49","5-9","50-54","55-59","60-64","65-69","70-74","75-79","80-84","85-89","90-over")
data_pivot<- data_pivot[,c("Gender","0-4","5-9","10-14","15-19","20-24","25-29","30-34","35-39","40-44","45-49","50-54","55-59","60-64","65-69","70-74","75-79","80-84","85-89","90-over")]
data_pivot

##     Gender   0-4    5-9  10-14  15-19  20-24  25-29  30-34  35-39  40-44  45-49
## 1: Females 90850  97040 102550 108910 122480 145960 153460 158850 157120 160230
## 2:   Males 94730 101290 105830 113730 127040 142640 140360 142310 144130 151800
##     50-54  55-59  60-64  65-69 70-74 75-79 80-84 85-89 90-over
## 1: 152750 153590 140770 112900 79190 52680 36230 21430   13730
## 2: 149360 153850 138490 108920 71450 42460 26230 12490    5590

Demographic structure of Singapore population by Age Cohort

pop_class<-melt(data_pivot,id=c("Gender"))
pop_class$value[data_pivot$Gender=="Males"]<-pop_class$value[data_pivot$Gender=="Males"]*-1

Pop_agecohort<- ggplot(pop_class,aes(x=variable,y=value,
                                     fill=Gender))+
  geom_bar(stat="identity",position="identity")+
  labs(X="Age",y="pop",
       title="Demographic structure of population by Age Cohort in 2019")+
  coord_flip()+xlab("Population")+ylab("Gender")+
  geom_vline(aes(xintercept=mean(value)),colour ="red",linetype="dashed")

options(scipen=200)

Pop_agecohort

Data Preprocessing

require(data.table)
pa_pivot= dcast(setDT(data_SG), Sex~PA,fun=list(sum,length),value.var = "Pop")
pa_pivot<-pa_pivot[,1:37]
names(pa_pivot)<-c("Gender","Ang MoKio","Bedok","Bishan" ,"Boon Lay","Bukit Batok" ,"Bukit Merah","Bukit Panjang" ,"Bukit Timah","Central Water Catchment","Changi","Changi Bay" ,"Choa Chu Kang","Clementi" ,"Downtown Core","Geylang","Hougang","Jurong East" ,"Jurong West","Kallang" ,               "Lim Chu Kang","Mandai","Marina East","Marina South","Marine Parade" ,"Museum","Newton","North-Eastern Islands","Novena","Orchard","Outram" ,"Pasir Ris","Paya Lebar","Pioneer","Punggol","Queenstown","River Valley")

pa_pivot<- pa_pivot[,c("Gender","Ang MoKio","Bedok","Bishan" ,"Boon Lay","Bukit Batok" ,"Bukit Merah","Bukit Panjang","Bukit Timah","Central Water Catchment","Changi","Changi Bay" ,"Choa Chu Kang","Clementi" ,"Downtown Core","Geylang","Hougang","Jurong East" ,"Jurong West","Kallang", "Lim Chu Kang","Mandai","Marina East","Marina South",
 "Marine Parade" ,"Museum","Newton","North-Eastern Islands","Novena","Orchard","Outram" ,"Pasir Ris","Paya Lebar","Pioneer","Punggol","Queenstown","River Valley")]

pa_pivot<-read.csv("data/pa_pivot.csv")
pa_pivot

##    Gender  Bedok Jurong.West Hougang Choa.Chu.Kang Punggol Ang.MoKio
## 1 Females 144160      131920  115760         95860   86650     85770
## 2   Males 135810      133090  111350         95240   84270     78660
##   Bukit.Batok Bukit.Merah Pasir.Ris Bukit.Panjang Geylang Kallang Queenstown
## 1       78600       79230     75070         70840   55870   51490      51040
## 2       75540       73370     73140         68860   54650   50450      45430
##   Clementi Bishan Jurong.East Bukit.Timah Novena Marine.Parade Outram
## 1    48840  45500       39960       41710  25440         24620   9560
## 2    44070  42730       39270       36010  23950         21830   9490
##   River.Valley Newton Downtown.Core Mandai Changi Orchard Museum Lim.Chu.Kang
## 1         5490   4140          1300   1070    910     480    240           40
## 2         4690   3860          1200    990    880     420    190           30
##   Boon.Lay Central.Water.Catchment Changi.Bay Marina.East Marina.South
## 1        0                       0          0           0            0
## 2        0                       0          0           0            0
##   North.Eastern.Islands Paya.Lebar Pioneer
## 1                     0          0       0
## 2                     0          0       0

Demographic structure of Singapore population by planning area

pa_class<-melt(pa_pivot,id=c("Gender"))
pa_class$value[pa_pivot$Gender=="Males"]<-pa_class$value[pa_pivot$Gender=="Males"]*-1

Pop_pa<- ggplot(pa_class,aes(x=variable,y=value,fill=Gender))+
  geom_bar(stat="identity",position="identity")+
  labs(X="PA",y="pop",title="Demographic structure of population by Planning Area in 2019")+coord_flip()+xlab("Planning area")+ylab("Gender")+
  geom_vline(aes(xintercept=mean(value)),colour ="red")
Pop_pa

Major Advantages of Building the Data Visualization in R:

R visualization provides more possibilities in data preprocessing, we can apply R code and all kinds of packages more efficiently to do data cleaning and data processing such as filtering the specific rows and recoding some values in the columns, which is more complicated in Tableau.
R visualization is more flexible: unlike tableau, we can visualize data more flexibly by customizing the calculation method , especially in particular matrices, while we are limited in the functions that official platform created for us instead of user-defined functions.
R Studio code will record every step we did, the fact means that we can learning from our own previous work and others. Besides, we can change our code by setting different parameters so as to make our work efficiently.

The Demographic Structure of Singapore Population in 2019 - By Age Cohort and Planning Area

YawenSHI

3/12/2020

Insights:

Installing R packages

Importing Data

Data Preprocessing

Demographic structure of Singapore population by Age Cohort

Data Preprocessing

Demographic structure of Singapore population by planning area

Major Advantages of Building the Data Visualization in R: