1.0 Overview

This Data Visualisation (DataViz) aims to design an interactive data visualisation to reveal the Singapore’s private property sales volume in 1H2020 by Singapore’s planning areas and by property type.

The data used in this DataViz are:


2.0 Data and Design Challenges

Since the aim is to facilitate quick comparisons across the planning areas in Singapore, multiple choropleth maps is proposed to visualise the sales volume in each planning areas and by property types. This design presents a challenge to the users due to the multiple interactive views to be presented. If the users want to compare between the sales volume across the different property types in the same planning area, the DataViz should be designed to facilitae such visual comparison. It would impede visual comparison if the users have to perform additional adjustments in between comparisons of different property types. To address this, the interactive views would be synchronized. In term of data challenge, the main difficulty was to aggregate the data (which came in a long format) in the required wide format that suits the intended DataViz. To do so, R data wrangling functions such as group_by,summarise and rbind were used.


2.1 Sketch of Purposed DataViz

The proposed DataViz is illustrated in the sketch below.


3.0 DataViz Step-by Step

3.1 Install and Load R packages

Install tidyverse, ggplot2, sf and tmap R packages using the code shown below.

packages = c('tidyverse','ggplot2', 'sf', 'tmap')

for(p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p,character.only = T)
}

3.2 Importing Geospatial Data

Use the codes below to import the MP14_PLNG_AREA_WEB_PL shapefile into R as a dataframe called mppa.

mppa<- st_read(dsn = 'data/geospatial', 
               layer = 'MP14_PLNG_AREA_WEB_PL')
## Reading layer `MP14_PLNG_AREA_WEB_PL' from data source `C:\Users\Adrian Zhong\Desktop\A5_Zhong_Desheng\A5_Zhong_Desheng\data\geospatial' using driver `ESRI Shapefile'
## Simple feature collection with 55 features and 12 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
## projected CRS:  SVY21

Check the content of mppa.

mppa
## Simple feature collection with 55 features and 12 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
## projected CRS:  SVY21
## First 10 features:
##    OBJECTID              PLN_AREA_N PLN_AREA_C CA_IND       REGION_N REGION_C
## 1         1                  BISHAN         BS      N CENTRAL REGION       CR
## 2         2             BUKIT BATOK         BK      N    WEST REGION       WR
## 3         3             BUKIT MERAH         BM      N CENTRAL REGION       CR
## 4         4           BUKIT PANJANG         BP      N    WEST REGION       WR
## 5         5             BUKIT TIMAH         BT      N CENTRAL REGION       CR
## 6         6 CENTRAL WATER CATCHMENT         CC      N   NORTH REGION       NR
## 7         7                  CHANGI         CH      N    EAST REGION       ER
## 8         8              CHANGI BAY         CB      N    EAST REGION       ER
## 9         9           CHOA CHU KANG         CK      N    WEST REGION       WR
## 10       10                CLEMENTI         CL      N    WEST REGION       WR
##             INC_CRC FMEL_UPD_D   X_ADDR   Y_ADDR SHAPE_Leng SHAPE_Area
## 1  BA616285F402846F 2014-12-05 28789.76 37450.89   13517.12    7618921
## 2  FB44C870B04B7F57 2014-12-05 19255.42 37527.65   15234.22   11133256
## 3  738B479882E4EE28 2014-12-05 26865.78 28662.87   29156.29   14462472
## 4  4A9C6E6BAF7BE998 2014-12-05 21287.04 38761.84   15891.85    9019940
## 5  C893AEAD20F42559 2014-12-05 23256.76 34689.00   22492.84   17526654
## 6  52D0068508B0348A 2014-12-05 24424.42 39849.05   30538.25   37147854
## 7  70EAFFC7C76EA5AD 2014-12-05 46307.48 36991.43   31342.12   40940490
## 8  ACFD43D88C72381C 2014-12-05 49502.49 34316.54   18731.56    1829822
## 9  F361929FAF5E9611 2014-12-05 18415.05 40833.42   11518.81    6117294
## 10 8A5BEDB748EAA0AA 2014-12-05 19923.36 33319.17   15024.87    9516228
##                          geometry
## 1  MULTIPOLYGON (((29772.19 38...
## 2  MULTIPOLYGON (((20294.46 39...
## 3  MULTIPOLYGON (((26228.63 30...
## 4  MULTIPOLYGON (((21448.72 41...
## 5  MULTIPOLYGON (((24031.39 36...
## 6  MULTIPOLYGON (((25073.29 43...
## 7  MULTIPOLYGON (((44586.3 417...
## 8  MULTIPOLYGON (((48860.11 34...
## 9  MULTIPOLYGON (((18349.06 43...
## 10 MULTIPOLYGON (((19680.06 31...

3.3 Load the Attribute Data

Use the codes below to load Singapore’s private property sales volume in 2020 data into R as a dataframe called df and use str to see a summary of the dataframe.

df<- read_csv('data/Transactions2020.csv')
str(df)
## tibble [7,694 x 21] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ X1                         : num [1:7694] 0 1 2 3 4 5 6 7 8 9 ...
##  $ Project Name               : chr [1:7694] "PALM GARDENS" "SERANGOON GARDEN ESTATE" "THE SANTORINI" "SEASIDE RESIDENCES" ...
##  $ Address                    : chr [1:7694] "2D Hong San Walk  #11-10" "131 Serangoon Garden Way" "21 Tampines Street 86  #05-04" "18 Siglap Link  #21-33" ...
##  $ No. of Units               : num [1:7694] 1 1 1 1 1 1 1 1 1 1 ...
##  $ Area (sqm)                 : num [1:7694] 125 389 49 47 42 54 44 64 205 139 ...
##  $ Type of Area               : chr [1:7694] "Strata" "Land" "Strata" "Strata" ...
##  $ Transacted Price ($)       : num [1:7694] 990000 4600000 615000 985800 1403700 ...
##  $ Nett Price($)              : chr [1:7694] "-" "-" "-" "-" ...
##  $ Unit Price ($ psm)         : num [1:7694] 7920 11828 12551 20974 32564 ...
##  $ Unit Price ($ psf)         : num [1:7694] 736 1099 1166 1949 3025 ...
##  $ Sale Date                  : chr [1:7694] "30-Apr-20" "30-Apr-20" "30-Apr-20" "30-Apr-20" ...
##  $ Property Type              : chr [1:7694] "Condominium" "Semi-Detached House" "Condominium" "Apartment" ...
##  $ Tenure                     : chr [1:7694] "99 Yrs From 12/08/1996" "999 Yrs From 01/01/1953" "99 Yrs From 16/10/2013" "99 Yrs From 18/04/2016" ...
##  $ Completion Date            : chr [1:7694] "2000" "Unknown" "2017" "Uncompleted" ...
##  $ Type of Sale               : chr [1:7694] "Resale" "Resale" "Resale" "New Sale" ...
##  $ Purchaser Address Indicator: chr [1:7694] "HDB" "N.A" "N.A" "HDB" ...
##  $ Postal District            : num [1:7694] 23 19 18 15 7 19 18 9 28 13 ...
##  $ Postal Sector              : num [1:7694] 68 55 52 44 18 54 52 22 80 34 ...
##  $ Postal Code                : num [1:7694] 689050 556033 528592 448873 189770 ...
##  $ Planning Region            : chr [1:7694] "West Region" "North East Region" "East Region" "East Region" ...
##  $ Planning Area              : chr [1:7694] "Choa Chu Kang" "Serangoon" "Tampines" "Bedok" ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   X1 = col_double(),
##   ..   `Project Name` = col_character(),
##   ..   Address = col_character(),
##   ..   `No. of Units` = col_double(),
##   ..   `Area (sqm)` = col_double(),
##   ..   `Type of Area` = col_character(),
##   ..   `Transacted Price ($)` = col_double(),
##   ..   `Nett Price($)` = col_character(),
##   ..   `Unit Price ($ psm)` = col_double(),
##   ..   `Unit Price ($ psf)` = col_double(),
##   ..   `Sale Date` = col_character(),
##   ..   `Property Type` = col_character(),
##   ..   Tenure = col_character(),
##   ..   `Completion Date` = col_character(),
##   ..   `Type of Sale` = col_character(),
##   ..   `Purchaser Address Indicator` = col_character(),
##   ..   `Postal District` = col_double(),
##   ..   `Postal Sector` = col_double(),
##   ..   `Postal Code` = col_double(),
##   ..   `Planning Region` = col_character(),
##   ..   `Planning Area` = col_character()
##   .. )

3.3 Data Pre-processing

The required pre-processing are:

  • Select required columns.
  • Perform aggregation of sales by Planning Areas and by Property Types, and by Planning Area for all Property Types
  • Merge the attribute data with the geospatial data

Use the following code to select the required column data in the attribute data.

#select required columns
df <- df[,c(4,12,21)]
str(df)
## tibble [7,694 x 3] (S3: tbl_df/tbl/data.frame)
##  $ No. of Units : num [1:7694] 1 1 1 1 1 1 1 1 1 1 ...
##  $ Property Type: chr [1:7694] "Condominium" "Semi-Detached House" "Condominium" "Apartment" ...
##  $ Planning Area: chr [1:7694] "Choa Chu Kang" "Serangoon" "Tampines" "Bedok" ...

Perform aggregation of sales volume by Planning Area and by Property Type.

df<- df %>%
  group_by(`Planning Area`, `Property Type`) %>%
  summarize(`Sales Volume` = sum(`No. of Units`, na.rm=TRUE))

Next, perform aggregation of sales volume by Planning Area for all Property Type and save it to a new dataframe called df2.

df2<- df %>%
  group_by(`Planning Area`) %>%
  summarize(`Sales Volume` = sum(`Sales Volume`, na.rm=TRUE))

df2$`Property Type` <- 'All Property Types'

Concatenate the 2 dataframes, df and df2.

df <- rbind(df,df2)

Join the geographical data and the attribute data according to the planning areas. Prior to that, we need to ensure the case used for the string text in the joining identifier columns (i.e.’PLN_AREA_N’, ‘Planning Area’) are the same.

df$'Planning Area'<- toupper(df$'Planning Area')

mppa_df <-left_join(mppa,df,
                    by = c('PLN_AREA_N'= 'Planning Area'))

Assign the column Planning Area as the first column in merged dataframe to be used as the header in the pop-up for the interactive views.

mppa_df$OBJECTID <- mppa_df$`Planning Area`

3.4 Creating the DataViz

The following codes are used to plot the DataViz.

tmap_mode("view")

tm_shape(mppa_df)+
  tm_polygons('Sales Volume')+
  tm_facets(by='Property Type', sync = TRUE, ncol = 2, drop.NA.facets = TRUE,
            scale.factor=1)+
  tmap_options(limits=c(facets.view=8))+
  tm_view(alpha=0.5, 
          view.legend.position = c("right", "bottom"),
          text.size.variable =TRUE)

4.0 Final Visualisation

The interactive DataViz comprises of 7 interactive views for private properties sales volume in 1H2020. The first view shows the total sales volume in each planning areas and the subsequent 6 views shows the sales volume for each property type in each planning areas. It can be observed that Tampines area has the highest sales volume and it was contributed mainly from the sales of Condominiums. Coincidentally, sales volume of Condominiums was also the highest in Tampines as compared to other planning areas.By looking at different views, one can observed that the planning areas with the highest sales volume for Apartment, Executive Condominiums and Terrace House were Downtown Core, Sembawang and Serangoon respectively.