This Data Visualisation (DataViz) aims to design an interactive data visualisation to reveal the Singapore’s private property sales volume in 1H2020 by Singapore’s planning areas and by property type.
The data used in this DataViz are:
Since the aim is to facilitate quick comparisons across the planning areas in Singapore, multiple choropleth maps is proposed to visualise the sales volume in each planning areas and by property types. This design presents a challenge to the users due to the multiple interactive views to be presented. If the users want to compare between the sales volume across the different property types in the same planning area, the DataViz should be designed to facilitae such visual comparison. It would impede visual comparison if the users have to perform additional adjustments in between comparisons of different property types. To address this, the interactive views would be synchronized. In term of data challenge, the main difficulty was to aggregate the data (which came in a long format) in the required wide format that suits the intended DataViz. To do so, R data wrangling functions such as group_by,summarise and rbind were used.
The proposed DataViz is illustrated in the sketch below.
Install tidyverse, ggplot2, sf and tmap R packages using the code shown below.
packages = c('tidyverse','ggplot2', 'sf', 'tmap')
for(p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p,character.only = T)
}
Use the codes below to import the MP14_PLNG_AREA_WEB_PL shapefile into R as a dataframe called mppa.
mppa<- st_read(dsn = 'data/geospatial',
layer = 'MP14_PLNG_AREA_WEB_PL')
## Reading layer `MP14_PLNG_AREA_WEB_PL' from data source `C:\Users\Adrian Zhong\Desktop\A5_Zhong_Desheng\A5_Zhong_Desheng\data\geospatial' using driver `ESRI Shapefile'
## Simple feature collection with 55 features and 12 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
## projected CRS: SVY21
Check the content of mppa.
mppa
## Simple feature collection with 55 features and 12 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
## projected CRS: SVY21
## First 10 features:
## OBJECTID PLN_AREA_N PLN_AREA_C CA_IND REGION_N REGION_C
## 1 1 BISHAN BS N CENTRAL REGION CR
## 2 2 BUKIT BATOK BK N WEST REGION WR
## 3 3 BUKIT MERAH BM N CENTRAL REGION CR
## 4 4 BUKIT PANJANG BP N WEST REGION WR
## 5 5 BUKIT TIMAH BT N CENTRAL REGION CR
## 6 6 CENTRAL WATER CATCHMENT CC N NORTH REGION NR
## 7 7 CHANGI CH N EAST REGION ER
## 8 8 CHANGI BAY CB N EAST REGION ER
## 9 9 CHOA CHU KANG CK N WEST REGION WR
## 10 10 CLEMENTI CL N WEST REGION WR
## INC_CRC FMEL_UPD_D X_ADDR Y_ADDR SHAPE_Leng SHAPE_Area
## 1 BA616285F402846F 2014-12-05 28789.76 37450.89 13517.12 7618921
## 2 FB44C870B04B7F57 2014-12-05 19255.42 37527.65 15234.22 11133256
## 3 738B479882E4EE28 2014-12-05 26865.78 28662.87 29156.29 14462472
## 4 4A9C6E6BAF7BE998 2014-12-05 21287.04 38761.84 15891.85 9019940
## 5 C893AEAD20F42559 2014-12-05 23256.76 34689.00 22492.84 17526654
## 6 52D0068508B0348A 2014-12-05 24424.42 39849.05 30538.25 37147854
## 7 70EAFFC7C76EA5AD 2014-12-05 46307.48 36991.43 31342.12 40940490
## 8 ACFD43D88C72381C 2014-12-05 49502.49 34316.54 18731.56 1829822
## 9 F361929FAF5E9611 2014-12-05 18415.05 40833.42 11518.81 6117294
## 10 8A5BEDB748EAA0AA 2014-12-05 19923.36 33319.17 15024.87 9516228
## geometry
## 1 MULTIPOLYGON (((29772.19 38...
## 2 MULTIPOLYGON (((20294.46 39...
## 3 MULTIPOLYGON (((26228.63 30...
## 4 MULTIPOLYGON (((21448.72 41...
## 5 MULTIPOLYGON (((24031.39 36...
## 6 MULTIPOLYGON (((25073.29 43...
## 7 MULTIPOLYGON (((44586.3 417...
## 8 MULTIPOLYGON (((48860.11 34...
## 9 MULTIPOLYGON (((18349.06 43...
## 10 MULTIPOLYGON (((19680.06 31...
Use the codes below to load Singapore’s private property sales volume in 2020 data into R as a dataframe called df and use str to see a summary of the dataframe.
df<- read_csv('data/Transactions2020.csv')
str(df)
## tibble [7,694 x 21] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ X1 : num [1:7694] 0 1 2 3 4 5 6 7 8 9 ...
## $ Project Name : chr [1:7694] "PALM GARDENS" "SERANGOON GARDEN ESTATE" "THE SANTORINI" "SEASIDE RESIDENCES" ...
## $ Address : chr [1:7694] "2D Hong San Walk #11-10" "131 Serangoon Garden Way" "21 Tampines Street 86 #05-04" "18 Siglap Link #21-33" ...
## $ No. of Units : num [1:7694] 1 1 1 1 1 1 1 1 1 1 ...
## $ Area (sqm) : num [1:7694] 125 389 49 47 42 54 44 64 205 139 ...
## $ Type of Area : chr [1:7694] "Strata" "Land" "Strata" "Strata" ...
## $ Transacted Price ($) : num [1:7694] 990000 4600000 615000 985800 1403700 ...
## $ Nett Price($) : chr [1:7694] "-" "-" "-" "-" ...
## $ Unit Price ($ psm) : num [1:7694] 7920 11828 12551 20974 32564 ...
## $ Unit Price ($ psf) : num [1:7694] 736 1099 1166 1949 3025 ...
## $ Sale Date : chr [1:7694] "30-Apr-20" "30-Apr-20" "30-Apr-20" "30-Apr-20" ...
## $ Property Type : chr [1:7694] "Condominium" "Semi-Detached House" "Condominium" "Apartment" ...
## $ Tenure : chr [1:7694] "99 Yrs From 12/08/1996" "999 Yrs From 01/01/1953" "99 Yrs From 16/10/2013" "99 Yrs From 18/04/2016" ...
## $ Completion Date : chr [1:7694] "2000" "Unknown" "2017" "Uncompleted" ...
## $ Type of Sale : chr [1:7694] "Resale" "Resale" "Resale" "New Sale" ...
## $ Purchaser Address Indicator: chr [1:7694] "HDB" "N.A" "N.A" "HDB" ...
## $ Postal District : num [1:7694] 23 19 18 15 7 19 18 9 28 13 ...
## $ Postal Sector : num [1:7694] 68 55 52 44 18 54 52 22 80 34 ...
## $ Postal Code : num [1:7694] 689050 556033 528592 448873 189770 ...
## $ Planning Region : chr [1:7694] "West Region" "North East Region" "East Region" "East Region" ...
## $ Planning Area : chr [1:7694] "Choa Chu Kang" "Serangoon" "Tampines" "Bedok" ...
## - attr(*, "spec")=
## .. cols(
## .. X1 = col_double(),
## .. `Project Name` = col_character(),
## .. Address = col_character(),
## .. `No. of Units` = col_double(),
## .. `Area (sqm)` = col_double(),
## .. `Type of Area` = col_character(),
## .. `Transacted Price ($)` = col_double(),
## .. `Nett Price($)` = col_character(),
## .. `Unit Price ($ psm)` = col_double(),
## .. `Unit Price ($ psf)` = col_double(),
## .. `Sale Date` = col_character(),
## .. `Property Type` = col_character(),
## .. Tenure = col_character(),
## .. `Completion Date` = col_character(),
## .. `Type of Sale` = col_character(),
## .. `Purchaser Address Indicator` = col_character(),
## .. `Postal District` = col_double(),
## .. `Postal Sector` = col_double(),
## .. `Postal Code` = col_double(),
## .. `Planning Region` = col_character(),
## .. `Planning Area` = col_character()
## .. )
The required pre-processing are:
Use the following code to select the required column data in the attribute data.
#select required columns
df <- df[,c(4,12,21)]
str(df)
## tibble [7,694 x 3] (S3: tbl_df/tbl/data.frame)
## $ No. of Units : num [1:7694] 1 1 1 1 1 1 1 1 1 1 ...
## $ Property Type: chr [1:7694] "Condominium" "Semi-Detached House" "Condominium" "Apartment" ...
## $ Planning Area: chr [1:7694] "Choa Chu Kang" "Serangoon" "Tampines" "Bedok" ...
Perform aggregation of sales volume by Planning Area and by Property Type.
df<- df %>%
group_by(`Planning Area`, `Property Type`) %>%
summarize(`Sales Volume` = sum(`No. of Units`, na.rm=TRUE))
Next, perform aggregation of sales volume by Planning Area for all Property Type and save it to a new dataframe called df2.
df2<- df %>%
group_by(`Planning Area`) %>%
summarize(`Sales Volume` = sum(`Sales Volume`, na.rm=TRUE))
df2$`Property Type` <- 'All Property Types'
Concatenate the 2 dataframes, df and df2.
df <- rbind(df,df2)
Join the geographical data and the attribute data according to the planning areas. Prior to that, we need to ensure the case used for the string text in the joining identifier columns (i.e.’PLN_AREA_N’, ‘Planning Area’) are the same.
df$'Planning Area'<- toupper(df$'Planning Area')
mppa_df <-left_join(mppa,df,
by = c('PLN_AREA_N'= 'Planning Area'))
Assign the column Planning Area as the first column in merged dataframe to be used as the header in the pop-up for the interactive views.
mppa_df$OBJECTID <- mppa_df$`Planning Area`
The following codes are used to plot the DataViz.
tmap_mode("view")
tm_shape(mppa_df)+
tm_polygons('Sales Volume')+
tm_facets(by='Property Type', sync = TRUE, ncol = 2, drop.NA.facets = TRUE,
scale.factor=1)+
tmap_options(limits=c(facets.view=8))+
tm_view(alpha=0.5,
view.legend.position = c("right", "bottom"),
text.size.variable =TRUE)
The interactive DataViz comprises of 7 interactive views for private properties sales volume in 1H2020. The first view shows the total sales volume in each planning areas and the subsequent 6 views shows the sales volume for each property type in each planning areas. It can be observed that Tampines area has the highest sales volume and it was contributed mainly from the sales of Condominiums. Coincidentally, sales volume of Condominiums was also the highest in Tampines as compared to other planning areas.By looking at different views, one can observed that the planning areas with the highest sales volume for Apartment, Executive Condominiums and Terrace House were Downtown Core, Sembawang and Serangoon respectively.