1.0 Overview

The number of dengue cases in Singapore has risen sharply in the last few months, and has already drastically exceeded the 15,998 cases reported in 2019 with 20,455 cases reported as of 27th July 2020.

According to the National Environment Agency, the number of dengue cases this year is likely to surpass Singapore’s peak of 22,170 cases in 2013.

This assignment aims to provide a geospatial visualization of the current dengue hot spots in Singapore for awareness and action, as well as a week-on-week case view across the years.

Similar to COVID-19, much community effort is required to significantly reduce the number of mosquitoes breeding habits to slow down the rise of dengue cases and to safeguard public health.

The dataset used is taken from a few sources:

1.1 Major data and design challenges

Given the multiple datasources, the challenge lay in integrating all of the data sources (geospatial and attributes) - through a common variable - into a coherent and manipulable dataframe, and plotting the variables interactively.

1.2 Sketch of proposed data vizualisation

The proposed data design will display the spatial distribution of the current dengue hotspots in Singapore and a line chart displaying the week-on-week view of dengue cases from 2012 to 2020 (till 27th July 2020).

library('knitr')

include_graphics("/Users/jayneteo/Dropbox/SMU MITB/Term 2 2020/Visual Analytics/Assignments/Assignment 5/Assignment 5 data/VA assignment 5 sketch/VA assignment 5-2.jpg")

2.0 Step-by-step data visualization

2.1 Installing and launching R packages

  • Tidyverse contains a set of essential packages for data wrangling and data visualisation. Lubridate is part of the Tidyversepackages
  • The sf package offers a standardized way to encode spatial vector data into a tibble dataframe
  • The tmap package offers a flexible, layer-based approach towards creating thematic maps, such as choropleths and bubble maps
  • The leaflet package is one of the most popular open-source Javascript libraries for interactive maps
  • The plotly package is an R package for creating interactive web-based graphs and can be used in conjunction with ggplot
packages = c('sf', 'tmap', 'tidyverse', 'leaflet','lubridate','plotly','flexdashboard','knitr', 'kableExtra')
for (p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p,character.only = T)
}

2.2 Importing Geospatial and attributes datasets

2.2.1 Importing and wrangling the dengue cluster shapefile

The dengue cluster shapefile was imported as a tibble dataframe through the st_read function of the sf package.

dengue_spdf <- st_read (
  dsn = "/Users/jayneteo/Dropbox/SMU MITB/Term 2 2020/Visual Analytics/Assignments/Assignment 5/Assignment 5 data/mygeodatakml",
  layer = "dengue-clusters-kml-polygon"
)
## Reading layer `dengue-clusters-kml-polygon' from data source `/Users/jayneteo/Dropbox/SMU MITB/Term 2 2020/Visual Analytics/Assignments/Assignment 5/Assignment 5 data/mygeodatakml' using driver `ESRI Shapefile'
## Simple feature collection with 424 features and 12 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 103.6283 ymin: 1.265024 xmax: 103.9685 ymax: 1.454956
## geographic CRS: WGS 84
dengue_spdf
## Simple feature collection with 424 features and 12 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 103.6283 ymin: 1.265024 xmax: 103.9685 ymax: 1.454956
## geographic CRS: WGS 84
## First 10 features:
##              Name descriptio
## 1  Dengue_Cluster       <NA>
## 2  Dengue_Cluster       <NA>
## 3  Dengue_Cluster       <NA>
## 4  Dengue_Cluster       <NA>
## 5  Dengue_Cluster       <NA>
## 6  Dengue_Cluster       <NA>
## 7  Dengue_Cluster       <NA>
## 8  Dengue_Cluster       <NA>
## 9  Dengue_Cluster       <NA>
## 10 Dengue_Cluster       <NA>
##                                                                                                                                                                                                                                   LOCALITY
## 1              Lor 1 Toa Payoh (Blk 98, 100, 103, 104, 106, 118, 125, 126, 128) / Lor 2 Toa Payoh (Blk 99A, 99B, 99C, 101A, 101B, 116,121, 122) / Lor 3 Toa Payoh (Blk 91, 96, 97) / Lor 3 Toa Payoh (Trevista) / Lor 4 Toa Payoh (Blk 92)
## 2  Brighton Cres / Chepstow Cl / Lichfield Rd / Ripley Cres / S'goon Gdn Way / S'goon Nth Ave 1 / S'goon Nth Ave 1 (Blk 120-127, 136, 142-144, 147-149) / S'goon Nth Ave 2 (Blk 131, 135, 136, 139-141, 151) / S'goon Nth View / Walmer Dr
## 3                                                                                                                                                                         Ubi Ave 1 (Blk 301, 304, 305, 311, 313, 314, 315, 316, 318, 324)
## 4                                                                                                                                                                                                                 Jln Kelichap / Jln Lokam
## 5                                                                                                                                                                                 Geylang Rd / Lim Ah Woo Rd (The Amarelle) / Tg Katong Rd
## 6                                                                                                                        Ah Soo Gdn / Lor Ah Soo / Paya Lebar Cres / Paya Lebar Cres (Tangerine Gr)  / Paya Lebar Pl, Walk / Tai Keng Gdns
## 7                                                                                                                                                                                                  Arthur Rd / Branksome Rd / Wilkinson Rd
## 8                                                                                                                                                                                                             Flora Dr (Ferraria Pk Condo)
## 9                                                                                                                                                                                                Hacienda Gr / Jln Tua Kong (Crescendo Pk)
## 10                                                                                                                                                                                                        Carman St / Figaro St / Lakme St
##    CASE_SIZE NAME2                                                 HYPERLINK
## 1         71  <NA> https://www.nea.gov.sg/dengue-zika/dengue/dengue-clusters
## 2        178  <NA> https://www.nea.gov.sg/dengue-zika/dengue/dengue-clusters
## 3         18  <NA> https://www.nea.gov.sg/dengue-zika/dengue/dengue-clusters
## 4         15  <NA> https://www.nea.gov.sg/dengue-zika/dengue/dengue-clusters
## 5          3  <NA> https://www.nea.gov.sg/dengue-zika/dengue/dengue-clusters
## 6         35  <NA> https://www.nea.gov.sg/dengue-zika/dengue/dengue-clusters
## 7          4  <NA> https://www.nea.gov.sg/dengue-zika/dengue/dengue-clusters
## 8          2  <NA> https://www.nea.gov.sg/dengue-zika/dengue/dengue-clusters
## 9          3  <NA> https://www.nea.gov.sg/dengue-zika/dengue/dengue-clusters
## 10         6  <NA> https://www.nea.gov.sg/dengue-zika/dengue/dengue-clusters
##    HOMES            PUBLIC_PLA CONSTRUCTI          INC_CRC     FMEL_UPD_D
## 1   <NA>                  <NA>       <NA> 61D71BF3DC614685 20200723200427
## 2   <NA>                  <NA>       <NA> CDF9097DA95401B7 20200723200427
## 3   <NA> Discarded receptacles       <NA> DD1B5A844E84FE92 20200723200427
## 4   <NA>                  <NA>       <NA> 57B3DB5B8AD41556 20200723200427
## 5   <NA>                  <NA>       <NA> 71EC9FD611A32C83 20200723200427
## 6   <NA>                  <NA>       <NA> 5EE8D679C47AA2A5 20200723200427
## 7   <NA>                  <NA>       <NA> B5DB106C5E753638 20200723200427
## 8   <NA>                  <NA>       <NA> 5C4CB35BEBB62234 20200723200427
## 9   <NA>                  <NA>       <NA> 250955E9BF946BEA 20200723200427
## 10  <NA>                  <NA>       <NA> DE892C16EA235E81 20200723200427
##             NAME3                       geometry
## 1  Dengue_Cluster MULTIPOLYGON (((103.8435 1....
## 2  Dengue_Cluster MULTIPOLYGON (((103.8709 1....
## 3  Dengue_Cluster MULTIPOLYGON (((103.9023 1....
## 4  Dengue_Cluster MULTIPOLYGON (((103.8817 1....
## 5  Dengue_Cluster MULTIPOLYGON (((103.8941 1....
## 6  Dengue_Cluster MULTIPOLYGON (((103.8846 1....
## 7  Dengue_Cluster MULTIPOLYGON (((103.8879 1....
## 8  Dengue_Cluster MULTIPOLYGON (((103.9647 1....
## 9  Dengue_Cluster MULTIPOLYGON (((103.9274 1....
## 10 Dengue_Cluster MULTIPOLYGON (((103.9284 1....

2.2.1.1 Importing and wrangling the dengue cluster shapefile

Changing the data category of Case Size from character to numeric

dengue_spdf_num <- dengue_spdf %>%  
  mutate(CASE_SIZE = as.numeric(CASE_SIZE))

glimpse(dengue_spdf_num)
## Rows: 424
## Columns: 13
## $ Name       <chr> "Dengue_Cluster", "Dengue_Cluster", "Dengue_Cluster", "Den…
## $ descriptio <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ LOCALITY   <chr> "Lor 1 Toa Payoh (Blk 98, 100, 103, 104, 106, 118, 125, 12…
## $ CASE_SIZE  <dbl> 71, 178, 18, 15, 3, 35, 4, 2, 3, 6, 6, 85, 21, 16, 66, 21,…
## $ NAME2      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ HYPERLINK  <chr> "https://www.nea.gov.sg/dengue-zika/dengue/dengue-clusters…
## $ HOMES      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ PUBLIC_PLA <chr> NA, NA, "Discarded receptacles", NA, NA, NA, NA, NA, NA, N…
## $ CONSTRUCTI <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ INC_CRC    <chr> "61D71BF3DC614685", "CDF9097DA95401B7", "DD1B5A844E84FE92"…
## $ FMEL_UPD_D <chr> "20200723200427", "20200723200427", "20200723200427", "202…
## $ NAME3      <chr> "Dengue_Cluster", "Dengue_Cluster", "Dengue_Cluster", "Den…
## $ geometry   <MULTIPOLYGON [°]> MULTIPOLYGON (((103.8435 1...., MULTIPOLYGON …

2.2.1.2 Creating initial plot of dengue clusters

The dengue dataframe was plotted using the tmap package for an initial view of the geospatial data.

tm_shape(dengue_spdf) +
  tm_polygons()

2.2.2 Importing and wrangling the Master Planning subzone shapefile

The Master Planning subzone shapefile was imported as a tibble dataframe through the st_read function of the sf package.

uraspf <- st_read (
  dsn = "/Users/jayneteo/Dropbox/SMU MITB/Term 2 2020/Visual Analytics/Assignments/Assignment 5/Assignment 5 data/master-plan-2014-subzone-boundary-web-shp",
  layer = "MP14_SUBZONE_WEB_PL"
)
## Reading layer `MP14_SUBZONE_WEB_PL' from data source `/Users/jayneteo/Dropbox/SMU MITB/Term 2 2020/Visual Analytics/Assignments/Assignment 5/Assignment 5 data/master-plan-2014-subzone-boundary-web-shp' using driver `ESRI Shapefile'
## Simple feature collection with 323 features and 15 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
## projected CRS:  SVY21
uraspf
## Simple feature collection with 323 features and 15 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
## projected CRS:  SVY21
## First 10 features:
##    OBJECTID SUBZONE_NO       SUBZONE_N SUBZONE_C CA_IND      PLN_AREA_N
## 1         1          1    MARINA SOUTH    MSSZ01      Y    MARINA SOUTH
## 2         2          1    PEARL'S HILL    OTSZ01      Y          OUTRAM
## 3         3          3       BOAT QUAY    SRSZ03      Y SINGAPORE RIVER
## 4         4          8  HENDERSON HILL    BMSZ08      N     BUKIT MERAH
## 5         5          3         REDHILL    BMSZ03      N     BUKIT MERAH
## 6         6          7  ALEXANDRA HILL    BMSZ07      N     BUKIT MERAH
## 7         7          9   BUKIT HO SWEE    BMSZ09      N     BUKIT MERAH
## 8         8          2     CLARKE QUAY    SRSZ02      Y SINGAPORE RIVER
## 9         9         13 PASIR PANJANG 1    QTSZ13      N      QUEENSTOWN
## 10       10          7       QUEENSWAY    QTSZ07      N      QUEENSTOWN
##    PLN_AREA_C       REGION_N REGION_C          INC_CRC FMEL_UPD_D   X_ADDR
## 1          MS CENTRAL REGION       CR 5ED7EB253F99252E 2014-12-05 31595.84
## 2          OT CENTRAL REGION       CR 8C7149B9EB32EEFC 2014-12-05 28679.06
## 3          SR CENTRAL REGION       CR C35FEFF02B13E0E5 2014-12-05 29654.96
## 4          BM CENTRAL REGION       CR 3775D82C5DDBEFBD 2014-12-05 26782.83
## 5          BM CENTRAL REGION       CR 85D9ABEF0A40678F 2014-12-05 26201.96
## 6          BM CENTRAL REGION       CR 9D286521EF5E3B59 2014-12-05 25358.82
## 7          BM CENTRAL REGION       CR 7839A8577144EFE2 2014-12-05 27680.06
## 8          SR CENTRAL REGION       CR 48661DC0FBA09F7A 2014-12-05 29253.21
## 9          QT CENTRAL REGION       CR 1F721290C421BFAB 2014-12-05 22077.34
## 10         QT CENTRAL REGION       CR 3580D2AFFBEE914C 2014-12-05 24168.31
##      Y_ADDR SHAPE_Leng SHAPE_Area                       geometry
## 1  29220.19   5267.381  1630379.3 MULTIPOLYGON (((31495.56 30...
## 2  29782.05   3506.107   559816.2 MULTIPOLYGON (((29092.28 30...
## 3  29974.66   1740.926   160807.5 MULTIPOLYGON (((29932.33 29...
## 4  29933.77   3313.625   595428.9 MULTIPOLYGON (((27131.28 30...
## 5  30005.70   2825.594   387429.4 MULTIPOLYGON (((26451.03 30...
## 6  29991.38   4428.913  1030378.8 MULTIPOLYGON (((25899.7 297...
## 7  30230.86   3275.312   551732.0 MULTIPOLYGON (((27746.95 30...
## 8  30222.86   2208.619   290184.7 MULTIPOLYGON (((29351.26 29...
## 9  29893.78   6571.323  1084792.3 MULTIPOLYGON (((20996.49 30...
## 10 30104.18   3454.239   631644.3 MULTIPOLYGON (((24472.11 29...

2.2.2.1 Creating initial chloropleth map of dengue clusters

The dengue dataframe was plotted against the backdrop of Singapore using the tmap package for an initial view.

tm_shape(uraspf) +
  tm_polygons()+
  tm_shape(dengue_spdf)+
  tm_polygons(col = 'blue', alpha = 0.8)

2.2.3 Importing and wrangling of Population attribute data

The population attribute dataset was imported as a tibble dataframe through the read_csv function of the tidyverse package.

#importing attribute data 

popdata <- read_csv("/Users/jayneteo/Dropbox/SMU MITB/Term 2 2020/Visual Analytics/Assignments/Assignment 5/Assignment 5 data/Singapore Residents by Planning AreaSubzone Age Group Sex and Type of Dwelling June 20112019/respopagesextod2011to2019.csv")
## Parsed with column specification:
## cols(
##   PA = col_character(),
##   SZ = col_character(),
##   AG = col_character(),
##   Sex = col_character(),
##   TOD = col_character(),
##   Pop = col_double(),
##   Time = col_double()
## )
tail(popdata)
## # A tibble: 6 x 7
##   PA     SZ         AG         Sex    TOD                              Pop  Time
##   <chr>  <chr>      <chr>      <chr>  <chr>                          <dbl> <dbl>
## 1 Yishun Yishun We… 90_and_ov… Femal… HDB 4-Room Flats                  60  2019
## 2 Yishun Yishun We… 90_and_ov… Femal… HDB 5-Room and Executive Flats    20  2019
## 3 Yishun Yishun We… 90_and_ov… Femal… HUDC Flats (excluding those p…     0  2019
## 4 Yishun Yishun We… 90_and_ov… Femal… Landed Properties                  0  2019
## 5 Yishun Yishun We… 90_and_ov… Femal… Condominiums and Other Apartm…    10  2019
## 6 Yishun Yishun We… 90_and_ov… Femal… Others                            40  2019

2.2.3.1 Preparing the dataset

As we would only want to show data in 2019, the filter function was used to select 2019 data.

To create the generation groups, the data was first transformed from a thin dataframe to a wide dataframe using the spread function.

The functions mutateand rowSum were then used to create the various generation groups variables.

The row numbers used are the variables’ order according to the new data frame after spreading the data frame.

The age groups are binned into 3 new generation groups:

  • Children: 0 to 24
  • Economically Active: 24 to 59
  • Aged: 60 and above
popdata2019 <- popdata %>%
  #filter(Sex == "Males") %>%
  filter(Time == 2019) %>%
  spread(AG, Pop) %>%
  mutate(YOUNG = `0_to_4`+`5_to_9`+`10_to_14`+
`15_to_19`+`20_to_24`) %>%
mutate(`ECONOMY ACTIVE` = rowSums(.[9:13])+
rowSums(.[15:17]))%>%
mutate(`AGED`=rowSums(.[18:22])) %>%
mutate(`TOTAL`=rowSums(.[5:22])) %>%  
mutate(`DEPENDENCY` = (`YOUNG` + `AGED`)
/`ECONOMY ACTIVE`) %>%
mutate_at(.vars = vars(PA, SZ), toupper) %>%
select(`PA`, `SZ`, `YOUNG`, `ECONOMY ACTIVE`, `AGED`, 
       `TOTAL`, `DEPENDENCY`) %>%
  filter(`ECONOMY ACTIVE` > 0) 

popdata2019
## # A tibble: 1,855 x 7
##    PA         SZ                   YOUNG `ECONOMY ACTIVE`  AGED TOTAL DEPENDENCY
##    <chr>      <chr>                <dbl>            <dbl> <dbl> <dbl>      <dbl>
##  1 ANG MO KIO ANG MO KIO TOWN CEN…   340              570   120  3039      0.807
##  2 ANG MO KIO ANG MO KIO TOWN CEN…    50              140    70  2279      0.857
##  3 ANG MO KIO ANG MO KIO TOWN CEN…    80              170   100  2369      1.06 
##  4 ANG MO KIO ANG MO KIO TOWN CEN…   220              480   270  2969      1.02 
##  5 ANG MO KIO ANG MO KIO TOWN CEN…   340              490   100  2929      0.898
##  6 ANG MO KIO ANG MO KIO TOWN CEN…    50              120    40  2229      0.75 
##  7 ANG MO KIO ANG MO KIO TOWN CEN…    70              130    70  2299      1.08 
##  8 ANG MO KIO ANG MO KIO TOWN CEN…   210              470   210  2869      0.894
##  9 ANG MO KIO ANG MO KIO TOWN CEN…     0               10     0  2029      0    
## 10 ANG MO KIO CHENG SAN               90              180   230  2499      1.78 
## # … with 1,845 more rows

2.2.4 Joining the attribute data and geospatial data

The left_join function was used to join the Singapore shapefiles and the population attribute dataset together through Subzone_N and SZ as the common identifier

#Joining the attribute data and geospatial URA data 

popdata2019_map <- left_join(uraspf, popdata2019, 
                              by = c("SUBZONE_N" = "SZ")) 
 # filter(`TOTAL` !=0)

glimpse(popdata2019_map)
## Rows: 1,945
## Columns: 22
## $ OBJECTID         <int> 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 4…
## $ SUBZONE_NO       <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 8, 8, 8, 8, 8…
## $ SUBZONE_N        <chr> "MARINA SOUTH", "PEARL'S HILL", "PEARL'S HILL", "PEA…
## $ SUBZONE_C        <chr> "MSSZ01", "OTSZ01", "OTSZ01", "OTSZ01", "OTSZ01", "O…
## $ CA_IND           <chr> "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y…
## $ PLN_AREA_N       <chr> "MARINA SOUTH", "OUTRAM", "OUTRAM", "OUTRAM", "OUTRA…
## $ PLN_AREA_C       <chr> "MS", "OT", "OT", "OT", "OT", "OT", "OT", "OT", "OT"…
## $ REGION_N         <chr> "CENTRAL REGION", "CENTRAL REGION", "CENTRAL REGION"…
## $ REGION_C         <chr> "CR", "CR", "CR", "CR", "CR", "CR", "CR", "CR", "CR"…
## $ INC_CRC          <chr> "5ED7EB253F99252E", "8C7149B9EB32EEFC", "8C7149B9EB3…
## $ FMEL_UPD_D       <date> 2014-12-05, 2014-12-05, 2014-12-05, 2014-12-05, 201…
## $ X_ADDR           <dbl> 31595.84, 28679.06, 28679.06, 28679.06, 28679.06, 28…
## $ Y_ADDR           <dbl> 29220.19, 29782.05, 29782.05, 29782.05, 29782.05, 29…
## $ SHAPE_Leng       <dbl> 5267.381, 3506.107, 3506.107, 3506.107, 3506.107, 35…
## $ SHAPE_Area       <dbl> 1630379.3, 559816.2, 559816.2, 559816.2, 559816.2, 5…
## $ PA               <chr> NA, "OUTRAM", "OUTRAM", "OUTRAM", "OUTRAM", "OUTRAM"…
## $ YOUNG            <dbl> NA, 50, 380, 80, 40, 30, 20, 380, 90, 20, 10, 0, 0, …
## $ `ECONOMY ACTIVE` <dbl> NA, 130, 780, 250, 60, 70, 80, 1020, 250, 70, 60, 20…
## $ AGED             <dbl> NA, 70, 680, 220, 50, 50, 60, 1120, 160, 40, 60, 10,…
## $ TOTAL            <dbl> NA, 2259, 3799, 2589, 2159, 2159, 2189, 4559, 2509, …
## $ DEPENDENCY       <dbl> NA, 0.9230769, 1.3589744, 1.2000000, 1.5000000, 1.14…
## $ geometry         <MULTIPOLYGON [m]> MULTIPOLYGON (((31495.56 30..., MULTIPO…

2.3 Creating an interactive choropleth map of dengue hotspots in Singapore using leaflet

  • The choropleth map showing the geographical population distribution by planning subzone was first plotted using tmap
    • Instead of the default interval binning, the quantile data classification with 5 classes were used for a more evenly distributed view
  • The dengue clusters were then plotted against the choropleth map with the aim of exploring if greater population density correlates with dengue hotspots and cluster size
  • The popup.vars function allows for an interactive tooltip of labels (eg. locality of dengue hotspots and cluster size) upon double clicking either of dengue clusters or subzone
#Popup labels when click 

tmap_mode("view")
## tmap mode set to interactive viewing
denguehotspots <- tm_shape(popdata2019_map)+
  tm_fill("TOTAL", 
          n= 5,
          style = "quantile",
          palette = "Blues",
          title = "Total population",
          popup.vars = c("Location" = "SUBZONE_N", "Population size" = "TOTAL"))+
  tm_borders(alpha = 0.5)+
  tm_shape(dengue_spdf_num) +
  tm_fill(col = "red", 
          alpha = 0.7,
          popup.vars = c("Location" = "LOCALITY", "Cluster size" = "CASE_SIZE")
          ) 

denguehotspots  

2.4 Importing and preparing dengue timeseries dataset

The weekly dengue dataset was imported as a tibble dataframe through the read_csv function of the tidyverse package.

denguetimeseries <- read_csv("/Users/jayneteo/Dropbox/SMU MITB/Term 2 2020/Visual Analytics/Assignments/Assignment 5/Assignment 5 data/Dengue Time Series (2012 to 2020).csv")
## Parsed with column specification:
## cols(
##   Epidemiology_Wk = col_double(),
##   Start = col_character(),
##   End = col_character(),
##   Year = col_double(),
##   Dengue_Fever = col_double(),
##   Dengue_Haemorrhagic_Fever = col_double(),
##   Total = col_double()
## )
glimpse(denguetimeseries)
## Rows: 447
## Columns: 7
## $ Epidemiology_Wk           <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, …
## $ Start                     <chr> "01/01/2012", "08/01/2012", "15/01/2012", "…
## $ End                       <chr> "07/01/2012", "14/01/2012", "21/01/2012", "…
## $ Year                      <dbl> 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2…
## $ Dengue_Fever              <dbl> 74, 64, 60, 50, 84, 87, 65, 50, 55, 45, 64,…
## $ Dengue_Haemorrhagic_Fever <dbl> 0, 2, 1, 2, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1…
## $ Total                     <dbl> 74, 66, 61, 52, 85, 87, 65, 51, 55, 46, 64,…

The dates have been categorised wrongly as characters and have been converted using the dmy function of the lubridate package.

denguetimeseries$Start = dmy(denguetimeseries$Start)

denguetimeseries$End = dmy(denguetimeseries$End)

glimpse(denguetimeseries)
## Rows: 447
## Columns: 7
## $ Epidemiology_Wk           <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, …
## $ Start                     <date> 2012-01-01, 2012-01-08, 2012-01-15, 2012-0…
## $ End                       <date> 2012-01-07, 2012-01-14, 2012-01-21, 2012-0…
## $ Year                      <dbl> 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2…
## $ Dengue_Fever              <dbl> 74, 64, 60, 50, 84, 87, 65, 50, 55, 45, 64,…
## $ Dengue_Haemorrhagic_Fever <dbl> 0, 2, 1, 2, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1…
## $ Total                     <dbl> 74, 66, 61, 52, 85, 87, 65, 51, 55, 46, 64,…

Additionally, the Year column have been converted into a factor.

denguetimeseries_1 <-  denguetimeseries %>% 
  mutate(Year = as_factor(Year))

glimpse(denguetimeseries_1)
## Rows: 447
## Columns: 7
## $ Epidemiology_Wk           <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, …
## $ Start                     <date> 2012-01-01, 2012-01-08, 2012-01-15, 2012-0…
## $ End                       <date> 2012-01-07, 2012-01-14, 2012-01-21, 2012-0…
## $ Year                      <fct> 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2…
## $ Dengue_Fever              <dbl> 74, 64, 60, 50, 84, 87, 65, 50, 55, 45, 64,…
## $ Dengue_Haemorrhagic_Fever <dbl> 0, 2, 1, 2, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1…
## $ Total                     <dbl> 74, 66, 61, 52, 85, 87, 65, 51, 55, 46, 64,…
denguetimeseries_1
## # A tibble: 447 x 7
##    Epidemiology_Wk Start      End        Year  Dengue_Fever Dengue_Haemorrh…
##              <dbl> <date>     <date>     <fct>        <dbl>            <dbl>
##  1               1 2012-01-01 2012-01-07 2012            74                0
##  2               2 2012-01-08 2012-01-14 2012            64                2
##  3               3 2012-01-15 2012-01-21 2012            60                1
##  4               4 2012-01-22 2012-01-28 2012            50                2
##  5               5 2012-01-29 2012-02-04 2012            84                1
##  6               6 2012-02-05 2012-02-11 2012            87                0
##  7               7 2012-02-12 2012-02-18 2012            65                0
##  8               8 2012-02-19 2012-02-25 2012            50                1
##  9               9 2012-02-26 2012-03-03 2012            55                0
## 10              10 2012-03-04 2012-03-10 2012            45                1
## # … with 437 more rows, and 1 more variable: Total <dbl>

2.5 Plotting a line chart for a week-on-week view of the dengue cases across the years

  • A line chart has been plotted to visualise the trend of dengue cases over the years
  • ggplotly is used to create an interactive line chart
Denguecases2020 <-  denguetimeseries_1 %>% 
  filter(Year == 2020) 

kable(Denguecases2020) %>% 
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive" ))
Epidemiology_Wk Start End Year Dengue_Fever Dengue_Haemorrhagic_Fever Total
1 2019-12-29 2020-01-04 2020 302 1 303
2 2020-01-05 2020-01-11 2020 342 1 343
3 2020-01-12 2020-01-18 2020 402 2 404
4 2020-01-19 2020-01-25 2020 307 2 309
5 2020-01-26 2020-02-01 2020 370 0 370
6 2020-02-02 2020-02-08 2020 400 0 400
7 2020-02-09 2020-02-15 2020 378 2 380
8 2020-02-16 2020-02-22 2020 380 1 381
9 2020-02-23 2020-02-29 2020 373 1 374
10 2020-03-01 2020-03-07 2020 375 0 375
11 2020-03-08 2020-03-14 2020 389 0 389
12 2020-03-15 2020-03-21 2020 367 1 368
13 2020-03-22 2020-03-28 2020 377 0 377
14 2020-03-29 2020-04-04 2020 315 0 315
15 2020-04-05 2020-04-11 2020 341 2 343
16 2020-04-12 2020-04-18 2020 360 0 360
17 2020-04-19 2020-04-25 2020 399 1 400
18 2020-04-26 2020-05-02 2020 390 0 390
19 2020-05-03 2020-05-09 2020 502 5 507
20 2020-05-10 2020-05-16 2020 524 2 526
21 2020-05-17 2020-05-23 2020 618 1 619
22 2020-05-24 2020-05-30 2020 731 1 732
23 2020-05-31 2020-06-06 2020 868 0 868
24 2020-06-07 2020-06-13 2020 1151 2 1153
25 2020-06-14 2020-06-20 2020 1369 2 1371
26 2020-06-21 2020-06-27 2020 1460 0 1460
27 2020-06-28 2020-07-04 2020 1444 4 1448
28 2020-07-05 2020-07-11 2020 1666 2 1668
29 2020-07-12 2020-07-18 2020 1725 4 1729
30 2020-07-19 2020-07-25 2020 1791 2 1793
Denguechart <- denguetimeseries_1 %>% 
  filter(Year != 1900) %>% 
  ggplot(aes( x = Epidemiology_Wk, y = Total, group = Year, col= Year)) + 
  geom_line(size = 0.8) +
  labs(title = "Dengue cases week-on-week from 2012 - 2020",
       y = "Total no. of cases",
       x = "Epidemiology Week")+ 
  theme(
    axis.title.x = element_text(size = 11, face = 'bold'),
    axis.title.y = element_text(size = 11, face = 'bold'),
    legend.title = element_text(size = 10,face = 'bold'),
    legend.text = element_text(size =10),
    panel.grid.major = element_blank(),
    plot.title = element_text(size = 12, hjust = 0.5,face = 'bold',margin = margin(5,0,10,0))) +
  scale_x_continuous(breaks = seq(0,53,5))+
  scale_y_continuous(breaks = seq(0,2000,250)) 
 # facet_wrap(~Year)

Denguechartinteractive <-  ggplotly(Denguechart)
## Warning: `group_by_()` is deprecated as of dplyr 0.7.0.
## Please use `group_by()` instead.
## See vignette('programming') for more help
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
Denguechartinteractive

2.5.1 Plotting a line chart for a week-on-week view of the dengue cases across the years

Sometimes it might be tricky to follow a line to understand the evoluation of cases over the year.

Hence a possible workaround is to use small multiples to get a glimpse of the trend of any year in comparison to other years.

Denguechart_2 <- denguetimeseries_1 %>% 
  filter(Year != 1900) %>% 
  ggplot(aes( x = Epidemiology_Wk, y = Total)) +
    geom_line(data = denguetimeseries_1 %>% select(-"Year"), aes(group = "Year"), color = "grey", size = 0.5, alpha = 0.5) +
    geom_line(aes (color = Year), color = "#69b2a2", size = 1.2)+
    labs(title = "Dengue cases week-on-week from 2012 - 2020",
       y = "Total no. of cases",
       x = "Epidemiology Week")+ 
  theme(
    axis.title.x = element_text(size = 11, face = 'bold'),
    axis.title.y = element_text(size = 11, face = 'bold'),
    legend.title = element_text(size = 10,face = 'bold'),
    legend.text = element_text(size =10),
    panel.grid.major = element_blank(),
    plot.title = element_text(size = 12, hjust = 0.5,face = 'bold',margin = margin(5,0,10,0)))+
  facet_wrap(~Year)

Denguechart_2interactive <- ggplotly(Denguechart_2)


Denguechart_2interactive

3.0 The Data visualisation

Three graphs have been plotted and consolidated into a flex dashboard - https://rpubs.com/jayneteo/646299

The first chart is an interactive geospatial distribution of the dengue clusters in Singapore and the second, an interactive line plot of the dengue cases week-on-week over the years. Lastly, an interactive multiples line chart highlighting the trend of individual years.

The geospatial distribution of the dengue clusters is plotted against the backdrop of Singapore’s subzone areas with the purpose of understanding if there’s a correlation between population density and dengue hotspots as well as locality of the dengue clusters.

The line plot shows the trend of the volume of dengue cases over the years and across the epidemiology weeks while the multiples line chart is a further in-depth examination of the line chart by years.

3.1 Insights description

From the geospatial dengue clusters graph, it can be seen that the dengue clusters are currently concentrated in the northeast and east. Whilst there are some dengue clusters concentrated in high density areas, there is no overwhelming evidence that high density areas correlates with dengue clusters. Instead the dengue clusters seems to be concentrated within geographic areas, hence prevention is key to prevent dengue from taking root in a geographic area. Nonetheless, more analyis on the clusters is required.

From the interactive line and multiples plot, it can be seen that the trajectory of dengue cases has risen very sharply in the last few months and will very likely soon surpass Singapore’s peak of 22,170 cases in 2013 with 4 more months to go before the end of the year.

Much community efforts are required to collective stem the spread of dengue - everyone has a part to play in preventing dengue transmission