Dengue Cases in Singapore has reached over 20000 cases this year.
A dengue cluster, according to Singapore’s National Environment Agency (NEA), indicates a locality with active transmission where intervention is needed.
It is formed when two or more cases have onset within 14 days and are located within 150m of each other (based on residential and workplace addresses). The clusters are colored based on 3 alert status:
High-risk area with 10 or more cases: RED
Medium-risk area with 10 cases or less: YELLOW
No new cases, under surveillance for the next 21 days GREEN
While the NEA Dengue Cluster Map tells us the locality and boundary of the dengue clusters, the visualization could be improved by including the color code of the cluster, and the number of cases for the cluster in the map. This will achieve its objective of relaying the information of the severity of dengue cases for residents staying in the vicinity.
NEA defined “Green Clusters” as those under surveillance for the next 21 days. This means that technically, these areas are currently not active and has dengue transmission under control. However, there arise a few problems:
there is no data regarding the coordinates of these green zone polygons
some of these areas that are currently in the green zone, are within the polygon of an active dengue cluster
*I decided to focus on active clusters in dengue-cluster.csv as these are the places where residents should be more vigilant.
In light of the new understanding and data challenges, I decided to re categorize the color codes for dengue. I recommend to keep:
the RED for more than 10 cases
split the YELLOW for 5-10 cases, and GREEN for <5 cases.
This year, there is an emergence of many ‘super-clusters’ with more than 100 cases and to show them effectively, I color coded it as BLACK
While the color code offers a refreshing view of the dengue cluster locations, it does not provide the information of how many cases are there. Within a color code, there may be wide variations and to do this I initially thought of using a bubble plot with bubble size to represent case sizes in a map as an additional layer to the chloropleth map.
The problem with this approach is that the bubble sizes may overlap and it may be obscured if there are many large cases within a small area.
I thought of using text to directly represent case numbers. However, too many numbers may be too complicated if we just want a little bit more detail than the color codes of the clusters.
In the end, I decided on a combination approach of chloropleth colored zones, bubble color fill opacity and text to offer users the level of detail they want to see on the map using the interactive selectors at the left side of the map. There will be legends to indicate what the color code means, and also the bubble color fill opacity legend to show the number of cases.
We need 4 libraries here:
sf
tmap
tidyverse
textclean
plotly
packages = c('sf', 'tmap', 'tidyverse', 'textclean')
for (p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p,character.only = T)
}
st_read function from sf package, and call a summary to see the resulting dataframe characteristics.dengue <- st_read("data/geo/dengue-clusters-kml.kml")
## Reading layer `DENGUE_CLUSTER' from data source `C:\Users\User\Desktop\tmp\Assignment5\Assignment5\data\geo\dengue-clusters-kml.kml' using driver `KML'
## Simple feature collection with 437 features and 2 fields
## geometry type: POLYGON
## dimension: XYZ
## bbox: xmin: 103.6283 ymin: 1.265024 xmax: 103.9685 ymax: 1.454956
## z_range: zmin: 0 zmax: 0
## geographic CRS: WGS 84
summary(dengue)
## Name Description geometry
## Length:437 Length:437 POLYGON Z :437
## Class :character Class :character epsg:4326 : 0
## Mode :character Mode :character +proj=long...: 0
dengue into normal dataframe dengue1dengue1 <- data.frame(dengue)
Description column of dengue1, clean HTML tags into new dataframe dengue2, rename column name as Descriptiondengue2<- data.frame(apply(dengue1["Description"], 1, FUN=replace_html))
names(dengue2)[1] <- "Description"
strsplit2 <- function(x){
return(strsplit(x, split="[A-Z]{4}"))
}
Description column of dengue2dengue3 <- apply(dengue2["Description"], 1, FUN="strsplit2")
LOCALITY <- list()
CASE_SIZE <- list()
NAME <- list()
extract LOCALITY, CASE_SIZE info from dengue3
unlist and trim whitespaces
send information to the empty lists initiated earlier
for (i in 1:437) {
NAME[i]<- paste0("kml_",i) }
for (i in 1:437) {
LOCALITY[i]<- paste0(trimws(unlist(dengue3[i])[3])) }
for (i in 1:437) {
CASE_SIZE[i]<- paste0(trimws(unlist(dengue3[i])[5])) }
rbind filled lists into new dataframe dengue4 and rename dataframe columns appropriately
change CASE_SIZE data type for the whole column to numeric for color coding later
dengue4 <- do.call(rbind, Map(data.frame, A=NAME, B=LOCALITY, C=CASE_SIZE))
names(dengue4) <- c("NAME", "LOCALITY","CASE_SIZE")
dengue4$CASE_SIZE <- as.numeric(dengue4$CASE_SIZE)
Create color codes of BLACK (>100), RED (11-100), YELLOW (6-10), GREEN (1-5) by mutating over CASE_SIZE
Create 4 new levels of BLACK (>100), RED (11-100), YELLOW (6-10), GREEN (1-5), with the numbers to represent number of cases.
dengue_clusters <- dengue4 %>%
select(NAME, LOCALITY, CASE_SIZE) %>%
mutate(COLOR_CODE_CASES = case_when(
CASE_SIZE > 100 ~ "BLACK (>100)",
CASE_SIZE > 10 ~ "RED (11-100)",
CASE_SIZE > 5 ~ "YELLOW (6-10)",
TRUE ~ "GREEN (1-5)")) %>%
mutate(COLOR_CODE_CASES = factor(COLOR_CODE_CASES,
levels = c("BLACK (>100)", "RED (11-100)",
"YELLOW (6-10)", "GREEN (1-5)")))
extract Name and geometry from original dengue sf dataframe as dengue_sel
full_join dengue_clusters dataframe and dengue_sel
reset the dataframe as sf type by passing st_st argument
dengue_sel <- dengue %>% select("Name","geometry")
dengue_clusters_chloropleth<- full_join(dengue_clusters, dengue_sel,
by=c("NAME"="Name"))
dengue_clusters_chloropleth <- st_sf(dengue_clusters_chloropleth)
dengue_case_size <- dengue_clusters_chloropleth
dengue_case_size_bubbles <- dengue_clusters_chloropleth
tmap_mode("view")
## tmap mode set to interactive viewing
tm <- tm_shape(dengue_clusters_chloropleth)+
tm_fill("COLOR_CODE_CASES",
id = 'LOCALITY',
palette = c("black","red2","yellow2","darkgreen"),
alpha= 0.75,
borders=1,
borders.col='blue') +
tm_borders(alpha = 0.5) +
tm_basemap("Esri.WorldTopoMap")
tm <- tm + tm_shape(dengue_case_size_bubbles) +
tm_bubbles(col = 'CASE_SIZE',
id = 'LOCALITY',
palette= NULL,
size = 0.05,
border.col = "black",
border.lwd = 1,
alpha=0.8)
tm <- tm + tm_shape(dengue_case_size) +
tm_text(text='CASE_SIZE',
fontface = "bold")
reading the data from dengue3, I reuse the steps I used to parse for case_size and locality on the home breeding reasons.
I used lapply to apply the function str_to_upper and trimws on the list, I then sort by Freq in descending order
HOMES = list()
for (i in 1:437) {
HOMES[i]<- paste0(trimws(unlist(dengue3[i])[9])) }
HOMES <- trimws(gsub("^S","",HOMES))
HOMES <- strsplit(HOMES, split=',')
HOMES <- lapply(HOMES, str_to_upper)
HOMES <- lapply(HOMES, trimws)
HOMES <- table(unlist(HOMES))
HOMES <- cbind.data.frame(HOMES)
HOMES <- HOMES[order(-HOMES$Freq),]
HOMES10 <- HOMES[1:10,]
SITES = list()
for (i in 1:437) {
SITES[i]<- paste0(trimws(unlist(dengue3[i])[11])) }
SITES <- trimws(gsub("^ES","",SITES))
SITES <- strsplit(SITES, split=',')
SITES <- lapply(SITES, str_to_upper)
SITES <- lapply(SITES, trimws)
SITES <- table(unlist(SITES))
SITES <- cbind.data.frame(SITES)
SITES <- SITES[order(-SITES$Freq),]
SITES10 <- SITES[1:10,]
geom_segment and geom_point of ggplot2 package, with coord_flip(), to create a lollipop plot which is visually more pleasing than a simple horizontal bar plot.lp1 <- HOMES10 %>%
ggplot( aes(x=reorder(Var1,Freq), y=Freq)) +
geom_segment( aes(xend=Var1, yend=0)) +
geom_point( size=4, color="blue") +
scale_y_continuous(breaks=c(4,8,12,16,20,24,28,32)) +
coord_flip() +
theme_bw() +
xlab("") +
ggtitle("Top 10 Home Mosquito Breeding Areas")
lp2<- SITES10 %>%
ggplot( aes(x=reorder(Var1,Freq), y=Freq)) +
geom_segment( aes(xend=Var1, yend=0)) +
geom_point( size=4, color="orange") +
scale_y_continuous(breaks=c(2,4,6,8,10,12)) +
coord_flip() +
theme_bw() +
xlab("") +
ggtitle("Top 10 Construction Site Mosquito Breeding Areas")
Map shows a huge concentration of Cluster Zones (regardless of color codes) at the Eastern region of Singapore in this period.
While the West area is sparsely affected, there are two distinct heavily hit clusters at Bukit Panjang (senja Rd vicinity) at 275 cases and Hillview at 134 cases.
The largest cluster is at Aljunied Area, with many consecutive red/black cluster zones. Government should perform a Dengue Blitz to eliminate dengue at that area.
Pail, Vase and Flower Pots are the top 3 most commonly reported mosquito breeding areas at homes. Residents should take note and be wary of mosquito breeding in these areas.
Closed Perimeter Drain, Tree Hole and Scupper Drain are the top 3 most commonly reported mosquito breeding areas at construction sites. Workers/Supervisors should take note and be wary of mosquito breeding in these areas.