Introduction

In this document, we will craw open data from Data.Taipei website and draw an interactive graph to indicate the relationship between the number of population and air pollution in Taipei City.

Crawling Population Data

We first download data from data.Taipei, and add their longitud and latitude from Google Maps API.

library(jsonlite)
library(RCurl)
library(readODS)
peopleURL<-"http://data.taipei/opendata/datalist/datasetMeta/download?id=6a1dbb4e-e99c-4e67-ab09-f6d83852dc99&rid=f1d3a464-151b-448f-bc5d-276a2ae36af8"
download.file(peopleURL,destfile = "106people.ods")
people<-read_ods("106people.ods")

library(ggmap)
index<-2
for (str in people$district[2:13]){
    str <- paste(str, "區")
    loc <-geocode(str)
    people$longitud[index] <-loc$lon
    people$latitude[index] <-loc$lat
    index <- index+1
}

people <- people[-c(14),]

Taipei <- geocode("Taipei City")
people$longitud[1] <- Taipei$lon
people$latitude[1] <- Taipei$lat

We can list 12 districts in Taipei and their longituds and latitudes.

people[c(-1),c("district","longitud","latitude")]
##    district longitud latitude
## 2      松山 121.5639 25.05416
## 3      信義 121.5770 25.02870
## 4      大安 121.5427 25.02616
## 5      中山 121.5427 25.07920
## 6      中正 121.5199 25.04214
## 7      大同 121.5113 25.06272
## 8      萬華 121.4970 25.02629
## 9      文山 121.5713 24.99292
## 10     南港 121.6112 25.03123
## 11     內湖 121.5909 25.06894
## 12     士林 121.5246 25.09505
## 13     北投 121.5150 25.11518

Furthermore, let us color the data to show their values.

library(viridis)
## Loading required package: viridisLite
library(leaflet)
pal <- colorFactor(viridis(5), people$population)

Crawling Air Pollution Data

Similarity, we download data from data.Taipei. Note that the data indicate “floating particles” in Taipei, and the unit is average micro gram per month. We also add their longitud and latitude from Google Maps API.

AirData<-fromJSON(getURL("http://data.taipei/opendata/datalist/apiAccess?scope=resourceAquire&rid=9c489b31-8621-4c33-8337-10457cc1bc3d"))
AirDataFrame<-AirData$result$results

newindex <- 1
for (str in AirDataFrame$監測站){
    str <- paste(str, "臺北市")
    loc <-geocode(str)
    AirDataFrame$longitud[newindex] <-loc$lon
    AirDataFrame$latitude[newindex] <-loc$lat
    newindex <- newindex+1
}

We can list 8 monitor stations, and their longituds and latitudes.

unique(AirDataFrame[,c("監測站","longitud","latitude")])
##   監測站 longitud latitude
## 1 中正站 121.5016 25.03796
## 2 大直站 121.5469 25.07948
## 3 信義站 121.5632 25.03306
## 4 南港站 121.6067 25.05212
## 5 內湖站 121.5944 25.08366
## 6 木柵站 121.5731 24.99824
## 7 承德站 121.5014 25.03345
## 8 中北站 121.5186 25.04879

Plotting Map and Adding Legend

Finally, let us first make complex pop-ups for these two data set and then output the data by showing an interactive map.

library(dplyr)
people <- people %>% 
     mutate(popup_info = paste("<b>District:</b>", district, "<br />",
                            "<b>Population:</b>", population, "<br />")) 
AirDataFrame <- AirDataFrame %>%
        mutate(popup_info = paste("<b>Station:</b>", `監測站`, "<br />",
                            "<b>Floating Particles:</b>",`月平均值μg/m3` , "<br />")) 
leaflet() %>%
    addTiles() %>%
    addCircleMarkers(data = people, radius = 10,
                     lng = ~ longitud, lat = ~ latitude,
                     popup = ~popup_info, color = ~pal(population)) %>%
    addCircleMarkers(data = AirDataFrame, radius = 5,
                     lng = ~ longitud, lat = ~ latitude,
                     popup = ~ popup_info) %>%
    addLegend(pal = pal, values = people$population)

In this map, we add legend to indicate population density.

Summary

In the presented map, it is 12 districts and 8 stations. According to the numbers, we can know the relationship between population and air pollution in Taipei. It comes more people in the region, and the number of folating particles also increase. However, if we have more precise data, in other words, to partition the geographical area in a detail way, we can further explore their connections.