library(dplyr)
library(ggplot2)

In-class exercise 1.

Indicate countries you have visited so far on a world map in the style of the ebola outbreaks example.

[Note] I add the information of visit counts.

Construct the original dataset

dta <- rbind(
  read.table('../data/visited_countries.txt', header = TRUE, sep = ','),
  data.frame(Year = c(rep(1997, 12-5+1), rep(1998:2019, each=12), rep(2020, 5)),
             Month = c(5:12, rep(1:12, 2019 - 1997), 1:5),
             Country = rep('Taiwan', 277))
)
summary(dta)

      Year          Month              Country   
 Min.   :1997   Min.   : 1.000   Taiwan    :277  
 1st Qu.:2003   1st Qu.: 4.000    Japan    :  4  
 Median :2009   Median : 7.000    Thailand :  2  
 Mean   :2009   Mean   : 6.515    Australia:  1  
 3rd Qu.:2015   3rd Qu.: 9.000    Austria  :  1  
 Max.   :2020   Max.   :12.000    Germany  :  1  
                                 (Other)   :  5

Plot the map

library(rworldmap)
library(countrycode)

Countries that I have visited

df_maps <- data.frame(Country = countrycode(levels(dta$Country)[1:10],
                                            'country.name', 'iso3c'),
                      Counts = as.numeric(table(dta$Country)[1:10]))
visitedMap <- joinCountryData2Map(df_maps, joinCode = "ISO3", nameJoinColumn = "Country")

10 codes from your data successfully matched countries in the map
0 codes from your data failed to match with a country code in the map
233 codes from the map weren't represented in your data

mapCountryData(visitedMap, nameColumnToPlot = "Counts", catMethod = "categorical",
               addLegend = TRUE, missingCountryCol = gray(.9),
               colourPalette = c('deepskyblue', 'dodgerblue', 'dodgerblue4'),
               mapTitle = 'Countries that Jay Liao have visited (with visit counts label)')

Countries that I have stay (including my own country)

df_maps <- data.frame(Country = countrycode(levels(dta$Country),
                                            'country.name', 'iso3c'),
                      Counts = as.numeric(table(dta$Country)))
visitedMap <- joinCountryData2Map(df_maps, joinCode = "ISO3", nameJoinColumn = "Country")

11 codes from your data successfully matched countries in the map
0 codes from your data failed to match with a country code in the map
232 codes from the map weren't represented in your data

mapCountryData(visitedMap, nameColumnToPlot = "Counts", catMethod = "categorical",
               addLegend = TRUE, missingCountryCol = gray(.9),
               colourPalette = c('deepskyblue', 'dodgerblue', 'dodgerblue4', 'black'),
               mapTitle = 'Countries that Jay Liao have stay (with stay months label)')

In-class exercise 2.

Plot places in administratice areas of Taiwan you have visited so far.

Since I have visited every city and county in Taiwan, I group them into ‘Have lived’, ‘Have stayed’, and ‘Have visited’.

Prepare the datasets

tw <- sf::st_read('../data/TWN_adm/TWN_adm2.shp')

Reading layer `TWN_adm2' from data source `/Users/jayliao/Documents/NCKU_108_Course/dataM/data/TWN_adm/TWN_adm2.shp' using driver `ESRI Shapefile'
Simple feature collection with 21 features and 18 fields
geometry type:  MULTIPOLYGON
dimension:      XY
bbox:           xmin: 119.3138 ymin: 21.89653 xmax: 122.1085 ymax: 25.63431
CRS:            4326

tw$NAME_2

 [1] Kaohsiung City Taipei City    Changhwa       Chiayi        
 [5] Hsinchu        Hualien        Ilan           Kaohsiung     
 [9] Keelung City   Miaoli         Nantou         Penghu        
[13] Pingtung       Taichung       Taichung City  Tainan        
[17] Tainan City    Taipei         Taitung        Taoyuan       
[21] Yunlin        
21 Levels: Changhwa Chiayi Hsinchu Hualien Ilan ... Yunlin

tw$counts <- c('violet', 'violetred4', 'violet', 'violet', 'violet',
               'violet', 'violet', 'violet', 'violetred1', 'violetred1',
               'violet', 'violet', 'violetred1', 'violetred1', 'violet',
               'violetred1', 'violetred4', 'violet', 'violet', 'violetred1',
               'violetred4')

Plot the map

plot(tw, col = tw$counts, axes = FALSE, max.plot = 1, main = '')
legend("topleft", 
       legend = c('Have lived', 'Have stayed', 'Have visited', 'Have not visited'),
       fill = c('violetred4', 'violet', 'violetred1', 'grey'),
       bty = 'n', x.intersp = .6, y.intersp = .9, cex = 0.8)

In-class exercise 3.

Map an area of Tainan city to include three of your favorite places to eat as landmarks.

Construct the data set

dta <- list(cafe = list(lat = c(22.996545, 22.982398, 22.994841),
                        lng = c(120.216966, 120.193082, 120.217972),
                        loc =  c('MASA LOFT', '午營咖啡 a break cafe', 'Louisa Coffee (成大勝利店)')),
            Brunch = list(lat = c(22.994070, 22.991456, 22.984290),
                          lng = c(120.221047, 120.222033, 120.215221),
                          loc = c('濰克早午餐台南成大店', '綠木咖啡早午餐 Green Wood', '鹿耳晚晚早餐')),
            Dinner = list(lat = c(22.991668, 22.994099, 22.982516),
                          lng = c(120.191856, 120.203678, 120.204432),
                          loc = c('沙卡里巴', '阿霞飯店', '有你真好 湘菜沙龍')),
            Bar = list(lat = c(22.994679, 22.993608, 22.988610),
                       lng = c(120.195527, 120.225973, 120.197088),
                       loc = c('Bar - Wanderer', '在島之後After Island. 餐酒館', 'Bar B&B - Bitters&Barrel')))

Several maps

library(leaflet)

Great cafe’ for studying

d <- dta$cafe
m <- leaflet() %>% addTiles() %>%
  addMarkers(lat = d$lat, lng = d$lng, popup = d$loc)
m

Great brunch restaurants

d <- dta$Brunch
m <- leaflet() %>% addTiles() %>%
  addMarkers(lat = d$lat, lng = d$lng, popup = d$loc)
m

Great dinner restaurants

d <- dta$Dinner
m <- leaflet() %>% addTiles() %>%
  addMarkers(lat = d$lat, lng = d$lng, popup = d$loc)
m

Great bars

d <- dta$Bar
m <- leaflet() %>% addTiles() %>%
  addMarkers(lat = d$lat, lng = d$lng, popup = d$loc)
m

Display above venues in the single map with different icons labeling different types of venues

dta$cafe$icon <- 'https://balithisweek.com/wp-content/uploads/2016/10/btw-icon-cafe.png'
dta$Brunch$icon <- 'https://cdn1.iconfinder.com/data/icons/kitchen-flat-set/512/Main_Dish-512.png'
dta$Dinner$icon <- 'https://cdn.iconscout.com/icon/premium/png-256-thumb/dinner-85-540711.png'
dta$Bar$icon <- 'https://cdn.icon-icons.com/icons2/644/PNG/512/red_bar_icon-icons.com_59494.png'

m <- leaflet() %>% addTiles() %>%
  addMarkers(lat = dta[[1]]$lat, lng = dta[[1]]$lng, popup = dta[[1]]$loc,
             icon = icons(iconUrl = dta[[1]]$icon, iconWidth = 28, iconHeight = 28)) %>%
  addMarkers(lat = dta[[2]]$lat, lng = dta[[2]]$lng, popup = dta[[2]]$loc,
             icon = icons(iconUrl = dta[[2]]$icon, iconWidth = 42, iconHeight = 42)) %>%
  addMarkers(lat = dta[[3]]$lat, lng = dta[[3]]$lng, popup = dta[[3]]$loc,
             icon = icons(iconUrl = dta[[3]]$icon, iconWidth = 28, iconHeight = 28)) %>%
  addMarkers(lat = dta[[4]]$lat, lng = dta[[4]]$lng, popup = dta[[4]]$loc,
             icon = icons(iconUrl = dta[[4]]$icon, iconWidth = 28, iconHeight = 28))
m

HW exercise 1.

Build a thematic plot of the results of Taiwan 2020 presidential election between the DDP and the KMT. The geographical data (maps) for Taiwan can be obtained from DIVA-GIS: Geographic Information System for Biodiversity Research.

Data - Taiwan Administrative units

Source: Taiwan presidential election 2020. Wikipedia.

Prepate the dataset

tw <- sf::st_read('../data/TWN_adm/TWN_adm2.shp')

Reading layer `TWN_adm2' from data source `/Users/jayliao/Documents/NCKU_108_Course/dataM/data/TWN_adm/TWN_adm2.shp' using driver `ESRI Shapefile'
Simple feature collection with 21 features and 18 fields
geometry type:  MULTIPOLYGON
dimension:      XY
bbox:           xmin: 119.3138 ymin: 21.89653 xmax: 122.1085 ymax: 25.63431
CRS:            4326

tw$gvotes <- c(62.23, 53.65, 57.17, 64.22, 56.52, 35.91, 63.28, 62.23,
               50.82, 45.02, 50.83, 53.85, 62.16, 56.95, 56.95, 67.38,
               67.38, 56.52, 38.12, 54.78, 61.56)

Plot the map

mcol <- c(RColorBrewer::brewer.pal(3, 'Blues')[3:1],
          RColorBrewer::brewer.pal(6, 'Greens'))

library(classInt)
brks <- classIntervals(tw$gvotes, n = 9, style = 'quantile')$brks

plot(tw, col = mcol[findInterval(tw$gvotes, brks, all.inside = TRUE)],
     axes = FALSE, max.plot = 1, main = '')
legend("topleft", legend = maptools::leglabs(round(brks)),
       title = 'Percentage of votes that DPP obtained',
       cex=0.8, fill = mcol, bty = 'n', x.intersp = .6, y.intersp = .9)

HW exercise 2 (The link of the dataset is broken).

Traffic accidents on roads in Taiwan (The link is broken QQ) in 2011 is available on-line from the Department of Transportation. Plot the number of deaths per 10,000 vehicles over the administrative units.

HW exercise 3.

Download the data for age fisrt have sex across several countries to make the following plot:

Target output

Load and check

dta_sex <- read.csv('../data/Age_When_You_First_Had_Sex.csv')
str(dta_sex)

'data.frame':   42 obs. of  2 variables:
 $ Date     : Factor w/ 42 levels "Age of first sex- Australia",..: 14 18 42 19 24 37 16 31 7 22 ...
 $ X1.1.2005: num  17.3 19.8 19.6 19.1 19 18.9 18.6 18.4 18.3 18.1 ...

dta_sex <- dta_sex %>% mutate(
  Country = gsub(as.character(Date), pattern = 'Age of first sex- ', replacement = ''),
  Age = X1.1.2005) %>% dplyr::select(Country, Age) %>% arrange(-Age)
dta_sex

Visualize

Lollipop plot

dta_sex %>% dplyr::filter(Country != 'Global') %>%
  mutate(Country = factor(Country, levels = Country[nrow(dta_sex):1]),
         Group = cut(Age, breaks = quantile(Age, probs = c(0, .25, .50, .75, 1)),
                     include.lowest = TRUE)) %>%
  qplot(data =., x = Age, y = Country, color = Group) +
  geom_segment(aes(xend = 14, yend = Country, color = Group)) +
  geom_vline(xintercept = dta_sex$Age[dta_sex$Country == 'Global'], lty = 2) +
  scale_x_continuous(breaks = seq(14, 22, by = 2)) +
  labs(x = 'Average age of the first sex', y = 'Country') +
  theme_bw() +
  theme(legend.position = 'none',
        axis.text.y = element_text(size = 10))

Map

df_maps <- data.frame(Country = countrycode(dta_sex$Country,
                                            'country.name', 'iso3c'),
                      Age = dta_sex$Age)

Warning in countrycode(dta_sex$Country, "country.name", "iso3c"): Some values were not matched unambiguously: Global, Serbia & Montenegro

visitedMap <- joinCountryData2Map(df_maps, joinCode = "ISO3", nameJoinColumn = "Country")

40 codes from your data successfully matched countries in the map
2 codes from your data failed to match with a country code in the map
203 codes from the map weren't represented in your data

par(oma = c(0, 0, 0, 0))
mapCountryData(visitedMap, nameColumnToPlot = "Age", catMethod = 'pretty',
               addLegend = FALSE, missingCountryCol = gray(.9), mapTitle = '',
               colourPalette = c('skyblue1', 'skyblue2', 'dodgerblue', 'dodgerblue4', 'black'))

Warning in rwmGetColours(colourPalette, numColours): 5 colours specified
and 9 required, using interpolation to calculate colours

This map is driven by the latest version of the dataset. Thus, it is lightly different from the target output.

HW exercise 4.

Download all the files from github (click the downward triangle in the clone or download button in green) for flood in schools in Taipei to replicate the analysis with the markdown file included.

台北市國民小學與中學洪災分析

降雨頻率分析與淹水潛勢是洪災分析重要的資料來源，利用水文分析進行降雨觀測及淹水分析結果，進一步整合社會統計資料，可評估人類社會承受災害的潛在風險。

假設將24小時延時200年重現期降雨的淹水潛勢的深度達50cm以上的區域，定義為「潛在受災區」。

Load the packages and prepare the datasets

library(rgdal)
library(sp)
Flood <- readOGR(dsn = './flood-in-school', layer = "flood50", encoding = 'big5')

OGR data source with driver: ESRI Shapefile 
Source: "/Users/jayliao/Documents/NCKU_108_Course/dataM/exercise_0518/flood-in-school", layer: "flood50"
with 5103 features
It has 5 fields

Schools <- readOGR(dsn = './flood-in-school', layer = "tpecity_school", encoding = 'big5')

OGR data source with driver: ESRI Shapefile 
Source: "/Users/jayliao/Documents/NCKU_108_Course/dataM/exercise_0518/flood-in-school", layer: "tpecity_school"
with 198 features
It has 3 fields
Integer64 fields read as strings:  STUDENTS

shp_tw <- readOGR('../data/mapdata201911261001/County_MOI_1081121.shp')

OGR data source with driver: ESRI Shapefile 
Source: "/Users/jayliao/Documents/NCKU_108_Course/dataM/data/mapdata201911261001/COUNTY_MOI_1081121.shp", layer: "COUNTY_MOI_1081121"
with 22 features
It has 4 fields

TP <- fortify(shp_tw) %>% dplyr::filter(id == 8)
fortify_Flood <- fortify(Flood)
Schools_Flood <- Schools[Flood, ]
df_Schools <- as.data.frame(Schools)

Plot the map with three layers using basic R graph tools

df_Schools_Flood <- as.data.frame(Schools_Flood)
par(oma = c(0, 0, 0, 0))    # Close the outter margins
plot(Flood, col = 'cyan')
plot(Schools, col = 'gray', pch = 20, add = TRUE)
plot(Schools_Flood, col = 'red', pch = 20, add = TRUE)
text(x = df_Schools_Flood$coords.x1, y = df_Schools_Flood$coords.x2,
     label = Schools_Flood$TEXTSTRING, family = '蘋方-繁 標準體', cex = .6)

I also try to label the name of the schools which are involved in the risky area. However, the texts of school names are too small to recognize. Some of them even overlap! I then try to use ggplot2 and ggrepel to fix it.

Plot the map with four layers using `ggplot2`

To add the background layer of Taipei City with coordinate scale different from other map layers, I conduct the linear regression to transform the coordinate scale.

f_scale <- function(v1, v2, nIter) {
  n <- min(c(length(v1), length(v2)))
  s <- sapply(1:nIter, function(i) {
    v1 <- sort(sample(v1, n))
    v2 <- sort(sample(v2, n))
    coefficients(lm(v2 ~ v1))
  })
  return(rowMeans(s))
}
coeffi_long <- f_scale(TP$long, fortify_Flood$long, 5000)
coeffi_lat <- f_scale(TP$lat, fortify_Flood$lat, 5000)

Plot using obtaind linear models coefficients.

library(ggrepel)
ggplot() +
  geom_polygon(aes(x = coeffi_long[1] + coeffi_long[2]*long, 
                   y = coeffi_lat[1] + coeffi_lat[2]*lat, group = group),
               fill = 'black', data = TP) +
  geom_polygon(aes(x = long, y = lat, group = group),
               fill = 'deepskyblue2', data = fortify_Flood) +
  geom_point(data = df_Schools, color = 'grey',
             mapping = aes(x = coords.x1, y = coords.x2)) +
  geom_point(data = df_Schools_Flood, color = 'hotpink',
             mapping = aes(x = coords.x1, y = coords.x2)) +
  geom_label_repel(aes(x = coords.x1, y = coords.x2, label = TEXTSTRING),
                   data = df_Schools_Flood,
                   family = '蘋方-繁 標準體', size = 1.5,
                   segment.color = 'grey10', segment.size = .25) +
  theme_void()

Fail! QAQ

Plot the map with three layers using `ggplot2`

ggplot() +
  geom_polygon(aes(x = long, y = lat, group = group),
               fill = 'deepskyblue2', data = fortify_Flood) +
  geom_point(data = df_Schools, color = 'grey',
             mapping = aes(x = coords.x1, y = coords.x2)) +
  geom_point(data = df_Schools_Flood, color = 'hotpink',
             mapping = aes(x = coords.x1, y = coords.x2)) +
  geom_label_repel(aes(x = coords.x1, y = coords.x2, label = TEXTSTRING),
                   data = df_Schools_Flood,
                   family = '蘋方-繁 標準體', size = 2.5,
                   segment.color = 'grey10', segment.size = .25) +
  theme_void()

Compute the number of students under the risk

head(Schools_Flood@data, n = nrow(Schools_Flood@data))

Schools_Flood@data$STUDENTS %>% as.character() %>% as.numeric() %>% sum()

[1] 3722

台北市共有36個中小學在潛在受災區範圍內，同時可能影響的學生人數共為3722名

DataM: In-class and Homework Exercise 0518 (Maps)

DataM: In-class and Homework Exercise 0518 (Maps)

In-class exercise 1.

Construct the original dataset

Plot the map

Countries that I have visited

Countries that I have stay (including my own country)

In-class exercise 2.

Prepare the datasets

Plot the map

In-class exercise 3.

Construct the data set

Several maps

Great cafe’ for studying

Great brunch restaurants

Great dinner restaurants

Great bars

Display above venues in the single map with different icons labeling different types of venues

HW exercise 1.

Prepate the dataset

Plot the map

HW exercise 2 (The link of the dataset is broken).

HW exercise 3.

Load and check

Visualize

Lollipop plot

Map

HW exercise 4.

台北市國民小學與中學洪災分析

Load the packages and prepare the datasets

Plot the map with three layers using basic R graph tools

Plot the map with four layers using `ggplot2`

To add the background layer of Taipei City with coordinate scale different from other map layers, I conduct the linear regression to transform the coordinate scale.

Plot using obtaind linear models coefficients.

Plot the map with three layers using `ggplot2`

Compute the number of students under the risk

DataM: In-class and Homework Exercise 0518 (Maps)

DataM: In-class and Homework Exercise 0518 (Maps)

In-class exercise 1.

Construct the original dataset

Plot the map

Countries that I have visited

Countries that I have stay (including my own country)

In-class exercise 2.

Prepare the datasets

Plot the map

In-class exercise 3.

Construct the data set

Several maps

Great cafe’ for studying

Great brunch restaurants

Great dinner restaurants

Great bars

Display above venues in the single map with different icons labeling different types of venues

HW exercise 1.

Prepate the dataset

Plot the map

HW exercise 2 (The link of the dataset is broken).

HW exercise 3.

Load and check

Visualize

Lollipop plot

Map

HW exercise 4.

台北市國民小學與中學洪災分析

Load the packages and prepare the datasets

Plot the map with three layers using basic R graph tools

Plot the map with four layers using ggplot2

To add the background layer of Taipei City with coordinate scale different from other map layers, I conduct the linear regression to transform the coordinate scale.

Plot using obtaind linear models coefficients.

Plot the map with three layers using ggplot2

Compute the number of students under the risk

Plot the map with four layers using `ggplot2`

Plot the map with three layers using `ggplot2`