DataM: In-class and Homework Exercise 0518 (Maps)
In-class exercise 1.
Indicate countries you have visited so far on a world map in the style of the ebola outbreaks example.
[Note] I add the information of visit counts.
Construct the original dataset
dta <- rbind(
read.table('../data/visited_countries.txt', header = TRUE, sep = ','),
data.frame(Year = c(rep(1997, 12-5+1), rep(1998:2019, each=12), rep(2020, 5)),
Month = c(5:12, rep(1:12, 2019 - 1997), 1:5),
Country = rep('Taiwan', 277))
)
summary(dta) Year Month Country
Min. :1997 Min. : 1.000 Taiwan :277
1st Qu.:2003 1st Qu.: 4.000 Japan : 4
Median :2009 Median : 7.000 Thailand : 2
Mean :2009 Mean : 6.515 Australia: 1
3rd Qu.:2015 3rd Qu.: 9.000 Austria : 1
Max. :2020 Max. :12.000 Germany : 1
(Other) : 5
Plot the map
Countries that I have visited
df_maps <- data.frame(Country = countrycode(levels(dta$Country)[1:10],
'country.name', 'iso3c'),
Counts = as.numeric(table(dta$Country)[1:10]))
visitedMap <- joinCountryData2Map(df_maps, joinCode = "ISO3", nameJoinColumn = "Country")10 codes from your data successfully matched countries in the map
0 codes from your data failed to match with a country code in the map
233 codes from the map weren't represented in your data
mapCountryData(visitedMap, nameColumnToPlot = "Counts", catMethod = "categorical",
addLegend = TRUE, missingCountryCol = gray(.9),
colourPalette = c('deepskyblue', 'dodgerblue', 'dodgerblue4'),
mapTitle = 'Countries that Jay Liao have visited (with visit counts label)')Countries that I have stay (including my own country)
df_maps <- data.frame(Country = countrycode(levels(dta$Country),
'country.name', 'iso3c'),
Counts = as.numeric(table(dta$Country)))
visitedMap <- joinCountryData2Map(df_maps, joinCode = "ISO3", nameJoinColumn = "Country")11 codes from your data successfully matched countries in the map
0 codes from your data failed to match with a country code in the map
232 codes from the map weren't represented in your data
mapCountryData(visitedMap, nameColumnToPlot = "Counts", catMethod = "categorical",
addLegend = TRUE, missingCountryCol = gray(.9),
colourPalette = c('deepskyblue', 'dodgerblue', 'dodgerblue4', 'black'),
mapTitle = 'Countries that Jay Liao have stay (with stay months label)')In-class exercise 2.
Plot places in administratice areas of Taiwan you have visited so far.
Since I have visited every city and county in Taiwan, I group them into ‘Have lived’, ‘Have stayed’, and ‘Have visited’.
Prepare the datasets
Reading layer `TWN_adm2' from data source `/Users/jayliao/Documents/NCKU_108_Course/dataM/data/TWN_adm/TWN_adm2.shp' using driver `ESRI Shapefile'
Simple feature collection with 21 features and 18 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: 119.3138 ymin: 21.89653 xmax: 122.1085 ymax: 25.63431
CRS: 4326
[1] Kaohsiung City Taipei City Changhwa Chiayi
[5] Hsinchu Hualien Ilan Kaohsiung
[9] Keelung City Miaoli Nantou Penghu
[13] Pingtung Taichung Taichung City Tainan
[17] Tainan City Taipei Taitung Taoyuan
[21] Yunlin
21 Levels: Changhwa Chiayi Hsinchu Hualien Ilan ... Yunlin
In-class exercise 3.
Map an area of Tainan city to include three of your favorite places to eat as landmarks.
Construct the data set
dta <- list(cafe = list(lat = c(22.996545, 22.982398, 22.994841),
lng = c(120.216966, 120.193082, 120.217972),
loc = c('MASA LOFT', '午營咖啡 a break cafe', 'Louisa Coffee (成大勝利店)')),
Brunch = list(lat = c(22.994070, 22.991456, 22.984290),
lng = c(120.221047, 120.222033, 120.215221),
loc = c('濰克早午餐台南成大店', '綠木咖啡早午餐 Green Wood', '鹿耳晚晚早餐')),
Dinner = list(lat = c(22.991668, 22.994099, 22.982516),
lng = c(120.191856, 120.203678, 120.204432),
loc = c('沙卡里巴', '阿霞飯店', '有你真好 湘菜沙龍')),
Bar = list(lat = c(22.994679, 22.993608, 22.988610),
lng = c(120.195527, 120.225973, 120.197088),
loc = c('Bar - Wanderer', '在島之後After Island. 餐酒館', 'Bar B&B - Bitters&Barrel')))Several maps
Great cafe’ for studying
d <- dta$cafe
m <- leaflet() %>% addTiles() %>%
addMarkers(lat = d$lat, lng = d$lng, popup = d$loc)
mGreat brunch restaurants
d <- dta$Brunch
m <- leaflet() %>% addTiles() %>%
addMarkers(lat = d$lat, lng = d$lng, popup = d$loc)
mGreat dinner restaurants
d <- dta$Dinner
m <- leaflet() %>% addTiles() %>%
addMarkers(lat = d$lat, lng = d$lng, popup = d$loc)
mDisplay above venues in the single map with different icons labeling different types of venues
dta$cafe$icon <- 'https://balithisweek.com/wp-content/uploads/2016/10/btw-icon-cafe.png'
dta$Brunch$icon <- 'https://cdn1.iconfinder.com/data/icons/kitchen-flat-set/512/Main_Dish-512.png'
dta$Dinner$icon <- 'https://cdn.iconscout.com/icon/premium/png-256-thumb/dinner-85-540711.png'
dta$Bar$icon <- 'https://cdn.icon-icons.com/icons2/644/PNG/512/red_bar_icon-icons.com_59494.png'
m <- leaflet() %>% addTiles() %>%
addMarkers(lat = dta[[1]]$lat, lng = dta[[1]]$lng, popup = dta[[1]]$loc,
icon = icons(iconUrl = dta[[1]]$icon, iconWidth = 28, iconHeight = 28)) %>%
addMarkers(lat = dta[[2]]$lat, lng = dta[[2]]$lng, popup = dta[[2]]$loc,
icon = icons(iconUrl = dta[[2]]$icon, iconWidth = 42, iconHeight = 42)) %>%
addMarkers(lat = dta[[3]]$lat, lng = dta[[3]]$lng, popup = dta[[3]]$loc,
icon = icons(iconUrl = dta[[3]]$icon, iconWidth = 28, iconHeight = 28)) %>%
addMarkers(lat = dta[[4]]$lat, lng = dta[[4]]$lng, popup = dta[[4]]$loc,
icon = icons(iconUrl = dta[[4]]$icon, iconWidth = 28, iconHeight = 28))
mHW exercise 1.
Build a thematic plot of the results of Taiwan 2020 presidential election between the DDP and the KMT. The geographical data (maps) for Taiwan can be obtained from DIVA-GIS: Geographic Information System for Biodiversity Research.
Data - Taiwan Administrative units
Source: Taiwan presidential election 2020. Wikipedia.
Prepate the dataset
Reading layer `TWN_adm2' from data source `/Users/jayliao/Documents/NCKU_108_Course/dataM/data/TWN_adm/TWN_adm2.shp' using driver `ESRI Shapefile'
Simple feature collection with 21 features and 18 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: 119.3138 ymin: 21.89653 xmax: 122.1085 ymax: 25.63431
CRS: 4326
Plot the map
mcol <- c(RColorBrewer::brewer.pal(3, 'Blues')[3:1],
RColorBrewer::brewer.pal(6, 'Greens'))
library(classInt)
brks <- classIntervals(tw$gvotes, n = 9, style = 'quantile')$brks
plot(tw, col = mcol[findInterval(tw$gvotes, brks, all.inside = TRUE)],
axes = FALSE, max.plot = 1, main = '')
legend("topleft", legend = maptools::leglabs(round(brks)),
title = 'Percentage of votes that DPP obtained',
cex=0.8, fill = mcol, bty = 'n', x.intersp = .6, y.intersp = .9)HW exercise 2 (The link of the dataset is broken).
Traffic accidents on roads in Taiwan (The link is broken QQ) in 2011 is available on-line from the Department of Transportation. Plot the number of deaths per 10,000 vehicles over the administrative units.
HW exercise 3.
Download the data for age fisrt have sex across several countries to make the following plot:
Target output
Load and check
'data.frame': 42 obs. of 2 variables:
$ Date : Factor w/ 42 levels "Age of first sex- Australia",..: 14 18 42 19 24 37 16 31 7 22 ...
$ X1.1.2005: num 17.3 19.8 19.6 19.1 19 18.9 18.6 18.4 18.3 18.1 ...
dta_sex <- dta_sex %>% mutate(
Country = gsub(as.character(Date), pattern = 'Age of first sex- ', replacement = ''),
Age = X1.1.2005) %>% dplyr::select(Country, Age) %>% arrange(-Age)
dta_sexVisualize
Lollipop plot
dta_sex %>% dplyr::filter(Country != 'Global') %>%
mutate(Country = factor(Country, levels = Country[nrow(dta_sex):1]),
Group = cut(Age, breaks = quantile(Age, probs = c(0, .25, .50, .75, 1)),
include.lowest = TRUE)) %>%
qplot(data =., x = Age, y = Country, color = Group) +
geom_segment(aes(xend = 14, yend = Country, color = Group)) +
geom_vline(xintercept = dta_sex$Age[dta_sex$Country == 'Global'], lty = 2) +
scale_x_continuous(breaks = seq(14, 22, by = 2)) +
labs(x = 'Average age of the first sex', y = 'Country') +
theme_bw() +
theme(legend.position = 'none',
axis.text.y = element_text(size = 10))Map
df_maps <- data.frame(Country = countrycode(dta_sex$Country,
'country.name', 'iso3c'),
Age = dta_sex$Age)Warning in countrycode(dta_sex$Country, "country.name", "iso3c"): Some values were not matched unambiguously: Global, Serbia & Montenegro
40 codes from your data successfully matched countries in the map
2 codes from your data failed to match with a country code in the map
203 codes from the map weren't represented in your data
par(oma = c(0, 0, 0, 0))
mapCountryData(visitedMap, nameColumnToPlot = "Age", catMethod = 'pretty',
addLegend = FALSE, missingCountryCol = gray(.9), mapTitle = '',
colourPalette = c('skyblue1', 'skyblue2', 'dodgerblue', 'dodgerblue4', 'black'))Warning in rwmGetColours(colourPalette, numColours): 5 colours specified
and 9 required, using interpolation to calculate colours
This map is driven by the latest version of the dataset. Thus, it is lightly different from the target output.
HW exercise 4.
Download all the files from github (click the downward triangle in the clone or download button in green) for flood in schools in Taipei to replicate the analysis with the markdown file included.
台北市國民小學與中學洪災分析
降雨頻率分析與淹水潛勢是洪災分析重要的資料來源,利用水文分析進行降雨觀測及淹水分析結果,進一步整合社會統計資料,可評估人類社會承受災害的潛在風險。
假設將24小時延時200年重現期降雨的淹水潛勢的深度達50cm以上的區域,定義為「潛在受災區」。
Load the packages and prepare the datasets
library(rgdal)
library(sp)
Flood <- readOGR(dsn = './flood-in-school', layer = "flood50", encoding = 'big5')OGR data source with driver: ESRI Shapefile
Source: "/Users/jayliao/Documents/NCKU_108_Course/dataM/exercise_0518/flood-in-school", layer: "flood50"
with 5103 features
It has 5 fields
OGR data source with driver: ESRI Shapefile
Source: "/Users/jayliao/Documents/NCKU_108_Course/dataM/exercise_0518/flood-in-school", layer: "tpecity_school"
with 198 features
It has 3 fields
Integer64 fields read as strings: STUDENTS
OGR data source with driver: ESRI Shapefile
Source: "/Users/jayliao/Documents/NCKU_108_Course/dataM/data/mapdata201911261001/COUNTY_MOI_1081121.shp", layer: "COUNTY_MOI_1081121"
with 22 features
It has 4 fields
Plot the map with three layers using basic R graph tools
df_Schools_Flood <- as.data.frame(Schools_Flood)
par(oma = c(0, 0, 0, 0)) # Close the outter margins
plot(Flood, col = 'cyan')
plot(Schools, col = 'gray', pch = 20, add = TRUE)
plot(Schools_Flood, col = 'red', pch = 20, add = TRUE)
text(x = df_Schools_Flood$coords.x1, y = df_Schools_Flood$coords.x2,
label = Schools_Flood$TEXTSTRING, family = '蘋方-繁 標準體', cex = .6)I also try to label the name of the schools which are involved in the risky area. However, the texts of school names are too small to recognize. Some of them even overlap! I then try to use ggplot2 and ggrepel to fix it.
Plot the map with four layers using ggplot2
To add the background layer of Taipei City with coordinate scale different from other map layers, I conduct the linear regression to transform the coordinate scale.
f_scale <- function(v1, v2, nIter) {
n <- min(c(length(v1), length(v2)))
s <- sapply(1:nIter, function(i) {
v1 <- sort(sample(v1, n))
v2 <- sort(sample(v2, n))
coefficients(lm(v2 ~ v1))
})
return(rowMeans(s))
}
coeffi_long <- f_scale(TP$long, fortify_Flood$long, 5000)
coeffi_lat <- f_scale(TP$lat, fortify_Flood$lat, 5000)Plot using obtaind linear models coefficients.
library(ggrepel)
ggplot() +
geom_polygon(aes(x = coeffi_long[1] + coeffi_long[2]*long,
y = coeffi_lat[1] + coeffi_lat[2]*lat, group = group),
fill = 'black', data = TP) +
geom_polygon(aes(x = long, y = lat, group = group),
fill = 'deepskyblue2', data = fortify_Flood) +
geom_point(data = df_Schools, color = 'grey',
mapping = aes(x = coords.x1, y = coords.x2)) +
geom_point(data = df_Schools_Flood, color = 'hotpink',
mapping = aes(x = coords.x1, y = coords.x2)) +
geom_label_repel(aes(x = coords.x1, y = coords.x2, label = TEXTSTRING),
data = df_Schools_Flood,
family = '蘋方-繁 標準體', size = 1.5,
segment.color = 'grey10', segment.size = .25) +
theme_void()Fail! QAQ
Plot the map with three layers using ggplot2
ggplot() +
geom_polygon(aes(x = long, y = lat, group = group),
fill = 'deepskyblue2', data = fortify_Flood) +
geom_point(data = df_Schools, color = 'grey',
mapping = aes(x = coords.x1, y = coords.x2)) +
geom_point(data = df_Schools_Flood, color = 'hotpink',
mapping = aes(x = coords.x1, y = coords.x2)) +
geom_label_repel(aes(x = coords.x1, y = coords.x2, label = TEXTSTRING),
data = df_Schools_Flood,
family = '蘋方-繁 標準體', size = 2.5,
segment.color = 'grey10', segment.size = .25) +
theme_void()Compute the number of students under the risk
[1] 3722
台北市共有36個中小學在潛在受災區範圍內,同時可能影響的學生人數共為3722名