Case study: World Heritage Sites in Danger

- 1,121 heretage sites like the Pyramids in Egypt

- Which sites are threatened and where are they located?
- Are there regions in the world where sitets are more endangered than in others?
- What are the reasons that put a site at risk?

https://en.wikipedia.org/wiki/List_of_World_Heritage_in_Danger

  1. Load some packages necessary to scraping web content and visualize the result
# install.packages("stringr")
# install.packages("RCurl")
# install.packages("XML")
# install.packages("maps")

library(stringr)
library(RCurl)
library(XML)
library(maps)
  1. Import content from a web page into R
heritage_page <- getURLContent("https://en.wikipedia.org/wiki/List_of_World_Heritage_in_Danger")
class(heritage_page)
## [1] "character"
  1. Parse the content in the form of an HTML document and extract HTML tables
heritage_parsed <- htmlParse(heritage_page, encoding = "UTF-8")
class(heritage_parsed)
## [1] "HTMLInternalDocument" "HTMLInternalDocument" "XMLInternalDocument" 
## [4] "XMLAbstractDocument"
tables <- readHTMLTable(heritage_parsed, stringsAsFactors = FALSE)
class(tables)
## [1] "list"
tables[[2]][1:10,]
##                                     V1    V2
## 1                                 Name Image
## 2                             Abu Mena      
## 3      Air and Ténéré Natural Reserves      
## 4               Ancient City of Aleppo      
## 5                Ancient City of Bosra      
## 6             Ancient City of Damascus      
## 7   Ancient Villages of Northern Syria      
## 8        Archaeological Site of Cyrene      
## 9  Archaeological Site of Leptis Magna      
## 10     Archaeological Site of Sabratha      
##                                                                                                                           V3
## 1                                                                                                                   Location
## 2                              EgyAbusir, Egypt30°50′30″N 29°39′50″E / 30.84167°N 29.66389°E / 30.84167; 29.66389 (Abu Mena)
## 3            Niger1Arlit Department, Niger18°17′N 8°0′E / 18.283°N 8.000°E / 18.283; 8.000 (Air and Ténéré Natural Reserves)
## 4                    Aleppo Governorate,  Syria36°14′N 37°10′E / 36.233°N 37.167°E / 36.233; 37.167 (Ancient City of Aleppo)
## 5         Daraa Governorate,  Syria32°31′5″N 36°28′54″E / 32.51806°N 36.48167°E / 32.51806; 36.48167 (Ancient City of Bosra)
## 6  Damascus Governorate,  Syria33°30′41″N 36°18′23″E / 33.51139°N 36.30639°E / 33.51139; 36.30639 (Ancient City of Damascus)
## 7                Syria36°20′3″N 36°50′39″E / 36.33417°N 36.84417°E / 36.33417; 36.84417 (Ancient Villages of Northern Syria)
## 8   LibJebel Akhdar, Libya32°49′30″N 21°51′30″E / 32.82500°N 21.85833°E / 32.82500; 21.85833 (Archaeological Site of Cyrene)
## 9    LibKhoms, Libya32°38′18″N 14°17′35″E / 32.63833°N 14.29306°E / 32.63833; 14.29306 (Archaeological Site of Leptis Magna)
## 10     LibSabratha, Libya32°48′19″N 12°29′6″E / 32.80528°N 12.48500°E / 32.80528; 12.48500 (Archaeological Site of Sabratha)
##                               V4                     V5         V6         V7
## 1                       Criteria          Areaha (acre) Year (WHS) Endangered
## 2                  Cultural:(iv)              182 (450)       1979      2001–
## 3       Natural:(vii), (ix), (x) 7,736,000 (19,120,000)       1991      1992–
## 4             Cultural:(iii)(iv)              350 (860)       1986      2013–
## 5          Cultural:(i)(iii)(vi)                      —       1980      2013–
## 6  Cultural:(i)(ii)(iii)(iv)(vi)               86 (210)       1979      2013–
## 7          Cultural:(iii)(iv)(v)        12,290 (30,400)       2011      2013–
## 8     Cultural:(ii), (iii), (vi)                      —       1982      2016–
## 9      Cultural:(i), (ii), (iii)                      —       1982      2016–
## 10                Cultural:(iii)                      —       1982      2016–
##                                                                                                                                             V8
## 1                                                                                                                                       Reason
## 2                               Cave-ins in the area caused by the clay at the surface, which becomes semi-liquid when met with "excess water"
## 3  Military conflict and civil disturbance in the region as well as a reduction of wildlife population and degradation of the vegetation cover
## 4                                                  Syrian Civil War, currently held by the government. Bombings continue threatening the site.
## 5                                                                                                    Syrian Civil War, held by the government.
## 6                                Syrian Civil War, rebel gunfire and mortar shelling, mainly from adjacent Jobar suburb endangers foundations.
## 7                                                Syrian Civil War, some held by rebels. Reports of looting and demolitions by Islamist groups.
## 8                                                   Libyan Civil War, presence of armed groups, already incurred and potential further damage.
## 9                                                   Libyan Civil War, presence of armed groups, already incurred and potential further damage.
## 10                                                  Libyan Civil War, presence of armed groups, already incurred and potential further damage.
##              V9
## 1          Refs
## 2  [17][18][19]
## 3      [20][21]
## 4          [22]
## 5          [23]
## 6          [24]
## 7          [25]
## 8      [26][27]
## 9      [27][28]
## 10     [27][29]
  1. Select the table of interest and rename the variables
danger_table <- tables[[2]]
danger_table[1:10,]
##                                     V1    V2
## 1                                 Name Image
## 2                             Abu Mena      
## 3      Air and Ténéré Natural Reserves      
## 4               Ancient City of Aleppo      
## 5                Ancient City of Bosra      
## 6             Ancient City of Damascus      
## 7   Ancient Villages of Northern Syria      
## 8        Archaeological Site of Cyrene      
## 9  Archaeological Site of Leptis Magna      
## 10     Archaeological Site of Sabratha      
##                                                                                                                           V3
## 1                                                                                                                   Location
## 2                              EgyAbusir, Egypt30°50′30″N 29°39′50″E / 30.84167°N 29.66389°E / 30.84167; 29.66389 (Abu Mena)
## 3            Niger1Arlit Department, Niger18°17′N 8°0′E / 18.283°N 8.000°E / 18.283; 8.000 (Air and Ténéré Natural Reserves)
## 4                    Aleppo Governorate,  Syria36°14′N 37°10′E / 36.233°N 37.167°E / 36.233; 37.167 (Ancient City of Aleppo)
## 5         Daraa Governorate,  Syria32°31′5″N 36°28′54″E / 32.51806°N 36.48167°E / 32.51806; 36.48167 (Ancient City of Bosra)
## 6  Damascus Governorate,  Syria33°30′41″N 36°18′23″E / 33.51139°N 36.30639°E / 33.51139; 36.30639 (Ancient City of Damascus)
## 7                Syria36°20′3″N 36°50′39″E / 36.33417°N 36.84417°E / 36.33417; 36.84417 (Ancient Villages of Northern Syria)
## 8   LibJebel Akhdar, Libya32°49′30″N 21°51′30″E / 32.82500°N 21.85833°E / 32.82500; 21.85833 (Archaeological Site of Cyrene)
## 9    LibKhoms, Libya32°38′18″N 14°17′35″E / 32.63833°N 14.29306°E / 32.63833; 14.29306 (Archaeological Site of Leptis Magna)
## 10     LibSabratha, Libya32°48′19″N 12°29′6″E / 32.80528°N 12.48500°E / 32.80528; 12.48500 (Archaeological Site of Sabratha)
##                               V4                     V5         V6         V7
## 1                       Criteria          Areaha (acre) Year (WHS) Endangered
## 2                  Cultural:(iv)              182 (450)       1979      2001–
## 3       Natural:(vii), (ix), (x) 7,736,000 (19,120,000)       1991      1992–
## 4             Cultural:(iii)(iv)              350 (860)       1986      2013–
## 5          Cultural:(i)(iii)(vi)                      —       1980      2013–
## 6  Cultural:(i)(ii)(iii)(iv)(vi)               86 (210)       1979      2013–
## 7          Cultural:(iii)(iv)(v)        12,290 (30,400)       2011      2013–
## 8     Cultural:(ii), (iii), (vi)                      —       1982      2016–
## 9      Cultural:(i), (ii), (iii)                      —       1982      2016–
## 10                Cultural:(iii)                      —       1982      2016–
##                                                                                                                                             V8
## 1                                                                                                                                       Reason
## 2                               Cave-ins in the area caused by the clay at the surface, which becomes semi-liquid when met with "excess water"
## 3  Military conflict and civil disturbance in the region as well as a reduction of wildlife population and degradation of the vegetation cover
## 4                                                  Syrian Civil War, currently held by the government. Bombings continue threatening the site.
## 5                                                                                                    Syrian Civil War, held by the government.
## 6                                Syrian Civil War, rebel gunfire and mortar shelling, mainly from adjacent Jobar suburb endangers foundations.
## 7                                                Syrian Civil War, some held by rebels. Reports of looting and demolitions by Islamist groups.
## 8                                                   Libyan Civil War, presence of armed groups, already incurred and potential further damage.
## 9                                                   Libyan Civil War, presence of armed groups, already incurred and potential further damage.
## 10                                                  Libyan Civil War, presence of armed groups, already incurred and potential further damage.
##              V9
## 1          Refs
## 2  [17][18][19]
## 3      [20][21]
## 4          [22]
## 5          [23]
## 6          [24]
## 7          [25]
## 8      [26][27]
## 9      [27][28]
## 10     [27][29]
names(danger_table)
## [1] "V1" "V2" "V3" "V4" "V5" "V6" "V7" "V8" "V9"
danger_table <- danger_table[-1, c(1,3,4,6,7)]
colnames(danger_table) <- c("name","location","criterion","year_des","year_end")
danger_table$name[1:3]
## [1] "Abu Mena"                        "Air and Ténéré Natural Reserves"
## [3] "Ancient City of Aleppo"
  1. Data cleaning
danger_table$criterion[1:3]
## [1] "Cultural:(iv)"            "Natural:(vii), (ix), (x)"
## [3] "Cultural:(iii)(iv)"
danger_table$criterion <- ifelse(str_detect(danger_table$criterion, "Natural")==TRUE, "Natural", "Cultural")
danger_table$criterion[1:3]
## [1] "Cultural" "Natural"  "Cultural"
danger_table$year_des[1:3]
## [1] "1979" "1991" "1986"
danger_table$year_des <- as.numeric(danger_table$year_des)
danger_table$year_des[1:3]
## [1] 1979 1991 1986
danger_table$year_end[1:3]
## [1] "2001–" "1992–" "2013–"
year_end_clean <- unlist(str_extract_all(danger_table$year_end, "^[[:digit:]]{4}"))
danger_table$year_end <- as.numeric(year_end_clean)
danger_table$year_end[1:3]
## [1] 2001 1992 2013

The location variable contains the name of the site’s location, the country, and the geographic coordinates in several varieties.

danger_table$location[c(1, 3, 5)]
## [1] "EgyAbusir, Egypt30°50′30″N 29°39′50″E / 30.84167°N 29.66389°E / 30.84167; 29.66389 (Abu Mena)"                            
## [2] "Aleppo Governorate,  Syria36°14′N 37°10′E / 36.233°N 37.167°E / 36.233; 37.167 (Ancient City of Aleppo)"                  
## [3] "Damascus Governorate,  Syria33°30′41″N 36°18′23″E / 33.51139°N 36.30639°E / 33.51139; 36.30639 (Ancient City of Damascus)"
reg_y <- "[/][ -]*[[:digit:]]*[.]*[[:digit:]]*[;]"
reg_x <- "[;][ -]*[[:digit:]]*[.]*[[:digit:]]*"
y_coords <- str_extract(danger_table$location, reg_y)
y_coords <- as.numeric(str_sub(y_coords, 3, -2))
y_coords
##  [1]  30.84167  18.28300  36.23300  32.51806  33.51139  36.33417  32.82500
##  [8]  32.63833  32.80528  35.45667  -8.11111 -19.58361  11.41700  34.78167
## [15]  34.83194 -11.68306  25.31700   9.55389   4.00000  35.58806  31.52417
## [22]  39.05000  48.20000  14.20000  27.63300  -2.50000   3.05222  53.40667
## [29]   9.00000  34.39667  42.66111   7.60000   6.83972  13.00000   2.00000
## [36]  31.77667  15.35556  30.13333  13.90639  31.71972  15.92694 -14.46700
## [43]  15.74444  24.83300  -2.00000  34.20000  -9.00000  34.55417  16.77333
## [50]  16.28972   0.32917  -2.50000   0.91700
danger_table$y_coords <- y_coords
x_coords <- str_extract(danger_table$location, reg_x)
x_coords <- as.numeric(str_sub(x_coords, 3, -1))
x_coords
##  [1]   29.66389    8.00000   37.16700   36.48167   36.30639   36.84417
##  [7]   21.85833   14.29306   12.48500   43.26250  -79.07500  -65.75306
## [13]  -69.66700   36.26306   67.82667  160.18306  -80.93300  -79.65583
## [19]   29.25000   42.71833   35.10889   66.83333   16.36700   43.31700
## [25] -112.55000   28.75000   36.50361   -2.84444   21.50000   64.51611
## [31]   20.26556   -8.38300  158.33083  -12.66700   28.50000   35.23417
## [37]   44.20806    9.50000   -4.55500   35.13056   48.62667   49.70000
## [43]  -84.67500   10.33300   21.00000   43.86700   37.40000   38.26667
## [49]   -2.99944   -0.04444   32.55333  101.50000   29.16700
danger_table$x_coords <- x_coords
danger_table$location <- NULL
round(danger_table$y_coords, 2)[1:3]
## [1] 30.84 18.28 36.23
round(danger_table$x_coords, 2)[1:3]
## [1] 29.66  8.00 37.17
length(danger_table$y_coords)
## [1] 53
length(danger_table$x_coords)
## [1] 53
dim(danger_table)
## [1] 53  6
head(danger_table)
##                                 name criterion year_des year_end y_coords
## 2                           Abu Mena  Cultural     1979     2001 30.84167
## 3    Air and Ténéré Natural Reserves   Natural     1991     1992 18.28300
## 4             Ancient City of Aleppo  Cultural     1986     2013 36.23300
## 5              Ancient City of Bosra  Cultural     1980     2013 32.51806
## 6           Ancient City of Damascus  Cultural     1979     2013 33.51139
## 7 Ancient Villages of Northern Syria  Cultural     2011     2013 36.33417
##   x_coords
## 2 29.66389
## 3  8.00000
## 4 37.16700
## 5 36.48167
## 6 36.30639
## 7 36.84417
  1. Plot the locations of the places on a map
pch <- ifelse(danger_table$criterion == "Natural", 19, 2)
map("world", col = "darkgrey", lwd = 0.5, mar = c(0.1, 0.1, 0.1, 0.1))
points(danger_table$x_coords, danger_table$y_coords, pch = pch)
box()

table(danger_table$criterion)
## 
## Cultural  Natural 
##       36       17
  1. Visualization of time trends
hist(danger_table$year_end, freq=TRUE,
     xlab = "Year when site was put on the list of endangered sites",
     main = "")

duration <- danger_table$year_end - danger_table$year_des
hist(duration, freq = TRUE,
     xlab = "Years it took to become an endangered site",
     main = "")

Individual Assignment

  1. Two or three any research questions to be answered
  2. A hyperlink leading to a web page that provides a relevant data set
  3. Your reasoning (explanation) about how the data set can be used to answer your research questions.
  4. 1-page long by Midnight Sep. 23rd (next Wednesday)