A UNESCO World Heritage Site is a site that has been nominated for the United Nations Educational, Scientific and Cultural Organization’s International World Heritage program. The program aims to catalogue and preserve sites of outstanding importance, either cultural or natural, to the common heritage of humankind.
A World Heritage Site is a landmark or area with legal protection by an international convention guarded by the United Nations Educational, Scientific and Cultural Organization (UNESCO). World Heritage Sites are designated by UNESCO for having cultural, historical, scientific or other forms of significance. The sites are judged to contain “cultural and natural heritage around the world considered to be of outstanding value to humanity.” To be selected, a World Heritage Site must be a somehow unique landmark which is geographically and historically identifiable and has special cultural or physical significance. For example, World Heritage Sites might be ancient ruins or historical structures, buildings, cities, deserts, forests, islands, lakes, monuments, mountains, or wilderness areas. As of June 2020, a total of 1,121 World Heritage Sites (869 cultural, 213 natural, and 39 mixed properties) exist across 167 countries; the three countries with most sites are China, Italy (both 55) and Spain (48).
This dataset contains spatial data of 1121 World Heritage Sites that were listed into the World Heritage List by UNESCO. Data collected from whc.unesco.org
Data Exploration
Before we do the exploratory and explanatory data analysis, we will install all the library needed to support the data analysis.
library(lubridate)
library(scales)
library(readr)
library(dplyr)
library(ggplot2)
library(plotly)
library(tidyr)
library(glue)
library(viridis)
library(leaflet)
library(treemapify)
library(skimr)
library(DT)
library(reshape2)After we install the libraries, we call the data ,check all the detail of the data, change the incorrect data types and drop the unused columns for further analysis
wh <-
read_csv("C:/SyabaruddinFolder/Work/Algoritma/DATAVIZcourse/InteractivePlot/LBB/wh.csv")
whc <- wh %>%
mutate(
category = as.factor(category),
country = as.factor(states_name_en),
region = as.factor(region_en),
date_recorded = year(as.Date(as.character(date_inscribed), format = "%Y")),
name = name_en,
danger = as.factor(danger)
) %>%
select(category,
country,
region,
name,
date_recorded,
danger,
longitude,
latitude) %>%
filter(
region == "Europe and North America" |
region == "Asia and the Pacific" |
region == "Latin America and the Caribbean" |
region == "Africa" |
region == "Arab States"
)
datatable(whc)Now let us recheck the data types for every columns
glimpse(whc)## Rows: 1,118
## Columns: 8
## $ category <fct> Cultural, Cultural, Cultural, Cultural, Cultural, Mixed,~
## $ country <fct> "Afghanistan", "Afghanistan", "Albania", "Albania", "Alg~
## $ region <fct> "Asia and the Pacific", "Asia and the Pacific", "Europe ~
## $ name <chr> "Cultural Landscape and Archaeological Remains of the Ba~
## $ date_recorded <dbl> 2003, 2002, 2005, 1992, 1980, 1982, 1982, 1982, 1982, 19~
## $ danger <fct> 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,~
## $ longitude <dbl> 67.825250, 64.515889, 20.133333, 20.026111, 4.786840, 9.~
## $ latitude <dbl> 34.846940, 34.396417, 40.069444, 39.751111, 35.818440, 2~
Great, all the data types is already correct. Now let us check if there is any missing value.
colSums(is.na(whc))## category country region name date_recorded
## 0 0 0 0 0
## danger longitude latitude
## 0 0 0
Great, No missing value and now the data set is ready to be analyzed.
Analysis and Visualization
In this section, The visualization will be divided by 2 section, global section and regional section.
Global
Let us check the World Heritage Sites by Category globally
whcat <- whc %>%
group_by(category) %>%
summarise(freq = n()) %>%
mutate(label = glue("Category: {category}
Number of sites: {freq}")) %>%
ggplot(aes(
y = reorder(category, freq),
x = freq,
text = label
)) +
geom_col(aes(fill = category), show.legend = F) +
scale_fill_viridis(discrete = TRUE) +
scale_color_viridis(discrete = TRUE) +
theme_bw() +
labs(title = "World Heritage Sites by Category Globally",
x = "Number of Sites",
y = "")
ggplotly(whcat, tooltip = "text") %>% layout(showlegend = F)If we look at the graph above, Most of World Heritage Sites is Cultural sites with 868 sites.
Next, Let us take a look at World Heritage Sites by Region.
whreg <- whc %>%
group_by(region) %>%
summarise(freq = n()) %>%
mutate(label = glue("Region: {region}
Number of sites: {freq}")) %>%
ggplot(aes(
y = reorder(region, freq),
x = freq,
text = label
)) +
geom_col(aes(fill = region), show.legend = F) +
scale_fill_viridis(discrete = TRUE) +
scale_color_viridis(discrete = TRUE) +
theme_bw() +
labs(title = "World Heritage Sites by Region",
x = "Number of Sites",
y = "")
ggplotly(whreg, tooltip = "text") %>% layout(showlegend = F)If we look at the graph above, Most of World Heritage Sites is located in Europe and North America region with 528 sites.
Next, Let us take a look at the danger status of World Heritage Sites.
whdang <- whc %>%
group_by(danger) %>%
summarise(freq=n()) %>%
mutate(label=glue(
"Status: {danger}
1 = Danger
0 = Reserved
Number of sites: {freq}
"
)) %>%
ggplot(aes(x=reorder(danger,freq),y=freq, text=label)) +
geom_col(aes(fill=danger)) +
scale_fill_viridis(discrete = TRUE) +
scale_color_viridis(discrete = TRUE) +
theme_bw()+
labs(title = "World Heritage Sites in Danger Globally",
y= "Number of Sites",
x= "")
ggplotly(whdang,tooltip = "text") %>% layout(showlegend=F)If we look at the graph above, most of the sites are in safe condition. There are only 53 sites that needs to be taken care.
Now Let us take a look at how many Sites are registered as World Heritage by year as per region.
whdate <- whc %>%
group_by(date_recorded,region) %>%
summarise(freq=n()) %>%
mutate(label=glue(
"Year Recorded: {date_recorded}
Region: {region}
Number of Sites: {freq}"
)) %>%
ggplot(aes(x=date_recorded,y=freq,text=label,col=region,group=1))+
geom_line() + geom_point() +
labs(title="World Heritage Sites Registered by Year",
x="Registration Year",
y="Number of Sites") +
theme_bw()## `summarise()` has grouped output by 'date_recorded'. You can override using the `.groups` argument.
ggplotly(whdate,tooltip = "text") %>% layout(legend = list(
orientation = "h",
x = 0,
y = -0.3
))If we look at the curve above, the Europe and North America is above the other lines in terms of Sites registered by year. The least recorded is Arab sataes region
Now lets take a look at the number of sites in below interactive map.
Pic <- makeIcon(
iconUrl = "images (1).png",
iconWidth = 100 * 0.35,
iconHeight = 100 * 0.35
)
map <- leaflet()
map <- addTiles(map)
map <- addMarkers(
map,
lng = whc$longitude,
lat = whc$latitude,
popup = whc$name,
clusterOptions = markerClusterOptions(),
icon = Pic
)
mapRegional
Arab Region
Let us check the number of World Heritage Sites in Arab region countries.
whregar <- whc %>%
filter(region=="Arab States") %>%
group_by(country) %>%
summarise(freq = n()) %>%
mutate(label = glue(
"Country: {country}
Number of sites: {freq}")) %>%
ggplot(aes(
y = reorder(country, freq),
x = freq,
text = label
)) +
geom_col(aes(fill = country), show.legend = F) +
scale_fill_viridis(discrete = TRUE) +
scale_color_viridis(discrete = TRUE) +
theme_bw() +
labs(title = "World Heritage Sites by Arab Region Countries",
x = "Number of Sites",
y = "")
ggplotly(whregar, tooltip = "text") %>% layout(showlegend = F)Most of World Heritage Sites in Arab Region is located in Morocco
Now lets check the distribution of the sites in Arab region, as shown in map below
Pic <- makeIcon(
iconUrl = "images (1).png",
iconWidth = 100 * 0.35,
iconHeight = 100 * 0.35
)
wharab <-whc %>%
filter(region=="Arab States")
map <- leaflet()
map <- addTiles(map)
map <- addMarkers(
map,
lng = wharab$longitude,
lat = wharab$latitude,
popup = wharab$name,
clusterOptions = markerClusterOptions(),
icon = Pic
)
mapLet us check the World Heritage Sites by Category in Arab region
whar <- whc %>%
filter(region=="Arab States") %>%
group_by(category,country) %>%
summarise(freq = n()) %>%
mutate(label = glue(
"Category: {category}
Country: {country}
Number of sites: {freq}")) %>%
ggplot(aes(
y = reorder(country, freq),
x = freq,
text = label
)) +
geom_col(aes(fill = category), show.legend = F) +
scale_fill_viridis(discrete = TRUE) +
scale_color_viridis(discrete = TRUE) +
facet_grid(~category)+
theme_bw() +
labs(title = "World Heritage Sites by Category in Arab Region",
x = "Number of Sites",
y = "")
ggplotly(whar, tooltip = "text") %>% layout(showlegend = F)If we look at above graph, most of the sites in Arab Region is Cultural sites.
Next, Let us take a look at the danger status of World Heritage Sites in Arab Region
whdangarab <- whc %>%
filter(region=="Arab States") %>%
group_by(danger,country) %>%
summarise(freq=n()) %>%
mutate(label=glue(
"Status: {danger}
1 = Danger
0 = Reserved
Country : {country}
Number of sites: {freq}
"
)) %>%
ggplot(aes(y=reorder(country,freq),x=freq, text=label)) +
geom_col(aes(fill=danger)) +
facet_grid(~danger)+
scale_fill_viridis(discrete = TRUE) +
scale_color_viridis(discrete = TRUE) +
theme_bw()+
labs(title = "World Heritage Sites Danger Status in Arab Region",
y= "",
x= "Number of Sites")
ggplotly(whdangarab,tooltip = "text") %>% layout(showlegend=F)Based on graph above, some sites found in Arab Region are in Danger Status. The most number sites in danger status are in Syirian and Libya. This is predictable since these countries are war zone.
Africa
Let us check the number of World Heritage Sites in Africa region countries.
whregaf <- whc %>%
filter(region=="Africa") %>%
group_by(country) %>%
summarise(freq = n()) %>%
mutate(label = glue(
"Country: {country}
Number of sites: {freq}")) %>%
ggplot(aes(
y = reorder(country, freq),
x = freq,
text = label
)) +
geom_col(aes(fill = country), show.legend = F) +
theme_bw() +
labs(title = "World Heritage Sites by Africa Region Countries",
x = "Number of Sites",
y = "")
ggplotly(whregaf, tooltip = "text") %>% layout(showlegend = F)Most of World Heritage Sites in Africa Region is located in South Africa and Ethiopia
Now lets check the distribution of the sites in Africa region, as shown in map below
Pic <- makeIcon(
iconUrl = "images (1).png",
iconWidth = 100 * 0.35,
iconHeight = 100 * 0.35
)
wharafrica <-whc %>%
filter(region=="Africa")
map <- leaflet()
map <- addTiles(map)
map <- addMarkers(
map,
lng = wharafrica$longitude,
lat = wharafrica$latitude,
popup = wharafrica$name,
clusterOptions = markerClusterOptions(),
icon = Pic
)
mapLet us check the World Heritage Sites by Category in Africa region
wharcat <- whc %>%
filter(region=="Africa") %>%
group_by(category,country) %>%
summarise(freq = n()) %>%
mutate(label = glue(
"Category: {category}
Country: {country}
Number of sites: {freq}")) %>%
ggplot(aes(
y = reorder(country, freq),
x = freq,
text = label
)) +
geom_col(aes(fill = category), show.legend = F) +
facet_grid(~category)+
theme_bw() +
labs(title = "World Heritage Sites by Category in Africa Region",
x = "Number of Sites",
y = "")
ggplotly(wharcat, tooltip = "text") %>% layout(showlegend = F)If we look at above graph, there is a balance of the sites in Africa Region both Cultural sites and Natural sites. Most of Natural Sites are located in Democratic Republic of the Congo, while Cultural Sites most located in Ethiopia
Next, Let us take a look at the danger status of World Heritage Sites in Africa Region
whdangarab <- whc %>%
filter(region=="Africa") %>%
group_by(danger,country) %>%
summarise(freq=n()) %>%
mutate(label=glue(
"Status: {danger}
1 = Danger
0 = Reserved
Country : {country}
Number of sites: {freq}
"
)) %>%
ggplot(aes(y=reorder(country,freq),x=freq, text=label)) +
geom_col(aes(fill=danger)) +
facet_grid(~danger)+
theme_bw()+
labs(title = "World Heritage Sites Danger Status in Africa Region",
y= "",
x= "Number of Sites")
ggplotly(whdangarab,tooltip = "text") %>% layout(showlegend=F)Based on graph above, mostly sites are in safe condition, howeversome sites found in Africa Region are in Danger Status. The most number sites in danger status are in Democratic of Congo and Mali. This is predictable since these countries are war zone.
Latin America and the Caribbean
Let us check the number of World Heritage Sites in Latin America and the Caribbean region countries.
whlat <- whc %>%
filter(region=="Latin America and the Caribbean") %>%
group_by(country) %>%
summarise(freq = n()) %>%
mutate(label = glue(
"Country: {country}
Number of sites: {freq}")) %>%
ggplot(aes(
y = reorder(country, freq),
x = freq,
text = label
)) +
geom_col(aes(fill = country), show.legend = F) +
scale_fill_viridis(discrete = TRUE) +
scale_color_viridis(discrete = TRUE) +
theme_bw() +
labs(title = "World Heritage Sites by Latin America and the Caribbean Region Countries",
x = "Number of Sites",
y = "")
ggplotly(whlat, tooltip = "text") %>% layout(showlegend = F)Most of World Heritage Sites in Latin America and the Caribbean Region is located in Mexico
Now lets check the distribution of the sites in Latin America and the Caribbean region, as shown in map below
Pic <- makeIcon(
iconUrl = "images (1).png",
iconWidth = 100 * 0.35,
iconHeight = 100 * 0.35
)
whlatin <-whc %>%
filter(region=="Latin America and the Caribbean")
map <- leaflet()
map <- addTiles(map)
map <- addMarkers(
map,
lng = whlatin$longitude,
lat = whlatin$latitude,
popup = whlatin$name,
clusterOptions = markerClusterOptions(),
icon = Pic
)
mapLet us check the World Heritage Sites by Category in Latin America and the Caribbean region
whlcat <- whc %>%
filter(region=="Latin America and the Caribbean") %>%
group_by(category,country) %>%
summarise(freq = n()) %>%
mutate(label = glue(
"Category: {category}
Country: {country}
Number of sites: {freq}")) %>%
ggplot(aes(
y = reorder(country, freq),
x = freq,
text = label
)) +
geom_col(aes(fill = category), show.legend = F) +
scale_fill_viridis(discrete = TRUE) +
scale_color_viridis(discrete = TRUE) +
facet_grid(~category)+
theme_bw() +
labs(title = "World Heritage Sites by Category in Latin America and the Caribbean Region",
x = "Number of Sites",
y = "")
ggplotly(whlcat, tooltip = "text") %>% layout(showlegend = F)If we look at above graph, most of the sites in Latin America and the Caribbean Region is Cultural sites. However there are many Natural Sites also.
Next, Let us take a look at the danger status of World Heritage Sites in Latin America and the Caribbean Region
whdanglat <- whc %>%
filter(region=="Latin America and the Caribbean") %>%
group_by(danger,country) %>%
summarise(freq=n()) %>%
mutate(label=glue(
"Status: {danger}
1 = Danger
0 = Reserved
Country : {country}
Number of sites: {freq}
"
)) %>%
ggplot(aes(y=reorder(country,freq),x=freq, text=label)) +
geom_col(aes(fill=danger)) +
facet_grid(~danger)+
scale_fill_viridis(discrete = TRUE) +
scale_color_viridis(discrete = TRUE) +
theme_bw()+
labs(title = "World Heritage Sites Danger Status in Latin America and the Caribbean Region",
y= "",
x= "Number of Sites")
ggplotly(whdanglat,tooltip = "text") %>% layout(showlegend=F)Based on graph above, Mostly sites found in Latin America and the Caribbean Region are in safe Status.
Asia and the Pacific
Let us check the number of World Heritage Sites in Asia and the Pacific region countries.
whas <- whc %>%
filter(region=="Asia and the Pacific") %>%
group_by(country) %>%
summarise(freq = n()) %>%
mutate(label = glue(
"Country: {country}
Number of sites: {freq}")) %>%
ggplot(aes(
y = reorder(country, freq),
x = freq,
text = label
)) +
geom_col(aes(fill = country), show.legend = F) +
theme_bw() +
labs(title = "World Heritage Sites by Asia and the Pacific Region Countries",
x = "Number of Sites",
y = "")
ggplotly(whas, tooltip = "text") %>% layout(showlegend = F)Most of World Heritage Sites in Asia and the Pacific Region is located in China and India
Now lets check the distribution of the sites in Asia and the Pacific region, as shown in map below
Pic <- makeIcon(
iconUrl = "images (1).png",
iconWidth = 100 * 0.35,
iconHeight = 100 * 0.35
)
whasia <-whc %>%
filter(region=="Asia and the Pacific")
map <- leaflet()
map <- addTiles(map)
map <- addMarkers(
map,
lng = whasia$longitude,
lat = whasia$latitude,
popup = whasia$name,
clusterOptions = markerClusterOptions(),
icon = Pic,
label = whasia$name,
)
mapLet us check the World Heritage Sites by Category in Asia and the Pacific region
whapac <- whc %>%
filter(region=="Asia and the Pacific") %>%
group_by(category,country) %>%
summarise(freq = n()) %>%
mutate(label = glue(
"Category: {category}
Country: {country}
Number of sites: {freq}")) %>%
ggplot(aes(
y = reorder(country, freq),
x = freq,
text = label
)) +
geom_col(aes(fill = category), show.legend = F) +
facet_grid(~category)+
theme_bw() +
labs(title = "World Heritage Sites by Category in Asia and the Pacific Region",
x = "Number of Sites",
y = "")
ggplotly(whapac, tooltip = "text") %>% layout(showlegend = F)If we look at above graph, there is a balance of the sites in Asia and the Pacific Region both Cultural sites and Natural sites. Most of Natural Sites are located in China and Australia, while Cultural Sites most located in China and India
Next, Let us take a look at the danger status of World Heritage Sites in Asia and the Pacific Region
whasiapac <- whc %>%
filter(region=="Asia and the Pacific") %>%
group_by(danger,country) %>%
summarise(freq=n()) %>%
mutate(label=glue(
"Status: {danger}
1 = Danger
0 = Reserved
Country : {country}
Number of sites: {freq}
"
)) %>%
ggplot(aes(y=reorder(country,freq),x=freq, text=label)) +
geom_col(aes(fill=danger)) +
facet_grid(~danger)+
theme_bw()+
labs(title = "World Heritage Sites Danger Status in Asia and the Pacific Region",
y= "",
x= "Number of Sites")
ggplotly(whasiapac,tooltip = "text") %>% layout(showlegend=F)Based on graph above, mostly sites are in safe condition.
Latin America and the Caribbean
Let us check the number of World Heritage Sites in Latin America and the Caribbean region countries.
whe <- whc %>%
filter(region=="Europe and North America") %>%
group_by(country) %>%
summarise(freq = n()) %>%
filter(freq>5) %>%
mutate(label = glue(
"Country: {country}
Number of sites: {freq}")) %>%
ggplot(aes(
y = reorder(country, freq),
x = freq,
text = label
)) +
geom_col(aes(fill = country), show.legend = F) +
theme_bw() +
labs(title = "World Heritage Sites by Europe and North America Region Countries",
x = "Number of Sites",
y = "")
ggplotly(whe, tooltip = "text") %>% layout(showlegend = F)Most of World Heritage Sites in Europe and North America Region is located in Italy, Spain, France, Germany.
Now lets check the distribution of the sites in Europe and North America region, as shown in map below
Pic <- makeIcon(
iconUrl = "images (1).png",
iconWidth = 100 * 0.35,
iconHeight = 100 * 0.35
)
wheu <-whc %>%
filter(region=="Europe and North America")
map <- leaflet()
map <- addTiles(map)
map <- addMarkers(
map,
lng = wheu$longitude,
lat = wheu$latitude,
popup = wheu$name,
clusterOptions = markerClusterOptions(),
icon = Pic
)
mapLet us check the World Heritage Sites by Category in Europe and North America region
wheuro <- whc %>%
filter(region=="Europe and North America") %>%
group_by(category,country) %>%
summarise(freq = n()) %>%
filter(freq>3) %>%
mutate(label = glue(
"Category: {category}
Country: {country}
Number of sites: {freq}")) %>%
ggplot(aes(
y = reorder(country, freq),
x = freq,
text = label
)) +
geom_col(aes(fill = category), show.legend = F) +
facet_grid(~category)+
theme_bw() +
labs(title = "World Heritage Sites by Category in Europe and North America Region",
x = "Number of Sites",
y = "")
ggplotly(wheuro, tooltip = "text") %>% layout(showlegend = F)If we look at above graph, Most the sites in Europe and North America Region Cultural sites . Most of Cultural Sites are in Italy, Spain, Germany, France.
Next, Let us take a look at the danger status of World Heritage Sites in Europe and North America Region
wheurope <- whc %>%
filter(region=="Europe and North America") %>%
group_by(danger,country) %>%
summarise(freq=n()) %>%
filter(freq>1) %>%
mutate(label=glue(
"Status: {danger}
1 = Danger
0 = Reserved
Country : {country}
Number of sites: {freq}
"
)) %>%
ggplot(aes(y=reorder(country,freq),x=freq, text=label)) +
geom_col(aes(fill=danger)) +
facet_grid(~danger)+
theme_bw()+
labs(title = "World Heritage Sites Danger Status in Europe and North America Region",
y= "",
x= "Number of Sites")
ggplotly(wheurope,tooltip = "text") %>% layout(showlegend=F)Based on graph above, Most sites are in safe condition.