1. Introduction

During the past few years there have been many different kinds of occurrences of armed conflicts all around the world and especially so in the Middle East. Non Profit Organizatins such as ACLED have been collecting data and have been punlished for the public to download and to observe how these armed conflicts have been changing.

Therefore, in this data visualization assignment, we would be focusing on one country which would be Iraq. The focus woud be mainly to understand how the various kinds of armed conflicts are spread throughout the various regions in Iraq and the total fatalities that have arised from these armed conflicts. Moreover, the changes in the number of total fatalities and the frequency of armed conflicts between 2016 to 2019 will be looked into.

There are 6 sub categories of armed conflicts: Protests, Strategic developments, Riots, Battles, Violence against civilians, Explosions/Remote violence

2. R Packages needed to be loaded

These are the R packages that need to be loaded to obtain the visualizations

library(ggpubr)
## Loading required package: ggplot2
library(ggplot2)
library(readr)
library(ggalt)
## Registered S3 methods overwritten by 'ggalt':
##   method                  from   
##   grid.draw.absoluteGrob  ggplot2
##   grobHeight.absoluteGrob ggplot2
##   grobWidth.absoluteGrob  ggplot2
##   grobX.absoluteGrob      ggplot2
##   grobY.absoluteGrob      ggplot2
library(ggcorrplot)
library(ggthemes)
library(plotly)
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
library(purrr)
library(readr)
library(readxl)
library(stringr)
library(tibble)
library(tidyr)
library(tidyverse)
## -- Attaching packages -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- tidyverse 1.3.0 --
## v dplyr   0.8.3     v forcats 0.4.0
## -- Conflicts ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks plotly::filter(), stats::filter()
## x dplyr::lag()    masks stats::lag()
library(viridis)
## Loading required package: viridisLite
library(viridisLite)
library(sf)
## Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
library(XML)
library(tmap)
library(dygraphs)
library(xts)
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## 
## Attaching package: 'xts'
## The following objects are masked from 'package:dplyr':
## 
##     first, last
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:dplyr':
## 
##     intersect, setdiff, union
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union
library(RColorBrewer)

3. Datasets loaded, preparation and challenges

  1. Data extraction from ACLED(https://acleddata.com/data-export-tool/). There were two sets of data extracted and downloaded from the ACLED website. The first set of data downloaded was the Iraq.csv file which contained information for the year 2019 only. The second set of data downloaded was the IRAQ5YEARS.csv file which contained data from 2016 to 2019

  2. The shape file for the map of Iraq was downloaded from: https://gadm.org/download_country_v3.html

However, the name of the regions in the shape file was diffeent from from csv filed downloaded from ACLED. Therefore, a manual mapping had to be done betwene the regions in the csv file and in the shape file and the mapping of regions are as follows (regions in csv file from ACLED VS regions in shape file):

CSV File Shape File
Wassit Wasit
Qadissiya Al-Qadisiyah
Sala al-Din Sala ad-Din
Babylon Babil
Thi-Qar Dhi-Qar
Erbil Arbil
Dahuk Dihok
Najaf An-Najaf
Anbar Al-Anbar
Baghdad Baghdad
Missan Maysan
Kerbala Karbala’
Basrah Al-Basrah
Ninewa Ninawa
Muthanna Al-Muthannia
Kirkuk At-Ta’mim
Diyala Diyala
Sulaymaniyah As-Sulaymaniyah

A new column called ‘connect’ was created in the csv files where this ‘connect’ would contain the correctly matched region in the shape file. The ‘connect’ column will then be used to join the shape files and the csv files together.

#Loading of datasets

#Iraq 2019 data
IRAQALL <- read_csv("C:/Users/User/Desktop/ASSIGNMENT 5/Iraq.csv")
## Parsed with column specification:
## cols(
##   .default = col_character(),
##   data_id = col_double(),
##   iso = col_double(),
##   event_id_no_cnty = col_double(),
##   year = col_double(),
##   time_precision = col_double(),
##   inter1 = col_double(),
##   inter2 = col_double(),
##   interaction = col_double(),
##   geo_precision = col_double(),
##   fatalities = col_double(),
##   timestamp = col_double()
## )
## See spec(...) for full column specifications.
#Iraq 2016 to 2019 data
IRAQ5YEARS <- read_csv("C:/Users/User/Desktop/ASSIGNMENT 5/IRAQ5YEARS.csv")
## Parsed with column specification:
## cols(
##   .default = col_character(),
##   data_id = col_double(),
##   iso = col_double(),
##   event_id_no_cnty = col_double(),
##   year = col_double(),
##   time_precision = col_double(),
##   inter1 = col_double(),
##   inter2 = col_double(),
##   interaction = col_double(),
##   latitude = col_double(),
##   longitude = col_double(),
##   geo_precision = col_double(),
##   fatalities = col_double(),
##   timestamp = col_double()
## )
## See spec(...) for full column specifications.
#Iraq Shape Map File
IRAQMAP <- st_read(dsn = "C:/Users/User/Desktop/IRAQSHAPE", layer = "gadm36_IRQ_1")
## Reading layer `gadm36_IRQ_1' from data source `C:\Users\User\Desktop\IRAQSHAPE' using driver `ESRI Shapefile'
## Simple feature collection with 18 features and 10 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 38.79684 ymin: 28.98695 xmax: 48.56856 ymax: 37.37804
## geographic CRS: WGS 84

4. Data design and challenges

The data downloaded from ACEL was in csv/excel format and it was hard to visualize how the different kind of armed conflicts and how total fatalities were distributed across Iraq and across time. Therefore 4 visualizations have been plotted below to give viewers a better snapshot on the armed coflicts in Iraq by region and across time. The 4 visualiations are:

  1. Map of the total fatalities per region in 2019
  2. Maps of the total frequency of the various armed conflicts by region
  3. Stacked Area Graph of the total fatalities by event type from 2016 to 2019
  4. Stacked Area Graph of the total frequency by event type from 2016 to 2019

By using a map view it would be easier to visualize the total count and deaths from armed conflicts and a stacked graph over time help users to determine the trend.

5. Sketched design

6. Map of the total fatalities in 2019 in Iraq by regions

6a Data Wrangling

Data needed to be manipulated such that the sum of total fatalities were calculate by region and then joined with the shape file.

#Filter out columns location and fatalities
IRAQFILTER1 <- IRAQALL %>% select(connect, fatalities)


#Group by location and sum of fatalities in each location (missing data means 0)
IRAQTOTALFATALITIES <- IRAQFILTER1 %>% group_by(`connect`) %>% summarise (`TotalFatalities` = sum (`fatalities`))


#Join Data and the Map Details
IRAQTOTALFATALITIESMAP <- left_join(IRAQMAP, IRAQTOTALFATALITIES,  by = c("NAME_1" = "connect"))
## Warning: Column `NAME_1`/`connect` joining factor and character vector, coercing
## into character vector

6b Map Visualization

Below shows the map of total fatalities in Iraq by region in 2019 where the top 3 locations with the highest datalities are Diyala (474 fatalities), Sala ad-Din (462 fatalities) and Arbil (448 fatalities) whereas regions such as Maysan (19 fatalities) and As-Sulaymaniyah (18 fatalities) have much fewer fatalities. By clicking on the region the number of fatalities will appear for that region.

tmap_mode("view")
## tmap mode set to interactive viewing
tm_shape(IRAQTOTALFATALITIESMAP)+ tm_fill("TotalFatalities", palette = "Greens") +tm_borders(alpha = 0.5) + tm_text("NAME_1") + tmap_style("classic")
## tmap style set to "classic"
## other available styles are: "white", "gray", "natural", "cobalt", "col_blind", "albatross", "beaver", "bw", "watercolor"

7. Maps of the frequency on the different kinds of armed conflicts in the various regions in Iaq in 2019

7a Data Wrangling

Data needed to be manipulated such that the count/frequency of the various event type were calculated by region and then joined with the shape file.

#Grouping the data by location and event type and to sum up the total frequency
IRAQ2A <- IRAQALL   %>% group_by(`connect`,`event_type`)%>% summarise (freq = n())


#Join Data and the Map Details
IRAQEVENTMAP <- left_join(IRAQMAP, IRAQ2A,  by = c("NAME_1" = "connect"))
## Warning: Column `NAME_1`/`connect` joining factor and character vector, coercing
## into character vector
#Filter data for each of the 6 events
IRAQPROTEST2019 <- IRAQEVENTMAP %>% filter (`event_type` == "Protests")
IRAQSTRATEGICDEVELOPMENTS2019 <- IRAQEVENTMAP %>% filter (`event_type` == "Strategic developments")
IRAQRIOTS2019 <- IRAQEVENTMAP %>% filter (`event_type` == "Riots")
IRAQBATTLES2019 <- IRAQEVENTMAP %>% filter (`event_type` == "Battles")
IRAQVIOLENCECIVILIANS2019 <- IRAQEVENTMAP %>% filter (`event_type` == "Violence against civilians")
IRAQEXPLOSIONS2019 <- IRAQEVENTMAP %>% filter (`event_type` == "Explosions/Remote violence")


# Creating the map view for each event type
tmap_mode("view")
## tmap mode set to interactive viewing
PROTEST <- tm_shape(IRAQPROTEST2019)+ tm_fill("freq", palette = "Greens") +tm_borders(alpha = 0.5) + tm_text("NAME_1") + tmap_style("classic")
## tmap style set to "classic"
## other available styles are: "white", "gray", "natural", "cobalt", "col_blind", "albatross", "beaver", "bw", "watercolor"
tmap_mode("view")
## tmap mode set to interactive viewing
STRATEGICDEVELOPMENT <- tm_shape(IRAQSTRATEGICDEVELOPMENTS2019)+ tm_fill("freq", palette = "Greens") +tm_borders(alpha = 0.5) + tm_text("NAME_1") + tmap_style("classic")
## tmap style set to "classic"
## other available styles are: "white", "gray", "natural", "cobalt", "col_blind", "albatross", "beaver", "bw", "watercolor"
tmap_mode("view")
## tmap mode set to interactive viewing
RIOTS <- tm_shape(IRAQRIOTS2019 )+ tm_fill("freq", palette = "Greens") +tm_borders(alpha = 0.5) + tm_text("NAME_1") + tmap_style("classic")
## tmap style set to "classic"
## other available styles are: "white", "gray", "natural", "cobalt", "col_blind", "albatross", "beaver", "bw", "watercolor"
tmap_mode("view")
## tmap mode set to interactive viewing
BATTLES <- tm_shape(IRAQBATTLES2019 )+ tm_fill("freq", palette = "Greens") +tm_borders(alpha = 0.5) + tm_text("NAME_1") + tmap_style("classic")
## tmap style set to "classic"
## other available styles are: "white", "gray", "natural", "cobalt", "col_blind", "albatross", "beaver", "bw", "watercolor"
tmap_mode("view")
## tmap mode set to interactive viewing
VIOLENCECIVILIANS <- tm_shape(IRAQVIOLENCECIVILIANS2019 )+ tm_fill("freq", palette = "Greens") +tm_borders(alpha = 0.5) + tm_text("NAME_1") + tmap_style("classic")
## tmap style set to "classic"
## other available styles are: "white", "gray", "natural", "cobalt", "col_blind", "albatross", "beaver", "bw", "watercolor"
tmap_mode("view")
## tmap mode set to interactive viewing
EXPLOSIONS <- tm_shape(IRAQEXPLOSIONS2019 )+ tm_fill("freq", palette = "Greens") +tm_borders(alpha = 0.5) + tm_text("NAME_1") + tmap_style("classic")
## tmap style set to "classic"
## other available styles are: "white", "gray", "natural", "cobalt", "col_blind", "albatross", "beaver", "bw", "watercolor"

7b Map Visualization

(i) Protest and Strategic Development

The map on the left shows the event type for protest and regions such as Al-Basrah, Al-Muthannia and Al- Qadisiyah have a higher number of protests. The map on the right shows the event type for Strategic Development and regions such as Al-Anbar, Ninawa, Diyala and At Ta’mim has a higher number of strategic developments.By clicking on the region the frequency count will appear.

# Using tmap_arrange to arrange two maps of two events side by side 
tmap_arrange(PROTEST, STRATEGICDEVELOPMENT, asp=1, ncol=2)

(ii) Riots and battles

The map on the left shows the event type for riots and regions such as Al-Basrah, Dhi Qhar and Baghdad have a higher number of riots. The map on the right shows the event type for battles and regions such as Arbil and Diyala has a higher number of battles. By clicking on the region the frequency count will appear.

# Using tmap_arrange to arrange two maps of two events side by side 
tmap_arrange(RIOTS, BATTLES, asp=1, ncol=2)

(iii) Violence against civilians and explosions

The map on the left shows the event type for violence against civilians and regions such as Diyala, Ninawa and Baghdad have a higher number of violence against civilians. The map on the right shows the event type for explosions and regions such as Dihok, Diyala and Arbil has a higher number of explosions. By clicking on the region the frequency count will appear.

# Using tmap_arrange to arrange two maps of two events side by side 
tmap_arrange(VIOLENCECIVILIANS, EXPLOSIONS, asp=1, ncol=2)

8. Stacked Area Chart to show total fatalities by event type from 2016 to 2019

Data needed to be manipulated such that the total fatalities by event type by year were calculated and then the data needed to be re-structured using the ‘spread’ function in order to be able to be used as data for the dygraph.

8a Data Wrangling

#Filter out columns connect, fatalities, event_type and year
IRAQ5YEARSFILTER <- IRAQ5YEARS %>% select(connect, fatalities,event_type,year)

#Grouping by year, event type and the sum of fatalities
IRAQ5YEARSFILTERGROUPBY <- IRAQ5YEARS  %>% group_by(`year`,`event_type`)%>% summarise (`TotalFatalities` = sum (`fatalities`))

#Spreading of data
IRAQ5YEARSFILTERGROUPBYSPREAD <- IRAQ5YEARSFILTERGROUPBY  %>% spread (event_type, TotalFatalities)

8b Data Visualization

From the stacked area graph it can be seen that the total fatalities have decreased significantly from 2016 onwards. From the graph it can also be seen that the greatst contribution to total fatalities are from battles and explosions/remote violence. By hovering over the stacked area graph during the various years (eg:2016), the legend at the top will change accordingly and show the total number of fatalities due to each event type. Below the stacked area graph, users can maneuver the time period they want the graph to reflect.

IRAQSTACKEDLINEGRAPH  <- dygraph(IRAQ5YEARSFILTERGROUPBYSPREAD,main = "Arm Conflict Profile in IRAQ (Total Fatalities)",ylab = "TotalFatalities",xlab = "Year", width = "750", height = "500") %>% dyHighlight(highlightCircleSize = 5,  highlightSeriesBackgroundAlpha = 0.2, highlightSeriesOpts = list(strokeWidth = 1.5), hideOnMouseOut = FALSE) %>%dyLegend(width = 550) %>% dyOptions(stackedGraph = TRUE, colors=RColorBrewer::brewer.pal(n=6,"Accent")) %>% dyRangeSelector(height = 10) 

IRAQSTACKEDLINEGRAPH

9. Stacked Area Chart to show frequency of armed conflicts by event type from 2016 to 2019

9a Data Wrangling

Data needed to be manipulated such that the frequency count by event type by year were calculated and then the data needed to be re-structured using the ‘spread’ function in order to be able to be used as data for the dygraph.

#Grouping by year, event type and the frequency count
IRAQ5YEARSFILTERGROUPBY2 <- IRAQ5YEARS  %>% group_by(`year`,`event_type`)%>% summarise (freq = n())


#Spreading of data
IRAQ5YEARSFILTERGROUPBYSPREAD2 <- IRAQ5YEARSFILTERGROUPBY2  %>% spread (event_type, freq)

9b Data Visualization

From the stacked area graph it can be seen that the frequency of armed conflicts have decreased significantly from 2016 onwards. From the graph it can also be seen that the events with the highest frequencies are from battles and explosions/remote violence. By hovering over the stacked area graph during the various years (eg:2016), the legend at the top will change accordingly and show frequency count of each event type. Below the stacked area graph, users can maneuver the time period they want the graph to reflect.

IRAQSTACKEDLINEGRAPH2  <- dygraph(IRAQ5YEARSFILTERGROUPBYSPREAD2, main = "Arm Conflict Profile in IRAQ (Frequency Count)",ylab = "Freqeuncy",xlab = "Year", width = "750", height = "500") %>% dyHighlight(highlightCircleSize = 5,  highlightSeriesBackgroundAlpha = 0.2, highlightSeriesOpts = list(strokeWidth = 1.5), hideOnMouseOut = FALSE) %>%dyLegend(width = 550) %>% dyOptions(stackedGraph = TRUE, colors=RColorBrewer::brewer.pal(n=6,"Accent")) %>% dyRangeSelector(height = 10) 

IRAQSTACKEDLINEGRAPH2

10. Conclusion

In conclusion, as seen from the visualiations above the armed conflicts in Iraq have been decreasing over the years which is positive sign. Detailed insights for each visualization have been explained under the respective visualizations.