DATA 110 Project 2

Author

Iris Wu

Banksy street art in Ukraine

Introduction

For this project, I chose a dataset from the Uppsala Conflict Data Program (UCDP). This dataset compiles all violent events that occurred from 1989 to 2024. UCDP defines an “event” as an incident of violence that results in at least 1 direct death. Most of the events in the dataset belong to dyads. A dyad is the conflict in which an event occurs. The Russo-Ukrainian war would be classified as a dyad, while a drone strike on Kyiv would be an event within the dyad. Each dyad is categorized by type of violence. In state-based conflicts, one of the actors involved must be the government of a state. In non-state conflicts both sides are organized armed groups, and in one-sided violence, a government or organized armed group perpetrates violence, typically against civilians, resulting in at least 25 deaths in one calendar year. Side A and Side B in the dataset signify the actors involved in the conflict. In a state-based conflict, for example, Side A would the government, and Side B would be the rebel force or opposing government.

The variables I focused on in my plots were type of violence and deaths. Since type of violence was assigned numerical values in the original dataset, I used the gsub function to convert them into characters. To better measure the impact of each event, I created a new variable, total deaths, which is the sum of deaths from both sides, civilian deaths, and unknown deaths. Deaths are categorized as “unknown” if it is unclear whether the person who died is a civilian or a member of either side. I decided to focus on events from 2000 to 2024 because they feel more current and relevant to my generation. While my plots show data for each region in the world, I chose to zoom in on Europe for my map. In the 21st century, Europe is known for being relatively peaceful, compared to the Middle East or Africa, but it is not without conflict. As an international studies major, I have also been following the Russo-Ukrainian War, and I took this project as an opportunity to map the violence that has occurred in this conflict.

Reading the current headlines, it is easy to grow numb to all the conflicts happening in the world. Through this project, I hope to visually highlight the impact of war, as a reminder to myself of the real human lives that are affected in every conflict.

Load the libraries and dataset

library(tidyverse)
library(leaflet)
library(highcharter)
library(knitr) 
setwd("C:/Users/rsaidi/Downloads")
#setwd("C:/Users/iwu80/OneDrive/Documents/Files/School/DATA 110 R Assignments")
conflicts <- read_csv("GEDEvent_v25_1.csv")

Clean the dataset

 #create a new column for total deaths 
conflicts2 <- conflicts |>
   mutate(total_deaths = deaths_a + deaths_b + deaths_civilians + deaths_unknown) |>
  #filter for year 2000 and later 
  filter(year >= 2000) |>
  #keep only the necessary variables 
  select(year, type_of_violence, conflict_name, dyad_name, side_a, side_b, source_article, latitude, longitude, country, region, where_description, date_start, date_end, deaths_a, deaths_b, deaths_civilians, deaths_unknown, total_deaths)
#label the types of violence 
conflicts2$type_of_violence <- gsub("1", "State-based conflict", conflicts2$type_of_violence) 
conflicts2$type_of_violence <- gsub("2", "Non-state conflict", conflicts2$type_of_violence)
conflicts2$type_of_violence <- gsub("3", "One-sided violence", conflicts2$type_of_violence)

Visualizations

Plot 1

#group the dataset by region
byregion <- conflicts2 |>
  group_by(region, year) |>
  summarize(sum_totaldeaths = sum(total_deaths), .groups = "drop") |>  # <- initially I did mean(total_deaths), but the values seemed too low, so I double-checked with Google Gemini (2.5 Flash). According to Gemini, mean(total_deaths) would give me the average death toll per violent event within each region for a specific year. The correct function, sum(), would give me total deaths for all conflicts in a region per year. See bottom of document for specific prompt used. 
  arrange(desc(sum_totaldeaths))
head(byregion)
# A tibble: 6 × 3
  region       year sum_totaldeaths
  <chr>       <dbl>           <dbl>
1 Africa       2022          188826
2 Africa       2021          149325
3 Middle East  2014          100397
4 Europe       2022           92992
5 Middle East  2013           81500
6 Middle East  2015           81471
ggplot(byregion, aes(x = year, y = region, fill = sum_totaldeaths)) +
  geom_tile() +
  labs(title = "Total Conflict-Related Deaths Per Region, 2000-2024", x = "Year", y = "Region", caption = "Uppsala Conflict Data Program") +
  #I used this site to find the color hex codes: https://htmlcolorcodes.com/colors/shades-of-red/
  scale_fill_gradientn(name = "Total Deaths", colors = c("#FAA0A0", "#FF3131", "#D22B2B",  "#C04000", "#880808")) +
  scale_x_continuous(breaks = seq(2000, 2024, by = 3)) +
  theme_minimal(base_family = "serif") +
  geom_vline(xintercept = 2011, col = "black")+
  geom_vline(xintercept = 2022, col = "navyblue") +
  geom_text(aes(x = 2008, y = "Middle East", label = "Syrian Civil War \n Begins"), size = 3) +
  geom_text(aes(x = 2019, y = "Europe", label = "Russia invades \n Ukraine"), size = 3, col = "navyblue") +
  theme(plot.title = element_text(hjust = .5, face = "bold")) +
  theme(plot.caption = element_text(hjust = .5, vjust = -1.5, face = "italic")) +
  theme(axis.title.x = element_text(vjust = -1))

Plot 2

bytype <- conflicts2 |>
  group_by(region, year) |>
  count(type_of_violence, name = "number of violent events")
head(bytype)
# A tibble: 6 × 4
# Groups:   region, year [2]
  region  year type_of_violence     `number of violent events`
  <chr>  <dbl> <chr>                                     <int>
1 Africa  2000 Non-state conflict                          138
2 Africa  2000 One-sided violence                          366
3 Africa  2000 State-based conflict                        750
4 Africa  2001 Non-state conflict                           96
5 Africa  2001 One-sided violence                          272
6 Africa  2001 State-based conflict                        489
ggplot(bytype, aes(x = year, y = `number of violent events`, fill = type_of_violence)) +
  facet_grid(~region, scales = "free_y") +
  geom_col(position = "stack") +
  geom_vline(xintercept = 2001, col = "white", lty = 3) +
  scale_fill_manual(name = "Type of Violence", values = c("#F88379", "#DC143C", "#8B0000")) +
  labs(title = "Violent Events per Region, 2000-2024", subtitle = "Dotted line marks start of Global War on Terrorism", x = "Year", y = "Number of Violent Events", caption = "Source: Uppsala Conflict Data Program") +
  scale_x_continuous(breaks = seq(2000, 2024, by = 4)) +
  theme_dark(base_family = "serif") +
  theme(plot.title = element_text(hjust = .5, face = "bold")) +
  theme(plot.caption = element_text(hjust = .5, face = "italic", vjust = -2)) +
  theme(axis.title.x = element_text(vjust = -1.5)) +
  theme(axis.text.x = element_text(angle = 55, vjust = .5)) +
  theme(plot.subtitle = element_text(hjust = .5, face = "italic"))

Plot 3

bywar <- conflicts2 |>
  group_by(dyad_name, region) |>
  summarize(sum_civdeaths = sum(deaths_civilians), .groups = "drop") |>
  arrange(desc(sum_civdeaths)) |>
  slice_head(n = 5)
head(bywar)
# A tibble: 5 × 3
  dyad_name                                                 region sum_civdeaths
  <chr>                                                     <chr>          <dbl>
1 Government of Syria - Syrian insurgents                   Middl…         84317
2 Government of Israel - Hamas                              Middl…         29451
3 Government of Russia (Soviet Union) - Government of Ukra… Europe         23115
4 IS - Civilians                                            Middl…         21006
5 IS - Civilians                                            Africa         19184
#shorten the names of each conflict
bywar$dyad_name <- gsub("Government of Syria - Syrian insurgents", "Syrian Civil War", bywar$dyad_name) 
bywar$dyad_name <- gsub("Government of Israel - Hamas", "Gaza War", bywar$dyad_name)  
bywar$dyad_name <- gsub("Russo-Ukrainian War", "Ukraine War", bywar$dyad_name) 
bywar$dyad_name <- gsub("IS - Civilians", "ISIS", bywar$dyad_name)
#set the palette: 
colors <- c("#800000","#C04000", "#FF3131")
highchart() |>
  hc_add_series(data = bywar, type = "column", hcaes( x = dyad_name, y = sum_civdeaths, group = region)) |>
  hc_colors(colors) |>
  hc_title(text = "Top 5 Conflicts by Civilian Deaths, 2000-2024", style = list(fontweight = "bold")) |>
  #the x-axis categories were not showing up, so I used this answer in Stack Overflow to add "type = category": https://stackoverflow.com/questions/65475811/highcharter-r-x-axis-values-not-showing-up-wrong-text-on-y-axis 
  hc_xAxis(type = "category", title = list(text = "Name of Conflict")) |>
  hc_yAxis(title = list(text = "Civilian Deaths")) |>
  hc_caption(text = "Source: Uppsala Conflict Data Program", align = "center") |>
  hc_chart(style = list(fontFamily = "serif")) |>
  hc_tooltip(borderColor = "black", pointFormat = "{point.title} <br> Civilian Deaths: {point.sum_civdeaths}") |>
  #I used this answer in Stack Overflow to adjust the width of the bars: https://stackoverflow.com/questions/58532263/set-the-thickness-of-highcharter-column-chart-columns-in-r
  hc_plotOptions(column = list(pointWidth = 50))

Map

#filter the data for Europe
euro <- conflicts2 |>
  filter(region == "Europe") |>
  #filter for total deaths greater than or equal to 25 |>
  filter(total_deaths >= 25) |>
  #filter out NA values in where_description 
  filter(!is.na(where_description))
#set the latitude and longitude for Ukraine 
euro_lat <- 48.3794
euro_long <- 31.1656
#create a popup 
popupeuro <- paste0("<b>Year: </b>", euro$year, "<br>", "<b>Conflict Name: </b>", euro$dyad_name, "<br>", "<b>Country: </b>", euro$country, "<br>", "<b>Type of Violence: </b>", euro$type_of_violence, "<br>", "<b>Location within Country: </b>", euro$where_description, "<br>", "<b>Start Date: </b>", euro$date_start, "<br>", "<b>End Date: </b>", euro$date_end, "<br>", "<b>Civilian Deaths: </b>", euro$deaths_civilians,"<br>", "<b>Total Deaths: </b>", euro$total_deaths, "<br>")
#set the legend palette 
colors <- colorNumeric(palette = c("#FF3131", "#D2042D", "#702963", "#770737", "#4A0404"), domain = euro$total_deaths)
#create the map 
leaflet() |>
  setView(lng = euro_long, lat = euro_lat, zoom = 2.5) |>
  addProviderTiles("CartoDB.DarkMatter") |>  # <- I used this site to find different map options: https://leaflet-extras.github.io/leaflet-providers/preview/
  addCircles(data = euro, radius = sqrt(euro$total_deaths) *300, color = ~colors(euro$total_deaths), fillOpacity = 2, stroke = TRUE, popup = popupeuro) |>
  addLegend(pal = colors, values = euro$total_deaths, position = "bottomleft", title = "Total Deaths")
Assuming "longitude" and "latitude" are longitude and latitude, respectively

Essay

My first plot is a heatmap that shows the total number of conflict-related deaths per region for every year between 2000 and 2024. From the plot, it is clear that most conflict-related deaths occur in Africa and the Middle East, which is unsurprising given the number of wars, past and ongoing, in those regions. The majority of deaths from 2011 and onwards in the Middle East are likely due to the Syrian Civil War. In Africa, the dark red tile in 2022 is marks a particularly violent event during the Tigray War in Ethiopia. Unsurprisingly, in Europe, there is a relatively low number of conflict-related deaths until the start of the Russo-Ukrainian War.

For my second plot, I tallied up the number of violent events that have occurred in each region and plotted them according to type of violence. Once again, the Middle East is the region that has experienced the most violence. In all regions except the Americas, state-based conflict is the dominant type of violence. Since 2000, the Americas have not seen full-scale war, but the presence of organized crime groups likely contribute to the high number of non-state violence. Europe appears to be the most peaceful region for most of the early 2000s, but the number of violent events drastically increase in the 2020s due to the war in Ukraine.

My third plot shows the top 5 conflicts with the most civilian deaths. The results are unsurprising but horrifying to contemplate. I find it troubling that although the Russo-Ukrainian War and Gaza War are more recent than the Syrian Civil War, they already rank in the top 5 21st century conflicts with the most civilian deaths. Of note, while the other conflicts are state-based, the ISIS ones in both the Middle East and Africa were one-sided and perpetrated against civilians.

Ukraine is the focus of my map, but other violent events around Europe are also shown. In Western Europe, most violent events were terrorist attacks. For example, the map shows the 2015 ISIS attack in Paris. Besides Ukraine, another hotspot is in the Azerbaijan - Armenia region, due to the Nagorno-Karabakh conflict between the two countries.

Overall, my plots show the impact of conflict on a broad scale, but in the future, I would like to conduct a more granular analysis by examining the nature of violent events within one particular conflict.

Use of AI Full Citation

The values returned from running Chunk 4 seemed too low, so I verified with Google Gemini (2.5 Flash) whether I was using the correct summary function.

Prompt: “When I run this code”byregion <- conflicts2 |> group_by(region, year) |> summarize(avg_totaldeaths = mean(total_deaths))“, the average number of deaths per region for each year seem kind of low.”

Answer: “If your original data (conflicts2) is structured at a very disaggregated (fine-grained) level, this average will be low. When you calculate mean(total_deaths) across all these highly-disaggregated units (events or dyads) the result is the average death toll per event/dyad that occurred in that region/year. If you want the total number of deaths, you need to change your summary function to sum the total_deaths instead of taking the mean.”

Bibliography

Dixon & Moe. “Shades of Red.” HTML Color Codes, htmlcolorcodes.com/colors/shades-of-red/. Accessed 16 Nov. 2025.

“Set the Thickness of Highcharter Column Chart Columns in R.” Stack Overflow, 24 Oct. 2019, stackoverflow.com/questions/58532263/set-the-thickness-of-highcharter-column-chart-columns-in-r. Accessed 16 Nov. 2025. Answered by user mnist.

“Highcharter-R: X-Axis Values Not Showing Up, Wrong Text on Y-Axis.” Stack Overflow, 28 Dec. 2020, stackoverflow.com/questions/65475811/highcharter-r-x-axis-values-not-showing-up-wrong-text-on-y-axis. Accessed 16 Nov. 2025. Answered by user Aman.