Welcome to my final project for POLS3230 - Political Analysis in R. When deciding on my final project topic, I knew I wanted to deal with some sort of geo-spatial data, especially in the international relations field. I decided I wanted to map military events in Ukraine to the present to see how the war has changed and moved. I fortunately ran across an University of Michigan professor that was collecting geo-coded military data across Ukraine through social media and Ukrainian/Russian news sources. In Yuri Zhukov’s database, also known as Violent Incident Information from News Articles (VIINA), he collected the event’s location, time, source, and military event, and initiator. The locations are as precise as long/lat coordinates, and times down to the second. His data was free to access in .csv format, and can be found on github. For this particular project, I used the “events_latest.csv”.
Citation: Zhukov, Yuri (2022). “VIINA: Violent Incident Information from News Articles on the 2022 Russian Invasion of Ukraine.” Ann Arbor: University of Michigan, Center for Political Studies. (https://github.com/zhukovyuri/VIINA, accessed 11/27/2022).
library(tidyverse)
library(lubridate)
library(maps)
library(rnaturalearth)
library(rnaturalearthdata)
library(sf)
library(ggthemes)
library(gganimate)
UK <- read_csv("/Users/joshfriedman/Downloads/events_latest.csv")
I wanted to simply the dataset down to only the variables I wanted, including event number, location, time, coordinates, type of event, and initiator. In order to do so, I had to create multiple versions of the original dataset (UK). For this project, I wanted to narrow the event types to only Artillery Strikes, Airstrikes, and Firefights. I did so using the select function.
UK1 <- UK |> select(event_id, report_id, location, tempid, source, date, url,
time, longitude, latitude, GEO_PRECISION, GEO_API,
YRWK, hours, minutes, hours_c, time_c, date_time,
t_loc_pred, t_airstrike_pred, t_artillery_pred,
t_firefight_pred, a_rus_b, a_ukr_b)
UK1 <- UK1 |> mutate(event_id = as.numeric(event_id))
UK2 <- UK1 |> select(t_airstrike_pred, t_artillery_pred, t_firefight_pred)
UK1 <- UK1 |>
rename(Airstrike = t_airstrike_pred,
Artillery = t_artillery_pred,
Firefight = t_firefight_pred)
UK2 <- UK2 |>
rename(Airstrike = t_airstrike_pred,
Artillery = t_artillery_pred,
Firefight = t_firefight_pred)
After narrowing my columns down and renaming them for sanity’s sake, I had to create a new column that specified the event type name. In Zhukov’s dataset, for each specific event gathered, predicted probabilities were given for each military type. The type with the highest probability was the type of military event most likely to have occurred. In order to simply this, I made a new column type that extracted the highest probability from each row, using the which.max function.
UK2$max <- colnames(UK2)[apply(UK2,1,which.max)]
After creating UK2$max, I had to create a new dataset by merging the new column (UK2$max) with the original dataset I tidied, UK1.
UKtest <- UK2 |> select(Firefight,
max)
UK3 <- bind_cols(UK1,
UKtest)
## New names:
## • `Firefight` -> `Firefight...22`
## • `Firefight` -> `Firefight...25`
After doing so, it was time to create a complete dataset with the new variable. I selected the columns I needed and renamed one as well.
UKtidy <- UK3 |>
select(event_id, report_id, location, tempid, source, date, url, time,
longitude, latitude, GEO_PRECISION, GEO_API, YRWK, hours, minutes,
hours_c, time_c, date_time, max, a_rus_b, a_ukr_b)
UKtidy <- UKtidy |>
rename(Initiator = a_rus_b)
Now that I tidied my data to my liking, I had to begin to build my map.
I used the naturalearth library to create a base layer for my ggplot, using its country data. I filtered out every country but Ukraine, and tested to see if the map formed.
world <- ne_countries(scale = 'medium',
returnclass = 'sf')
world <- world |>
filter(sovereignt == "Ukraine")
ggplot(data = world) +
geom_sf()
After creating my base layer, I had to create an initial static ggplot to overlay on the “world” object I created above. The original geom_sf() is the outline of Ukraine, with geom_point() added as the next layer for the military events. In the map, I mapped x to Longitude and y to Latitude, with events being grouped by both event type and by the party that initiated the attack. In the original dataset, the initiator was set as a binary variable, and I decided to keep it that way for simplicity sake. In this case, Russia = 1. I went with theme_minimal, and added the appropriate labels.
finalmap <- ggplot(data = world) +
geom_sf() +
geom_point(data = UKtidy,
mapping = aes(x = longitude,
y = latitude,
color = max,
shape = as.factor(Initiator))) +
theme_minimal() +
labs(x = "Longitude",
y = "Latitude",
title = "Mapping the Russian Invasion of Ukraine from February 24, 2022 to Present",
subtitle = "Data Pulled from Yuri Zhukov's VIINA Project",
color = "Type of Military Event",
shape = "Who Initiated? (Russia = 1)")
finalmap
To finish, I used gganimate to create a gif based off of each week of the invasion. Second, I used shadow_mark to retain the events from the previous weeks in order to visualize the geographical trends of the war. Furthermore, by using shadow_mark, you can visualize how the conflict has escalated since the war began.
WARNING: gif will take 5-10 minutes to animate
finalanimate <- ggplot(data = world) +
geom_sf() +
geom_point(data = UKtidy,
mapping = aes(x = longitude,
y = latitude,
color = max,
shape = as.factor(Initiator))) +
theme_minimal() +
labs(x = "Longitude",
y = "Latitude",
title = "Mapping the Russian Invasion of Ukraine from February 24, 2022 to Present",
subtitle = "Data Pulled from Yuri Zhukov's VIINA Project",
color = "Type of Military Event",
shape = "Who Initiated? (Russia = 1)") +
transition_states(YRWK,
transition_length = 4,
state_length = 4) +
shadow_mark(past = TRUE, future = FALSE)
finalanimate
## Warning in lapply(row_vars$states, as.integer): NAs introduced by coercion
## Warning in f(..., self = self): NAs introduced by coercion