This is an analysis of state-sponsored cyber operations.
The data presented here was taken from the Cyber Operations Tracker of the Council of Foreign Relations, a database of publicly known state-sponsored incidents since 2005.
The analysis was done using R, as a part of an excellent MOOC by the Knight Center I participated in.
This is the 2nd version of the analysis, updated with some tips from other participans in the MOOC (Thanks!)
As always, a bunch of libraries were used:
library(lubridate)
library(ggplot2)
library(stringr)
library(anytime)
library(ggthemes)
library(gghighlight)
library(dplyr)
library(ggplot2)
library(sf)
library(readr)
library(rnaturalearth)
library(leaflet)
library(tidyr)
library(DT)
library(plotly)
source: https://www.cfr.org/interactive/cyber-operations/export-incidents?_format=csv
Cyber_operations <- read.csv("https://www.cfr.org/interactive/cyber-operations/export-incidents?_format=csv", stringsAsFactors = F)
source: rnaturalearth library
world <- st_as_sf(countries110) %>%
filter(sovereignt!="Antarctica") %>%
filter(type != "Dependency") %>%
filter(type != "Disputed") %>%
filter(type != "Indeterminate")
I created several dataframes to play around with, and to make the plotting and mapping easier later on. I changed the names of some states (nothing personal) to make sure they will fit into the rnaturalearth dataframe later on. Also, I adjusted the total number of operations per state, in order to include dual-sponsored operations.
This is the only real chunk of code in this document, so take a minute to enjoy it.
#adding the number of attacks sponsored and suffered, to be plotted and mapped later
world <- world %>%
mutate(attacked=ifelse(str_count(Spon_List, admin)>0,str_count(Spon_List, admin),NA)) %>%
mutate(was.attacked=ifelse(str_count(Vic_List, admin)>0,str_count(Vic_List, admin),NA))
#creating a lean dataframe to be used as datatable
Lean_database <- Cyber_operations %>%
filter(Sponsor!="") %>%
filter(Date!="") %>%
select(Sponsor, Victims, Type, Title, Description, Date) %>%
arrange(Sponsor,Date)
#dataframe of attacks with unknown sposnors
UnknownSponsor<- Cyber_operations %>%
filter(Sponsor=="") %>%
select(Victims, Type, Description, Title, Date)
#dataframe of single-sponsored attacks
SingleSponsor <- Cyber_operations %>%
select(Sponsor, Type, Date) %>%
filter (!Sponsor %in% c(grep(",", Cyber_operations$Sponsor, value=TRUE)),
Sponsor!="",
Type!="") %>%
mutate(Year = year(anydate(Date))) %>%
mutate(Sponsor=case_when(
Sponsor=="Iran (Islamic Republic of)" ~ "Iran",
Sponsor=="Korea (Democratic People's Republic of)" ~ "North Korea",
Sponsor=="Korea (Republic of)" ~ "South Korea",
Sponsor=="Russian Federation" ~ "Russia",
Sponsor=="United States" ~ "United States of America",
TRUE ~ Sponsor)) %>%
select(-Date) %>%
arrange(Sponsor, Type, Year)
#attacks by state per year
Tidy_by_year <- SingleSponsor %>%
group_by(Sponsor, Year) %>%
summarize(Cases=n())
#attacks with multiple known sponsors
MultipleSponsors <- Cyber_operations %>%
select(Sponsor, Type, Date) %>%
filter(Sponsor %in% c(grep(",", Cyber_operations$Sponsor, value=TRUE)))
As a beginning, I filtered the database down to only operations with known sponsors and dates (which were the vast majority of operations - 233 out of 262). Then, an interactive datatable was produced using the DT library.
This chart represents cyber operations by state per year.
The dominance of China and Russia is clearly visible.
This chart allows to clearly the evolution of involvement for each state overtime.
It is visible that China is the most consistent actor, while Russia heavily increased its involvement since 2013.
Another possible way of illustarting the data is by using a map.
In this case, an interactive map was utilized, using rnaturalearth geometry and the leaflet library.
As expected, China, Russia - and to a lesser extent, Iran - immediately pop out.
The data reveals not only the main sponsors of cyber operations, but also the main targets.
The United States is revealed to be the biggest taget of state-sponsored cyber operations.
It is important to emphasize that cyber operations can mean different things. The CFR database lists 6 types of attack.
Out of the six, espionage is by far the most common goal.
However, it’s important to remember that different states pursue different cyber strategies.
Examning the four leading actors in the database, we can clearly see that while all actors are heavily invested in espionage, some actors pursue more diverse cyber goals, mainly using sabotage and DDoS.
The database allows us to examine collaborations in the cyber realm.
It is evident that the US is a clear leader in cyber cooperation with allies: namely Israel, the UK, and Taiwan. Interestingly, 2018 is the first year which features a joint Chinese-Russian cyber operation.
The database also contains many cyber operations with unknown sponsors. It’s possible that a decent analysis of the victims and the types, coupled with a comparison to known methods of actors, will shed some statistical light at the instigators.
Perhaps in a future project.