Introduction
My research question is Do terrorist organizations admit to their own planned suicide attacks more often when its targeting civilians, political figures, or military? My data set includes categorical and numerical info on terrorist organization attacks (if known) on when and where, number of casualities, type of weapon, who the target was, but most importantly to me, whether a organization admitted or not to the attack. Most terrorist organizations have similar goals and ideals which is to instill fear for the message of religion or how much they hate the western world. Through this methodoligy, terrorists don’t have morals, and will gladly give up their own people’s lives for their perpose. Suicide attacks are also a big statement and is seen as a grand sacrifice. My dataset is from the Chicago Project on Security and Terrorism (CPOST) and has data from 1985 to 2019. The following link takes you to a research study done which took the dataset: https://journals.sagepub.com/doi/10.1177/0022343320978260.
library(lubridate)
##
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
##
## date, intersect, setdiff, union
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ purrr 1.0.4 ✔ tidyr 1.3.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(broom)
setwd("C:/Users/Jacob/Downloads")
attack <- read.csv("suicide_attacks.csv")
Data Analysis
I will be using hypothesis testing to test if terrorist organizations claiming the suicide attack are dependent or not on if the attack was targeting civilians, political figures, or military. But first to clean and perform EDA.
attack |>
ggplot(aes(x = target.type, fill = status)) + #seeing proportion of target types and status of whether the suicide is confirmed or possible
geom_bar() +
theme(
axis.title.x = element_blank(),
axis.title.y = element_blank(),
legend.text = element_text(size = 11, color = "black"),
legend.title = element_text(size = 12, face = "bold", color = "black"),
panel.background = element_rect(fill = "white", color = NA),
plot.background = element_rect(fill = "lightgrey", color = NA),
panel.grid = element_line(color = "darkgrey", linetype = "dashed"),
legend.position = "bottom",
plot.title = element_text(hjust = 0.5)
) +
labs(
title = "Counts of Target Types"
)
Looking at the data set there are only 4 unknown target types and there is much more confirmed suicide then any other type so we can safely only used confirmed suicides. I’ll also remove the unknowns to unskew the data but it is still very skewed to military. In claim there is also “Claimed”, “Suspected”, “Denied”, and “Unclaimed” so I’ll put all the suspected and denied claims into unknown because we only want whether a organization admitted or not.
attack_clean <- attack |>
mutate(target.type = case_when(target.type == "Security" ~ "Military", TRUE ~ target.type)) |>#changing security to military
filter(target.type != "Unknown") |> #removing unknowns
filter(status == "Confirmed Suicide") |> #taking only confirmed suicides
mutate(claim = case_when(claim == "Suspected" ~ "Unclaimed", claim == "Denied" ~ "Unclaimed", TRUE ~ claim)) #Changing suspected and Denied to unknown
attack_clean |>
ggplot(aes(x = target.type, fill = claim)) +
geom_bar() +
theme(
axis.title.x = element_blank(),
axis.title.y = element_blank(),
legend.text = element_text(size = 11, color = "black"),
legend.title = element_blank(),
panel.background = element_rect(fill = "white", color = NA),
plot.background = element_rect(fill = "lightgrey", color = NA),
panel.grid = element_line(color = "darkgrey", linetype = "dashed"),
plot.caption = element_text(size = 11, face = "bold", color = "black"),
legend.position = "bottom",
plot.title = element_text(hjust = 0.5)
) +
labs(
title = "Counts of Confirmed Suicide Attack Target Types",
caption = "Done by known and unknown terrorist organizations"
)
All counts are much above 5 so we safely can use chi square testing of independence between the two categorical variables, “Target type” and “Claim”.
Statistical Analysis
\(H_0\) : Terrorist suicide attack targets and whether the terrorist group admitted to the attack are independent
\(H_a\) : Terrorist suicide attack targets and whether the terrorist group admitted to the attack are dependent
table_chi <- table(attack_clean$target.type, attack_clean$claim)
test_result <- chisq.test(table_chi)
test_result
##
## Pearson's Chi-squared test
##
## data: table_chi
## X-squared = 93.95, df = 2, p-value < 2.2e-16
test_result$expected
##
## Claimed Unclaimed
## Civilian 910.3250 1127.6750
## Military 2308.8664 2860.1336
## Political 428.8086 531.1914
A X-Square of 93.95 is very impressive and makes me second guess my assumption of the independince of the variables and maybe they are actually the same data, but genuienly the attack target heavily influences the decision to admit to the attack or not. This is even more surprising when looking at the three attack target types, the spread of claimed and unclaimed attacks are almost split in half. The p-value of 2.2e-16 is also extremely strong significant evidence to reject the null hypothesis. After checking the expected values, they are all much above 5 so the chi-square is surprising but valid.
Now to inspect the residuals where the strong dependency is coming from:
resid <- as.data.frame(test_result$stdres) #extracting residuals
names(resid) <- c("Target", "Claimed", "Residual")
ggplot(resid, aes(x = Claimed, y = Target)) +
geom_point(aes(size = abs(Residual), color = Residual)) +
scale_color_gradient2(low = "darkgreen", mid = "white", high = "darkred", midpoint = 0, name = "Residual", guide = "none") +
scale_size_continuous(name = NULL, range = c(2, 10), guide = guide_legend(title = "Deviation level (bigger is larger than expected)")) +
labs(
title = "Chi-square Residuals",
subtitle = "Red = More Attacks than Expected, Green = Less Attacks than Expected"
) +
theme(
axis.title.x = element_blank(),
axis.title.y = element_blank(),
legend.text = element_text(size = 8, color = "black"),
legend.title = element_text(size = 6, face = "bold", color = "black"),
panel.background = element_rect(fill = "white", color = NA),
plot.background = element_rect(fill = "lightgrey", color = NA),
panel.grid = element_line(color = "darkgrey", linetype = "dashed"),
plot.caption = element_text(hjust = 0.5, size = 7, face = "bold", color = "black"),
legend.position = "bottom",
plot.title = element_text(hjust = 0.5)
)
The following graph represents the expected values the chi square test gave assuming that the variables are indpendent. According to the chi-square test, if the variables were independent there would be far fewer claimed civilian attacks than observed. Additionally there are significantly more unclaimed civilian attacks than expected if independent. This further supports the alternative hypothesis that the categories are not independent, and that terrorist organizations specifically decide whether or not to claim an attack based on the target. According to my data and the chi-square test, this decision is largely influenced by whether the attack targeted civilians. Organizations are much less likely to admit to an attack if it targeted civilians. For political or military targets its the opposite, though it is much less apparent than for civilian targets.
Although there are many factors that could play into a terrorist organization’s decisions, I have found a leading factor into the admission of suicide bombings. If terrorist organizations admit less to civilian attacks much less often than if the attack target wasn’t a deciding factor, this can make the groups more liable in civilian attacks. If a civilian bombing attack occurs and no one admits it, according to my analysis, it is very much more than likely a terrorist organization choosing to not admit it. The biggest terrorist group by far is the Islamic State which conducts attacks across all of the middle east; So it might be safe to assume that if a attack occured at a civilian target and no organization has admitted to it, that it was the Islamic State that conducted the attack who chooses not to admit to the attack. This is highly sceptical though and further leading factors should be researched and analized to better understand terrorist organizations patterns.