Suicide Attacks

Author

Thiloni Konara

Introduction

This data set is about suicide attacks from 1982 through October 2020. The data base includes information about the location of attacks, the target type, the weapon used, and symmetric information on the demographic and general biographical characteristics of suicide attackers. The current CPOST-SAD release contains the universe of suicide attacks from 1982 through September 2015, a total of 4814 attacks in over 40 countries.

Variables

date.year: the year when the suicide attack occurred (numeric) statistics.#wounded_high: represents the highest number of people injured in attack (numeric) statistics.#killed_high: represents the highest number of people killed in a single suicide attack (numeric) target.country: the country that was targeted in the attack

Question

How do the number of wounded and the year of an attack affect the number of deaths in suicide attacks across different target countries.

Source

<Chicago Project on Security and Terrorism (CPOST). 2020. Suicide Attack Database (October, 2020 Release)

Load the libraries

library (tidyverse)
library (ggfortify)
library(RColorBrewer)

Load the data set

suicide_attacks <- read_csv("suicide_attacks.csv")

Just to look at the data type and first 6 rows

head(suicide_attacks)

# A tibble: 6 × 39
  groups           claim status statistics.sources date.year date.month date.day
  <chr>            <chr> <chr>               <dbl>     <dbl>      <dbl>    <dbl>
1 Islamic State    Susp… Confi…                  2      2015          6        2
2 Islamic State    Susp… Possi…                  3      2017          1        6
3 Islamic State    Susp… Possi…                  3      2017          1        6
4 Unknown Group    Uncl… Confi…                  4      2004         10        5
5 Taliban (IEA)    Clai… Possi…                  5      2017          7        4
6 Al-Jaysh al-Isl… Clai… Confi…                  4      2012         10        3
# ℹ 32 more variables: `statistics.# wounded_low` <dbl>,
#   `statistics.# wounded_high` <dbl>, `statistics.# killed_low` <dbl>,
#   `statistics.# killed_high` <dbl>, `statistics.# killed_low_civilian` <dbl>,
#   `statistics.# killed_high_civilian` <dbl>,
#   `statistics.# killed_low_political` <dbl>,
#   `statistics.# killed_high_political` <dbl>,
#   `statistics.# killed_low_security` <dbl>, …

str(suicide_attacks)

spc_tbl_ [10,018 × 39] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ groups                            : chr [1:10018] "Islamic State" "Islamic State" "Islamic State" "Unknown Group" ...
 $ claim                             : chr [1:10018] "Suspected" "Suspected" "Suspected" "Unclaimed" ...
 $ status                            : chr [1:10018] "Confirmed Suicide" "Possible - Too Few Sources" "Possible - Too Few Sources" "Confirmed Suicide" ...
 $ statistics.sources                : num [1:10018] 2 3 3 4 5 4 4 4 4 4 ...
 $ date.year                         : num [1:10018] 2015 2017 2017 2004 2017 ...
 $ date.month                        : num [1:10018] 6 1 1 10 7 10 10 10 10 10 ...
 $ date.day                          : num [1:10018] 2 6 6 5 4 3 3 3 3 3 ...
 $ statistics.# wounded_low          : num [1:10018] 8 0 0 10 2 100 100 100 100 100 ...
 $ statistics.# wounded_high         : num [1:10018] 8 0 0 15 2 120 120 120 120 120 ...
 $ statistics.# killed_low           : num [1:10018] 5 40 40 1 0 31 31 31 31 31 ...
 $ statistics.# killed_high          : num [1:10018] 5 40 40 10 0 40 40 40 40 40 ...
 $ statistics.# killed_low_civilian  : num [1:10018] 0 20 20 1 0 31 31 31 31 31 ...
 $ statistics.# killed_high_civilian : num [1:10018] 0 20 20 10 0 40 40 40 40 40 ...
 $ statistics.# killed_low_political : num [1:10018] 0 0 0 0 0 0 0 0 0 0 ...
 $ statistics.# killed_high_political: num [1:10018] 0 0 0 0 0 0 0 0 0 0 ...
 $ statistics.# killed_low_security  : num [1:10018] 5 20 20 0 0 0 0 0 0 0 ...
 $ statistics.# killed_high_security : num [1:10018] 5 20 20 0 0 0 0 0 0 0 ...
 $ statistics.# belt_bomb            : num [1:10018] 0 0 0 0 0 0 0 0 0 0 ...
 $ statistics.# truck_bomb           : num [1:10018] 0 0 0 0 1 0 0 0 0 0 ...
 $ statistics.# car_bomb             : num [1:10018] 1 0 0 1 0 1 1 1 1 1 ...
 $ statistics.# weapon_oth           : num [1:10018] 0 1 1 0 0 0 0 0 0 0 ...
 $ statistics.# weapon_unk           : num [1:10018] 0 0 0 0 0 0 0 0 0 0 ...
 $ target.weapon                     : chr [1:10018] "Car bomb" "Unspecified" "Unspecified" "Car bomb" ...
 $ target.region                     : chr [1:10018] "Asia" "Asia" "Asia" "Asia" ...
 $ target.subregion                  : chr [1:10018] "Western Asia" "Western Asia" "Western Asia" "Western Asia" ...
 $ target.country                    : chr [1:10018] "Syria" "Syria" "Syria" "Iraq" ...
 $ target.province                   : chr [1:10018] "Hasaka (Al Haksa)" "Deir ez-Zor" "Deir ez-Zor" "Baghdad" ...
 $ target.city                       : chr [1:10018] "Al Hasakah" "Deir ez-Zor" "Deir ez-Zor" "Baghdad" ...
 $ target.location                   : chr [1:10018] "close to a children's hospital" "Route between City & Deir ez-Zor Airport" "Route between City & Deir ez-Zor Airport" "Al Dora neighborhood, near refinery and cathedral" ...
 $ target.latitude                   : num [1:10018] 36.5 35.3 35.3 33.3 31.8 ...
 $ target.longtitude                 : num [1:10018] 40.8 40.1 40.1 44.4 64.5 ...
 $ target.desc                       : chr [1:10018] "Syrian Army checkpoint" "Syrian regime forces" "Syrian regime forces" "Iraqi Police patrol" ...
 $ target.type                       : chr [1:10018] "Security" "Security" "Security" "Security" ...
 $ target.nationality                : chr [1:10018] "Syrian" "Syrian" "Syrian" "Iraqi" ...
 $ statistics.# attackers            : num [1:10018] 1 2 2 1 1 3 3 3 3 3 ...
 $ statistics.# female_attackers     : num [1:10018] 0 0 0 0 0 0 0 0 0 0 ...
 $ statistics.# male_attackers       : num [1:10018] 0 0 0 0 0 0 0 0 0 0 ...
 $ statistics.# unknown_attackers    : num [1:10018] 1 2 2 1 1 3 3 3 3 3 ...
 $ attacker.gender                   : chr [1:10018] "Unknown" "Unknown" "Unknown" "Unknown" ...
 - attr(*, "spec")=
  .. cols(
  ..   groups = col_character(),
  ..   claim = col_character(),
  ..   status = col_character(),
  ..   statistics.sources = col_double(),
  ..   date.year = col_double(),
  ..   date.month = col_double(),
  ..   date.day = col_double(),
  ..   `statistics.# wounded_low` = col_double(),
  ..   `statistics.# wounded_high` = col_double(),
  ..   `statistics.# killed_low` = col_double(),
  ..   `statistics.# killed_high` = col_double(),
  ..   `statistics.# killed_low_civilian` = col_double(),
  ..   `statistics.# killed_high_civilian` = col_double(),
  ..   `statistics.# killed_low_political` = col_double(),
  ..   `statistics.# killed_high_political` = col_double(),
  ..   `statistics.# killed_low_security` = col_double(),
  ..   `statistics.# killed_high_security` = col_double(),
  ..   `statistics.# belt_bomb` = col_double(),
  ..   `statistics.# truck_bomb` = col_double(),
  ..   `statistics.# car_bomb` = col_double(),
  ..   `statistics.# weapon_oth` = col_double(),
  ..   `statistics.# weapon_unk` = col_double(),
  ..   target.weapon = col_character(),
  ..   target.region = col_character(),
  ..   target.subregion = col_character(),
  ..   target.country = col_character(),
  ..   target.province = col_character(),
  ..   target.city = col_character(),
  ..   target.location = col_character(),
  ..   target.latitude = col_double(),
  ..   target.longtitude = col_double(),
  ..   target.desc = col_character(),
  ..   target.type = col_character(),
  ..   target.nationality = col_character(),
  ..   `statistics.# attackers` = col_double(),
  ..   `statistics.# female_attackers` = col_double(),
  ..   `statistics.# male_attackers` = col_double(),
  ..   `statistics.# unknown_attackers` = col_double(),
  ..   attacker.gender = col_character()
  .. )
 - attr(*, "problems")=<externalptr>

Data cleaning

names(suicide_attacks) <- gsub ("[#]", "_", names(suicide_attacks)) ##Replacing # and . in the column names with underscore
names(suicide_attacks) <- gsub("[.]", "", names(suicide_attacks))

head(suicide_attacks)

# A tibble: 6 × 39
  groups               claim status statisticssources dateyear datemonth dateday
  <chr>                <chr> <chr>              <dbl>    <dbl>     <dbl>   <dbl>
1 Islamic State        Susp… Confi…                 2     2015         6       2
2 Islamic State        Susp… Possi…                 3     2017         1       6
3 Islamic State        Susp… Possi…                 3     2017         1       6
4 Unknown Group        Uncl… Confi…                 4     2004        10       5
5 Taliban (IEA)        Clai… Possi…                 5     2017         7       4
6 Al-Jaysh al-Islami … Clai… Confi…                 4     2012        10       3
# ℹ 32 more variables: `statistics_ wounded_low` <dbl>,
#   `statistics_ wounded_high` <dbl>, `statistics_ killed_low` <dbl>,
#   `statistics_ killed_high` <dbl>, `statistics_ killed_low_civilian` <dbl>,
#   `statistics_ killed_high_civilian` <dbl>,
#   `statistics_ killed_low_political` <dbl>,
#   `statistics_ killed_high_political` <dbl>,
#   `statistics_ killed_low_security` <dbl>, …

To look at the exact names for columns

names(suicide_attacks)

 [1] "groups"                            "claim"                            
 [3] "status"                            "statisticssources"                
 [5] "dateyear"                          "datemonth"                        
 [7] "dateday"                           "statistics_ wounded_low"          
 [9] "statistics_ wounded_high"          "statistics_ killed_low"           
[11] "statistics_ killed_high"           "statistics_ killed_low_civilian"  
[13] "statistics_ killed_high_civilian"  "statistics_ killed_low_political" 
[15] "statistics_ killed_high_political" "statistics_ killed_low_security"  
[17] "statistics_ killed_high_security"  "statistics_ belt_bomb"            
[19] "statistics_ truck_bomb"            "statistics_ car_bomb"             
[21] "statistics_ weapon_oth"            "statistics_ weapon_unk"           
[23] "targetweapon"                      "targetregion"                     
[25] "targetsubregion"                   "targetcountry"                    
[27] "targetprovince"                    "targetcity"                       
[29] "targetlocation"                    "targetlatitude"                   
[31] "targetlongtitude"                  "targetdesc"                       
[33] "targettype"                        "targetnationality"                
[35] "statistics_ attackers"             "statistics_ female_attackers"     
[37] "statistics_ male_attackers"        "statistics_ unknown_attackers"    
[39] "attackergender"

Removing NAs from the certain columns I need

suicide_country <- suicide_attacks |>
  filter(!is.na(dateyear) & (!is.na(`statistics_ killed_high`)) & (!is.na(`statistics_ wounded_high`))& (!is.na(targetcountry)))

Selecting columns I need for the research question

suicide_country <- suicide_country |>
  select(dateyear,`statistics_ killed_high`,`statistics_ wounded_high`,targetcountry) |>
  
  group_by(dateyear,`statistics_ killed_high`,`statistics_ wounded_high`,targetcountry)
  
  head(suicide_country)

# A tibble: 6 × 4
# Groups:   dateyear, statistics_ killed_high, statistics_ wounded_high,
#   targetcountry [5]
  dateyear `statistics_ killed_high` `statistics_ wounded_high` targetcountry
     <dbl>                     <dbl>                      <dbl> <chr>        
1     2015                         5                          8 Syria        
2     2017                        40                          0 Syria        
3     2017                        40                          0 Syria        
4     2004                        10                         15 Iraq         
5     2017                         0                          2 Afghanistan  
6     2012                        40                        120 Syria

Linear Regression Model

fit1 <- lm(`statistics_ killed_high` ~ dateyear + `statistics_ wounded_high`+ targetcountry, data = suicide_country)
autoplot(fit1, 1:4,nrow=2,ncol=2) ##Got this from correlation scatter plots and regressions tutorial, to see the diagnostic plots

summary(fit1)


Call:
lm(formula = `statistics_ killed_high` ~ dateyear + `statistics_ wounded_high` + 
    targetcountry, data = suicide_country)

Residuals:
     Min       1Q   Median       3Q      Max 
-1530.81    -4.12    -0.99     2.36   251.10 

Coefficients:
                                    Estimate Std. Error t value Pr(>|t|)    
(Intercept)                        2.123e+01  1.071e+02   0.198 0.842937    
dateyear                          -1.043e-02  5.324e-02  -0.196 0.844746    
`statistics_ wounded_high`         3.951e-01  1.286e-03 307.138  < 2e-16 ***
targetcountryAlgeria              -8.934e+00  4.454e+00  -2.006 0.044883 *  
targetcountryArgentina             5.537e+00  1.657e+01   0.334 0.738237    
targetcountryBangladesh           -5.190e-02  4.171e+00  -0.012 0.990072    
targetcountryBelgium              -3.432e+01  1.351e+01  -2.541 0.011082 *  
targetcountryBolivia              -2.284e+00  2.339e+01  -0.098 0.922218    
targetcountryBosnia & Herzegovina -1.954e+00  2.339e+01  -0.084 0.933413    
targetcountryBulgaria             -6.893e+00  2.338e+01  -0.295 0.768160    
targetcountryBurkina Faso         -7.921e+00  1.351e+01  -0.586 0.557708    
targetcountryCameroon             -5.739e-01  2.238e+00  -0.256 0.797608    
targetcountryChad                 -5.413e+00  4.453e+00  -1.216 0.224176    
targetcountryChina                -4.701e+00  4.450e+00  -1.056 0.290841    
targetcountryColombia             -6.045e+00  2.339e+01  -0.258 0.796039    
targetcountryDjibouti             -5.155e+00  1.654e+01  -0.312 0.755284    
targetcountryEgypt                -4.259e+00  2.637e+00  -1.615 0.106340    
targetcountryFinland              -1.851e+01  1.655e+01  -1.118 0.263456    
targetcountryFrance                1.838e+01  8.284e+00   2.219 0.026514 *  
targetcountryGeorgia              -2.071e-01  2.338e+01  -0.009 0.992933    
targetcountryGermany              -6.134e+00  2.338e+01  -0.262 0.793078    
targetcountryIndia                 3.062e+00  5.141e+00   0.596 0.551483    
targetcountryIndonesia            -1.585e+00  3.934e+00  -0.403 0.686950    
targetcountryIran                 -1.209e+01  6.507e+00  -1.858 0.063138 .  
targetcountryIraq                  1.686e+00  6.641e-01   2.539 0.011129 *  
targetcountryIsrael               -1.237e+01  2.095e+00  -5.905 3.65e-09 ***
targetcountryJordan                3.119e+00  8.852e+00   0.352 0.724569    
targetcountryKazakhstan           -1.544e-01  1.654e+01  -0.009 0.992552    
targetcountryKenya                -2.322e+02  8.338e+00 -27.856  < 2e-16 ***
targetcountryKuwait               -2.795e+01  1.354e+01  -2.064 0.039051 *  
targetcountryKyrgyzstan           -1.393e+00  2.338e+01  -0.060 0.952514    
targetcountryLebanon               5.028e+00  2.221e+00   2.263 0.023635 *  
targetcountryLibya                -7.774e-02  2.342e+00  -0.033 0.973517    
targetcountryMali                  2.108e+00  3.095e+00   0.681 0.495883    
targetcountryMauritania           -1.466e+00  2.338e+01  -0.063 0.950028    
targetcountryMontenegro           -1.863e-01  2.339e+01  -0.008 0.993644    
targetcountryMorocco              -2.815e+00  4.324e+00  -0.651 0.514994    
targetcountryNiger                 3.387e+00  3.942e+00   0.859 0.390268    
targetcountryNigeria               3.149e+00  1.145e+00   2.750 0.005964 ** 
targetcountryPakistan             -2.452e-01  9.883e-01  -0.248 0.804033    
targetcountryPalestine            -1.231e+00  2.519e+00  -0.489 0.625135    
targetcountryPanama                2.056e+01  2.340e+01   0.879 0.379598    
targetcountryPhilippines          -9.753e+00  7.819e+00  -1.247 0.212314    
targetcountryQatar                -4.063e+00  2.339e+01  -0.174 0.862062    
targetcountryRussia               -2.784e+00  2.177e+00  -1.279 0.200949    
targetcountrySaudi Arabia         -9.791e+00  3.693e+00  -2.652 0.008025 ** 
targetcountrySerbia               -6.023e-01  2.338e+01  -0.026 0.979453    
targetcountrySomalia               5.227e+00  1.486e+00   3.518 0.000436 ***
targetcountrySouth Sudan          -4.145e+00  2.338e+01  -0.177 0.859321    
targetcountrySpain                -5.176e-01  8.862e+00  -0.058 0.953424    
targetcountrySri Lanka             1.106e+01  1.568e+00   7.052 1.87e-12 ***
targetcountrySweden               -1.060e+00  2.338e+01  -0.045 0.963845    
targetcountrySyria                 5.408e+00  9.466e-01   5.713 1.14e-08 ***
targetcountryTajikistan           -5.988e+00  1.170e+01  -0.512 0.608880    
targetcountryTanzania             -1.863e+01  2.340e+01  -0.797 0.425748    
targetcountryTunisia               9.927e-01  6.773e+00   0.147 0.883489    
targetcountryTurkey               -1.263e+01  2.720e+00  -4.643 3.49e-06 ***
targetcountryUganda                5.202e+01  1.654e+01   3.145 0.001663 ** 
targetcountryUkraine              -1.031e-01  1.170e+01  -0.009 0.992968    
targetcountryUnited Kingdom       -6.721e+01  1.047e+01  -6.417 1.45e-10 ***
targetcountryUnited States         2.300e+02  6.356e+00  36.182  < 2e-16 ***
targetcountryUzbekistan            9.591e-02  9.566e+00   0.010 0.992001    
targetcountryYemen                 4.178e+00  1.635e+00   2.555 0.010628 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 23.38 on 9955 degrees of freedom
Multiple R-squared:  0.9647,    Adjusted R-squared:  0.9644 
F-statistic:  4382 on 62 and 9955 DF,  p-value: < 2.2e-16

Linear Regression Analysis

The multiple linear regression model predicts the number of people killed in suicide attacks based on the year, the number of wounded, and the target country.

Model Equation

using y = ax+b, Just because I am using multiple regression, the equation will be, y = a + b1x1 + b2x2 So, in my case, statistics_ killed_high = a+b1(dateyear)+b2(statistics_ wounded_high)

a = Intercept b1 = how death changes when the year increases (slope for year) b2 = how deaths change when more people are wounded (slope for wounded) So the final equation will be,

statistics_ killed_high = 21.23 - 0.01043(dateyear) + 0.3951(statistics_ wounded_high)

P-value and Adjusted R-squared analyze

The p-value for statistics_ wounded_high (<2e-16) indicates that the number of wounded is strong, statistically significant predictor for deaths. The variable dateyear has a high p-value (0.844746), suggesting that year has a little effect on the number of deaths.

The adjusted R-squared = 0.9644, meaning the model explains 96.4% of the variation in the number of deaths. The overall model is statistically significant.

Diagnostic plots

Residuals vs fitted plot : the blue line is slightly curved, suggesting that the model may not be perfectly linear and that a few points deviate from the pattern.

Normal Q-Q plot : Most points following the diagonal line, meaning the residuals are approximately normal, but a few outliers(such as observations 367,4527,and 8801) deviate from normality.

Scale-Location plot : shows that the spread of residuals increases with fitted values, indicating mild heteroscedasticity.

Cook’s Distance : a few influential points (367,1609, and 8801) that may affect the model’s results.

Citation

https://rpubs.com/rsaidi/950425

Grouping the data by country

suicide_country_grouped <- suicide_country |>
  group_by(targetcountry) |>
  summarize(avg_wounded = mean(`statistics_ wounded_high`),
            avg_killed = mean(`statistics_ killed_high`)) |>
  arrange(desc(avg_killed))

Top 5 by average deaths per attack

top5_countries <- suicide_country_grouped |>
  slice_max(order_by = avg_killed, n=5)

top5_countries

# A tibble: 5 × 3
  targetcountry avg_wounded avg_killed
  <chr>               <dbl>      <dbl>
1 United States      3781       1724. 
2 Argentina           200         85  
3 Uganda               60         76  
4 Kenya               686.        39  
5 France               40.9       34.8

Filtering data for the top 5 countries

suicide_top5 <- suicide_country |>
  filter(targetcountry %in% top5_countries$targetcountry)

Plot 1

ggplot(suicide_top5, aes(x =factor (dateyear), ##I did factor(dateyear) so all years appear as separate categories on the x axis
                         y = `statistics_ killed_high`, 
                         fill = targetcountry)) +
  geom_col(position = position_dodge(width = 0.9), width = 0.5) +
  labs(title = "Yearly Suicide Attack Fatalities (Top 5 Countries)",
       x = "Year",
       y = "Number of People Killed",
       color = "Country",
       caption = "Source: CPOST Suicide Attack Database (2020)") +
  theme_minimal() +
  scale_fill_brewer(palette = "Set2")

Ignoring unknown labels:
• colour : "Country"

Removing U.S. because it is a outlier

suicide_no_us <- suicide_top5 |>
  filter(targetcountry != "United States")

Plot 2 - Final plot(For the grading)

ggplot(suicide_no_us, aes(x=factor(dateyear), y = `statistics_ killed_high`, fill=targetcountry))+
  geom_col(position = position_dodge(width =0.5),width = 0.6)+
  labs(title = "Yearly Suicide Attack Fatalities (Top 4 Countries, excluding U.S.)",
       x = "Year",
       y = "Number of People Killed",
       fill = "Country",
       caption = "Source: CPOST Suicide Attack Database (2020)") +
  theme_light() +
  scale_fill_brewer(palette = "Set1")

Plot 3 - trivial one

ggplot(suicide_no_us,
       aes(x = factor(dateyear), y = targetcountry, fill = `statistics_ killed_high`)) +
  geom_tile() +
  scale_fill_gradient(low = "#87CEEB", high = "#36648B")  +
  labs(title = "Heatmap of Deaths by Country and Year",
       x = "Year",
       y = "Country",
       fill = "Deaths",
       caption = "Source: CPOST Suicide Attack Database (2020)") +
  theme_light()

Citation

https://r-charts.com/colors/

Essay

a. Data Cleaning Process

To prepare my dataset for analysis, I first cleaned the column names by removing special characters such as “#” and “.” using the gsub() function. This helped ensure that the variable names were consistent and easy to reference in R. After that, I filtered out missing values (NAs) from the main variables needed for my analysis: dateyear, statistics_killed_high, statistics_wounded_high, and targetcountry.

Then, I selected only these columns because they were directly related to my research question. I also checked the data structure to confirm that the numeric and categorical variables were correctly formatted. Finally, I summarized and grouped the data by country to identify which countries had the highest average fatalities. These cleaning steps allowed me to create a clear, error-free dataset that was ready for both visualization and regression analysis.

b. Visualization Interpretation

My final visualization is a bar chart showing yearly suicide attack fatalities for the top 4 countries, excluding the United States because U.S. has a extreme outlier. Each bar represents the number of people killed in a given year, and each color corresponds to a different country. The chart makes it easy to compare how suicide attacks vary across both time and location.

For example, Kenya shows a large spike in 1998, which aligns with the historical U.S. Embassy bombing in Nairobi that year. Argentina, France, and Uganda also show individual years with notable death counts, suggesting that these countries experienced fewer but highly impactful suicide attacks. The visualization highlights how these incidents are often concentrated in specific years rather than being evenly distributed over time.

citation

https://www.fbi.gov/history/famous-cases/east-african-embassy-bombings

c. Challenges and Improvements

One challenge I faced was that many of the columns contained a lot of zeros, which likely represented years or countries where no attacks occurred. This made it harder to find strong patterns since most data points were zero. Another issue I ran into was that my first few visualizations were too cluttered,since there were so many countries, the plots were messy and difficult to read. I solved that by focusing only on the top countries with the highest fatalities.

I also wanted to create a visualization comparing killed vs wounded, but it didn’t work out as I planned. I tried a few different chart types, including scatter plots and box plots, but they either looked confusing or didn’t show clear relationships. I also wanted to make a box plot of the top 10 countries’ deaths by year, but it didn’t display properly in R. If I had more time, I would explore improving those visualizations and use tools like Plotly to make the graphs more interactive and insightful. Despite these challenges, my final visualization successfully illustrates the main patterns and answers my research question.