Gunia, A. (2019, April 25). “The Birthplace of the Suicide Belt.” Sri Lanka’s Deadly History of Suicide Bombings. Time; Time. https://time.com/5575956/sri-lanka-history-suicide-bombings-birthplace-invented/

Introduction

The topic of the dataset is suicide attacks, and the data for the dataset was collected from The Chicago Project on Security and Terrorism (CPOST) from the University of Chicago. CPOST is a searchable database on all known suicide attacks across the globe from 1982 to October 2020. The dataset includes 10018 observations with 39 variables. I chose this topic as I wanted to see what factors make a suicide attack more harmful. This lead to my question of, what characteristics of a suicide attack make them more detrimental? The variables I intend to use to answer this question are the datasets low and high predictions of the number of individuals killed, the type of weapon used by the attacker, the type of target attacked, the subregion of the attack, the number of attackers involved, the year of the attack, and the month of the attack.

Variables

statistics.# killed_low, low estimate of the number of deaths caused by the attack

statistics.# killed_high, high estimate of the numbers of deaths caused by the attack

target.weapon, the weapon used by the attacker

target.type, the target of the attack, (civilian, political, security)

target.subregion, the subregion of the attack, (ex. Southern Asia, Western Asia, etc.)

statistics.# attackers, the number of attackers involved in the attack

date.year, the year the attack occured

date.month, the month the attack took place on

Load Libraries and Data

library(tidyverse)
library(highcharter)
library(RColorBrewer)
library(ggthemes)
library(GGally)

setwd("C:/Users/wesle/Downloads")
sads <- read_csv("suicide_attacks.csv")

Data Cleaning

colSums(is.na(sads)) # checking the dataset for any NAs
##                             groups                              claim 
##                                  0                                  0 
##                             status                 statistics.sources 
##                                  0                                  0 
##                          date.year                         date.month 
##                                  0                                  0 
##                           date.day           statistics.# wounded_low 
##                                  0                                  0 
##          statistics.# wounded_high            statistics.# killed_low 
##                                  0                                  0 
##           statistics.# killed_high   statistics.# killed_low_civilian 
##                                  0                                  0 
##  statistics.# killed_high_civilian  statistics.# killed_low_political 
##                                  0                                  0 
## statistics.# killed_high_political   statistics.# killed_low_security 
##                                  0                                  0 
##  statistics.# killed_high_security             statistics.# belt_bomb 
##                                  0                                  0 
##            statistics.# truck_bomb              statistics.# car_bomb 
##                                  0                                  0 
##            statistics.# weapon_oth            statistics.# weapon_unk 
##                                  0                                  0 
##                      target.weapon                      target.region 
##                                  0                                  0 
##                   target.subregion                     target.country 
##                                  0                                  0 
##                    target.province                        target.city 
##                                  0                                  0 
##                    target.location                    target.latitude 
##                                  0                                  0 
##                  target.longtitude                        target.desc 
##                                  0                                  0 
##                        target.type                 target.nationality 
##                                  0                                  0 
##             statistics.# attackers      statistics.# female_attackers 
##                                  0                                  0 
##        statistics.# male_attackers     statistics.# unknown_attackers 
##                                  0                                  0 
##                    attacker.gender 
##                                  0
names(sads) <- gsub("\\.","_",names(sads)) # changed . that seperated words with _ to keep it consistent throughout the data set
names(sads) <- gsub(" ","_",names(sads)) # changed spaces that seperated parts of the variable's name with _ to keep consistencyin the dataset
names(sads) <- gsub("#","num",names(sads)) # changed the # representing number with num to make it easier to understand at a first glance

sads1 <- sads |>
  filter(statistics_num_killed_low >= 0) # several of the values present within the low and high estimates had a -1 present which most likely means NA as it isn't possible for there to be a negative amount of deaths in an attack

There were no “NA”s present within the dataset, however, there were placeholders that stood for an NA in both of the low and high estimates of deaths in an attack. The placeholder was a -1, I removed the observations that had a -1 in their death estimates as it isn’t possible to have a negative amount of deaths in an attack. Doing this shouldn’t have too much of an impact on later work with the dataset as it have 10018 observations and the NAs only accounted for 37 of the total observations (down to 9981 observations). I have also changed the column names for a few reasons, one being consistency. The dataset would use either a space, a period, or an underscore to separate words in a variable name, I changed it to use an underscore for all instances. I also changed the # used to symbolize number in the low and high estimate variables to num to make it easier to understand at first glance when looking at the variable name.

Visualization(s)

1

colors <- c("red", "orange", "yellow", "green", "blue", "purple", "pink", "hotpink", "black", "maroon", "mediumseagreen", "dodgerblue", "aquamarine", "skyblue", "violet", "orangered", "firebrick") # Colors List

sadsawh <- sads1 |>
  group_by(target_weapon) |>
  summarize(highnumavg = round(mean(statistics_num_killed_high))) # grouping by weapon type and taking the average of the high estimated number killed to make it easier to create the visualization

highchart() |>
  hc_add_series(data = sadsawh, type = "column", hcaes(x = target_weapon, y = highnumavg, group = target_weapon)) |>
  hc_title(text = "Average High Estimate of the Number of Individuals Killed vs Attacker Weapon Type") |>
  hc_caption(text = "University of Chicago, Chicago Project on Security and Terrorism (CPOST)") |>
  hc_xAxis(title = list(text = "Attacker Weapon Type")) |>
  hc_yAxis(title = list(text = "Average High Estimate of the Number of Individuals Killed")) |>
  hc_colors(colors) |>
  hc_add_theme(hc_theme_darkunica())

This graph, while only being the high estimates of the number of deaths, can help push the idea that weapon type can be a factor when looking at what causes higher or lower deaths in an attack.

2

sadsawl <- sads1 |>
  group_by(target_weapon) |>
  summarize(lownumavg = round(mean(statistics_num_killed_low))) # grouping by weapon type and taking the average of the high estimated number killed to make it easier to create the visualization

highchart() |>
  hc_add_series(data = sadsawl, type = "column", hcaes(x = target_weapon, y = lownumavg, group = target_weapon)) |>
  hc_title(text = "Average Low Estimate of the Number of Individuals Killed vs Attacker Weapon Type") |>
  hc_caption(text = "University of Chicago, Chicago Project on Security and Terrorism (CPOST)") |>
  hc_xAxis(title = list(text = "Attacker Weapon Type")) |>
  hc_yAxis(title = list(text = "Average Low Estimate of the Number of Individuals Killed")) |>
  hc_colors(colors) |>
  hc_add_theme(hc_theme_superheroes())

Both of these visualizations are similar, which is expected, as they look at the amount of deaths caused by an attack. The first visualization is the high estimate while the second visualization is the low estimate. As shown there is a slight difference between the two, which is expected as on is the high and one is the low estimate, however, the differences amount the weapon types average number of estimated kills is the same. For both animal bombs are the lowest and airplanes are the highest by a significant margin. This shows how there is a relationship between the number of individuals killed and the type of weapon used which is useful in answering the question in the introduction. A possible reason for airplanes having such a high average estimate in deaths could come from the fact that this data includes the attacks in New York during 9/11, which is in the data range this data covers from 1982 to 2020. A later tableau visualization supports this as it shows an attack that had a large number of deaths caused by an airplane in the US.

Mutliple Linear Regression

High

mlrmh <- lm(statistics_num_killed_high ~ target_weapon + target_type + target_subregion + statistics_num_attackers + date_year + date_month, data = sads1)

summary(mlrmh)
## 
## Call:
## lm(formula = statistics_num_killed_high ~ target_weapon + target_type + 
##     target_subregion + statistics_num_attackers + date_year + 
##     date_month, data = sads1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1797.19    -8.46    -2.36     5.45   902.51 
## 
## Coefficients:
##                                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                         2587.8266   294.5619   8.785   <2e-16 ***
## target_weaponAnimal bomb           -1837.1517    62.5406 -29.375   <2e-16 ***
## target_weaponBackpack bomb         -1827.6357    52.4744 -34.829   <2e-16 ***
## target_weaponBelt bomb             -1827.4068    51.6916 -35.352   <2e-16 ***
## target_weaponBoat bomb             -1821.4653    51.9091 -35.090   <2e-16 ***
## target_weaponCar bomb              -1825.8123    51.6800 -35.329   <2e-16 ***
## target_weaponCart bomb             -1824.2801    55.2664 -33.009   <2e-16 ***
## target_weaponMixed                 -1825.1668    52.0971 -35.034   <2e-16 ***
## target_weaponMotorcycle bomb       -1830.1434    51.8057 -35.327   <2e-16 ***
## target_weaponNon-suicide IED       -1836.9574    71.9185 -25.542   <2e-16 ***
## target_weaponOther PBIED           -1832.7142    51.6637 -35.474   <2e-16 ***
## target_weaponOther VBIED           -1823.7340    52.9828 -34.421   <2e-16 ***
## target_weaponScuba bomb            -1838.8717    53.8733 -34.133   <2e-16 ***
## target_weaponTruck bomb            -1816.2487    51.7556 -35.093   <2e-16 ***
## target_weaponTurban bomb           -1816.7367    65.8456 -27.591   <2e-16 ***
## target_weaponUnspecified           -1828.8970    51.8009 -35.306   <2e-16 ***
## target_weaponUnspecified PBIED     -1830.9245    51.8372 -35.321   <2e-16 ***
## target_typePolitical                 -33.1355     2.7221 -12.173   <2e-16 ***
## target_typeSecurity                  -18.9712     1.8517 -10.245   <2e-16 ***
## target_typeUnknown                   -19.3273    35.4881  -0.545   0.5860    
## target_subregionCentral Asia           5.7659    73.4332   0.079   0.9374    
## target_subregionEastern Africa        11.9441    70.9199   0.168   0.8663    
## target_subregionEastern Asia          -1.2571    72.0374  -0.017   0.9861    
## target_subregionEastern Europe         6.3098    71.0301   0.089   0.9292    
## target_subregionMiddle Africa         -6.0390    71.0017  -0.085   0.9322    
## target_subregionNorthern Africa        3.0322    70.9252   0.043   0.9659    
## target_subregionNorthern America     -13.7132    86.6990  -0.158   0.8743    
## target_subregionNorthern Europe       -5.1706    75.2581  -0.069   0.9452    
## target_subregionSouth-Eastern Asia     4.2347    71.5865   0.059   0.9528    
## target_subregionSouth America         39.2840    79.0576   0.497   0.6193    
## target_subregionSouthern Asia          7.8231    70.7855   0.111   0.9120    
## target_subregionSouthern Europe       -6.1683    74.1888  -0.083   0.9337    
## target_subregionWestern Africa        -0.5815    70.8253  -0.008   0.9934    
## target_subregionWestern Asia           4.5067    70.7803   0.064   0.9492    
## target_subregionWestern Europe         9.1847    73.6682   0.125   0.9008    
## statistics_num_attackers               1.6098     0.1928   8.349   <2e-16 ***
## date_year                             -0.3705     0.1426  -2.598   0.0094 ** 
## date_month                             0.1915     0.2146   0.892   0.3723    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 70.61 on 9943 degrees of freedom
## Multiple R-squared:  0.6779, Adjusted R-squared:  0.6767 
## F-statistic: 565.6 on 37 and 9943 DF,  p-value: < 2.2e-16

Low

mlrml <- lm(statistics_num_killed_low ~ target_weapon + target_type + target_subregion + statistics_num_attackers + date_year + date_month, data = sads1)

summary(mlrml)
## 
## Call:
## lm(formula = statistics_num_killed_low ~ target_weapon + target_type + 
##     target_subregion + statistics_num_attackers + date_year + 
##     date_month, data = sads1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1798.35    -7.03    -1.78     4.92   902.91 
## 
## Coefficients:
##                                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                         2560.6451   291.4103   8.787   <2e-16 ***
## target_weaponAnimal bomb           -1835.8772    61.8715 -29.672   <2e-16 ***
## target_weaponBackpack bomb         -1829.5225    51.9129 -35.242   <2e-16 ***
## target_weaponBelt bomb             -1828.3145    51.1385 -35.752   <2e-16 ***
## target_weaponBoat bomb             -1824.8424    51.3537 -35.535   <2e-16 ***
## target_weaponCar bomb              -1826.4898    51.1270 -35.725   <2e-16 ***
## target_weaponCart bomb             -1828.6661    54.6751 -33.446   <2e-16 ***
## target_weaponMixed                 -1829.4242    51.5397 -35.495   <2e-16 ***
## target_weaponMotorcycle bomb       -1829.8416    51.2514 -35.703   <2e-16 ***
## target_weaponNon-suicide IED       -1835.1306    71.1490 -25.793   <2e-16 ***
## target_weaponOther PBIED           -1831.9166    51.1109 -35.842   <2e-16 ***
## target_weaponOther VBIED           -1824.5932    52.4159 -34.810   <2e-16 ***
## target_weaponScuba bomb            -1839.3827    53.2969 -34.512   <2e-16 ***
## target_weaponTruck bomb            -1818.6847    51.2018 -35.520   <2e-16 ***
## target_weaponTurban bomb           -1815.9827    65.1411 -27.878   <2e-16 ***
## target_weaponUnspecified           -1828.9980    51.2466 -35.690   <2e-16 ***
## target_weaponUnspecified PBIED     -1831.1880    51.2826 -35.708   <2e-16 ***
## target_typePolitical                 -32.1443     2.6930 -11.936   <2e-16 ***
## target_typeSecurity                  -17.1242     1.8319  -9.348   <2e-16 ***
## target_typeUnknown                   -15.8063    35.1084  -0.450   0.6526    
## target_subregionCentral Asia           4.5491    72.6475   0.063   0.9501    
## target_subregionEastern Africa         7.1702    70.1611   0.102   0.9186    
## target_subregionEastern Asia          -3.8177    71.2666  -0.054   0.9573    
## target_subregionEastern Europe         3.2975    70.2701   0.047   0.9626    
## target_subregionMiddle Africa         -7.9775    70.2420  -0.114   0.9096    
## target_subregionNorthern Africa        0.6006    70.1664   0.009   0.9932    
## target_subregionNorthern America     -14.2349    85.7713  -0.166   0.8682    
## target_subregionNorthern Europe       -4.7834    74.4529  -0.064   0.9488    
## target_subregionSouth-Eastern Asia     1.7268    70.8205   0.024   0.9805    
## target_subregionSouth America         38.0726    78.2117   0.487   0.6264    
## target_subregionSouthern Asia          4.2577    70.0282   0.061   0.9515    
## target_subregionSouthern Europe       -7.5425    73.3950  -0.103   0.9182    
## target_subregionWestern Africa        -4.0628    70.0675  -0.058   0.9538    
## target_subregionWestern Asia           1.1861    70.0230   0.017   0.9865    
## target_subregionWestern Europe         8.8245    72.8800   0.121   0.9036    
## statistics_num_attackers               1.5925     0.1908   8.348   <2e-16 ***
## date_year                             -0.3566     0.1411  -2.527   0.0115 *  
## date_month                             0.1508     0.2123   0.710   0.4775    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 69.85 on 9943 degrees of freedom
## Multiple R-squared:  0.6829, Adjusted R-squared:  0.6817 
## F-statistic: 578.6 on 37 and 9943 DF,  p-value: < 2.2e-16

For both the high estimate and low estimate the overall multiple linear regression model were significant with a p-value of < 2.2e-16 for both, as < 2.2e-16 is much lower than the default α of 0.05. Both also had similar adjusted r-squared values, the low estimate model was slightly higher with 0.6817 which means the model explained 68.17% of the variance in the low estimate in the number of individuals killed in an attack, while the high estimate model had a value of 0.6767, meaning that the model was able to explain 67.67% of the variance in the high estimate in the number of individuals killed in an attack. All of the variables used, except for a few, were also significant, all having a p-value of < 2e-16. The variables that did not have this p-value were year, which was still significant as it had a p-value of 0.0094 on the high estimate and 0.0115 on the low estimate, month which was not significant in either model a p-value of 0.3723 in the high estimate and 0.4775 in the low estimate, subregion wasn’t significant at all as its p-value in both models is very close to one, always being greater than 0.5.

High

mlrmh1 <- lm(statistics_num_killed_high ~ target_weapon + target_type + statistics_num_attackers + date_year, data = sads1)

summary(mlrmh1)
## 
## Call:
## lm(formula = statistics_num_killed_high ~ target_weapon + target_type + 
##     statistics_num_attackers + date_year, data = sads1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1799.81    -8.08    -2.42     5.10   903.50 
## 
## Coefficients:
##                                  Estimate Std. Error  t value Pr(>|t|)    
## (Intercept)                     2696.2224   274.6473    9.817  < 2e-16 ***
## target_weaponAnimal bomb       -1819.5012    37.5837  -48.412  < 2e-16 ***
## target_weaponBackpack bomb     -1811.1100    15.5423 -116.528  < 2e-16 ***
## target_weaponBelt bomb         -1808.7810    12.9078 -140.131  < 2e-16 ***
## target_weaponBoat bomb         -1801.2317    13.9085 -129.505  < 2e-16 ***
## target_weaponCar bomb          -1807.0376    12.9389 -139.659  < 2e-16 ***
## target_weaponCart bomb         -1805.7215    23.4405  -77.034  < 2e-16 ***
## target_weaponMixed             -1806.2302    14.4331 -125.145  < 2e-16 ***
## target_weaponMotorcycle bomb   -1810.4299    13.3670 -135.440  < 2e-16 ***
## target_weaponNon-suicide IED   -1816.7080    51.6025  -35.206  < 2e-16 ***
## target_weaponOther PBIED       -1813.6496    13.9245 -130.249  < 2e-16 ***
## target_weaponOther VBIED       -1805.5395    17.3651 -103.975  < 2e-16 ***
## target_weaponScuba bomb        -1818.3026    20.0395  -90.736  < 2e-16 ***
## target_weaponTruck bomb        -1797.7196    13.1906 -136.288  < 2e-16 ***
## target_weaponTurban bomb       -1796.6113    42.7809  -41.996  < 2e-16 ***
## target_weaponUnspecified       -1810.9341    13.3062 -136.097  < 2e-16 ***
## target_weaponUnspecified PBIED -1813.4302    13.4250 -135.079  < 2e-16 ***
## target_typePolitical             -30.5489     2.6182  -11.668  < 2e-16 ***
## target_typeSecurity              -17.5360     1.7708   -9.903  < 2e-16 ***
## target_typeUnknown               -16.0804    35.4691   -0.453  0.65030    
## statistics_num_attackers           1.5847     0.1916    8.270  < 2e-16 ***
## date_year                         -0.4311     0.1372   -3.143  0.00168 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 70.61 on 9959 degrees of freedom
## Multiple R-squared:  0.6774, Adjusted R-squared:  0.6767 
## F-statistic: 995.7 on 21 and 9959 DF,  p-value: < 2.2e-16

Low

mlrml1 <- lm(statistics_num_killed_low ~ target_weapon + target_type + statistics_num_attackers + date_year, data = sads1)

summary(mlrml1)
## 
## Call:
## lm(formula = statistics_num_killed_low ~ target_weapon + target_type + 
##     statistics_num_attackers + date_year, data = sads1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1800.53    -6.72    -1.72     4.29   903.81 
## 
## Coefficients:
##                                  Estimate Std. Error  t value Pr(>|t|)    
## (Intercept)                     2670.0568   271.6769    9.828  < 2e-16 ***
## target_weaponAnimal bomb       -1821.0214    37.1772  -48.982  < 2e-16 ***
## target_weaponBackpack bomb     -1815.0216    15.3742 -118.056  < 2e-16 ***
## target_weaponBelt bomb         -1812.4285    12.7682 -141.949  < 2e-16 ***
## target_weaponBoat bomb         -1807.5518    13.7581 -131.381  < 2e-16 ***
## target_weaponCar bomb          -1810.4902    12.7990 -141.456  < 2e-16 ***
## target_weaponCart bomb         -1813.0651    23.1870  -78.193  < 2e-16 ***
## target_weaponMixed             -1813.3082    14.2770 -127.009  < 2e-16 ***
## target_weaponMotorcycle bomb   -1812.9353    13.2224 -137.111  < 2e-16 ***
## target_weaponNon-suicide IED   -1817.6555    51.0444  -35.609  < 2e-16 ***
## target_weaponOther PBIED       -1815.6155    13.7739 -131.816  < 2e-16 ***
## target_weaponOther VBIED       -1809.2394    17.1773 -105.327  < 2e-16 ***
## target_weaponScuba bomb        -1821.8262    19.8227  -91.906  < 2e-16 ***
## target_weaponTruck bomb        -1802.8981    13.0479 -138.175  < 2e-16 ***
## target_weaponTurban bomb       -1798.6968    42.3182  -42.504  < 2e-16 ***
## target_weaponUnspecified       -1813.7201    13.1623 -137.796  < 2e-16 ***
## target_weaponUnspecified PBIED -1816.3554    13.2798 -136.776  < 2e-16 ***
## target_typePolitical             -29.7622     2.5899  -11.492  < 2e-16 ***
## target_typeSecurity              -15.7539     1.7516   -8.994  < 2e-16 ***
## target_typeUnknown               -12.7371    35.0855   -0.363  0.71659    
## statistics_num_attackers           1.5698     0.1895    8.282  < 2e-16 ***
## date_year                         -0.4181     0.1357   -3.082  0.00206 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 69.84 on 9959 degrees of freedom
## Multiple R-squared:  0.6824, Adjusted R-squared:  0.6818 
## F-statistic:  1019 on 21 and 9959 DF,  p-value: < 2.2e-16

Removing both month and subregion had no affect on either of the high or low models, the only slight difference that appeared was an increase of 0.0001 in the low models adjusted r-squared value, meaning that the model without month and subregion is 0.01% better at explaining the variance in the low estimate amount of deaths in an attack.

Tableau Visualization