Rebecca Johnston
21/10/2013
I have chosen to complete my final Exploratory Data Analysis assignment using the Global Terrorism Database (GTD), available for download here. The GTD includes over 100 000 incidents of terrorism from 1970 to 2011 (N.B. Incidents of terrorism from year 1993 are not present in the GTD as they were lost).
Here I will display the figures generated by my analysis pipeline using R and the graphical package ggplot2. For my full code, and access to the data used herein, please visit my github repo.
Since there are over 100 variables in the GTD, I immediately restricted my analyses to 15 variables for the purposes of this assignment. In addition, I shortened some region names for ease of graphing, so here “MENA” means Middle East and North Africa.
First, I want to explore the total number of fatalities per incident over time using a scatter plot:
N.B. I have deliberately chosen not to show extreme outliers by manually specifying the y axis limits. Here, the extreme outliers are the September 11 attacks (2001) which were counted in the GTD as two separate incidents, each with 1381.5 fatalities.
The majority of terrorist incidents have a low number of fatalities, and I have added transparency to the points (which represent incidents) to convey this.
Initially, it appears that this graph may be suitable as a boxplot, but the inter-quartile range is 1, so that would not be appropriate to graph given the spread of the data.
summary(terr$nkill)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0 0 0 2 1 1380 6907
Since there are so many incidents to convey on the one plot, we could introduce the ggplot2 function facet_wrap to separate the fatalities by region:
Now we can observe that some regions have very low total numbers of fatalities per terrorist incident (e.g. Oceania and Central Asia), whilst other regions have quite a spread in the number of fatalities per incident (e.g. Sub-Saharan Africa and MENA). Let's compare this result to the number of individuals wounded per incident over time by region:
The differences between the number of fatalities and number of individuals wounded by region appear to be subtle, but let's compare the two variables properly by plotting the data on the same graph. To do this, I used data aggregation to find the total number of fatalities and the total number of individuals wounded by year and region. I then reshaped the data into tall format to allow group in ggplot2:
So yes, for the most part, the total number of fatalities and the total number of individuals harmed follow a similar trend over time by region. However, one striking deviation from this trend was during 1980-1985 in Central America, where there was a maximum of ~5000 fatalities but no where near as many individuals harmed.
What about the observed outliers, what was the maximum number of fatalities per region for any one incident?
| region_txt | maxKill |
|---|---|
| Oceania | 17.00 |
| Central Asia | 23.00 |
| Southeast Asia | 116.00 |
| Eastern Europe | 180.00 |
| East Asia | 184.00 |
| Western Europe | 270.00 |
| South America | 275.00 |
| Central America | 300.00 |
| Russia | 344.00 |
| MENA | 422.00 |
| South Asia | 518.00 |
| Sub-Saharan Africa | 1180.00 |
| North America | 1381.50 |
The terrorist incident with the most number of casualties occurred in North America (this was 9/11). Which terrorist groups were behind the attacks with the most fatalities per region?
| Region | Max killed | Terrorist group name |
|---|---|---|
| Oceania | 17.00 | Kanak Separatists |
| Central Asia | 23.00 | Unknown |
| Southeast Asia | 116.00 | Abu Sayyaf Group (ASG) |
| Eastern Europe | 180.00 | Serbian Militants |
| East Asia | 184.00 | Unknown |
| Western Europe | 270.00 | Libyan |
| South America | 275.00 | Revolutionary Armed Forces of Colombia (FARC) |
| Central America | 300.00 | Unknown |
| Russia | 344.00 | Riyadus-Salikhin Reconnaissance and Sabotage Battalion of Chechen Martyrs |
| MENA | 422.00 | Mujahedin-e Khalq (MEK) |
| South Asia | 518.00 | Communist Party of Nepal- Maoist (CPN-M) |
| Sub-Saharan Africa | 1180.00 | Hutus |
| North America | 1381.50 | Al-Qaida |