Introduction

For my research I decided to look at whether travel patterns can be affected by terrorism incidents. The patterns could potentially reflect society’s political moods and fears and help shape the diplomatic relationships and foreign policies.

Data

Final travel, terrorism and combined data examples:

Travel

month outbound
2016-01-01 615470
2016-02-01 546246
2016-03-01 912603

Terrorism

DATE COUNTRY CITY FATALITIES INJURED REGION ATTACK.TYPE.1 victims year month
2016-11-20 France Paris NA NA NA 2016 11
2016-10-09 Russia Gudermesskiy 8 4 Eastern Europe Armed Assault 12 2016 10
2016-07-24 Germany Ansbach 1 15 Western Europe Bombing/Explosion 16 2016 7

Combined

year month outbound fatal injured attacks
2016 10 1055110 8 4 1
2016 7 1654182 11 42 2
2016 3 912603 35 270 2

Data collection

Monthly U.S. citizen departures are collected and reported in Tourism Industries U.S. International Air Travel Statistics (I-92 data) Program. Each month NTTO processes and reports outbound figures in the “U.S. International Air Passenger Statistics Report”.
Detailed description of data collection methods for Global Terrorism Database can be found here: https://www.start.umd.edu/gtd/using-gtd/

Cases

Travel data: in the original data each case represents monthly US citizens outbound departures by the world region. My subset only captures travel to Europe. There are 20 years of observations resulting in 252 total observations.
Terrorism data: in my subset, each case represents a terrorist attack in Europe with 262 total observations.

Variables

Response variable is outbound travel of the U.S. citizens to Europe and it’s numerical. Explanatory variables:
- number of attacks (numerical)
- number of victims (numerical)

Type of study

This is an observational study.

Scope of inference - generalizability

The population of interest is all U.S. citizens traveling abroad (Europe in this study). I think the findings can be generalized as the data should capture all departures given mandatory collection of this information on all travelers.

Scope of inference - causality

Since the study is of an observational character, no causation can be established.

Exploratory data analysis

General exploration of travel and terrorism data

Plotting travel data over time reveals cyclicality and general upward trend. Plotting annual travel demonstrates travel range within each year, and also uncovers that travel in colder months is below average.

Plotting terrorism data over time reveals outliers and an insight on the most popular type of attack (bombing). Geographical plotting uncovers the country with highest number of attacks and regional boxplots help understand median and range of the victims in each region.

## 18 codes from your data successfully matched countries in the map
## 1 codes from your data failed to match with a country code in the map
## 225 codes from the map weren't represented in your data

Relevant summary statistics

Travel

vars n mean sd median trimmed mad min max range skew kurtosis se
X1 1 252 973841.9 290583 930029.5 958781.7 325103.8 414958 1837000 1422042 0.4334493 -0.5284436 18305.01

Fatalities

vars n mean sd median trimmed mad min max range skew kurtosis se
X1 1 129 24.62016 50.83709 7 12.90476 10.3782 0 354 354 4.100089 19.70772 4.475956

Attacks

vars n mean sd median trimmed mad min max range skew kurtosis se
X1 1 129 1.984496 1.520611 1 1.647619 0 1 8 7 2.129257 4.492304 0.1338823

Inference

Model 1 - Attacks per month and monthly outbound travel

## 
## Call:
## lm(formula = outbound ~ attacks, data = comb)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -580974 -192980   -3015  205152  653128 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   990810      40009   24.77   <2e-16 ***
## attacks         5122      16026    0.32     0.75    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 275700 on 127 degrees of freedom
## Multiple R-squared:  0.0008036,  Adjusted R-squared:  -0.007064 
## F-statistic: 0.1021 on 1 and 127 DF,  p-value: 0.7498

Model 2 - Annual attacks and average annual outbound travel

## 
## Call:
## lm(formula = outbound ~ attacks, data = comb2)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -268491  -57564   35100   61160  113655 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   979789      45062  21.743 6.93e-15 ***
## attacks         1812       3258   0.556    0.585    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 97580 on 19 degrees of freedom
## Multiple R-squared:  0.01602,    Adjusted R-squared:  -0.03577 
## F-statistic: 0.3093 on 1 and 19 DF,  p-value: 0.5846

Model 3 - Average victims (fatalities and injured combined) and average annual outbound travel

## 
## Call:
## lm(formula = outbound ~ victims, data = comb3)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -266531  -55029   38321   50041  120632 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 991896.34   31619.46  31.370   <2e-16 ***
## victims         97.73     228.32   0.428    0.673    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 97900 on 19 degrees of freedom
## Multiple R-squared:  0.009551,   Adjusted R-squared:  -0.04258 
## F-statistic: 0.1832 on 1 and 19 DF,  p-value: 0.6734

Conclusion

All three models diagnostics demonstrate no relationship between chosen predictive and explanatory variables most likely due to data not meeting normality, linearity and constant variability requirements. As some variables were highly skewed by the outliers, it is recommended to rerun analysis excluding the outliers. Additionally, it may be logical not to align the timing precisely but instead look at travel patterns in the months following the attacks to see if there is a reaction.