Introduction

As a human rights geographer, I am especially interested in global conflicts. Throughout my life, I have always been intrigued by Latin America, particularly during the Cold War era, and wanted to carry this interest into my final project. My research question revolved around the number of weapons sent by the United States to Colombia throughout the Cold War until 2023, asking specifically if “The United States was most interested in Colombia between 1984 and 1993.” This question served as my hypothesis throughout the project.

“Most interested” is defined by the number of shipments sent each year to Colombia and the total number of weapons sent each year to Colombia. If there proves to be a significant correlation between the year and the number of weapons sent, this implies the United States was incredibly interested in Colombia during this period.

These years are significant moments across Colombian rebel group history, which hints at the significance of United States intervention and aid. FARC, one of the strongest and most recognizable leftist rebel groups in Colombian history, was at its most powerful between 1984–and 1988. Because the United States was still participating in the worldwide Cold War, they despised the idea of FARC gaining any additional pull in the Colombian government and began sending aid and weapons to the government. This is a common theme across American involvement in Latin America, often supporting coups and death squads that vowed to fight against the spread of communism. Because FARC posed a threat to democracy in American eyes, leaders were happy to send weapons to the nation.

Additionally, the Colombian cocaine market had taken off and steadily gained power throughout the 80s, reaching its peak late into the decade and into the early 90s. The United States government had launched its infamous “War on Drugs,” sponsored by President Ronald Reagan, which diverted its attention to the Colombian narcotics trade. The fight against the Colombian cartels involved the Colombian military, local police forces, and intervention from the American Drug Enforcement Agency (DEA). DEA involvement was accompanied by American weapons shipments, which I feel will be reflected in the number of weapons sent in the selected years.

Data

I decided to focus on United States arms exports into Colombia from 1970 until 2022. I chose these years because I am very interested in the American War on Drugs, specifically in Colombia and Mexico. Colombia had a broader dataset than strictly Mexico, so I decided to look at Colombia. The years mark up the bulk of the time Americans were president in Colombia’s drug war, and I decided to span it into 2022 because there is still ongoing conflict, especially with FARC, in Colombia and their drug trade. While it is not as violent as before, I wanted to see if the United States still provided as many weapons as they had before.

I was able to source this data from the Stockholm International Peace Institute’s arms transfer dataset, originally found through the data-is-plural website. https://www.sipri.org/databases/armstransfers Here, the link is attached.

Methods

Cleaning Data

The data was delivered in an RTF file, which was unsuitable for the project. I decided converting the RTF into an Excel spreadsheet wouldn’t clean it how I wanted, so I manually input the data into a CSV file.

I decided to include information on the class of weapon sent, the number of weapons with each transaction, if the weapon was second-hand, if the weapons were meant for counter-narcotics, and finally, if the weapons were meant for use against rebel groups. I anticipated that counter-narcotics would be sent in the highest quantity during the 1980s and early 1990s since this was the peak of American involvement in the Colombian drug war, so I wanted to see if the data reflected my prediction. I wanted to see if weaponry meant for combatting rebel groups would have any particular spikes.

Manually inputting the data was a generally tedious process. I included nine categories– Provider, Receiver, YearReceived, WeaponClass, NumberDelivered, Secondhand, MajorCity, RebelForces, and CounterNarcotics. These labels vary from general organizing purposes to quantities of delivered weapons, and simple yes or no categorizations. For “yes or no” categories, I input these as factor levels of 1 and 2. 1 indicated “yes” while 2 indicated “no.” The number of weapons delivered and its corresponding year was numerical data. The remaining categories were characters.

Every weapon included in the dataset was sent from the United States to Colombia, so each field had the same input for these two categories. YearReceived was based on the year the weapons were delivered to Colombia. I opted to exclude the year the weapons were requested by Colombia, as I was more interested in observing the quantities sent by the United States, and less the weapons requested by Colombia. All 34 WeaponClass entries were entered according to the data provided; these classifications are determined by the Stockholm International Peace Research Institute (SIPRI).

Running Regressions

I wanted to test if there was a level of significance between the years the United States sent weapons and the number of weapons sent to Colombia. I felt the best way to test this was to run a linear regression of the variables and later check the coefficients and R2 value. I named the new model “reg_data.”

Confidence Interval

I wanted to test statistical significance with the regression data to analyze the results I had found further. I used “reg_data” and a 95% confidence level for this interval.

T-Statistic

I wasn’t entirely sure if there was enough evidence to make definite conclusions, so I decided to run a T-Statistic Test to set up a full Hypothesis Test. My alternative hypothesis H(a) was that there is a statistical significance between the number of arms delivered and the year the weapons were received by Colombia. My null hypothesis H(o) was that there is no statistical significance between the number of arms delivered and the year the weapons were received by Colombia.

Early Plotting

It was difficult to see if I was properly looking at my data, so I wanted to visualize it using Ggplot2. I also hoped to see if there were any outliers, but this was not the primary goal– yet. I did my first plot using a scatter plot. It became clear that there were a few outliers across a few years– especially around the mid-1980s and then again in the late 2010s and early 2020s. Aside from these outliers, the number of shipments hovered around 10-20. This was not easily detectable from the scatterplot I created, so I decided I needed to create a better data display.

Histogram

Because the scatterplot did not show me much, I decided to create a histogram of the YearReceived data. This way, I could see if the outliers from the scatterplot matched up with the data demonstrated in the histogram. I was correct in this assumption, as the peaks were displayed properly by the number of deliveries each year. It appears most deliveries were made in the late 2010s, which is interesting to look at– while not aligning with my original hypothesis. There were higher numbers during the mid-80s and early 90s– as predicted– but these numbers are nowhere near the highest years. Despite some drops followed by plateaus, the 21st century sees the highest number of deliveries.

Findings

The regression produced several coefficients that spanned 51 years of data, so it was non-beneficial to sift through the dozens of coefficients. Instead, I wanted to look at the R2 value to see how accurate this model was. The produced R2 value was: 0.001163. This R2 value is quite small and exceptionally close to 0, meaning I created an inaccurate and nonlinear model.

This function produced an interval of (-0.2596108, 0.3892116). Because this interval includes 0 in its range of values, the correlation between the regression of the number of weapons sent and the year the shipment was received, according to this data, is NOT statistically significant.

After running a regression analysis and t-statistic test, two conclusions can be drawn. The regression analysis proved to not be statistically significant between the two variables– the number of arms sent and the year the arms were received. This generally leads to the conclusion that the regression model did not fit very accurately, and that this was not the best approach to test my original hypothesis. This was only further confirmed when I ran a confidence interval on the regression data– it proved to be incredibly statistically insignificant.

The p-value resulting from the t-statistic was incredibly small, equaling only p < 2.2e-16. The value is substantially smaller than 0.05, so we can assume that these two variables, outside of the regression calculation, ARE statistically significant. This is further supported by the subsequent confidence interval– (1982.617, 1992.633). This interval does not include 0, further reinforcing its significance. We can hereby reject the null hypothesis that there is no statistical significance between the number of arms delivered and the year the weapons were received by Colombia. The t-statistic test was much more helpful for testing my hypothesis. The test and subsequent confidence intervals brought results that proved statistical significance and allowed for a rejection of the null hypothesis for the test.

Limitations

These findings have intense limitations. War geographies involve hundreds, if not thousands, of compounding variables. These variables become all the more complicated regarding foreign involvement, particularly the United States during the Cold War and the War on Drugs. Though it is highly plausible the United States was especially interested in Colombia at FARC’s highest activity and during the height of the Medellín and Calí cartels, I cannot determine for sure this is supported through the t-statistic and confidence interval tests. Though the t-statistic test determines there is likely a connection between the number of arms sent and the year they were received– we cannot absolutely accept this alternative hypothesis.

There could be dozens of other reasons an increase in weapons occurred during specific years. Additional natural resources could have been located at a certain point, making production more manageable. Production costs may have increased in a year due to higher labor expenses, or possibly crucial factories shut down. A government shutdown may have occurred, leading to frozen funds for aid shipments (in the form of weapons). Though it is easier to determine, yes, there is a significance between the number of weapons sent and the year they were delivered– I cannot definitely say it is because the United States was more interested in Colombia during these peaked years.

While this data is likely very accurate, arms dealing is a notoriously opaque initiative, with many deals occurring behind closed doors. Increased secrecy was infamously rampant during the Cold War, with many deals occurring among rebel groups, paramilitaries, or even narcotics cartels. Shipments of this nature are likely excluded from the dataset, presenting yet another limitation of my findings.

Excluded data included specific weapon names and the weapon’s year of request. I felt this was sufficient to exclude since I was not looking at a question pertaining to the individual weapon class. If I were investigating what may be considered the most “valuable” weapon sent to Colombia through the same selected period, I would likely include the information.

Conclusion

Upon discovering the statistical significance between the number of weapons sent and year received, I can tentatively observe shipment spikes and consider them significant or intentionally higher. However, I cannot conclude that the United States was the most interested in Colombia between 1984 and 1993. The question of “interest” is far too complex to answer through just one statistically significant test revealing significant variables.

As a concluding observation– through analyzing the t-statistic test, regression analysis, and observing outlying data points, I cannot determine whether the United States was the most interested in Colombia between 1984 and 1993. Several limitations with my observations and general project prevent an authentic acceptance of the research request. However, I can generally determine a connection exists between the year and number of weapons sent to Colombia, laying the framework for potential additional studies.

Reflection

Sometimes I am weary of relying on statistics, but I can identify the benefits of having a mathematical result that proves statistical significance– at least to some degree. Quantitative research is often much more reliable in the eyes of the general population, as people are quicker to believe numbers and statistics than researcher-based observations. Additionally, having a quantitative result can point a qualitative researcher toward a new perspective to investigate what they may not have thought of before. Quantitative research often lays the foundation for explaining various socioeconomic events that are difficult to explain. Once there is a mathematical foundation, it is often easier to initiate exploring connections or potential correlations.

Despite its benefits, quantitative research has significant limitations. One of the core principles of statistics is that “correlation does not equal causation.” Just because a t-statistic hint at a statistically significant relationship does not mean we automatically can accept the tested alternative hypothesis. Additionally, there are countless variables we cannot account for in the geography field. It is nearly impossible to isolate two specific variables and then test them outside a simulated environment– and often quantitative results assume we can.

One of the most challenging aspects of this project was determining how I wanted to test my data once it was all laid out. Yes, I had a functioning dataset properly loaded into R– but where did I want to go from that point? As someone used to exploring and discussing variables without utilizing numbers, this was a learning curve. I consistently questioned if I was performing these tests correctly– as our lab data during the semester was already clean. I had to accept that perhaps my data was presenting incorrect results and possibly the numbers I was observing were out of context– I simply had to keep working and attempt to unearth trends. There is not necessarily a correct or incorrect approach to constructing a quantitative project– just expect there to be trial and error.

I admit that the project was much more doable than I initially anticipated. Though I am a qualitative researcher and typically prefer to stick to what I traditionally know, this project introduced me to quantitative research’s benefits. I had difficulty reading the raw data points from the RTF file; it was incredibly tedious. However, once I effectively translated the data, I was very proud of my work and excited to begin the analysis aspect of the project. I did not think I would be keen to work on the project at any point, so that was a pleasant surprise. It was enjoyable to discover a trend and have it backed up through statistics; I already knew there would likely be a correlation between the number of weapons sent and the year they were received but it was fascinating to see confirmation.