Final Project – TransAtlantic Slave Trade

Author

charlene

The Trans-Atlantic Slave Trade

The Trans-Atlantic Slave Trade Ship

The Trans-Atlantic Slave Trade Ship

Introduction

The Transatlantic Slave Trade was a horrific and large-scale forced migration of African people to the Americas between the 16th and 19th centuries. This trade system was one of the most significant events in human history that shaped global demographics, economics, and social structures. The goal of this project is to analyze patterns within this slave trade using actual voyage-level data from the Trans-Atlantic Slave Trade Database.

This dataset includes rich details about ships, captains, years of operation, ports of embarkation and disembarkation, and the number of captives who arrived at the destination. It allows us to investigate key quantitative aspects like the number of enslaved individuals and qualitative details like where and how voyages were conducted. I use this data to answer questions like: Which disembarkation ports received the highest number of captives? How did the number of captives transported change over time? What can we learn by looking at individual ship voyages and their captains? These questions offer a lens into the human impact and logistical operations behind the slave trade.

The data was cleaned using `dplyr`, and no `is.na()` or `na.omit()` functions were used. Instead, I applied filtering logic to ensure we keep only meaningful, complete records for analysis. For example, only rows with valid years (after 1600) and a non-zero number of captives were retained.

The quantitative variables used include `year` and `captives_arrived`, while the categorical ones include `vessel_name`, `departure_port`, `disembark_port`, and `captain`. These allow both statistical modeling and visual exploration using ggplot2 and Plotly for interactivity.

my personal Interest and purpose

This project gave me the opportunity to combine my passion for data science with my concern for historical and social justice. Understanding the Transatlantic Slave Trade through data helps bring to light the scale and systematic nature of this human tragedy. It also honors those whose stories have been obscured or forgotten.

I chose this dataset because it pushes me to not only use statistical methods, but to think deeply about how data is connected to human lives. Each data point represents a person or group affected by systemic exploitation. Analyzing this dataset encourages empathy and ethical responsibility in data science.

I was also interested in how historians managed to gather and digitize this information. Reading through the dataset and its structure helped me appreciate the role of historical data curation. Much of the data comes from ship records, archival documents, and scholarly reconstruction—showing how interdisciplinary work informs modern analysis.

On a technical level, this project helped me practice new techniques. I used Plotly for interactivity and customized color palettes for visual appeal. I also ensured that each graph met aesthetic and clarity standards: labeled axes, custom color schemes, readable themes, and informative titles.

This project gave me a meaningful way to reflect on the long-lasting effects of the slave trade. Understanding where ships went, how many captives they carried, and when the activity peaked reveals a lot about global power dynamics, economic motivations, and resistance to abolition.

Finally, my hope is that this project serves as a small contribution to remembering the lives affected by slavery—not just through text, but through data and visual storytelling. Using my skills to bring attention to this history is one way I can make data science more socially conscious.

##Loading the libraries

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(readxl)
library(ggplot2)
library(plotly)

Attaching package: 'plotly'

The following object is masked from 'package:ggplot2':

    last_plot

The following object is masked from 'package:stats':

    filter

The following object is masked from 'package:graphics':

    layout

##Loading the dataset

df <- read.csv("C:/Users/MCuser/Downloads/transAtlanticSlaveTrade.csv")

##Data structure

summary(df)
 Year.of.arrival.at.port.of.disembarkation   Voyage.ID      Vessel.name       
 Min.   :1514                              Min.   :     1   Length:36069      
 1st Qu.:1732                              1st Qu.: 16132   Class :character  
 Median :1773                              Median : 32540   Mode  :character  
 Mean   :1765                              Mean   : 42871                     
 3rd Qu.:1806                              3rd Qu.: 50330                     
 Max.   :1866                              Max.   :900237                     
 NA's   :2                                                                    
 Voyage.itinerary.imputed.port.where.began..ptdepimp..place
 Length:36069                                              
 Class :character                                          
 Mode  :character                                          
                                                           
                                                           
                                                           
                                                           
 Voyage.itinerary.imputed.principal.place.of.slave.purchase..mjbyptimp.
 Length:36069                                                          
 Class :character                                                      
 Mode  :character                                                      
                                                                       
                                                                       
                                                                       
                                                                       
 Voyage.itinerary.imputed.principal.port.of.slave.disembarkation..mjslptimp..place
 Length:36069                                                                     
 Class :character                                                                 
 Mode  :character                                                                 
                                                                                  
                                                                                  
                                                                                  
                                                                                  
  VOYAGEID2         Captives.arrived.at.1st.port Captain.s.name    
 Length:36069       Min.   :   0.0               Length:36069      
 Class :character   1st Qu.: 158.0               Class :character  
 Mode  :character   Median : 254.0               Mode  :character  
                    Mean   : 276.5                                 
                    3rd Qu.: 372.0                                 
                    Max.   :1700.0                                 
                    NA's   :17787                                  
glimpse(df)
Rows: 36,069
Columns: 9
$ Year.of.arrival.at.port.of.disembarkation                                         <int> …
$ Voyage.ID                                                                         <int> …
$ Vessel.name                                                                       <chr> …
$ Voyage.itinerary.imputed.port.where.began..ptdepimp..place                        <chr> …
$ Voyage.itinerary.imputed.principal.place.of.slave.purchase..mjbyptimp.            <chr> …
$ Voyage.itinerary.imputed.principal.port.of.slave.disembarkation..mjslptimp..place <chr> …
$ VOYAGEID2                                                                         <chr> …
$ Captives.arrived.at.1st.port                                                      <int> …
$ Captain.s.name                                                                    <chr> …
colnames(df)
[1] "Year.of.arrival.at.port.of.disembarkation"                                        
[2] "Voyage.ID"                                                                        
[3] "Vessel.name"                                                                      
[4] "Voyage.itinerary.imputed.port.where.began..ptdepimp..place"                       
[5] "Voyage.itinerary.imputed.principal.place.of.slave.purchase..mjbyptimp."           
[6] "Voyage.itinerary.imputed.principal.port.of.slave.disembarkation..mjslptimp..place"
[7] "VOYAGEID2"                                                                        
[8] "Captives.arrived.at.1st.port"                                                     
[9] "Captain.s.name"                                                                   

##naming the columns

names(df)
[1] "Year.of.arrival.at.port.of.disembarkation"                                        
[2] "Voyage.ID"                                                                        
[3] "Vessel.name"                                                                      
[4] "Voyage.itinerary.imputed.port.where.began..ptdepimp..place"                       
[5] "Voyage.itinerary.imputed.principal.place.of.slave.purchase..mjbyptimp."           
[6] "Voyage.itinerary.imputed.principal.port.of.slave.disembarkation..mjslptimp..place"
[7] "VOYAGEID2"                                                                        
[8] "Captives.arrived.at.1st.port"                                                     
[9] "Captain.s.name"                                                                   

##Converting to numeric

library(dplyr)

df <- df %>%
  mutate(
    year = as.numeric(Year.of.arrival.at.port.of.disembarkation),
    captives_arrived = as.numeric(Captives.arrived.at.1st.port)
  ) %>%
  filter(!is.na(year), !is.na(captives_arrived))  # removes bad rows

##filtering

df_filtered <- df %>%
  filter(year > 1600, captives_arrived > 0)

##double checking the tructure

str(df)
'data.frame':   18282 obs. of  11 variables:
 $ Year.of.arrival.at.port.of.disembarkation                                        : int  1817 1817 1817 1817 1817 1817 1817 1817 1817 1817 ...
 $ Voyage.ID                                                                        : int  1 2 3 4 5 6 7 8 9 10 ...
 $ Vessel.name                                                                      : chr  "Pastora de Lima" "Tibério" "Paquete Real" "Bom Caminho" ...
 $ Voyage.itinerary.imputed.port.where.began..ptdepimp..place                       : chr  "Rio de Janeiro" "Bahia, place unspecified" "Bahia, place unspecified" "Bahia, place unspecified" ...
 $ Voyage.itinerary.imputed.principal.place.of.slave.purchase..mjbyptimp.           : chr  "Mozambique" "Mozambique" "Cabinda" "Quilimane" ...
 $ Voyage.itinerary.imputed.principal.port.of.slave.disembarkation..mjslptimp..place: chr  "Bahia, place unspecified" "Bahia, place unspecified" "Bahia, place unspecified" "Bahia, place unspecified" ...
 $ VOYAGEID2                                                                        : chr  "" "" "" "" ...
 $ Captives.arrived.at.1st.port                                                     : int  290 223 350 342 516 515 204 374 345 478 ...
 $ Captain.s.name                                                                   : chr  "Dias, Manoel José" "Mata, José Maria da" "Ferreira, José dos Santos" "Dias, Domingos Francisco" ...
 $ year                                                                             : num  1817 1817 1817 1817 1817 ...
 $ captives_arrived                                                                 : num  290 223 350 342 516 515 204 374 345 478 ...

##Filter dataset for meaningful row

##Clean

time_data <- df %>%
  group_by(year) %>%
  summarize(total_captives = sum(captives_arrived, na.rm = TRUE))

Linear Regression Analysis

model <- lm(`Captives.arrived.at.1st.port` ~ `Year.of.arrival.at.port.of.disembarkation`, data = df)
summary(model)

Call:
lm(formula = Captives.arrived.at.1st.port ~ Year.of.arrival.at.port.of.disembarkation, 
    data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-353.34 -106.15  -15.54   89.47 1348.04 

Coefficients:
                                            Estimate Std. Error t value
(Intercept)                               -1.278e+03  3.789e+01  -33.73
Year.of.arrival.at.port.of.disembarkation  8.768e-01  2.136e-02   41.05
                                          Pr(>|t|)    
(Intercept)                                 <2e-16 ***
Year.of.arrival.at.port.of.disembarkation   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 152.2 on 18280 degrees of freedom
Multiple R-squared:  0.08439,   Adjusted R-squared:  0.08434 
F-statistic:  1685 on 1 and 18280 DF,  p-value: < 2.2e-16

The model shows a statistically significant (p < 0.05) relationship, but a low R² indicates that year alone does not fully explain variation in captives transported. Other factors such as laws, wars, or port capacity likely influence this.

##Visualizarion 1: Top 8 ports by Total Captives

library(tidyverse)
library(plotly)

# Summarize top 8 years
slave_df <- df %>%
  filter(!is.na(Year.of.arrival.at.port.of.disembarkation), !is.na(captives_arrived)) %>%
  group_by(Year.of.arrival.at.port.of.disembarkation) %>%
  summarize(total_captives = sum(captives_arrived, na.rm = TRUE)) %>%
  arrange(desc(total_captives)) %>%
  slice_head(n = 8)

# Define a color palette
colors <- c("#1f77b4", "#ff7f0e", "#2ca02c", "#d62728", 
            "#9467bd", "#8c564b", "#e377c2", "#7f7f7f")
port_colors <- setNames(colors, slave_df$Year.of.arrival.at.port.of.disembarkation)

# Plot
plot_ly(
  data = slave_df,
  x = ~total_captives,
  y = ~reorder(Year.of.arrival.at.port.of.disembarkation, total_captives),
  type = 'bar',
  orientation = 'h',
  color = ~as.factor(Year.of.arrival.at.port.of.disembarkation),
  colors = port_colors
) %>%
  layout(
    title = list(
      text = "Top 8 Years of Slave Disembarkation by Total Captives (1500–1866)",
      x = 0.05,
      xanchor = "left"
    ),
    xaxis = list(title = "Total Captives Transported"),
    yaxis = list(title = "Year of Arrival"),
    legend = list(title = list(text = "Year")),
    margin = list(l = 100)
  )

This interactive bar graph shows the top 8 disembarkation ports that received the highest number of enslaved captives from 1500 to 1866 during the transatlantic slave trade. Each bar represents a port, and the longer the bar, the more captives were transported to that location over time. The ports are color-coded with a map key (legend) to help identify them easily.

The top ports likely appear here because they were part of major colonial economies heavily dependent on enslaved labor, especially in plantation regions like the Caribbean, Brazil, and parts of Central and South America. These areas needed large numbers of enslaved Africans to work on sugar, coffee, and cotton plantations.

The color palette used in the chart adds visual distinction, making it easier to compare port totals. The horizontal layout helps with readability, especially for ports with longer names.

We can assume that the ports with the highest numbers had consistent and long-term involvement in the slave trade, especially during the 1700s and 1800s, when the slave trade was at its peak. During this time, demand for enslaved labor was extremely high, especially in Brazil and the Caribbean.

Some of these ports likely remained active even after legal abolition, which happened at different times in different countries. For example, Brazil did not end the slave trade until the 1850s, meaning its ports remained major receivers of enslaved Africans well into the 19th century.

The concentration of captives at these eight ports reflects a combination of economic demand, geographic accessibility, and colonial control by European powers. These regions had infrastructure, shipping routes, and legal systems built to support large-scale human trafficking.

There’s a historical trend of increasing captives during the 18th century, reaching a peak in the late 1700s to early 1800s, which is consistent with global trends in slave trade history.

The graph helps highlight that the trade was not evenly distributed across all ports — rather, it was concentrated in certain places that became slave trade hubs. These hubs played a central role in maintaining and expanding the transatlantic system of slavery.

This visualization shows that some disembarkation ports were more heavily involved over time, either because of their size, their economic importance, or their colonial government’s policies. For example, ports in Bahia or Havana may have had both high demand and state support.

Looking at this data over the full time range from 1500–1866 emphasizes how deeply rooted and long-lasting the transatlantic slave trade was. It also shows how systematic it was — these weren’t isolated voyages, but repeated, large-scale deliveries of people over centuries.

Visualization 2: Total Numbers of Captives Over Time(years)

# Step 1: Group the data by year and get total captives
time_data <- df %>%
  mutate(year = round(year)) %>%
  group_by(year) %>%
  summarise(total_captives = sum(captives_arrived, na.rm = TRUE)) %>%
  mutate(
    time_group = case_when(
      year < 1700 ~ "Before 1700",
      year >= 1700 & year < 1800 ~ "1700s",
      year >= 1800 & year <= 1866 ~ "1800s",
      TRUE ~ "Other"
    )
  )

# Step 2: Create the interactive plot
plot_ly(
  data = time_data,
  x = ~year,
  y = ~total_captives,
  type = 'scatter',
  mode = 'lines+markers',
  color = ~time_group,
  colors = c("Before 1700" = "#33a02c", "1700s" = "#ff7f00", "1800s" = "#e31a1c", "Other" = "#6a3d9a"),
  marker = list(size = 6),
  line = list(width = 2)
) %>%
  layout(
    title = "Total Number of Captives Transported Each Year (1500–1866)",
    xaxis = list(title = "Year of Disembarkation"),
    yaxis = list(title = "Total Captives"),
    legend = list(title = list(text = "Map Key (Time Period)")),
    margin = list(l = 65, r = 45, t = 85, b = 65)
  )

##the year with the highest slaved transported

time_data %>% filter(total_captives == max(total_captives))
# A tibble: 1 × 3
   year total_captives time_group
  <dbl>          <dbl> <chr>     
1  1829          79472 1800s     

This graph shows the total number of enslaved people (captives) transported each year between 1500 and 1866 during the Transatlantic Slave Trade. It’s an interactive line graph, which means we can hover over each year to see the exact number of captives, and we can zoom in on specific time periods. The x-axis represents the year when the captives arrived at ports, and the y-axis shows the total number of captives transported that year. This scatter plot highlights temporal trends in slave transport. We observe peaks during certain decades, followed by declines, often linked to abolition efforts.In the interactive graph, we see that the year 1829 had the highest number of enslaved people transported — 79,472 individuals. This peak happens in the red-colored section of the graph, which represents the 1800s. This large bubble or dot stands out clearly when compared to the rest of the timeline. This might seem surprising, because many people think the slave trade ended in the early 1800s. However, while Britain and the U.S. banned the transatlantic slave trade in 1807–1808, other countries like Brazil, Cuba, and Portugal continued illegal slave trading for decades. In fact, Brazil didn’t officially abolish the trade until the 1850s. The spike in 1829 likely reflects a period of intense trafficking led by Portuguese and Brazilian traders. This shows how legal bans didn’t immediately stop the trade — many ships continued illegally under different flags or used hidden routes.

Visualization 3: Interactive bubble graph of Captives by Vessel and Year

library(tidyverse)
library(plotly)

# Step 1: Prepare the vessel-level data
vessel_plot_data <- df %>%
  filter(!is.na(Vessel.name), !is.na(year), !is.na(captives_arrived)) %>%
  group_by(Vessel.name, year) %>%
  summarize(total = sum(captives_arrived), .groups = "drop") %>%
  mutate(
    time_period = case_when(
      year < 1700 ~ "Before 1700",
      year >= 1700 & year < 1800 ~ "1700s",
      year >= 1800 ~ "1800s",
      TRUE ~ "Unknown"
    )
  )

# Step 2: Plot bubble chart correctly
plot_ly(
  data = vessel_plot_data,
  x = ~year,
  y = ~total,
  type = 'scatter',
  mode = 'markers',
  size = ~total,
  color = ~time_period,
  colors = c(
    "Before 1700" = "#33a02c",
    "1700s" = "#ff7f00",
    "1800s" = "#e31a1c",
    "Unknown" = "#6a3d9a"
  ),
  marker = list(
    sizemode = 'diameter',
    sizeref = 2.5 * max(vessel_plot_data$total) / (100^2),
    sizemin = 4,
    line = list(width = 1, color = '#FFFFFF')
  ),
  text = ~paste(
    "Vessel:", Vessel.name,
    "<br>Year:", year,
    "<br>Captives:", total
  ),
  hoverinfo = "text"
) %>%
  layout(
    title = "Bubble Chart: Number of Captives Transported by Vessel Over Time",
    xaxis = list(title = "Year of Disembarkation"),
    yaxis = list(title = "Number of Captives Transported"),
    legend = list(title = list(text = "Time Period (Map Key)")),
    margin = list(l = 70, r = 40, t = 80, b = 60)
  )
Warning: `line.width` does not currently support multiple values.
Warning: `line.width` does not currently support multiple values.
Warning: `line.width` does not currently support multiple values.

##Explanation This bubble chart shows how many enslaved people were transported by each ship (vessel) over time, from the year 1500 to the mid-1800s. Each bubble represents a ship in a specific year, and the size of the bubble shows how many captives that ship carried in that year. The x-axis is the year and the y-axis is the number of captives transported.Looking at the chart, most of the larger bubbles are in the 1700s and 1800s (orange and red). That means ships carried more people during those centuries, which tells us those were the peak years of the transatlantic slave trade. Before 1700, we see fewer and smaller bubbles, showing that the trade was still growing in the early years.

The red bubbles in the early 1800s stand out the most. That’s when the slave trade was running at full scale, especially by Portuguese and Brazilian ships. This happened even though Britain and the U.S. passed laws to ban the slave trade around 1807–1808. Other countries ignored or broke these rules, which is why the trade continued.As for the country with the highest rate, based on the ports and historical context, it’s likely Brazil. Many of the ports in Brazil, especially Bahia and Rio de Janeiro, received huge numbers of enslaved people. Brazil kept the trade going longer than some other countries — they didn’t end it until around 1850. That’s why we still see a lot of big bubbles even in the later years like 1829.When I look at this graph, I can see that most of the bubbles are red and orange, which means most of the voyages happened in the 1700s and 1800s. This makes sense because those were the years when the transatlantic slave trade was at its peak.

##Backgroud Research According to the Trans-Atlantic Slave Trade Database, over 12 million Africans were forcibly transported across the Atlantic. This trade profoundly shaped the histories of Africa, Europe, and the Americas. Scholars like David Eltis have compiled vast records from archives to trace this inhuman commerce. The transatlantic slave trade was an oceanic trade in African men, women, and children which lasted from the mid-sixteenth century until the 1860s. European traders loaded African captives at dozens of points on the African coast, from Senegambia to Angola and round the Cape to Mozambique. The great majority of captives were collected from West and Central Africa and from Angola.

The trade was initiated by the Portuguese and Spanish especially after the settlement of sugar plantations in the Americas. European planters spread sugar, cultivated by enslaved Africans on plantations in Brazil, and later Barbados, throughout the Caribbean. In time, planters sought to grow other profitable crops, such as tobacco, rice, coffee, cocoa, and cotton, with European indentured laborers as well as African and Indian slave laborers.Europeans used various methods to organize the Atlantic trade. Spain licensed (by Asiento agreements) other nations to supply its Spanish American and Caribbean colonies with African captives. France, the Netherlands, and England initially used monopoly companies.During the long coercive interlude of forced trans-Atlantic migration, European and African conceptions of self and community (and eligibility for enslavement) did not remain static. On the African side, the major effect of the African-European exchange was to encourage an elementary pan-Africanism, at least among victims. The initial and unintentional impact of European sea-borne contact was to force non-elite Africans to think of themselves as part of a wider African group. Initially, this group might be Igbo, or Yoruba, and soon, in addition, black as opposed to white. At the most elemental level, by the late eighteenth century, the slaves at James Island vowed to drink the blood of the whitemen. In Gorée, a little later, one third of the slaves, in a carefully planned conspiracy, “would go in the village and be dispersed to massacre the whites.” When asked “whether it were true that they had planned to massacre all the whites of the island the two leaders, far from denying the fact or looking for prevarication, answered with boldness and courage: that nothing was truer.” Many similar incidents could be cited from the American side of the Atlantic. And on board a slave ship with the slaves always black, and the crew largely white, skin color defined ethnicity.In the Atlantic after 1492, oceans that had hermetically sealed peoples and cultures from each other sprouted sea-lanes almost overnight. Cultural accommodation between peoples, in this case between Europeans and non-Europeans, always took time. The big difference was that before Columbus, migrations had been gradual and tended to move outwards from the more to the less densely populated parts of the globe. But Columbian contact was sudden, and inhibited any gradual adjustment, cultural as well as epidemiological. A merging of perceptions of right and wrong, group identities, and relations between the sexes, to look only at the top of a very long list of social values, could not be expected to occur quickly in a post-Columbian world. In short, cultural adjustment could not keep pace with transportation technology. The result was first the rise, and then, as perceptions of the insider-outsider divide slowly changed, the fall, of the trans-Atlantic trade in enslaved Africans.

Final Reflections

This project gave me new insight into the scale, scope, and mechanics of the slave trade. I was surprised by how consistent the volume remained over long periods and how centralized some ports were.

One limitation was the lack of geographic coordinates to produce a leaflet map. I also wished there was more complete data on the origins of captives and more consistent captain naming conventions.

Despite these limits, this project taught me the power of combining history and data science. The interactive visualizations especially helped turn abstract data into more relatable and humanized stories.

##Sources

Class Google graph

Chat Gpt for Graph two making an interactive graph, and part form graph one to make the graph work because of some compelling issues with renaming columns.

The history of the transatlantic slave trade. (n.d.). Royal Museums Greenwich. https://www.rmg.co.uk/stories/topics/history-transatlantic-slave-trade

Lewis, & Thomas. (2025, April 21). Transatlantic slave trade | History & Facts. Encyclopedia Britannica.

Trans-Atlantic - about the database. (n.d.). https://www.slavevoyages.org/voyage/essays#interpretation/overview-trans-atlantic-slave-trade/influence-ethnic-racial-identity/8/en/

The history of the transatlantic slave trade. (n.d.-b). Royal Museums Greenwich. https://www.rmg.co.uk/stories/topics/history-transatlantic-slave-trade