Assignment: Analyzing Trans-Atlantic and Intra-American Slave Trade with Tidyverse

Objective

In this assignment, you will use R and the tidyverse package to load, clean, and analyze two datasets on the slave trade: Trans-Atlantic (from Africa to Americas) and Intra-American (within Americas). You will clean the data, add new columns (e.g., decade, estimated deaths, US indicator), combine the datasets, and answer key questions about slave imports to the US. Publish your results to RPubs using R Markdown.

Part 1: Data Loading and Cleaning (1 points)

Use the below code for getting the data and loading the necessary libraries:

# Load required library
library(tidyverse)


col_types_spec <- cols_only(
  id = col_integer(),
  voyage_id = col_integer(),
  voyage_dates__imp_arrival_at_port_of_dis_sparsedate__year = col_double(),
  voyage_slaves_numbers__imp_total_num_slaves_disembarked = col_double(),
  voyage_slaves_numbers__imp_total_num_slaves_embarked = col_double(),
  voyage_dates__length_middle_passage_days = col_double(),
  voyage_dates__imp_length_home_to_disembark = col_double(),
  voyage_crew__crew_first_landing = col_double(),
  voyage_crew__crew_voyage_outset = col_double(),
  voyage_ship__tonnage_mod = col_double(),
  voyage_slaves_numbers__imp_jamaican_cash_price = col_double(),
  voyage_slaves_numbers__imp_mortality_ratio = col_double(),
  voyage_slaves_numbers__percentage_women_among_embarked_slaves = col_double(),
  voyage_outcome__vessel_captured_outcome__name = col_character(),
  voyage_ship__imputed_nationality__name = col_character(),
  voyage_itinerary__imp_region_voyage_begin__name = col_character(),
  voyage_ship__rig_of_vessel__name = col_character(),
  voyage_itinerary__place_voyage_ended__name = col_character(),  # Force as character
  voyage_dates__slave_purchase_began_sparsedate__month = col_double(),
  voyage_slaves_numbers__percentage_men = col_double(),
  voyage_dates__voyage_completed_sparsedate__month = col_double(),
  voyage_itinerary__region_of_return__name = col_character(),
  voyage_slaves_numbers__percentage_boy = col_double(),
  voyage_itinerary__imp_principal_region_slave_dis__name = col_character(),
  voyage_itinerary__imp_principal_region_of_slave_purchase__name = col_character(),
  voyage_dates__date_departed_africa_sparsedate__month = col_double(),
  voyage_dates__voyage_began_sparsedate__month = col_double(),
  voyage_itinerary__imp_port_voyage_begin__name = col_character(),
  voyage_dates__first_dis_of_slaves_sparsedate__month = col_double(),
  voyage_itinerary__imp_broad_region_slave_dis__name = col_character(),
  voyage_slaves_numbers__percentage_girl = col_double(),
  voyage_outcome__particular_outcome__name = col_character(),
  voyage_itinerary__imp_principal_port_slave_dis__name = col_character(),
  voyage_slaves_numbers__percentage_child = col_double(),
  voyage_slaves_numbers__percentage_women = col_double(),
  voyage_dates__departure_last_place_of_landing_sparsedate__month = col_double(),
  voyage_outcome__outcome_owner__name = col_character(),
  voyage_outcome__outcome_slaves__name = col_character(),
  voyage_itinerary__imp_principal_place_of_slave_purchase__name = col_character(),
  voyage_outcome__resistance__name = col_character(),
  voyage_slaves_numbers__percentage_male = col_double(),
  voyage_slaves_numbers__percentage_female = col_double(),
  voyage_itinerary__imp_broad_region_voyage_begin__name = col_character(),
  voyage_itinerary__imp_broad_region_of_slave_purchase__name = col_character(),
  voyage_sources = col_character(),
  enslavers = col_character()
)


# Load the datasets
trans <- read_csv("https://raw.githubusercontent.com/imowerman-prog/data-3210/refs/heads/main/Data/trans-atlantic.csv", col_types = col_types_spec)
intra <- read_csv("https://raw.githubusercontent.com/imowerman-prog/data-3210/refs/heads/main/Data/intra-american.csv", col_types = col_types_spec)

Rename long column names for readability (e.g., year, slaves_embarked, slaves_disembarked).

Clean the data:

  • Convert year to integer, slave numbers to numeric.

  • Filter out rows where slaves_disembarked is 0 or NA (incomplete voyages).

  • Filter for successful outcomes (e.g., “Slaves disembarked”, “Voyage completed”, “Sold slaves”).

  • Add new columns: decade (e.g., floor(year / 10) * 10), estimated_deaths (slaves_embarked - slaves_disembarked), is_us (TRUE if disembark is US-based, using dis_broad == “Mainland North America” or specific US regions/ports like “New Orleans”).

  • Combine the datasets with bind_rows(), adding a source_type column (“Trans-Atlantic” or “Intra-American”).

Part 2: Analysis and Questions (5 points)

  • Total slaves imported to the US: Filter for is_us == TRUE, sum slaves_disembarked from both datasets.
  • Proportion of all slaves taken from Africa: Calculate US total / total slaves_embarked from Trans-Atlantic dataset (as this represents slaves taken from Africa).
  • Graph slave imports by decade to the US: Filter for US, group by decade, sum slaves_disembarked, plot as a bar graph with ggplot2.
  • Imports to the US by decade and region/port/state: Filter for US, group by decade, dis_region, dis_port (approximate state from port/region, e.g., “New Orleans” -> “Louisiana”), sum slaves_disembarked. Use a table and faceted bar plot.
  • Countries participating in export from Africa, by decade: From Trans-Atlantic dataset, group by decade and voyage_ship__imputed_nationality__name (as “country”), count unique voyages or sum slaves_embarked. Display in a table.

Part 3: Visualizations and Publication (2 points)

  • Create at least 2 plots (e.g., bar for US imports by decade, faceted bar for US by decade/region).
  • Write a summary of what you have uncovered from this assignment.
  • Knit the Rmd file and publish to RPubs. Include a link to your RPubs page in your submission.