#####This R Markdown document is for anyone interested in working with AIS Marine Traffic Vessel Data. The R script that follows was originally created to download and organize AIS Vessel Data in order to study patterns of and share insights into the controversy over the Vieques, Puerto Rico ferry system. On NOAA’s website, the data is stored in CSVs that include all U.S. marine traffic for each day, and includes information like the vessel name, size, speed, location, and a time stamp. This script will download however many days you want and organize the data into original CSVs on your device depending on which latititude/longitude bounds you input. There’s lots that can be done the data is downloaded and organized. Included in this script is one such example.

We’re going to start out by setting our working directory and creating another object with a folder for our output files.

myWD = "/Volumes/BlackSams/projects/workingDirectory" ###You will want to create your own folders on your computer and copy the file paths here.##
myLD = "/Volumes/BlackSams/projects/listDirectory"
setwd(myWD)

AIS Vessel Data can be accessed through NOAA’s website on separate pages for each year. 2019, for example, can be accessed here: link. As you’ll see, each day of the year has its own URL and its own dataset that you can download by clicking on that URL. To avoid having to go through and download each file one-by-one, we’re going to use a For Loop.

But before we get there, we’ve got to do some setup. The losDias vector we’re creating below will give us a list of dates that we can use to access each individual URL. The minLon, minLat, maxLon, and maxLat objects will allow us to parse through each dataset and extract only the data that falls within the area we’re interested in. The urlStart object will be combined with the each result of the For Loop to complete the URL.

losDias <- format(seq(as.Date("2018/01/01"), by = "day", length.out = 2),"%Y_%m_%d")

minLon = -65.8
maxLon = -65.1
maxLat = 18.5
minLat = 17.9
x = NULL
df = NULL
urlStart <- "https://coast.noaa.gov/htdata/CMSP/AISDataHandler/2018/AIS_"

Next, the For Loop. We’re not going to go through it all here, since it gets a little complicated and a little redundant. But trust me, it works!

options(timeout=2000)

for(dia in losDias){
        daURL <- paste0(urlStart,dia,".zip")
        daTemp <- paste0(myWD,"/temp",".zip")
        daCSV <- paste0(myWD,"/AIS_",dia,".csv")
        daNewCSV <- paste0(myLD,"/PR_",dia,".csv")
        download.file(daURL, daTemp,  mode = "wb") 
        unzip(daTemp)   
        df = read.csv(daCSV)
        df_PR = df[df$LON > minLon & df$LON < maxLon & df$LAT > minLat & df$LAT < maxLat,]
        write.csv(df_PR, file = daNewCSV)
        df = NULL ## I put these to delete the files and be sure that it does not get reused in the next loop 
        df_PR = NULL
        file.remove(daTemp)
        file.remove(daCSV)
}

If you’ve just run the code up to here, the the list directory you created at the beginning should be populated with however many CSV files you told the code to create when you decided what number to in the ‘length.out’ part of the losDias vector.

The next step is to create one large CSV with all of the CSVs in the list folder. Run the code below and you’ll output a CSV called “AllVessels2018.csv”.

setwd(myLD)
file_list <- list.files(myLD)

for (listFile in file_list){
  if (!exists("df")){
    df <- read.csv(listFile, header=TRUE, sep=",")
  }
  
  if (exists("df")){
    temp_dataset <-read.csv(listFile, header=TRUE, sep=",")
    df<-rbind(df, temp_dataset)
    rm(temp_dataset)
  }}

write.csv(df, file = "AllVessels2018.csv")

Now that we’ve got one full dataset with all of the days we’re interested in, we can start to explore the data. One fun way to do that is by creating a map with the wonderful open source JavaScript library called Leaflet. Run the code below and you’ll get yourself an interactive map.

install.packages("leaflet",repos = "http://cran.us.r-project.org")
## 
## The downloaded binary packages are in
##  /var/folders/vx/2ngwqbm150x_qkqm3db6jmtc0000gn/T//RtmpfUQjBv/downloaded_packages
library(leaflet)

name <- as.character(df$VesselName)
time <- as.character(df$BaseDateTime)

popupText <- paste0("Vessel Name: ", name, " | ", "Time Stamp: ", time)

myMap <- leaflet(df, options = leafletOptions(preferCanvas = TRUE)) %>%
  addTiles() %>%
  addCircleMarkers(df$LON, df$LAT, radius = 1, popup = ~popupText , label = ~popupText)

myMap

There is a lot more that can be done with AIS Vessel Data. Try exploring the data on your own and see what you can come up with. In the future, I’ll share how to analyze the lateness of vessels, which in the case of Vieques is a huge issue. Read more about that here: Human Rights Implications of an Inadequate Transportation System between the Islands of Culebra, Vieques, and Puerto Rico

Hope this is useful!