The purpose of this code is to re-format the environmental upcasts (bottle measurements) and downcasts (continuous CTD measurements) from the WOAC cruises so that they can be added to the zooplankton abundance data. It produces a file with upcast data named WOAC_upcasts and a file with downcast data named WOAC_downcasts.
load packages
library(openxlsx)
library(dplyr)
library(lubridate)
library(readxl)
Download data from NANOOS website: I first went to https://nvs.nanoos.org/CruiseSalish and downloaded all relevant files. I moved them to the NANOOS-files folder. I unzipped them manually. I created folders named upcasts and downcasts to add the appropriate files to.
This code produces a dataframe named WOAC_upcasts which has all of the upcast files merged vertically with common columns.
#find current directory
pwd
## /Users/hailaschultz/Dropbox/Schultz_Dissertation/Data_Analysis/Schultz_dissertation-2/code
move all files labeled upcast to the upcast folder
#move directories
cd /Users/hailaschultz/Dropbox/Schultz_Dissertation/Data_Analysis/Schultz_dissertation-2/data/NANOOS_files
find . -name '*labupcast.xlsx' -exec mv {} ../NANOOS_files/upcasts/ \;
## mv: ./upcasts/SalishCruise_September2021_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_September2021_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_July2020_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_July2020_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_April2015_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_April2015_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_July2021_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_July2021_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_September2020_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_September2020_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_April2019_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_April2019_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_September2022_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_September2022_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_April2016_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_April2016_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_April2017_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_April2017_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_July2022_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_July2022_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_April2018_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_April2018_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_September2017_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_September2017_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_July2019_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_July2019_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_July2016_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_July2016_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_April2023_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_April2023_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_September2018_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_September2018_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_September2019_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_September2019_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_April2022_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_April2022_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_July2017_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_July2017_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_July2018_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_July2018_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_September2016_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_September2016_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_September2014_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_September2014_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_July2015_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_July2015_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_April2021_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_April2021_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_July2014_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_July2014_labupcast.xlsx are identical
## mv: ./upcasts/SalishCruise_September2015_labupcast.xlsx and ../NANOOS_files/upcasts/SalishCruise_September2015_labupcast.xlsx are identical
The 2014 and 2015 upcasts were not yet in NANOOS, so I had to download them from NCEI: https://www.ncei.noaa.gov/access/ocean-carbon-acidification-data-system/oceans/SalishCruise_DataPackage.html After downloading, I moved them directly to the upcasts folder and manually converted from csv to excel
import the excel files into R and merge into one large list
#name the directory
excel_dir<-"/Users/hailaschultz/Dropbox/Schultz_Dissertation/Data_Analysis/Schultz_dissertation-2/data/NANOOS_files/upcasts"
#get a list of file names
excel_files <- list.files(path = excel_dir, pattern = "\\.xlsx$", full.names = TRUE)
# Initialize an empty list to store the data frames
dfs <- list()
# Loop through each Excel file and read it into a data frame
for (file in excel_files) {
# Read the Excel file into a data frame
df <- read_excel(file)
# Store the data frame in the list
dfs[[length(dfs) + 1]] <- df
}
example of how to access dataframes
dfs[[1]]
## # A tibble: 235 × 39
## CRUISE_ID DATE_UTC TIME_UTC DATE_LOCAL
## <chr> <dttm> <dttm> <dttm>
## 1 CAB1028 2015-04-05 00:00:00 1899-12-31 16:35:32 2015-04-05 00:00:00
## 2 CAB1028 2015-04-05 00:00:00 1899-12-31 16:36:59 2015-04-05 00:00:00
## 3 CAB1028 2015-04-05 00:00:00 1899-12-31 16:38:36 2015-04-05 00:00:00
## 4 CAB1028 2015-04-05 00:00:00 1899-12-31 16:40:28 2015-04-05 00:00:00
## 5 CAB1028 2015-04-05 00:00:00 1899-12-31 16:42:15 2015-04-05 00:00:00
## 6 CAB1028 2015-04-05 00:00:00 1899-12-31 16:43:16 2015-04-05 00:00:00
## 7 CAB1028 2015-04-05 00:00:00 1899-12-31 16:44:02 2015-04-05 00:00:00
## 8 CAB1028 2015-04-05 00:00:00 1899-12-31 16:44:47 2015-04-05 00:00:00
## 9 CAB1028 2015-04-05 00:00:00 1899-12-31 16:45:28 2015-04-05 00:00:00
## 10 CAB1028 2015-04-05 00:00:00 1899-12-31 16:46:11 2015-04-05 00:00:00
## # ℹ 225 more rows
## # ℹ 35 more variables: TIME_LOCAL <dttm>, LONGITUDE_DEC <dbl>,
## # LATITUDE_DEC <dbl>, STATION_NO <dbl>, NISKIN_NO <dbl>, CTDPRS_DBAR <dbl>,
## # CTDTMP_DEG_C_ITS90 <dbl>, CTDTMP_FLAG_W <dbl>, CTDSAL_PSS78 <dbl>,
## # CTDSAL_FLAG_W <dbl>, SIGMATHETA_KG_M3 <dbl>, CTDOXY_UMOL_KG_ADJ <dbl>,
## # CTDOXY_MG_L <dbl>, CTDOXY_FLAG_W <dbl>, OXYGEN_UMOL_KG <dbl>,
## # OXYGEN_MG_L_1 <dbl>, OXYGEN_MG_L_2 <dbl>, OXYGEN_MG_L_3 <dbl>, …
See which columns all of the dataframes have in common
# Get column names of the first data frame
common_columns <- names(dfs[[1]])
# Loop through the remaining data frames and find common column names
for (i in 2:length(dfs)) {
# Get column names of the current data frame
current_columns <- names(dfs[[i]])
# Find common column names with previous data frames
common_columns <- intersect(common_columns, current_columns)
}
# 'common_columns' now contains the column names that are common across all data frames
print(common_columns)
## [1] "CRUISE_ID" "DATE_UTC" "TIME_UTC"
## [4] "DATE_LOCAL" "TIME_LOCAL" "LONGITUDE_DEC"
## [7] "LATITUDE_DEC" "STATION_NO" "NISKIN_NO"
## [10] "CTDPRS_DBAR" "CTDTMP_DEG_C_ITS90" "CTDTMP_FLAG_W"
## [13] "CTDSAL_PSS78" "CTDSAL_FLAG_W" "SIGMATHETA_KG_M3"
## [16] "CTDOXY_FLAG_W" "OXYGEN_UMOL_KG" "OXYGEN_MG_L_1"
## [19] "OXYGEN_MG_L_2" "OXYGEN_MG_L_3" "OXYGEN_FLAG_W"
## [22] "TA_UMOL_KG" "DIC_UMOL_KG" "TA_FLAG_W"
## [25] "DIC_FLAG_W" "NITRATE_UMOL_L" "NITRITE_UMOL_L"
## [28] "AMMONIUM_UMOL_L" "PHOSPHATE_UMOL_L" "SILICATE_UMOL_L"
## [31] "NUTRIENTS_FLAG_W" "CHLA (ug/l)"
convert the date to local format
# Loop through each dataframe in the list
for (i in seq_along(dfs)) {
# Check if the dataframe has a column labeled "DATE_LOCAL"
if ("DATE_LOCAL" %in% names(dfs[[i]])) {
# Convert the "DATE_LOCAL" column to character
dfs[[i]]$DATE_LOCAL <- as.character(dfs[[i]]$DATE_LOCAL)
}
}
for (i in seq_along(dfs)) {
# Check if the dataframe has a column labeled "DATE_LOCAL"
if ("DATE_LOCAL" %in% names(dfs[[i]])) {
# Convert the "DATE_LOCAL" column to character
dfs[[i]]$DATE_UTC <- as.character(dfs[[i]]$DATE_UTC)
}
}
for (i in seq_along(dfs)) {
# Check if the dataframe has a column labeled "DATE_LOCAL"
if ("DATE_LOCAL" %in% names(dfs[[i]])) {
# Convert the "DATE_LOCAL" column to character
dfs[[i]]$TIME_UTC <- as.character(dfs[[i]]$TIME_UTC)
}
}
for (i in seq_along(dfs)) {
# Check if the dataframe has a column labeled "DATE_LOCAL"
if ("DATE_LOCAL" %in% names(dfs[[i]])) {
# Convert the "DATE_LOCAL" column to character
dfs[[i]]$TIME_LOCAL <- as.character(dfs[[i]]$TIME_LOCAL)
}
}
subset each dataframe to the common columns
# Loop through each data frame in the list
for (i in seq_along(dfs)) {
# Subset the data frame to only the common columns
dfs[[i]] <- dfs[[i]][, common_columns, drop = FALSE]
}
combine the dataframes vertically
combined_df <- do.call(rbind, dfs)
convert date from character to date
combined_df$Date <- ymd(combined_df$DATE_LOCAL)
extract month and year
combined_df$Month <- month(combined_df$Date)
combined_df$Year <- year(combined_df$Date)
unique(combined_df$Month)
## [1] 4 3 NA 7 6 9 10
recode months
combined_df$Month <- recode_factor(combined_df$Month,
'4' = "APR",'3'="APR", '7' = "JUL",
'6' = "JUL", '9' = "SEP", '10' = "SEP")
Subset stations to WOAC biology stations
unique(combined_df$STATION_NO)
## [1] "28" "5" "1" "3" "4" "26" "22" "21" "20" "7" "8" "10"
## [13] "17" "15" "14" "13" "401" "12" "11" "402" "29" "30" "31" "33"
## [25] "35" "36" "38" "28b" NA "27" "19" "18" "9" "16" "32" "37"
# Define the seven named stations
named_stations <- c("4", "8", "12", "28", "38", "402", "22")
# Subset the dataframe based on the named stations
subset_df <- combined_df[combined_df$STATION_NO %in% named_stations, ]
#add P to station numbers
subset_df$STATION_NO <- paste0("P", subset_df$STATION_NO)
Rename
WOAC_upcasts<-subset_df
This code creates a dataframe named downcasts_combined which has all of the downcast files merged vertically with common columns. 2014-2015 are processed separately from 2016-2022 at first because they are different file types and have different column names for the same variables.
import and merge downcast excel files (2016-2022)
# name downcast folder
downcast_dir<-"/Users/hailaschultz/Dropbox/Schultz_Dissertation/Data_Analysis/Schultz_dissertation-2/data/NANOOS_files/woac_downcasts"
downcast_excel_files <- list.files(path = downcast_dir, pattern = "\\.xlsx$", full.names = TRUE)
# Initialize an empty list to store the data frames
dfs <- list()
# Loop through each Excel file and read it into a data frame
for (file in downcast_excel_files) {
# Read the Excel file into a data frame
df <- read_excel(file)
# Store the data frame in the list
dfs[[length(dfs) + 1]] <- df
}
See which columns all of the dataframes have in common
# Get column names of the first data frame
common_columns <- names(dfs[[1]])
# Loop through the remaining data frames and find common column names
for (i in 2:length(dfs)) {
# Get column names of the current data frame
current_columns <- names(dfs[[i]])
# Find common column names with previous data frames
common_columns <- intersect(common_columns, current_columns)
}
# 'common_columns' now contains the column names that are common across all data frames
print(common_columns)
## [1] "Uploadtime"
## [2] "NMEAtimeUTC"
## [3] "CruiseID"
## [4] "Station"
## [5] "Waypoint"
## [6] "Cast"
## [7] "prDM: Pressure Digiquartz"
## [8] "depSM: Depth"
## [9] "Temperature"
## [10] "potemp090C: Potential Temperature"
## [11] "potemp190C: Potential Temperature 2"
## [12] "c0S/m: Conductivity"
## [13] "sal00: Salinity Practical"
## [14] "sal11: Salinity Practical 2"
## [15] "density00: Density"
## [16] "sigma-t00: Density"
## [17] "density11: Density 2"
## [18] "sigma-È11: Density 2"
## [19] "sigma-t11: Density 2"
## [20] "sbeox1V: Oxygen raw SBE 43 2"
## [21] "sbeox0ML/L: Oxygen SBE 43"
## [22] "sbeox0Mg/L: Oxygen SBE 43"
## [23] "sbeox1ML/L: Oxygen SBE 43 2"
## [24] "sbeox1Mg/L: Oxygen SBE 43 2"
## [25] "sbox0Mm/Kg: Oxygen SBE 43"
## [26] "sbox1Mm/Kg: Oxygen SBE 43 2"
## [27] "sbeox0PS: Oxygen SBE 43"
## [28] "sbeox1PS: Oxygen SBE 43 2"
## [29] "oxsolMg/L: Oxygen Saturation Garcia & Gordon"
## [30] "flECO-AFL: Fluorescence WET Labs ECO-AFL/FL"
## [31] "PAR"
## [32] "CStarTr0: Beam Transmission WET Labs C-Star"
## [33] "CStarAt0: Beam Attenuation WET Labs C-Star"
## [34] "turbWETntu0: Turbidity WET Labs ECO"
## [35] "timeS: Time Elapsed"
## [36] "scan: Scan Count"
subset to common columns
# Loop through each data frame in the list
for (i in seq_along(dfs)) {
# Subset the data frame to only the common columns
dfs[[i]] <- dfs[[i]][, common_columns, drop = FALSE]
}
combine datasheets vertically
combined_df <- do.call(rbind, dfs)
remove units rows
combined_df<-subset(combined_df,NMEAtimeUTC!="[]")
convert all dates to the correct format
library(dplyr)
# Define a helper function to process each entry
convert_NMEAtime <- function(x) {
if (grepl("^[0-9]+\\.[0-9]+$", x)) {
# Convert Excel numeric date to POSIXct
as.POSIXct(as.numeric(x) * 86400, origin = "1899-12-30", tz = "UTC") # Excel epoch starts on 1899-12-30
} else {
# Parse human-readable datetime
as.POSIXct(x, format = "%b %d %Y %H:%M:%S", tz = "UTC")
}
}
# Apply the conversion function to standardize the column
combined_df <- combined_df %>%
mutate(
# Convert all entries in NMEAtimeUTC to POSIXct format
NMEAtimeUTC = sapply(NMEAtimeUTC, convert_NMEAtime)
) %>%
# Convert POSIXct to desired character format
mutate(
NMEAtimeUTC = format(as.POSIXct(NMEAtimeUTC), "%b %d %Y %H:%M:%S")
)
extract month
combined_df$Month <- substr(combined_df$NMEAtimeUTC, 1, 3)
unique(combined_df$Month)
## [1] "Apr" "Jul" "Jun" "Sep"
combined_df$Month <- recode_factor(combined_df$Month,
'Apr' = "APR", 'Jul' = "JUL", 'Jun'="JUL",
'6' = "JUL", 'Sep' = "SEP")
extract year
combined_df$Year <- substr(combined_df$NMEAtimeUTC, 8, 11)
unique(combined_df$Year)
## [1] "2016" "2017" "2018" "2019" "2021" "2022" "2023" "2020"
Subset stations
unique(combined_df$Station)
## [1] "P28" "P5" "P1" "P3" "P4" "P26"
## [7] "P22" "P21" "P20" "P7" "P8" "P10"
## [13] "P17" "P15" "P14" "P13" "P401" "P12"
## [19] "P11" "P402" "P29" "P30" "P31" "P33"
## [25] "P35" "P36" "P38" "RC001 08" "RC001 07" "RC001 06"
## [31] "RC001 09" "P27" "P07" "P08" "P01" "P03"
## [37] "P04" "P05" "P04b" "p1" "p27" "p28"
## [43] "p3" "p4" "p5" "p13" "p21" "p22"
## [49] "p26" "p20" "p7" "p8" "p10" "p11"
## [55] "p12" "p14" "p15" "p17" "p401" "p402"
## [61] "p29" "p30" "p31" "p33" "p35" "p36"
## [67] "p38"
# Define the seven named stations
named_stations <- c("P4", "P8", "P12", "P28", "P38", "P402", "P22")
# Subset the dataframe based on the named stations
subset_df <- combined_df[combined_df$Station %in% named_stations, ]
unique(subset_df$Station)
## [1] "P28" "P4" "P22" "P8" "P12" "P402" "P38"
imoort and merge csv files (2014-2015)
# Directory containing the CSV files
downcast_dir <- "/Users/hailaschultz/Dropbox/Schultz_Dissertation/Data_Analysis/Schultz_dissertation-2/data/NANOOS_files/CSV_files"
# List all CSV files in the directory
downcast_csv_files <- list.files(path = downcast_dir, pattern = "\\.csv$", full.names = TRUE)
# Initialize an empty list to store the data frames
dfs <- list()
# Loop through each CSV file and read it into a data frame
for (file in downcast_csv_files) {
# Read the CSV file into a data frame
df <- read.csv(file, stringsAsFactors = FALSE)
# Store the data frame in the list
dfs[[length(dfs) + 1]] <- df
}
See which columns all of the dataframes have in common
# Get column names of the first data frame
common_columns <- names(dfs[[1]])
# Loop through the remaining data frames and find common column names
for (i in 2:length(dfs)) {
# Get column names of the current data frame
current_columns <- names(dfs[[i]])
# Find common column names with previous data frames
common_columns <- intersect(common_columns, current_columns)
}
# 'common_columns' now contains the column names that are common across all data frames
print(common_columns)
## [1] "Cruise.ID" "UTC.Time"
## [3] "Latitude.DegMin" "Longitude.DegMin"
## [5] "Latitude.Deg" "Longitude.Deg"
## [7] "Station" "Pressure"
## [9] "Depth" "Temperature"
## [11] "Potential.Temperature" "Salinity"
## [13] "Sigma.t" "Sigma.theta"
## [15] "Oxygen.Concentration.MG" "Oxygen.Concentration.MOL"
## [17] "Oxygen.Saturation" "Chlorophyll.Fluorescence"
## [19] "Beam.Transmission" "Beam.Attenuation"
subset to common columns
# Loop through each data frame in the list
for (i in seq_along(dfs)) {
# Subset the data frame to only the common columns
dfs[[i]] <- dfs[[i]][, common_columns, drop = FALSE]
}
combine datasheets vertically
csv_combined_df <- do.call(rbind, dfs)
remove units rows
csv_combined_df<-subset(csv_combined_df,Pressure!="CTD")
csv_combined_df<-subset(csv_combined_df,Pressure!="[db]")
extract month
csv_combined_df$Month <- substr(csv_combined_df$UTC.Time, 1, 3)
unique(csv_combined_df$Month)
## [1] "Apr" "Jul" "Jun" "Oct" "Sep"
csv_combined_df$Month <- recode_factor(csv_combined_df$Month,
'Apr' = "APR", 'Jul' = "JUL", 'Jun'="JUL",
'6' = "JUL", 'Sep' = "SEP",'Oct'="OCT")
extract year
csv_combined_df$Year <- substr(csv_combined_df$UTC.Time, 8, 11)
unique(csv_combined_df$Year)
## [1] "2015" "2014"
Subset stations
unique(csv_combined_df$Station)
## [1] "P28" "P5" "P1" "P3" "P4" "P26" "P22" "P21" "P20" "P7"
## [11] "P8" "P10" "P17" "P15" "P14" "P13" "P401" "P12" "P11" "P402"
## [21] "P29" "P30" "P31" "P33" "P36" "P38" "P27" "P19" "P18" "P9"
## [31] "P16" "P32" "P35" "P37" "P381" "P122" "P128" "P132" "P136" "P6"
## [41] "P105" "P120" "P123"
# Define the seven named stations
named_stations <- c("P4", "P8", "P12", "P28", "P38", "P402", "P22")
# Subset the dataframe based on the named stations
csv_subset_df <- csv_combined_df[csv_combined_df$Station %in% named_stations, ]
unique(csv_subset_df$Station)
## [1] "P28" "P4" "P22" "P8" "P12" "P402" "P38"
colnames(subset_df)
## [1] "Uploadtime"
## [2] "NMEAtimeUTC"
## [3] "CruiseID"
## [4] "Station"
## [5] "Waypoint"
## [6] "Cast"
## [7] "prDM: Pressure Digiquartz"
## [8] "depSM: Depth"
## [9] "Temperature"
## [10] "potemp090C: Potential Temperature"
## [11] "potemp190C: Potential Temperature 2"
## [12] "c0S/m: Conductivity"
## [13] "sal00: Salinity Practical"
## [14] "sal11: Salinity Practical 2"
## [15] "density00: Density"
## [16] "sigma-t00: Density"
## [17] "density11: Density 2"
## [18] "sigma-È11: Density 2"
## [19] "sigma-t11: Density 2"
## [20] "sbeox1V: Oxygen raw SBE 43 2"
## [21] "sbeox0ML/L: Oxygen SBE 43"
## [22] "sbeox0Mg/L: Oxygen SBE 43"
## [23] "sbeox1ML/L: Oxygen SBE 43 2"
## [24] "sbeox1Mg/L: Oxygen SBE 43 2"
## [25] "sbox0Mm/Kg: Oxygen SBE 43"
## [26] "sbox1Mm/Kg: Oxygen SBE 43 2"
## [27] "sbeox0PS: Oxygen SBE 43"
## [28] "sbeox1PS: Oxygen SBE 43 2"
## [29] "oxsolMg/L: Oxygen Saturation Garcia & Gordon"
## [30] "flECO-AFL: Fluorescence WET Labs ECO-AFL/FL"
## [31] "PAR"
## [32] "CStarTr0: Beam Transmission WET Labs C-Star"
## [33] "CStarAt0: Beam Attenuation WET Labs C-Star"
## [34] "turbWETntu0: Turbidity WET Labs ECO"
## [35] "timeS: Time Elapsed"
## [36] "scan: Scan Count"
## [37] "Month"
## [38] "Year"
colnames(csv_subset_df)
## [1] "Cruise.ID" "UTC.Time"
## [3] "Latitude.DegMin" "Longitude.DegMin"
## [5] "Latitude.Deg" "Longitude.Deg"
## [7] "Station" "Pressure"
## [9] "Depth" "Temperature"
## [11] "Potential.Temperature" "Salinity"
## [13] "Sigma.t" "Sigma.theta"
## [15] "Oxygen.Concentration.MG" "Oxygen.Concentration.MOL"
## [17] "Oxygen.Saturation" "Chlorophyll.Fluorescence"
## [19] "Beam.Transmission" "Beam.Attenuation"
## [21] "Month" "Year"
rename columns from first dataset
subset_df<-subset_df %>% rename(Cruise.ID = CruiseID,
Depth = "depSM: Depth",
Potential.Temperature="potemp090C: Potential Temperature",
Salinity="sal00: Salinity Practical",
Oxygen.Concentration.MG="sbeox0Mg/L: Oxygen SBE 43",
Chlorophyll.Fluorescence="flECO-AFL: Fluorescence WET Labs ECO-AFL/FL")
# Find common columns
common_columns <- intersect(names(subset_df), names(csv_subset_df))
# Subset each data frame to only the common columns
subset_df_common <- subset_df[, common_columns, drop = FALSE]
csv_subset_df_common <- csv_subset_df[, common_columns, drop = FALSE]
# Combine the data frames vertically
WOAC_downcasts <- rbind(subset_df_common, csv_subset_df_common)