Set working directory to the folder the csv file is in.

setwd("D:/Microplastic Data/") #change this to match the current working directory

Read in libraries

# remove "#" to run
#install.packages("effsize")
library(effsize)
#install.packages("rstatix")
library(rstatix)
#install.packages("dplyr")
library(dplyr)
#install.packages("ggplot2")
library(ggplot2)

Read in data

fcm <- read.csv("Microplastic Samples.csv") #change this to match the csv to match current study
#this is reading the averaged data, because if you didn't, you'd artificially double sample size
fcm$type <- factor(fcm$type,
                   levels = c("flowing", "stagnant"),
                   labels = c ("Flowing", "Stagnant"))
#for neatness of visualization, capitalizing the type

ASSUMPTIONS

Use the Shapiro Test to check normality to determine if a t-test or Wilcoxon test is needed. This test asks: “Does this data follow a normal (bell-shaped) distribution?”

shapiro.test(fcm$avg_particles_ml[fcm$type == "Flowing"])
shapiro.test(fcm$avg_particles_ml[fcm$type == "Stagnant"])
#if p > 0.05 -> close enough to normal, p < 0.05 -> not normal

Use Equal Variance test to check for equal variance. This test asks: “Do both groups have similar spread?”

var.test(avg_particles_ml ~ type, data = fcm)

# if p > 0.05, NOT significantly different = assume equal variance

T-TESTING

T-tests are to determine if there is a difference between group means.This test asks: “Is there a significant difference between means?” However, using this test is better for larger n.

t.test(avg_particles_ml ~ type, data = fcm, var.equal = TRUE)
#means tell you which is higher, p < 0.05 -> significant difference,p > 0.05 not significant

WILCOXON TESTING

Wilcoxon compares ranked values for small sample sizes, skewed data, and outliers. This test asks: “Are the values in one group generally higher?”

wilcox.test(avg_particles_ml ~ type, data = fcm)
#p < 0.05 -> significant difference, p > 0.05 -> not significant

EFFECT SIZE

P-values only tell if there is a difference, we need to know how big the difference is. This test asks: “How far apart are the two groups in standard deviation units”

cohen.d(avg_particles_ml ~ type, data = fcm, exact = FALSE)
# 0.2 = small effect, 0.5 = medium effect, > 0.8 = large effect, >1.2 = very large, > 2.0 = huge

VISUALIZATION

Jitter is included because it shows each site individually.

ggplot(fcm, aes(x = type, y = avg_particles_ml)) +
  geom_boxplot() +
  geom_jitter(width = 0.1, size = 2) +
  labs(
    x = "Condition",
    y = "Particles per mL"
  ) +
  theme_minimal()