Goal: To determine whether the upper units of sediment cores obtained in the Larsen A Embayment of Antarctica contain distinct diatom abundances and assemblages. The diatom suite of a sedimentary unit reveals information about the depositional setting for that unit. Many other data sets have been used to describe the depositional settings of Units B and C of these sediment cores, so this exercise is only meant to corroborate or question, not confirm or reveal.
diatom <- read.csv("/Users/telekineticturtle/Desktop/Colorado 13/Quant Methods/Data/diatoms_NBP0003_LarsenA.csv",
header = T)
names(diatom)
## [1] "Cruise" "Core" "Depth" "Unit" "Total" "perChaet"
## [7] "perFragil"
# Obtaining Subsets
diatomB <- subset(diatom, Unit == "B")
diatomA <- subset(diatom, Unit == "A")
diatomC <- subset(diatom, Unit == "C")
diatomT <- subset(diatom, Unit == "T")
The top two units are do not have enough data to perform statistical tests. Nor are they normally distributed.
# diatomT hist(diatomT$Total) shapiro.test(diatomT$Total) diatomA
# hist(diatomA$Total) shapiro.test(diatomA$Total)
Statistical comparison of these two units is possible.
hist(diatomB$Total)
shapiro.test(diatomB$Total)
##
## Shapiro-Wilk normality test
##
## data: diatomB$Total
## W = 0.9822, p-value = 0.8267
qqnorm(diatomB$Total)
qqline(diatomB$Total)
hist(diatomC$Total)
shapiro.test(diatomC$Total)
##
## Shapiro-Wilk normality test
##
## data: diatomC$Total
## W = 0.9058, p-value = 0.08497
qqnorm(diatomC$Total)
qqline(diatomC$Total)
Null Hypothesis: The means for Units B and C are the same.
t.test(diatomB$Total, diatomC$Total)
##
## Welch Two Sample t-test
##
## data: diatomB$Total and diatomC$Total
## t = 1.366, df = 38.53, p-value = 0.1799
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.4545 2.3423
## sample estimates:
## mean of x mean of y
## 6.768 5.824
The null hypothesis cannot be rejected. Unit C has some high values!
Null Hypothesis: The mean total abundance for site 20 and site 23 is the same.
# Subsetting
diatom20 <- subset(diatom, Core == "KC20")
diatom23 <- subset(diatom, Core != "KC20")
# Normality test
shapiro.test(diatom20$Total) # Normal check.
##
## Shapiro-Wilk normality test
##
## data: diatom20$Total
## W = 0.9614, p-value = 0.3367
shapiro.test(diatom23$Total) # Normal check.
##
## Shapiro-Wilk normality test
##
## data: diatom23$Total
## W = 0.9578, p-value = 0.2232
# T-test
t.test(diatom20$Total, diatom23$Total)
##
## Welch Two Sample t-test
##
## data: diatom20$Total and diatom23$Total
## t = -2.039, df = 60.91, p-value = 0.04576
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -2.80200 -0.02751
## sample estimates:
## mean of x mean of y
## 5.418 6.833
This test shows that site 23 has systematically more diatoms than site 20 (with a 95% confidence interval). This complicates any comparison of total numbers. The simplest fix is the scale up site 20 to match 23 using the estiamted means.
# Calculate means:
mTot20 <- mean(diatom20$Total)
mTot23 <- mean(diatom23$Total)
# Find conversion factor:
Convert20to23 <- mTot23/mTot20
# Make a new column with converted site 20 values: I was trying an if
# statement... if(diatom$Core == 'KC20') diatom$Totalc <-
# diatom$Total*Convert20to23
diatom$Totalc <- 1
diatom[29:59, 8] <- diatom[29:59, 5]
diatom[62:63, 8] <- diatom[62:63, 5]
diatom[1:28, 8] <- diatom[1:28, 5] * Convert20to23
diatom[60:61, 8] <- diatom[60:61, 5] * Convert20to23
# Check the work:
diatom20 <- subset(diatom, Core == "KC20")
diatom23 <- subset(diatom, Core != "KC20")
# Null Hypothesis: Total abundance for (corrected) site 20 and site 23 are
# the same.
t.test(diatom20$Totalc, diatom23$Totalc)
##
## Welch Two Sample t-test
##
## data: diatom20$Totalc and diatom23$Totalc
## t = 0, df = 58.8, p-value = 1
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1.564 1.564
## sample estimates:
## mean of x mean of y
## 6.833 6.833
We have now controlled for sampling site, so we can more accurately compare the diatom abundance of Units B and C:
Null Hypothesis: There is no difference in mean abundance between Units B and C.
# Obtain new subsets:
diatomB <- subset(diatom, Unit == "B")
diatomA <- subset(diatom, Unit == "A")
diatomC <- subset(diatom, Unit == "C")
diatomT <- subset(diatom, Unit == "T")
# Since the data were scaled, more normality tests are not necessary.
# Move straight to the t-test:
t.test(diatomB$Totalc, diatomC$Totalc)
##
## Welch Two Sample t-test
##
## data: diatomB$Totalc and diatomC$Totalc
## t = 2.855, df = 39.97, p-value = 0.006785
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.5873 3.4332
## sample estimates:
## mean of x mean of y
## 7.834 5.824
Controlling for core site, we find that Unit B has 2.01 +/- 1.42 million valves more than Unit C. This is consistent with the interpretation that Unit B was deposited during the mid-Holocene cliamtic optimum, a warmer, more productive period.
In addition to total diatom abundance, we can also differentiate units based on assemblage. The scaling conducted between core sites to account for abundance differences is not applicable to assemblage data, which is reported a relative percentages.
# Data are roughly normally distributed.
shapiro.test(diatomB$perFragil)
##
## Shapiro-Wilk normality test
##
## data: diatomB$perFragil
## W = 0.9807, p-value = 0.7816
# hist(diatomB$perFragil,main='Unit B Percent
# Fragilariopsis',xlab='Percent Fragilariopsis')
shapiro.test(diatomC$perFragil)
##
## Shapiro-Wilk normality test
##
## data: diatomC$perFragil
## W = 0.9526, p-value = 0.4993
# hist(diatomC$perFragil,main='Unit C Percent
# Fragilariopsis',xlab='Percent Fragilariopsis')
Null Hypothesis: Fragilariopsis make up the same percentage of the diatom assemblages in Unit B and Unit C. Any difference in mean % Fragilariopsis is due to random chance.
t.test(diatomB$perFragil, diatomC$perFragil)
##
## Welch Two Sample t-test
##
## data: diatomB$perFragil and diatomC$perFragil
## t = 2.527, df = 39.74, p-value = 0.01559
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.9852 8.8652
## sample estimates:
## mean of x mean of y
## 24.35 19.42
This test shows that we can be confident that the % Fragiliaropsis is different between the two units; if they were not different, we would receive these results less than 5% of the time. This validates the claim that the two units were formed by different depositional environments.
Chaetoceros is the most abundant diatom genus in these cores. It should also be considered.
# Data are roughly normally distributed.
shapiro.test(diatomB$perChaet)
##
## Shapiro-Wilk normality test
##
## data: diatomB$perChaet
## W = 0.9765, p-value = 0.6448
# hist(diatomB$perFragil,main='Unit B Percent Chaetoceros',xlab='Percent
# Chaetoceros')
shapiro.test(diatomC$perChaet)
##
## Shapiro-Wilk normality test
##
## data: diatomC$perChaet
## W = 0.9476, p-value = 0.4191
# hist(diatomC$perFragil,main='Unit C Percent Chaetoceros',xlab='Percent
# Chaetoceros')
Null Hypothesis: Chaetoceros make up the same percentage of the diatom assemblages in Unit B and Unit C. Any difference in mean % Chaetoceros is due to random chance.
t.test(diatomB$perChaet, diatomC$perChaet)
##
## Welch Two Sample t-test
##
## data: diatomB$perChaet and diatomC$perChaet
## t = -3.155, df = 44.77, p-value = 0.00287
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -13.984 -3.085
## sample estimates:
## mean of x mean of y
## 61.02 69.56
This test shows that we can be confident that the % Chaetoceros is different between the two units; if they were not different, we would receive these results less than 5% of the time. This validates the claim that the two units were formed by different depositional environments. Furthermore, the higher relative abundance of the smaller Chaetoceros compared to Fargilariopsis in Unit C is consistent with the theory that Unit C represents a sub-ice shelf setting, whereas Unit B represents an open marine setting. Chaetoceros are more easily transported laterally underneath an ice shelf than Fragilariopsis and so are more likely to dominate a sub-ice shelf setting.