Diatom Analysis for NBP00-03 (Larsen A Embayment, Antarctica)

Goal: To determine whether the upper units of sediment cores obtained in the Larsen A Embayment of Antarctica contain distinct diatom abundances and assemblages. The diatom suite of a sedimentary unit reveals information about the depositional setting for that unit. Many other data sets have been used to describe the depositional settings of Units B and C of these sediment cores, so this exercise is only meant to corroborate or question, not confirm or reveal.

Loading Data

diatom <- read.csv("/Users/telekineticturtle/Desktop/Colorado 13/Quant Methods/Data/diatoms_NBP0003_LarsenA.csv", 
    header = T)
names(diatom)

## [1] "Cruise"    "Core"      "Depth"     "Unit"      "Total"     "perChaet" 
## [7] "perFragil"


# Obtaining Subsets
diatomB <- subset(diatom, Unit == "B")
diatomA <- subset(diatom, Unit == "A")
diatomC <- subset(diatom, Unit == "C")
diatomT <- subset(diatom, Unit == "T")

Comparing count of diatom valves

The Top two units

The top two units are do not have enough data to perform statistical tests. Nor are they normally distributed.

# diatomT hist(diatomT$Total) shapiro.test(diatomT$Total) diatomA
# hist(diatomA$Total) shapiro.test(diatomA$Total)

Units B & C

Statistical comparison of these two units is possible.

hist(diatomB$Total)

plot of chunk unnamed-chunk-3

shapiro.test(diatomB$Total)

## 
##  Shapiro-Wilk normality test
## 
## data:  diatomB$Total 
## W = 0.9822, p-value = 0.8267

qqnorm(diatomB$Total)
qqline(diatomB$Total)

plot of chunk unnamed-chunk-5

hist(diatomC$Total)

plot of chunk unnamed-chunk-6

shapiro.test(diatomC$Total)

## 
##  Shapiro-Wilk normality test
## 
## data:  diatomC$Total 
## W = 0.9058, p-value = 0.08497

qqnorm(diatomC$Total)
qqline(diatomC$Total)

plot of chunk unnamed-chunk-8

Two sample t test comparing Units B & C

Null Hypothesis: The means for Units B and C are the same.

t.test(diatomB$Total, diatomC$Total)

## 
##  Welch Two Sample t-test
## 
## data:  diatomB$Total and diatomC$Total 
## t = 1.366, df = 38.53, p-value = 0.1799
## alternative hypothesis: true difference in means is not equal to 0 
## 95 percent confidence interval:
##  -0.4545  2.3423 
## sample estimates:
## mean of x mean of y 
##     6.768     5.824

The null hypothesis cannot be rejected. Unit C has some high values!

Two sample t test comparing KC23/JGC23 and KC20.

Null Hypothesis: The mean total abundance for site 20 and site 23 is the same.

# Subsetting
diatom20 <- subset(diatom, Core == "KC20")
diatom23 <- subset(diatom, Core != "KC20")

# Normality test
shapiro.test(diatom20$Total)  # Normal check.

## 
##  Shapiro-Wilk normality test
## 
## data:  diatom20$Total 
## W = 0.9614, p-value = 0.3367

shapiro.test(diatom23$Total)  # Normal check.

## 
##  Shapiro-Wilk normality test
## 
## data:  diatom23$Total 
## W = 0.9578, p-value = 0.2232


# T-test
t.test(diatom20$Total, diatom23$Total)

## 
##  Welch Two Sample t-test
## 
## data:  diatom20$Total and diatom23$Total 
## t = -2.039, df = 60.91, p-value = 0.04576
## alternative hypothesis: true difference in means is not equal to 0 
## 95 percent confidence interval:
##  -2.80200 -0.02751 
## sample estimates:
## mean of x mean of y 
##     5.418     6.833

This test shows that site 23 has systematically more diatoms than site 20 (with a 95% confidence interval). This complicates any comparison of total numbers. The simplest fix is the scale up site 20 to match 23 using the estiamted means.

# Calculate means:
mTot20 <- mean(diatom20$Total)
mTot23 <- mean(diatom23$Total)

# Find conversion factor:
Convert20to23 <- mTot23/mTot20

# Make a new column with converted site 20 values: I was trying an if
# statement... if(diatom$Core == 'KC20') diatom$Totalc <-
# diatom$Total*Convert20to23
diatom$Totalc <- 1
diatom[29:59, 8] <- diatom[29:59, 5]
diatom[62:63, 8] <- diatom[62:63, 5]
diatom[1:28, 8] <- diatom[1:28, 5] * Convert20to23
diatom[60:61, 8] <- diatom[60:61, 5] * Convert20to23

# Check the work:
diatom20 <- subset(diatom, Core == "KC20")
diatom23 <- subset(diatom, Core != "KC20")

# Null Hypothesis: Total abundance for (corrected) site 20 and site 23 are
# the same.
t.test(diatom20$Totalc, diatom23$Totalc)

## 
##  Welch Two Sample t-test
## 
## data:  diatom20$Totalc and diatom23$Totalc 
## t = 0, df = 58.8, p-value = 1
## alternative hypothesis: true difference in means is not equal to 0 
## 95 percent confidence interval:
##  -1.564  1.564 
## sample estimates:
## mean of x mean of y 
##     6.833     6.833

We have now controlled for sampling site, so we can more accurately compare the diatom abundance of Units B and C:

Two sample t test with control for core site

Null Hypothesis: There is no difference in mean abundance between Units B and C.

# Obtain new subsets:
diatomB <- subset(diatom, Unit == "B")
diatomA <- subset(diatom, Unit == "A")
diatomC <- subset(diatom, Unit == "C")
diatomT <- subset(diatom, Unit == "T")

# Since the data were scaled, more normality tests are not necessary.
# Move straight to the t-test:
t.test(diatomB$Totalc, diatomC$Totalc)

## 
##  Welch Two Sample t-test
## 
## data:  diatomB$Totalc and diatomC$Totalc 
## t = 2.855, df = 39.97, p-value = 0.006785
## alternative hypothesis: true difference in means is not equal to 0 
## 95 percent confidence interval:
##  0.5873 3.4332 
## sample estimates:
## mean of x mean of y 
##     7.834     5.824

Controlling for core site, we find that Unit B has 2.01 +/- 1.42 million valves more than Unit C. This is consistent with the interpretation that Unit B was deposited during the mid-Holocene cliamtic optimum, a warmer, more productive period.

% Fragilariopsis

In addition to total diatom abundance, we can also differentiate units based on assemblage. The scaling conducted between core sites to account for abundance differences is not applicable to assemblage data, which is reported a relative percentages.

Test for Normality:

# Data are roughly normally distributed.
shapiro.test(diatomB$perFragil)

## 
##  Shapiro-Wilk normality test
## 
## data:  diatomB$perFragil 
## W = 0.9807, p-value = 0.7816

# hist(diatomB$perFragil,main='Unit B Percent
# Fragilariopsis',xlab='Percent Fragilariopsis')
shapiro.test(diatomC$perFragil)

## 
##  Shapiro-Wilk normality test
## 
## data:  diatomC$perFragil 
## W = 0.9526, p-value = 0.4993

# hist(diatomC$perFragil,main='Unit C Percent
# Fragilariopsis',xlab='Percent Fragilariopsis')

Two-sample t-test:

Null Hypothesis: Fragilariopsis make up the same percentage of the diatom assemblages in Unit B and Unit C. Any difference in mean % Fragilariopsis is due to random chance.

t.test(diatomB$perFragil, diatomC$perFragil)

## 
##  Welch Two Sample t-test
## 
## data:  diatomB$perFragil and diatomC$perFragil 
## t = 2.527, df = 39.74, p-value = 0.01559
## alternative hypothesis: true difference in means is not equal to 0 
## 95 percent confidence interval:
##  0.9852 8.8652 
## sample estimates:
## mean of x mean of y 
##     24.35     19.42

This test shows that we can be confident that the % Fragiliaropsis is different between the two units; if they were not different, we would receive these results less than 5% of the time. This validates the claim that the two units were formed by different depositional environments.

% Chaetoceros

Chaetoceros is the most abundant diatom genus in these cores. It should also be considered.

Test for Normality:

# Data are roughly normally distributed.
shapiro.test(diatomB$perChaet)

## 
##  Shapiro-Wilk normality test
## 
## data:  diatomB$perChaet 
## W = 0.9765, p-value = 0.6448

# hist(diatomB$perFragil,main='Unit B Percent Chaetoceros',xlab='Percent
# Chaetoceros')
shapiro.test(diatomC$perChaet)

## 
##  Shapiro-Wilk normality test
## 
## data:  diatomC$perChaet 
## W = 0.9476, p-value = 0.4191

# hist(diatomC$perFragil,main='Unit C Percent Chaetoceros',xlab='Percent
# Chaetoceros')

Two-sample t-test:

Null Hypothesis: Chaetoceros make up the same percentage of the diatom assemblages in Unit B and Unit C. Any difference in mean % Chaetoceros is due to random chance.

t.test(diatomB$perChaet, diatomC$perChaet)

## 
##  Welch Two Sample t-test
## 
## data:  diatomB$perChaet and diatomC$perChaet 
## t = -3.155, df = 44.77, p-value = 0.00287
## alternative hypothesis: true difference in means is not equal to 0 
## 95 percent confidence interval:
##  -13.984  -3.085 
## sample estimates:
## mean of x mean of y 
##     61.02     69.56

This test shows that we can be confident that the % Chaetoceros is different between the two units; if they were not different, we would receive these results less than 5% of the time. This validates the claim that the two units were formed by different depositional environments. Furthermore, the higher relative abundance of the smaller Chaetoceros compared to Fargilariopsis in Unit C is consistent with the theory that Unit C represents a sub-ice shelf setting, whereas Unit B represents an open marine setting. Chaetoceros are more easily transported laterally underneath an ice shelf than Fragilariopsis and so are more likely to dominate a sub-ice shelf setting.