The data for this project originates from the National Park Service (NPS) species records, compiled and shared via the TidyTuesday project. Species richness between Glacier National Park and Grand Canyon National Park is being compared to determine how temperature affects the amount of species. This study focuses on differences in the richness of birds and vascular plants in each park. It is important to look at this because temperature is an important factor for metabolic rates and habitat conditions which affect where different species can live. Temperature variation can significantly affect species distributions. The response variable is the species richness which is measured by the numbers of species observed at each park. The explanatory variable is the average temperature at each park. # Chat GPT told me to add word richness and also I added a little more context about where the data came from.
“Is there a significant difference in the mean species richness of vascular plants and birds between Glacier Park and Grand Canyon Park?” #Helped me by making our question more focused on our specific test.
library(dplyr)
library(ggplot2)
#park data
parkspeciesdata <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2024/2024-10-08/most_visited_nps_species_data.csv')
## Warning: One or more parsing issues, call `problems()` on your data frame for details,
## e.g.:
## dat <- vroom(...)
## problems(dat)
## Rows: 61119 Columns: 28
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (21): ParkCode, ParkName, CategoryName, Order, Family, TaxonRecordStatus...
## dbl (3): References, Observations, Vouchers
## lgl (4): Synonyms, ParkAccepted, Sensitive, ExternalLinks
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
counts <- table(parkspeciesdata$CategoryName)
counts
##
## Amphibian Bacteria Bird
## 235 376 4624
## Chromista Crab/Lobster/Shrimp Fish
## 1040 264 633
## Fungi Insect Mammal
## 5997 16235 1107
## Non-vascular Plant Other Non-vertebrates Protozoa
## 1898 1189 296
## Reptile Slug/Snail Spider/Scorpion
## 384 425 4994
## Vascular Plant
## 21422
table2 <- table(parkspeciesdata$ParkName,parkspeciesdata$CategoryName)
table2
##
## Amphibian Bacteria Bird Chromista
## Acadia National Park 15 0 364 0
## Bryce Canyon National Park 4 0 218 0
## Cuyahoga Valley National Park 24 0 246 0
## Glacier National Park 6 0 277 2
## Grand Canyon National Park 15 0 456 0
## Grand Teton National Park 6 0 266 1
## Great Smoky Mountains National Park 58 294 267 654
## Hot Springs National Park 27 10 387 24
## Indiana Dunes National Park 24 0 353 0
## Joshua Tree National Park 5 23 301 5
## Olympic National Park 16 0 310 0
## Rocky Mountain National Park 5 45 278 150
## Yellowstone National Park 9 4 330 204
## Yosemite National Park 14 0 270 0
## Zion National Park 7 0 301 0
##
## Crab/Lobster/Shrimp Fish Fungi Insect
## Acadia National Park 0 38 0 0
## Bryce Canyon National Park 0 1 0 0
## Cuyahoga Valley National Park 8 85 0 227
## Glacier National Park 6 27 276 197
## Grand Canyon National Park 2 29 0 125
## Grand Teton National Park 11 23 28 155
## Great Smoky Mountains National Park 125 110 5243 12398
## Hot Springs National Park 9 90 0 15
## Indiana Dunes National Park 0 76 69 249
## Joshua Tree National Park 0 1 38 342
## Olympic National Park 0 97 0 87
## Rocky Mountain National Park 39 12 306 676
## Yellowstone National Park 64 19 37 1764
## Yosemite National Park 0 10 0 0
## Zion National Park 0 15 0 0
##
## Mammal Non-vascular Plant
## Acadia National Park 55 0
## Bryce Canyon National Park 76 0
## Cuyahoga Valley National Park 47 0
## Glacier National Park 69 404
## Grand Canyon National Park 107 0
## Grand Teton National Park 74 0
## Great Smoky Mountains National Park 101 1039
## Hot Springs National Park 52 18
## Indiana Dunes National Park 60 0
## Joshua Tree National Park 67 6
## Olympic National Park 79 0
## Rocky Mountain National Park 75 416
## Yellowstone National Park 78 15
## Yosemite National Park 87 0
## Zion National Park 80 0
##
## Other Non-vertebrates Protozoa Reptile
## Acadia National Park 0 0 11
## Bryce Canyon National Park 0 0 13
## Cuyahoga Valley National Park 25 0 24
## Glacier National Park 2 0 4
## Grand Canyon National Park 1 0 76
## Grand Teton National Park 8 0 5
## Great Smoky Mountains National Park 993 257 47
## Hot Springs National Park 22 12 52
## Indiana Dunes National Park 0 0 30
## Joshua Tree National Park 10 0 52
## Olympic National Park 0 0 6
## Rocky Mountain National Park 48 9 3
## Yellowstone National Park 80 18 9
## Yosemite National Park 0 0 22
## Zion National Park 0 0 30
##
## Slug/Snail Spider/Scorpion Vascular Plant
## Acadia National Park 0 0 1226
## Bryce Canyon National Park 0 0 975
## Cuyahoga Valley National Park 15 2 1239
## Glacier National Park 20 0 1269
## Grand Canyon National Park 2 142 1753
## Grand Teton National Park 24 1 1645
## Great Smoky Mountains National Park 291 4630 2163
## Hot Springs National Park 2 0 1252
## Indiana Dunes National Park 0 1 1622
## Joshua Tree National Park 0 153 1314
## Olympic National Park 0 0 1352
## Rocky Mountain National Park 10 22 1119
## Yellowstone National Park 61 43 1444
## Yosemite National Park 0 0 1683
## Zion National Park 0 0 1366
# Data: Temperature
glacier_temp <- data.frame(
day = 1:31,
avg_temp = c(32.7, 33.6, 38, 42.8, 39.7, 30.7, 43.6, 49, 34.1, 28, 33.3, 34,
29.1, 26.4, 21.6, 29.3, 38.6, 43.1, 47.8, 54, 47.7, 42, 39.7, 41.9,
46, 36.9, 34.6, 40, 39.2, 36.2, 33.7)
)
grandcanyon_temp <- data.frame(
day = 1:31,
avg_temp = c(48.3, 45.9, 38.4, 43.2, 35.6, 31.8, 35, 38.6, 43.4, 43.3, 43.1,
42, 44.7, 47.6, 46.3, 43.8, 48.6, 50.8, 54.3, 55, 54, 50.9, 52.5,
54.8, 59.4, 51, 57.5, 52.7, 51.5, 53, 52.1)
)
# Summary Statistics
combined_summary <- bind_rows(
glacier_temp %>%
summarise(
Park = "Glacier",
n = n(),
mean_temp = mean(avg_temp, na.rm = TRUE),
median_temp = median(avg_temp, na.rm = TRUE),
sd_temp = sd(avg_temp, na.rm = TRUE),
se_temp = sd_temp / sqrt(n)
),
grandcanyon_temp %>%
summarise(
Park = "Grand Canyon",
n = n(),
mean_temp = mean(avg_temp, na.rm = TRUE),
median_temp = median(avg_temp, na.rm = TRUE),
sd_temp = sd(avg_temp, na.rm = TRUE),
se_temp = sd_temp / sqrt(n)
)
)
combined_summary
## Park n mean_temp median_temp sd_temp se_temp
## 1 Glacier 31 37.65484 38.0 7.297481 1.310666
## 2 Grand Canyon 31 47.39032 48.3 6.903446 1.239896
## Preplexity simplified the data making just the comboned summary instead of making a summary for each park then combining them into the combined summary.
# Combined dataset for plots
temp_data <- data.frame(
temp = c(glacier_temp$avg_temp, grandcanyon_temp$avg_temp),
park = rep(c("Glacier", "Grand Canyon"), each = 31)
)
# Boxplot
ggplot(temp_data, aes(x = park, y = temp, fill = park)) +
geom_boxplot(width = 0.6, alpha = 0.75, outlier.shape = 21, outlier.fill = "white") +
geom_jitter(width = 0.12, alpha = 0.55, size = 1.8, color = "black") +
scale_fill_manual(values = c("Glacier" = "#56B4E9", "Grand Canyon" = "#E69F00")) +
theme_minimal(base_size = 13) +
labs(
title = "Distribution of Average Daily Temperatures",
subtitle = "Daily temperatures across 31 days in Glacier and Grand Canyon",
x = "Park",
y = "Temperature (°F)"
) +
theme(
legend.position = "none",
plot.title = element_text(face = "bold")
)
## Perplexity changed the titles and graphs axes labled and added the points to clean up the graph.
# Histogram
ggplot(temp_data, aes(x = temp)) +
geom_histogram(bins = 10, fill = "steelblue", color = "black") +
facet_wrap(~park) +
theme_minimal() +
labs(
title = "Temperature Distribution by Park",
x = "Temperature (°F)",
y = "Frequency"
)
# Biodiversity Data
ggplot(temp_data, aes(x = temp)) +
geom_histogram(binwidth = 2, fill = "steelblue", color = "white") +
facet_wrap(~park, ncol = 1) +
theme_minimal(base_size = 13) +
labs(
title = "Temperature Distribution by Park",
subtitle = "Daily average temperatures across 31 days",
x = "Temperature (°F)",
y = "Number of days"
)
## we didn't agree with the change as ours have a bell curve and the ai change removed that
# Biodiversity Data
biodiversity_data <- data.frame(
park = c("Glacier", "Glacier", "Grand Canyon", "Grand Canyon"),
group = c("Birds", "Plants", "Birds", "Plants"),
count = c(277, 1269, 456, 1753)
)
# Bar Plot
ggplot(biodiversity_data, aes(x = park, y = count, fill = group)) +
geom_col(position = position_dodge(width = 0.8), width = 0.7, color = "black") +
theme_minimal(base_size = 13) +
scale_fill_manual(values = c("Birds" = "#56B4E9", "Plants" = "#E69F00")) +
labs(
title = "Comparison of Birds and Vascular Plants",
subtitle = "Species counts in Glacier and Grand Canyon",
x = "Park",
y = "Number of species",
fill = "Group"
)
## AI gave us different colors and changed the titles.
Null: Temperature does not affect species richness between Glacier and Grand Canyon. Alternative: Temperature does affect species richness between Glacier and Grand Canyon.
t_test_result <- t.test(glacier_temp$avg_temp,
grandcanyon_temp$avg_temp)
t_test_result
##
## Welch Two Sample t-test
##
## data: glacier_temp$avg_temp and grandcanyon_temp$avg_temp
## t = -5.396, df = 59.816, p-value = 1.228e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -13.344677 -6.126291
## sample estimates:
## mean of x mean of y
## 37.65484 47.39032
shapiro.test(glacier_temp$avg_temp)
##
## Shapiro-Wilk normality test
##
## data: glacier_temp$avg_temp
## W = 0.99292, p-value = 0.9988
shapiro.test(grandcanyon_temp$avg_temp)
##
## Shapiro-Wilk normality test
##
## data: grandcanyon_temp$avg_temp
## W = 0.96593, p-value = 0.4145
The Welch’s 2 sample t test was used to compare the temperatures between both of the parks because it does not assume equal variances between groups, making it appropriate for environmental data where variability may differ between locations. The Shapiro-Wilk test was used to assess the assumption of normality for each dataset. Both parks showed non-significant results (p > 0.05), which shows that the temperature data were approximately normally distributed and the assumptions for the t-test were met. The results showed a statistically significant difference in mean temperature between the two parks (t = -5.396, df = 59.816, p < 0.001). Glacier Park also had a lower mean compared to the Grand Canyon and the 95% confidence interval is between -13.34°F to -6.13°F which does not include 0 so there is a significant difference.
The p value in given in the Welch two sample t test is 1.228e-6, which means that we reject the null hypothesis and conclude that there is a statistically significant difference in mean species richness between Grand Canyon and Glacier National Park. The results indicate that Grand Canyon has a higher number of both bird and vascular plant species compared to Glacier National Park. One possible ecological explanation is the difference in average temperature between the two parks, where Grand Canyon is warmer. Warmer temperatures may reduce the energy birds need for thermoregulation, potentially allowing greater species persistence. Additionally, plants have higher metabolic and photosynthetic rates at higher temperatures, which can increase plant diversity. # Told me to add a sentence at the end and explain better why the temperature affects.
The Null hypothesis was rejected and temperatures does have significant effect on the number of Birds and Vascular Plants in Glacier Park and Grand Canyon Park. Grand Canyon had a higher number of both bird and vascular plant species which suggests an association between temperature differences and species diversity, with warmer conditions potentially supporting greater biodiversity. #Added a sentence about the effect temperature has.
Burns, C. E., et al. “Global Climate Change and Mammalian Species Diversity in U.S. National Parks.” Proceedings of the National Academy of Sciences, vol. 100, no. 20, 19 Sept. 2003, pp. 11474–11477, www.pnas.org/content/100/20/11474/, https://doi.org/10.1073/pnas.1635115100.
Fly Aviary. (2024). Swift fliers top predators [Image]. https://www.flyaviary.com/wp-content/uploads/2024/05/swift_fliers_top_predators.jpg
Grand Canyon, AZ weather conditionsstar_ratehome. Weather Underground. (n.d.). https://www.wunderground.com/weather/us/az/grand-canyon East Glacier Park, Mt Weather conditionsstar_ratehome. Weather Underground. (n.d.-a). https://www.wunderground.com/weather/us/mt/east-glacier-park
Grand Canyon, AZ weather conditionsstar_ratehome. Weather Underground. (n.d.). https://www.wunderground.com/weather/us/az/grand-canyon East Glacier Park, Mt Weather conditionsstar_ratehome. Weather Underground. (n.d.-a). https://www.wunderground.com/weather/us/mt/east-glacier-park
We used Claude and Chat GPT to help debug our code. It was used to help us edit our sections and make them more like the rubric.