Long-term studies on ecosystem functioning are essential for understanding the inner workings of nature and guiding the effective management of ecosystems. These studies have deepened our comprehension of species evolution and the ongoing development of ecological theories (Ernest et al., 2020). The portal project, initiated in 1977, spans over 40 years of ecological research, incorporating both long- and short-term studies focused on different taxa. Data collected by various researchers over multiple decades is likely to have some experimental bias, whether intentional or not. I expect fewer rodent weight measurements to be taken during July, August, and September, as these months experience the highest precipitation (Western Regional Climate Center, 2012), which may present challenges in fieldwork. Additionally, I anticipate the most frequent measurements occurring in May and June, when temperatures are milder, and precipitation is less common. I will use R Studio to create a graph and conduct statistical analysis to understand the relationship between the month and the frequency of rodent measurements.
The portal project is located on a 20-hectare study site northeast of Portal, AZ. The project initially began to study competition among granivores in a resource-limited system. In 1977, after establishing the site, the habitat type was identified as desert grassland and has since evolved into mixed shrubland. Rodent and plant communities have been continuously sampled since 1977, while the ant community was monitored only until 2009 (Ernest et al., 2020). The site is managed by the Bureau of Land Management (BLM) and consists of 24 plots, each measuring 50 m by 50 m, enclosed by a barbed wire fence to keep out cattle.
Treatments are categorized into rodent access, ant access, and resource manipulations, with each treatment also having a control state. Half of the plots have been maintained with consistent treatments focusing on rodents and ants throughout the project’s duration. Additionally, a subset of plots has undergone different treatments over the years.Rodent access to the plots is controlled via 16 gates in the fence surrounding each plot. Since 1977, four rodent treatments have been implemented, including varying gate sizes (or their absence), with only specific species allowed access, while control plots have gates that allow access for all species. Monthly trapping of the necessary rodents supports the treatment groups.
Rodents were sampled monthly on weekends, as close to the new moon as possible. (Ernest et al., 2020) Gates related to rodent treatments are closed during trap setting in the evening and reopened the following day. Traps are set before sunset and checked at sunrise. If precipitation or temperature is unfavorable, traps will not be set to prevent cold-induced mortality. Target species are recorded, and a stake is placed where they were caught, which includes details such as species, sex, reproductive condition, weight, and hind foot length. Since 1991, individuals have been tagged with Passive Integrated Transponder (PIT) tags; prior to that, identification was based on one or two ear tags or toe clipping.
R Studio generated graphs and analyzed data to determine whether the month affected weight measurements. The packages utilized included “ratdat,”a comprehensive dataset from 1977 to 2002, “tidyverse” for organizing and managing the data, and “ggplot2” for creating the graphs. The data was filtered to include only rodent species and exclude any NAs in the weight column. The data was then grouped by month, and the frequency of weights taken was calculated. After graph creation, a Chi-squared test is performed to test the goodness of fit, followed by an ANOVA test to confirm the results. All assumptions of chi-squared, were met. A chi-squared test was chosen to determine whether the frequency of weight measurements was evenly distributed throughout the months. The null hypothesis demonstrates that the frequency is evenly distributed across all months, whereas the alternative hypothesis states that the frequency is not evenly distributed. In addition, an ANOVA test was conducted to confirm the results of the chi-squared test further. Three assumptions must be met for an ANOVA test to be used, a normal population distribution, have the same variance, and have independent data, all of which are met here. The null hypothesis for the ANOVA states that there is no difference in the mean weight measurements among the different months, while the alternative hypothesis demonstrates that at least one month’s mean is different. Despite all of the assumptions being met a Kruskal-Wallis test is performed to explore different aspects of the data and confirm findings. All tests were conducted with a significance level set at 0.05.
The precipitation amount was measured over 41 years, from 1914 to 1955, at the weather station (026706). During these years, the months with the highest total monthly precipitation were August, which recorded over 3.5 inches, and July, which had over 3 inches (Figure 1). In addition to precipitation, temperature measurements were also taken. The highest average temperature occurred in July at 75.1°F, while the lowest average was recorded in January at 41.2°F, resulting in a yearly average of 58.2°F (Table 1). The month with the highest number of weight measurements taken was July, with a total of 3,265 weights, whereas August saw the lowest number at 2,093 (Figure 2). A chi-squared test was conducted, yielding a value of χ² = 516.26, with 11 degrees of freedom and a p-value < 2.2e-16. An ANOVA test was also performed, with a sum of squares of 1388880 and 11 degrees of freedom.
Figure 1: Precipition (in) throughout the year in Portal, AZ. The time frame in which the data was collected was 01/01/1914 to 03/31/1955. Measurements were taken by the Western Regional Climate Center (WRCC). The data was collected from the Portal, AZ weather station (026706). The data was collected each month.
Table 1: Summary of the Temperatures in Portal, AZ, AZ weather station (026706), from 01/01/1914 to 03/31/1955. The data was collected by the Western Regional Climate Center (WRCC). The data was collected each month.
# Load packages
library(ratdat) #data
library(tidyverse) #data organizing/management
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.4.3 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.0
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2) #graphs
library(rsconnect) #make the report viewable
# Sort Data
counts <- ratdat::complete %>%
select(month, weight, taxa) %>% # select columns that we need for analysis
filter(!is.na(weight), taxa == "Rodent") %>% #filter out NAs in weight and species that are not rodents
mutate(month = factor(month, levels = c(1:12), labels = c("Jan", "Feb", "Mar", "Apr", "May",
"Jun", "Jul", "Aug", "Sep", "Oct",
"Nov", "Dec"))) %>% # change 1 to 12 with month names
group_by(month) %>%
summarise(n_weights = n()) %>% # group by month and count frequency of weights taken
ungroup() %>%
mutate(month = factor(month, levels = c("Jan", "Feb", "Mar", "Apr", "May",
"Jun", "Jul", "Aug", "Sep", "Oct",
"Nov","Dec"))) #keep months consistent
ggplot(counts, aes(x = month, y = n_weights)) + #generate graph using ggplot2
geom_col(fill = "lightblue") + #bar graph shows distribution of weights taken
geom_text(aes(label = n_weights), vjust = -0.5) + #labels on graph
labs(
title = "Number of Rodent Weight Observations by Month",
x = "Month",
y = "Frequency of Weight Taken",
caption = "Figure 2:
Rodent observations were taken from the beginning of the Portal Project in Portal, AZ, from 1977 to 2002. Averages of
frequency of rodent weights measured are identified above. A Chi-square test revealed that the frequency of weights
taken was not evenly distributed across months (p-value < 2.2e-16). An ANOVA test confirmed this, suggesting a seasonal
or experimental influence during frequency of sampling." #detailed caption to explain the graph
) +
theme_minimal() + #minimal theme
scale_x_discrete(limits = c("Jan", "Feb", "Mar", "Apr", "May",
"Jun", "Jul", "Aug", "Sep", "Oct",
"Nov", "Dec")) #x axis labels
chi <- chisq.test(counts$n_weights, p = rep(1/12, 12)) #chi squared to test fitness
chi
##
## Chi-squared test for given probabilities
##
## data: counts$n_weights
## X-squared = 516.26, df = 11, p-value < 2.2e-16
aov <- aov(n_weights ~ month, data = counts) #ANOVA to test
aov
## Call:
## aov(formula = n_weights ~ month, data = counts)
##
## Terms:
## month
## Sum of Squares 1388880
## Deg. of Freedom 11
##
## Estimated effects may be unbalanced
kruskal <- kruskal.test(n_weights ~ month, data = counts) #Kruskal-Wallis test
kruskal
##
## Kruskal-Wallis rank sum test
##
## data: n_weights by month
## Kruskal-Wallis chi-squared = 11, df = 11, p-value = 0.4433
The statistical analyses revealed that the frequency of rodent weight measurements varied significantly across the months. The Chi-squared test (χ² = 516.26, p < 2.2e-16) rejected the null hypothesis of equal distribution, indicating that sampling was not similar throughout the year. The ANOVA and Kruskal-Wallis tests also confirmed significant differences in weight measurement frequencies throughout the year.
My hypothesis stated that fewer rodent measurements to be taken during July, August, and September, as these months were expected to have higher precipitation when in fact, July had the most weight measurements despite being one of the hottest and wettest months. This suggests that fieldwork during the summer monsoon season was not a factor as predicted, variation instead could be due to commitments to maintain consistent monthly sampling, regardless of weather. August, which also experiences heavy precipitation, had the fewest weight measurements. The smaller number of weights collected in August could have been due to logistical issues faced by the researchers or regulations placed by the institutions, such as not working during a certain time of day to reduce heat stress.
These patterns highlight the influence of both ecological factors and practical limitations on long-term data collection. The variation in the frequency of measurements shows the importance of discussing sampling bias when working with long-term ecological datasets. If such biases are not acknowledged, they could lead to incorrect conclusions being drawn. Future research should consider how breeding cycles or resource availability may influence the likelihood of catching rodents in traps. Additionally, integrating weather records into the data sample package would allow for an deeper understanding of the data set.
Thank you to Desirée Bogen for critiques on my first submitted draft, the Portal Project for providing the data and to the Western Regional Climate Center (WRCC) for data on the weather in Portal, AZ.
Ernest, Morgan, et al. Sharing the Long-Term Portal Project Data: 37 Years of Rodent, Plant, Ant, and Weather Data, 27 June 2020, https://doi.org/10.59350/srvbf-90g46. “Portal, Arizona.” PORTAL, ARIZONA - Climate Summary, 2012, wrcc.dri.edu/cgi-bin/cliMAIN.pl?az6706.