Instructions are provided in italicized text. Please delete this text for your submission.
dataset <- read.csv("/Users/sean/Desktop/R Studio/BIOS 338/Data/Data_Package_Assignment/Lifespan_18°Cand21°C_ywRflies.csv")
View(dataset)
Simons, Mirre (2024). Data From: Dietary restriction extends lifespan across different temperatures in the fly [Dataset]. Dryad. https://doi.org/10.5061/dryad.fttdz091j
This data package explores how dietary restriction affects the lifespan of Drosophila Melanogaster at various temperatures.
I chose this study specifically because of my interest in aging and logevity research with the model organism - Drosophila Melanogaster. I am also particularly interested in understanding the mechanisms of aging and how environmental factors, such as diet and temperature, can influence lifespan.
The purpose of the study was to investigate whether dietary restriction extends the lifespan of Drosophila melanogaster across different temperatures. Previous studies suggested that the lifespan extension effect of dietary restriction might not hold at lower temperatures in flies. This study aimed to test the robustness of the dietary restriction longevity response under different temperatures.
Since the data package was published on March 28, 2024, the data were likely collected prior to this date. The experiments were conducted in controlled laboratory settings, where flies were reared under specific conditions
Photo Credit: Wikimedia Commons, User Name:
Fir0002
mean_value <- mean(dataset$Age, na.rm = TRUE)
sd_value <- sd(dataset$Age, na.rm = TRUE) # standard deviation
se_value <- sd_value / sqrt(nrow(dataset)) # standard error
hist(dataset$Age,
main = "Histogram of Age",
xlab = "Age (days)",
breaks = seq(min(dataset$Age),max(dataset$Age),1),
col = "skyblue",
border = "black",
lwd = 2)
abline(v=mean_value, col = "red", lwd = 3) # Mean lifespan value in Histogram
## Lines for standard deviation (Below & Above the mean)
abline(v= mean_value + sd_value, col = "blue", lwd = 2, lty = 2)
abline(v= mean_value - sd_value, col = "blue", lwd = 2, lty = 2)
## Lines for standard error (Below & Above the mean)
abline(v= mean_value + se_value, col = "green", lwd = 2, lty = 3)
abline(v= mean_value - se_value, col = "green", lwd = 2, lty = 3)
The distribution of data appears approximately symmetrical, without
extreme skew to left or right. The red line at the center of the
histogram represents the mean age of Drosophila’s lifespan. The blue
dashed lines are +/-1 standard deviation from the mean. Most of the data
fall within this range. The green dashed-dotted lines show the +/- 1
standard error from the mean, which is much tighter around the mean.
There do not appear to be any major outliers in this dataset. The tails
of the distribution taper off gradually at around 0 to 120 days.
plot(dataset$Age, dataset$At.Risk,
main = "Age vs At Risk",
xlab = "Age (Days)",
ylab = "At Risk",
col = "blue")
abline(lm(At.Risk ~ Age, data = dataset), col = "red") # Linear Regression Line
# lm(): fits a linear regression model to the data
I’d say the quality of the data package is “good.”
The dataset is well-documented with clear labels for each variable, like Age, At Risk, Dead, etc. It includes all the essential variables for lifespan studies, and its structure is straightforward. However, the study could benefit from more details about the exact experimental setup.
The dataset is pretty clear because the variable names are intuitive, and the content matches the research focus. For instance, the age of the flies, the number of flies at risk, and the treatment groups are all clearly labeled. But while the variables are clear, more information about the Treatment variable would help improve understanding. For example, explaining exactly what “21C2” or “18C1” means in terms of temperature and dietary conditions would make it easier for someone new to interpret the results.
Overall, the dataset seems tidy and ready for analysis, which is great for reproducibility. It doesn’t need a lot of data cleaning.