Install the ISwR R package. Write the built-in dataset thuesen to a tab-separated text file. View it with a text editor.
library(ISwR)
## Warning: package 'ISwR' was built under R version 4.3.2
write.table(thuesen, file = "thuesen.txt", row.names = F, sep = "\t")
thuesen.tab.delim <- readLines("thuesen.txt")
theusen.tab.delim <- sub("NA", ".", thuesen.tab.delim)
writeLines(thuesen.tab.delim, "thuesen_period.txt")
cat(readLines("thuesen_period.txt"), sep = "\n")
## "blood.glucose" "short.velocity"
## 15.3 1.76
## 10.8 1.34
## 8.1 1.27
## 19.5 1.47
## 7.2 1.27
## 5.3 1.49
## 9.3 1.31
## 11.1 1.09
## 7.5 1.18
## 12.2 1.22
## 6.7 1.25
## 5.2 1.19
## 19 1.95
## 15.1 1.28
## 6.7 1.52
## 8.6 NA
## 4.2 1.12
## 10.3 1.37
## 12.5 1.19
## 16.1 1.05
## 13.3 1.32
## 4.9 1.03
## 8.8 1.12
## 9.5 1.7
The exponential growth of a population is described by this mathematical function: N_t=N_0 e^rt where Nt is the population size at time t, N0 is the initial population size, and r is the rate of growth or reproductive rate. Write an exponential growth function in R that also generates a plot with time on the x axis and Nt on the y axis. Using that function create plots for 20 days assuming an initial population size of 10 individuals under three growth rate scenarios: (0.5, 0.8, -0.1).
exponential_growth <- function(N0, r, days, col) {
t <- 1:days
Nt <- N0 * exp(r * t)
lines(t, Nt, type = "l", lty = 2, lwd = 2.5, col = col)
}
I orginally made three separate plots with no modifications to color, legends, etc., but saw a tutorial for generating plots like the one below as well and wanted to give it a shot…
plot(0, type = "n", xlim = c(1, 20), ylim = c(0, 1000), xlab = "Time (days)", ylab="Population Size (Number)", main="Exponential Growth")
exponential_growth(10, 0.5, 20, col="darkred")
exponential_growth(10, 0.8, 20, col="blue")
exponential_growth(10, -0.1, 20, col="darkgreen")
legend("topright", legend = c("r = 0.5", "r = 0.8", "r = -0.1"), col = c("darkred", "blue", "darkgreen"), lty = 1, lwd = 2.5)
Under highly favorable conditions, populations grow exponentially. However resources will eventually limit population growth and exponential growth cannot continue indefinitely. This phenomenon is described by the logistic growth function: N_t=(KN_0)/〖N_0+(K-N_0)e〗^(-rt) where K is the carrying capacity. Write a logistic growth function in R that also generates a plot with time on the x axis and Nt on the y axis. Using that function create plots for 20 days assuming an initial population size of 10 individuals and a carrying capacity of 1,000 individuals under three growth rate scenarios: (0.5, 0.8, 0.4).
log.growth<-function (N0, r, t, K, col) {
Nt <- K * N0/(N0 + (K - N0) * exp(1)^(-r * t))
lines(t, Nt, type = "l", lty = 2, lwd = 2.5, col = col)}
plot(0, type = "n", xlim = c(0, 20), ylim = c(0, 1000), xlab = "Time (days)", ylab = "Population (Number)", main = "Logistic Growth")
log.growth(N0 = 10, r = 0.5, t = 0:20, K = 1000, col = "darkred")
log.growth(N0 = 10, r = 0.8, t = 0:20, K = 1000, col = "blue")
log.growth(N0 = 10, r = 0.4, t = 0:20, K = 1000, col = "darkgreen")
legend("topleft", legend = c("r = 0.5", "r = 0.8", "r = 0.4"), col = c("darkred", "blue", "darkgreen"), lty = 1, lwd = 2.5)
Write a function (sum_n) that for any given value, say n, computes the sum of the integers from 1 to n (inclusive). Use the function to determine the sum of integers from 1 to 5,000.
sum_n <- function(n){
n <- 1:n
x <- n-(n-1)
sum_n_eq <- sum(c(from = 1, to = n, by = 1))}
sum_n(n = 5000)
print(sum_n(n = 5000))
## [1] 12502502
Write a function (sqrt_round) that takes a number x as input, takes the square root, rounds it to the nearest whole number and then returns the result.
sqrt_round <- function(x){
x <- x
round(sqrt(x))}
print(sqrt_round(x = 8.3))
## [1] 3
Install the R package ‘nycflights13’, and load the ‘weather’ data. a) Explore the columns names and the top part of the dataset to get a sense of the data
library(nycflights13)
data(weather)
head(weather)
## # A tibble: 6 × 15
## origin year month day hour temp dewp humid wind_dir wind_speed wind_gust
## <chr> <int> <int> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 EWR 2013 1 1 1 39.0 26.1 59.4 270 10.4 NA
## 2 EWR 2013 1 1 2 39.0 27.0 61.6 250 8.06 NA
## 3 EWR 2013 1 1 3 39.0 28.0 64.4 240 11.5 NA
## 4 EWR 2013 1 1 4 39.9 28.0 62.2 250 12.7 NA
## 5 EWR 2013 1 1 5 39.0 28.0 64.4 260 12.7 NA
## 6 EWR 2013 1 1 6 37.9 28.0 67.2 240 11.5 NA
## # ℹ 4 more variables: precip <dbl>, pressure <dbl>, visib <dbl>,
## # time_hour <dttm>
weather1 <- subset(weather, weather$month == 1)
library(ggplot2)
ggplot(data = weather1, aes(x = temp)) + geom_histogram(bins = 30)
ggplot(data = weather1, aes(x = temp)) + geom_histogram(col = "white",
fill = "blue") +
labs(title = "Temperatures in Month 1",
x = "Temperature (°F)",
y = "Count")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
ggplot(weather1, aes(x = time_hour, y = temp)) + geom_line(color = "hotpink") +
labs(title = "Departure Temperatures Over Time (Month 1)",
x = "Time (hr)",
y = "Temperature (°F)")
Okay, pink is my favourite colour but I’ll select a better colour for reporting…
ggplot(weather1, aes(x = time_hour, y = temp)) + geom_line(color = "red") +
labs(title = "Departure Temperatures Over Time (Month 1)",
x = "Time (hr)",
y = "Temperature (°F)")
ggplot(weather1, aes(x = origin, y = temp, fill = origin)) + geom_boxplot() +
labs(title = "Departure Temperatures Over Time (Month 1)",
x = "Time (hr)",
y = "Temperature (°F)")
A researcher videotaped the glides of 8 tree snakes leaping from a 10-m tower. Undulation rates of the snakes measured in hertz (cycles per second) were as follows: 0.9, 1.4, 1.2, 1.2, 1.3, 2.0, 1.4, 1.6.
library(ggplot2)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.3 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ lubridate 1.9.3 ✔ tibble 3.2.1
## ✔ purrr 1.0.2 ✔ tidyr 1.3.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
undulation_rates <- c(0.9, 1.4, 1.2, 1.2, 1.3, 2.0, 1.4, 1.6)
hist(undulation_rates, xlab = "Hertz", ylab = "Frequency", main = "Tree Snake Undulation Rates")
sur_mean <- mean(undulation_rates)
sur_mean
## [1] 1.375
sur_range <- range(undulation_rates)
sur_range
## [1] 0.9 2.0
sur_sd <- sd(undulation_rates)
sur_sd
## [1] 0.324037
coef_v <- function(sur_mean, sur_sd){sur_mean <- sur_mean
sur_sd < sur_sd
}
(sur_sd/sur_mean)*100
## [1] 23.56633
Blood pressure was measured (in units of mm Hg). Here are the measurements: 112, 128, 108, 129, 125, 153, 155, 132, 137.
bp <- c(112, 128, 108, 129, 125, 153, 155, 132, 137)
length(bp)
## [1] 9
bp_mean <- mean(bp)
bp_mean
## [1] 131
bp_variance <- var(bp)
bp_variance
## [1] 254.5
bp_sd <- sd(bp)
bp_sd
## [1] 15.95306
coef_v <- function(bp_mean, bp_sd){bp_mean <- bp_mean
bp_sd <- bp_sd
}
(bp_sd/bp_mean)*100
## [1] 12.17791
Calculate the probability of each of the following events: a) A standard normally distributed variable is larger than 3
pnorm(3, lower.tail = FALSE)
## [1] 0.001349898
pnorm(42, mean = 35, sd = 6, lower.tail = FALSE)
## [1] 0.1216725
dbinom(10, size = 10, prob = 0.8)
## [1] 0.1073742
pchisq(6.5, df = 2, lower.tail = FALSE)
## [1] 0.03877421
Imagine height is genetically determined by the combined (that is, the sum) effect of several genes (polygenic trait). Assume that each gene has an effect on height as a uniform distribution with min=1 and max=3. Simulate an stochastic model of height for 1,000 random people based on 1 gene,
single <- runif(1000, 1, 3)
hist(single, main = "Single Gene Height Determination", xlab = "Height")
2 genes,
double <- runif(1000, 1, 3) + runif(1000, 1, 3)
hist(double, main = "Double Gene Height Determination", xlab = "Height")
and 5 genes.
quintuple <- runif(1000, 1, 3) + runif(1000, 1, 3) + runif(1000, 1, 3) + runif(1000, 1, 3) + runif(1000, 1, 3)
hist(quintuple, main = "Quintuple Gene Height Determination", xlab = "Height")
As we increase the number of genes, what is the resulting height distribution?
As we increase the number of genes, the distribution becomes more centralized with relatively fewer outliers on either end of the spectrum. The curve of the histogram becomes more similar to a downwards-facing parabola at the 5 gene determination compared to the almost rectangular single-gene determination.
Normal human body temperature is 98.6 F. Researchers obtained body-temperature measurements on randomly chosen healthy people: 98.4, 98.6, 97.8, 98.8, 97.9, 99.0, 98.2, 98.8, 98.8, 99.0, 98.0, 99.2, 99.5, 99.4, 98.4, 99.1, 98.4, 97.6, 97.4, 97.5, 97.5, 98.8, 98.6, 100.0, 98.4.
bodytempsurvey <- c(98.4, 98.6, 97.8, 98.8, 97.9, 99.0, 98.2, 98.8, 98.8, 99.0, 98.0, 99.2, 99.5, 99.4, 98.4, 99.1, 98.4, 97.6, 97.4, 97.5, 97.5, 98.8, 98.6, 100.0, 98.4)
hist(bodytempsurvey)
qqnorm(bodytempsurvey)
qqline(bodytempsurvey)
shapiro.test(bodytempsurvey)
##
## Shapiro-Wilk normality test
##
## data: bodytempsurvey
## W = 0.97216, p-value = 0.7001
t.test(bodytempsurvey, mu = 98.6)
##
## One Sample t-test
##
## data: bodytempsurvey
## t = -0.56065, df = 24, p-value = 0.5802
## alternative hypothesis: true mean is not equal to 98.6
## 95 percent confidence interval:
## 98.24422 98.80378
## sample estimates:
## mean of x
## 98.524
Yes, they are normally distributed.
Yes, these measurements are consistent with a population mean body temperature of 98.6F, as it falls within the confidence interval.
Ten epileptic patients participated in a study of a new anticonvulsant drug. During the first 8-week period, half the patients received a placebo and half were given the drug, and the number of seizures were recorded. Following this, the same patients were given the opposite treatment, and the number of seizures were recorded. Assuming that the distribution of the difference between the placebo and drug meets the assumption of normality, perform an appropriate test to determine whether there were differences in the number of epileptic seizures with and without the drug.
epimedstudy <- data.frame(c(1:10), c(37, 52, 68, 4, 29, 32, 19, 52, 19, 12), c(5, 23, 40, 3, 38, 19, 9, 24, 17, 14))
names(epimedstudy) <- c("Patient Number", "Seizures: Placebo", "Seizures: Actual Med")
head(epimedstudy)
## Patient Number Seizures: Placebo Seizures: Actual Med
## 1 1 37 5
## 2 2 52 23
## 3 3 68 40
## 4 4 4 3
## 5 5 29 38
## 6 6 32 19
placebo = c(37, 52, 68, 4, 29, 32, 19, 52, 19, 12)
actual = c(5, 23, 40, 3, 38, 19, 9, 24, 17, 14)
boxplot(placebo, actual, names=c("Placebo", "Experimental"), main="Seizures by Treatment")
t.test(placebo, actual, paired = TRUE)
##
## Paired t-test
##
## data: placebo and actual
## t = 2.7661, df = 9, p-value = 0.02189
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## 2.404666 23.995334
## sample estimates:
## mean difference
## 13.2
t.test(placebo, actual, paired = TRUE)$p.value
## [1] 0.02189433
p value is less than 0.05, indicating a significant difference between control (placebo) and experimental (actual) groups.
A bee biologist is analyzing whether there was an association between bee colony number and type of forest habitat. We expect that there is no habitat preference for bee colony number.
beecolhab <- (c("Oak" = 33, "Hickory" = 30, "Maple" = 29, "Red Cedar" = 4, "Poplar" = 4))
barplot(beecolhab)
chisq.test(beecolhab)
##
## Chi-squared test for given probabilities
##
## data: beecolhab
## X-squared = 43.1, df = 4, p-value = 9.865e-09
chisq.test(beecolhab)$p.value
## [1] 9.865077e-09
Is this true based on this data? Having run a chi-squared test as a goodness-of-fit test, and having a p-value (relatively far) lower than 0.05 suggests there is evidence supporting habitat preference for bee colony numbers
Perform 10 one-sample t-tests for mu=0 on simulated standard normally distributed data of 25 observations each and get the P-value.
replicate(10, t.test(rnorm(25), mu = 0)$p.value)
## [1] 0.52416886 0.96484912 0.80712460 0.46682219 0.27731529 0.82666178
## [7] 0.58099065 0.06106349 0.78961751 0.99383502
Repeat the experiment, but instead simulate samples of 25 observations from a t-distribution with 2 degrees of freedom.
replicate (10, t.test(rt(25, df = 2), mu = 0)$p.value)
## [1] 0.8506760 0.6567203 0.2315762 0.5864703 0.2089147 0.6842468 0.7622087
## [8] 0.4138133 0.9491339 0.6274438
Find a way to automate the first experiment to do it a 1,000 times and applying a false discovery rate correction to the P-values (check ?replicate).
p.adjust(replicate(1000, t.test(rnorm(25), mu = 0)$p.value), method = "fdr")
## [1] 0.9723148 0.8249871 0.9988447 0.9982948 0.9330501 0.9330501 0.9988447
## [8] 0.9982948 0.9723148 0.9330501 0.9988447 0.9988447 0.9455095 0.9723148
## [15] 0.9988447 0.9988447 0.9988447 0.9723148 0.9982948 0.9988447 0.9723148
## [22] 0.9330501 0.9982948 0.9988447 0.9988447 0.9723148 0.9455095 0.9723148
## [29] 0.9982948 0.9723148 0.9723148 0.9723148 0.9982948 0.9982948 0.9988447
## [36] 0.9988447 0.9988447 0.9982948 0.9988447 0.9988447 0.9982948 0.9988447
## [43] 0.9723148 0.9330501 0.9723148 0.9777742 0.9988447 0.9988447 0.9982948
## [50] 0.9982948 0.9988447 0.9723148 0.9982948 0.9988447 0.9723148 0.9982948
## [57] 0.9988447 0.9988447 0.9982948 0.9988447 0.9723148 0.9988447 0.9988447
## [64] 0.9723148 0.9988447 0.9723148 0.9988447 0.9988447 0.9723148 0.9982948
## [71] 0.9988447 0.9982948 0.9723148 0.9805739 0.9988447 0.9988447 0.9993963
## [78] 0.9723148 0.9581171 0.9723148 0.9982948 0.9982948 0.9982948 0.9723148
## [85] 0.9988447 0.9723148 0.9723148 0.9723148 0.9988447 0.9988447 0.8249871
## [92] 0.9455095 0.9982948 0.9723148 0.9723148 0.9988447 0.9723148 0.9723148
## [99] 0.9988447 0.9330501 0.9330501 0.9723148 0.9330501 0.9752485 0.9455095
## [106] 0.9988447 0.9993963 0.9988447 0.9988447 0.9982948 0.9723148 0.9982948
## [113] 0.9988447 0.9723148 0.9752485 0.9982948 0.9330501 0.9982948 0.9723148
## [120] 0.9723148 0.9988447 0.9752485 0.9723148 0.9982948 0.9988447 0.9988447
## [127] 0.9723148 0.9723148 0.9988447 0.9723148 0.9752485 0.9744790 0.9455095
## [134] 0.9723148 0.9455095 0.9988447 0.9982948 0.9988447 0.9982948 0.9723148
## [141] 0.9455095 0.9455095 0.9982948 0.9752485 0.9988447 0.9988447 0.9723148
## [148] 0.9988447 0.9723148 0.9982948 0.9330501 0.9723148 0.9982948 0.9723148
## [155] 0.9723148 0.9723148 0.9723148 0.9455095 0.9330501 0.9988447 0.9982948
## [162] 0.9982948 0.8774087 0.9723148 0.9744790 0.9982948 0.9455095 0.9988447
## [169] 0.9723148 0.9723148 0.9330501 0.9455095 0.9723148 0.9988447 0.9455095
## [176] 0.9988447 0.9988447 0.9723148 0.9988447 0.9982948 0.9988447 0.9723148
## [183] 0.9723148 0.9988447 0.9723148 0.9988447 0.9988447 0.9723148 0.9330501
## [190] 0.9455095 0.9988447 0.9988447 0.9988447 0.9330501 0.9477678 0.9723148
## [197] 0.9723148 0.9982948 0.9723148 0.9988447 0.9982948 0.9988447 0.9988447
## [204] 0.9982948 0.9723148 0.9330501 0.9982948 0.9723148 0.9723148 0.9988447
## [211] 0.9723148 0.9988447 0.9723148 0.9723148 0.9988447 0.9723148 0.9330501
## [218] 0.9957783 0.9723148 0.9982948 0.9982948 0.9723148 0.9988447 0.9455095
## [225] 0.9988447 0.9982948 0.9330501 0.9723148 0.9988447 0.9723148 0.9982948
## [232] 0.9455095 0.9988447 0.9330501 0.9988447 0.9982948 0.9988447 0.9988447
## [239] 0.9744790 0.9723148 0.9988447 0.9988447 0.9988447 0.9988447 0.9330501
## [246] 0.9723148 0.9982948 0.9723148 0.9744790 0.9988447 0.9723148 0.9982948
## [253] 0.9988447 0.9723148 0.9330501 0.6823811 0.9988447 0.9723148 0.9330501
## [260] 0.9749366 0.9330501 0.9723148 0.9723148 0.9455095 0.9988447 0.9982948
## [267] 0.9988447 0.9330501 0.9779414 0.9723148 0.9982948 0.9988447 0.9988447
## [274] 0.9988447 0.9723148 0.9330501 0.9723148 0.9988447 0.9330501 0.9723148
## [281] 0.9723148 0.9330501 0.9752485 0.9982948 0.9723148 0.9455095 0.9723148
## [288] 0.9723148 0.9988447 0.9723148 0.9723148 0.9330501 0.9330501 0.9330501
## [295] 0.9723148 0.9330501 0.9455095 0.9330501 0.9723148 0.9988447 0.6492151
## [302] 0.9330501 0.9982948 0.9988447 0.9723148 0.9988447 0.9988447 0.9330501
## [309] 0.9455095 0.9988447 0.9988447 0.9723148 0.9988447 0.9988447 0.9330501
## [316] 0.9723148 0.9723148 0.9723148 0.9988447 0.9330501 0.9752485 0.9988447
## [323] 0.9455095 0.9982948 0.9455095 0.9723148 0.9723148 0.9779414 0.9330501
## [330] 0.9330501 0.9723148 0.8249871 0.9723148 0.9723148 0.9982948 0.9988447
## [337] 0.9723148 0.9330501 0.9723148 0.9982948 0.9723148 0.9330501 0.9982948
## [344] 0.9988447 0.9982948 0.9723148 0.9723148 0.9988447 0.9988447 0.9330501
## [351] 0.9988447 0.9744790 0.9982948 0.9988447 0.9988447 0.9988447 0.9723148
## [358] 0.9988447 0.9455095 0.9723148 0.7779686 0.9982948 0.9723148 0.9982948
## [365] 0.9723148 0.9988447 0.8148259 0.9982948 0.9988447 0.9982948 0.9988447
## [372] 0.9982948 0.9723148 0.9988447 0.9982948 0.9982948 0.9988447 0.9455095
## [379] 0.9988447 0.9723148 0.9988447 0.9982948 0.9988447 0.9455095 0.9982948
## [386] 0.9723148 0.9723148 0.9723148 0.8774087 0.9330501 0.9330501 0.9330501
## [393] 0.9988447 0.9982948 0.9723148 0.9330501 0.9723148 0.9723148 0.9723148
## [400] 0.9723148 0.9723148 0.9330501 0.9982948 0.9982948 0.9982948 0.9723148
## [407] 0.9723148 0.9982948 0.9723148 0.9455095 0.9988447 0.9988447 0.9723148
## [414] 0.9988447 0.9330501 0.9988447 0.9723148 0.9988447 0.9723148 0.9982948
## [421] 0.9744790 0.9988447 0.9330501 0.9723148 0.9988447 0.9988447 0.9455095
## [428] 0.9988447 0.9723148 0.9330501 0.9988447 0.9744790 0.9330501 0.9723148
## [435] 0.9988447 0.8774087 0.9455095 0.9988447 0.9723148 0.9723148 0.9988447
## [442] 0.9723148 0.9723148 0.9752485 0.9752485 0.9723148 0.9988447 0.9744790
## [449] 0.9988447 0.9982948 0.9455095 0.9988447 0.9988447 0.9988447 0.9982948
## [456] 0.9723148 0.9723148 0.9988447 0.9723148 0.9982948 0.9723148 0.9982948
## [463] 0.9723148 0.9988447 0.9988447 0.9988447 0.9982948 0.9988447 0.9455095
## [470] 0.9988447 0.9330501 0.9982948 0.9810300 0.9723148 0.9988447 0.9455095
## [477] 0.9982948 0.9723148 0.9723148 0.9723148 0.9330501 0.9723148 0.9723148
## [484] 0.9988447 0.9982948 0.8774087 0.9330501 0.9723148 0.9988447 0.9982948
## [491] 0.9723148 0.9988447 0.9723148 0.9988447 0.9723148 0.9455095 0.9455095
## [498] 0.9982948 0.9988447 0.9330501 0.9723148 0.9988447 0.9988447 0.9988447
## [505] 0.9988447 0.9988447 0.9723148 0.9723148 0.9723148 0.9988447 0.9988447
## [512] 0.9330501 0.9721226 0.9723148 0.9752485 0.9982948 0.9988447 0.9723148
## [519] 0.9723148 0.9723148 0.9988447 0.9330501 0.9988447 0.9723148 0.9723148
## [526] 0.9988447 0.9982948 0.9988447 0.9988447 0.9723148 0.9516867 0.9723148
## [533] 0.9988447 0.9723148 0.9988447 0.9988447 0.9988447 0.9988447 0.9723148
## [540] 0.8712003 0.9455095 0.9723148 0.9455095 0.9455095 0.9982948 0.9723148
## [547] 0.9723148 0.9988447 0.9982948 0.9988447 0.9744790 0.9330501 0.9988447
## [554] 0.9988447 0.9988447 0.8774087 0.9988447 0.9982948 0.9723148 0.9723148
## [561] 0.9988447 0.9988447 0.9982948 0.9988447 0.9723148 0.9988447 0.9680508
## [568] 0.9723148 0.9988447 0.9723148 0.9988447 0.9330501 0.9330501 0.9988447
## [575] 0.9455095 0.9982948 0.9330501 0.9988447 0.9982948 0.8774087 0.9988447
## [582] 0.9988447 0.9330501 0.9988447 0.9988447 0.9988447 0.9752485 0.9723148
## [589] 0.9723148 0.9988447 0.9723148 0.9723148 0.9723148 0.9988447 0.9723148
## [596] 0.9455095 0.9723148 0.8774087 0.9723148 0.9330501 0.9723148 0.9988447
## [603] 0.9455095 0.9455095 0.9752485 0.9988447 0.9330501 0.9982948 0.9988447
## [610] 0.9988447 0.9982948 0.9982948 0.9988447 0.9988447 0.2509067 0.9330501
## [617] 0.9330501 0.9988447 0.9988447 0.9988447 0.9988447 0.9752485 0.9744790
## [624] 0.8774087 0.9982948 0.9723148 0.9723148 0.9982948 0.9982948 0.9455095
## [631] 0.9988447 0.9723148 0.9723148 0.9988447 0.9330501 0.9988447 0.9988447
## [638] 0.9988447 0.9988447 0.9988447 0.9988447 0.9982948 0.9988447 0.9744790
## [645] 0.9988447 0.9805739 0.9988447 0.9723148 0.9330501 0.9988447 0.9752485
## [652] 0.9723148 0.9982948 0.9982948 0.9723148 0.9723148 0.9988447 0.9723148
## [659] 0.9723148 0.9982948 0.9455095 0.9330501 0.9723148 0.9982948 0.9988447
## [666] 0.9723148 0.9495821 0.9982948 0.9455095 0.9988447 0.9982948 0.9988447
## [673] 0.9988447 0.9330501 0.9988447 0.9723148 0.9330501 0.9330501 0.9581171
## [680] 0.9723148 0.9723148 0.9988447 0.9988447 0.9988447 0.9988447 0.9330501
## [687] 0.9723148 0.9744790 0.9988447 0.9988447 0.9455095 0.9455095 0.9805739
## [694] 0.9330501 0.9988447 0.9744790 0.9982948 0.9988447 0.9982948 0.9723148
## [701] 0.9982948 0.9988447 0.9723148 0.9988447 0.9455095 0.9723148 0.9982948
## [708] 0.9982948 0.9455095 0.9330501 0.9330501 0.9330501 0.9982948 0.9752485
## [715] 0.9988447 0.9988447 0.6823811 0.9723148 0.9982948 0.9982948 0.9988447
## [722] 0.9723148 0.9988447 0.9982948 0.9982948 0.9723148 0.9988447 0.9455095
## [729] 0.9988447 0.9988447 0.9455095 0.9723148 0.9752485 0.9988447 0.9982948
## [736] 0.9988447 0.9723148 0.9982948 0.9982948 0.9723148 0.9988447 0.9982948
## [743] 0.9982948 0.9752485 0.9723148 0.9723148 0.9988447 0.9982948 0.9723148
## [750] 0.9723148 0.9723148 0.9988447 0.9982948 0.9330501 0.9988447 0.9723148
## [757] 0.9723148 0.9988447 0.9988447 0.9723148 0.9988447 0.9982948 0.9982948
## [764] 0.9988447 0.9988447 0.9330501 0.9982948 0.9330501 0.9330501 0.9752485
## [771] 0.9988447 0.9988447 0.9330501 0.9330501 0.9988447 0.9723148 0.9330501
## [778] 0.9982948 0.9982948 0.9988447 0.9723148 0.9988447 0.9723148 0.9982948
## [785] 0.9723148 0.9723148 0.9723148 0.9455095 0.9982948 0.9988447 0.9988447
## [792] 0.9988447 0.9988447 0.9988447 0.9744790 0.9988447 0.9723148 0.9723148
## [799] 0.9982948 0.9988447 0.9723148 0.9723148 0.9988447 0.9723148 0.9988447
## [806] 0.9988447 0.9982948 0.9988447 0.9723148 0.9988447 0.9744790 0.9744790
## [813] 0.9330501 0.9988447 0.9723148 0.9982948 0.9752485 0.9988447 0.9988447
## [820] 0.9723148 0.9982948 0.9810300 0.9723148 0.9988447 0.9982948 0.9723148
## [827] 0.9723148 0.2509067 0.9982948 0.9982948 0.9723148 0.9988447 0.9723148
## [834] 0.9723148 0.9723148 0.9723148 0.9330501 0.9982948 0.9988447 0.9723148
## [841] 0.9330501 0.9982948 0.9330501 0.9988447 0.9988447 0.9988447 0.9723148
## [848] 0.9988447 0.9988447 0.9982948 0.9988447 0.9988447 0.9723148 0.9982948
## [855] 0.9723148 0.9988447 0.9988447 0.9723148 0.9982948 0.9982948 0.9330501
## [862] 0.9330501 0.9723148 0.6492151 0.9723148 0.9723148 0.9723148 0.9982948
## [869] 0.9988447 0.9723148 0.9988447 0.9723148 0.9744790 0.9330501 0.9723148
## [876] 0.9723148 0.9723148 0.9982948 0.9723148 0.9982948 0.9723148 0.9988447
## [883] 0.9988447 0.9988447 0.9988447 0.9723148 0.9988447 0.9982948 0.9723148
## [890] 0.9988447 0.9330501 0.9988447 0.9982948 0.9455095 0.9723148 0.9988447
## [897] 0.9988447 0.9982948 0.9330501 0.9723148 0.9988447 0.9723148 0.9723148
## [904] 0.9330501 0.9723148 0.9330501 0.9723148 0.9455095 0.9723148 0.9982948
## [911] 0.9988447 0.9723148 0.9988447 0.9988447 0.9988447 0.9988447 0.9988447
## [918] 0.9744790 0.9988447 0.9988447 0.9988447 0.9330501 0.9723148 0.9982948
## [925] 0.9455095 0.9982948 0.9988447 0.9723148 0.9330501 0.8148259 0.9455095
## [932] 0.9455095 0.9982948 0.9982948 0.9988447 0.9988447 0.9752485 0.9988447
## [939] 0.9330501 0.9723148 0.9988447 0.9988447 0.9988447 0.9988447 0.9805739
## [946] 0.9988447 0.9723148 0.9744790 0.9330501 0.9723148 0.9988447 0.9330501
## [953] 0.9988447 0.9752485 0.9744790 0.9723148 0.9744790 0.9723148 0.9982948
## [960] 0.9988447 0.9752485 0.9723148 0.9982948 0.9330501 0.9805739 0.9982948
## [967] 0.9988447 0.9330501 0.9988447 0.2509067 0.9723148 0.9988447 0.9330501
## [974] 0.9988447 0.8774087 0.9723148 0.9988447 0.9455095 0.9723148 0.9723148
## [981] 0.9723148 0.9723148 0.9455095 0.9723148 0.9982948 0.9988447 0.9988447
## [988] 0.9988447 0.9988447 0.9723148 0.8774087 0.9723148 0.9330501 0.9330501
## [995] 0.9988447 0.9455095 0.9988447 0.9988447 0.9330501 0.9810300
You are hired by the US Department of Agriculture to develop effective control practices for invasive plants (data_ANOVA.xlsx). In particular, you are asked to investigate pesticide controls for kudzu which is an invasive vine that grows in thick mats that smother underlying plants. Two of the most widely used pesticides for kudzu are glyphosate and triclopyr. To determine the effectiveness of a single application of pesticide, you conduct an experiment in 18 plots that each had 50% kudzu cover. In mid-summer, you applied equal amounts of 2% glyphosate to 6 plots, 2% triclopyr to 6 plots, and water without pesticides to 6 plots. Then you returned in autumn and measured the percent cover of kudzu in the plots.
library(readxl) install.packages(“reshape2”) library(reshape2) library(ggplot2)
USDA_Exp <- readxl::read_xlsx(path = "data_ANOVA.xlsx", col_names = T)
USDA_Exp
## # A tibble: 6 × 3
## Glyphosate Triclopyr Control
## <dbl> <dbl> <dbl>
## 1 20 17 34
## 2 17 14 31
## 3 24 12 29
## 4 18 16 32
## 5 16 18 27
## 6 22 13 30
KudzuCon <- reshape2::melt(USDA_Exp)
## No id variables; using all as measure variables
KudzuCon
## variable value
## 1 Glyphosate 20
## 2 Glyphosate 17
## 3 Glyphosate 24
## 4 Glyphosate 18
## 5 Glyphosate 16
## 6 Glyphosate 22
## 7 Triclopyr 17
## 8 Triclopyr 14
## 9 Triclopyr 12
## 10 Triclopyr 16
## 11 Triclopyr 18
## 12 Triclopyr 13
## 13 Control 34
## 14 Control 31
## 15 Control 29
## 16 Control 32
## 17 Control 27
## 18 Control 30
boxplot(value~variable, data=KudzuCon, main="Kudzu Response", xlab = "Treatment", ylab = "Measured Cover (%)")
anova(lm(value ~ variable, data = KudzuCon))
## Analysis of Variance Table
##
## Response: value
## Df Sum Sq Mean Sq F value Pr(>F)
## variable 2 763 381.5 54.5 1.318e-07 ***
## Residuals 15 105 7.0
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(lm(value ~ variable, data = KudzuCon))
##
## Call:
## lm(formula = value ~ variable, data = KudzuCon)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.500 -1.875 0.000 1.875 4.500
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 19.500 1.080 18.053 1.38e-11 ***
## variableTriclopyr -4.500 1.528 -2.946 0.01 *
## variableControl 11.000 1.528 7.201 3.07e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.646 on 15 degrees of freedom
## Multiple R-squared: 0.879, Adjusted R-squared: 0.8629
## F-statistic: 54.5 on 2 and 15 DF, p-value: 1.318e-07
summary(lm(value ~ variable, data = KudzuCon))$r.squared
## [1] 0.8790323
R^2 is the coefficient of determination, assessing how well a model fits the data. The R^2 value here is 0.88, so i texplains that about 88% of the variation can be explained by the data. Ideally an R^2 should be high, depending on the data collected (the data I’m working with for my research at the moment often has a value of 0.98 or 0.99), but 0.88 is by no means “bad”
Data in growth.txt come from a farm-scale trial of animal diets. There are two factors: diet and supplement. Diet is a factor with three levels: barley, oats and wheat. Supplement is a factor with four levels: agrimore, control, supergain and supersupp. The response variable is weight gain after 6 weeks.
growth <- read.table("growth.txt", sep="\t", header = T)
head(growth)
## supplement diet gain
## 1 supergain wheat 17.37125
## 2 supergain wheat 16.81489
## 3 supergain wheat 18.08184
## 4 supergain wheat 15.78175
## 5 control wheat 17.70656
## 6 control wheat 18.22717
library(ggplot2)
Weight as a function of diet
boxplot(gain ~ diet, data = growth)
diet_bp <- ggplot(growth, aes(x = diet, y = gain, fill = diet)) + geom_boxplot() +
geom_jitter(shape = 16, position = position_jitter(0.2))+
labs(x = "Diet", y = "Weight Gain (g)")
diet_bp
Weight as a function of supplement
boxplot(gain ~ supplement, data = growth)
supplement_bp <- ggplot(growth, aes(x = supplement, y = gain, fill = supplement)) +
geom_boxplot() +
geom_jitter(shape = 16, position = position_jitter(0.2)) +
labs(x = "Supplement", y = "Weight Gain (g)")
supplement_bp
Combinations FULL DISCLOSURE, I found this formatting style when double checking my own work online compared to that of others, and wanted to try learning how to format like this… My own entry before formatting like this was the first “barebones” entry
boxplot(gain ~ diet + supplement, data = growth)
combo_bp <- ggplot(growth, aes(x = supplement, y = gain, fill = supplement)) + geom_boxplot() +
geom_jitter(shape = 16, position = position_jitter(0.2))+
facet_wrap(~diet) + theme(axis.text.x = element_text(angle = 45)) + labs(x = "Diet & Supplement", y = "Weight Gain (g)")
combo_bp
anova(lm(gain ~ diet * supplement, data = growth))
## Analysis of Variance Table
##
## Response: gain
## Df Sum Sq Mean Sq F value Pr(>F)
## diet 2 287.171 143.586 83.5201 2.999e-14 ***
## supplement 3 91.881 30.627 17.8150 2.952e-07 ***
## diet:supplement 6 3.406 0.568 0.3302 0.9166
## Residuals 36 61.890 1.719
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
interaction.plot(growth$supplement, growth$diet, growth$gain, type = "b", col = c("darkgreen", "blue", "darkred"), main = "Interaction of Diet and Supplement Type on One Another", xlab = "Supplement", ylab = "Gain (g)", trace.label = "Diet")
interaction.plot(growth$diet, growth$supplement, growth$gain, type="b", col=c("darkgreen","blue", "darkred"),
main = "Interaction of Diet and Supplement Type on Gain",
xlab = "Diet", ylab = "Gain", trace.label = "Supplement")
anova(lm(gain ~ diet + supplement, data = growth))
## Analysis of Variance Table
##
## Response: gain
## Df Sum Sq Mean Sq F value Pr(>F)
## diet 2 287.171 143.586 92.358 4.199e-16 ***
## supplement 3 91.881 30.627 19.700 3.980e-08 ***
## Residuals 42 65.296 1.555
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(lm(gain ~ diet + supplement, data = growth))
##
## Call:
## lm(formula = gain ~ diet + supplement, data = growth)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.30792 -0.85929 -0.07713 0.92052 2.90615
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 26.1230 0.4408 59.258 < 2e-16 ***
## dietoats -3.0928 0.4408 -7.016 1.38e-08 ***
## dietwheat -5.9903 0.4408 -13.589 < 2e-16 ***
## supplementcontrol -2.6967 0.5090 -5.298 4.03e-06 ***
## supplementsupergain -3.3815 0.5090 -6.643 4.72e-08 ***
## supplementsupersupp -0.7274 0.5090 -1.429 0.16
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.247 on 42 degrees of freedom
## Multiple R-squared: 0.8531, Adjusted R-squared: 0.8356
## F-statistic: 48.76 on 5 and 42 DF, p-value: < 2.2e-16
0.8531, adjusted = 0.8356
TukeyHSD(aov(gain ~ diet + supplement, data = growth))
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = gain ~ diet + supplement, data = growth)
##
## $diet
## diff lwr upr p adj
## oats-barley -3.092817 -4.163817 -2.021817 0e+00
## wheat-barley -5.990298 -7.061298 -4.919297 0e+00
## wheat-oats -2.897481 -3.968481 -1.826481 2e-07
##
## $supplement
## diff lwr upr p adj
## control-agrimore -2.6967005 -4.0583332 -1.3350677 0.0000234
## supergain-agrimore -3.3814586 -4.7430914 -2.0198259 0.0000003
## supersupp-agrimore -0.7273521 -2.0889849 0.6342806 0.4888738
## supergain-control -0.6847581 -2.0463909 0.6768746 0.5400389
## supersupp-control 1.9693484 0.6077156 3.3309811 0.0020484
## supersupp-supergain 2.6541065 1.2924737 4.0157392 0.0000307
In a study of hyena laughter, a researcher investigated whether sound spectral properties of hyenas’ giggles are associated with age.
# Age(years): 2, 2, 2, 6, 9, 10, 13, 10, 14, 14, 12, 7, 11, 11, 14, 20
age <- c(2,2,2,6,9,10,13,10,14,14,12,7,11,11,14,20)
# Frequency (Hz): 840,670,580,470,540,660,510,520,500,480,400,650,460,500,580,500
freq <- c(840,670,580,470,540,660,510,520,500,480,400,650,460,500,580,500)
agelaugh <- data.frame(age = c(2,2,2,6,9,10,13,10,14,14,12,7,11,11,14,20), freq = c(840,670,580,470,540,660,510,520,500,480,400,650,460,500,580,500))
head(agelaugh)
## age freq
## 1 2 840
## 2 2 670
## 3 2 580
## 4 6 470
## 5 9 540
## 6 10 660
plot(agelaugh)
cor.test(agelaugh$age, agelaugh$freq, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: agelaugh$age and agelaugh$freq
## t = -2.8194, df = 14, p-value = 0.01365
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.8453293 -0.1511968
## sample estimates:
## cor
## -0.601798
cor.test(agelaugh$age, agelaugh$freq, method = "spearman", exact = FALSE)
##
## Spearman's rank correlation rho
##
## data: agelaugh$age and agelaugh$freq
## S = 1047.1, p-value = 0.03092
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## -0.5397807
Testosterone is known to predict aggressive behavior and is involved in face shape during puberty. Does face shape predict aggression? Researchers measured the face width-to-height ratio of 21 university hockey players with the average number of penalty minutes awarded per game for aggressive infractions. The data are available in face.txt.
library(ggplot2)
library(see)
## Warning: package 'see' was built under R version 4.3.2
testo <- read.table("face.txt", sep = "\t", header = T)
head(testo)
## Face Penalty
## 1 1.59 0.44
## 2 1.67 1.43
## 3 1.65 1.57
## 4 1.72 0.14
## 5 1.79 0.27
## 6 1.77 0.35
plota <- ggplot(testo, aes(x = Face, y = Penalty)) + geom_point()
plota
model_testo <- lm(Penalty ~ Face, data = testo)
summary(model_testo)
##
## Call:
## lm(formula = Penalty ~ Face, data = testo)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.9338 -0.3883 -0.1260 0.3381 1.0711
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -4.505 2.089 -2.157 0.0440 *
## Face 3.189 1.150 2.774 0.0121 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6129 on 19 degrees of freedom
## Multiple R-squared: 0.2882, Adjusted R-squared: 0.2508
## F-statistic: 7.694 on 1 and 19 DF, p-value: 0.0121
plotb <- ggplot(testo, aes(x = Face, y = Penalty)) +
geom_point() +
stat_smooth(method = "lm", col = "blue", se=F)
plotb
## `geom_smooth()` using formula = 'y ~ x'
library(performance)
## Warning: package 'performance' was built under R version 4.3.2
check_model(lm(Penalty ~ Face, data = testo))
How steeply does the number of penalty minutes increase per unit increase in face ratio? Calculate the estimate of the intercept. Write the result in the form of an equation for the line. See: y = mx + b <- y = 3.189x - 4.505, where the slope is 3.189 and the intercept is -4.505
How many degrees of freedom does this analysis have? See: summary(model_testo) - 19 DF
What is the t-statistic? See: summary(model_testo) - t value = -2.157(intercept), 2.774
What is your final conclusion about the slope? The slope is a positive slope, indicating an increase in 3.189 face ratio units for each penalty minute.
What is the variation explained, R2? The R^2 value (multiple) is 0.2882, adjusted is 0.2508.
Make a scatterplot of the data and include a linear regression line See: Above code: plotb
Check the model assumptions See: Above code: check_model