HW 1 D-Dirt

ddirt <- read.csv("ddirt.csv")

dl_all <- ddirt[ddirt$litter_treatment == "DL" ,]

Look at plot

ggplot(ddirt, aes(percentOrgC_2014, percentN_2014, color=litter_treatment))+
  geom_point()+
  geom_smooth(method="lm", se=F)

I think I will just go with organic carbon as a predictor for nitrogen using 2014 data. I will isolate double litter plots since there is a steeper relationship.

Grab slope and intercept

xbar <- mean(dl_all$percentOrgC_2014, na.rm = T)

ybar <- mean(dl_all$percentN_2014, na.rm=T)


#find the difference between each data point and mean

x_diff <- dl_all$percentOrgC_2014-xbar

y_diff <- dl_all$percentN_2014-ybar

#find numerator and denominator
numerator <- sum(x_diff*y_diff, na.rm = T)

denominator <- sum(x_diff^2, na.rm = T)

#calculate slope
b1 <- numerator/denominator


#find teh intercept Y=b0+b1(X)

b0 <- ybar-b1*xbar



sd(dl_all$percentN_2014, na.rm=T)

## [1] 0.03391587

Step 1: Sample size

n <- 30

Step 2: Generate Predictor

simulated_carbon <- runif(n, min=0, max=1) #choose range for x data

Step 3: Choose Parameters

In this case, I am using the parameters from the roach data…but I could easily just choose any parameters.

b0 #intercept

## [1] 0.008990332

b1 #slope

## [1] 0.1051602

Step 4: Compute True Mean

mu <- b0 + b1*simulated_carbon

Step 5: Add some noise error

sigma <- .03

Step 6: Create data!

simulated_nitrogen <- rnorm(n, mean=mu, sd=sigma)

Step 7: Plot simulation

#name model for line
model_nc <- lm(simulated_nitrogen~simulated_carbon)
summary(model_nc)

## 
## Call:
## lm(formula = simulated_nitrogen ~ simulated_carbon)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.045421 -0.018955 -0.000841  0.020077  0.058650 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       0.01441    0.01093   1.319    0.198    
## simulated_carbon  0.09380    0.01832   5.119    2e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02747 on 28 degrees of freedom
## Multiple R-squared:  0.4835, Adjusted R-squared:  0.465 
## F-statistic: 26.21 on 1 and 28 DF,  p-value: 2.002e-05

plot(simulated_nitrogen ~ simulated_carbon,
     main="Simulated double litter plots", 
     xlab = "% Organic C", 
     ylab = "% N", 
     ylim=c(0,.2), 
     xlim=c(0, 1))+
  abline(model_nc, col = "red", lwd = 2)

Fig. 1: HW1 generated data for relationship between % organic carbon and % nitrogen for double litter plots for DDIRT project. For every 1% increase in organic carbon, percent nitrogen increased by approximately 0.11% in the simulated double litter plots.

## integer(0)

Step 8: Assess Model

model_nc$coefficients

##      (Intercept) simulated_carbon 
##       0.01441212       0.09380411

#OG slope and int
b0

## [1] 0.008990332

b1

## [1] 0.1051602

The fitted linear model closely recovered the true parameters used to generate the data, with an estimated slope (0.11) and intercept (−0.002) that were very similar to the specified values (slope = 0.105, intercept = 0.009), indicating good model performance despite added noise.

Step 9: Predict observed values

predicted_cn<- predict(model_nc)


fake_DDIRT <- data.frame(simulated_carbon, simulated_nitrogen, predicted_cn)

Step 10: Create function to determine R^2 from predicted values

#create a function for r^2 equation (work from the inside out!)

#ingredient 1: sum((y-y_hat)^2)
#ingredient 2: sum(y-mean(y))^2
#ingredient 3: (1-ingredient 1/ingredient 2)


r_squared <- function(y, y_hat) {
  ss_res <- sum((y-y_hat)^2)
  ss_tot <- sum((y-mean(y))^2)
  1-(ss_res/ss_tot)
}

#use the function on the data:
r_squared(fake_DDIRT$simulated_nitrogen, fake_DDIRT$predicted_cn)

## [1] 0.4834554

HW 1 Practice

1a: Cockroaches Actual

roaches <- read.csv("cockroachneurons.csv")

Data Table

headroach <- head(roaches)

headroach |>
  gt()

temperature	rate
11.9	155.03
14.4	142.33
15.3	140.11
16.4	142.65
17.9	135.24
20.4	130.16

Plots

ggplot(roaches, aes(temperature, rate))+
  geom_point(color="#00B8E7")+
  geom_smooth(color="#ED68ED", method="lm", se=F)+
  labs(title="Cockroach Neural Activity ~ Temperature", x="Temperature (C)", y=expression("Neural Firing Rate (s" ^-1 ~ ")")
  )

## `geom_smooth()` using formula = 'y ~ x'

Determine slope and intercept

xbar <- mean(roaches$temperature)
ybar <- mean(roaches$rate)

#find the difference between each data point and mean

x_diff <- roaches$temperature-xbar
y_diff <- roaches$rate-ybar

#find numerator and denominator
numerator <- sum(x_diff*y_diff)

denominator <- sum(x_diff^2)

#calculate slope
b1 <- numerator/denominator

#find teh intercept Y=b0+b1(X)

b0 <- ybar-b1*xbar

#when x is 0,

1b: Cockroaches Predicted