sleep_data = read.table(url("https://gksmyth.github.io/ozdasl/general/sleep.txt"), header = T)
sleep_data = sleep_data[!sleep_data$Species %in% c("Africanelephant", "Asianelephant", "Man" ), ]
Answer: : Independent (x) - Body weight;
Dependent (y) - Brain weight
Answer: It appears that there is a weak positive relationship, meaning that as 𝑥increases, y also increases.
BodyWeight=sleep_data$BodyWt
BrainWeight=sleep_data$BrainWt
plot(BodyWeight,BrainWeight,main="Scatterplot between Body Weight and Brain Weight")
Answer: Both covariance and correlation indicate a positive trend; however, the correlation value suggests that this relationship is weak.
cov(BodyWeight,BrainWeight)
## [1] 17833.93
cor(BodyWeight,BrainWeight)
## [1] 0.8884084
Answer:
cor(BodyWeight,BrainWeight,method="spearman")
## [1] 0.950011
Answer:
cor(BodyWeight,BrainWeight,method="spearman")
## [1] 0.950011
cor(BrainWeight,BodyWeight,method="spearman")
## [1] 0.950011
Answer: My regression equation is:\(\bar{y}\) = 36.572 + 1.228x
model = lm(BrainWeight ~ BodyWeight, data = sleep_data )
model
##
## Call:
## lm(formula = BrainWeight ~ BodyWeight, data = sleep_data)
##
## Coefficients:
## (Intercept) BodyWeight
## 36.572 1.228
Answer: \(\widehat{\beta}_0\) = 36.572 is the estimated value of Brain Weight (y) when Body Weight (x) is equal to 0.
Answer: \(\widehat{\beta}_1\) = 1.228 is the estimated change in Brain Weight (𝑦) when Body Weight (x) changes by 1 unit.
Answer: From the output table, \(R^2\) = 0.7893. Only 79% variability in Brain Weight (𝑦) is being explained by the model.
summary(model)
##
## Call:
## lm(formula = BrainWeight ~ BodyWeight, data = sleep_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -184.81 -34.52 -27.16 0.67 339.35
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 36.57231 10.95089 3.34 0.00148 **
## BodyWeight 1.22847 0.08408 14.61 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 77.15 on 57 degrees of freedom
## Multiple R-squared: 0.7893, Adjusted R-squared: 0.7856
## F-statistic: 213.5 on 1 and 57 DF, p-value: < 2.2e-16
Answer:
cor(BodyWeight,BrainWeight)^2
## [1] 0.7892696
Answer:
plot(BodyWeight, BrainWeight)
abline(model, col=("purple"))
Answer:
predict(model, newdata = data.frame(BodyWeight = 50) )
## 1
## 97.99588
Answer: \(\hat[\sigma]\) = 77.15; \(\hat[\sigma]^2\) = 5952.123
77.15^2
## [1] 5952.123
Answer: It seems that most of the data follows the trend line and it seems to be decreasing.
plot(model, which = 1)
Answer: \(H_0\):
\(\beta_1\) = 0. \(H_A\): \(\beta_1\) ≠ 0.
p-value = 2e-16 < 𝛼 ( =0.05). So, we have enough evidence to reject
the \(H_0\)).
summary(model)
##
## Call:
## lm(formula = BrainWeight ~ BodyWeight, data = sleep_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -184.81 -34.52 -27.16 0.67 339.35
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 36.57231 10.95089 3.34 0.00148 **
## BodyWeight 1.22847 0.08408 14.61 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 77.15 on 57 degrees of freedom
## Multiple R-squared: 0.7893, Adjusted R-squared: 0.7856
## F-statistic: 213.5 on 1 and 57 DF, p-value: < 2.2e-16
Answer: From the table our t-value = 14.61. From our calculations we end up getting the same number.
summary(model)
##
## Call:
## lm(formula = BrainWeight ~ BodyWeight, data = sleep_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -184.81 -34.52 -27.16 0.67 339.35
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 36.57231 10.95089 3.34 0.00148 **
## BodyWeight 1.22847 0.08408 14.61 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 77.15 on 57 degrees of freedom
## Multiple R-squared: 0.7893, Adjusted R-squared: 0.7856
## F-statistic: 213.5 on 1 and 57 DF, p-value: < 2.2e-16
#Estimate/Std. Error
1.22847/0.08408
## [1] 14.61073
Answer: So, the 95% CI for 𝛽1 is: (1.06011, 1.396833). The corresponding confidence interval contains 1.4. 1.4 is inside; hence, we can not reject 𝐻0.
confint(model, level = 0.95)
## 2.5 % 97.5 %
## (Intercept) 14.64354 58.501081
## BodyWeight 1.06011 1.396833
Answer: So, the 99% CI for 𝛽1 is: (1.004416, 1.452526). The corresponding confidence interval contains 1.4. 1.4 is inside; hence, we can not reject 𝐻0.
confint(model, level = 0.99)
## 0.5 % 99.5 %
## (Intercept) 7.389616 65.755003
## BodyWeight 1.004416 1.452526
Answer The hypothesis test has the form: \(H_0\): 𝛽1=0. \(H_A\): 𝛽1≠0. p-value = 2.2e-16 < 𝛼(=0.05). Reject \(H_0\).
summary(model)
##
## Call:
## lm(formula = BrainWeight ~ BodyWeight, data = sleep_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -184.81 -34.52 -27.16 0.67 339.35
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 36.57231 10.95089 3.34 0.00148 **
## BodyWeight 1.22847 0.08408 14.61 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 77.15 on 57 degrees of freedom
## Multiple R-squared: 0.7893, Adjusted R-squared: 0.7856
## F-statistic: 213.5 on 1 and 57 DF, p-value: < 2.2e-16
Answer: There are 3 influential points.
plot(hatvalues(model), type = 'h') >2/nrow(model)
## logical(0)
sum(hatvalues(model) >2/nrow(model))
## [1] 0
Answer: 0
plot(cooks.distance(model), type = 'h')
sum(cooks.distance(model)>1)
## [1] 1
Answer: Leverage scores range between 0 and 1, i.e., 0≤ℎ𝑖≤1. A high leverage point is usually defined as one where: ℎ𝑖>2/n
Answer:
data("mtcars")
fit=lm(formula = mpg ~ wt, data = mtcars)
plot(fit, which=2, col=c("red"))
Answer:In our model, the Q-Q plot demonstrates a strong alignment with the line, although a few points at the top are slightly offset. This is likely not significant and indicates a reasonable fit.