1. Determine the correlation coefficients between π‘₯ and 𝑦, π‘₯ and 𝑧, and 𝑦 and 𝑧 using Pearson’s π‘Ÿ. Interpret your results.

data = read.csv("Exercise 1 data.csv", header = T)

Correlation coefficient between x and y:

with(data, cor(data$Milk.Intake.x.,data$Weight.y., method = "pearson"))
## [1] 0.6136956

Since the correlation coefficient between x and y is 0.61, there is a strong linear association between milk intake (in liters) and weight (in kg) of a person.

Correlation coefficient between x and z:

with(data, cor(data$Milk.Intake.x.,data$Age.z., method = "pearson"))
## [1] 0.767911

Since the correlation coefficient between x and z is 0.76, there is a strong linear association between milk intake (in liters) and age (in years) of a person.

Correlation coefficient between y and z:

with(data, cor(data$Weight.y.,data$Age.z., method = "pearson"))
## [1] 0.509803

Since the correlation coefficient between y and z is 0.51, there is a moderate linear association between weight (in kg) and age (in years) of a person.

2. Are the associations for each pair in (1) significant at 𝛼 = 5%? Explain.

For pairs x and y:

Test the hypothesis at 0.05 level of significance

\(H_0: p = 0\)

\(H_a: p\neq 0\)

cor.test(data$Milk.Intake.x.,data$Weight.y.)
## 
##  Pearson's product-moment correlation
## 
## data:  data$Milk.Intake.x. and data$Weight.y.
## t = 3.6458, df = 22, p-value = 0.001425
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.2794903 0.8152634
## sample estimates:
##       cor 
## 0.6136956

Since p-value= 0.001 < 0.05, we will reject Ho. Therefore, there is a significant positive correlation between milk intake and weight of a person.

For pairs x and z:

Test the hypothesis at 0.05 level of significance

\(H_0: p = 0\)

\(H_a: p\neq 0\)

cor.test(data$Milk.Intake.x.,data$Age.z., method = "pearson")
## 
##  Pearson's product-moment correlation
## 
## data:  data$Milk.Intake.x. and data$Age.z.
## t = 5.623, df = 22, p-value = 1.183e-05
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.5281076 0.8942831
## sample estimates:
##      cor 
## 0.767911

Since p-value << 0.05, we will reject Ho. Therefore, there is a significant positive correlation between milk intake and age of a person.

For pairs y and z

Test the hypothesis at 0.05 level of significance

\(H_0: p = 0\)

\(H_a: p\neq 0\)

cor.test(data$Weight.y.,data$Age.z., method = "pearson")
## 
##  Pearson's product-moment correlation
## 
## data:  data$Weight.y. and data$Age.z.
## t = 2.7795, df = 22, p-value = 0.01093
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.1339544 0.7574317
## sample estimates:
##      cor 
## 0.509803

Since p-value = 0.01<0.05, we will reject Ho. Therefore, there is a significant positive correlation between weight and age of a person.

3. A researcher wants to find out the correlation between milk intake and body weight while controlling for age. Perform a partial correlation analysis and test the hypothesis at 𝛼 =0.05.

Test the hypothesis at 0.05 level of significance.

\(H_0:\) There is no correlation between milk intake and body weight while controlling the age of a person.

\(H_a:\) There is a correlation between milk intake and body weight while controlling the age of a person.

# install.packages("ppcor")
library(ppcor)
## Loading required package: MASS
with(data, pcor.test(data$Milk.Intake.x., data$Weight.y., data$Age.z., 
                     method = "pearson"))
##    estimate   p.value statistic  n gp  Method
## 1 0.4032415 0.0563976  2.019339 24  1 pearson

Since the p-value > 0.05, we fail to reject Ho. There is no correlation between milk intake and body weight while controlling the age of a person.

4. Furthermore, determine if weight is correlated with milk intake and age. Also compute for \(R^2\) and \(\tilde R^2\). Interpret your results.

\(H_0:\) The weight is not correlated with milk intake and age of a person.

\(H_a:\) The weight is correlated with milk intake and age of a person.

mulcor = lm(data$Weight.y ~ data$Milk.Intake.x + data$Age.z, data = data)
summary(mulcor)
## 
## Call:
## lm(formula = data$Weight.y ~ data$Milk.Intake.x + data$Age.z, 
##     data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -25.645  -6.748   1.635   8.329  16.252 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)  
## (Intercept)         19.2210     8.7153   2.205   0.0387 *
## data$Milk.Intake.x   1.4097     0.6981   2.019   0.0564 .
## data$Age.z           0.1282     0.3661   0.350   0.7297  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.63 on 21 degrees of freedom
## Multiple R-squared:  0.3802, Adjusted R-squared:  0.3212 
## F-statistic: 6.442 on 2 and 21 DF,  p-value: 0.006582

Since p-value=0.0066<0.05, we will reject Ho. Thus, weight is correlated with milk intake and age of a person.

r.squared <- summary(mulcor)$r.squared 
multiple.R <- sqrt(r.squared)
multiple.R
## [1] 0.6166378
r.squared
## [1] 0.3802422

R-squared of the model turns out to be 0.3802. This means that 38.02% of the variation in weight can be explained by the milk intake and the age of a person.

adjusted.r <- summary(mulcor)$adj.r.squared
multiple.adjust.r <- sqrt(adjusted.r)
multiple.adjust.r
## [1] 0.5667607

The R-adjusted of the model is 0.5667. It means that the additional input variables are slightly adding value to the model.

5. Test the hypothesis that

\(p^2_y.xz β‰  0\) using 5% level of significance.

\(H_o: p^2_y.xz = 0\)

\(H_a: p^2_y.xz β‰  0\)

n=nrow(data)
k=ncol(data)-2 
R2=adjusted.r
F=((n-k-1)*R2)/(k*(1-R2))
F
## [1] 10.41098
qf(0.05,2,21, lower.tail = FALSE)
## [1] 3.4668
pf(F,2,21, lower.tail = FALSE)
## [1] 0.0007220326

Since the p-value << 0.05, we reject Ho. Thus, at 0.05 level of significance, there is sufficient evidence to conclude that there is a significant association between weight and the linear combination of milk intake and age of a person. In addition, the variation in weight can be explained by linear combination of the milk intake and the age of a person.