Abstract
Answer question part 1, Plus any two other questions. Total score is 15 pointsA study was conducted with vegetarians to see whether the number of grams of protein each ate per day \((x_1)\) was related to their diastolic blood pressure \((x_2)\), systolic blood pressure \((x_3),\) and gender \((x_4)\). Below are the recorded data for a sample of ten vegetarian students:
\[\bf{X}=\begin{bmatrix}\bf{x_1}& \bf{x_2}& \bf{x_3}& \bf{x_4} \end{bmatrix}^{'}=\begin{bmatrix} 4.0 & 73 & 112 & 1 \\ 6.5 & 79 & 118 & 2\\ 5.0 & 83 & 123 & 1\\ 5.5 & 82 & 120 & 2 \\ 8.0 & 84 & 125 & 1\\ 10.0 & 92 & 140 & 2\\ 9.0 & 88 & 130 & 1\\ 8.2 & 86 & 126 & 2\\ 10.5 & 95 & 144 & 1 \\ 11.0 & 100 & 150 & 2 \end{bmatrix}\]
# Vegetarian Data
X <- matrix(c(4.0, 73, 112, 1,
6.5, 79, 118, 2,
5.0, 83, 123, 1,
5.5, 82, 120, 2,
8.0, 84, 125, 1,
10.0, 92, 140, 2,
9.0, 88, 130, 1,
8.2, 86, 126, 2,
10.5, 95, 144, 1,
11.0, 100, 150, 2), 10, 4, byrow = T)
# Data Processing
P <- as.data.frame(X)
colnames(P) <- c("protein", "dbp", "sbp", "gender")
P$gender <- factor(P$gender, levels=c(1,2), labels=c("Female", "Male"))
attach(P)
# Use a for() loop to compute the column averages: One Way
numericvars <- NULL
for (m in names(P)){
if(class(P[,m]) == 'numeric'){
numericvars[m] <- mean(P[,m], na.rm = TRUE)
} else { NA }
}
Mn <- numericvars
x <- P[, 1:3]
cov(x)
numb_cols <- ncol(x)
# Create a vector to store the column averages
col_aver <- numeric(numb_cols)
# Calculate column averages using a for loop
for (i in 1:numb_cols) {
col_aver[i] <- mean(x[, i])
}
names(col_aver) <- c("protein", "dbp", "sbp")
# calculate the Deviation Matrix
D <- matrix(0, nrow = nrow(x), ncol = ncol(x), byrow = TRUE, dimnames = list(NULL, c("protein", "dbp", "sbp")))
for (m in names(x)){
if(class(x[,m]) == 'numeric'){
D[, m] <- (x[, m] - col_aver[m])
} else { NA }
}
# Compute the Variance-Covariance Matrix
S <- (t(D)%*%D)/(nrow(D)-1)
We notice the we are able to calculate the sample variance-covariance matrix of our numeric variables, protein, diastolic blood pressure (dbp) and systolic blood pressure (sbp) using just the deviation matrix.
# Both Male and Female
A <- as.matrix(P[, -c(1,4)], 10,3)
a <- as.matrix(X[, 1], 10, 1)
Ba <- solve(t(A)%*%A)%*%t(A)%*%a
#lm(a ~ A[,1]+A[,2])
# Male Only
M <- as.matrix(P[which(P$gender == "Male"), -c(1,4)])
m <- as.matrix(X[which(P$gender == "Male"), 1])
Bm <- solve(t(M)%*%M)%*%t(M)%*%m
#lm(m ~ M[,1]+M[,2])
# Female Only
L <- as.matrix(P[which(P$gender=="Female"), -c(1,4)])
l <- as.matrix(X[which(P$gender=="Female"), 1])
Bl <- solve(t(L)%*%L)%*%t(L)%*%l
#lm(l ~ L[,1]+L[,2])
cbind(ALL=Ba, Male=Bm, Female=Bl)
Female and Male Model:
\[Protein~~=~~-0.1716936 ~~dbp + 0.1761890 ~~sbp \] Male Only Model:
\[Protein~~=~~-0.6782395 ~~dbp + 0.5187178 ~~sbp \] Female Only Model:
\[Protein~~=~~0.3290938 ~~dbp + -0.1609671 ~~sbp \]
The models for both Female and Male compared to that for Male Only seem to agree in terms of the relationship between the effect of Diastolic Blood Pressure and Systolic Blood Pressure on Protein consumption per day. we notice that there is a negative effect of diastolic blood pressure on Protein consumption, yet systolic blood pressure has a positive effect on the consumption of proteins among bot. However, when modeled independently for only Females, we notice that this effect is reversed.