The data gives the systolic blood pressure (SBP), body size (QUET), age (AGE), and smoking history (SMK = 0 if nonsmoker, SMK = 1 if a current or previous smoker) for a hypothetical sample of 32 white males over 40 years old from the town of Angina.
data = read.csv("week2-HW-data.csv", header = T, sep = ",", row.names = 1)
attach(data)
anova(lm( SBP ~ SMK))
## Analysis of Variance Table
##
## Response: SBP
## Df Sum Sq Mean Sq F value Pr(>F)
## SMK 1 393.1 393.1 1.9548 0.1723
## Residuals 30 6032.9 201.1
anova(lm( SBP ~ QUET))
## Analysis of Variance Table
##
## Response: SBP
## Df Sum Sq Mean Sq F value Pr(>F)
## QUET 1 3537.9 3537.9 36.751 1.172e-06 ***
## Residuals 30 2888.0 96.3
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(lm( QUET ~ AGE))
## Analysis of Variance Table
##
## Response: QUET
## Df Sum Sq Mean Sq F value Pr(>F)
## AGE 1 4.9360 4.9360 54.367 3.253e-08 ***
## Residuals 30 2.7237 0.0908
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(lm( SBP ~ AGE))
## Analysis of Variance Table
##
## Response: SBP
## Df Sum Sq Mean Sq F value Pr(>F)
## AGE 1 3861.6 3861.6 45.177 1.894e-07 ***
## Residuals 30 2564.3 85.5
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
For each table, we are testing if the independent variable contributes significantly to the model, this is, if the model fits better than the naive model.
If we take a confidence level of 5%, for:
SBP (Y) on SMK (X) we haven’t evidence for reject the null hypothesis (\(\beta_1 = 0\)), because the probability of error in rejecting the true null hypothesis would be 17.23% (higher than 5% previously admitted).
SBP (Y) on QUET (X) we can reject the null hypothesis (\(\beta_1 = 0\)), because the probability of error in rejecting the true null hypothesis would be 0% (less than 5% previously admitted).
QUET (Y) on AGE (X) we can reject the null hypothesis (\(\beta_1 = 0\)), because the probability of error in rejecting the true null hypothesis would be 0% (less than 5% previously admitted).
SBP (Y) on AGE (X) we can reject the null hypothesis (\(\beta_1 = 0\)), because the probability of error in rejecting the true null hypothesis would be 0% (less than 5% previously admitted).
If we consider a confidence level of 5%, we can conclude: - some variation of the SBP can be explained by QUET and AGE on linear regression; - there is interaction (linear) between QUET and AGE; - though not too clear here (not linear association), but by homework 2, we can conclude that the smoke can influence on SBP.
detach(data)