PAIR QUIZ
Question 1
An article in the Journal of Sound and Vibration [“Measurement of Noise-Evoked Blood Pressure by Means of Averaging Method: Relation between Blood Pressure Rise and PSL” (1991, Vol. 151(3), pp. 383-394)] described a study investigating the relationship between noise exposure and hypertension. The following data are representative of those reported in the article.
| y | 1 | 0 | 1 | 2 | 5 | 1 | 4 | 6 | 2 | 3 | 5 | 4 | 6 | 8 | 4 | 5 | 7 | 9 | 7 | 6 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| x | 60 | 63 | 65 | 70 | 70 | 70 | 80 | 90 | 80 | 80 | 85 | 89 | 90 | 90 | 90 | 90 | 94 | 100 | 100 | 100 |
A. Draw a scatter diagram of y (blood pressure rise in millimeters of mercury) versus x (sound pressure level in decibels). Does a simple linear regression seem reasonable in this situation?
In drawing scatter plots from this data, we’ll use R:
Notice that the scatter plot and the linear regression model shows correspondence in slope and points. The relationship of Sound Pressure level (x) and the Blood Pressure Rise (y) displayed somehow the same results between the two graphing system. It is in this sense where we can conclude that a linear progression model is reasonable in this situation as it fits accordingly with the scatter plot data.
B. Fit the simple linear regression model using least squares. Find an estimate of \(\sigma^2\).
Least Squares Method:
\[ n = 20\] \[\sum_{i = 1}^{20}X_i = 1,656 \hspace{20pt} \sum_{i = 1}^{20}Y_i = 86\]
\[\overline{x} = 82.8 \hspace{50pt} \overline{y} = 4.3\]
\[\sum_{i = 1}^{20}X^2_i = 140,176 \hspace{20pt} \sum_{i = 1}^{20}Y^2_i = 494\]
\[\sum_{i = 1}^{20}X_iY_i = 7,654\]
We then calculate \(S_{xx}\) and \(S_{xy}\).
Solving for \(S_{xx}\)
\[ \sum_{i = 1}^{n}(X_i - \overline{x})^2 = \sum_{i=1}^{n}X_i^2 - \frac{(\sum_{i=1}^{n}X_i)^2}{n}\] \[= \sum_{i=1}^{20}X_i^2 - \frac{(\sum_{i=1}^{20}X_i)^2}{20}\] \[= 140,176 - \frac{2,742,336}{20}\] \[= 140,176 - 137,116.8 \\ = 3,059.2\]
Solving for \(S_{xy}\)
\[ \sum_{i=1}^{n}Y_i(X_i - \overline{x})^2 = \sum_{i=1}^{n}X_iY_i - \frac{(\sum_{i=1}^{n}X_i)(\sum_{i=1}^{n}Yi)}{n}\] \[= \sum_{i=1}^{20}X_iY_i - \frac{(\sum_{i=1}^{20}X_i)(\sum_{i=1}^{20}Yi)}{20}\] \[= 7,654 - \frac{(1,656)(86)}{20}\] \[= 7,654 - 7,120.8 \\ =533.2\]
From this, we can get the least slope and intercept which are:
\[ \hat{\beta} = \frac{S_{xy}}{S_{xx}}
\\
= \frac{533.2}{3,059.2}
\\
= 0.1742939331\]
\[ \hat{\beta}_0 = \overline{y} - \hat{\beta}_1 \overline{x}
\\
= 4.3 - (0.1743)(82.8)
\\
= - 10.13153766\]
Using the Least Squares Method, the equation of the line that best fit the plot is:
\[ y = 0.1743939331x - 10.13153766 \]y <- c(1, 0, 1, 2, 5, 1, 4, 6, 2, 3, 5, 4, 6, 8, 4, 5, 7, 9, 7, 6)
x <- c(60, 63, 65, 70, 70, 70, 80, 90, 80, 80, 85, 89, 90, 90, 90, 90, 94, 100, 100, 100)
plot(x,y, pch = 16, cex = 1, col = "dark green", main = "Sound Pressure Level vs Blood Pressure Rise", xlab = "Sound Pressure Levels (in decibels)", ylab = "Blood Pressure Rise (in mmHg)")
abline(- 10.13153766, 0.1743939331)Calculating the Variance \(\sigma^2\)
model <- lm(y~x)
deviance(model)## [1] 31.26647
Since the sum of the squared of estimated errors is 31.26647, we can substitute this to the equation:
\[ \sigma^2 = \frac{SSE}{n-2} \]
\[= \frac{31.26647}{20-2} \\ = 1.737026111\]
Therefore the \(\sigma^2\) is equal to 1.737.
C. Find the predicted mean rise in blood pressure level associated with a sound pressure level of 85 decibels.
Predicting the rise in blood pressure level associated with a sound pressure level is now easy to do since we calculated a linear regression of which we can use in the said process.
\[ y = 0.1743939331x - 10.13153766 \] In this case, we simply just substitute 85 decibels as x in the equation:
\[ y = 0.1743939331x - 10.13153766\]
\[ = 0.1743939331(85) - 10.13153766\]
\[ = 14.82348431 - 10.13153766\]
\[= 4.691946654\]
Therefore, the blood pressure level will likely rise 4.69 mmHg when the sound pressure level is 85 decibels.
Question 2
An article in Optical Engineering [“Operating Curve Extraction of a Correlator’s Filter”(2004, Vol.43, pp. 2775-2779)] reported on the use of an optical correlator to perform an experiment by varying brightness and contrast. The resulting modulation is characterized by the useful range of gray levels. The data follow:
| Brightness (%) | 54 | 61 | 65 | 100 | 100 | 100 | 50 | 57 | 54 |
|---|---|---|---|---|---|---|---|---|---|
| Contrast (%) | 56 | 80 | 70 | 50 | 65 | 80 | 25 | 35 | 26 |
| Useful Range (ng) | 96 | 50 | 50 | 112 | 96 | 80 | 155 | 144 | 255 |
A. Fit a multiple linear regression model to these data.
First, we will be identifying the formula for Multiple Regression, which is:
\[ \displaystyle y = \beta _0 + \beta _1x_1 + \beta _2x_2 \] Given that:
\(Y =\) Useful Range
\(x_1 =\) Brightness
\(x_2 =\) Contrast
This is the set of data given above.
## # A tibble: 9 x 3
## Brightness Contrast UR
## <dbl> <dbl> <dbl>
## 1 54 56 96
## 2 61 80 50
## 3 65 70 50
## 4 100 50 112
## 5 100 65 96
## 6 100 80 80
## 7 50 25 155
## 8 57 35 144
## 9 54 26 255
## Brightness Contrast UR
## Brightness 1.00 0.50 -0.35
## Contrast 0.50 1.00 -0.86
## UR -0.35 -0.86 1.00
Figure 1. Multiple Linear Regression Model
##
## Call:
## lm(formula = UR ~ Brightness + Contrast, data = PAIR_QUIZ)
##
## Residuals:
## Min 1Q Median 3Q Max
## -32.334 -20.090 -8.451 8.413 69.047
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 238.5569 45.2285 5.274 0.00188 **
## Brightness 0.3339 0.6763 0.494 0.63904
## Contrast -2.7167 0.6887 -3.945 0.00759 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 36.35 on 6 degrees of freedom
## Multiple R-squared: 0.7557, Adjusted R-squared: 0.6742
## F-statistic: 9.278 on 2 and 6 DF, p-value: 0.01459
According to the result above,
\(\beta _0 = 238.5569\)
\(\beta _1 = 0.3339\)
\(\beta _2 = -2.7167\)
Therefore, the multiple linear regression model is:
\[ y = 238.5569 + 0.3339x_1 - 2.7167x_2 \]
B. Estimate \(\sigma^2\).
According to the table below,
## Df Sum Sq Mean Sq F value Pr(>F)
## Brightness 1 3960 3960 2.997 0.13412
## Contrast 1 20558 20558 15.559 0.00759 **
## Residuals 6 7928 1321
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The \(\sigma^2\) is equal to 1321.3
C. Compute the standard errors of the regression coefficients.
Based on this table,
##
## Call:
## lm(formula = UR ~ Brightness + Contrast, data = PAIR_QUIZ)
##
## Residuals:
## Min 1Q Median 3Q Max
## -32.334 -20.090 -8.451 8.413 69.047
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 238.5569 45.2285 5.274 0.00188 **
## Brightness 0.3339 0.6763 0.494 0.63904
## Contrast -2.7167 0.6887 -3.945 0.00759 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 36.35 on 6 degrees of freedom
## Multiple R-squared: 0.7557, Adjusted R-squared: 0.6742
## F-statistic: 9.278 on 2 and 6 DF, p-value: 0.01459
The standard of errors of the regression coefficients are:
Std. Error for Useful Range \(= 45.2285\)
Std. Error for Brightness \(= 0.6763\)
Std. Error for Contrast \(= 0.6887\)
D. Predict the useful range when the brightness=80 and contrast=75.
Using the formula of the multiple regression model from the part A,
\[ y = 238.5569 + 0.3339x_1 - 2.7167x_2 \]
We can plug in the values of the brightness=80 and contrast=75: where
\(X_1 = 80\) \(X_2 = 75\)
\[ y = 238.5569 + 0.3339(80) - 2.7167(75) \] \[ y = 61.5164 \] The value of the useful range when the brightness = 80 and contrast = 75 is 61.5164
E. Test for significance of regression using \(\alpha = 0.05\). What is the P-value for this?
The hypothesis are:
\(H_0 : \beta _1 = \beta _2 = ... = \beta _k = 0\)
\(H_1 : \beta _j \neq 0\)
The formula for the test statistic:
\(F_0 = \frac {SS_R / k}{SS_E / (n-p)} = \frac {MS_R}{MS_E}\)
The null hypothesis will be rejected if \(f_0\) > \(f_{0.05,2,6}\)
Using this table,
## # A tibble: 9 x 3
## Brightness Contrast UR
## <dbl> <dbl> <dbl>
## 1 54 56 96
## 2 61 80 50
## 3 65 70 50
## 4 100 50 112
## 5 100 65 96
## 6 100 80 80
## 7 50 25 155
## 8 57 35 144
## 9 54 26 255
##
## Call:
## lm(formula = UR ~ Brightness + Contrast, data = PAIR_QUIZ)
##
## Residuals:
## Min 1Q Median 3Q Max
## -32.334 -20.090 -8.451 8.413 69.047
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 238.5569 45.2285 5.274 0.00188 **
## Brightness 0.3339 0.6763 0.494 0.63904
## Contrast -2.7167 0.6887 -3.945 0.00759 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 36.35 on 6 degrees of freedom
## Multiple R-squared: 0.7557, Adjusted R-squared: 0.6742
## F-statistic: 9.278 on 2 and 6 DF, p-value: 0.01459
The p-value is equal to 0.01459.
The \(f_0\), or the f-statistic, is equal to 9.278 and the \(f_{0.05,2,6}\), or the f-critical value with 2 and 6 as degree of freedoms, is equal to \(5.1432528\).
Since \(9.278 > 5.1432528\), we can say that the null hypothesis is rejected.
F. Construct a \(t\)-test on each regression coefficient. What conclusions can you draw about the variables in this model? Use \(\alpha=0.05\).
The hypotheses are:
\(H_0 : \beta _0 = 0 , \beta _1 = 0 , \beta _2 = 0\)
\(H_1 : \beta _0 \neq 0 , \beta _1 \neq 0 , \beta _2 \neq 0\)
Based on this table,
##
## Call:
## lm(formula = UR ~ Brightness + Contrast, data = PAIR_QUIZ)
##
## Residuals:
## Min 1Q Median 3Q Max
## -32.334 -20.090 -8.451 8.413 69.047
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 238.5569 45.2285 5.274 0.00188 **
## Brightness 0.3339 0.6763 0.494 0.63904
## Contrast -2.7167 0.6887 -3.945 0.00759 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 36.35 on 6 degrees of freedom
## Multiple R-squared: 0.7557, Adjusted R-squared: 0.6742
## F-statistic: 9.278 on 2 and 6 DF, p-value: 0.01459
the test statistics are:
\(t_0, _{\beta0} = 5.274\)
\(t_0, _{\beta1} = 0.494\)
\(t_0, _{\beta2} = -3.945\)
\(\beta_1\) : BRIGHTNESS
As \(|t_0| < t _{0.025,6}\), we can say that we fail to reject the \(H_0 : \beta_1 = 0\).
Therefore, the brightness does not significantly contribute to the model.
\(\beta_2\) : CONTRAST
As \(|t_0| > t _{a/2,n-p}\) , we can say that we reject the \(H_0 : \beta_2 = 0\).
Therefore, the contrast significantly contributes to the model.