Pair Quiz

Question 1

An article in the Journal of Sound and Vibration [“Measurement of Noise-Evoked Blood Pressure by Means of Averaging Method: Relation between Blood Pressure Rise and PSL” (1991, Vol. 151(3), pp. 383-394)] described a study investigating the relationship between noise exposure and hypertension. The following data are representative of those reported in the article.


y 1 0 1 2 5 1 4 6 2 3 5 4 6 8 4 5 7 9 7 6
x 60 63 65 70 70 70 80 90 80 80 85 89 90 90 90 90 94 100 100 100



A. Draw a scatter diagram of y (blood pressure rise in millimeters of mercury) versus x (sound pressure level in decibels). Does a simple linear regression model seem reasonable in this situation?


  y <- c(1, 0,  1,  2,  5,  1,  4,  6,  2,  3,  5,  4,  6,  8,  4,  5,  7,  9,  7,  6)
  x <- c(60,    63, 65, 70, 70, 70, 80, 90, 80, 80, 85, 89, 90, 90, 90, 90, 94, 100, 100, 100)

  plot(x, y, pch = 16, cex = 1, col = "blue", main = "Relation between Blood Pressure Rise and PSL", xlab = "Sound Pressure Level (dB)", ylab = "Blood Pressure Rise (mmHg)")
  lm(y ~ x)
## 
## Call:
## lm(formula = y ~ x)
## 
## Coefficients:
## (Intercept)            x  
##    -10.1315       0.1743
  abline(lm(y ~ x))
Table A.1 Scatter Diagram of x and y



As seen in the scatter plot diagram in Table A.1, it is reasonable to use a simple linear regression model as there is a clear linear relationship present between variable x (sound pressure level) and y (blood pressure). For specificity, there is a direct relationship between both variables as when sound pressure level increases, so does the blood pressure.


B. Fit the simple linear regression model using least squares. Find an estimate of \(\sigma^2\)

For Simple Linear Regression Model:

Given the set of x and y values, we can determine the simple linear regression model using the least squares method. This method is best represented by the equation:

\(\hat{y}=\hat{\beta_0}+\hat{\beta_1}x\)

wherein:
\(\hat{y}\) is the rise of blood pressure measured in mmHG
\(x\) is the sound pressure in dB
\(\beta_{0}\) and \(\beta_1\) are the estimates for regression coefficients



It is worth taking note that this is reminiscent of the slope intercept form \(y=mx+b\).

With that, we would have to solve for the following values and summations at \(n=20\):

\[ \sum_{i = 1}^{20} x_i = 1,656 \] \[ \bar{x}=82.8 \] \[ \sum_{i = 1}^{20} y_i = 86 \] \[ \bar{y}=4.3 \] \[ \sum_{i = 1}^{20} x_i^2 = 140,176 \] \[ \sum_{i = 1}^{20} y_i^2 = 494 \] \[ \sum_{i = 1}^{20} x_i y_1 = 7,654 \] With these values solved for already, we may proceed to solving for \(S_xy\) and \(S_xx\) as follows:

For \(S_xx\) \[ S_{xx} = \sum_{i = 1}^{n} x_i^2 - \frac{(\sum_{i = 1}^{n} x_i)^2}{n} \]
For \(S_xy\) \[ S_{xy} = \sum_{i = 1}^{n} x_i y_i - \frac{(\sum_{i = 1}^{n} x_i)(\sum_{i = 1}^{n} y_i)}{n} \]
We then plug in the values we have already solved for as follows:

For \(S_xx\)

\[ S_{xx} = \sum_{i = 1}^{n} x_i^2 - \frac{(\sum_{i = 1}^{n} x_i)^2}{n} \] \[ S_{xx} = 140,176 - \frac{(1,656)^2}{20} \] \(S_{xx} = 3,059\)

For \(S_xy\)
\[ S_{xy} = \sum_{i = 1}^{n} x_i y_i - \frac{(\sum_{i = 1}^{n} x_i)(\sum_{i = 1}^{n} y_i)}{n} \] \[ S_{xy} = 7,654 - \frac{(1,656)(86)}{20} \] \[ S_{xy} = 533.2 \]



Now that both have been solved for, we can now calculate the estimates for regression coefficients as follows:

For \(\hat{\beta_1}\) \[ \hat{\beta_1}= \frac{S_xy}{S_xx} \] \[ \hat{\beta_1}= \frac{533.2}{3,059.2} \] \[ \hat{\beta_1}= 0.1742939 \] For \(\hat{\beta_0}\) \[ \hat{\beta_0}=\overline{y}-\hat{\beta_1} \overline{x} \] \[ \hat{\beta_0}=4.3 - (0.1742939)(82.8) \] \[ \hat{\beta_0}=-10.1315349 \] This leaves with the equation for the best fit line or the fitted simple linear regression model:

\(\hat{y}=0.1742939331x-10.13153766\)



For estimating \(\sigma^2\):

We will use the formula: \[ \sigma^2 = \frac{SS_E}{n-2} \]

\(SS_E\) can be solved through R as follows:

SSE <- lm(y~x)

sum(resid(SSE)^2)
## [1] 31.26647



Since we already have the value of \(SS_E\), we can proceed with attaining \(\sigma^2\) by plugging it into the equation former equation:

\[ \sigma^2 = \frac{31.26647}{20-2} \]

\[ \sigma^2 = 1.737026111 \]

With that, we can conclude that the estimated variance is \(\sigma^2=1.737026111\)


C. Find the predicted mean rise in blood pressure level associated with a sound pressure level of 85 decibels.

Given the fitted simple line regression model, we can further obtain other useful data such predicting the mean rise in blood pressure given a particular pressure level. If we try to predict blood pressure level mean rise at 85 dB, we get the following solution:

\(\hat{y}=0.1742939331x-10.13153766\)
\(\hat{y}=0.1742939331(85)-10.13153766\)
\(\hat{y}=4.683446654\)


The answer we solved for is 4.683446654 or approximately 5. With that, we can conclude by saying that the estimated mean rise in blood pressure level at sound pressure level of 85 dB is 5mmHg.




Question 2

An article in Optical Engineering [“Operating Curve Extraction of a Correlator’s Filter” (2004, Vol. 43, pp. 2775-2779)] reported on the use of an optical correlator to perform an experiment by varying brightness and contrast. The resulting modulation is characterized by the useful range of gray levels. The data follow:


Brightness Contrast Useful Range
54 56 96
61 80 50
65 70 50
100 50 112
100 65 96
100 80 80
50 25 155
57 35 144
54 26 255

\

A. Fit a multiple linear regression model to these data.


## 
## Call:
## lm(formula = usefulrange ~ brightness + contrast)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -32.334 -20.090  -8.451   8.413  69.047 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept) 238.5569    45.2285   5.274  0.00188 **
## brightness    0.3339     0.6763   0.494  0.63904   
## contrast     -2.7167     0.6887  -3.945  0.00759 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 36.35 on 6 degrees of freedom
## Multiple R-squared:  0.7557, Adjusted R-squared:  0.6742 
## F-statistic: 9.278 on 2 and 6 DF,  p-value: 0.01459

\

B. Estimate \(𝜎^2\).


Using R Function:

(summary(data2)$sigma)**2 
## [1] 1321.273

Solving For the estimate of variance.

\[\sigma^2 = \frac{SS_E}{n-p}\]

\[ \sigma^2 = \frac{7928}{9-3} \]

\[\sigma^2 = 1321.333 \]

\[ 1321.273 \approx 1321.333 \approx 1321 \]

\

C. Compute the standard errors of the regression coefficients.


Using R function:

x
(Intercept) 45.2284742
brightness 0.6762945
contrast 0.6887346

\

D. Predict the useful range when brightness = 80 and contrast = 75.


Using R function:

newdata = data.frame(brightness=80,contrast=75)
predict(data2,newdata)
##        1 
## 61.51477

The Prediction is \(\approx 61.5\) for the Useful Range, given that brightness is \(80\)% and contrast is \(75\)%

\

E. Test for significance of regression using 𝛼=0.05. What is the P-value for this test?


## 
## Call:
## lm(formula = usefulrange ~ brightness + contrast)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -32.334 -20.090  -8.451   8.413  69.047 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept) 238.5569    45.2285   5.274  0.00188 **
## brightness    0.3339     0.6763   0.494  0.63904   
## contrast     -2.7167     0.6887  -3.945  0.00759 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 36.35 on 6 degrees of freedom
## Multiple R-squared:  0.7557, Adjusted R-squared:  0.6742 
## F-statistic: 9.278 on 2 and 6 DF,  p-value: 0.01459

\[H_0 : \sigma_1^2 = \sigma_2^2\] \[H_1: \sigma_1^2 ≠ \sigma_2^2\] \[ The\ \ P-Value\ \ is\ \ 0.01459 \]

\[P-Value\ \ <\ \ \alpha \]

\[\therefore \ Null\ \ Hypothesis\ \ is\ \ Rejected.\]

\

F. Construct a t-test on each regression coefficient. What conclusions can you draw about the variables in this model? Use 𝛼=0.05.


Mean of Brightness
=
71.22222
## 
##  One Sample t-test
## 
## data:  bur
## t = 1.8174, df = 17, p-value = 0.08682
## alternative hypothesis: true mean is not equal to 71.22
## 95 percent confidence interval:
##   67.67102 118.88454
## sample estimates:
## mean of x 
##  93.27778

\[H_0 : \beta_1 = \beta_{1,0}\] \[H_0 : \beta_1 ≠ \beta_{1,0}\]

\[0.08682 > (\alpha=0.05)\] Failed to Reject.

\(\therefore\) there is no linear relationship existing between Brightness and Useful Range.

Mean of Contrast
=
54.11111
## 
##  One Sample t-test
## 
## data:  cur
## t = 1.0255, df = 17, p-value = 0.3195
## alternative hypothesis: true mean is not equal to 71.22
## 95 percent confidence interval:
##   56.94322 112.50123
## sample estimates:
## mean of x 
##  84.72222

\[H_0 : \beta_1 = \beta_{1,0}\] \[H_0 : \beta_1 ≠ \beta_{1,0}\] \[0.03271 < 0.05 \ \ Reject \ H_0\] \(\therefore\) there is a linear relationship existing between the Contrast and the Useful Range.


References

D. C. Montgomery and G. C. Runger, Applied statistics and probability for engineers. New York: Wiley, 2003.