Question 1
An article in the Journal of Sound and Vibration [“Measurement of Noise-Evoked Blood Pressure by Means of Averaging Method: Relation between Blood Pressure Rise and PSL” (1991, Vol. 151(3), pp. 383-394)] described a study investigating the relationship between noise exposure and hypertension. The following data are representative of those reported in the article.
A. Draw a scatter diagram of y (blood pressure rise in millimeters of mercury) versus x (sound pressure level in decibels). Does a simple linear regression model seem reasonable in this situation?
For a scatter diagram to be drawn, the given raw data must be stored in a variable as follows:SoundPressure = c(60, 63, 65, 70, 70, 70, 80, 90, 80, 80, 85, 89, 90, 90, 90, 90, 94, 100, 100, 100)
BloodPressureRise = c(1, 0, 1, 2, 5, 1, 4, 6, 2, 3, 5, 4, 6, 8, 4, 5, 7, 9, 7, 6)A scatter diagram can be made by using:
plot(SoundPressure, BloodPressureRise,
main = "Scatter Diagram of Sound Pressure Level vs. Blood Pressure Rise",
xlab = "Sound Pressure (dB)",
ylab = "Blood Pressure Rise (mmHg)", las = 1)Based on the scatter diagram above, it seems that Blood Pressure Rise also increases as Sound Pressure increases. The scatter plot follow a straight incline pattern going upward. It seems reasonable to create a simple linear regression model for this diagram.
B. Fit the simple linear regression model using least squares. Find an estimate of \(\sigma^2\)
A simple linear regression in the form \(Y=\hat\beta_0+\hat\beta_1x\) can fit the data. In this problem, \(Y\) represents Blood Pressure Rise while \(x\) represents Sound Pressure.
A simple linear regression model can be fitted by using:simreg = lm(BloodPressureRise ~ SoundPressure)
summary(simreg)##
## Call:
## lm(formula = BloodPressureRise ~ SoundPressure)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.8120 -0.9040 -0.1333 0.5023 2.9310
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -10.13154 1.99490 -5.079 7.83e-05 ***
## SoundPressure 0.17429 0.02383 7.314 8.57e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.318 on 18 degrees of freedom
## Multiple R-squared: 0.7483, Adjusted R-squared: 0.7343
## F-statistic: 53.5 on 1 and 18 DF, p-value: 8.567e-07
From the information above, the least squares estimates of the slope and intercept are 0.1742939 and -10.1315377, respectively. Therefore, a fitting model for the data is \(Y=\) -10.1315377 \(+\) 0.1742939\(x\).
To estimate \(\sigma^2\), the equation below will be used:\[ \sigma^2=\frac{\Sigma^n_{i=1}e^2_i}{n-2} \]
But it can easily be solved by utilizing code chunks:
var1 = (deviance(simreg))/simreg$df.residual; var1## [1] 1.737026
C. Find the predicted mean rise in blood pressure level associated with a sound pressure level of 85 decibels.
To find the predicted mean rise in blood pressure level, the equation formed \(Y=-10.1315377+0.1742939x\) will be used where, \(x=85\). Although, code chunks can also be used.prediction = data.frame(SoundPressure=85)
predictbp = predict(simreg, prediction)\(\therefore\) The predicted mean rise in blood pressure level is 4.6834467
Question 2
An article in Optical Engineering [“Operating Curve Extraction of a Correlator’s Filter” (2004, Vol. 43, pp. 2775-2779)] reported on the use of an optical correlator to perform an experiment by varying brightness and contrast. The resulting modulation is characterized by the useful range of gray levels. The data follow:
A. Fit a multiple linear regression model to these data.
Input the raw data into a data frame as follows:optics.data <- data.frame(
Brightness = c(54, 61, 65, 100, 100, 100, 50, 57, 54),
Contrast = c(56, 80, 70, 50, 65, 80, 25, 35, 26),
UsefulRange = c(96, 50, 50, 112, 96, 80, 155, 144, 255)
)Note that there are two variables and thus a multiple linear regression in the form \(Y=\hat\beta_0 + \hat\beta_1x_1 + \hat\beta_2 x_2\) can fit the data. In this context, the variables \(Y\) represents for the useful range, \(x_1\) represent the brightness, and \(x_2\) represent the contrast.
Fit a multiple linear regression model by using:multreg <- lm(UsefulRange ~ Brightness + Contrast ,data=optics.data)
summary(multreg)##
## Call:
## lm(formula = UsefulRange ~ Brightness + Contrast, data = optics.data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -32.334 -20.090 -8.451 8.413 69.047
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 238.5569 45.2285 5.274 0.00188 **
## Brightness 0.3339 0.6763 0.494 0.63904
## Contrast -2.7167 0.6887 -3.945 0.00759 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 36.35 on 6 degrees of freedom
## Multiple R-squared: 0.7557, Adjusted R-squared: 0.6742
## F-statistic: 9.278 on 2 and 6 DF, p-value: 0.01459
From the results above, a fitting model for the data is \(Y=\) 238.5569113 + 0.333912\(x_1\) -2.7167348 \(x_2\).
B. Estimate \(\sigma^2\).
To estimate \(\sigma^2\), we will use the formula:\[ \displaystyle \dfrac{\Sigma_{i=1}^{n} e_i^2}{n-p} \]
To compute that, the code will be used is:
var2 = (deviance(multreg))/multreg$df.residual; var2## [1] 1321.273
Thus, the estimated value of \(\sigma^2\) is 1321.2726765
C. Compute the standard errors of the regression coefficients.
Std. Error column. Specifically, the standard errors for each coefficient are:
- \(se(\hat{\beta_0})\) = 45.2284742
- \(se(\hat{\beta_1})\) = 0.6762945
- \(se(\hat{\beta_2})\) = 0.6887346
D. Predict the useful range when brightness = 80 and contrast = 75.
To predict the useful range with brightness of 80% and contrast of 75%, the code chunk will be utilized:
newdata = data.frame(Brightness=80, Contrast=75);
predict(multreg, newdata)## 1
## 61.51477
Thus, using the fitted multiple linear model, the predicted value obtained is 61.5147687.
E. Test for significance of regression using \(\alpha=0.05\). What is the P-value for this test?
To test the significance of regression with \(\alpha=0.05\), the hypothesis are:
- \(H_0: \beta_1 = \beta_2 = 0\)
- \(H_1: \beta_j \neq 0\) for at least one \(j\)
The test statistic is \(f_0=\) 9.2783134. Since \(f_0 >f_{0.05,2,6}=\) 5.1432528, the null hypothesis is rejected on \(\alpha=0.05\) with a P-value of 0.0145864.
In conclusion, at least one of the regressor variables, either brightness or contrast, or both, contribute significantly to the fitted model.F. Construct a t-test on each regression coefficient. What conclusions can you draw about the variables in this model? Use \(\alpha=0.05\).
The three pairs of hypotheses cosntructed for each regression coefficient \(\beta_0,\beta_1,\beta_2\) are as follows:- \(H_0: \beta_0 =0, \beta_1 =0, \beta_2 =0\)
- \(H_1: \beta_0 \neq0, \beta_1 \neq0, \beta_2 \neq0\)
t value column. Specifically, the test statistic of each coefficients are:
- \(t_{0, \beta0} =\) 5.2744851
- \(t_{0, \beta1} =\) 0.4937376
- \(t_{0, \beta2} =\) -3.9445306
The rejection region for all of the test statistic is \(|t_0|>\) 2.0738731 with \(\alpha=0.05\).
References
- D. C. Montgomery and G. C. Runger, Applied statistics and probability for engineers. New York: Wiley, 2003.