the problem studies the effect of four factors (x1 = superficial fluid velocity of the gas (cs/s), x2= kinematic velocity, x3= mesh opening (cm), and x4= dimensionless number relating the superficial velocity of the gas to the fluid) on the pressure drop in a screen plate bubble column (y).
dat<-read.csv("data-table-B9.CSV",row.names = NULL)
dat
## x1 x2 x3 x4 y
## 1 2.14 10.00 0.34 1.000 28.9
## 2 4.14 10.00 0.34 1.000 31.0
## 3 8.15 10.00 0.34 1.000 26.4
## 4 2.14 10.00 0.34 0.246 27.2
## 5 4.14 10.00 0.34 0.379 26.1
## 6 8.15 10.00 0.34 0.474 23.2
## 7 2.14 10.00 0.34 0.141 19.7
## 8 4.14 10.00 0.34 0.234 22.1
## 9 8.15 10.00 0.34 0.311 22.8
## 10 2.14 10.00 0.34 0.076 29.2
## 11 4.14 10.00 0.34 0.132 23.6
## 12 8.15 10.00 0.34 0.184 23.6
## 13 2.14 2.63 0.34 0.679 24.2
## 14 4.14 2.63 0.34 0.804 22.1
## 15 8.15 2.63 0.34 0.890 20.9
## 16 2.14 2.63 0.34 0.514 17.6
## 17 4.14 2.63 0.34 0.672 15.7
## 18 8.15 2.63 0.34 0.801 15.8
## 19 2.14 2.63 0.34 0.346 14.0
## 20 4.14 2.63 0.34 0.506 17.1
## 21 8.15 2.63 0.34 0.669 18.3
## 22 2.14 2.63 0.34 1.000 33.8
## 23 4.14 2.63 0.34 1.000 31.7
## 24 8.15 2.63 0.34 1.000 28.1
## 25 5.60 1.25 0.34 0.848 18.1
## 26 5.60 1.25 0.34 0.737 16.5
## 27 5.60 1.25 0.34 0.651 15.4
## 28 5.60 1.25 0.34 0.554 15.0
## 29 4.30 2.63 0.34 0.748 19.1
## 30 4.30 2.63 0.34 0.682 16.2
## 31 4.30 2.63 0.34 0.524 16.3
## 32 4.30 2.63 0.34 0.472 15.8
## 33 4.30 2.63 0.34 0.398 15.4
## 34 5.60 10.10 0.25 0.789 19.2
## 35 5.60 10.10 0.25 0.677 8.4
## 36 5.60 10.10 0.25 0.590 15.0
## 37 5.60 10.10 0.25 0.523 12.0
## 38 5.60 10.10 0.34 0.789 21.9
## 39 5.60 10.10 0.34 0.677 21.3
## 40 5.60 10.10 0.34 0.590 21.6
## 41 5.60 10.10 0.34 0.523 19.8
## 42 4.30 10.10 0.34 0.741 21.6
## 43 4.30 10.10 0.34 0.617 17.3
## 44 4.30 10.10 0.34 0.524 20.0
## 45 4.30 10.10 0.34 0.457 18.6
## 46 2.40 10.10 0.34 0.615 22.1
## 47 2.40 10.10 0.34 0.473 14.7
## 48 2.40 10.10 0.34 0.381 15.8
## 49 2.40 10.10 0.34 0.320 13.2
## 50 5.60 10.10 0.55 0.789 30.8
## 51 5.60 10.10 0.55 0.677 27.5
## 52 5.60 10.10 0.55 0.590 25.2
## 53 5.60 10.10 0.55 0.523 22.8
## 54 2.14 112.00 0.34 0.680 41.7
## 55 4.14 112.00 0.34 0.803 33.7
## 56 8.15 112.00 0.34 0.889 29.7
## 57 2.14 112.00 0.34 0.514 41.8
## 58 4.14 112.00 0.34 0.672 37.1
## 59 8.15 112.00 0.34 0.801 40.1
## 60 2.14 112.00 0.34 0.306 42.7
## 61 4.14 112.00 0.34 0.506 48.6
## 62 8.15 112.00 0.34 0.668 42.4
#fitting the model
model <- lm(y~x1+x2+x3+x4,data=dat)
full_model<-lm(y~(x1+x2+x3+x4)^2, data=dat)
full_model
##
## Call:
## lm(formula = y ~ (x1 + x2 + x3 + x4)^2, data = dat)
##
## Coefficients:
## (Intercept) x1 x2 x3 x4 x1:x2
## 15.88376 0.18695 0.37921 -11.99940 -8.86442 0.01155
## x1:x3 x1:x4 x2:x3 x2:x4 x3:x4
## NA -1.11525 NA -0.38547 72.85976
summary_full<-summary(full_model)
summary_full
##
## Call:
## lm(formula = y ~ (x1 + x2 + x3 + x4)^2, data = dat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.4804 -3.0766 -0.6635 2.9625 12.2221
##
## Coefficients: (2 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 15.88376 23.17863 0.685 0.49616
## x1 0.18696 0.78447 0.238 0.81255
## x2 0.37921 0.06332 5.989 1.89e-07 ***
## x3 -11.99940 67.31148 -0.178 0.85919
## x4 -8.86442 35.62553 -0.249 0.80446
## x1:x2 0.01155 0.00869 1.329 0.18955
## x1:x3 NA NA NA NA
## x1:x4 -1.11525 1.14847 -0.971 0.33592
## x2:x3 NA NA NA NA
## x2:x4 -0.38547 0.11962 -3.222 0.00218 **
## x3:x4 72.85976 103.15353 0.706 0.48308
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.683 on 53 degrees of freedom
## Multiple R-squared: 0.7496, Adjusted R-squared: 0.7118
## F-statistic: 19.83 on 8 and 53 DF, p-value: 1.947e-13
the dropped interactions are Answer: x1:x3 and x2:x3
R^2 value is:0.7496
F-statistics for the overall regression is:19.83
conclusion about the full regression model is:Reject H0 (at least one regression coefficient is significantly different from zero)
#fitting partail f-test
reduced_model<-lm(y~x1+x2+x3+x4, data=dat)
reduced_model
##
## Call:
## lm(formula = y ~ x1 + x2 + x3 + x4, data = dat)
##
## Coefficients:
## (Intercept) x1 x2 x3 x4
## 5.8945 -0.4779 0.1827 35.4028 5.8439
anova_test<-anova(reduced_model,full_model)
anova_test
## Analysis of Variance Table
##
## Model 1: y ~ x1 + x2 + x3 + x4
## Model 2: y ~ (x1 + x2 + x3 + x4)^2
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 57 1432.8
## 2 53 1162.4 4 270.37 3.0819 0.02352 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
the partial F-test is:3.08
P-value for the partial F-Test fo all two-factor interaction is: 0.0235
conclusion about the two-factor interactions:Reject H0 (at least one interaction term is significant and should be retained)
best_fitting_model<-step(full_model,direction = "backward")
## Start: AIC=199.73
## y ~ (x1 + x2 + x3 + x4)^2
##
##
## Step: AIC=199.73
## y ~ x1 + x2 + x3 + x4 + x1:x2 + x1:x3 + x1:x4 + x2:x4 + x3:x4
##
##
## Step: AIC=199.73
## y ~ x1 + x2 + x3 + x4 + x1:x2 + x1:x4 + x2:x4 + x3:x4
##
## Df Sum of Sq RSS AIC
## - x3:x4 1 10.942 1173.4 198.31
## - x1:x4 1 20.682 1183.1 198.82
## <none> 1162.4 199.73
## - x1:x2 1 38.737 1201.2 199.76
## - x2:x4 1 227.751 1390.2 208.82
##
## Step: AIC=198.31
## y ~ x1 + x2 + x3 + x4 + x1:x2 + x1:x4 + x2:x4
##
## Df Sum of Sq RSS AIC
## - x1:x4 1 19.837 1193.2 197.35
## <none> 1173.4 198.31
## - x1:x2 1 38.709 1212.1 198.32
## - x2:x4 1 228.394 1401.8 207.34
## - x3 1 249.320 1422.7 208.26
##
## Step: AIC=197.35
## y ~ x1 + x2 + x3 + x4 + x1:x2 + x2:x4
##
## Df Sum of Sq RSS AIC
## - x1:x2 1 32.307 1225.5 197.01
## <none> 1193.2 197.35
## - x2:x4 1 220.026 1413.2 205.84
## - x3 1 252.209 1445.4 207.24
##
## Step: AIC=197.01
## y ~ x1 + x2 + x3 + x4 + x2:x4
##
## Df Sum of Sq RSS AIC
## - x1 1 11.262 1236.8 195.57
## <none> 1225.5 197.01
## - x2:x4 1 207.286 1432.8 204.69
## - x3 1 248.430 1473.9 206.45
##
## Step: AIC=195.57
## y ~ x2 + x3 + x4 + x2:x4
##
## Df Sum of Sq RSS AIC
## <none> 1236.8 195.57
## - x3 1 243.60 1480.4 204.72
## - x2:x4 1 245.68 1482.4 204.81
summary(best_fitting_model)
##
## Call:
## lm(formula = y ~ x2 + x3 + x4 + x2:x4, data = dat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.959 -3.358 -1.131 3.040 11.646
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.52261 4.03964 0.377 0.70763
## x2 0.38056 0.06084 6.255 5.47e-08 ***
## x3 34.51062 10.29961 3.351 0.00144 **
## x4 9.52471 2.96093 3.217 0.00214 **
## x2:x4 -0.30472 0.09056 -3.365 0.00137 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.658 on 57 degrees of freedom
## Multiple R-squared: 0.7336, Adjusted R-squared: 0.7149
## F-statistic: 39.24 on 4 and 57 DF, p-value: 9.297e-16
The best-fitting model includes x2, x3, x4, and the interaction x2:x4. Variable x1 is dropped because it is not statistically significant. \[ (y\sim x2+x3+x4+x2:x4) \]
best_model<-lm(y~x2+x3+x4+x2:x4, data=dat)
best_model
##
## Call:
## lm(formula = y ~ x2 + x3 + x4 + x2:x4, data = dat)
##
## Coefficients:
## (Intercept) x2 x3 x4 x2:x4
## 1.5226 0.3806 34.5106 9.5247 -0.3047
summary(best_model)
##
## Call:
## lm(formula = y ~ x2 + x3 + x4 + x2:x4, data = dat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.959 -3.358 -1.131 3.040 11.646
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.52261 4.03964 0.377 0.70763
## x2 0.38056 0.06084 6.255 5.47e-08 ***
## x3 34.51062 10.29961 3.351 0.00144 **
## x4 9.52471 2.96093 3.217 0.00214 **
## x2:x4 -0.30472 0.09056 -3.365 0.00137 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.658 on 57 degrees of freedom
## Multiple R-squared: 0.7336, Adjusted R-squared: 0.7149
## F-statistic: 39.24 on 4 and 57 DF, p-value: 9.297e-16
The Adjusted R square value for the best model is 0.7149
The estimated coefficient for the x2:x4 interaction term is -0.3047
We should omit \(x1\) because it is not in the model
pt1 <- data.frame(x2 = 10, x3 = 0.5, x4 = 0.75)
pred1 <- predict(best_model, pt1, interval="confidence", level=0.95)
pred1
## fit lwr upr
## 1 27.44168 24.00618 30.87717
the predicted value of y is 27.44
The lower bound of the 95% confidence interval on the mean response at Point 1 (x2 = 10, x3 = 0.5, x4 = 0.75) is 24.01
The upper bound of the 95% confidence interval on the mean response at Point 1 is 30.88
pt2 <- data.frame(x2 = 3, x3 = 0.25, x4 = 0.85)
pred2 <- predict(best_model, pt2, interval="prediction", level=0.95)
pred2
## fit lwr upr
## 1 18.61092 8.860183 28.36165
The predicted (fitted) value of y at Point 2 (x2 = 3, x3 = 0.25, x4 = 0.85)is 18.61
The upper bound of the 95% prediction interval at Point 2 (x2 = 3, x3 = 0.25, x4 = 0.85) is 28.36
The prediction interval is always wider than the confidence interval at the same point because the prediction interval accounts for both the uncertainty in estimating the mean response and the variability of individual observations is [True]
x1 was not statistically significant as a predictor of pressure drop and its removal improved the model based on AIC