Data Description

Specifically, the problem studies the effect of four factors (x1 = superficial fluid velocity of the gas (cs/s), x2= kinematic velocity, x3= mesh opening (cm), and x4= dimensionless number relating the superficial velocity of the gas to the fluid) on the pressure drop in a screen plate bubble column (y).

First, we need to load the data-set using the link.

url <- "https://raw.githubusercontent.com/tmatis12/datafiles/refs/heads/main/data-table-B9.csv"
dat <- read.csv(url)
dat

##      x1     x2   x3    x4    y
## 1  2.14  10.00 0.34 1.000 28.9
## 2  4.14  10.00 0.34 1.000 31.0
## 3  8.15  10.00 0.34 1.000 26.4
## 4  2.14  10.00 0.34 0.246 27.2
## 5  4.14  10.00 0.34 0.379 26.1
## 6  8.15  10.00 0.34 0.474 23.2
## 7  2.14  10.00 0.34 0.141 19.7
## 8  4.14  10.00 0.34 0.234 22.1
## 9  8.15  10.00 0.34 0.311 22.8
## 10 2.14  10.00 0.34 0.076 29.2
## 11 4.14  10.00 0.34 0.132 23.6
## 12 8.15  10.00 0.34 0.184 23.6
## 13 2.14   2.63 0.34 0.679 24.2
## 14 4.14   2.63 0.34 0.804 22.1
## 15 8.15   2.63 0.34 0.890 20.9
## 16 2.14   2.63 0.34 0.514 17.6
## 17 4.14   2.63 0.34 0.672 15.7
## 18 8.15   2.63 0.34 0.801 15.8
## 19 2.14   2.63 0.34 0.346 14.0
## 20 4.14   2.63 0.34 0.506 17.1
## 21 8.15   2.63 0.34 0.669 18.3
## 22 2.14   2.63 0.34 1.000 33.8
## 23 4.14   2.63 0.34 1.000 31.7
## 24 8.15   2.63 0.34 1.000 28.1
## 25 5.60   1.25 0.34 0.848 18.1
## 26 5.60   1.25 0.34 0.737 16.5
## 27 5.60   1.25 0.34 0.651 15.4
## 28 5.60   1.25 0.34 0.554 15.0
## 29 4.30   2.63 0.34 0.748 19.1
## 30 4.30   2.63 0.34 0.682 16.2
## 31 4.30   2.63 0.34 0.524 16.3
## 32 4.30   2.63 0.34 0.472 15.8
## 33 4.30   2.63 0.34 0.398 15.4
## 34 5.60  10.10 0.25 0.789 19.2
## 35 5.60  10.10 0.25 0.677  8.4
## 36 5.60  10.10 0.25 0.590 15.0
## 37 5.60  10.10 0.25 0.523 12.0
## 38 5.60  10.10 0.34 0.789 21.9
## 39 5.60  10.10 0.34 0.677 21.3
## 40 5.60  10.10 0.34 0.590 21.6
## 41 5.60  10.10 0.34 0.523 19.8
## 42 4.30  10.10 0.34 0.741 21.6
## 43 4.30  10.10 0.34 0.617 17.3
## 44 4.30  10.10 0.34 0.524 20.0
## 45 4.30  10.10 0.34 0.457 18.6
## 46 2.40  10.10 0.34 0.615 22.1
## 47 2.40  10.10 0.34 0.473 14.7
## 48 2.40  10.10 0.34 0.381 15.8
## 49 2.40  10.10 0.34 0.320 13.2
## 50 5.60  10.10 0.55 0.789 30.8
## 51 5.60  10.10 0.55 0.677 27.5
## 52 5.60  10.10 0.55 0.590 25.2
## 53 5.60  10.10 0.55 0.523 22.8
## 54 2.14 112.00 0.34 0.680 41.7
## 55 4.14 112.00 0.34 0.803 33.7
## 56 8.15 112.00 0.34 0.889 29.7
## 57 2.14 112.00 0.34 0.514 41.8
## 58 4.14 112.00 0.34 0.672 37.1
## 59 8.15 112.00 0.34 0.801 40.1
## 60 2.14 112.00 0.34 0.306 42.7
## 61 4.14 112.00 0.34 0.506 48.6
## 62 8.15 112.00 0.34 0.668 42.4

Part 1: The Full Interaction Model

Fit the first-order model with all two-factor interactions.

full_model<-lm(y~(x1+x2+x3+x4)^2, data=dat)
full_model

## 
## Call:
## lm(formula = y ~ (x1 + x2 + x3 + x4)^2, data = dat)
## 
## Coefficients:
## (Intercept)           x1           x2           x3           x4        x1:x2  
##    15.88376      0.18695      0.37921    -11.99940     -8.86442      0.01155  
##       x1:x3        x1:x4        x2:x3        x2:x4        x3:x4  
##          NA     -1.11525           NA     -0.38547     72.85976

sum_full<-summary(full_model)
sum_full

## 
## Call:
## lm(formula = y ~ (x1 + x2 + x3 + x4)^2, data = dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.4804 -3.0766 -0.6635  2.9625 12.2221 
## 
## Coefficients: (2 not defined because of singularities)
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  15.88376   23.17863   0.685  0.49616    
## x1            0.18696    0.78447   0.238  0.81255    
## x2            0.37921    0.06332   5.989 1.89e-07 ***
## x3          -11.99940   67.31148  -0.178  0.85919    
## x4           -8.86442   35.62553  -0.249  0.80446    
## x1:x2         0.01155    0.00869   1.329  0.18955    
## x1:x3              NA         NA      NA       NA    
## x1:x4        -1.11525    1.14847  -0.971  0.33592    
## x2:x3              NA         NA      NA       NA    
## x2:x4        -0.38547    0.11962  -3.222  0.00218 ** 
## x3:x4        72.85976  103.15353   0.706  0.48308    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.683 on 53 degrees of freedom
## Multiple R-squared:  0.7496, Adjusted R-squared:  0.7118 
## F-statistic: 19.83 on 8 and 53 DF,  p-value: 1.947e-13

Q1) Which two interactions are dropped due to singularities?

Answer: x1:x3 and x2:x3

Q2) R squared value?

Answer: 0.7496

Q3) F-statistic?

Answer: 19.83

Q4) Based on the overall F-test… what do you conclude?

Answer: Reject H0 (at least one regression coefficient is significantly different from zero)

Part 2: Partial F-Test

We fit the reduced model and compare it to the full interaction model using a partial F-test to see if the interaction terms as a group are significant.

reduced_model<-lm(y~x1+x2+x3+x4, data=dat)
reduced_model

## 
## Call:
## lm(formula = y ~ x1 + x2 + x3 + x4, data = dat)
## 
## Coefficients:
## (Intercept)           x1           x2           x3           x4  
##      5.8945      -0.4779       0.1827      35.4028       5.8439

anova_test<-anova(reduced_model,full_model)
anova_test

## Analysis of Variance Table
## 
## Model 1: y ~ x1 + x2 + x3 + x4
## Model 2: y ~ (x1 + x2 + x3 + x4)^2
##   Res.Df    RSS Df Sum of Sq      F  Pr(>F)  
## 1     57 1432.8                              
## 2     53 1162.4  4    270.37 3.0819 0.02352 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Q5) Partial F-test comparing the reduced model to the full interaction model?

Answer: 3.08

Q6) P-value for the partial F-Test fo all two-factor interactions?

Answer: 0.0235

Q7) Based on the partial F-test, what do you conclude about two-factor interactions?

Answer: Reject H0 (at least one interaction term is significant and should be retained).

Part 3: Stepwise Regression

Q8) Using the Stepwise regression, which terms are included in the best-fitting model justified by significance of regression parameters?

The best-fitting model includes x2, x3, x4, and the interaction x2:x4. Variable x1 is dropped because it is not statistically significant.

best_fitting_model<-step(full_model,direction = "backward")

## Start:  AIC=199.73
## y ~ (x1 + x2 + x3 + x4)^2
## 
## 
## Step:  AIC=199.73
## y ~ x1 + x2 + x3 + x4 + x1:x2 + x1:x3 + x1:x4 + x2:x4 + x3:x4
## 
## 
## Step:  AIC=199.73
## y ~ x1 + x2 + x3 + x4 + x1:x2 + x1:x4 + x2:x4 + x3:x4
## 
##         Df Sum of Sq    RSS    AIC
## - x3:x4  1    10.942 1173.4 198.31
## - x1:x4  1    20.682 1183.1 198.82
## <none>               1162.4 199.73
## - x1:x2  1    38.737 1201.2 199.76
## - x2:x4  1   227.751 1390.2 208.82
## 
## Step:  AIC=198.31
## y ~ x1 + x2 + x3 + x4 + x1:x2 + x1:x4 + x2:x4
## 
##         Df Sum of Sq    RSS    AIC
## - x1:x4  1    19.837 1193.2 197.35
## <none>               1173.4 198.31
## - x1:x2  1    38.709 1212.1 198.32
## - x2:x4  1   228.394 1401.8 207.34
## - x3     1   249.320 1422.7 208.26
## 
## Step:  AIC=197.35
## y ~ x1 + x2 + x3 + x4 + x1:x2 + x2:x4
## 
##         Df Sum of Sq    RSS    AIC
## - x1:x2  1    32.307 1225.5 197.01
## <none>               1193.2 197.35
## - x2:x4  1   220.026 1413.2 205.84
## - x3     1   252.209 1445.4 207.24
## 
## Step:  AIC=197.01
## y ~ x1 + x2 + x3 + x4 + x2:x4
## 
##         Df Sum of Sq    RSS    AIC
## - x1     1    11.262 1236.8 195.57
## <none>               1225.5 197.01
## - x2:x4  1   207.286 1432.8 204.69
## - x3     1   248.430 1473.9 206.45
## 
## Step:  AIC=195.57
## y ~ x2 + x3 + x4 + x2:x4
## 
##         Df Sum of Sq    RSS    AIC
## <none>               1236.8 195.57
## - x3     1    243.60 1480.4 204.72
## - x2:x4  1    245.68 1482.4 204.81

summary(best_fitting_model)

## 
## Call:
## lm(formula = y ~ x2 + x3 + x4 + x2:x4, data = dat)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -9.959 -3.358 -1.131  3.040 11.646 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.52261    4.03964   0.377  0.70763    
## x2           0.38056    0.06084   6.255 5.47e-08 ***
## x3          34.51062   10.29961   3.351  0.00144 ** 
## x4           9.52471    2.96093   3.217  0.00214 ** 
## x2:x4       -0.30472    0.09056  -3.365  0.00137 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.658 on 57 degrees of freedom
## Multiple R-squared:  0.7336, Adjusted R-squared:  0.7149 
## F-statistic: 39.24 on 4 and 57 DF,  p-value: 9.297e-16

Part 4: Best Model

\[ y\sim x2+x3+x4+x2:x4 \]

best_model<-lm(y~x2+x3+x4+x2:x4, data=dat)
best_model

## 
## Call:
## lm(formula = y ~ x2 + x3 + x4 + x2:x4, data = dat)
## 
## Coefficients:
## (Intercept)           x2           x3           x4        x2:x4  
##      1.5226       0.3806      34.5106       9.5247      -0.3047

sum_best<-summary(best_model)
sum_best

## 
## Call:
## lm(formula = y ~ x2 + x3 + x4 + x2:x4, data = dat)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -9.959 -3.358 -1.131  3.040 11.646 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.52261    4.03964   0.377  0.70763    
## x2           0.38056    0.06084   6.255 5.47e-08 ***
## x3          34.51062   10.29961   3.351  0.00144 ** 
## x4           9.52471    2.96093   3.217  0.00214 ** 
## x2:x4       -0.30472    0.09056  -3.365  0.00137 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.658 on 57 degrees of freedom
## Multiple R-squared:  0.7336, Adjusted R-squared:  0.7149 
## F-statistic: 39.24 on 4 and 57 DF,  p-value: 9.297e-16

Q9) The Adjusted R square value for the best model?

Answer: 0.7149

Q10) The estimated coefficient for the x2:x4 interaction term?

Answer:-0.3047

Part 4: Predictions and Intervals

Point 1: Confidence Interval

Q11) We predict the mean response at \(x2 = 10\), \(x3 = 0.5\), and \(x4 = 0.75\).

Answer: We omit \(x1\) because it is not in the model.

pt1 <- data.frame(x2 = 10, x3 = 0.5, x4 = 0.75)
pred1 <- predict(best_model, pt1, interval="confidence", level=0.95)
pred1

##        fit      lwr      upr
## 1 27.44168 24.00618 30.87717

Q12) The predicted (fitted) value of y at Point 1 (x2 = 10, x3 = 0.5, x4 = 0.75)? (Round to 2 decimal places)

Answer: 27.44

Q13)The lower bound of the 95% confidence interval on the mean response at Point 1 (x2 = 10, x3 = 0.5, x4 = 0.75)? (Round to 2 decimal places)

Answer: 24.01

Q14) The upper bound of the 95% confidence interval on the mean response at Point 1? (Round to 2 decimal places)

Answer: 30.88

Point 2: Prediction Interval

We predict an individual response at \(x2 = 3\), \(x3 = 0.25\), and \(x4 = 0.85\).

pt2 <- data.frame(x2 = 3, x3 = 0.25, x4 = 0.85)
pred2 <- predict(best_model, pt2, interval="prediction", level=0.95)
pred2

##        fit      lwr      upr
## 1 18.61092 8.860183 28.36165

Q15) The predicted (fitted) value of y at Point 2 (x2 = 3, x3 = 0.25, x4 = 0.85)?

Answer: 18.61

Q16) The upper bound of the 95% prediction interval at Point 2 (x2 = 3, x3 = 0.25, x4 = 0.85)?

Answer: 28.36

Q17) The prediction interval is always wider than the confidence interval at the same point because the prediction interval accounts for both the uncertainty in estimating the mean response and the variability of individual observations.

Answer: True

Q18) Why was x1 (superficial fluid velocity of the gas) ultimately excluded from the best model?

Answer: x1 was not statistically significant as a predictor of pressure drop and its removal improved the model based on AIC

Part 5: R Script

url <- "https://raw.githubusercontent.com/tmatis12/datafiles/refs/heads/main/data-table-B9.csv"
dat <- read.csv(url)
dat

##      x1     x2   x3    x4    y
## 1  2.14  10.00 0.34 1.000 28.9
## 2  4.14  10.00 0.34 1.000 31.0
## 3  8.15  10.00 0.34 1.000 26.4
## 4  2.14  10.00 0.34 0.246 27.2
## 5  4.14  10.00 0.34 0.379 26.1
## 6  8.15  10.00 0.34 0.474 23.2
## 7  2.14  10.00 0.34 0.141 19.7
## 8  4.14  10.00 0.34 0.234 22.1
## 9  8.15  10.00 0.34 0.311 22.8
## 10 2.14  10.00 0.34 0.076 29.2
## 11 4.14  10.00 0.34 0.132 23.6
## 12 8.15  10.00 0.34 0.184 23.6
## 13 2.14   2.63 0.34 0.679 24.2
## 14 4.14   2.63 0.34 0.804 22.1
## 15 8.15   2.63 0.34 0.890 20.9
## 16 2.14   2.63 0.34 0.514 17.6
## 17 4.14   2.63 0.34 0.672 15.7
## 18 8.15   2.63 0.34 0.801 15.8
## 19 2.14   2.63 0.34 0.346 14.0
## 20 4.14   2.63 0.34 0.506 17.1
## 21 8.15   2.63 0.34 0.669 18.3
## 22 2.14   2.63 0.34 1.000 33.8
## 23 4.14   2.63 0.34 1.000 31.7
## 24 8.15   2.63 0.34 1.000 28.1
## 25 5.60   1.25 0.34 0.848 18.1
## 26 5.60   1.25 0.34 0.737 16.5
## 27 5.60   1.25 0.34 0.651 15.4
## 28 5.60   1.25 0.34 0.554 15.0
## 29 4.30   2.63 0.34 0.748 19.1
## 30 4.30   2.63 0.34 0.682 16.2
## 31 4.30   2.63 0.34 0.524 16.3
## 32 4.30   2.63 0.34 0.472 15.8
## 33 4.30   2.63 0.34 0.398 15.4
## 34 5.60  10.10 0.25 0.789 19.2
## 35 5.60  10.10 0.25 0.677  8.4
## 36 5.60  10.10 0.25 0.590 15.0
## 37 5.60  10.10 0.25 0.523 12.0
## 38 5.60  10.10 0.34 0.789 21.9
## 39 5.60  10.10 0.34 0.677 21.3
## 40 5.60  10.10 0.34 0.590 21.6
## 41 5.60  10.10 0.34 0.523 19.8
## 42 4.30  10.10 0.34 0.741 21.6
## 43 4.30  10.10 0.34 0.617 17.3
## 44 4.30  10.10 0.34 0.524 20.0
## 45 4.30  10.10 0.34 0.457 18.6
## 46 2.40  10.10 0.34 0.615 22.1
## 47 2.40  10.10 0.34 0.473 14.7
## 48 2.40  10.10 0.34 0.381 15.8
## 49 2.40  10.10 0.34 0.320 13.2
## 50 5.60  10.10 0.55 0.789 30.8
## 51 5.60  10.10 0.55 0.677 27.5
## 52 5.60  10.10 0.55 0.590 25.2
## 53 5.60  10.10 0.55 0.523 22.8
## 54 2.14 112.00 0.34 0.680 41.7
## 55 4.14 112.00 0.34 0.803 33.7
## 56 8.15 112.00 0.34 0.889 29.7
## 57 2.14 112.00 0.34 0.514 41.8
## 58 4.14 112.00 0.34 0.672 37.1
## 59 8.15 112.00 0.34 0.801 40.1
## 60 2.14 112.00 0.34 0.306 42.7
## 61 4.14 112.00 0.34 0.506 48.6
## 62 8.15 112.00 0.34 0.668 42.4

full_model<-lm(y~(x1+x2+x3+x4)^2, data=dat)
full_model

## 
## Call:
## lm(formula = y ~ (x1 + x2 + x3 + x4)^2, data = dat)
## 
## Coefficients:
## (Intercept)           x1           x2           x3           x4        x1:x2  
##    15.88376      0.18695      0.37921    -11.99940     -8.86442      0.01155  
##       x1:x3        x1:x4        x2:x3        x2:x4        x3:x4  
##          NA     -1.11525           NA     -0.38547     72.85976

sum_full<-summary(full_model)
sum_full

## 
## Call:
## lm(formula = y ~ (x1 + x2 + x3 + x4)^2, data = dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.4804 -3.0766 -0.6635  2.9625 12.2221 
## 
## Coefficients: (2 not defined because of singularities)
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  15.88376   23.17863   0.685  0.49616    
## x1            0.18696    0.78447   0.238  0.81255    
## x2            0.37921    0.06332   5.989 1.89e-07 ***
## x3          -11.99940   67.31148  -0.178  0.85919    
## x4           -8.86442   35.62553  -0.249  0.80446    
## x1:x2         0.01155    0.00869   1.329  0.18955    
## x1:x3              NA         NA      NA       NA    
## x1:x4        -1.11525    1.14847  -0.971  0.33592    
## x2:x3              NA         NA      NA       NA    
## x2:x4        -0.38547    0.11962  -3.222  0.00218 ** 
## x3:x4        72.85976  103.15353   0.706  0.48308    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.683 on 53 degrees of freedom
## Multiple R-squared:  0.7496, Adjusted R-squared:  0.7118 
## F-statistic: 19.83 on 8 and 53 DF,  p-value: 1.947e-13

reduced_model<-lm(y~x1+x2+x3+x4, data=dat)
reduced_model

## 
## Call:
## lm(formula = y ~ x1 + x2 + x3 + x4, data = dat)
## 
## Coefficients:
## (Intercept)           x1           x2           x3           x4  
##      5.8945      -0.4779       0.1827      35.4028       5.8439

anova_test<-anova(reduced_model,full_model)
anova_test

## Analysis of Variance Table
## 
## Model 1: y ~ x1 + x2 + x3 + x4
## Model 2: y ~ (x1 + x2 + x3 + x4)^2
##   Res.Df    RSS Df Sum of Sq      F  Pr(>F)  
## 1     57 1432.8                              
## 2     53 1162.4  4    270.37 3.0819 0.02352 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

best_fitting_model<-step(full_model,direction = "backward")

## Start:  AIC=199.73
## y ~ (x1 + x2 + x3 + x4)^2
## 
## 
## Step:  AIC=199.73
## y ~ x1 + x2 + x3 + x4 + x1:x2 + x1:x3 + x1:x4 + x2:x4 + x3:x4
## 
## 
## Step:  AIC=199.73
## y ~ x1 + x2 + x3 + x4 + x1:x2 + x1:x4 + x2:x4 + x3:x4
## 
##         Df Sum of Sq    RSS    AIC
## - x3:x4  1    10.942 1173.4 198.31
## - x1:x4  1    20.682 1183.1 198.82
## <none>               1162.4 199.73
## - x1:x2  1    38.737 1201.2 199.76
## - x2:x4  1   227.751 1390.2 208.82
## 
## Step:  AIC=198.31
## y ~ x1 + x2 + x3 + x4 + x1:x2 + x1:x4 + x2:x4
## 
##         Df Sum of Sq    RSS    AIC
## - x1:x4  1    19.837 1193.2 197.35
## <none>               1173.4 198.31
## - x1:x2  1    38.709 1212.1 198.32
## - x2:x4  1   228.394 1401.8 207.34
## - x3     1   249.320 1422.7 208.26
## 
## Step:  AIC=197.35
## y ~ x1 + x2 + x3 + x4 + x1:x2 + x2:x4
## 
##         Df Sum of Sq    RSS    AIC
## - x1:x2  1    32.307 1225.5 197.01
## <none>               1193.2 197.35
## - x2:x4  1   220.026 1413.2 205.84
## - x3     1   252.209 1445.4 207.24
## 
## Step:  AIC=197.01
## y ~ x1 + x2 + x3 + x4 + x2:x4
## 
##         Df Sum of Sq    RSS    AIC
## - x1     1    11.262 1236.8 195.57
## <none>               1225.5 197.01
## - x2:x4  1   207.286 1432.8 204.69
## - x3     1   248.430 1473.9 206.45
## 
## Step:  AIC=195.57
## y ~ x2 + x3 + x4 + x2:x4
## 
##         Df Sum of Sq    RSS    AIC
## <none>               1236.8 195.57
## - x3     1    243.60 1480.4 204.72
## - x2:x4  1    245.68 1482.4 204.81

summary(best_fitting_model)

## 
## Call:
## lm(formula = y ~ x2 + x3 + x4 + x2:x4, data = dat)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -9.959 -3.358 -1.131  3.040 11.646 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.52261    4.03964   0.377  0.70763    
## x2           0.38056    0.06084   6.255 5.47e-08 ***
## x3          34.51062   10.29961   3.351  0.00144 ** 
## x4           9.52471    2.96093   3.217  0.00214 ** 
## x2:x4       -0.30472    0.09056  -3.365  0.00137 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.658 on 57 degrees of freedom
## Multiple R-squared:  0.7336, Adjusted R-squared:  0.7149 
## F-statistic: 39.24 on 4 and 57 DF,  p-value: 9.297e-16

best_model<-lm(y~x2+x3+x4+x2:x4, data=dat)
best_model

## 
## Call:
## lm(formula = y ~ x2 + x3 + x4 + x2:x4, data = dat)
## 
## Coefficients:
## (Intercept)           x2           x3           x4        x2:x4  
##      1.5226       0.3806      34.5106       9.5247      -0.3047

sum_best<-summary(best_model)
sum_best

## 
## Call:
## lm(formula = y ~ x2 + x3 + x4 + x2:x4, data = dat)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -9.959 -3.358 -1.131  3.040 11.646 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.52261    4.03964   0.377  0.70763    
## x2           0.38056    0.06084   6.255 5.47e-08 ***
## x3          34.51062   10.29961   3.351  0.00144 ** 
## x4           9.52471    2.96093   3.217  0.00214 ** 
## x2:x4       -0.30472    0.09056  -3.365  0.00137 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.658 on 57 degrees of freedom
## Multiple R-squared:  0.7336, Adjusted R-squared:  0.7149 
## F-statistic: 39.24 on 4 and 57 DF,  p-value: 9.297e-16

pt1 <- data.frame(x2 = 10, x3 = 0.5, x4 = 0.75)
pred1 <- predict(best_model, pt1, interval="confidence", level=0.95)
pred1

##        fit      lwr      upr
## 1 27.44168 24.00618 30.87717

pt2 <- data.frame(x2 = 3, x3 = 0.25, x4 = 0.85)
pred2 <- predict(best_model, pt2, interval="prediction", level=0.95)
pred2

##        fit      lwr      upr
## 1 18.61092 8.860183 28.36165

Assignment 8 - Multiple Regression with “lm”

Parva Pate

2026-02-22