During periods of high electricity demand, especially during the hot summer months, the power output from a gas turbine engine can drop dramatically. One way to counter this drop in power is by cooling the inlet air to the gas turbine. An increasingly popular cooling method uses high pressure inlet fogging. The performance of a sample of 67 gas turbines augmented with high pressure inlet fogging was investigated in the Journal of Engineering for Gas Turbines and Power (January 2005). One measure of performance is heat rate (kilojoules per kilowatt per hour). Heat rates for the 67 gas turbines, saved in the gasturbine file.

Unit 3.1: Multicollinearity and Variable Screening

Step 1: Collect the Data

Check the appropriateness of response variable for regression: View a histogram of response variable. It should be continuous, and approximately unimodal and symmetric, with few outliers.

gasturbine<-read.delim("https://raw.githubusercontent.com/kvaranyak4/STAT3220/main/GASTURBINE.txt")
head(gasturbine)
names(gasturbine)
[1] "ENGINE"     "SHAFTS"     "RPM"        "CPRATIO"    "INLET.TEMP"
[6] "EXH.TEMP"   "AIRFLOW"    "POWER"      "HEATRATE"  
hist(gasturbine$HEATRATE, xlab="Heat Rate", main="Histogram of Heat Rate") 

The distribution of the response variable, heat rate, is unimodal and skewed right. It is continuous, so it should still be suitable for regression.

Step 2: Hypothesize Relationship (Exploratory Data Analysis)

We will explore the relationship with quantitative variables with scatter plots and correlations and classify each relationship as linear, curvilinear, or none. We explore the box plots and means for each qualitative variable explanatory variable then classify the relationships as existent or not. We will not explore interactions in this example.

#Scatter plots for quantitative variables
for (i in names(gasturbine)[3:8]) {
  plot(gasturbine[,i], gasturbine$HEATRATE,xlab=i,ylab="Heat Rate")
}

#Correlations for quantitative variables
round(cor(gasturbine[3:8],gasturbine$HEATRATE,use="complete.obs"),3)
             [,1]
RPM         0.844
CPRATIO    -0.735
INLET.TEMP -0.801
EXH.TEMP   -0.314
AIRFLOW    -0.703
POWER      -0.697
#Summary Statistics for response variable grouped by each level of the response
tapply(gasturbine$HEATRATE,gasturbine$ENGINE,summary)
$Advanced
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   9105    9295    9669    9764    9933   11588 

$Aeroderiv
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   8714   10708   12414   12312   13697   16243 

$Traditional
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10086   10598   11183   11544   11956   14796 
tapply(gasturbine$HEATRATE,gasturbine$SHAFTS,summary)
$`1`
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   9105    9918   10592   10930   11674   14796 

$`2`
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10951   11223   11654   12536   13232   16243 

$`3`
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   8714    8903    9092    9092    9280    9469 
#Box plots for Qualitative
boxplot(HEATRATE~ENGINE,gasturbine, ylab="Heat Rate")

boxplot(HEATRATE~SHAFTS,gasturbine, ylab="Heat Rate")

# Summary counts for qualitative variables
table(gasturbine$ENGINE,gasturbine$SHAFTS)
             
               1  2  3
  Advanced    21  0  0
  Aeroderiv    1  4  2
  Traditional 35  4  0
  • Most of the variables have high correlations. Exhaust temperature has the weakest correlation.
  • Power has a curvilinear or potentially a negative log relationship.
  • Engine type appears to have a different average heat rate for the advanced group.
  • Note: Shafts=3 only has two observations. Further, we would not be able to fit an interaction between Engine and Shaft because some level combinations do not have any observations.

Step 2 (continued): Multicollinearity

Do any of the explanatory variables have relationships with each other? We will look at pairwise correlations and VIF to evaluate multicollinearity in the quantitative explanatory variables.

#Regular correlation
gasturcor<-round(cor(gasturbine[,3:8]),4)
gasturcor
               RPM CPRATIO INLET.TEMP EXH.TEMP AIRFLOW   POWER
RPM         1.0000 -0.4903    -0.5536  -0.1715 -0.6876 -0.6169
CPRATIO    -0.4903  1.0000     0.6851   0.1139  0.3826  0.4473
INLET.TEMP -0.5536  0.6851     1.0000   0.7283  0.6808  0.7503
EXH.TEMP   -0.1715  0.1139     0.7283   1.0000  0.5665  0.6309
AIRFLOW    -0.6876  0.3826     0.6808   0.5665  1.0000  0.9776
POWER      -0.6169  0.4473     0.7503   0.6309  0.9776  1.0000
# Scatter plot matrix
plot(gasturbine[3:8])

#A new correlation function
gasturcor2<-rcorr(as.matrix(gasturbine[,3:8]))
gasturcor2
             RPM CPRATIO INLET.TEMP EXH.TEMP AIRFLOW POWER
RPM         1.00   -0.49      -0.55    -0.17   -0.69 -0.62
CPRATIO    -0.49    1.00       0.69     0.11    0.38  0.45
INLET.TEMP -0.55    0.69       1.00     0.73    0.68  0.75
EXH.TEMP   -0.17    0.11       0.73     1.00    0.57  0.63
AIRFLOW    -0.69    0.38       0.68     0.57    1.00  0.98
POWER      -0.62    0.45       0.75     0.63    0.98  1.00

n= 67 


P
           RPM    CPRATIO INLET.TEMP EXH.TEMP AIRFLOW POWER 
RPM               0.0000  0.0000     0.1653   0.0000  0.0000
CPRATIO    0.0000         0.0000     0.3585   0.0014  0.0001
INLET.TEMP 0.0000 0.0000             0.0000   0.0000  0.0000
EXH.TEMP   0.1653 0.3585  0.0000              0.0000  0.0000
AIRFLOW    0.0000 0.0014  0.0000     0.0000           0.0000
POWER      0.0000 0.0001  0.0000     0.0000   0.0000        
#Correlation Visualization
corrplot(gasturcor)

There is concern of strong pairwise relationships.

  • Power & airflow (r=0.977)
  • Power & inlet_temp (r=0.75)
  • inlet_temp & exh_temp (r=0.73)
#Multicollinearity VIF
gasmod1<-lm(HEATRATE~.-ENGINE-SHAFTS,data=gasturbine)
# Syntax Note: We can use the . to indicate all the variables in the data frame
# And use the - to exclude a particular variable from the model
summary(gasmod1)

Call:
lm(formula = HEATRATE ~ . - ENGINE - SHAFTS, data = gasturbine)

Residuals:
     Min       1Q   Median       3Q      Max 
-1003.32  -307.35   -91.44   271.18  1405.52 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.431e+04  1.112e+03  12.869  < 2e-16 ***
RPM          8.058e-02  1.611e-02   5.002 5.25e-06 ***
CPRATIO     -6.775e+00  3.038e+01  -0.223 0.824301    
INLET.TEMP  -9.507e+00  1.529e+00  -6.217 5.33e-08 ***
EXH.TEMP     1.415e+01  3.469e+00   4.081 0.000135 ***
AIRFLOW     -2.553e+00  1.746e+00  -1.462 0.148892    
POWER        4.257e-03  4.217e-03   1.009 0.316804    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 458.8 on 60 degrees of freedom
Multiple R-squared:  0.9248,    Adjusted R-squared:  0.9173 
F-statistic:   123 on 6 and 60 DF,  p-value: < 2.2e-16
gasmod1vif<-round(vif(gasmod1),3)
gasmod1vif
       RPM    CPRATIO INLET.TEMP   EXH.TEMP    AIRFLOW      POWER 
     4.015      5.213     13.852      7.351     49.136     49.765 
mean(gasmod1vif)
[1] 21.55533

Yes, there is evidence of severe multicollinearity because several VIFs are much greater than 10 and the average VIF is greater than 3.

Step 2 (continued): Variable Screening

Because we have quite a few variables and severe multicollinearity, we need to address that. It is not clear from EDA what variables should remain and which variables should be removed.

We will use variable selection procedures to narrow down our quantitative variables to a best set of predictors. We will use the entry and remain significance levels of 0.15

# backwards elimination
#Default: prem = 0.3
ols_step_backward_p(gasmod1,prem=0.15,details=T)
Backward Elimination Method 
---------------------------

Candidate Terms: 

1 . RPM 
2 . CPRATIO 
3 . INLET.TEMP 
4 . EXH.TEMP 
5 . AIRFLOW 
6 . POWER 

We are eliminating variables based on p value...

- CPRATIO 

Backward Elimination: Step 1 

 Variable CPRATIO Removed 

                          Model Summary                            
------------------------------------------------------------------
R                       0.962       RMSE                  455.170 
R-Squared               0.925       Coef. Var               4.113 
Adj. R-Squared          0.919       MSE                207179.318 
Pred R-Squared          0.907       MAE                   336.847 
------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 

                                   ANOVA                                    
---------------------------------------------------------------------------
                     Sum of                                                
                    Squares        DF     Mean Square       F         Sig. 
---------------------------------------------------------------------------
Regression    155259270.080         5    31051854.016    149.879    0.0000 
Residual       12637938.368        61      207179.318                      
Total         167897208.448        66                                      
---------------------------------------------------------------------------

                                       Parameter Estimates                                         
--------------------------------------------------------------------------------------------------
      model         Beta    Std. Error    Std. Beta       t        Sig         lower        upper 
--------------------------------------------------------------------------------------------------
(Intercept)    14215.194      1011.866                  14.048    0.000    12191.843    16238.544 
        RPM        0.080         0.016        0.354      5.038    0.000        0.048        0.112 
 INLET.TEMP       -9.769         0.969       -0.842    -10.080    0.000      -11.707       -7.831 
   EXH.TEMP       14.732         2.290        0.408      6.432    0.000       10.152       19.312 
    AIRFLOW       -2.473         1.696       -0.352     -1.459    0.150       -5.864        0.917 
      POWER        0.004         0.004        0.239      0.992    0.325       -0.004        0.012 
--------------------------------------------------------------------------------------------------


- POWER 

Backward Elimination: Step 2 

 Variable POWER Removed 

                          Model Summary                            
------------------------------------------------------------------
R                       0.961       RMSE                  455.114 
R-Squared               0.924       Coef. Var               4.113 
Adj. R-Squared          0.919       MSE                207128.472 
Pred R-Squared          0.908       MAE                   340.078 
------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 

                                   ANOVA                                    
---------------------------------------------------------------------------
                     Sum of                                                
                    Squares        DF     Mean Square       F         Sig. 
---------------------------------------------------------------------------
Regression    155055243.172         4    38763810.793    187.149    0.0000 
Residual       12841965.276        62      207128.472                      
Total         167897208.448        66                                      
---------------------------------------------------------------------------

                                       Parameter Estimates                                         
--------------------------------------------------------------------------------------------------
      model         Beta    Std. Error    Std. Beta       t        Sig         lower        upper 
--------------------------------------------------------------------------------------------------
(Intercept)    13617.924       813.306                  16.744    0.000    11992.148    15243.700 
        RPM        0.089         0.013        0.391      6.608    0.000        0.062        0.116 
 INLET.TEMP       -9.186         0.770       -0.791    -11.923    0.000      -10.726       -7.646 
   EXH.TEMP       14.363         2.260        0.397      6.356    0.000        9.846       18.880 
    AIRFLOW       -0.848         0.437       -0.120     -1.939    0.057       -1.721        0.026 
--------------------------------------------------------------------------------------------------



No more variables satisfy the condition of p value = 0.15


Variables Removed: 

- CPRATIO 
- POWER 


Final Model Output 
------------------

                          Model Summary                            
------------------------------------------------------------------
R                       0.961       RMSE                  455.114 
R-Squared               0.924       Coef. Var               4.113 
Adj. R-Squared          0.919       MSE                207128.472 
Pred R-Squared          0.908       MAE                   340.078 
------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 

                                   ANOVA                                    
---------------------------------------------------------------------------
                     Sum of                                                
                    Squares        DF     Mean Square       F         Sig. 
---------------------------------------------------------------------------
Regression    155055243.172         4    38763810.793    187.149    0.0000 
Residual       12841965.276        62      207128.472                      
Total         167897208.448        66                                      
---------------------------------------------------------------------------

                                       Parameter Estimates                                         
--------------------------------------------------------------------------------------------------
      model         Beta    Std. Error    Std. Beta       t        Sig         lower        upper 
--------------------------------------------------------------------------------------------------
(Intercept)    13617.924       813.306                  16.744    0.000    11992.148    15243.700 
        RPM        0.089         0.013        0.391      6.608    0.000        0.062        0.116 
 INLET.TEMP       -9.186         0.770       -0.791    -11.923    0.000      -10.726       -7.646 
   EXH.TEMP       14.363         2.260        0.397      6.356    0.000        9.846       18.880 
    AIRFLOW       -0.848         0.437       -0.120     -1.939    0.057       -1.721        0.026 
--------------------------------------------------------------------------------------------------

                            Elimination Summary                             
---------------------------------------------------------------------------
        Variable                  Adj.                                         
Step    Removed     R-Square    R-Square     C(p)        AIC         RMSE      
---------------------------------------------------------------------------
   1    CPRATIO       0.9247      0.9186    5.0497    1018.0217    455.1695    
   2    POWER         0.9235      0.9186    4.0192    1017.0947    455.1137    
---------------------------------------------------------------------------
# forward selection
#default: penter = 0.3
ols_step_forward_p(gasmod1,penter=0.15,details=T)
Forward Selection Method    
---------------------------

Candidate Terms: 

1. RPM 
2. CPRATIO 
3. INLET.TEMP 
4. EXH.TEMP 
5. AIRFLOW 
6. POWER 

We are selecting variables based on p value...


Forward Selection: Step 1 

- RPM 

                          Model Summary                            
------------------------------------------------------------------
R                       0.844       RMSE                  862.007 
R-Squared               0.712       Coef. Var               7.789 
Adj. R-Squared          0.708       MSE                743056.584 
Pred R-Squared          0.696       MAE                   648.175 
------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 

                                   ANOVA                                     
----------------------------------------------------------------------------
                     Sum of                                                 
                    Squares        DF      Mean Square       F         Sig. 
----------------------------------------------------------------------------
Regression    119598530.459         1    119598530.459    160.955    0.0000 
Residual       48298677.989        65       743056.584                      
Total         167897208.448        66                                       
----------------------------------------------------------------------------

                                     Parameter Estimates                                       
----------------------------------------------------------------------------------------------
      model        Beta    Std. Error    Std. Beta      t        Sig        lower       upper 
----------------------------------------------------------------------------------------------
(Intercept)    9470.484       164.058                 57.726    0.000    9142.838    9798.131 
        RPM       0.192         0.015        0.844    12.687    0.000       0.161       0.222 
----------------------------------------------------------------------------------------------



Forward Selection: Step 2 

- INLET.TEMP 

                          Model Summary                            
------------------------------------------------------------------
R                       0.934       RMSE                  578.322 
R-Squared               0.873       Coef. Var               5.226 
Adj. R-Squared          0.869       MSE                334456.428 
Pred R-Squared          0.859       MAE                   389.791 
------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 

                                   ANOVA                                    
---------------------------------------------------------------------------
                     Sum of                                                
                    Squares        DF     Mean Square       F         Sig. 
---------------------------------------------------------------------------
Regression    146491997.059         2    73245998.530        219    0.0000 
Residual       21405211.389        64      334456.428                      
Total         167897208.448        66                                      
---------------------------------------------------------------------------

                                       Parameter Estimates                                        
-------------------------------------------------------------------------------------------------
      model         Beta    Std. Error    Std. Beta      t        Sig         lower        upper 
-------------------------------------------------------------------------------------------------
(Intercept)    16523.288       794.181                 20.805    0.000    14936.728    18109.847 
        RPM        0.131         0.012        0.578    10.783    0.000        0.107        0.156 
 INLET.TEMP       -5.577         0.622       -0.481    -8.967    0.000       -6.820       -4.335 
-------------------------------------------------------------------------------------------------



Forward Selection: Step 3 

- EXH.TEMP 

                          Model Summary                            
------------------------------------------------------------------
R                       0.959       RMSE                  464.980 
R-Squared               0.919       Coef. Var               4.202 
Adj. R-Squared          0.915       MSE                216206.126 
Pred R-Squared          0.907       MAE                   342.429 
------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 

                                   ANOVA                                    
---------------------------------------------------------------------------
                     Sum of                                                
                    Squares        DF     Mean Square       F         Sig. 
---------------------------------------------------------------------------
Regression    154276222.495         3    51425407.498    237.854    0.0000 
Residual       13620985.953        63      216206.126                      
Total         167897208.448        66                                      
---------------------------------------------------------------------------

                                       Parameter Estimates                                         
--------------------------------------------------------------------------------------------------
      model         Beta    Std. Error    Std. Beta       t        Sig         lower        upper 
--------------------------------------------------------------------------------------------------
(Intercept)    14359.717       733.308                  19.582    0.000    12894.318    15825.116 
        RPM        0.105         0.011        0.463      9.818    0.000        0.084        0.127 
 INLET.TEMP       -9.223         0.787       -0.795    -11.721    0.000      -10.795       -7.650 
   EXH.TEMP       12.426         2.071        0.344      6.000    0.000        8.288       16.564 
--------------------------------------------------------------------------------------------------



Forward Selection: Step 4 

- AIRFLOW 

                          Model Summary                            
------------------------------------------------------------------
R                       0.961       RMSE                  455.114 
R-Squared               0.924       Coef. Var               4.113 
Adj. R-Squared          0.919       MSE                207128.472 
Pred R-Squared          0.908       MAE                   340.078 
------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 

                                   ANOVA                                    
---------------------------------------------------------------------------
                     Sum of                                                
                    Squares        DF     Mean Square       F         Sig. 
---------------------------------------------------------------------------
Regression    155055243.172         4    38763810.793    187.149    0.0000 
Residual       12841965.276        62      207128.472                      
Total         167897208.448        66                                      
---------------------------------------------------------------------------

                                       Parameter Estimates                                         
--------------------------------------------------------------------------------------------------
      model         Beta    Std. Error    Std. Beta       t        Sig         lower        upper 
--------------------------------------------------------------------------------------------------
(Intercept)    13617.924       813.306                  16.744    0.000    11992.148    15243.700 
        RPM        0.089         0.013        0.391      6.608    0.000        0.062        0.116 
 INLET.TEMP       -9.186         0.770       -0.791    -11.923    0.000      -10.726       -7.646 
   EXH.TEMP       14.363         2.260        0.397      6.356    0.000        9.846       18.880 
    AIRFLOW       -0.848         0.437       -0.120     -1.939    0.057       -1.721        0.026 
--------------------------------------------------------------------------------------------------



No more variables to be added.

Variables Entered: 

+ RPM 
+ INLET.TEMP 
+ EXH.TEMP 
+ AIRFLOW 


Final Model Output 
------------------

                          Model Summary                            
------------------------------------------------------------------
R                       0.961       RMSE                  455.114 
R-Squared               0.924       Coef. Var               4.113 
Adj. R-Squared          0.919       MSE                207128.472 
Pred R-Squared          0.908       MAE                   340.078 
------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 

                                   ANOVA                                    
---------------------------------------------------------------------------
                     Sum of                                                
                    Squares        DF     Mean Square       F         Sig. 
---------------------------------------------------------------------------
Regression    155055243.172         4    38763810.793    187.149    0.0000 
Residual       12841965.276        62      207128.472                      
Total         167897208.448        66                                      
---------------------------------------------------------------------------

                                       Parameter Estimates                                         
--------------------------------------------------------------------------------------------------
      model         Beta    Std. Error    Std. Beta       t        Sig         lower        upper 
--------------------------------------------------------------------------------------------------
(Intercept)    13617.924       813.306                  16.744    0.000    11992.148    15243.700 
        RPM        0.089         0.013        0.391      6.608    0.000        0.062        0.116 
 INLET.TEMP       -9.186         0.770       -0.791    -11.923    0.000      -10.726       -7.646 
   EXH.TEMP       14.363         2.260        0.397      6.356    0.000        9.846       18.880 
    AIRFLOW       -0.848         0.437       -0.120     -1.939    0.057       -1.721        0.026 
--------------------------------------------------------------------------------------------------

                               Selection Summary                                
-------------------------------------------------------------------------------
        Variable                    Adj.                                           
Step     Entered      R-Square    R-Square      C(p)         AIC         RMSE      
-------------------------------------------------------------------------------
   1    RPM             0.7123      0.7079    166.4933    1099.8486    862.0073    
   2    INLET.TEMP      0.8725      0.8685     40.7078    1047.3261    578.3221    
   3    EXH.TEMP        0.9189      0.9150      5.7207    1019.0405    464.9797    
   4    AIRFLOW         0.9235      0.9186      4.0192    1017.0947    455.1137    
-------------------------------------------------------------------------------
# stepwise regression
#Default: pent = 0.1, prem = 0.3
ols_step_both_p(gasmod1,pent=0.15,prem=0.15,details=T)
Stepwise Selection Method   
---------------------------

Candidate Terms: 

1. RPM 
2. CPRATIO 
3. INLET.TEMP 
4. EXH.TEMP 
5. AIRFLOW 
6. POWER 

We are selecting variables based on p value...


Stepwise Selection: Step 1 

- RPM added 

                          Model Summary                            
------------------------------------------------------------------
R                       0.844       RMSE                  862.007 
R-Squared               0.712       Coef. Var               7.789 
Adj. R-Squared          0.708       MSE                743056.584 
Pred R-Squared          0.696       MAE                   648.175 
------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 

                                   ANOVA                                     
----------------------------------------------------------------------------
                     Sum of                                                 
                    Squares        DF      Mean Square       F         Sig. 
----------------------------------------------------------------------------
Regression    119598530.459         1    119598530.459    160.955    0.0000 
Residual       48298677.989        65       743056.584                      
Total         167897208.448        66                                       
----------------------------------------------------------------------------

                                     Parameter Estimates                                       
----------------------------------------------------------------------------------------------
      model        Beta    Std. Error    Std. Beta      t        Sig        lower       upper 
----------------------------------------------------------------------------------------------
(Intercept)    9470.484       164.058                 57.726    0.000    9142.838    9798.131 
        RPM       0.192         0.015        0.844    12.687    0.000       0.161       0.222 
----------------------------------------------------------------------------------------------



Stepwise Selection: Step 2 

- INLET.TEMP added 

                          Model Summary                            
------------------------------------------------------------------
R                       0.934       RMSE                  578.322 
R-Squared               0.873       Coef. Var               5.226 
Adj. R-Squared          0.869       MSE                334456.428 
Pred R-Squared          0.859       MAE                   389.791 
------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 

                                   ANOVA                                    
---------------------------------------------------------------------------
                     Sum of                                                
                    Squares        DF     Mean Square       F         Sig. 
---------------------------------------------------------------------------
Regression    146491997.059         2    73245998.530        219    0.0000 
Residual       21405211.389        64      334456.428                      
Total         167897208.448        66                                      
---------------------------------------------------------------------------

                                       Parameter Estimates                                        
-------------------------------------------------------------------------------------------------
      model         Beta    Std. Error    Std. Beta      t        Sig         lower        upper 
-------------------------------------------------------------------------------------------------
(Intercept)    16523.288       794.181                 20.805    0.000    14936.728    18109.847 
        RPM        0.131         0.012        0.578    10.783    0.000        0.107        0.156 
 INLET.TEMP       -5.577         0.622       -0.481    -8.967    0.000       -6.820       -4.335 
-------------------------------------------------------------------------------------------------

                          Model Summary                            
------------------------------------------------------------------
R                       0.934       RMSE                  578.322 
R-Squared               0.873       Coef. Var               5.226 
Adj. R-Squared          0.869       MSE                334456.428 
Pred R-Squared          0.859       MAE                   389.791 
------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 

                                   ANOVA                                    
---------------------------------------------------------------------------
                     Sum of                                                
                    Squares        DF     Mean Square       F         Sig. 
---------------------------------------------------------------------------
Regression    146491997.059         2    73245998.530        219    0.0000 
Residual       21405211.389        64      334456.428                      
Total         167897208.448        66                                      
---------------------------------------------------------------------------

                                       Parameter Estimates                                        
-------------------------------------------------------------------------------------------------
      model         Beta    Std. Error    Std. Beta      t        Sig         lower        upper 
-------------------------------------------------------------------------------------------------
(Intercept)    16523.288       794.181                 20.805    0.000    14936.728    18109.847 
        RPM        0.131         0.012        0.578    10.783    0.000        0.107        0.156 
 INLET.TEMP       -5.577         0.622       -0.481    -8.967    0.000       -6.820       -4.335 
-------------------------------------------------------------------------------------------------



Stepwise Selection: Step 3 

- EXH.TEMP added 

                          Model Summary                            
------------------------------------------------------------------
R                       0.959       RMSE                  464.980 
R-Squared               0.919       Coef. Var               4.202 
Adj. R-Squared          0.915       MSE                216206.126 
Pred R-Squared          0.907       MAE                   342.429 
------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 

                                   ANOVA                                    
---------------------------------------------------------------------------
                     Sum of                                                
                    Squares        DF     Mean Square       F         Sig. 
---------------------------------------------------------------------------
Regression    154276222.495         3    51425407.498    237.854    0.0000 
Residual       13620985.953        63      216206.126                      
Total         167897208.448        66                                      
---------------------------------------------------------------------------

                                       Parameter Estimates                                         
--------------------------------------------------------------------------------------------------
      model         Beta    Std. Error    Std. Beta       t        Sig         lower        upper 
--------------------------------------------------------------------------------------------------
(Intercept)    14359.717       733.308                  19.582    0.000    12894.318    15825.116 
        RPM        0.105         0.011        0.463      9.818    0.000        0.084        0.127 
 INLET.TEMP       -9.223         0.787       -0.795    -11.721    0.000      -10.795       -7.650 
   EXH.TEMP       12.426         2.071        0.344      6.000    0.000        8.288       16.564 
--------------------------------------------------------------------------------------------------

                          Model Summary                            
------------------------------------------------------------------
R                       0.959       RMSE                  464.980 
R-Squared               0.919       Coef. Var               4.202 
Adj. R-Squared          0.915       MSE                216206.126 
Pred R-Squared          0.907       MAE                   342.429 
------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 

                                   ANOVA                                    
---------------------------------------------------------------------------
                     Sum of                                                
                    Squares        DF     Mean Square       F         Sig. 
---------------------------------------------------------------------------
Regression    154276222.495         3    51425407.498    237.854    0.0000 
Residual       13620985.953        63      216206.126                      
Total         167897208.448        66                                      
---------------------------------------------------------------------------

                                       Parameter Estimates                                         
--------------------------------------------------------------------------------------------------
      model         Beta    Std. Error    Std. Beta       t        Sig         lower        upper 
--------------------------------------------------------------------------------------------------
(Intercept)    14359.717       733.308                  19.582    0.000    12894.318    15825.116 
        RPM        0.105         0.011        0.463      9.818    0.000        0.084        0.127 
 INLET.TEMP       -9.223         0.787       -0.795    -11.721    0.000      -10.795       -7.650 
   EXH.TEMP       12.426         2.071        0.344      6.000    0.000        8.288       16.564 
--------------------------------------------------------------------------------------------------



Stepwise Selection: Step 4 

- AIRFLOW added 

                          Model Summary                            
------------------------------------------------------------------
R                       0.961       RMSE                  455.114 
R-Squared               0.924       Coef. Var               4.113 
Adj. R-Squared          0.919       MSE                207128.472 
Pred R-Squared          0.908       MAE                   340.078 
------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 

                                   ANOVA                                    
---------------------------------------------------------------------------
                     Sum of                                                
                    Squares        DF     Mean Square       F         Sig. 
---------------------------------------------------------------------------
Regression    155055243.172         4    38763810.793    187.149    0.0000 
Residual       12841965.276        62      207128.472                      
Total         167897208.448        66                                      
---------------------------------------------------------------------------

                                       Parameter Estimates                                         
--------------------------------------------------------------------------------------------------
      model         Beta    Std. Error    Std. Beta       t        Sig         lower        upper 
--------------------------------------------------------------------------------------------------
(Intercept)    13617.924       813.306                  16.744    0.000    11992.148    15243.700 
        RPM        0.089         0.013        0.391      6.608    0.000        0.062        0.116 
 INLET.TEMP       -9.186         0.770       -0.791    -11.923    0.000      -10.726       -7.646 
   EXH.TEMP       14.363         2.260        0.397      6.356    0.000        9.846       18.880 
    AIRFLOW       -0.848         0.437       -0.120     -1.939    0.057       -1.721        0.026 
--------------------------------------------------------------------------------------------------

                          Model Summary                            
------------------------------------------------------------------
R                       0.961       RMSE                  455.114 
R-Squared               0.924       Coef. Var               4.113 
Adj. R-Squared          0.919       MSE                207128.472 
Pred R-Squared          0.908       MAE                   340.078 
------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 

                                   ANOVA                                    
---------------------------------------------------------------------------
                     Sum of                                                
                    Squares        DF     Mean Square       F         Sig. 
---------------------------------------------------------------------------
Regression    155055243.172         4    38763810.793    187.149    0.0000 
Residual       12841965.276        62      207128.472                      
Total         167897208.448        66                                      
---------------------------------------------------------------------------

                                       Parameter Estimates                                         
--------------------------------------------------------------------------------------------------
      model         Beta    Std. Error    Std. Beta       t        Sig         lower        upper 
--------------------------------------------------------------------------------------------------
(Intercept)    13617.924       813.306                  16.744    0.000    11992.148    15243.700 
        RPM        0.089         0.013        0.391      6.608    0.000        0.062        0.116 
 INLET.TEMP       -9.186         0.770       -0.791    -11.923    0.000      -10.726       -7.646 
   EXH.TEMP       14.363         2.260        0.397      6.356    0.000        9.846       18.880 
    AIRFLOW       -0.848         0.437       -0.120     -1.939    0.057       -1.721        0.026 
--------------------------------------------------------------------------------------------------



No more variables to be added/removed.


Final Model Output 
------------------

                          Model Summary                            
------------------------------------------------------------------
R                       0.961       RMSE                  455.114 
R-Squared               0.924       Coef. Var               4.113 
Adj. R-Squared          0.919       MSE                207128.472 
Pred R-Squared          0.908       MAE                   340.078 
------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 

                                   ANOVA                                    
---------------------------------------------------------------------------
                     Sum of                                                
                    Squares        DF     Mean Square       F         Sig. 
---------------------------------------------------------------------------
Regression    155055243.172         4    38763810.793    187.149    0.0000 
Residual       12841965.276        62      207128.472                      
Total         167897208.448        66                                      
---------------------------------------------------------------------------

                                       Parameter Estimates                                         
--------------------------------------------------------------------------------------------------
      model         Beta    Std. Error    Std. Beta       t        Sig         lower        upper 
--------------------------------------------------------------------------------------------------
(Intercept)    13617.924       813.306                  16.744    0.000    11992.148    15243.700 
        RPM        0.089         0.013        0.391      6.608    0.000        0.062        0.116 
 INLET.TEMP       -9.186         0.770       -0.791    -11.923    0.000      -10.726       -7.646 
   EXH.TEMP       14.363         2.260        0.397      6.356    0.000        9.846       18.880 
    AIRFLOW       -0.848         0.437       -0.120     -1.939    0.057       -1.721        0.026 
--------------------------------------------------------------------------------------------------

                                Stepwise Selection Summary                                  
-------------------------------------------------------------------------------------------
                       Added/                   Adj.                                           
Step     Variable     Removed     R-Square    R-Square      C(p)         AIC         RMSE      
-------------------------------------------------------------------------------------------
   1       RPM        addition       0.712       0.708    166.4930    1099.8486    862.0073    
   2    INLET.TEMP    addition       0.873       0.869     40.7080    1047.3261    578.3221    
   3     EXH.TEMP     addition       0.919       0.915      5.7210    1019.0405    464.9797    
   4     AIRFLOW      addition       0.924       0.919      4.0190    1017.0947    455.1137    
-------------------------------------------------------------------------------------------
  • Backwards elimination: rpm, inlet.temp, exh.temp, airflow
  • Forward Selection: rpm, inlet.temp, exh.temp, airflow
  • Stepwise Regression: rpm, inlet.temp, exh.temp, airflow
  • In this instance all three procedures resulted in the same model. This may not always happen.
  • We will use this set of predictors to fit a model and determine if our problem of severe multicollinearity has been resolved.
#updated model
gasmod2<-lm(HEATRATE~.-ENGINE-SHAFTS-POWER-CPRATIO,data=gasturbine)
# Syntax Note: We can use the . to indicate all the variables in the data frame
# And use the - to exclude a particular variable from the model
summary(gasmod2)

Call:
lm(formula = HEATRATE ~ . - ENGINE - SHAFTS - POWER - CPRATIO, 
    data = gasturbine)

Residuals:
    Min      1Q  Median      3Q     Max 
-1007.7  -290.5  -106.0   240.1  1414.8 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.362e+04  8.133e+02  16.744  < 2e-16 ***
RPM          8.882e-02  1.344e-02   6.608 1.02e-08 ***
INLET.TEMP  -9.186e+00  7.704e-01 -11.923  < 2e-16 ***
EXH.TEMP     1.436e+01  2.260e+00   6.356 2.76e-08 ***
AIRFLOW     -8.475e-01  4.370e-01  -1.939    0.057 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 455.1 on 62 degrees of freedom
Multiple R-squared:  0.9235,    Adjusted R-squared:  0.9186 
F-statistic: 187.1 on 4 and 62 DF,  p-value: < 2.2e-16
gasmod2vif<-round(vif(gasmod2),3)
gasmod2vif
       RPM INLET.TEMP   EXH.TEMP    AIRFLOW 
     2.840      3.572      3.170      3.128 
mean(gasmod2vif)
[1] 3.1775

The average VIF is slightly greater than 3. We conclude this is no longer a severe issue and we can continue with our analysis. Moving forward, we might consider adding the remaining qualitative variables, interactions, and higher order (or variable transformations). We then would go forward and assess the model.

Unit 3.2: Residuals and Assumptions

Step 3: Estimate the Model Parameters

Reminder we looked at multicollinearity and variable screening to end up with a starting subset of predictors of: RPM, INLET.TEMP,

#updated model
gasmod2<-lm(HEATRATE~.-ENGINE-SHAFTS-POWER-CPRATIO,data=gasturbine)
# Syntax Note: We can use the . to indicate all the variables in the data frame
# And use the - to exclude a particular variable from the model
summary(gasmod2)

Call:
lm(formula = HEATRATE ~ . - ENGINE - SHAFTS - POWER - CPRATIO, 
    data = gasturbine)

Residuals:
    Min      1Q  Median      3Q     Max 
-1007.7  -290.5  -106.0   240.1  1414.8 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.362e+04  8.133e+02  16.744  < 2e-16 ***
RPM          8.882e-02  1.344e-02   6.608 1.02e-08 ***
INLET.TEMP  -9.186e+00  7.704e-01 -11.923  < 2e-16 ***
EXH.TEMP     1.436e+01  2.260e+00   6.356 2.76e-08 ***
AIRFLOW     -8.475e-01  4.370e-01  -1.939    0.057 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 455.1 on 62 degrees of freedom
Multiple R-squared:  0.9235,    Adjusted R-squared:  0.9186 
F-statistic: 187.1 on 4 and 62 DF,  p-value: < 2.2e-16

Try adding qualitative variable Engine

#updated model
gasmod3<-lm(HEATRATE~.-SHAFTS-POWER-CPRATIO,data=gasturbine)
# Syntax Note: We can use the . to indicate all the variables in the data frame
# And use the - to exclude a particular variable from the model
summary(gasmod3)

Call:
lm(formula = HEATRATE ~ . - SHAFTS - POWER - CPRATIO, data = gasturbine)

Residuals:
     Min       1Q   Median       3Q      Max 
-1002.38  -257.50   -60.08   252.72  1397.12 

Coefficients:
                    Estimate Std. Error t value Pr(>|t|)    
(Intercept)        1.537e+04  1.284e+03  11.968  < 2e-16 ***
ENGINEAeroderiv   -4.245e+02  2.806e+02  -1.513   0.1356    
ENGINETraditional -3.772e+02  2.183e+02  -1.728   0.0891 .  
RPM                9.065e-02  1.459e-02   6.214 5.38e-08 ***
INLET.TEMP        -9.908e+00  9.161e-01 -10.816 1.01e-15 ***
EXH.TEMP           1.314e+01  2.377e+00   5.530 7.36e-07 ***
AIRFLOW           -8.534e-01  4.338e-01  -1.967   0.0538 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 450.8 on 60 degrees of freedom
Multiple R-squared:  0.9274,    Adjusted R-squared:  0.9201 
F-statistic: 127.7 on 6 and 60 DF,  p-value: < 2.2e-16

Perform nested F test to determine if Engine type is significant.

redgasmod3<-lm(HEATRATE~.-ENGINE-SHAFTS-POWER-CPRATIO,data=gasturbine)
anova(redgasmod3,gasmod3)
  • Hypotheses:
    • \(H_0: \beta_1=\beta_2=0\) (the engine type does not contribute to predicting heat rate)
    • \(H_a:\beta_4, \beta_5 \neq 0\) (the species contributes to predicting heat rate)
  • Distribution of test statistic: F 2 with 60 DF
  • Test Statistic: F=1.603
  • Pvalue: 0.2098
  • Decision: 0.2098>0.05 -> FAIL TO REJECT H0
  • Conclusion: The engine type is not significant at predicting heat rate. We will remove both dummy variables in the model and not test them individually.

For the sake of this example, we will not consider any further testing.

Step 4: Specify the distribution of the errors and find the estimate of the variance

Step 5: Evaluate the Utility of the model

Step 6: Check the Model Assumptions

First we will evaluate model assumptions

#There are a few options to view plots for assumptions

#Residuals Plots of explanatory variables vs residuals
residualPlots(gasmod3,tests=F)

#Residual vs Fitted and QQ plot
plot(gasmod3, which=c(1,2))

#histogram of residuals
hist(residuals(gasmod3))

  • Lack of Fit Plot: Residual Plots
  • Constant Variance: Residual Plots and Residual vs Fitted
  • Normality: QQ Plot and Histogram of Residuals
  • Independence: We do not have time series data

Then we will identify outliers and influential observations.

#Cooks Distance Thresholds
plot(gasmod3,which=4)

#Leverage vs Studentized Residuals
influencePlot(gasmod3,fill=F)
#We can use various functions to store and view these statistics for all observations

dffits(gasmod3)
           1            2            3            4            5            6 
 0.760196099  0.484742791 -0.202119453 -0.458551481 -0.107150221 -0.029733726 
           7            8            9           10           11           12 
-0.022548928  0.138806983 -0.008912956 -0.026512455  0.883790595  0.462687401 
          13           14           15           16           17           18 
-0.060732559 -0.040992246  0.224449807 -0.123046747 -0.155064888 -0.183487684 
          19           20           21           22           23           24 
-0.015050769  0.178189075  0.445846820  0.107259411 -0.055098516 -0.103878755 
          25           26           27           28           29           30 
 0.018100698 -0.007710417 -0.102430426 -0.056103363 -0.193438489 -0.197493559 
          31           32           33           34           35           36 
-0.207168823 -0.794247664  0.055167417 -0.030580437 -0.293112289  1.135238201 
          37           38           39           40           41           42 
 0.085446025  0.099343488 -0.424322676 -0.458157215 -0.139615819 -0.129422848 
          43           44           45           46           47           48 
-0.215621851 -0.116659552  0.810400278  0.025727980  0.846040408  0.183644637 
          49           50           51           52           53           54 
-0.145327656  0.258925459 -0.041409259  0.495274858  0.256127721 -0.220523608 
          55           56           57           58           59           60 
 0.063572125 -0.199340056 -0.086775927 -0.336206852  0.168821930 -0.302900575 
          61           62           63           64           65           66 
 0.776485175 -0.699216883 -0.329478620  0.995300017 -0.358519481  0.545795996 
          67 
-0.249149287 
dfbetas(gasmod3)
     (Intercept) ENGINEAeroderiv ENGINETraditional           RPM    INLET.TEMP
1  -0.4535499243    0.0167439440       0.317180177  3.938248e-01  0.1965176870
2   0.3915692912   -0.2840894465      -0.235877400  1.782809e-01 -0.0740844654
3   0.0961433931    0.0074080582      -0.093668836 -1.111369e-01 -0.1170746327
4  -0.2774071543    0.2146974001       0.138460808 -4.837385e-02  0.0593733446
5   0.0731691716   -0.0180179579      -0.059564440 -5.350942e-03 -0.0323056489
6   0.0113656632   -0.0082633805      -0.018547530  3.539220e-03 -0.0145789601
7   0.0062129985   -0.0048764686      -0.013219338 -1.162223e-03 -0.0135147596
8   0.0133757319    0.0274485052       0.055812137 -7.762876e-03  0.0289248410
9   0.0009799309   -0.0033245107      -0.004462150  2.424455e-03  0.0006715006
10  0.0012932338   -0.0082886475      -0.012978012 -3.696201e-03 -0.0029625717
11 -0.0123943863   -0.2908018941       0.034210440  3.966546e-01 -0.0456822310
12 -0.0036620829   -0.0474799774      -0.002062140 -1.930113e-01 -0.2213322739
13  0.0477937569   -0.0131807399      -0.039592299 -1.734180e-02 -0.0368240576
14 -0.0149832061    0.0099935533       0.002551843  5.366802e-03 -0.0036521416
15  0.1313305676   -0.0462874245      -0.086412964 -1.280964e-01 -0.1733892418
16  0.0552535976   -0.0432585116      -0.052577665  8.330807e-02  0.0229444459
17  0.0266782395   -0.0537208621      -0.055724722  9.450724e-02  0.0634771006
18  0.0288759242   -0.0702760154      -0.074494525  7.660335e-02  0.0702299359
19  0.0062248536   -0.0075777093      -0.010396932  6.109760e-04 -0.0015021213
20 -0.0759357844    0.0008412255       0.087294404  8.111878e-02  0.1319065697
21 -0.1460586844    0.1191047370       0.133584778 -3.346838e-01 -0.0497176797
22 -0.0275126025    0.0275993461       0.054844838 -3.614848e-02  0.0271570301
23  0.0071677611   -0.0202522339      -0.027739156  1.510395e-02  0.0058105336
24  0.0374982623   -0.0480830807      -0.058146665  4.637035e-02  0.0213650092
25 -0.0044763132    0.0073417912       0.011701421 -1.015495e-03  0.0042674656
26 -0.0018006339   -0.0013472437      -0.001929857  2.257055e-04  0.0013319221
27  0.0112514145   -0.0375726776      -0.048227696  9.076031e-03  0.0152878319
28 -0.0241955852    0.0279787368       0.020532149 -2.272197e-02  0.0274493744
29 -0.1627086434    0.1161310699       0.103750410 -6.850409e-02  0.0506679177
30 -0.0025525613    0.0594362970      -0.012372040 -5.339859e-02  0.0064318234
31  0.0211130553    0.0503469195      -0.045499356 -9.955372e-02 -0.0768971635
32 -0.0448072255    0.2149531002      -0.128394688 -5.266997e-01 -0.5018550732
33  0.0290556398   -0.0196906407      -0.009561385  6.672910e-03  0.0106649412
34 -0.0054372832    0.0055156016      -0.002956651 -9.909128e-05 -0.0100234437
35 -0.0130971643    0.0573362366      -0.061774019 -1.101131e-01 -0.1642393772
36  0.3228113257   -0.0263005519      -0.258400386 -7.999446e-01 -0.9716864002
37  0.0180815706    0.0134951327       0.010758803 -4.178265e-02 -0.0445846341
38  0.0585762335   -0.0063267033      -0.012704594 -1.401378e-02 -0.0474052060
39 -0.1966438156    0.1649359223       0.066467067 -5.656278e-02 -0.0450794837
40 -0.1239033000    0.2440325603       0.279959173  5.549717e-02 -0.0318658861
41  0.0069030482    0.0360677900       0.061925112  7.299118e-02  0.0455837507
42  0.0085612808    0.0254047045       0.048642943  3.815236e-02  0.0386796632
43  0.0968070776   -0.0124275031       0.009109390  8.024291e-02  0.0353591512
44 -0.0150629668    0.0196585692       0.035845023 -9.703150e-03  0.0331577279
45  0.0910556837   -0.1393077827      -0.045523501  4.885240e-01  0.4152678357
46 -0.0093241803    0.0024221334       0.003677005  6.106168e-03  0.0056616828
47  0.1356881026   -0.3905633714      -0.481810392 -2.443774e-01 -0.0615592126
48  0.0373132314   -0.0916703593      -0.111863502 -4.483104e-02 -0.0142594672
49  0.0304264979    0.0157242805       0.037235788  4.461512e-02  0.0343471466
50 -0.0669050659   -0.0160171372       0.011162221  9.393020e-02  0.1288797539
51 -0.0086760187    0.0094867982       0.012646285 -1.169426e-02  0.0023785949
52 -0.0018688689   -0.0433994260       0.005787565  2.748199e-01  0.2030320743
53  0.1160875703   -0.1506162485      -0.170904553  4.540871e-03  0.0327828718
54  0.0264751584    0.0550668550       0.085205917  1.015311e-01  0.0384182200
55  0.0373202574   -0.0361073931      -0.046685184  5.643887e-03 -0.0082357370
56 -0.0265172134    0.0727437488       0.092023418 -4.973490e-03 -0.0127555308
57  0.0043079474    0.0216131136       0.031086648  8.759746e-03  0.0022490996
58 -0.1211446207    0.1105331955       0.206079477  8.946833e-02  0.2126689083
59 -0.0433021915   -0.0124230796       0.005613581  5.952930e-02  0.0877642966
60 -0.0934189979    0.0829915950       0.123591896 -5.807800e-02  0.0698530282
61  0.2819978706    0.0022374921      -0.264923268  3.779586e-01 -0.2417315553
62  0.0436283804   -0.2189487851       0.043492562 -2.262431e-01  0.1611606021
63  0.0167039636   -0.1913930169       0.013100275  3.795333e-02  0.0729760773
64  0.0440009760    0.4820952976       0.157861070  1.115376e-02  0.5450767970
65 -0.0761241224   -0.1848000724       0.018848460  1.239129e-01 -0.0408620003
66 -0.3050670451    0.4319674273       0.218933117 -7.325678e-02  0.1591430104
67  0.0252085113   -0.1607132597       0.002572737  1.034296e-01  0.0808525882
        EXH.TEMP      AIRFLOW
1   0.2296344844 -0.001764431
2  -0.3226764693  0.127169860
3   0.0163179443  0.022343955
4   0.2021781858  0.033626555
5  -0.0455820200  0.048050440
6   0.0011619152  0.007721298
7   0.0059494329  0.002262205
8  -0.0431702299  0.040097008
9  -0.0012787190 -0.002415159
10  0.0039932789 -0.017598497
11  0.0367257879 -0.015001787
12  0.2399491776 -0.234712343
13 -0.0134781465  0.018467025
14  0.0151607893  0.012682091
15  0.0424715587 -0.037362967
16 -0.0849660197  0.071445954
17 -0.0875277591  0.017895379
18 -0.0871565457 -0.037731980
19 -0.0035851133 -0.006460154
20 -0.0453584335 -0.033514126
21  0.2399282078 -0.335231712
22  0.0080300551 -0.036502840
23 -0.0105440758 -0.013942912
24 -0.0553371116 -0.002782808
25 -0.0003839523  0.005845356
26  0.0011207448 -0.004681146
27 -0.0168529480 -0.054946469
28  0.0011779680 -0.015070421
29  0.1170779487 -0.059057370
30 -0.0058848168  0.035683677
31  0.0496714033  0.023347699
32  0.5200453184 -0.080793285
33 -0.0360513597 -0.004608631
34  0.0125833155  0.009802877
35  0.1580002310  0.032232409
36  0.6358697215 -0.290633117
37  0.0230630042  0.010801334
38 -0.0187905258  0.051650950
39  0.2162612880  0.062062610
40  0.0854578279  0.245525735
41 -0.0693213117  0.093771200
42 -0.0547506225  0.019285151
43 -0.1452165765  0.070369879
44 -0.0132155479 -0.059338171
45 -0.5130448855  0.519556368
46  0.0031545599  0.010938454
47  0.0562696492 -0.590679228
48  0.0044587989 -0.117353276
49 -0.0733397385  0.028225244
50 -0.0527184589  0.079995026
51  0.0077204671 -0.025875603
52 -0.2164704094  0.343959391
53 -0.1140757033 -0.078097081
54 -0.0950226035  0.156908152
55 -0.0254005492  0.011933315
56  0.0227443752 -0.004984691
57 -0.0131908682  0.005460117
58 -0.0911498217 -0.049446953
59 -0.0368867380  0.045357851
60  0.0358202972 -0.181797200
61 -0.1117080138  0.349597036
62 -0.1526398283 -0.151325887
63 -0.0857270389  0.028858134
64 -0.5237985830 -0.001945503
65  0.0932263398  0.059656021
66  0.1684668241 -0.170453367
67 -0.1085547000  0.066224326
#We can subset by the observations of interest
dffits(gasmod3)[c(11,36,61,64)]
       11        36        61        64 
0.8837906 1.1352382 0.7764852 0.9953000 

Next Steps to resolve

Variable transformation: We can directly use a funciton of the response variable within lm() or add a new variable to the table

#Create new variable
gasturbine$sqrty<-sqrt(gasturbine$HEATRATE)

#remember to remove original
gasmodsqrt<-lm(sqrty~.-ENGINE-SHAFTS-POWER-CPRATIO-HEATRATE,data=gasturbine)
summary(gasmodsqrt)

Call:
lm(formula = sqrty ~ . - ENGINE - SHAFTS - POWER - CPRATIO - 
    HEATRATE, data = gasturbine)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.3749 -1.3484 -0.3872  1.1073  6.1302 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.172e+02  3.616e+00  32.409  < 2e-16 ***
RPM          3.631e-04  5.976e-05   6.076 8.28e-08 ***
INLET.TEMP  -4.344e-02  3.425e-03 -12.683  < 2e-16 ***
EXH.TEMP     6.904e-02  1.005e-02   6.872 3.58e-09 ***
AIRFLOW     -5.240e-03  1.943e-03  -2.697    0.009 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.023 on 62 degrees of freedom
Multiple R-squared:  0.9289,    Adjusted R-squared:  0.9243 
F-statistic: 202.6 on 4 and 62 DF,  p-value: < 2.2e-16
#Or directly transform in the lm() function
#remember to remove transformation
gasmodsqrt2<-lm(sqrt(HEATRATE)~.-ENGINE-SHAFTS-POWER-CPRATIO-sqrty,data=gasturbine)
summary(gasmodsqrt2)

Call:
lm(formula = sqrt(HEATRATE) ~ . - ENGINE - SHAFTS - POWER - CPRATIO - 
    sqrty, data = gasturbine)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.3749 -1.3484 -0.3872  1.1073  6.1302 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.172e+02  3.616e+00  32.409  < 2e-16 ***
RPM          3.631e-04  5.976e-05   6.076 8.28e-08 ***
INLET.TEMP  -4.344e-02  3.425e-03 -12.683  < 2e-16 ***
EXH.TEMP     6.904e-02  1.005e-02   6.872 3.58e-09 ***
AIRFLOW     -5.240e-03  1.943e-03  -2.697    0.009 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.023 on 62 degrees of freedom
Multiple R-squared:  0.9289,    Adjusted R-squared:  0.9243 
F-statistic: 202.6 on 4 and 62 DF,  p-value: < 2.2e-16

Removing observations from analyis by subsetting the data

#We can remove particular observations
subsetgas<-gasturbine[-c(11,36,61,64),]
gasmod2obs<-lm(HEATRATE~.-ENGINE-SHAFTS-POWER-CPRATIO-sqrty,data=subsetgas)
# Syntax Note: We can use the . to indicate all the variables in the data frame
# And use the - to exclude a particular variable from the model
summary(gasmod2obs)

Call:
lm(formula = HEATRATE ~ . - ENGINE - SHAFTS - POWER - CPRATIO - 
    sqrty, data = subsetgas)

Residuals:
    Min      1Q  Median      3Q     Max 
-957.18 -214.19  -93.73  233.18 1030.40 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.325e+04  7.218e+02  18.361  < 2e-16 ***
RPM          8.526e-02  1.370e-02   6.226 5.77e-08 ***
INLET.TEMP  -8.347e+00  8.048e-01 -10.372 7.88e-15 ***
EXH.TEMP     1.320e+01  2.316e+00   5.698 4.26e-07 ***
AIRFLOW     -9.429e-01  4.004e-01  -2.355   0.0219 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 391.6 on 58 degrees of freedom
Multiple R-squared:  0.9231,    Adjusted R-squared:  0.9178 
F-statistic:   174 on 4 and 58 DF,  p-value: < 2.2e-16