During periods of high electricity demand, especially during the hot summer months, the power output from a gas turbine engine can drop dramatically. One way to counter this drop in power is by cooling the inlet air to the gas turbine. An increasingly popular cooling method uses high pressure inlet fogging. The performance of a sample of 67 gas turbines augmented with high pressure inlet fogging was investigated in the Journal of Engineering for Gas Turbines and Power (January 2005). One measure of performance is heat rate (kilojoules per kilowatt per hour). Heat rates for the 67 gas turbines, saved in the gasturbine file.
Check the appropriateness of response variable for regression: View a histogram of response variable. It should be continuous, and approximately unimodal and symmetric, with few outliers.
gasturbine<-read.delim("https://raw.githubusercontent.com/kvaranyak4/STAT3220/main/GASTURBINE.txt")
head(gasturbine)
names(gasturbine)
[1] "ENGINE" "SHAFTS" "RPM" "CPRATIO" "INLET.TEMP"
[6] "EXH.TEMP" "AIRFLOW" "POWER" "HEATRATE"
hist(gasturbine$HEATRATE, xlab="Heat Rate", main="Histogram of Heat Rate")
The distribution of the response variable, heat rate, is unimodal and skewed right. It is continuous, so it should still be suitable for regression.
We will explore the relationship with quantitative variables with scatter plots and correlations and classify each relationship as linear, curvilinear, or none. We explore the box plots and means for each qualitative variable explanatory variable then classify the relationships as existent or not. We will not explore interactions in this example.
#Scatter plots for quantitative variables
for (i in names(gasturbine)[3:8]) {
plot(gasturbine[,i], gasturbine$HEATRATE,xlab=i,ylab="Heat Rate")
}
#Correlations for quantitative variables
round(cor(gasturbine[3:8],gasturbine$HEATRATE,use="complete.obs"),3)
[,1]
RPM 0.844
CPRATIO -0.735
INLET.TEMP -0.801
EXH.TEMP -0.314
AIRFLOW -0.703
POWER -0.697
#Summary Statistics for response variable grouped by each level of the response
tapply(gasturbine$HEATRATE,gasturbine$ENGINE,summary)
$Advanced
Min. 1st Qu. Median Mean 3rd Qu. Max.
9105 9295 9669 9764 9933 11588
$Aeroderiv
Min. 1st Qu. Median Mean 3rd Qu. Max.
8714 10708 12414 12312 13697 16243
$Traditional
Min. 1st Qu. Median Mean 3rd Qu. Max.
10086 10598 11183 11544 11956 14796
tapply(gasturbine$HEATRATE,gasturbine$SHAFTS,summary)
$`1`
Min. 1st Qu. Median Mean 3rd Qu. Max.
9105 9918 10592 10930 11674 14796
$`2`
Min. 1st Qu. Median Mean 3rd Qu. Max.
10951 11223 11654 12536 13232 16243
$`3`
Min. 1st Qu. Median Mean 3rd Qu. Max.
8714 8903 9092 9092 9280 9469
#Box plots for Qualitative
boxplot(HEATRATE~ENGINE,gasturbine, ylab="Heat Rate")
boxplot(HEATRATE~SHAFTS,gasturbine, ylab="Heat Rate")
# Summary counts for qualitative variables
table(gasturbine$ENGINE,gasturbine$SHAFTS)
1 2 3
Advanced 21 0 0
Aeroderiv 1 4 2
Traditional 35 4 0
Do any of the explanatory variables have relationships with each other? We will look at pairwise correlations and VIF to evaluate multicollinearity in the quantitative explanatory variables.
#Regular correlation
gasturcor<-round(cor(gasturbine[,3:8]),4)
gasturcor
RPM CPRATIO INLET.TEMP EXH.TEMP AIRFLOW POWER
RPM 1.0000 -0.4903 -0.5536 -0.1715 -0.6876 -0.6169
CPRATIO -0.4903 1.0000 0.6851 0.1139 0.3826 0.4473
INLET.TEMP -0.5536 0.6851 1.0000 0.7283 0.6808 0.7503
EXH.TEMP -0.1715 0.1139 0.7283 1.0000 0.5665 0.6309
AIRFLOW -0.6876 0.3826 0.6808 0.5665 1.0000 0.9776
POWER -0.6169 0.4473 0.7503 0.6309 0.9776 1.0000
# Scatter plot matrix
plot(gasturbine[3:8])
#A new correlation function
gasturcor2<-rcorr(as.matrix(gasturbine[,3:8]))
gasturcor2
RPM CPRATIO INLET.TEMP EXH.TEMP AIRFLOW POWER
RPM 1.00 -0.49 -0.55 -0.17 -0.69 -0.62
CPRATIO -0.49 1.00 0.69 0.11 0.38 0.45
INLET.TEMP -0.55 0.69 1.00 0.73 0.68 0.75
EXH.TEMP -0.17 0.11 0.73 1.00 0.57 0.63
AIRFLOW -0.69 0.38 0.68 0.57 1.00 0.98
POWER -0.62 0.45 0.75 0.63 0.98 1.00
n= 67
P
RPM CPRATIO INLET.TEMP EXH.TEMP AIRFLOW POWER
RPM 0.0000 0.0000 0.1653 0.0000 0.0000
CPRATIO 0.0000 0.0000 0.3585 0.0014 0.0001
INLET.TEMP 0.0000 0.0000 0.0000 0.0000 0.0000
EXH.TEMP 0.1653 0.3585 0.0000 0.0000 0.0000
AIRFLOW 0.0000 0.0014 0.0000 0.0000 0.0000
POWER 0.0000 0.0001 0.0000 0.0000 0.0000
#Correlation Visualization
corrplot(gasturcor)
There is concern of strong pairwise relationships.
#Multicollinearity VIF
gasmod1<-lm(HEATRATE~.-ENGINE-SHAFTS,data=gasturbine)
# Syntax Note: We can use the . to indicate all the variables in the data frame
# And use the - to exclude a particular variable from the model
summary(gasmod1)
Call:
lm(formula = HEATRATE ~ . - ENGINE - SHAFTS, data = gasturbine)
Residuals:
Min 1Q Median 3Q Max
-1003.32 -307.35 -91.44 271.18 1405.52
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.431e+04 1.112e+03 12.869 < 2e-16 ***
RPM 8.058e-02 1.611e-02 5.002 5.25e-06 ***
CPRATIO -6.775e+00 3.038e+01 -0.223 0.824301
INLET.TEMP -9.507e+00 1.529e+00 -6.217 5.33e-08 ***
EXH.TEMP 1.415e+01 3.469e+00 4.081 0.000135 ***
AIRFLOW -2.553e+00 1.746e+00 -1.462 0.148892
POWER 4.257e-03 4.217e-03 1.009 0.316804
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 458.8 on 60 degrees of freedom
Multiple R-squared: 0.9248, Adjusted R-squared: 0.9173
F-statistic: 123 on 6 and 60 DF, p-value: < 2.2e-16
gasmod1vif<-round(vif(gasmod1),3)
gasmod1vif
RPM CPRATIO INLET.TEMP EXH.TEMP AIRFLOW POWER
4.015 5.213 13.852 7.351 49.136 49.765
mean(gasmod1vif)
[1] 21.55533
Yes, there is evidence of severe multicollinearity because several VIFs are much greater than 10 and the average VIF is greater than 3.
Because we have quite a few variables and severe multicollinearity, we need to address that. It is not clear from EDA what variables should remain and which variables should be removed.
We will use variable selection procedures to narrow down our quantitative variables to a best set of predictors. We will use the entry and remain significance levels of 0.15
# backwards elimination
#Default: prem = 0.3
ols_step_backward_p(gasmod1,prem=0.15,details=T)
Backward Elimination Method
---------------------------
Candidate Terms:
1 . RPM
2 . CPRATIO
3 . INLET.TEMP
4 . EXH.TEMP
5 . AIRFLOW
6 . POWER
We are eliminating variables based on p value...
- CPRATIO
Backward Elimination: Step 1
Variable CPRATIO Removed
Model Summary
------------------------------------------------------------------
R 0.962 RMSE 455.170
R-Squared 0.925 Coef. Var 4.113
Adj. R-Squared 0.919 MSE 207179.318
Pred R-Squared 0.907 MAE 336.847
------------------------------------------------------------------
RMSE: Root Mean Square Error
MSE: Mean Square Error
MAE: Mean Absolute Error
ANOVA
---------------------------------------------------------------------------
Sum of
Squares DF Mean Square F Sig.
---------------------------------------------------------------------------
Regression 155259270.080 5 31051854.016 149.879 0.0000
Residual 12637938.368 61 207179.318
Total 167897208.448 66
---------------------------------------------------------------------------
Parameter Estimates
--------------------------------------------------------------------------------------------------
model Beta Std. Error Std. Beta t Sig lower upper
--------------------------------------------------------------------------------------------------
(Intercept) 14215.194 1011.866 14.048 0.000 12191.843 16238.544
RPM 0.080 0.016 0.354 5.038 0.000 0.048 0.112
INLET.TEMP -9.769 0.969 -0.842 -10.080 0.000 -11.707 -7.831
EXH.TEMP 14.732 2.290 0.408 6.432 0.000 10.152 19.312
AIRFLOW -2.473 1.696 -0.352 -1.459 0.150 -5.864 0.917
POWER 0.004 0.004 0.239 0.992 0.325 -0.004 0.012
--------------------------------------------------------------------------------------------------
- POWER
Backward Elimination: Step 2
Variable POWER Removed
Model Summary
------------------------------------------------------------------
R 0.961 RMSE 455.114
R-Squared 0.924 Coef. Var 4.113
Adj. R-Squared 0.919 MSE 207128.472
Pred R-Squared 0.908 MAE 340.078
------------------------------------------------------------------
RMSE: Root Mean Square Error
MSE: Mean Square Error
MAE: Mean Absolute Error
ANOVA
---------------------------------------------------------------------------
Sum of
Squares DF Mean Square F Sig.
---------------------------------------------------------------------------
Regression 155055243.172 4 38763810.793 187.149 0.0000
Residual 12841965.276 62 207128.472
Total 167897208.448 66
---------------------------------------------------------------------------
Parameter Estimates
--------------------------------------------------------------------------------------------------
model Beta Std. Error Std. Beta t Sig lower upper
--------------------------------------------------------------------------------------------------
(Intercept) 13617.924 813.306 16.744 0.000 11992.148 15243.700
RPM 0.089 0.013 0.391 6.608 0.000 0.062 0.116
INLET.TEMP -9.186 0.770 -0.791 -11.923 0.000 -10.726 -7.646
EXH.TEMP 14.363 2.260 0.397 6.356 0.000 9.846 18.880
AIRFLOW -0.848 0.437 -0.120 -1.939 0.057 -1.721 0.026
--------------------------------------------------------------------------------------------------
No more variables satisfy the condition of p value = 0.15
Variables Removed:
- CPRATIO
- POWER
Final Model Output
------------------
Model Summary
------------------------------------------------------------------
R 0.961 RMSE 455.114
R-Squared 0.924 Coef. Var 4.113
Adj. R-Squared 0.919 MSE 207128.472
Pred R-Squared 0.908 MAE 340.078
------------------------------------------------------------------
RMSE: Root Mean Square Error
MSE: Mean Square Error
MAE: Mean Absolute Error
ANOVA
---------------------------------------------------------------------------
Sum of
Squares DF Mean Square F Sig.
---------------------------------------------------------------------------
Regression 155055243.172 4 38763810.793 187.149 0.0000
Residual 12841965.276 62 207128.472
Total 167897208.448 66
---------------------------------------------------------------------------
Parameter Estimates
--------------------------------------------------------------------------------------------------
model Beta Std. Error Std. Beta t Sig lower upper
--------------------------------------------------------------------------------------------------
(Intercept) 13617.924 813.306 16.744 0.000 11992.148 15243.700
RPM 0.089 0.013 0.391 6.608 0.000 0.062 0.116
INLET.TEMP -9.186 0.770 -0.791 -11.923 0.000 -10.726 -7.646
EXH.TEMP 14.363 2.260 0.397 6.356 0.000 9.846 18.880
AIRFLOW -0.848 0.437 -0.120 -1.939 0.057 -1.721 0.026
--------------------------------------------------------------------------------------------------
Elimination Summary
---------------------------------------------------------------------------
Variable Adj.
Step Removed R-Square R-Square C(p) AIC RMSE
---------------------------------------------------------------------------
1 CPRATIO 0.9247 0.9186 5.0497 1018.0217 455.1695
2 POWER 0.9235 0.9186 4.0192 1017.0947 455.1137
---------------------------------------------------------------------------
# forward selection
#default: penter = 0.3
ols_step_forward_p(gasmod1,penter=0.15,details=T)
Forward Selection Method
---------------------------
Candidate Terms:
1. RPM
2. CPRATIO
3. INLET.TEMP
4. EXH.TEMP
5. AIRFLOW
6. POWER
We are selecting variables based on p value...
Forward Selection: Step 1
- RPM
Model Summary
------------------------------------------------------------------
R 0.844 RMSE 862.007
R-Squared 0.712 Coef. Var 7.789
Adj. R-Squared 0.708 MSE 743056.584
Pred R-Squared 0.696 MAE 648.175
------------------------------------------------------------------
RMSE: Root Mean Square Error
MSE: Mean Square Error
MAE: Mean Absolute Error
ANOVA
----------------------------------------------------------------------------
Sum of
Squares DF Mean Square F Sig.
----------------------------------------------------------------------------
Regression 119598530.459 1 119598530.459 160.955 0.0000
Residual 48298677.989 65 743056.584
Total 167897208.448 66
----------------------------------------------------------------------------
Parameter Estimates
----------------------------------------------------------------------------------------------
model Beta Std. Error Std. Beta t Sig lower upper
----------------------------------------------------------------------------------------------
(Intercept) 9470.484 164.058 57.726 0.000 9142.838 9798.131
RPM 0.192 0.015 0.844 12.687 0.000 0.161 0.222
----------------------------------------------------------------------------------------------
Forward Selection: Step 2
- INLET.TEMP
Model Summary
------------------------------------------------------------------
R 0.934 RMSE 578.322
R-Squared 0.873 Coef. Var 5.226
Adj. R-Squared 0.869 MSE 334456.428
Pred R-Squared 0.859 MAE 389.791
------------------------------------------------------------------
RMSE: Root Mean Square Error
MSE: Mean Square Error
MAE: Mean Absolute Error
ANOVA
---------------------------------------------------------------------------
Sum of
Squares DF Mean Square F Sig.
---------------------------------------------------------------------------
Regression 146491997.059 2 73245998.530 219 0.0000
Residual 21405211.389 64 334456.428
Total 167897208.448 66
---------------------------------------------------------------------------
Parameter Estimates
-------------------------------------------------------------------------------------------------
model Beta Std. Error Std. Beta t Sig lower upper
-------------------------------------------------------------------------------------------------
(Intercept) 16523.288 794.181 20.805 0.000 14936.728 18109.847
RPM 0.131 0.012 0.578 10.783 0.000 0.107 0.156
INLET.TEMP -5.577 0.622 -0.481 -8.967 0.000 -6.820 -4.335
-------------------------------------------------------------------------------------------------
Forward Selection: Step 3
- EXH.TEMP
Model Summary
------------------------------------------------------------------
R 0.959 RMSE 464.980
R-Squared 0.919 Coef. Var 4.202
Adj. R-Squared 0.915 MSE 216206.126
Pred R-Squared 0.907 MAE 342.429
------------------------------------------------------------------
RMSE: Root Mean Square Error
MSE: Mean Square Error
MAE: Mean Absolute Error
ANOVA
---------------------------------------------------------------------------
Sum of
Squares DF Mean Square F Sig.
---------------------------------------------------------------------------
Regression 154276222.495 3 51425407.498 237.854 0.0000
Residual 13620985.953 63 216206.126
Total 167897208.448 66
---------------------------------------------------------------------------
Parameter Estimates
--------------------------------------------------------------------------------------------------
model Beta Std. Error Std. Beta t Sig lower upper
--------------------------------------------------------------------------------------------------
(Intercept) 14359.717 733.308 19.582 0.000 12894.318 15825.116
RPM 0.105 0.011 0.463 9.818 0.000 0.084 0.127
INLET.TEMP -9.223 0.787 -0.795 -11.721 0.000 -10.795 -7.650
EXH.TEMP 12.426 2.071 0.344 6.000 0.000 8.288 16.564
--------------------------------------------------------------------------------------------------
Forward Selection: Step 4
- AIRFLOW
Model Summary
------------------------------------------------------------------
R 0.961 RMSE 455.114
R-Squared 0.924 Coef. Var 4.113
Adj. R-Squared 0.919 MSE 207128.472
Pred R-Squared 0.908 MAE 340.078
------------------------------------------------------------------
RMSE: Root Mean Square Error
MSE: Mean Square Error
MAE: Mean Absolute Error
ANOVA
---------------------------------------------------------------------------
Sum of
Squares DF Mean Square F Sig.
---------------------------------------------------------------------------
Regression 155055243.172 4 38763810.793 187.149 0.0000
Residual 12841965.276 62 207128.472
Total 167897208.448 66
---------------------------------------------------------------------------
Parameter Estimates
--------------------------------------------------------------------------------------------------
model Beta Std. Error Std. Beta t Sig lower upper
--------------------------------------------------------------------------------------------------
(Intercept) 13617.924 813.306 16.744 0.000 11992.148 15243.700
RPM 0.089 0.013 0.391 6.608 0.000 0.062 0.116
INLET.TEMP -9.186 0.770 -0.791 -11.923 0.000 -10.726 -7.646
EXH.TEMP 14.363 2.260 0.397 6.356 0.000 9.846 18.880
AIRFLOW -0.848 0.437 -0.120 -1.939 0.057 -1.721 0.026
--------------------------------------------------------------------------------------------------
No more variables to be added.
Variables Entered:
+ RPM
+ INLET.TEMP
+ EXH.TEMP
+ AIRFLOW
Final Model Output
------------------
Model Summary
------------------------------------------------------------------
R 0.961 RMSE 455.114
R-Squared 0.924 Coef. Var 4.113
Adj. R-Squared 0.919 MSE 207128.472
Pred R-Squared 0.908 MAE 340.078
------------------------------------------------------------------
RMSE: Root Mean Square Error
MSE: Mean Square Error
MAE: Mean Absolute Error
ANOVA
---------------------------------------------------------------------------
Sum of
Squares DF Mean Square F Sig.
---------------------------------------------------------------------------
Regression 155055243.172 4 38763810.793 187.149 0.0000
Residual 12841965.276 62 207128.472
Total 167897208.448 66
---------------------------------------------------------------------------
Parameter Estimates
--------------------------------------------------------------------------------------------------
model Beta Std. Error Std. Beta t Sig lower upper
--------------------------------------------------------------------------------------------------
(Intercept) 13617.924 813.306 16.744 0.000 11992.148 15243.700
RPM 0.089 0.013 0.391 6.608 0.000 0.062 0.116
INLET.TEMP -9.186 0.770 -0.791 -11.923 0.000 -10.726 -7.646
EXH.TEMP 14.363 2.260 0.397 6.356 0.000 9.846 18.880
AIRFLOW -0.848 0.437 -0.120 -1.939 0.057 -1.721 0.026
--------------------------------------------------------------------------------------------------
Selection Summary
-------------------------------------------------------------------------------
Variable Adj.
Step Entered R-Square R-Square C(p) AIC RMSE
-------------------------------------------------------------------------------
1 RPM 0.7123 0.7079 166.4933 1099.8486 862.0073
2 INLET.TEMP 0.8725 0.8685 40.7078 1047.3261 578.3221
3 EXH.TEMP 0.9189 0.9150 5.7207 1019.0405 464.9797
4 AIRFLOW 0.9235 0.9186 4.0192 1017.0947 455.1137
-------------------------------------------------------------------------------
# stepwise regression
#Default: pent = 0.1, prem = 0.3
ols_step_both_p(gasmod1,pent=0.15,prem=0.15,details=T)
Stepwise Selection Method
---------------------------
Candidate Terms:
1. RPM
2. CPRATIO
3. INLET.TEMP
4. EXH.TEMP
5. AIRFLOW
6. POWER
We are selecting variables based on p value...
Stepwise Selection: Step 1
- RPM added
Model Summary
------------------------------------------------------------------
R 0.844 RMSE 862.007
R-Squared 0.712 Coef. Var 7.789
Adj. R-Squared 0.708 MSE 743056.584
Pred R-Squared 0.696 MAE 648.175
------------------------------------------------------------------
RMSE: Root Mean Square Error
MSE: Mean Square Error
MAE: Mean Absolute Error
ANOVA
----------------------------------------------------------------------------
Sum of
Squares DF Mean Square F Sig.
----------------------------------------------------------------------------
Regression 119598530.459 1 119598530.459 160.955 0.0000
Residual 48298677.989 65 743056.584
Total 167897208.448 66
----------------------------------------------------------------------------
Parameter Estimates
----------------------------------------------------------------------------------------------
model Beta Std. Error Std. Beta t Sig lower upper
----------------------------------------------------------------------------------------------
(Intercept) 9470.484 164.058 57.726 0.000 9142.838 9798.131
RPM 0.192 0.015 0.844 12.687 0.000 0.161 0.222
----------------------------------------------------------------------------------------------
Stepwise Selection: Step 2
- INLET.TEMP added
Model Summary
------------------------------------------------------------------
R 0.934 RMSE 578.322
R-Squared 0.873 Coef. Var 5.226
Adj. R-Squared 0.869 MSE 334456.428
Pred R-Squared 0.859 MAE 389.791
------------------------------------------------------------------
RMSE: Root Mean Square Error
MSE: Mean Square Error
MAE: Mean Absolute Error
ANOVA
---------------------------------------------------------------------------
Sum of
Squares DF Mean Square F Sig.
---------------------------------------------------------------------------
Regression 146491997.059 2 73245998.530 219 0.0000
Residual 21405211.389 64 334456.428
Total 167897208.448 66
---------------------------------------------------------------------------
Parameter Estimates
-------------------------------------------------------------------------------------------------
model Beta Std. Error Std. Beta t Sig lower upper
-------------------------------------------------------------------------------------------------
(Intercept) 16523.288 794.181 20.805 0.000 14936.728 18109.847
RPM 0.131 0.012 0.578 10.783 0.000 0.107 0.156
INLET.TEMP -5.577 0.622 -0.481 -8.967 0.000 -6.820 -4.335
-------------------------------------------------------------------------------------------------
Model Summary
------------------------------------------------------------------
R 0.934 RMSE 578.322
R-Squared 0.873 Coef. Var 5.226
Adj. R-Squared 0.869 MSE 334456.428
Pred R-Squared 0.859 MAE 389.791
------------------------------------------------------------------
RMSE: Root Mean Square Error
MSE: Mean Square Error
MAE: Mean Absolute Error
ANOVA
---------------------------------------------------------------------------
Sum of
Squares DF Mean Square F Sig.
---------------------------------------------------------------------------
Regression 146491997.059 2 73245998.530 219 0.0000
Residual 21405211.389 64 334456.428
Total 167897208.448 66
---------------------------------------------------------------------------
Parameter Estimates
-------------------------------------------------------------------------------------------------
model Beta Std. Error Std. Beta t Sig lower upper
-------------------------------------------------------------------------------------------------
(Intercept) 16523.288 794.181 20.805 0.000 14936.728 18109.847
RPM 0.131 0.012 0.578 10.783 0.000 0.107 0.156
INLET.TEMP -5.577 0.622 -0.481 -8.967 0.000 -6.820 -4.335
-------------------------------------------------------------------------------------------------
Stepwise Selection: Step 3
- EXH.TEMP added
Model Summary
------------------------------------------------------------------
R 0.959 RMSE 464.980
R-Squared 0.919 Coef. Var 4.202
Adj. R-Squared 0.915 MSE 216206.126
Pred R-Squared 0.907 MAE 342.429
------------------------------------------------------------------
RMSE: Root Mean Square Error
MSE: Mean Square Error
MAE: Mean Absolute Error
ANOVA
---------------------------------------------------------------------------
Sum of
Squares DF Mean Square F Sig.
---------------------------------------------------------------------------
Regression 154276222.495 3 51425407.498 237.854 0.0000
Residual 13620985.953 63 216206.126
Total 167897208.448 66
---------------------------------------------------------------------------
Parameter Estimates
--------------------------------------------------------------------------------------------------
model Beta Std. Error Std. Beta t Sig lower upper
--------------------------------------------------------------------------------------------------
(Intercept) 14359.717 733.308 19.582 0.000 12894.318 15825.116
RPM 0.105 0.011 0.463 9.818 0.000 0.084 0.127
INLET.TEMP -9.223 0.787 -0.795 -11.721 0.000 -10.795 -7.650
EXH.TEMP 12.426 2.071 0.344 6.000 0.000 8.288 16.564
--------------------------------------------------------------------------------------------------
Model Summary
------------------------------------------------------------------
R 0.959 RMSE 464.980
R-Squared 0.919 Coef. Var 4.202
Adj. R-Squared 0.915 MSE 216206.126
Pred R-Squared 0.907 MAE 342.429
------------------------------------------------------------------
RMSE: Root Mean Square Error
MSE: Mean Square Error
MAE: Mean Absolute Error
ANOVA
---------------------------------------------------------------------------
Sum of
Squares DF Mean Square F Sig.
---------------------------------------------------------------------------
Regression 154276222.495 3 51425407.498 237.854 0.0000
Residual 13620985.953 63 216206.126
Total 167897208.448 66
---------------------------------------------------------------------------
Parameter Estimates
--------------------------------------------------------------------------------------------------
model Beta Std. Error Std. Beta t Sig lower upper
--------------------------------------------------------------------------------------------------
(Intercept) 14359.717 733.308 19.582 0.000 12894.318 15825.116
RPM 0.105 0.011 0.463 9.818 0.000 0.084 0.127
INLET.TEMP -9.223 0.787 -0.795 -11.721 0.000 -10.795 -7.650
EXH.TEMP 12.426 2.071 0.344 6.000 0.000 8.288 16.564
--------------------------------------------------------------------------------------------------
Stepwise Selection: Step 4
- AIRFLOW added
Model Summary
------------------------------------------------------------------
R 0.961 RMSE 455.114
R-Squared 0.924 Coef. Var 4.113
Adj. R-Squared 0.919 MSE 207128.472
Pred R-Squared 0.908 MAE 340.078
------------------------------------------------------------------
RMSE: Root Mean Square Error
MSE: Mean Square Error
MAE: Mean Absolute Error
ANOVA
---------------------------------------------------------------------------
Sum of
Squares DF Mean Square F Sig.
---------------------------------------------------------------------------
Regression 155055243.172 4 38763810.793 187.149 0.0000
Residual 12841965.276 62 207128.472
Total 167897208.448 66
---------------------------------------------------------------------------
Parameter Estimates
--------------------------------------------------------------------------------------------------
model Beta Std. Error Std. Beta t Sig lower upper
--------------------------------------------------------------------------------------------------
(Intercept) 13617.924 813.306 16.744 0.000 11992.148 15243.700
RPM 0.089 0.013 0.391 6.608 0.000 0.062 0.116
INLET.TEMP -9.186 0.770 -0.791 -11.923 0.000 -10.726 -7.646
EXH.TEMP 14.363 2.260 0.397 6.356 0.000 9.846 18.880
AIRFLOW -0.848 0.437 -0.120 -1.939 0.057 -1.721 0.026
--------------------------------------------------------------------------------------------------
Model Summary
------------------------------------------------------------------
R 0.961 RMSE 455.114
R-Squared 0.924 Coef. Var 4.113
Adj. R-Squared 0.919 MSE 207128.472
Pred R-Squared 0.908 MAE 340.078
------------------------------------------------------------------
RMSE: Root Mean Square Error
MSE: Mean Square Error
MAE: Mean Absolute Error
ANOVA
---------------------------------------------------------------------------
Sum of
Squares DF Mean Square F Sig.
---------------------------------------------------------------------------
Regression 155055243.172 4 38763810.793 187.149 0.0000
Residual 12841965.276 62 207128.472
Total 167897208.448 66
---------------------------------------------------------------------------
Parameter Estimates
--------------------------------------------------------------------------------------------------
model Beta Std. Error Std. Beta t Sig lower upper
--------------------------------------------------------------------------------------------------
(Intercept) 13617.924 813.306 16.744 0.000 11992.148 15243.700
RPM 0.089 0.013 0.391 6.608 0.000 0.062 0.116
INLET.TEMP -9.186 0.770 -0.791 -11.923 0.000 -10.726 -7.646
EXH.TEMP 14.363 2.260 0.397 6.356 0.000 9.846 18.880
AIRFLOW -0.848 0.437 -0.120 -1.939 0.057 -1.721 0.026
--------------------------------------------------------------------------------------------------
No more variables to be added/removed.
Final Model Output
------------------
Model Summary
------------------------------------------------------------------
R 0.961 RMSE 455.114
R-Squared 0.924 Coef. Var 4.113
Adj. R-Squared 0.919 MSE 207128.472
Pred R-Squared 0.908 MAE 340.078
------------------------------------------------------------------
RMSE: Root Mean Square Error
MSE: Mean Square Error
MAE: Mean Absolute Error
ANOVA
---------------------------------------------------------------------------
Sum of
Squares DF Mean Square F Sig.
---------------------------------------------------------------------------
Regression 155055243.172 4 38763810.793 187.149 0.0000
Residual 12841965.276 62 207128.472
Total 167897208.448 66
---------------------------------------------------------------------------
Parameter Estimates
--------------------------------------------------------------------------------------------------
model Beta Std. Error Std. Beta t Sig lower upper
--------------------------------------------------------------------------------------------------
(Intercept) 13617.924 813.306 16.744 0.000 11992.148 15243.700
RPM 0.089 0.013 0.391 6.608 0.000 0.062 0.116
INLET.TEMP -9.186 0.770 -0.791 -11.923 0.000 -10.726 -7.646
EXH.TEMP 14.363 2.260 0.397 6.356 0.000 9.846 18.880
AIRFLOW -0.848 0.437 -0.120 -1.939 0.057 -1.721 0.026
--------------------------------------------------------------------------------------------------
Stepwise Selection Summary
-------------------------------------------------------------------------------------------
Added/ Adj.
Step Variable Removed R-Square R-Square C(p) AIC RMSE
-------------------------------------------------------------------------------------------
1 RPM addition 0.712 0.708 166.4930 1099.8486 862.0073
2 INLET.TEMP addition 0.873 0.869 40.7080 1047.3261 578.3221
3 EXH.TEMP addition 0.919 0.915 5.7210 1019.0405 464.9797
4 AIRFLOW addition 0.924 0.919 4.0190 1017.0947 455.1137
-------------------------------------------------------------------------------------------
#updated model
gasmod2<-lm(HEATRATE~.-ENGINE-SHAFTS-POWER-CPRATIO,data=gasturbine)
# Syntax Note: We can use the . to indicate all the variables in the data frame
# And use the - to exclude a particular variable from the model
summary(gasmod2)
Call:
lm(formula = HEATRATE ~ . - ENGINE - SHAFTS - POWER - CPRATIO,
data = gasturbine)
Residuals:
Min 1Q Median 3Q Max
-1007.7 -290.5 -106.0 240.1 1414.8
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.362e+04 8.133e+02 16.744 < 2e-16 ***
RPM 8.882e-02 1.344e-02 6.608 1.02e-08 ***
INLET.TEMP -9.186e+00 7.704e-01 -11.923 < 2e-16 ***
EXH.TEMP 1.436e+01 2.260e+00 6.356 2.76e-08 ***
AIRFLOW -8.475e-01 4.370e-01 -1.939 0.057 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 455.1 on 62 degrees of freedom
Multiple R-squared: 0.9235, Adjusted R-squared: 0.9186
F-statistic: 187.1 on 4 and 62 DF, p-value: < 2.2e-16
gasmod2vif<-round(vif(gasmod2),3)
gasmod2vif
RPM INLET.TEMP EXH.TEMP AIRFLOW
2.840 3.572 3.170 3.128
mean(gasmod2vif)
[1] 3.1775
The average VIF is slightly greater than 3. We conclude this is no longer a severe issue and we can continue with our analysis. Moving forward, we might consider adding the remaining qualitative variables, interactions, and higher order (or variable transformations). We then would go forward and assess the model.
Reminder we looked at multicollinearity and variable screening to end up with a starting subset of predictors of: RPM, INLET.TEMP,
#updated model
gasmod2<-lm(HEATRATE~.-ENGINE-SHAFTS-POWER-CPRATIO,data=gasturbine)
# Syntax Note: We can use the . to indicate all the variables in the data frame
# And use the - to exclude a particular variable from the model
summary(gasmod2)
Call:
lm(formula = HEATRATE ~ . - ENGINE - SHAFTS - POWER - CPRATIO,
data = gasturbine)
Residuals:
Min 1Q Median 3Q Max
-1007.7 -290.5 -106.0 240.1 1414.8
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.362e+04 8.133e+02 16.744 < 2e-16 ***
RPM 8.882e-02 1.344e-02 6.608 1.02e-08 ***
INLET.TEMP -9.186e+00 7.704e-01 -11.923 < 2e-16 ***
EXH.TEMP 1.436e+01 2.260e+00 6.356 2.76e-08 ***
AIRFLOW -8.475e-01 4.370e-01 -1.939 0.057 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 455.1 on 62 degrees of freedom
Multiple R-squared: 0.9235, Adjusted R-squared: 0.9186
F-statistic: 187.1 on 4 and 62 DF, p-value: < 2.2e-16
Try adding qualitative variable Engine
#updated model
gasmod3<-lm(HEATRATE~.-SHAFTS-POWER-CPRATIO,data=gasturbine)
# Syntax Note: We can use the . to indicate all the variables in the data frame
# And use the - to exclude a particular variable from the model
summary(gasmod3)
Call:
lm(formula = HEATRATE ~ . - SHAFTS - POWER - CPRATIO, data = gasturbine)
Residuals:
Min 1Q Median 3Q Max
-1002.38 -257.50 -60.08 252.72 1397.12
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.537e+04 1.284e+03 11.968 < 2e-16 ***
ENGINEAeroderiv -4.245e+02 2.806e+02 -1.513 0.1356
ENGINETraditional -3.772e+02 2.183e+02 -1.728 0.0891 .
RPM 9.065e-02 1.459e-02 6.214 5.38e-08 ***
INLET.TEMP -9.908e+00 9.161e-01 -10.816 1.01e-15 ***
EXH.TEMP 1.314e+01 2.377e+00 5.530 7.36e-07 ***
AIRFLOW -8.534e-01 4.338e-01 -1.967 0.0538 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 450.8 on 60 degrees of freedom
Multiple R-squared: 0.9274, Adjusted R-squared: 0.9201
F-statistic: 127.7 on 6 and 60 DF, p-value: < 2.2e-16
Perform nested F test to determine if Engine type is significant.
redgasmod3<-lm(HEATRATE~.-ENGINE-SHAFTS-POWER-CPRATIO,data=gasturbine)
anova(redgasmod3,gasmod3)
For the sake of this example, we will not consider any further testing.
First we will evaluate model assumptions
#There are a few options to view plots for assumptions
#Residuals Plots of explanatory variables vs residuals
residualPlots(gasmod3,tests=F)
#Residual vs Fitted and QQ plot
plot(gasmod3, which=c(1,2))
#histogram of residuals
hist(residuals(gasmod3))
Then we will identify outliers and influential observations.
#Cooks Distance Thresholds
plot(gasmod3,which=4)
#Leverage vs Studentized Residuals
influencePlot(gasmod3,fill=F)
#We can use various functions to store and view these statistics for all observations
dffits(gasmod3)
1 2 3 4 5 6
0.760196099 0.484742791 -0.202119453 -0.458551481 -0.107150221 -0.029733726
7 8 9 10 11 12
-0.022548928 0.138806983 -0.008912956 -0.026512455 0.883790595 0.462687401
13 14 15 16 17 18
-0.060732559 -0.040992246 0.224449807 -0.123046747 -0.155064888 -0.183487684
19 20 21 22 23 24
-0.015050769 0.178189075 0.445846820 0.107259411 -0.055098516 -0.103878755
25 26 27 28 29 30
0.018100698 -0.007710417 -0.102430426 -0.056103363 -0.193438489 -0.197493559
31 32 33 34 35 36
-0.207168823 -0.794247664 0.055167417 -0.030580437 -0.293112289 1.135238201
37 38 39 40 41 42
0.085446025 0.099343488 -0.424322676 -0.458157215 -0.139615819 -0.129422848
43 44 45 46 47 48
-0.215621851 -0.116659552 0.810400278 0.025727980 0.846040408 0.183644637
49 50 51 52 53 54
-0.145327656 0.258925459 -0.041409259 0.495274858 0.256127721 -0.220523608
55 56 57 58 59 60
0.063572125 -0.199340056 -0.086775927 -0.336206852 0.168821930 -0.302900575
61 62 63 64 65 66
0.776485175 -0.699216883 -0.329478620 0.995300017 -0.358519481 0.545795996
67
-0.249149287
dfbetas(gasmod3)
(Intercept) ENGINEAeroderiv ENGINETraditional RPM INLET.TEMP
1 -0.4535499243 0.0167439440 0.317180177 3.938248e-01 0.1965176870
2 0.3915692912 -0.2840894465 -0.235877400 1.782809e-01 -0.0740844654
3 0.0961433931 0.0074080582 -0.093668836 -1.111369e-01 -0.1170746327
4 -0.2774071543 0.2146974001 0.138460808 -4.837385e-02 0.0593733446
5 0.0731691716 -0.0180179579 -0.059564440 -5.350942e-03 -0.0323056489
6 0.0113656632 -0.0082633805 -0.018547530 3.539220e-03 -0.0145789601
7 0.0062129985 -0.0048764686 -0.013219338 -1.162223e-03 -0.0135147596
8 0.0133757319 0.0274485052 0.055812137 -7.762876e-03 0.0289248410
9 0.0009799309 -0.0033245107 -0.004462150 2.424455e-03 0.0006715006
10 0.0012932338 -0.0082886475 -0.012978012 -3.696201e-03 -0.0029625717
11 -0.0123943863 -0.2908018941 0.034210440 3.966546e-01 -0.0456822310
12 -0.0036620829 -0.0474799774 -0.002062140 -1.930113e-01 -0.2213322739
13 0.0477937569 -0.0131807399 -0.039592299 -1.734180e-02 -0.0368240576
14 -0.0149832061 0.0099935533 0.002551843 5.366802e-03 -0.0036521416
15 0.1313305676 -0.0462874245 -0.086412964 -1.280964e-01 -0.1733892418
16 0.0552535976 -0.0432585116 -0.052577665 8.330807e-02 0.0229444459
17 0.0266782395 -0.0537208621 -0.055724722 9.450724e-02 0.0634771006
18 0.0288759242 -0.0702760154 -0.074494525 7.660335e-02 0.0702299359
19 0.0062248536 -0.0075777093 -0.010396932 6.109760e-04 -0.0015021213
20 -0.0759357844 0.0008412255 0.087294404 8.111878e-02 0.1319065697
21 -0.1460586844 0.1191047370 0.133584778 -3.346838e-01 -0.0497176797
22 -0.0275126025 0.0275993461 0.054844838 -3.614848e-02 0.0271570301
23 0.0071677611 -0.0202522339 -0.027739156 1.510395e-02 0.0058105336
24 0.0374982623 -0.0480830807 -0.058146665 4.637035e-02 0.0213650092
25 -0.0044763132 0.0073417912 0.011701421 -1.015495e-03 0.0042674656
26 -0.0018006339 -0.0013472437 -0.001929857 2.257055e-04 0.0013319221
27 0.0112514145 -0.0375726776 -0.048227696 9.076031e-03 0.0152878319
28 -0.0241955852 0.0279787368 0.020532149 -2.272197e-02 0.0274493744
29 -0.1627086434 0.1161310699 0.103750410 -6.850409e-02 0.0506679177
30 -0.0025525613 0.0594362970 -0.012372040 -5.339859e-02 0.0064318234
31 0.0211130553 0.0503469195 -0.045499356 -9.955372e-02 -0.0768971635
32 -0.0448072255 0.2149531002 -0.128394688 -5.266997e-01 -0.5018550732
33 0.0290556398 -0.0196906407 -0.009561385 6.672910e-03 0.0106649412
34 -0.0054372832 0.0055156016 -0.002956651 -9.909128e-05 -0.0100234437
35 -0.0130971643 0.0573362366 -0.061774019 -1.101131e-01 -0.1642393772
36 0.3228113257 -0.0263005519 -0.258400386 -7.999446e-01 -0.9716864002
37 0.0180815706 0.0134951327 0.010758803 -4.178265e-02 -0.0445846341
38 0.0585762335 -0.0063267033 -0.012704594 -1.401378e-02 -0.0474052060
39 -0.1966438156 0.1649359223 0.066467067 -5.656278e-02 -0.0450794837
40 -0.1239033000 0.2440325603 0.279959173 5.549717e-02 -0.0318658861
41 0.0069030482 0.0360677900 0.061925112 7.299118e-02 0.0455837507
42 0.0085612808 0.0254047045 0.048642943 3.815236e-02 0.0386796632
43 0.0968070776 -0.0124275031 0.009109390 8.024291e-02 0.0353591512
44 -0.0150629668 0.0196585692 0.035845023 -9.703150e-03 0.0331577279
45 0.0910556837 -0.1393077827 -0.045523501 4.885240e-01 0.4152678357
46 -0.0093241803 0.0024221334 0.003677005 6.106168e-03 0.0056616828
47 0.1356881026 -0.3905633714 -0.481810392 -2.443774e-01 -0.0615592126
48 0.0373132314 -0.0916703593 -0.111863502 -4.483104e-02 -0.0142594672
49 0.0304264979 0.0157242805 0.037235788 4.461512e-02 0.0343471466
50 -0.0669050659 -0.0160171372 0.011162221 9.393020e-02 0.1288797539
51 -0.0086760187 0.0094867982 0.012646285 -1.169426e-02 0.0023785949
52 -0.0018688689 -0.0433994260 0.005787565 2.748199e-01 0.2030320743
53 0.1160875703 -0.1506162485 -0.170904553 4.540871e-03 0.0327828718
54 0.0264751584 0.0550668550 0.085205917 1.015311e-01 0.0384182200
55 0.0373202574 -0.0361073931 -0.046685184 5.643887e-03 -0.0082357370
56 -0.0265172134 0.0727437488 0.092023418 -4.973490e-03 -0.0127555308
57 0.0043079474 0.0216131136 0.031086648 8.759746e-03 0.0022490996
58 -0.1211446207 0.1105331955 0.206079477 8.946833e-02 0.2126689083
59 -0.0433021915 -0.0124230796 0.005613581 5.952930e-02 0.0877642966
60 -0.0934189979 0.0829915950 0.123591896 -5.807800e-02 0.0698530282
61 0.2819978706 0.0022374921 -0.264923268 3.779586e-01 -0.2417315553
62 0.0436283804 -0.2189487851 0.043492562 -2.262431e-01 0.1611606021
63 0.0167039636 -0.1913930169 0.013100275 3.795333e-02 0.0729760773
64 0.0440009760 0.4820952976 0.157861070 1.115376e-02 0.5450767970
65 -0.0761241224 -0.1848000724 0.018848460 1.239129e-01 -0.0408620003
66 -0.3050670451 0.4319674273 0.218933117 -7.325678e-02 0.1591430104
67 0.0252085113 -0.1607132597 0.002572737 1.034296e-01 0.0808525882
EXH.TEMP AIRFLOW
1 0.2296344844 -0.001764431
2 -0.3226764693 0.127169860
3 0.0163179443 0.022343955
4 0.2021781858 0.033626555
5 -0.0455820200 0.048050440
6 0.0011619152 0.007721298
7 0.0059494329 0.002262205
8 -0.0431702299 0.040097008
9 -0.0012787190 -0.002415159
10 0.0039932789 -0.017598497
11 0.0367257879 -0.015001787
12 0.2399491776 -0.234712343
13 -0.0134781465 0.018467025
14 0.0151607893 0.012682091
15 0.0424715587 -0.037362967
16 -0.0849660197 0.071445954
17 -0.0875277591 0.017895379
18 -0.0871565457 -0.037731980
19 -0.0035851133 -0.006460154
20 -0.0453584335 -0.033514126
21 0.2399282078 -0.335231712
22 0.0080300551 -0.036502840
23 -0.0105440758 -0.013942912
24 -0.0553371116 -0.002782808
25 -0.0003839523 0.005845356
26 0.0011207448 -0.004681146
27 -0.0168529480 -0.054946469
28 0.0011779680 -0.015070421
29 0.1170779487 -0.059057370
30 -0.0058848168 0.035683677
31 0.0496714033 0.023347699
32 0.5200453184 -0.080793285
33 -0.0360513597 -0.004608631
34 0.0125833155 0.009802877
35 0.1580002310 0.032232409
36 0.6358697215 -0.290633117
37 0.0230630042 0.010801334
38 -0.0187905258 0.051650950
39 0.2162612880 0.062062610
40 0.0854578279 0.245525735
41 -0.0693213117 0.093771200
42 -0.0547506225 0.019285151
43 -0.1452165765 0.070369879
44 -0.0132155479 -0.059338171
45 -0.5130448855 0.519556368
46 0.0031545599 0.010938454
47 0.0562696492 -0.590679228
48 0.0044587989 -0.117353276
49 -0.0733397385 0.028225244
50 -0.0527184589 0.079995026
51 0.0077204671 -0.025875603
52 -0.2164704094 0.343959391
53 -0.1140757033 -0.078097081
54 -0.0950226035 0.156908152
55 -0.0254005492 0.011933315
56 0.0227443752 -0.004984691
57 -0.0131908682 0.005460117
58 -0.0911498217 -0.049446953
59 -0.0368867380 0.045357851
60 0.0358202972 -0.181797200
61 -0.1117080138 0.349597036
62 -0.1526398283 -0.151325887
63 -0.0857270389 0.028858134
64 -0.5237985830 -0.001945503
65 0.0932263398 0.059656021
66 0.1684668241 -0.170453367
67 -0.1085547000 0.066224326
#We can subset by the observations of interest
dffits(gasmod3)[c(11,36,61,64)]
11 36 61 64
0.8837906 1.1352382 0.7764852 0.9953000
Variable transformation: We can directly use a funciton of the response variable within lm() or add a new variable to the table
#Create new variable
gasturbine$sqrty<-sqrt(gasturbine$HEATRATE)
#remember to remove original
gasmodsqrt<-lm(sqrty~.-ENGINE-SHAFTS-POWER-CPRATIO-HEATRATE,data=gasturbine)
summary(gasmodsqrt)
Call:
lm(formula = sqrty ~ . - ENGINE - SHAFTS - POWER - CPRATIO -
HEATRATE, data = gasturbine)
Residuals:
Min 1Q Median 3Q Max
-4.3749 -1.3484 -0.3872 1.1073 6.1302
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.172e+02 3.616e+00 32.409 < 2e-16 ***
RPM 3.631e-04 5.976e-05 6.076 8.28e-08 ***
INLET.TEMP -4.344e-02 3.425e-03 -12.683 < 2e-16 ***
EXH.TEMP 6.904e-02 1.005e-02 6.872 3.58e-09 ***
AIRFLOW -5.240e-03 1.943e-03 -2.697 0.009 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.023 on 62 degrees of freedom
Multiple R-squared: 0.9289, Adjusted R-squared: 0.9243
F-statistic: 202.6 on 4 and 62 DF, p-value: < 2.2e-16
#Or directly transform in the lm() function
#remember to remove transformation
gasmodsqrt2<-lm(sqrt(HEATRATE)~.-ENGINE-SHAFTS-POWER-CPRATIO-sqrty,data=gasturbine)
summary(gasmodsqrt2)
Call:
lm(formula = sqrt(HEATRATE) ~ . - ENGINE - SHAFTS - POWER - CPRATIO -
sqrty, data = gasturbine)
Residuals:
Min 1Q Median 3Q Max
-4.3749 -1.3484 -0.3872 1.1073 6.1302
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.172e+02 3.616e+00 32.409 < 2e-16 ***
RPM 3.631e-04 5.976e-05 6.076 8.28e-08 ***
INLET.TEMP -4.344e-02 3.425e-03 -12.683 < 2e-16 ***
EXH.TEMP 6.904e-02 1.005e-02 6.872 3.58e-09 ***
AIRFLOW -5.240e-03 1.943e-03 -2.697 0.009 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.023 on 62 degrees of freedom
Multiple R-squared: 0.9289, Adjusted R-squared: 0.9243
F-statistic: 202.6 on 4 and 62 DF, p-value: < 2.2e-16
Removing observations from analyis by subsetting the data
#We can remove particular observations
subsetgas<-gasturbine[-c(11,36,61,64),]
gasmod2obs<-lm(HEATRATE~.-ENGINE-SHAFTS-POWER-CPRATIO-sqrty,data=subsetgas)
# Syntax Note: We can use the . to indicate all the variables in the data frame
# And use the - to exclude a particular variable from the model
summary(gasmod2obs)
Call:
lm(formula = HEATRATE ~ . - ENGINE - SHAFTS - POWER - CPRATIO -
sqrty, data = subsetgas)
Residuals:
Min 1Q Median 3Q Max
-957.18 -214.19 -93.73 233.18 1030.40
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.325e+04 7.218e+02 18.361 < 2e-16 ***
RPM 8.526e-02 1.370e-02 6.226 5.77e-08 ***
INLET.TEMP -8.347e+00 8.048e-01 -10.372 7.88e-15 ***
EXH.TEMP 1.320e+01 2.316e+00 5.698 4.26e-07 ***
AIRFLOW -9.429e-01 4.004e-01 -2.355 0.0219 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 391.6 on 58 degrees of freedom
Multiple R-squared: 0.9231, Adjusted R-squared: 0.9178
F-statistic: 174 on 4 and 58 DF, p-value: < 2.2e-16