{r}GasTurbine=read.table(file="C:/Users/Unico/Downloads/GasTurbine19095.txt",header=T)

attach(GasTurbine)

  1. Suppose that a standard gas turbine has, on average, a heat rate of 10000kJ/kWh. Perform a t-test to see if the mean heat rate for the turbines in your data file exceeds 10000kJ/kWh. What do you conclude about your set of gas turbines compared to the average gas turbine?

My answer for this question is: t.test(HeatRate) The mean of the gas turbine data I have is 11328.25kJ/kWh, higher than that of a standard gas turbine. My t-test returned a p-value of far less than 0.05 and therefore shows the null hypothesis should be rejected.

The mark for this question is:

  1. Using the heat rate as the response variable, construct suitable graphs to identify which of the numeric variables might prove useful as predictors in a simple linear regression of the form E(y) = ??0 + ??1x.

My answer for this question is: par(mfrow=c(2,4))

plot(HeatRate~ExhTemp)

plot(HeatRate~Airflow)

plot(HeatRate~CPRatio)

plot(HeatRate~Engine)

plot(HeatRate~InletTemp)

plot(HeatRate~Power)

plot(HeatRate~RPM)

plot(HeatRate~Shafts)

CPRatio, RPM and InletTemp.

The mark for this question is:

  1. Use air flow as a candidate for the predictor in a simple linear regression model to explain the heat rate of the turbines.

My answer for this question is: When predicting HeatRate by means of AirFlow we find that for each unit increase in AirFlow, HeatRate will decrease by 5.453 units, giving these two variables an inverse relationship with the formula “HeatRate=12291.077-5.453(AirFlow).” This formula also indicates a HeatRate of 12291.077 when AirFlow is equal to 0.

The mark for this question is:

  1. Choose another variable as the predictor in a simple linear regression model. Is this model better or worse than using air flow as the sole predictor of heat rate?

My answer for this question is: I have chosen to use InletTemp as a sole predictor of HeatRate. The simple linear regression formula is “HeatRate=23230.324-10.345(InletTemp).”This formula indicates that for an InletTemp value of zero, HeatRate would be 23230.324, decreasing by 10.345 for each unit increase in InletTemp.This model has a greater R squared value than that of the previous model which would indicate that InletTemp is a greater sole predictor of HeatRate.

The mark for this question is:

  1. Now fit a model that has both air flow and the other variable you chose. Determine if the interaction of the two variables is required. You must therefore choose between the models E(y) = ??0 + ??1x1 + ??2x2 and E(y) = ??0 + ??1x1 + ??2x2 + ??3x1x2.

My answer for this question is:

The mark for this question is:

  1. Create a multiple regression model that uses all numeric variables in your data set as predictors of the heat rate. Do not include any interactions or polynomial terms. Is it better than any previous models fitted?

My answer for this question is: This multiple regression model can be considered more accurate as it has a very small p-value and an F-statistic value greater than that needed to prove significance at 9 and 22 DF.

The mark for this question is:

  1. Give a practical interpretation of your estimates of the ??’s from this model.

My answer for this question is:

The mark for this question is:

  1. Interpret the R 2 value from the last model.

My answer for this question is: The R squared value for this multiple regression model is relatively high, at 0.9251, and would indicate that these variables are fairly viable in predicting HeatRate.

The mark for this question is:

  1. Is this large multiple regression model useful for predicting heat rate? Justify your answer using a hypothesis test for the utility of the entire model at a significance level of ?? = 0.01.

My answer for this question is:

The mark for this question is:

  1. Investigate the leverages of your multiple regression model. Are there any points that are having undue influence on the model?

My answer for this question is: Points 4 and 6 have relatively high leverage values, considerably more than most other points. Point 13 is somewhat untoward but not anything too major.

The mark for this question is:

  1. Now create a reduced model by appropriately removing terms from your multiple regression model.

My answer for this question is:

The mark for this question is:

  1. Compare this reduced model with the complete multiple regression model that included all predictors, using a single hypothesis test.

My answer for this question is:

The mark for this question is:

  1. Generate the residual analysis for your reduced model. Is there anything to be concerned about?

My answer for this question is:

The mark for this question is:

  1. Obtain the Cook’s distances and variance inflation factors for this model. Is there anything to worry about?

My answer for this question is:

The mark for this question is:

Your total mark is:

HeEx.lm=lm(HeatRate~ExhTemp)

summary(HeEx.lm)

E[HeatRate]=14932.286-6.872(ExhTemp)

HeAi.lm=lm(HeatRate~Airflow)

summary(HeAi.lm)

E[HeatRate]=12291.077-5.453(Airflow)

HeCP.lm=lm(HeatRate~CPRatio)

summary(HeCP.lm)

E[HeatRate]=-258.27(CPRatio)+15132.14

HeEn.lm=lm(HeatRate~Engine)

summary(HeEn.lm)

HeIn.lm=lm(HeatRate~InletTemp)

summary(HeIn.lm)

E[HeatRate]=23230.324-10.345(InletTemp)

HePo.lm=lm(HeatRate~Power)

summary(HePo.lm)

E[HeatRate]=1.216e+04-1.318e-02(Power)

HeRP.lm=lm(HeatRate~RPM)

summary(HeRP.lm)

E[HeatRate]=9.356e+03+1.978e-01(RPM)

HeSh.lm=(HeatRate~Shafts)

summary(HeSh.lm)

HeatRate.lm=lm(HeatRate~.,data=GasTurbine)

par(mfrow=c(2,2))

plot(HeatRate.lm)

summary(HeatRate.lm)

t.test(HeatRate)