airlines.df <- read.csv(paste("SixAirlinesDataV2.csv", sep=""))
attach(airlines.df)
summary(airlines.df)
## Airline Aircraft FlightDuration TravelMonth
## AirFrance: 74 AirBus:151 Min. : 1.250 Aug:127
## British :175 Boeing:307 1st Qu.: 4.260 Jul: 75
## Delta : 46 Median : 7.790 Oct:127
## Jet : 61 Mean : 7.578 Sep:129
## Singapore: 40 3rd Qu.:10.620
## Virgin : 62 Max. :14.660
## IsInternational SeatsEconomy SeatsPremium PitchEconomy
## Domestic : 40 Min. : 78.0 Min. : 8.00 Min. :30.00
## International:418 1st Qu.:133.0 1st Qu.:21.00 1st Qu.:31.00
## Median :185.0 Median :36.00 Median :31.00
## Mean :202.3 Mean :33.65 Mean :31.22
## 3rd Qu.:243.0 3rd Qu.:40.00 3rd Qu.:32.00
## Max. :389.0 Max. :66.00 Max. :33.00
## PitchPremium WidthEconomy WidthPremium PriceEconomy
## Min. :34.00 Min. :17.00 Min. :17.00 Min. : 65
## 1st Qu.:38.00 1st Qu.:18.00 1st Qu.:19.00 1st Qu.: 413
## Median :38.00 Median :18.00 Median :19.00 Median :1242
## Mean :37.91 Mean :17.84 Mean :19.47 Mean :1327
## 3rd Qu.:38.00 3rd Qu.:18.00 3rd Qu.:21.00 3rd Qu.:1909
## Max. :40.00 Max. :19.00 Max. :21.00 Max. :3593
## PricePremium PriceRelative SeatsTotal PitchDifference
## Min. : 86.0 Min. :0.0200 Min. : 98 Min. : 2.000
## 1st Qu.: 528.8 1st Qu.:0.1000 1st Qu.:166 1st Qu.: 6.000
## Median :1737.0 Median :0.3650 Median :227 Median : 7.000
## Mean :1845.3 Mean :0.4872 Mean :236 Mean : 6.688
## 3rd Qu.:2989.0 3rd Qu.:0.7400 3rd Qu.:279 3rd Qu.: 7.000
## Max. :7414.0 Max. :1.8900 Max. :441 Max. :10.000
## WidthDifference PercentPremiumSeats
## Min. :0.000 Min. : 4.71
## 1st Qu.:1.000 1st Qu.:12.28
## Median :1.000 Median :13.21
## Mean :1.633 Mean :14.65
## 3rd Qu.:3.000 3rd Qu.:15.36
## Max. :4.000 Max. :24.69
boxplot(airlines.df$PriceRelative~airlines.df$PitchDifference,horizontal=TRUE,xlab="Relative Price between Economy and Premium Economy",ylab="Pitch Difference")
The relative price of the Premium Economy Seat and Economy Seat is maximum when the Pitch Difference is at MAX.
library(lattice)
histogram(airlines.df$PriceRelative~airlines.df$WidthDifference,horizintal=TRUE,col="grey",xlab="Width Difference")
The difference in seat width of Premium Economy and Economy seats are 0, 1, 2, 3, 4 inches and the most frequent Width Difference is 1 inch.
library(corrgram)
airlines2 <- c("PricePremium","PriceEconomy","PitchDifference","WidthDifference","SeatsTotal","PercentPremiumSeats")
colz <- colorRampPalette(c("darkkhaki","darkgreen","burlywood1"))
corrgram(airlines.df[,airlines2],order=TRUE,lower.panel=panel.shade,upper.panel=panel.pie,text.panel=panel.txt,main="Corrgram of Selected factors",col.regions = colz)
model1 <- lm(PricePremium ~ PriceEconomy + PitchDifference + WidthDifference + PercentPremiumSeats + SeatsTotal + IsInternational + TravelMonth + FlightDuration + Aircraft,data=airlines.df)
summary(model1)
##
## Call:
## lm(formula = PricePremium ~ PriceEconomy + PitchDifference +
## WidthDifference + PercentPremiumSeats + SeatsTotal + IsInternational +
## TravelMonth + FlightDuration + Aircraft, data = airlines.df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -977.2 -246.3 -47.9 135.2 3419.7
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.211e+03 1.755e+02 -6.898 1.82e-11 ***
## PriceEconomy 1.064e+00 3.114e-02 34.175 < 2e-16 ***
## PitchDifference 8.510e+01 3.913e+01 2.175 0.030163 *
## WidthDifference 1.240e+02 3.438e+01 3.607 0.000345 ***
## PercentPremiumSeats 3.177e+01 5.250e+00 6.052 3.04e-09 ***
## SeatsTotal 1.925e+00 3.360e-01 5.729 1.87e-08 ***
## IsInternationalInternational -7.537e+02 2.135e+02 -3.530 0.000458 ***
## TravelMonthJul -3.441e+01 7.074e+01 -0.486 0.626904
## TravelMonthOct 2.692e+01 6.036e+01 0.446 0.655795
## TravelMonthSep -2.097e+00 6.015e+01 -0.035 0.972203
## FlightDuration 8.455e+01 8.809e+00 9.598 < 2e-16 ***
## AircraftBoeing -2.082e+00 5.651e+01 -0.037 0.970625
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 480.7 on 446 degrees of freedom
## Multiple R-squared: 0.8641, Adjusted R-squared: 0.8607
## F-statistic: 257.7 on 11 and 446 DF, p-value: < 2.2e-16
the p-value indicates that not all the factors taken into consideration above are relevant. Thus, trying a better fit model
model2 <- lm(PricePremium ~ PriceEconomy + PitchDifference + WidthDifference + PercentPremiumSeats + SeatsTotal + FlightDuration + IsInternational,data = airlines.df)
summary(model2)
##
## Call:
## lm(formula = PricePremium ~ PriceEconomy + PitchDifference +
## WidthDifference + PercentPremiumSeats + SeatsTotal + FlightDuration +
## IsInternational, data = airlines.df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1010.0 -258.4 -49.9 133.6 3416.7
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.213e+03 1.695e+02 -7.156 3.40e-12 ***
## PriceEconomy 1.063e+00 3.077e-02 34.537 < 2e-16 ***
## PitchDifference 8.421e+01 3.656e+01 2.303 0.021722 *
## WidthDifference 1.224e+02 3.373e+01 3.629 0.000318 ***
## PercentPremiumSeats 3.190e+01 5.220e+00 6.112 2.14e-09 ***
## SeatsTotal 1.920e+00 3.241e-01 5.922 6.31e-09 ***
## FlightDuration 8.459e+01 8.507e+00 9.943 < 2e-16 ***
## IsInternationalInternational -7.412e+02 2.001e+02 -3.704 0.000238 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 479 on 450 degrees of freedom
## Multiple R-squared: 0.8638, Adjusted R-squared: 0.8617
## F-statistic: 407.9 on 7 and 450 DF, p-value: < 2.2e-16
** AS THE ADJUSTED R SQUARE VALUE OF MODEL 2 IS MORE, MODEL 2 IS BETTER FIT
The airfare for Premium Economy:-
Increases as the difference in pitch of Premium Economy and Economy increases
Increases as the difference in seat width of Premium Economy and Economy increases.
Increases as the total number of seats increases
Increases as the percentage of Premium Economy seats increases