airline<-read.csv(paste("SixAirlinesDataV2.csv",sep=" "))
View(airline)
Summary Statistics of the data set.
summary(airline)
## Airline Aircraft FlightDuration TravelMonth
## AirFrance: 74 AirBus:151 Min. : 1.250 Aug:127
## British :175 Boeing:307 1st Qu.: 4.260 Jul: 75
## Delta : 46 Median : 7.790 Oct:127
## Jet : 61 Mean : 7.578 Sep:129
## Singapore: 40 3rd Qu.:10.620
## Virgin : 62 Max. :14.660
## IsInternational SeatsEconomy SeatsPremium PitchEconomy
## Domestic : 40 Min. : 78.0 Min. : 8.00 Min. :30.00
## International:418 1st Qu.:133.0 1st Qu.:21.00 1st Qu.:31.00
## Median :185.0 Median :36.00 Median :31.00
## Mean :202.3 Mean :33.65 Mean :31.22
## 3rd Qu.:243.0 3rd Qu.:40.00 3rd Qu.:32.00
## Max. :389.0 Max. :66.00 Max. :33.00
## PitchPremium WidthEconomy WidthPremium PriceEconomy
## Min. :34.00 Min. :17.00 Min. :17.00 Min. : 65
## 1st Qu.:38.00 1st Qu.:18.00 1st Qu.:19.00 1st Qu.: 413
## Median :38.00 Median :18.00 Median :19.00 Median :1242
## Mean :37.91 Mean :17.84 Mean :19.47 Mean :1327
## 3rd Qu.:38.00 3rd Qu.:18.00 3rd Qu.:21.00 3rd Qu.:1909
## Max. :40.00 Max. :19.00 Max. :21.00 Max. :3593
## PricePremium PriceRelative SeatsTotal PitchDifference
## Min. : 86.0 Min. :0.0200 Min. : 98 Min. : 2.000
## 1st Qu.: 528.8 1st Qu.:0.1000 1st Qu.:166 1st Qu.: 6.000
## Median :1737.0 Median :0.3650 Median :227 Median : 7.000
## Mean :1845.3 Mean :0.4872 Mean :236 Mean : 6.688
## 3rd Qu.:2989.0 3rd Qu.:0.7400 3rd Qu.:279 3rd Qu.: 7.000
## Max. :7414.0 Max. :1.8900 Max. :441 Max. :10.000
## WidthDifference PercentPremiumSeats
## Min. :0.000 Min. : 4.71
## 1st Qu.:1.000 1st Qu.:12.28
## Median :1.000 Median :13.21
## Mean :1.633 Mean :14.65
## 3rd Qu.:3.000 3rd Qu.:15.36
## Max. :4.000 Max. :24.69
Discription of variables in the data set.
library(psych)
describe(airline)
## vars n mean sd median trimmed mad min
## Airline* 1 458 3.01 1.65 2.00 2.89 1.48 1.00
## Aircraft* 2 458 1.67 0.47 2.00 1.71 0.00 1.00
## FlightDuration 3 458 7.58 3.54 7.79 7.57 4.81 1.25
## TravelMonth* 4 458 2.56 1.17 3.00 2.58 1.48 1.00
## IsInternational* 5 458 1.91 0.28 2.00 2.00 0.00 1.00
## SeatsEconomy 6 458 202.31 76.37 185.00 194.64 85.99 78.00
## SeatsPremium 7 458 33.65 13.26 36.00 33.35 11.86 8.00
## PitchEconomy 8 458 31.22 0.66 31.00 31.26 0.00 30.00
## PitchPremium 9 458 37.91 1.31 38.00 38.05 0.00 34.00
## WidthEconomy 10 458 17.84 0.56 18.00 17.81 0.00 17.00
## WidthPremium 11 458 19.47 1.10 19.00 19.53 0.00 17.00
## PriceEconomy 12 458 1327.08 988.27 1242.00 1244.40 1159.39 65.00
## PricePremium 13 458 1845.26 1288.14 1737.00 1799.05 1845.84 86.00
## PriceRelative 14 458 0.49 0.45 0.36 0.42 0.41 0.02
## SeatsTotal 15 458 235.96 85.29 227.00 228.73 90.44 98.00
## PitchDifference 16 458 6.69 1.76 7.00 6.76 0.00 2.00
## WidthDifference 17 458 1.63 1.19 1.00 1.53 0.00 0.00
## PercentPremiumSeats 18 458 14.65 4.84 13.21 14.31 2.68 4.71
## max range skew kurtosis se
## Airline* 6.00 5.00 0.61 -0.95 0.08
## Aircraft* 2.00 1.00 -0.72 -1.48 0.02
## FlightDuration 14.66 13.41 -0.07 -1.12 0.17
## TravelMonth* 4.00 3.00 -0.14 -1.46 0.05
## IsInternational* 2.00 1.00 -2.91 6.50 0.01
## SeatsEconomy 389.00 311.00 0.72 -0.36 3.57
## SeatsPremium 66.00 58.00 0.23 -0.46 0.62
## PitchEconomy 33.00 3.00 -0.03 -0.35 0.03
## PitchPremium 40.00 6.00 -1.51 3.52 0.06
## WidthEconomy 19.00 2.00 -0.04 -0.08 0.03
## WidthPremium 21.00 4.00 -0.08 -0.31 0.05
## PriceEconomy 3593.00 3528.00 0.51 -0.88 46.18
## PricePremium 7414.00 7328.00 0.50 0.43 60.19
## PriceRelative 1.89 1.87 1.17 0.72 0.02
## SeatsTotal 441.00 343.00 0.70 -0.53 3.99
## PitchDifference 10.00 8.00 -0.54 1.78 0.08
## WidthDifference 4.00 4.00 0.84 -0.53 0.06
## PercentPremiumSeats 24.69 19.98 0.71 0.28 0.23
boxplot(PriceEconomy~PitchEconomy,horizontal=TRUE,main="Plot between Economy seats Price vs Pitch",xlab="Economy Seats Price",ylab="Economy Seats Pitch",col="light blue")
boxplot(PriceEconomy~WidthEconomy,horizontal=TRUE,main="Plot between Economy seats Price vs Width",xlab="Economy Seats Price",ylab="Economy Seats Width",col="light blue")
hist(WidthEconomy,col="red")
hist(PitchEconomy,col="red")
hist(PriceRelative,col="light pink")
hist(PitchDifference,col="light pink")
hist(WidthDifference,col="light pink")
scatterplotMatrix(formula = ~ PriceRelative + PitchDifference + WidthDifference , cex=0.6, diagonal="histogram")
y<-airline[,12:13]
x<-airline[,6:18]
cor(x,y)
## PriceEconomy PricePremium
## SeatsEconomy 0.12816722 0.17700093
## SeatsPremium 0.11364218 0.21761238
## PitchEconomy 0.36866123 0.22614179
## PitchPremium 0.05038455 0.08853915
## WidthEconomy 0.06799061 0.15054837
## WidthPremium -0.05704522 0.06402004
## PriceEconomy 1.00000000 0.90138870
## PricePremium 0.90138870 1.00000000
## PriceRelative -0.28856711 0.03184654
## SeatsTotal 0.13243313 0.19232533
## PitchDifference -0.09952511 -0.01806629
## WidthDifference -0.08449975 -0.01151218
## PercentPremiumSeats 0.06532232 0.11639097
library(corrgram)
corrgram(airline, order=TRUE, lower.panel=panel.shade,
upper.panel=panel.pie, text.panel=panel.txt,
main="Corrgram of Airline Variable intercorrelations")
Here our main focus is on Price The main focus question is ####What factors explain the difference in price between an economy ticket and a premium-economy airline ticket? So for this we have three price variables and hence our analysis will depend upon 3 factors: 1.Premium seat price 2.Economy seat price and 3.Relative price
test2<-lm(PriceEconomy~SeatsEconomy+FlightDuration+SeatsPremium+PitchEconomy+PitchPremium+WidthEconomy+WidthPremium+PricePremium+PriceRelative+SeatsTotal+PitchDifference+WidthDifference+PercentPremiumSeats, data = airline)
summary(test2)
##
## Call:
## lm(formula = PriceEconomy ~ SeatsEconomy + FlightDuration + SeatsPremium +
## PitchEconomy + PitchPremium + WidthEconomy + WidthPremium +
## PricePremium + PriceRelative + SeatsTotal + PitchDifference +
## WidthDifference + PercentPremiumSeats, data = airline)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1197.24 -91.56 10.99 110.71 601.51
##
## Coefficients: (3 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -8.885e+03 9.951e+02 -8.929 < 2e-16 ***
## SeatsEconomy 2.128e+00 4.976e-01 4.277 2.32e-05 ***
## FlightDuration 1.292e+01 4.344e+00 2.975 0.00309 **
## SeatsPremium -1.681e+01 2.953e+00 -5.691 2.28e-08 ***
## PitchEconomy 2.352e+02 2.498e+01 9.416 < 2e-16 ***
## PitchPremium 1.435e+02 1.208e+01 11.887 < 2e-16 ***
## WidthEconomy -2.371e+02 2.659e+01 -8.918 < 2e-16 ***
## WidthPremium 2.069e+01 1.547e+01 1.338 0.18165
## PricePremium 6.529e-01 1.101e-02 59.307 < 2e-16 ***
## PriceRelative -7.677e+02 2.667e+01 -28.788 < 2e-16 ***
## SeatsTotal NA NA NA NA
## PitchDifference NA NA NA NA
## WidthDifference NA NA NA NA
## PercentPremiumSeats 3.160e+01 7.037e+00 4.491 9.04e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 208.3 on 447 degrees of freedom
## Multiple R-squared: 0.9565, Adjusted R-squared: 0.9556
## F-statistic: 983.6 on 10 and 447 DF, p-value: < 2.2e-16
test3<-lm(PriceRelative ~ SeatsEconomy+FlightDuration+SeatsPremium+PitchEconomy+PitchPremium+WidthEconomy+WidthPremium+PricePremium+SeatsTotal+PitchDifference+WidthDifference+PercentPremiumSeats, data = airline)
summary(test3)
##
## Call:
## lm(formula = PriceRelative ~ SeatsEconomy + FlightDuration +
## SeatsPremium + PitchEconomy + PitchPremium + WidthEconomy +
## WidthPremium + PricePremium + SeatsTotal + PitchDifference +
## WidthDifference + PercentPremiumSeats, data = airline)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.87453 -0.25322 -0.08333 0.13083 1.35318
##
## Coefficients: (3 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.928e+00 1.732e+00 3.999 7.43e-05 ***
## SeatsEconomy -9.463e-05 8.816e-04 -0.107 0.915
## FlightDuration 3.052e-02 7.559e-03 4.038 6.35e-05 ***
## SeatsPremium -2.239e-03 5.230e-03 -0.428 0.669
## PitchEconomy -2.585e-01 4.253e-02 -6.077 2.63e-09 ***
## PitchPremium -1.828e-02 2.138e-02 -0.855 0.393
## WidthEconomy 2.561e-03 4.710e-02 0.054 0.957
## WidthPremium 1.204e-01 2.680e-02 4.492 9.01e-06 ***
## PricePremium -6.865e-06 1.950e-05 -0.352 0.725
## SeatsTotal NA NA NA NA
## PitchDifference NA NA NA NA
## WidthDifference NA NA NA NA
## PercentPremiumSeats -1.322e-02 1.245e-02 -1.062 0.289
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3691 on 448 degrees of freedom
## Multiple R-squared: 0.3421, Adjusted R-squared: 0.3289
## F-statistic: 25.89 on 9 and 448 DF, p-value: < 2.2e-16
1.SeatsEconomy , SeatsPremium ,PitchEconomy ,PitchPremium ,WidthEconomy ,PriceEconomy ,PriceRelative ,PercentPremiumSeats are highly significant Independent Variables.
2.The above mentiones variables were highly dependent on the price variation of Premium And Economy seats.
3.Variables like FlightDuration ,WidthPremium ,SeatsTotal, PitchDifference, WidthDifference have p value>0.05 therefore they are not significant.
4.The accuracy of model is almost 95% for the Prices of premium and economy seats.
5.The F-statistics which checkhow well does all the variables taken together predicts the dependent variable.Here F-statistics is p-value<2.2e-16 which is highly significant which implies high correlation.