The Rmd Document contains the Air Ticket Pricing Analysis of Six Airlines.
AirlinesDATA <- read.csv(paste("SixAirlinesDataV2.csv",sep=""))
#DataFrame Structure
str(AirlinesDATA)
## 'data.frame': 458 obs. of 18 variables:
## $ Airline : Factor w/ 6 levels "AirFrance","British",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ Aircraft : Factor w/ 2 levels "AirBus","Boeing": 2 2 2 2 2 2 2 2 2 2 ...
## $ FlightDuration : num 12.25 12.25 12.25 12.25 8.16 ...
## $ TravelMonth : Factor w/ 4 levels "Aug","Jul","Oct",..: 2 1 4 3 1 4 3 1 4 4 ...
## $ IsInternational : Factor w/ 2 levels "Domestic","International": 2 2 2 2 2 2 2 2 2 2 ...
## $ SeatsEconomy : int 122 122 122 122 122 122 122 122 122 122 ...
## $ SeatsPremium : int 40 40 40 40 40 40 40 40 40 40 ...
## $ PitchEconomy : int 31 31 31 31 31 31 31 31 31 31 ...
## $ PitchPremium : int 38 38 38 38 38 38 38 38 38 38 ...
## $ WidthEconomy : int 18 18 18 18 18 18 18 18 18 18 ...
## $ WidthPremium : int 19 19 19 19 19 19 19 19 19 19 ...
## $ PriceEconomy : int 2707 2707 2707 2707 1793 1793 1793 1476 1476 1705 ...
## $ PricePremium : int 3725 3725 3725 3725 2999 2999 2999 2997 2997 2989 ...
## $ PriceRelative : num 0.38 0.38 0.38 0.38 0.67 0.67 0.67 1.03 1.03 0.75 ...
## $ SeatsTotal : int 162 162 162 162 162 162 162 162 162 162 ...
## $ PitchDifference : int 7 7 7 7 7 7 7 7 7 7 ...
## $ WidthDifference : int 1 1 1 1 1 1 1 1 1 1 ...
## $ PercentPremiumSeats: num 24.7 24.7 24.7 24.7 24.7 ...
View(AirlinesDATA)
Now from the “part-1”, we have the basic picture of what variables contribute for the Ticket Pricing.
model <- PricePremium ~ PitchDifference + WidthDifference + SeatsTotal + FlightDuration + PriceEconomy + PercentPremiumSeats + IsInternational
model2 <- PriceEconomy ~ PitchDifference + WidthDifference + SeatsTotal + FlightDuration + PricePremium + PercentPremiumSeats + IsInternational
Model <- PricePremium ~ PriceEconomy + PitchDifference + WidthDifference + SeatsTotal + FlightDuration + PercentPremiumSeats + IsInternational
fit <- lm(Model,data=AirlinesDATA)
summary(fit)
##
## Call:
## lm(formula = Model, data = AirlinesDATA)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1010.0 -258.4 -49.9 133.6 3416.7
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.213e+03 1.695e+02 -7.156 3.40e-12 ***
## PriceEconomy 1.063e+00 3.077e-02 34.537 < 2e-16 ***
## PitchDifference 8.421e+01 3.656e+01 2.303 0.021722 *
## WidthDifference 1.224e+02 3.373e+01 3.629 0.000318 ***
## SeatsTotal 1.920e+00 3.241e-01 5.922 6.31e-09 ***
## FlightDuration 8.459e+01 8.507e+00 9.943 < 2e-16 ***
## PercentPremiumSeats 3.190e+01 5.220e+00 6.112 2.14e-09 ***
## IsInternationalInternational -7.412e+02 2.001e+02 -3.704 0.000238 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 479 on 450 degrees of freedom
## Multiple R-squared: 0.8638, Adjusted R-squared: 0.8617
## F-statistic: 407.9 on 7 and 450 DF, p-value: < 2.2e-16
Model2 <- PriceEconomy ~ PitchDifference + WidthDifference + SeatsTotal + FlightDuration + PricePremium
fit2 <- lm(Model2,data=AirlinesDATA)
summary(fit2)
##
## Call:
## lm(formula = Model2, data = AirlinesDATA)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2164.31 -187.76 -2.55 102.65 1030.42
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 441.87030 104.31163 4.236 2.76e-05 ***
## PitchDifference -26.24484 17.54055 -1.496 0.1353
## WidthDifference -39.11664 26.33624 -1.485 0.1382
## SeatsTotal -0.49649 0.24004 -2.068 0.0392 *
## FlightDuration -10.27514 7.41826 -1.385 0.1667
## PricePremium 0.71514 0.02026 35.290 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 418.9 on 452 degrees of freedom
## Multiple R-squared: 0.8223, Adjusted R-squared: 0.8203
## F-statistic: 418.3 on 5 and 452 DF, p-value: < 2.2e-16
Now we can neglect the model2(as the variables p>0.05 & R-Squared value). ‘model1’ fits the best.
library(leaps)
leap1 <- regsubsets(Model, data = AirlinesDATA, nbest=1)
plot(leap1, scale="adjr2")
library(leaps)
leap2 <- regsubsets(Model2, data = AirlinesDATA, nbest=1)
plot(leap2, scale="adjr2")
From the OLS Regression(Model1), Pricing of Premium Class varies with Pricing of Economy class, based on the factors(Independent Variables) : PriceEconomy, Pitch Difference, Width Difference, Total Seats, Percentage of Premium Seats