The Rmd Document contains the Air Ticket Pricing Analysis of Six Airlines.
AirlinesDATA <- read.csv(paste("SixAirlinesDataV2.csv",sep=""))
#DataFrame Structure
str(AirlinesDATA)
## 'data.frame': 458 obs. of 18 variables:
## $ Airline : Factor w/ 6 levels "AirFrance","British",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ Aircraft : Factor w/ 2 levels "AirBus","Boeing": 2 2 2 2 2 2 2 2 2 2 ...
## $ FlightDuration : num 12.25 12.25 12.25 12.25 8.16 ...
## $ TravelMonth : Factor w/ 4 levels "Aug","Jul","Oct",..: 2 1 4 3 1 4 3 1 4 4 ...
## $ IsInternational : Factor w/ 2 levels "Domestic","International": 2 2 2 2 2 2 2 2 2 2 ...
## $ SeatsEconomy : int 122 122 122 122 122 122 122 122 122 122 ...
## $ SeatsPremium : int 40 40 40 40 40 40 40 40 40 40 ...
## $ PitchEconomy : int 31 31 31 31 31 31 31 31 31 31 ...
## $ PitchPremium : int 38 38 38 38 38 38 38 38 38 38 ...
## $ WidthEconomy : int 18 18 18 18 18 18 18 18 18 18 ...
## $ WidthPremium : int 19 19 19 19 19 19 19 19 19 19 ...
## $ PriceEconomy : int 2707 2707 2707 2707 1793 1793 1793 1476 1476 1705 ...
## $ PricePremium : int 3725 3725 3725 3725 2999 2999 2999 2997 2997 2989 ...
## $ PriceRelative : num 0.38 0.38 0.38 0.38 0.67 0.67 0.67 1.03 1.03 0.75 ...
## $ SeatsTotal : int 162 162 162 162 162 162 162 162 162 162 ...
## $ PitchDifference : int 7 7 7 7 7 7 7 7 7 7 ...
## $ WidthDifference : int 1 1 1 1 1 1 1 1 1 1 ...
## $ PercentPremiumSeats: num 24.7 24.7 24.7 24.7 24.7 ...
View(AirlinesDATA)
#Summary Stats of Dataset
library(psych)
describe(AirlinesDATA)
## vars n mean sd median trimmed mad min
## Airline* 1 458 3.01 1.65 2.00 2.89 1.48 1.00
## Aircraft* 2 458 1.67 0.47 2.00 1.71 0.00 1.00
## FlightDuration 3 458 7.58 3.54 7.79 7.57 4.81 1.25
## TravelMonth* 4 458 2.56 1.17 3.00 2.58 1.48 1.00
## IsInternational* 5 458 1.91 0.28 2.00 2.00 0.00 1.00
## SeatsEconomy 6 458 202.31 76.37 185.00 194.64 85.99 78.00
## SeatsPremium 7 458 33.65 13.26 36.00 33.35 11.86 8.00
## PitchEconomy 8 458 31.22 0.66 31.00 31.26 0.00 30.00
## PitchPremium 9 458 37.91 1.31 38.00 38.05 0.00 34.00
## WidthEconomy 10 458 17.84 0.56 18.00 17.81 0.00 17.00
## WidthPremium 11 458 19.47 1.10 19.00 19.53 0.00 17.00
## PriceEconomy 12 458 1327.08 988.27 1242.00 1244.40 1159.39 65.00
## PricePremium 13 458 1845.26 1288.14 1737.00 1799.05 1845.84 86.00
## PriceRelative 14 458 0.49 0.45 0.36 0.42 0.41 0.02
## SeatsTotal 15 458 235.96 85.29 227.00 228.73 90.44 98.00
## PitchDifference 16 458 6.69 1.76 7.00 6.76 0.00 2.00
## WidthDifference 17 458 1.63 1.19 1.00 1.53 0.00 0.00
## PercentPremiumSeats 18 458 14.65 4.84 13.21 14.31 2.68 4.71
## max range skew kurtosis se
## Airline* 6.00 5.00 0.61 -0.95 0.08
## Aircraft* 2.00 1.00 -0.72 -1.48 0.02
## FlightDuration 14.66 13.41 -0.07 -1.12 0.17
## TravelMonth* 4.00 3.00 -0.14 -1.46 0.05
## IsInternational* 2.00 1.00 -2.91 6.50 0.01
## SeatsEconomy 389.00 311.00 0.72 -0.36 3.57
## SeatsPremium 66.00 58.00 0.23 -0.46 0.62
## PitchEconomy 33.00 3.00 -0.03 -0.35 0.03
## PitchPremium 40.00 6.00 -1.51 3.52 0.06
## WidthEconomy 19.00 2.00 -0.04 -0.08 0.03
## WidthPremium 21.00 4.00 -0.08 -0.31 0.05
## PriceEconomy 3593.00 3528.00 0.51 -0.88 46.18
## PricePremium 7414.00 7328.00 0.50 0.43 60.19
## PriceRelative 1.89 1.87 1.17 0.72 0.02
## SeatsTotal 441.00 343.00 0.70 -0.53 3.99
## PitchDifference 10.00 8.00 -0.54 1.78 0.08
## WidthDifference 4.00 4.00 0.84 -0.53 0.06
## PercentPremiumSeats 24.69 19.98 0.71 0.28 0.23
IorD <- xtabs(~AirlinesDATA$Airline+AirlinesDATA$IsInternational)
IorD
## AirlinesDATA$IsInternational
## AirlinesDATA$Airline Domestic International
## AirFrance 0 74
## British 0 175
## Delta 40 6
## Jet 0 61
## Singapore 0 40
## Virgin 0 62
Since only Delta Airlines contain Domestic Flights, we can exclude the International/Domestic Parameter from the Airline Ticket Pricing.
AirlineDuration <- xtabs(~AirlinesDATA$Airline+ AirlinesDATA$FlightDuration)
AirlineDuration
## AirlinesDATA$FlightDuration
## AirlinesDATA$Airline 1.25 1.33 1.57 1.75 1.8 1.81 1.83 1.91 1.95 2.01 2.06
## AirFrance 0 0 0 0 0 0 0 0 0 0 0
## British 6 3 0 0 0 0 4 0 0 0 0
## Delta 0 0 2 1 1 1 1 1 1 1 2
## Jet 0 0 0 0 0 0 0 0 0 0 0
## Singapore 0 0 0 0 0 0 0 0 0 0 0
## Virgin 0 0 0 0 0 0 0 0 0 0 0
## AirlinesDATA$FlightDuration
## AirlinesDATA$Airline 2.26 2.3 2.33 2.41 2.5 2.55 2.58 2.66 2.83 2.86 3.08
## AirFrance 0 0 0 0 0 0 0 0 0 0 0
## British 0 0 0 8 0 0 0 1 2 0 0
## Delta 1 2 2 0 1 2 0 0 0 1 0
## Jet 0 0 0 0 4 0 8 4 0 0 8
## Singapore 0 0 0 0 0 0 0 0 0 0 0
## Virgin 0 0 0 0 0 0 0 0 0 0 0
## AirlinesDATA$FlightDuration
## AirlinesDATA$Airline 3.16 3.25 3.58 3.83 4.08 4.16 4.25 4.26 4.28 4.33
## AirFrance 0 0 0 0 0 0 0 0 0 0
## British 0 3 6 8 3 0 0 0 0 0
## Delta 0 0 0 0 0 0 2 2 1 2
## Jet 4 9 0 0 2 5 0 0 0 4
## Singapore 0 0 0 4 0 0 0 0 0 0
## Virgin 0 0 0 0 0 0 0 0 0 0
## AirlinesDATA$FlightDuration
## AirlinesDATA$Airline 4.36 4.4 4.43 4.5 4.51 4.63 4.65 4.66 4.7 4.91 5.41
## AirFrance 0 0 0 0 0 0 0 0 0 0 0
## British 0 0 0 3 0 0 0 0 0 3 3
## Delta 1 1 1 0 2 2 1 1 4 0 0
## Jet 0 0 0 0 0 0 0 0 0 0 0
## Singapore 0 0 0 0 0 0 0 0 0 0 0
## Virgin 0 0 0 0 0 0 0 0 0 0 0
## AirlinesDATA$FlightDuration
## AirlinesDATA$Airline 5.66 6.08 6.16 6.5 6.58 6.66 6.75 6.83 6.91 7.08 7.25
## AirFrance 0 0 0 0 0 0 0 5 4 0 0
## British 0 3 0 3 0 3 3 4 0 4 3
## Delta 0 0 0 0 0 0 0 0 0 0 0
## Jet 6 0 0 0 0 0 0 0 0 0 0
## Singapore 0 0 4 4 0 0 0 0 0 0 0
## Virgin 0 0 0 0 4 0 0 0 4 4 0
## AirlinesDATA$FlightDuration
## AirlinesDATA$Airline 7.33 7.41 7.5 7.58 7.66 7.75 7.83 8 8.08 8.16 8.25
## AirFrance 3 0 3 0 1 3 1 0 1 0 0
## British 0 0 0 4 0 0 0 0 0 3 3
## Delta 0 0 0 0 0 0 0 0 0 0 0
## Jet 0 0 0 0 0 0 0 0 0 0 0
## Singapore 0 0 0 0 0 0 0 0 0 0 0
## Virgin 0 4 0 0 4 4 0 4 0 0 0
## AirlinesDATA$FlightDuration
## AirlinesDATA$Airline 8.33 8.41 8.5 8.58 8.66 8.75 8.83 8.91 9.16 9.18 9.25
## AirFrance 11 1 3 0 0 3 0 3 2 2 4
## British 0 0 0 3 3 3 0 6 3 0 0
## Delta 3 0 0 0 0 0 0 0 0 0 0
## Jet 0 0 0 0 0 0 0 3 0 0 0
## Singapore 0 0 0 0 0 0 0 0 0 0 0
## Virgin 0 0 0 0 0 0 4 0 0 0 0
## AirlinesDATA$FlightDuration
## AirlinesDATA$Airline 9.33 9.41 9.5 9.58 9.66 9.91 10.41 10.5 10.66 10.75
## AirFrance 0 3 7 0 0 0 0 0 3 0
## British 3 0 0 3 0 4 3 4 0 0
## Delta 0 0 3 0 0 0 0 0 0 0
## Jet 0 0 4 0 0 0 0 0 0 0
## Singapore 0 0 0 0 3 0 0 0 0 0
## Virgin 0 0 0 0 0 4 4 0 0 4
## AirlinesDATA$FlightDuration
## AirlinesDATA$Airline 10.83 11 11.08 11.16 11.25 11.33 11.41 11.5 11.58
## AirFrance 0 0 0 0 0 0 0 2 0
## British 0 3 7 7 0 0 8 4 3
## Delta 0 0 0 0 0 0 0 0 0
## Jet 0 0 0 0 0 0 0 0 0
## Singapore 3 0 0 0 0 0 0 0 0
## Virgin 4 0 0 0 4 3 0 0 0
## AirlinesDATA$FlightDuration
## AirlinesDATA$Airline 11.75 11.91 12.05 12.08 12.25 12.41 12.5 12.58 12.66
## AirFrance 2 4 0 0 0 0 0 0 0
## British 0 0 1 0 4 0 2 0 0
## Delta 0 0 0 0 0 0 0 0 0
## Jet 0 0 0 0 0 0 0 0 0
## Singapore 0 0 0 0 0 3 0 0 4
## Virgin 0 0 0 4 0 0 0 3 0
## AirlinesDATA$FlightDuration
## AirlinesDATA$Airline 12.75 13 13.08 13.33 13.5 13.83 13.91 14.66
## AirFrance 0 3 0 0 0 0 0 0
## British 3 0 3 3 3 3 0 0
## Delta 0 0 0 0 0 0 0 0
## Jet 0 0 0 0 0 0 0 0
## Singapore 4 0 0 4 0 0 4 3
## Virgin 0 0 0 0 0 0 0 0
par(mfrow=c(2,1))
boxplot(AirlinesDATA$PriceEconomy~AirlinesDATA$Aircraft,main="Price vs Aircraft",xlab="Economy class",las=1,horizontal = TRUE,col=c("red","yellow"))
boxplot(AirlinesDATA$PricePremium~AirlinesDATA$Aircraft,main="Price vs Aircraft",xlab="Premium class",las=1,horizontal = TRUE,col=c("red","yellow"))
par(mfrow=c(1,2))
boxplot(AirlinesDATA$PriceEconomy~AirlinesDATA$Airline,main="Economy class v/s Airlines",xlab="Price",las=1,horizontal=TRUE,col=c("aquamarine2","burlywood1"))
boxplot(AirlinesDATA$PricePremium~AirlinesDATA$Airline,main="Premium class v/s Airlines",xlab="Price",las=1,horizontal=TRUE,col=c("aquamarine2","burlywood1"))
par(mfrow=c(1,3))
boxplot(AirlinesDATA$SeatsEconomy~AirlinesDATA$Aircraft,main="Economy Seats vs Aircraft",xlab="Economy Class",las=1,horizontal = TRUE, col=c("red","yellow"))
boxplot(AirlinesDATA$SeatsPremium~AirlinesDATA$Aircraft,main="Premium Seats vs Aircraft",xlab="Premium Class",las=1,horizontal = TRUE, col=c("red","yellow"))
boxplot(AirlinesDATA$SeatsTotal~AirlinesDATA$Aircraft,main="Total Seats vs Aircraft",xlab="Total Seats",las=1,horizontal = TRUE,col=c("red","yellow"))
par(mfrow=c(2,1))
boxplot(AirlinesDATA$PitchEconomy~AirlinesDATA$Aircraft,main="Economy Pitch vs Aircraft",xlab="Pitch",las=1,horizontal = TRUE)
boxplot(AirlinesDATA$PitchPremium~AirlinesDATA$Aircraft,main="Premium Pitch vs Aircraft",xlab="Pitch",las=1,horizontal = TRUE)
par(mfrow=c(2,1))
boxplot(AirlinesDATA$WidthEconomy~AirlinesDATA$Aircraft,main="Economy Width vs Aircraft",xlab="Width",las=1,horizontal = TRUE)
boxplot(AirlinesDATA$WidthPremium~AirlinesDATA$Aircraft,main="Premium Width vs Aircraft",xlab="Width",las=1,horizontal = TRUE )
par(mfrow=c(1,2))
boxplot(AirlinesDATA$FlightDuration~AirlinesDATA$Aircraft,xlab="Flight Duration",main="Flight Duration v/s Aircraft",las=1,horizontal = TRUE,col=c("red","yellow"))
boxplot(AirlinesDATA$FlightDuration~AirlinesDATA$Airline,xlab="Flight Duration",main="Flight Duration v/s Airline",las=1,horizontal = TRUE,col=c("red","yellow"))
library(car)
##
## Attaching package: 'car'
## The following object is masked from 'package:psych':
##
## logit
scatterplotMatrix(~PricePremium+PriceEconomy+PitchDifference+WidthDifference,data = AirlinesDATA)
Observations : #PricePremium Variation occurs with the change in PitchDifference Variation occurs with the change in WidthDifference #PriceEconomy Variation occurs with the change in PitchDifference Variation occurs with the change in WidthDifference
library(car)
scatterplotMatrix(~PricePremium+PriceEconomy+SeatsTotal+FlightDuration,data = AirlinesDATA)
Observations : #PricePremium Variation occurs with the change in SeatsTotal Variation occurs the change in FlightDuration. #PriceEconomy Variation occurs with the change in SeatsTotal Variation occurs with the change in PlightDuration.
data1 <- AirlinesDATA[,c(3,12,13,15,16,17)]
cor(data1)
## FlightDuration PriceEconomy PricePremium SeatsTotal
## FlightDuration 1.00000000 0.56664039 0.64873981 0.20023299
## PriceEconomy 0.56664039 1.00000000 0.90138870 0.13243313
## PricePremium 0.64873981 0.90138870 1.00000000 0.19232533
## SeatsTotal 0.20023299 0.13243313 0.19232533 1.00000000
## PitchDifference -0.03749288 -0.09952511 -0.01806629 0.03416915
## WidthDifference -0.11856070 -0.08449975 -0.01151218 -0.10584398
## PitchDifference WidthDifference
## FlightDuration -0.03749288 -0.11856070
## PriceEconomy -0.09952511 -0.08449975
## PricePremium -0.01806629 -0.01151218
## SeatsTotal 0.03416915 -0.10584398
## PitchDifference 1.00000000 0.76089108
## WidthDifference 0.76089108 1.00000000
library(corrplot)
## corrplot 0.84 loaded
corrplot(corr=cor(data1),method="ellipse")
library(corrgram)
corrgram(data1 ,upper.panel =panel.pie,text.panel =panel.txt, lower.panel = panel.shade)
H0= No correlation between PriceEconomy and FlightDuration
t.test(AirlinesDATA$PriceEconomy,AirlinesDATA$FlightDuration)
##
## Welch Two Sample t-test
##
## data: AirlinesDATA$PriceEconomy and AirlinesDATA$FlightDuration
## t = 28.573, df = 457.01, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 1228.749 1410.249
## sample estimates:
## mean of x mean of y
## 1327.076419 7.577838
H0= No correlation between PriceEconomy and SeatsTotal
t.test(AirlinesDATA$PriceEconomy,AirlinesDATA$SeatsTotal)
##
## Welch Two Sample t-test
##
## data: AirlinesDATA$PriceEconomy and AirlinesDATA$SeatsTotal
## t = 23.54, df = 463.81, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 1000.032 1182.199
## sample estimates:
## mean of x mean of y
## 1327.0764 235.9607
Now from the “part-1”, we have the basic picture of what variables contribute for the Ticket Pricing.
model <- PricePremium ~ PitchDifference + WidthDifference + SeatsTotal + FlightDuration + PriceEconomy + PercentPremiumSeats + IsInternational
model2 <- PriceEconomy ~ PitchDifference + WidthDifference + SeatsTotal + FlightDuration + PricePremium + PercentPremiumSeats + IsInternational
Model <- PricePremium ~ PriceEconomy + PitchDifference + WidthDifference + SeatsTotal + FlightDuration + PercentPremiumSeats + IsInternational
fit <- lm(Model,data=AirlinesDATA)
summary(fit)
##
## Call:
## lm(formula = Model, data = AirlinesDATA)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1010.0 -258.4 -49.9 133.6 3416.7
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.213e+03 1.695e+02 -7.156 3.40e-12 ***
## PriceEconomy 1.063e+00 3.077e-02 34.537 < 2e-16 ***
## PitchDifference 8.421e+01 3.656e+01 2.303 0.021722 *
## WidthDifference 1.224e+02 3.373e+01 3.629 0.000318 ***
## SeatsTotal 1.920e+00 3.241e-01 5.922 6.31e-09 ***
## FlightDuration 8.459e+01 8.507e+00 9.943 < 2e-16 ***
## PercentPremiumSeats 3.190e+01 5.220e+00 6.112 2.14e-09 ***
## IsInternationalInternational -7.412e+02 2.001e+02 -3.704 0.000238 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 479 on 450 degrees of freedom
## Multiple R-squared: 0.8638, Adjusted R-squared: 0.8617
## F-statistic: 407.9 on 7 and 450 DF, p-value: < 2.2e-16
Model2 <- PriceEconomy ~ PitchDifference + WidthDifference + SeatsTotal + FlightDuration + PricePremium
fit2 <- lm(Model2,data=AirlinesDATA)
summary(fit2)
##
## Call:
## lm(formula = Model2, data = AirlinesDATA)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2164.31 -187.76 -2.55 102.65 1030.42
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 441.87030 104.31163 4.236 2.76e-05 ***
## PitchDifference -26.24484 17.54055 -1.496 0.1353
## WidthDifference -39.11664 26.33624 -1.485 0.1382
## SeatsTotal -0.49649 0.24004 -2.068 0.0392 *
## FlightDuration -10.27514 7.41826 -1.385 0.1667
## PricePremium 0.71514 0.02026 35.290 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 418.9 on 452 degrees of freedom
## Multiple R-squared: 0.8223, Adjusted R-squared: 0.8203
## F-statistic: 418.3 on 5 and 452 DF, p-value: < 2.2e-16
Now we can neglect the model2(as the variables p>0.05 & R-Squared value). ‘model1’ fits the best.
From the OLS Regression(Model1), Pricing of Premium Class varies with Pricing of Economy class, based on the factors(Independent Variables) : PriceEconomy, Pitch Difference, Width Difference, Total Seats, Percentage of Premium Seats