This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
airlines.df <- read.csv(paste("SixAirlinesDataV2.csv", sep = ""))
View(airlines.df)
summary(airlines.df)
## Airline Aircraft FlightDuration TravelMonth
## AirFrance: 74 AirBus:151 Min. : 1.250 Aug:127
## British :175 Boeing:307 1st Qu.: 4.260 Jul: 75
## Delta : 46 Median : 7.790 Oct:127
## Jet : 61 Mean : 7.578 Sep:129
## Singapore: 40 3rd Qu.:10.620
## Virgin : 62 Max. :14.660
## IsInternational SeatsEconomy SeatsPremium PitchEconomy
## Domestic : 40 Min. : 78.0 Min. : 8.00 Min. :30.00
## International:418 1st Qu.:133.0 1st Qu.:21.00 1st Qu.:31.00
## Median :185.0 Median :36.00 Median :31.00
## Mean :202.3 Mean :33.65 Mean :31.22
## 3rd Qu.:243.0 3rd Qu.:40.00 3rd Qu.:32.00
## Max. :389.0 Max. :66.00 Max. :33.00
## PitchPremium WidthEconomy WidthPremium PriceEconomy
## Min. :34.00 Min. :17.00 Min. :17.00 Min. : 65
## 1st Qu.:38.00 1st Qu.:18.00 1st Qu.:19.00 1st Qu.: 413
## Median :38.00 Median :18.00 Median :19.00 Median :1242
## Mean :37.91 Mean :17.84 Mean :19.47 Mean :1327
## 3rd Qu.:38.00 3rd Qu.:18.00 3rd Qu.:21.00 3rd Qu.:1909
## Max. :40.00 Max. :19.00 Max. :21.00 Max. :3593
## PricePremium PriceRelative SeatsTotal PitchDifference
## Min. : 86.0 Min. :0.0200 Min. : 98 Min. : 2.000
## 1st Qu.: 528.8 1st Qu.:0.1000 1st Qu.:166 1st Qu.: 6.000
## Median :1737.0 Median :0.3650 Median :227 Median : 7.000
## Mean :1845.3 Mean :0.4872 Mean :236 Mean : 6.688
## 3rd Qu.:2989.0 3rd Qu.:0.7400 3rd Qu.:279 3rd Qu.: 7.000
## Max. :7414.0 Max. :1.8900 Max. :441 Max. :10.000
## WidthDifference PercentPremiumSeats
## Min. :0.000 Min. : 4.71
## 1st Qu.:1.000 1st Qu.:12.28
## Median :1.000 Median :13.21
## Mean :1.633 Mean :14.65
## 3rd Qu.:3.000 3rd Qu.:15.36
## Max. :4.000 Max. :24.69
library(psych)
describe(airlines.df)
## vars n mean sd median trimmed mad min
## Airline* 1 458 3.01 1.65 2.00 2.89 1.48 1.00
## Aircraft* 2 458 1.67 0.47 2.00 1.71 0.00 1.00
## FlightDuration 3 458 7.58 3.54 7.79 7.57 4.81 1.25
## TravelMonth* 4 458 2.56 1.17 3.00 2.58 1.48 1.00
## IsInternational* 5 458 1.91 0.28 2.00 2.00 0.00 1.00
## SeatsEconomy 6 458 202.31 76.37 185.00 194.64 85.99 78.00
## SeatsPremium 7 458 33.65 13.26 36.00 33.35 11.86 8.00
## PitchEconomy 8 458 31.22 0.66 31.00 31.26 0.00 30.00
## PitchPremium 9 458 37.91 1.31 38.00 38.05 0.00 34.00
## WidthEconomy 10 458 17.84 0.56 18.00 17.81 0.00 17.00
## WidthPremium 11 458 19.47 1.10 19.00 19.53 0.00 17.00
## PriceEconomy 12 458 1327.08 988.27 1242.00 1244.40 1159.39 65.00
## PricePremium 13 458 1845.26 1288.14 1737.00 1799.05 1845.84 86.00
## PriceRelative 14 458 0.49 0.45 0.36 0.42 0.41 0.02
## SeatsTotal 15 458 235.96 85.29 227.00 228.73 90.44 98.00
## PitchDifference 16 458 6.69 1.76 7.00 6.76 0.00 2.00
## WidthDifference 17 458 1.63 1.19 1.00 1.53 0.00 0.00
## PercentPremiumSeats 18 458 14.65 4.84 13.21 14.31 2.68 4.71
## max range skew kurtosis se
## Airline* 6.00 5.00 0.61 -0.95 0.08
## Aircraft* 2.00 1.00 -0.72 -1.48 0.02
## FlightDuration 14.66 13.41 -0.07 -1.12 0.17
## TravelMonth* 4.00 3.00 -0.14 -1.46 0.05
## IsInternational* 2.00 1.00 -2.91 6.50 0.01
## SeatsEconomy 389.00 311.00 0.72 -0.36 3.57
## SeatsPremium 66.00 58.00 0.23 -0.46 0.62
## PitchEconomy 33.00 3.00 -0.03 -0.35 0.03
## PitchPremium 40.00 6.00 -1.51 3.52 0.06
## WidthEconomy 19.00 2.00 -0.04 -0.08 0.03
## WidthPremium 21.00 4.00 -0.08 -0.31 0.05
## PriceEconomy 3593.00 3528.00 0.51 -0.88 46.18
## PricePremium 7414.00 7328.00 0.50 0.43 60.19
## PriceRelative 1.89 1.87 1.17 0.72 0.02
## SeatsTotal 441.00 343.00 0.70 -0.53 3.99
## PitchDifference 10.00 8.00 -0.54 1.78 0.08
## WidthDifference 4.00 4.00 0.84 -0.53 0.06
## PercentPremiumSeats 24.69 19.98 0.71 0.28 0.23
describe(airlines.df$PriceRelative)[3:4]
## mean sd
## X1 0.49 0.45
library(car)
##
## Attaching package: 'car'
## The following object is masked from 'package:psych':
##
## logit
scatterplot(airlines.df$Aircraft,airlines.df$PriceRelative)
## [1] "212" "308" "156" "157" "158" "159" "379" "380" "381" "382"
The relative premium price for flights by airbus are lower as compared to boeing flights
scatterplot(airlines.df$Airline,airlines.df$PriceRelative)
## [1] "406" "407" "212" "408" "213" "426" "427" "214" "409" "339" "367"
## [12] "368" "369" "110" "111" "240" "241" "260" "271" "272" "185" "186"
## [23] "187" "188" "189" "190"
The Relative premium price of flight by Airfrance and Delta are comparatively lower than other airlines.
scatterplot(airlines.df$PriceRelative,airlines.df$PitchDifference)
In majority of flights premium price are higher where the pitch difference is above 6 whereas its nominal for pitch differnce of 3 and less
scatterplot(airlines.df$PriceRelative,airlines.df$WidthDifference)
The width size and premium price are directly realted and majorly the width diffrence of 1 have majority of price variances.
plot(airlines.df$Airline,airlines.df$PercentPremiumSeats, xlab="Airlines", ylab="%age of premium seats")
On an average diffrent airlines have 12% to 16% of seats as premium in every flight
plot(airlines.df$TravelMonth,airlines.df$PriceRelative)
On an average all months have similar premium seat pricing but overall more no. of fights in july have higher premium seats pricing.
attach(airlines.df)
newdata <- airlines.df[order(PriceRelative),]
newdata[1:10,1:18]
## Airline Aircraft FlightDuration TravelMonth IsInternational
## 439 AirFrance AirBus 13.00 Jul International
## 81 Delta Boeing 2.30 Jul Domestic
## 232 AirFrance AirBus 9.18 Jul International
## 233 AirFrance AirBus 9.18 Aug International
## 234 AirFrance AirBus 9.25 Sep International
## 235 AirFrance AirBus 9.25 Oct International
## 236 AirFrance AirBus 9.16 Jul International
## 237 AirFrance AirBus 9.16 Aug International
## 238 AirFrance AirBus 9.25 Sep International
## 239 AirFrance AirBus 9.25 Oct International
## SeatsEconomy SeatsPremium PitchEconomy PitchPremium WidthEconomy
## 439 389 38 32 38 18
## 81 78 20 31 34 18
## 232 147 21 32 38 18
## 233 147 21 32 38 18
## 234 147 21 32 38 18
## 235 147 21 32 38 18
## 236 147 21 32 38 18
## 237 147 21 32 38 18
## 238 147 21 32 38 18
## 239 147 21 32 38 18
## WidthPremium PriceEconomy PricePremium PriceRelative SeatsTotal
## 439 19 3220 3289 0.02 427
## 81 18 581 596 0.03 98
## 232 19 3165 3275 0.03 168
## 233 19 3165 3275 0.03 168
## 234 19 3165 3275 0.03 168
## 235 19 3165 3275 0.03 168
## 236 19 3165 3275 0.03 168
## 237 19 3165 3275 0.03 168
## 238 19 3165 3275 0.03 168
## 239 19 3165 3275 0.03 168
## PitchDifference WidthDifference PercentPremiumSeats
## 439 6 1 8.90
## 81 3 0 20.41
## 232 6 1 12.50
## 233 6 1 12.50
## 234 6 1 12.50
## 235 6 1 12.50
## 236 6 1 12.50
## 237 6 1 12.50
## 238 6 1 12.50
## 239 6 1 12.50
The flights by Air france has minimal relative price for premium seats.
newdata <- airlines.df[order(-PriceRelative),]
newdata[1:10,1:18]
## Airline Aircraft FlightDuration TravelMonth IsInternational
## 379 Jet Boeing 3.25 Aug International
## 380 Jet Boeing 3.25 Sep International
## 381 Jet Boeing 3.25 Oct International
## 382 Jet Boeing 3.25 Jul International
## 156 Virgin Boeing 11.25 Jul International
## 157 Virgin Boeing 11.25 Aug International
## 158 Virgin Boeing 11.25 Sep International
## 159 Virgin Boeing 11.25 Oct International
## 160 Virgin Boeing 12.08 Aug International
## 161 Virgin Boeing 12.08 Sep International
## SeatsEconomy SeatsPremium PitchEconomy PitchPremium WidthEconomy
## 379 124 16 30 40 17
## 380 124 16 30 40 17
## 381 124 16 30 40 17
## 382 124 16 30 40 17
## 156 198 35 31 38 18
## 157 198 35 31 38 18
## 158 198 35 31 38 18
## 159 198 35 31 38 18
## 160 198 35 31 38 18
## 161 198 35 31 38 18
## WidthPremium PriceEconomy PricePremium PriceRelative SeatsTotal
## 379 21 167 483 1.89 140
## 380 21 167 483 1.89 140
## 381 21 167 483 1.89 140
## 382 21 139 398 1.87 140
## 156 21 574 1619 1.82 233
## 157 21 574 1619 1.82 233
## 158 21 574 1619 1.82 233
## 159 21 574 1619 1.82 233
## 160 21 1086 2964 1.73 233
## 161 21 1086 2964 1.73 233
## PitchDifference WidthDifference PercentPremiumSeats
## 379 10 4 11.43
## 380 10 4 11.43
## 381 10 4 11.43
## 382 10 4 11.43
## 156 7 3 15.02
## 157 7 3 15.02
## 158 7 3 15.02
## 159 7 3 15.02
## 160 7 3 15.02
## 161 7 3 15.02
The flights by jet has the maximum relative price for the premium seats.
library(corrplot)
## corrplot 0.84 loaded
corrplot.mixed(corr= cor(airlines.df[, c(7:18)], use = "complete.obs"), upper="ellipse", tl.pos = "lt")
round(cor(airlines.df$PriceRelative,airlines.df$PitchPremium),2)
## [1] 0.42
round(cor(airlines.df$PriceRelative,airlines.df$WidthPremium),2)
## [1] 0.5
cor.test(airlines.df$PriceRelative,airlines.df$PitchDifference)
##
## Pearson's product-moment correlation
##
## data: airlines.df$PriceRelative and airlines.df$PitchDifference
## t = 11.331, df = 456, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.3940262 0.5372817
## sample estimates:
## cor
## 0.4687302
cor.test(airlines.df$PriceRelative,airlines.df$WidthDifference)
##
## Pearson's product-moment correlation
##
## data: airlines.df$PriceRelative and airlines.df$WidthDifference
## t = 11.869, df = 456, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.4125388 0.5528218
## sample estimates:
## cor
## 0.4858024
cor.test(airlines.df$PriceRelative,airlines.df$PercentPremiumSeats)
##
## Pearson's product-moment correlation
##
## data: airlines.df$PriceRelative and airlines.df$PercentPremiumSeats
## t = -3.496, df = 456, p-value = 0.0005185
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.24949885 -0.07098966
## sample estimates:
## cor
## -0.1615656
t.test(airlines.df$PriceRelative,airlines.df$PitchDifference)
##
## Welch Two Sample t-test
##
## data: airlines.df$PriceRelative and airlines.df$PitchDifference
## t = -72.974, df = 516.54, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -6.367495 -6.033640
## sample estimates:
## mean of x mean of y
## 0.4872052 6.6877729
t.test(airlines.df$PriceRelative,airlines.df$WidthDifference)
##
## Welch Two Sample t-test
##
## data: airlines.df$PriceRelative and airlines.df$WidthDifference
## t = -19.284, df = 585.55, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1.262697 -1.029268
## sample estimates:
## mean of x mean of y
## 0.4872052 1.6331878
model <- lm(PriceRelative ~ SeatsPremium + PitchPremium + WidthPremium + PitchDifference + WidthDifference + PricePremium + SeatsTotal + PercentPremiumSeats, data = airlines.df)
summary(model)
##
## Call:
## lm(formula = PriceRelative ~ SeatsPremium + PitchPremium + WidthPremium +
## PitchDifference + WidthDifference + PricePremium + SeatsTotal +
## PercentPremiumSeats, data = airlines.df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.92770 -0.27230 -0.06054 0.14779 1.37985
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.903e+00 1.686e+00 2.908 0.00382 **
## SeatsPremium -4.349e-03 6.148e-03 -0.707 0.47969
## PitchPremium -2.538e-01 5.382e-02 -4.716 3.21e-06 ***
## WidthPremium 2.018e-01 4.220e-02 4.781 2.36e-06 ***
## PitchDifference 2.436e-01 4.309e-02 5.653 2.80e-08 ***
## WidthDifference -8.010e-02 4.374e-02 -1.831 0.06773 .
## PricePremium 4.225e-05 1.550e-05 2.726 0.00666 **
## SeatsTotal 5.157e-05 8.958e-04 0.058 0.95412
## PercentPremiumSeats -1.128e-02 1.265e-02 -0.892 0.37312
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3754 on 449 degrees of freedom
## Multiple R-squared: 0.3182, Adjusted R-squared: 0.306
## F-statistic: 26.19 on 8 and 449 DF, p-value: < 2.2e-16
The intersection of data is at 4.903
The relative price of the premium seats increase with the unit increase in width of premium seats, pitch differnce between premium and economy seats and also effected by increase in total no. of seats of aircrafts.
The factors such as no. of premium seats, Pitch of the premium seats, width difference between premium and economy seats and The total precent of premium seats leads to decrease in relative price with the unit increase.