Summary

summary(airline.df)
##       Airline      Aircraft   FlightDuration   TravelMonth
##  AirFrance: 74   AirBus:151   Min.   : 1.250   Aug:127    
##  British  :175   Boeing:307   1st Qu.: 4.260   Jul: 75    
##  Delta    : 46                Median : 7.790   Oct:127    
##  Jet      : 61                Mean   : 7.578   Sep:129    
##  Singapore: 40                3rd Qu.:10.620              
##  Virgin   : 62                Max.   :14.660              
##       IsInternational  SeatsEconomy    SeatsPremium    PitchEconomy  
##  Domestic     : 40    Min.   : 78.0   Min.   : 8.00   Min.   :30.00  
##  International:418    1st Qu.:133.0   1st Qu.:21.00   1st Qu.:31.00  
##                       Median :185.0   Median :36.00   Median :31.00  
##                       Mean   :202.3   Mean   :33.65   Mean   :31.22  
##                       3rd Qu.:243.0   3rd Qu.:40.00   3rd Qu.:32.00  
##                       Max.   :389.0   Max.   :66.00   Max.   :33.00  
##   PitchPremium    WidthEconomy    WidthPremium    PriceEconomy 
##  Min.   :34.00   Min.   :17.00   Min.   :17.00   Min.   :  65  
##  1st Qu.:38.00   1st Qu.:18.00   1st Qu.:19.00   1st Qu.: 413  
##  Median :38.00   Median :18.00   Median :19.00   Median :1242  
##  Mean   :37.91   Mean   :17.84   Mean   :19.47   Mean   :1327  
##  3rd Qu.:38.00   3rd Qu.:18.00   3rd Qu.:21.00   3rd Qu.:1909  
##  Max.   :40.00   Max.   :19.00   Max.   :21.00   Max.   :3593  
##   PricePremium    PriceRelative      SeatsTotal  PitchDifference 
##  Min.   :  86.0   Min.   :0.0200   Min.   : 98   Min.   : 2.000  
##  1st Qu.: 528.8   1st Qu.:0.1000   1st Qu.:166   1st Qu.: 6.000  
##  Median :1737.0   Median :0.3650   Median :227   Median : 7.000  
##  Mean   :1845.3   Mean   :0.4872   Mean   :236   Mean   : 6.688  
##  3rd Qu.:2989.0   3rd Qu.:0.7400   3rd Qu.:279   3rd Qu.: 7.000  
##  Max.   :7414.0   Max.   :1.8900   Max.   :441   Max.   :10.000  
##  WidthDifference PercentPremiumSeats
##  Min.   :0.000   Min.   : 4.71      
##  1st Qu.:1.000   1st Qu.:12.28      
##  Median :1.000   Median :13.21      
##  Mean   :1.633   Mean   :14.65      
##  3rd Qu.:3.000   3rd Qu.:15.36      
##  Max.   :4.000   Max.   :24.69

Creating Box Plots for price

boxplot(airline.df$PriceEconomy, horizontal = T)

boxplot(airline.df$PricePremium, horizontal = T)

Scatter Plots to understand how are the variables correlated pair-wise

attach(airline.df)
pairs(formula = ~ Aircraft + Airline+ PriceEconomy)

pairs(formula = ~ Aircraft + Airline+ PricePremium + PercentPremiumSeats )

Corrgram of airlines data

library(corrgram)
corrgram(airline.df, order=TRUE, lower.panel=panel.shade,
         upper.panel=panel.pie, text.panel=panel.txt,
         main="Corrgram of airlines data")

Variance-Covariance Matrix

round(cov(airline.df[,c(3,6:17)]),2)
##                 FlightDuration SeatsEconomy SeatsPremium PitchEconomy
## FlightDuration           12.55        52.92         7.57         0.68
## SeatsEconomy             52.92      5832.92       633.07         7.21
## SeatsPremium              7.57       633.07       175.87        -0.30
## PitchEconomy              0.68         7.21        -0.30         0.43
## PitchPremium              0.45        11.96         0.09        -0.47
## WidthEconomy              0.90        15.91         3.37         0.11
## WidthPremium              0.40         8.58        -0.04        -0.39
## PriceEconomy           1983.54      9673.79      1489.38       238.70
## PricePremium           2959.98     17413.25      3717.36       190.85
## PriceRelative             0.19         0.14        -0.58        -0.12
## SeatsTotal               60.49      6465.99       808.94         6.91
## PitchDifference          -0.23         4.75         0.38        -0.90
## WidthDifference          -0.50        -7.33        -3.41        -0.50
##                 PitchPremium WidthEconomy WidthPremium PriceEconomy
## FlightDuration          0.45         0.90         0.40      1983.54
## SeatsEconomy           11.96        15.91         8.58      9673.79
## SeatsPremium            0.09         3.37        -0.04      1489.38
## PitchEconomy           -0.47         0.11        -0.39       238.70
## PitchPremium            1.73        -0.02         1.08        65.43
## WidthEconomy           -0.02         0.31         0.05        37.46
## WidthPremium            1.08         0.05         1.20       -61.85
## PriceEconomy           65.43        37.46       -61.85    976684.06
## PricePremium          149.85       108.12        90.48   1147494.77
## PriceRelative           0.25        -0.01         0.25      -128.50
## SeatsTotal             12.05        19.28         8.54     11163.18
## PitchDifference         2.20        -0.12         1.47      -173.28
## WidthDifference         1.10        -0.26         1.15       -99.32
##                 PricePremium PriceRelative SeatsTotal PitchDifference
## FlightDuration       2959.98          0.19      60.49           -0.23
## SeatsEconomy        17413.25          0.14    6465.99            4.75
## SeatsPremium         3717.36         -0.58     808.94            0.38
## PitchEconomy          190.85         -0.12       6.91           -0.90
## PitchPremium          149.85          0.25      12.05            2.20
## WidthEconomy          108.12         -0.01      19.28           -0.12
## WidthPremium           90.48          0.25       8.54            1.47
## PriceEconomy      1147494.77       -128.50   11163.18         -173.28
## PricePremium      1659293.12         18.48   21130.62          -41.00
## PriceRelative          18.48          0.20      -0.44            0.37
## SeatsTotal          21130.62         -0.44    7274.92            5.13
## PitchDifference       -41.00          0.37       5.13            3.10
## WidthDifference       -17.64          0.26     -10.74            1.59
##                 WidthDifference
## FlightDuration            -0.50
## SeatsEconomy              -7.33
## SeatsPremium              -3.41
## PitchEconomy              -0.50
## PitchPremium               1.10
## WidthEconomy              -0.26
## WidthPremium               1.15
## PriceEconomy             -99.32
## PricePremium             -17.64
## PriceRelative              0.26
## SeatsTotal               -10.74
## PitchDifference            1.59
## WidthDifference            1.41

Hypothesis

The prices of premium and economy seats are positively increase with flight duration, increase in width diference and pitch difference as well as depending on whether the flight is international or domestic.

The relative prices of premium and economy class are directly correlated with the pitch difference and width difference.

T-Tests

cor.test(PriceRelative,FlightDuration)
## 
##  Pearson's product-moment correlation
## 
## data:  PriceRelative and FlightDuration
## t = 2.6046, df = 456, p-value = 0.009498
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.02977856 0.21036806
## sample estimates:
##      cor 
## 0.121075
cor.test(PriceRelative,WidthDifference)
## 
##  Pearson's product-moment correlation
## 
## data:  PriceRelative and WidthDifference
## t = 11.869, df = 456, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.4125388 0.5528218
## sample estimates:
##       cor 
## 0.4858024
cor.test(PriceRelative,PitchDifference)
## 
##  Pearson's product-moment correlation
## 
## data:  PriceRelative and PitchDifference
## t = 11.331, df = 456, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.3940262 0.5372817
## sample estimates:
##       cor 
## 0.4687302

These tests, using the p-values, show that the relative prices of the Premium and Economy flight are related by difference in pitch and width as well as flight duration, rejecting the null hypothesis.

Formulating a Regression Model using lm()

Dependent variable - PriceRelative Independent variables - PitchDifference, WidthDifference, PercentPremiumSeats and Flight Duration

fit <- lm(formula = airline.df$PriceRelative~ PitchDifference+WidthDifference+FlightDuration+PercentPremiumSeats, data = airline.df)
summary(fit)
## 
## Call:
## lm(formula = airline.df$PriceRelative ~ PitchDifference + WidthDifference + 
##     FlightDuration + PercentPremiumSeats, data = airline.df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.79439 -0.29424 -0.03427  0.16197  1.13688 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         -0.179033   0.101492  -1.764  0.07840 .  
## PitchDifference      0.059311   0.015921   3.725  0.00022 ***
## WidthDifference      0.118140   0.024555   4.811 2.05e-06 ***
## FlightDuration       0.021707   0.005085   4.269 2.39e-05 ***
## PercentPremiumSeats -0.005999   0.003898  -1.539  0.12454    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.381 on 453 degrees of freedom
## Multiple R-squared:  0.2913, Adjusted R-squared:  0.285 
## F-statistic: 46.54 on 4 and 453 DF,  p-value: < 2.2e-16

Since the p-value<2.2e-16,it is safe to reject the null hypothesis and it can be observed that the variables are related.

The model can be written as - PriceRelative = 0.059311 PitchDifference + 0.118140 WidthDifference + 0.021707 FlightDuration + -0.005999 PercentPremiumSeats + intercept

Inference

The difference in prices of premium and economy airline tickets depends on the factors width and pitch of the seats of the airlines, as well as the flight duration.

The price increases with increase in width, pitch and flight duration.