Summary statistics:

describe(airlines)
##                     vars   n    mean      sd  median trimmed     mad   min
## Airline*               1 458    3.01    1.65    2.00    2.89    1.48  1.00
## Aircraft*              2 458    1.67    0.47    2.00    1.71    0.00  1.00
## FlightDuration         3 458    7.58    3.54    7.79    7.57    4.81  1.25
## TravelMonth*           4 458    2.56    1.17    3.00    2.58    1.48  1.00
## IsInternational*       5 458    1.91    0.28    2.00    2.00    0.00  1.00
## SeatsEconomy           6 458  202.31   76.37  185.00  194.64   85.99 78.00
## SeatsPremium           7 458   33.65   13.26   36.00   33.35   11.86  8.00
## PitchEconomy           8 458   31.22    0.66   31.00   31.26    0.00 30.00
## PitchPremium           9 458   37.91    1.31   38.00   38.05    0.00 34.00
## WidthEconomy          10 458   17.84    0.56   18.00   17.81    0.00 17.00
## WidthPremium          11 458   19.47    1.10   19.00   19.53    0.00 17.00
## PriceEconomy          12 458 1327.08  988.27 1242.00 1244.40 1159.39 65.00
## PricePremium          13 458 1845.26 1288.14 1737.00 1799.05 1845.84 86.00
## PriceRelative         14 458    0.49    0.45    0.36    0.42    0.41  0.02
## SeatsTotal            15 458  235.96   85.29  227.00  228.73   90.44 98.00
## PitchDifference       16 458    6.69    1.76    7.00    6.76    0.00  2.00
## WidthDifference       17 458    1.63    1.19    1.00    1.53    0.00  0.00
## PercentPremiumSeats   18 458   14.65    4.84   13.21   14.31    2.68  4.71
##                         max   range  skew kurtosis    se
## Airline*               6.00    5.00  0.61    -0.95  0.08
## Aircraft*              2.00    1.00 -0.72    -1.48  0.02
## FlightDuration        14.66   13.41 -0.07    -1.12  0.17
## TravelMonth*           4.00    3.00 -0.14    -1.46  0.05
## IsInternational*       2.00    1.00 -2.91     6.50  0.01
## SeatsEconomy         389.00  311.00  0.72    -0.36  3.57
## SeatsPremium          66.00   58.00  0.23    -0.46  0.62
## PitchEconomy          33.00    3.00 -0.03    -0.35  0.03
## PitchPremium          40.00    6.00 -1.51     3.52  0.06
## WidthEconomy          19.00    2.00 -0.04    -0.08  0.03
## WidthPremium          21.00    4.00 -0.08    -0.31  0.05
## PriceEconomy        3593.00 3528.00  0.51    -0.88 46.18
## PricePremium        7414.00 7328.00  0.50     0.43 60.19
## PriceRelative          1.89    1.87  1.17     0.72  0.02
## SeatsTotal           441.00  343.00  0.70    -0.53  3.99
## PitchDifference       10.00    8.00 -0.54     1.78  0.08
## WidthDifference        4.00    4.00  0.84    -0.53  0.06
## PercentPremiumSeats   24.69   19.98  0.71     0.28  0.23

Some plots to visualize the distribution of each variable independently:

par(mfrow=c(1,2))
boxplot(airlines$PriceEconomy)
boxplot(airlines$PricePremium)

Scatter Plots to understand how are the variables correlated pair-wise

par(mfrow=c(1,2))
plot(airlines$FlightDuration,airlines$PriceRelative)
plot(airlines$WidthDifference,airlines$PercentPremiumSeats)

Corrgram:

corrgram(airlines, order=TRUE, lower.panel=panel.shade,
  upper.panel=panel.pie, text.panel=panel.txt,
  diag.panel=panel.minmax, 
    main="Corrgram of Premium vs Economy")

Variance-Covariance Matrix:

data<-cbind(airlines[,c(3,6:18)])
"variance"
## [1] "variance"
var(data)
##                     FlightDuration  SeatsEconomy  SeatsPremium
## FlightDuration          12.5462183    52.9194291    7.57372426
## SeatsEconomy            52.9194291  5832.9154300  633.07060954
## SeatsPremium             7.5737243   633.0706095  175.86521648
## PitchEconomy             0.6817421     7.2117665   -0.29725856
## PitchPremium             0.4477835    11.9637325    0.08508595
## WidthEconomy             0.9014224    15.9105138    3.36977440
## WidthPremium             0.4019845     8.5832800   -0.03954019
## PriceEconomy          1983.5401655  9673.7944684 1489.38359627
## PricePremium          2959.9783043 17413.2541733 3717.36428960
## PriceRelative            0.1932368     0.1361699   -0.58078765
## SeatsTotal              60.4931534  6465.9860396  808.93582602
## PitchDifference         -0.2339587     4.7519660    0.38234451
## WidthDifference         -0.4994380    -7.3272338   -3.40931459
## PercentPremiumSeats      1.0379912  -122.3914537   31.14753127
##                     PitchEconomy PitchPremium WidthEconomy WidthPremium
## FlightDuration         0.6817421   0.44778348   0.90142242   0.40198446
## SeatsEconomy           7.2117665  11.96373253  15.91051379   8.58327998
## SeatsPremium          -0.2972586   0.08508595   3.36977440  -0.03954019
## PitchEconomy           0.4292471  -0.47398546   0.10756500  -0.38766208
## PitchPremium          -0.4739855   1.72639580  -0.01739081   1.08157435
## WidthEconomy           0.1075650  -0.01739081   0.31081765   0.05010845
## WidthPremium          -0.3876621   1.08157435   0.05010845   1.20378776
## PriceEconomy         238.7031905  65.42513354  37.46095191 -61.85450011
## PricePremium         190.8517195 149.85356368 108.11611707  90.47997668
## PriceRelative         -0.1248808   0.24719874  -0.01104335   0.24928593
## SeatsTotal             6.9145079  12.04881848  19.28028819   8.54373979
## PitchDifference       -0.9032326   2.20038126  -0.12495581   1.46923643
## WidthDifference       -0.4952271   1.09896515  -0.26070920   1.15367930
## PercentPremiumSeats   -0.3261739  -1.11655834   0.61321816  -0.97393787
##                      PriceEconomy  PricePremium PriceRelative
## FlightDuration         1983.54017    2959.97830    0.19323683
## SeatsEconomy           9673.79447   17413.25417    0.13616991
## SeatsPremium           1489.38360    3717.36429   -0.58078765
## PitchEconomy            238.70319     190.85172   -0.12488080
## PitchPremium             65.42513     149.85356    0.24719874
## WidthEconomy             37.46095     108.11612   -0.01104335
## WidthPremium            -61.85450      90.47998    0.24928593
## PriceEconomy         976684.06198 1147494.76801 -128.49991725
## PricePremium        1147494.76801 1659293.11947   18.48428836
## PriceRelative          -128.49992      18.48429    0.20302893
## SeatsTotal            11163.17806   21130.61846   -0.44461774
## PitchDifference        -173.27806     -40.99816    0.37207954
## WidthDifference         -99.31545     -17.63614    0.26032928
## PercentPremiumSeats     312.61077     726.01582   -0.35252750
##                        SeatsTotal PitchDifference WidthDifference
## FlightDuration         60.4931534      -0.2339587      -0.4994380
## SeatsEconomy         6465.9860396       4.7519660      -7.3272338
## SeatsPremium          808.9358260       0.3823445      -3.4093146
## PitchEconomy            6.9145079      -0.9032326      -0.4952271
## PitchPremium           12.0488185       2.2003813       1.0989652
## WidthEconomy           19.2802882      -0.1249558      -0.2607092
## WidthPremium            8.5437398       1.4692364       1.1536793
## PriceEconomy        11163.1780647    -173.2780570     -99.3154520
## PricePremium        21130.6184629     -40.9981558     -17.6361404
## PriceRelative          -0.4446177       0.3720795       0.2603293
## SeatsTotal           7274.9218656       5.1343105     -10.7365484
## PitchDifference         5.1343105       3.1036138       1.5941922
## WidthDifference       -10.7365484       1.5941922       1.4143885
## PercentPremiumSeats   -91.2439224      -0.7903844      -1.5871560
##                     PercentPremiumSeats
## FlightDuration                1.0379912
## SeatsEconomy               -122.3914537
## SeatsPremium                 31.1475313
## PitchEconomy                 -0.3261739
## PitchPremium                 -1.1165583
## WidthEconomy                  0.6132182
## WidthPremium                 -0.9739379
## PriceEconomy                312.6107669
## PricePremium                726.0158229
## PriceRelative                -0.3525275
## SeatsTotal                  -91.2439224
## PitchDifference              -0.7903844
## WidthDifference              -1.5871560
## PercentPremiumSeats          23.4493343
"covariance"
## [1] "covariance"
cov(data)
##                     FlightDuration  SeatsEconomy  SeatsPremium
## FlightDuration          12.5462183    52.9194291    7.57372426
## SeatsEconomy            52.9194291  5832.9154300  633.07060954
## SeatsPremium             7.5737243   633.0706095  175.86521648
## PitchEconomy             0.6817421     7.2117665   -0.29725856
## PitchPremium             0.4477835    11.9637325    0.08508595
## WidthEconomy             0.9014224    15.9105138    3.36977440
## WidthPremium             0.4019845     8.5832800   -0.03954019
## PriceEconomy          1983.5401655  9673.7944684 1489.38359627
## PricePremium          2959.9783043 17413.2541733 3717.36428960
## PriceRelative            0.1932368     0.1361699   -0.58078765
## SeatsTotal              60.4931534  6465.9860396  808.93582602
## PitchDifference         -0.2339587     4.7519660    0.38234451
## WidthDifference         -0.4994380    -7.3272338   -3.40931459
## PercentPremiumSeats      1.0379912  -122.3914537   31.14753127
##                     PitchEconomy PitchPremium WidthEconomy WidthPremium
## FlightDuration         0.6817421   0.44778348   0.90142242   0.40198446
## SeatsEconomy           7.2117665  11.96373253  15.91051379   8.58327998
## SeatsPremium          -0.2972586   0.08508595   3.36977440  -0.03954019
## PitchEconomy           0.4292471  -0.47398546   0.10756500  -0.38766208
## PitchPremium          -0.4739855   1.72639580  -0.01739081   1.08157435
## WidthEconomy           0.1075650  -0.01739081   0.31081765   0.05010845
## WidthPremium          -0.3876621   1.08157435   0.05010845   1.20378776
## PriceEconomy         238.7031905  65.42513354  37.46095191 -61.85450011
## PricePremium         190.8517195 149.85356368 108.11611707  90.47997668
## PriceRelative         -0.1248808   0.24719874  -0.01104335   0.24928593
## SeatsTotal             6.9145079  12.04881848  19.28028819   8.54373979
## PitchDifference       -0.9032326   2.20038126  -0.12495581   1.46923643
## WidthDifference       -0.4952271   1.09896515  -0.26070920   1.15367930
## PercentPremiumSeats   -0.3261739  -1.11655834   0.61321816  -0.97393787
##                      PriceEconomy  PricePremium PriceRelative
## FlightDuration         1983.54017    2959.97830    0.19323683
## SeatsEconomy           9673.79447   17413.25417    0.13616991
## SeatsPremium           1489.38360    3717.36429   -0.58078765
## PitchEconomy            238.70319     190.85172   -0.12488080
## PitchPremium             65.42513     149.85356    0.24719874
## WidthEconomy             37.46095     108.11612   -0.01104335
## WidthPremium            -61.85450      90.47998    0.24928593
## PriceEconomy         976684.06198 1147494.76801 -128.49991725
## PricePremium        1147494.76801 1659293.11947   18.48428836
## PriceRelative          -128.49992      18.48429    0.20302893
## SeatsTotal            11163.17806   21130.61846   -0.44461774
## PitchDifference        -173.27806     -40.99816    0.37207954
## WidthDifference         -99.31545     -17.63614    0.26032928
## PercentPremiumSeats     312.61077     726.01582   -0.35252750
##                        SeatsTotal PitchDifference WidthDifference
## FlightDuration         60.4931534      -0.2339587      -0.4994380
## SeatsEconomy         6465.9860396       4.7519660      -7.3272338
## SeatsPremium          808.9358260       0.3823445      -3.4093146
## PitchEconomy            6.9145079      -0.9032326      -0.4952271
## PitchPremium           12.0488185       2.2003813       1.0989652
## WidthEconomy           19.2802882      -0.1249558      -0.2607092
## WidthPremium            8.5437398       1.4692364       1.1536793
## PriceEconomy        11163.1780647    -173.2780570     -99.3154520
## PricePremium        21130.6184629     -40.9981558     -17.6361404
## PriceRelative          -0.4446177       0.3720795       0.2603293
## SeatsTotal           7274.9218656       5.1343105     -10.7365484
## PitchDifference         5.1343105       3.1036138       1.5941922
## WidthDifference       -10.7365484       1.5941922       1.4143885
## PercentPremiumSeats   -91.2439224      -0.7903844      -1.5871560
##                     PercentPremiumSeats
## FlightDuration                1.0379912
## SeatsEconomy               -122.3914537
## SeatsPremium                 31.1475313
## PitchEconomy                 -0.3261739
## PitchPremium                 -1.1165583
## WidthEconomy                  0.6132182
## WidthPremium                 -0.9739379
## PriceEconomy                312.6107669
## PricePremium                726.0158229
## PriceRelative                -0.3525275
## SeatsTotal                  -91.2439224
## PitchDifference              -0.7903844
## WidthDifference              -1.5871560
## PercentPremiumSeats          23.4493343

Consider some hypotheses: 1)Choice of aircraft is independent of %premium seats taken 2)Relative price of tickets is independent of Flight duration

T tests: Hypothesis 1:

t.test(airlines$PricePremium,airlines$FlightDuration)
## 
##  Welch Two Sample t-test
## 
## data:  airlines$PricePremium and airlines$FlightDuration
## t = 30.531, df = 457.01, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  1719.395 1955.965
## sample estimates:
##   mean of x   mean of y 
## 1845.257642    7.577838

Hypothesis 2:

t.test(airlines$PriceRelative,airlines$FlightDuration)
## 
##  Welch Two Sample t-test
## 
## data:  airlines$PriceRelative and airlines$FlightDuration
## t = -42.499, df = 471.79, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -7.418482 -6.762785
## sample estimates:
## mean of x mean of y 
## 0.4872052 7.5778384

Since tests of both give p-values <0.05, we can reject the null hypotheses.

Regression models Hypothesis 1:

summary(lm(PricePremium ~ FlightDuration, data = airlines))
## 
## Call:
## lm(formula = PricePremium ~ FlightDuration, data = airlines)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2292.5  -664.7  -103.8   803.0  4093.7 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       57.45     108.39    0.53    0.596    
## FlightDuration   235.93      12.96   18.20   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 981.4 on 456 degrees of freedom
## Multiple R-squared:  0.4209, Adjusted R-squared:  0.4196 
## F-statistic: 331.4 on 1 and 456 DF,  p-value: < 2.2e-16

Inferences: 1)The regression coefficient of 235.93 Significantly greater than 0(p value << 0.001) and there is an expected increase Of 235.93 units of price for every 1 hour increase in flight duration.

2)Multiple R squared indicates model accounts for 42.09% variance in Premium prices.

3)The residual standard error(981.4) in predicting the Premium prices from the flight duration

Hypothesis 2:

summary(lm(PriceRelative~FlightDuration,data=airlines))
## 
## Call:
## lm(formula = PriceRelative ~ FlightDuration, data = airlines)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.5507 -0.3373 -0.1167  0.2363  1.4694 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    0.370491   0.049454   7.492 3.56e-13 ***
## FlightDuration 0.015402   0.005913   2.605   0.0095 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4478 on 456 degrees of freedom
## Multiple R-squared:  0.01466,    Adjusted R-squared:  0.0125 
## F-statistic: 6.784 on 1 and 456 DF,  p-value: 0.009498

Inferences: 1)The regression coefficient of 0.015402 It is greater than 0 and there is an expected increase Of 0.015402 units of price for every 1 hour increase in flight duration.

2)Multiple R squared indicates model accounts for 1.46% variance in Premium prices.

3)The residual standard error(0.4478) in predicting the Premium prices from the flight duration

Regression models- 1)PricePremium=(235.93)FlightDuration+57.45 2)PriceRelative=(0.015402)FlightDuration+0.370491

Low p-values indicate that models are good

Thus the T tests are verified and Null hypotheses are rejected and alternate hypotheses are accepted.

Finding what factors affect difference in price between an economy ticket and a premium-economy airline ticket:

summary(lm(PriceRelative~SeatsEconomy+FlightDuration+SeatsPremium+PitchDifference+WidthDifference+PercentPremiumSeats,data=airlines))
## 
## Call:
## lm(formula = PriceRelative ~ SeatsEconomy + FlightDuration + 
##     SeatsPremium + PitchDifference + WidthDifference + PercentPremiumSeats, 
##     data = airlines)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.81222 -0.28724 -0.03929  0.15184  1.13902 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         -0.1070373  0.2013753  -0.532 0.595312    
## SeatsEconomy        -0.0003348  0.0008855  -0.378 0.705549    
## FlightDuration       0.0228284  0.0053223   4.289 2.20e-05 ***
## SeatsPremium         0.0004936  0.0053482   0.092 0.926512    
## PitchDifference      0.0629304  0.0163747   3.843 0.000139 ***
## WidthDifference      0.1107340  0.0259805   4.262 2.47e-05 ***
## PercentPremiumSeats -0.0088310  0.0126109  -0.700 0.484119    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3815 on 451 degrees of freedom
## Multiple R-squared:  0.2927, Adjusted R-squared:  0.2833 
## F-statistic: 31.11 on 6 and 451 DF,  p-value: < 2.2e-16

Analysis shows that Relative price between Premium economy and economy tickets primarily depends on FlightDuration,PitchDifference,WidthDifference