The data set that we are working on is a classification data set which is used to differentiate the economy class air-ticket price from the premium economy class air-ticket price. Several other factors are provided in the data set which are closely related to the airline industry.
The aim of this analysis report is to find out which factors exactly contribute toward to difference in the prices of the premium economy class and the economy class air tickets.
setwd("C:/Users/Shreyas Jadhav/Downloads")
airlines <- read.csv(paste("SixAirlinesDataV2.csv",sep="."))
#View(airlines)
airlines$TravelMonth <- as.numeric(airlines$TravelMonth)
airlines$IsInternational <- as.numeric(airlines$IsInternational)
airlines$Aircraft <- as.numeric(airlines$Aircraft)
str(airlines)
## 'data.frame': 458 obs. of 18 variables:
## $ Airline : Factor w/ 6 levels "AirFrance","British",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ Aircraft : num 2 2 2 2 2 2 2 2 2 2 ...
## $ FlightDuration : num 12.25 12.25 12.25 12.25 8.16 ...
## $ TravelMonth : num 2 1 4 3 1 4 3 1 4 4 ...
## $ IsInternational : num 2 2 2 2 2 2 2 2 2 2 ...
## $ SeatsEconomy : int 122 122 122 122 122 122 122 122 122 122 ...
## $ SeatsPremium : int 40 40 40 40 40 40 40 40 40 40 ...
## $ PitchEconomy : int 31 31 31 31 31 31 31 31 31 31 ...
## $ PitchPremium : int 38 38 38 38 38 38 38 38 38 38 ...
## $ WidthEconomy : int 18 18 18 18 18 18 18 18 18 18 ...
## $ WidthPremium : int 19 19 19 19 19 19 19 19 19 19 ...
## $ PriceEconomy : int 2707 2707 2707 2707 1793 1793 1793 1476 1476 1705 ...
## $ PricePremium : int 3725 3725 3725 3725 2999 2999 2999 2997 2997 2989 ...
## $ PriceRelative : num 0.38 0.38 0.38 0.38 0.67 0.67 0.67 1.03 1.03 0.75 ...
## $ SeatsTotal : int 162 162 162 162 162 162 162 162 162 162 ...
## $ PitchDifference : int 7 7 7 7 7 7 7 7 7 7 ...
## $ WidthDifference : int 1 1 1 1 1 1 1 1 1 1 ...
## $ PercentPremiumSeats: num 24.7 24.7 24.7 24.7 24.7 ...
summary(airlines)
## Airline Aircraft FlightDuration TravelMonth
## AirFrance: 74 Min. :1.00 Min. : 1.250 Min. :1.000
## British :175 1st Qu.:1.00 1st Qu.: 4.260 1st Qu.:1.000
## Delta : 46 Median :2.00 Median : 7.790 Median :3.000
## Jet : 61 Mean :1.67 Mean : 7.578 Mean :2.563
## Singapore: 40 3rd Qu.:2.00 3rd Qu.:10.620 3rd Qu.:4.000
## Virgin : 62 Max. :2.00 Max. :14.660 Max. :4.000
## IsInternational SeatsEconomy SeatsPremium PitchEconomy
## Min. :1.000 Min. : 78.0 Min. : 8.00 Min. :30.00
## 1st Qu.:2.000 1st Qu.:133.0 1st Qu.:21.00 1st Qu.:31.00
## Median :2.000 Median :185.0 Median :36.00 Median :31.00
## Mean :1.913 Mean :202.3 Mean :33.65 Mean :31.22
## 3rd Qu.:2.000 3rd Qu.:243.0 3rd Qu.:40.00 3rd Qu.:32.00
## Max. :2.000 Max. :389.0 Max. :66.00 Max. :33.00
## PitchPremium WidthEconomy WidthPremium PriceEconomy
## Min. :34.00 Min. :17.00 Min. :17.00 Min. : 65
## 1st Qu.:38.00 1st Qu.:18.00 1st Qu.:19.00 1st Qu.: 413
## Median :38.00 Median :18.00 Median :19.00 Median :1242
## Mean :37.91 Mean :17.84 Mean :19.47 Mean :1327
## 3rd Qu.:38.00 3rd Qu.:18.00 3rd Qu.:21.00 3rd Qu.:1909
## Max. :40.00 Max. :19.00 Max. :21.00 Max. :3593
## PricePremium PriceRelative SeatsTotal PitchDifference
## Min. : 86.0 Min. :0.0200 Min. : 98 Min. : 2.000
## 1st Qu.: 528.8 1st Qu.:0.1000 1st Qu.:166 1st Qu.: 6.000
## Median :1737.0 Median :0.3650 Median :227 Median : 7.000
## Mean :1845.3 Mean :0.4872 Mean :236 Mean : 6.688
## 3rd Qu.:2989.0 3rd Qu.:0.7400 3rd Qu.:279 3rd Qu.: 7.000
## Max. :7414.0 Max. :1.8900 Max. :441 Max. :10.000
## WidthDifference PercentPremiumSeats
## Min. :0.000 Min. : 4.71
## 1st Qu.:1.000 1st Qu.:12.28
## Median :1.000 Median :13.21
## Mean :1.633 Mean :14.65
## 3rd Qu.:3.000 3rd Qu.:15.36
## Max. :4.000 Max. :24.69
library(psych)
describe(airlines)
## vars n mean sd median trimmed mad min
## Airline* 1 458 3.01 1.65 2.00 2.89 1.48 1.00
## Aircraft 2 458 1.67 0.47 2.00 1.71 0.00 1.00
## FlightDuration 3 458 7.58 3.54 7.79 7.57 4.81 1.25
## TravelMonth 4 458 2.56 1.17 3.00 2.58 1.48 1.00
## IsInternational 5 458 1.91 0.28 2.00 2.00 0.00 1.00
## SeatsEconomy 6 458 202.31 76.37 185.00 194.64 85.99 78.00
## SeatsPremium 7 458 33.65 13.26 36.00 33.35 11.86 8.00
## PitchEconomy 8 458 31.22 0.66 31.00 31.26 0.00 30.00
## PitchPremium 9 458 37.91 1.31 38.00 38.05 0.00 34.00
## WidthEconomy 10 458 17.84 0.56 18.00 17.81 0.00 17.00
## WidthPremium 11 458 19.47 1.10 19.00 19.53 0.00 17.00
## PriceEconomy 12 458 1327.08 988.27 1242.00 1244.40 1159.39 65.00
## PricePremium 13 458 1845.26 1288.14 1737.00 1799.05 1845.84 86.00
## PriceRelative 14 458 0.49 0.45 0.36 0.42 0.41 0.02
## SeatsTotal 15 458 235.96 85.29 227.00 228.73 90.44 98.00
## PitchDifference 16 458 6.69 1.76 7.00 6.76 0.00 2.00
## WidthDifference 17 458 1.63 1.19 1.00 1.53 0.00 0.00
## PercentPremiumSeats 18 458 14.65 4.84 13.21 14.31 2.68 4.71
## max range skew kurtosis se
## Airline* 6.00 5.00 0.61 -0.95 0.08
## Aircraft 2.00 1.00 -0.72 -1.48 0.02
## FlightDuration 14.66 13.41 -0.07 -1.12 0.17
## TravelMonth 4.00 3.00 -0.14 -1.46 0.05
## IsInternational 2.00 1.00 -2.91 6.50 0.01
## SeatsEconomy 389.00 311.00 0.72 -0.36 3.57
## SeatsPremium 66.00 58.00 0.23 -0.46 0.62
## PitchEconomy 33.00 3.00 -0.03 -0.35 0.03
## PitchPremium 40.00 6.00 -1.51 3.52 0.06
## WidthEconomy 19.00 2.00 -0.04 -0.08 0.03
## WidthPremium 21.00 4.00 -0.08 -0.31 0.05
## PriceEconomy 3593.00 3528.00 0.51 -0.88 46.18
## PricePremium 7414.00 7328.00 0.50 0.43 60.19
## PriceRelative 1.89 1.87 1.17 0.72 0.02
## SeatsTotal 441.00 343.00 0.70 -0.53 3.99
## PitchDifference 10.00 8.00 -0.54 1.78 0.08
## WidthDifference 4.00 4.00 0.84 -0.53 0.06
## PercentPremiumSeats 24.69 19.98 0.71 0.28 0.23
table(airlines$Airline)
##
## AirFrance British Delta Jet Singapore Virgin
## 74 175 46 61 40 62
plot(x=airlines$Airline,y=airlines$PriceEconomy)
plot(x=airlines$Airline,y=airlines$PricePremium)
library(ggplot2)
##
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
##
## %+%, alpha
ggplot(airlines, aes(x = Airline, fill = Airline)) + geom_bar()
cor(airlines[,c(6,12)])
## SeatsEconomy PriceEconomy
## SeatsEconomy 1.0000000 0.1281672
## PriceEconomy 0.1281672 1.0000000
cor(airlines[,c(7,13)])
## SeatsPremium PricePremium
## SeatsPremium 1.0000000 0.2176124
## PricePremium 0.2176124 1.0000000
mytable1<-xtabs(~airlines$Airline+airlines$PriceEconomy)
chisq.test(mytable1)
## Warning in chisq.test(mytable1): Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: mytable1
## X-squared = 2156.7, df = 945, p-value < 2.2e-16
mytable2<-xtabs(~airlines$Airline+airlines$PricePremium)
chisq.test(mytable2)
## Warning in chisq.test(mytable2): Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: mytable2
## X-squared = 2195.1, df = 850, p-value < 2.2e-16
mytable3<-xtabs(~airlines$Airline+airlines$PriceRelative)
chisq.test(mytable3)
## Warning in chisq.test(mytable3): Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: mytable3
## X-squared = 1402.9, df = 485, p-value < 2.2e-16
mytable4<-xtabs(~airlines$Aircraft+airlines$PriceEconomy)
chisq.test(mytable4)
## Warning in chisq.test(mytable4): Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: mytable4
## X-squared = 396.14, df = 189, p-value < 2.2e-16
mytable5<-xtabs(~airlines$Aircraft+airlines$PricePremium)
chisq.test(mytable5)
## Warning in chisq.test(mytable5): Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: mytable5
## X-squared = 399.65, df = 170, p-value < 2.2e-16
Dependant variable: Price Relative
Predictor variables: Airlines
fit1<-lm(airlines$PriceRelative~airlines$TravelMonth)
summary(fit1)
##
## Call:
## lm(formula = airlines$PriceRelative ~ airlines$TravelMonth)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.4690 -0.3827 -0.1177 0.2534 1.4073
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.49520 0.05084 9.741 <2e-16 ***
## airlines$TravelMonth -0.00312 0.01805 -0.173 0.863
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4511 on 456 degrees of freedom
## Multiple R-squared: 6.554e-05, Adjusted R-squared: -0.002127
## F-statistic: 0.02989 on 1 and 456 DF, p-value: 0.8628
Dependant variable: Price Relative
Predictor variables: FlightDuration
fit2<-lm(airlines$PriceRelative~airlines$FlightDuration)
summary(fit2)
##
## Call:
## lm(formula = airlines$PriceRelative ~ airlines$FlightDuration)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.5507 -0.3373 -0.1167 0.2363 1.4694
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.370491 0.049454 7.492 3.56e-13 ***
## airlines$FlightDuration 0.015402 0.005913 2.605 0.0095 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4478 on 456 degrees of freedom
## Multiple R-squared: 0.01466, Adjusted R-squared: 0.0125
## F-statistic: 6.784 on 1 and 456 DF, p-value: 0.009498
Dependant variable: PriceEconomy
Predictor variables: SeatsEconomy
fit3<-lm(airlines$PriceEconomy~airlines$SeatsEconomy)
summary(fit3)
##
## Call:
## lm(formula = airlines$PriceEconomy ~ airlines$SeatsEconomy)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1434.99 -861.30 -49.71 619.12 2312.88
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 991.545 129.941 7.631 1.38e-13 ***
## airlines$SeatsEconomy 1.659 0.601 2.760 0.00602 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 981.2 on 456 degrees of freedom
## Multiple R-squared: 0.01643, Adjusted R-squared: 0.01427
## F-statistic: 7.616 on 1 and 456 DF, p-value: 0.006019
Dependant variable: PricePremium
Predictor variables: SeatsPremium
fit4<-lm(airlines$PricePremium~airlines$SeatsPremium)
summary(fit4)
##
## Call:
## lm(formula = airlines$PricePremium ~ airlines$SeatsPremium)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2210.6 -999.1 -111.0 1082.6 5772.7
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1134.01 160.55 7.063 6.11e-12 ***
## airlines$SeatsPremium 21.14 4.44 4.761 2.59e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1259 on 456 degrees of freedom
## Multiple R-squared: 0.04736, Adjusted R-squared: 0.04527
## F-statistic: 22.67 on 1 and 456 DF, p-value: 2.591e-06
Dependant variable: Price Relative
Predictor variables: PitchDifference
fit5<-lm(airlines$PriceRelative~airlines$PitchDifference)
summary(fit5)
##
## Call:
## lm(formula = airlines$PriceRelative ~ airlines$PitchDifference)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.7643 -0.3247 -0.1146 0.2052 1.2954
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.31456 0.07317 -4.299 2.1e-05 ***
## airlines$PitchDifference 0.11989 0.01058 11.331 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3985 on 456 degrees of freedom
## Multiple R-squared: 0.2197, Adjusted R-squared: 0.218
## F-statistic: 128.4 on 1 and 456 DF, p-value: < 2.2e-16
Dependant variable: PriceRelative
Predictor variables: WidthDifference
fit6<-lm(airlines$PriceRelative~airlines$WidthDifference)
summary(fit6)
##
## Call:
## lm(formula = airlines$PriceRelative ~ airlines$WidthDifference)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.8028 -0.2907 -0.0766 0.1852 1.1893
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.18660 0.03132 5.958 5.11e-09 ***
## airlines$WidthDifference 0.18406 0.01551 11.869 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3943 on 456 degrees of freedom
## Multiple R-squared: 0.236, Adjusted R-squared: 0.2343
## F-statistic: 140.9 on 1 and 456 DF, p-value: < 2.2e-16
Dependant variable: Price Relative
Predictor variables: SeatsEconomy
fit7<-lm(airlines$PriceRelative~airlines$SeatsEconomy)
summary(fit7)
##
## Call:
## lm(formula = airlines$PriceRelative ~ airlines$SeatsEconomy)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.4716 -0.3863 -0.1213 0.2546 1.4046
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.825e-01 5.974e-02 8.077 5.99e-15 ***
## airlines$SeatsEconomy 2.335e-05 2.763e-04 0.084 0.933
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4511 on 456 degrees of freedom
## Multiple R-squared: 1.566e-05, Adjusted R-squared: -0.002177
## F-statistic: 0.00714 on 1 and 456 DF, p-value: 0.9327
Dependant variable: Price Relative
Predictor variables: SeatsPremium
fit8<-lm(airlines$PriceRelative~airlines$SeatsPremium)
summary(fit8)
##
## Call:
## lm(formula = airlines$PriceRelative ~ airlines$SeatsPremium)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.5023 -0.3862 -0.1129 0.2038 1.3445
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.598328 0.057266 10.448 <2e-16 ***
## airlines$SeatsPremium -0.003302 0.001584 -2.085 0.0376 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4489 on 456 degrees of freedom
## Multiple R-squared: 0.009447, Adjusted R-squared: 0.007275
## F-statistic: 4.349 on 1 and 456 DF, p-value: 0.03759
Dependant variable: Price Relative
Predictor variables: SeatsPremium-SeatsEconomy
fit9<-lm(airlines$PriceRelative ~ I(airlines$SeatsPremium-airlines$SeatsEconomy))
summary(fit9)
##
## Call:
## lm(formula = airlines$PriceRelative ~ I(airlines$SeatsPremium -
## airlines$SeatsEconomy))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.4948 -0.3848 -0.1150 0.2620 1.4120
##
## Coefficients:
## Estimate Std. Error
## (Intercept) 0.4617079 0.0557965
## I(airlines$SeatsPremium - airlines$SeatsEconomy) -0.0001512 0.0003063
## t value Pr(>|t|)
## (Intercept) 8.275 1.43e-15 ***
## I(airlines$SeatsPremium - airlines$SeatsEconomy) -0.494 0.622
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.451 on 456 degrees of freedom
## Multiple R-squared: 0.0005338, Adjusted R-squared: -0.001658
## F-statistic: 0.2436 on 1 and 456 DF, p-value: 0.6219
Dependant variable: PricePremium-PriceEconomy
Predictor variables: PitchDifference
fit10<-lm(I(airlines$PricePremium-airlines$PriceEconomy) ~ airlines$PitchDifference)
summary(fit10)
##
## Call:
## lm(formula = I(airlines$PricePremium - airlines$PriceEconomy) ~
## airlines$PitchDifference)
##
## Residuals:
## Min 1Q Median 3Q Max
## -600.4 -379.9 -274.4 241.0 3780.5
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 233.14 106.45 2.190 0.02902 *
## airlines$PitchDifference 42.62 15.39 2.769 0.00586 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 579.7 on 456 degrees of freedom
## Multiple R-squared: 0.01653, Adjusted R-squared: 0.01438
## F-statistic: 7.666 on 1 and 456 DF, p-value: 0.005855
Dependant variable: PricePremium-PriceEconomy
Predictor variables: WidthDifference
fit11<-lm(I(airlines$PricePremium-airlines$PriceEconomy) ~ airlines$WidthDifference)
summary(fit11)
##
## Call:
## lm(formula = I(airlines$PricePremium - airlines$PriceEconomy) ~
## airlines$WidthDifference)
##
## Residuals:
## Min 1Q Median 3Q Max
## -595.9 -408.9 -257.6 243.4 3830.4
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 423.87 46.11 9.192 <2e-16 ***
## airlines$WidthDifference 57.75 22.83 2.529 0.0118 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 580.5 on 456 degrees of freedom
## Multiple R-squared: 0.01383, Adjusted R-squared: 0.01167
## F-statistic: 6.396 on 1 and 456 DF, p-value: 0.01177
Dependant variable: PricePremium-PriceEconomy
Predictor variables: PitchDifference
fit12<-lm(I(airlines$PricePremium-airlines$PriceEconomy) ~ airlines$FlightDuration)
summary(fit12)
##
## Call:
## lm(formula = I(airlines$PricePremium - airlines$PriceEconomy) ~
## airlines$FlightDuration)
##
## Residuals:
## Min 1Q Median 3Q Max
## -891.4 -339.0 -53.9 148.4 3307.2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -71.581 56.918 -1.258 0.209
## airlines$FlightDuration 77.827 6.806 11.435 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 515.3 on 456 degrees of freedom
## Multiple R-squared: 0.2229, Adjusted R-squared: 0.2212
## F-statistic: 130.8 on 1 and 456 DF, p-value: < 2.2e-16
cor(x=I(airlines$FlightDuration),y=I(airlines$PricePremium-airlines$PriceEconomy))
## [1] 0.4720837
cor(x=I(airlines$TravelMonth),y=I(airlines$PricePremium-airlines$PriceEconomy))
## [1] 0.007286108
cor.test(airlines$TravelMonth,I(airlines$PricePremium-airlines$PriceEconomy))
##
## Pearson's product-moment correlation
##
## data: airlines$TravelMonth and I(airlines$PricePremium - airlines$PriceEconomy)
## t = 0.15559, df = 456, p-value = 0.8764
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.08439705 0.09884693
## sample estimates:
## cor
## 0.007286108
cor.test(airlines$IsInternational,I(airlines$PricePremium-airlines$PriceEconomy))
##
## Pearson's product-moment correlation
##
## data: airlines$IsInternational and I(airlines$PricePremium - airlines$PriceEconomy)
## t = 5.7328, df = 456, p-value = 1.799e-08
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1717354 0.3427659
## sample estimates:
## cor
## 0.2592822
cor.test(airlines$Aircraft,I(airlines$PricePremium-airlines$PriceEconomy))
##
## Pearson's product-moment correlation
##
## data: airlines$Aircraft and I(airlines$PricePremium - airlines$PriceEconomy)
## t = 0.47848, df = 456, p-value = 0.6325
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.06936787 0.11379457
## sample estimates:
## cor
## 0.02240132
t.test(I(airlines$PricePremium-airlines$PriceEconomy)~airlines$Aircraft)
##
## Welch Two Sample t-test
##
## data: I(airlines$PricePremium - airlines$PriceEconomy) by airlines$Aircraft
## t = -0.50194, df = 338.8, p-value = 0.616
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -136.72105 81.12983
## sample estimates:
## mean in group 1 mean in group 2
## 499.5497 527.3453
t.test(I(airlines$PricePremium)~airlines$Aircraft)
##
## Welch Two Sample t-test
##
## data: I(airlines$PricePremium) by airlines$Aircraft
## t = 0.28645, df = 310.38, p-value = 0.7747
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -212.2929 284.6350
## sample estimates:
## mean in group 1 mean in group 2
## 1869.503 1833.332
t.test(I(airlines$PriceEconomy)~airlines$Aircraft)
##
## Welch Two Sample t-test
##
## data: I(airlines$PriceEconomy) by airlines$Aircraft
## t = 0.64317, df = 289.45, p-value = 0.5206
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -131.7801 259.7135
## sample estimates:
## mean in group 1 mean in group 2
## 1369.954 1305.987
plot(airlines$FlightDuration,airlines$PriceEconomy, col="green",main="Price economy vs flight hours",xlab="Hours", ylab="Price")
abline(h=mean(airlines$PriceEconomy), col="black", lty="dotted")
abline(v=mean(airlines$FlightDuration), col="black", lty="dotted")
abline(lm(airlines$PriceEconomy ~ airlines$FlightDuration))
The British Airline has the maximum frequency of occurrence in the data set.It has the maximum number of aeroplanes with a different set of factor values for every aeroplane.
The boxplot graphs of Airlines vs Premium air-ticket cost and Airlines vs Economy class air-ticket cost have a similar hyperbolic curve shape which suggests that the difference between the maximum and minimum ticket cost (for both classes) is dependent upon the airline type.
The Airline factor is statistically related to the economy class air ticket price, the premium economy class air ticket price and the relative price of both the classes from the correlation tests for the same.
From the linear regression model between the flight duration and the relative price between the two classes as well as between the flight duration and the difference price between the two classes, due to the p-value being <0.05, we conclude that the fight duration factor is very much significantly important in determining the difference in the prices of the economy class and the premium economy class air tickets.
The seats in the economy class are statistically related to the price of the economy class air ticket, as per the adjoining regression model and correlation table.
The seats in the premium economy class are statistically related to the price of the premium economy class air ticket, as per the adjoining regression model and correlation table.
The difference in the number of seats in the economy class and the premium economy class does not contribute significantly to the difference in the air ticket cost of the economy class and the premium economy class, since the p-value > 0.05 as per the adjoining linear regression model.
The difference in the pitch of the economy class seat and the premium economy class seat does contribute significantly to the relative prices of the air tickets betweem the economy class and the premium economy class as well as the, as the p-value < 0.05 from the adjoining linear regression model.
The difference in the width of the economy class seat and the premium economy class seat does contribute significantly to the difference in the prices of the air tickets of the economy class and the premium economy class as well as the relative price, as the p-value < 0.05 from the adjoining linear regression model.
Surprisingly, the travel month is positively correlated to the difference in the prices of the economy class and premium economy class air tickets, from the adjoining correlation table and test but its close to zero therefore its very weakly correlated. Also the travel month is not statistically significant to the relative prices from the regression model.
Based on the correlation test, the IsInternational factor shows strong correlation with the difference in prices of the economy and premium economy class air tickets.
The Aircraft factor is negatively correlated to the difference in the prices of the economy and premium economy class air tickets due to negative correlation coefficient. With a p-value > 0.05, the Aircraft factor s not a significant contributor to the latter.