Airlines, now in very competitive market place, aims to reduce cost per passanger seat and increase revenue per flight by increasing number of passangers onboard. This is done by reducing leg-space between the seat. Hence, airlines can accomodate more passangers then before. With this increase in number of seats, now they can charge even lower price to attract more passangers. Also, airlines offers high priced tickets to those passangers who wants more leg-room and other utilities near the seat such as a small TV, charging point, reading lights and confortable seat. In this project, we wil examine whether above mentioned factors are significant enough for airlines to charge a higher price.
setwd("C:/Users/Abhi/Desktop/Data Analytics/Week 3 Day 5")
airline<- read.csv(paste("SixAirlinesDataV2.csv", sep=""))
head(airline)
## Airline Aircraft FlightDuration TravelMonth IsInternational SeatsEconomy
## 1 British Boeing 12.25 Jul International 122
## 2 British Boeing 12.25 Aug International 122
## 3 British Boeing 12.25 Sep International 122
## 4 British Boeing 12.25 Oct International 122
## 5 British Boeing 8.16 Aug International 122
## 6 British Boeing 8.16 Sep International 122
## SeatsPremium PitchEconomy PitchPremium WidthEconomy WidthPremium
## 1 40 31 38 18 19
## 2 40 31 38 18 19
## 3 40 31 38 18 19
## 4 40 31 38 18 19
## 5 40 31 38 18 19
## 6 40 31 38 18 19
## PriceEconomy PricePremium PriceRelative SeatsTotal PitchDifference
## 1 2707 3725 0.38 162 7
## 2 2707 3725 0.38 162 7
## 3 2707 3725 0.38 162 7
## 4 2707 3725 0.38 162 7
## 5 1793 2999 0.67 162 7
## 6 1793 2999 0.67 162 7
## WidthDifference PercentPremiumSeats
## 1 1 24.69
## 2 1 24.69
## 3 1 24.69
## 4 1 24.69
## 5 1 24.69
## 6 1 24.69
summary(airline)
## Airline Aircraft FlightDuration TravelMonth
## AirFrance: 74 AirBus:151 Min. : 1.250 Aug:127
## British :175 Boeing:307 1st Qu.: 4.260 Jul: 75
## Delta : 46 Median : 7.790 Oct:127
## Jet : 61 Mean : 7.578 Sep:129
## Singapore: 40 3rd Qu.:10.620
## Virgin : 62 Max. :14.660
## IsInternational SeatsEconomy SeatsPremium PitchEconomy
## Domestic : 40 Min. : 78.0 Min. : 8.00 Min. :30.00
## International:418 1st Qu.:133.0 1st Qu.:21.00 1st Qu.:31.00
## Median :185.0 Median :36.00 Median :31.00
## Mean :202.3 Mean :33.65 Mean :31.22
## 3rd Qu.:243.0 3rd Qu.:40.00 3rd Qu.:32.00
## Max. :389.0 Max. :66.00 Max. :33.00
## PitchPremium WidthEconomy WidthPremium PriceEconomy
## Min. :34.00 Min. :17.00 Min. :17.00 Min. : 65
## 1st Qu.:38.00 1st Qu.:18.00 1st Qu.:19.00 1st Qu.: 413
## Median :38.00 Median :18.00 Median :19.00 Median :1242
## Mean :37.91 Mean :17.84 Mean :19.47 Mean :1327
## 3rd Qu.:38.00 3rd Qu.:18.00 3rd Qu.:21.00 3rd Qu.:1909
## Max. :40.00 Max. :19.00 Max. :21.00 Max. :3593
## PricePremium PriceRelative SeatsTotal PitchDifference
## Min. : 86.0 Min. :0.0200 Min. : 98 Min. : 2.000
## 1st Qu.: 528.8 1st Qu.:0.1000 1st Qu.:166 1st Qu.: 6.000
## Median :1737.0 Median :0.3650 Median :227 Median : 7.000
## Mean :1845.3 Mean :0.4872 Mean :236 Mean : 6.688
## 3rd Qu.:2989.0 3rd Qu.:0.7400 3rd Qu.:279 3rd Qu.: 7.000
## Max. :7414.0 Max. :1.8900 Max. :441 Max. :10.000
## WidthDifference PercentPremiumSeats
## Min. :0.000 Min. : 4.71
## 1st Qu.:1.000 1st Qu.:12.28
## Median :1.000 Median :13.21
## Mean :1.633 Mean :14.65
## 3rd Qu.:3.000 3rd Qu.:15.36
## Max. :4.000 Max. :24.69
Understanding the summary statistics: * There are 6 airlines operating on aircraft of two companies, Boeing (307 aircrafts) and Airbus (151 aircrafts). Majority of this airlines are operating on internation routes. There is also a frequency of travel during a perticular month which is highest for month September closely followed by August and October. This is the vacation season during Fall. Range of economic class seats available in an aircraft is 78 to 389 seats while for primium class is 8 to 66. Maximum width in economic class is 19 inch while for premium class is 21 inch. Average price for economic class is $1327 and $1737 for premium class. Following are histograms for different variables.
str(airline)
## 'data.frame': 458 obs. of 18 variables:
## $ Airline : Factor w/ 6 levels "AirFrance","British",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ Aircraft : Factor w/ 2 levels "AirBus","Boeing": 2 2 2 2 2 2 2 2 2 2 ...
## $ FlightDuration : num 12.25 12.25 12.25 12.25 8.16 ...
## $ TravelMonth : Factor w/ 4 levels "Aug","Jul","Oct",..: 2 1 4 3 1 4 3 1 4 4 ...
## $ IsInternational : Factor w/ 2 levels "Domestic","International": 2 2 2 2 2 2 2 2 2 2 ...
## $ SeatsEconomy : int 122 122 122 122 122 122 122 122 122 122 ...
## $ SeatsPremium : int 40 40 40 40 40 40 40 40 40 40 ...
## $ PitchEconomy : int 31 31 31 31 31 31 31 31 31 31 ...
## $ PitchPremium : int 38 38 38 38 38 38 38 38 38 38 ...
## $ WidthEconomy : int 18 18 18 18 18 18 18 18 18 18 ...
## $ WidthPremium : int 19 19 19 19 19 19 19 19 19 19 ...
## $ PriceEconomy : int 2707 2707 2707 2707 1793 1793 1793 1476 1476 1705 ...
## $ PricePremium : int 3725 3725 3725 3725 2999 2999 2999 2997 2997 2989 ...
## $ PriceRelative : num 0.38 0.38 0.38 0.38 0.67 0.67 0.67 1.03 1.03 0.75 ...
## $ SeatsTotal : int 162 162 162 162 162 162 162 162 162 162 ...
## $ PitchDifference : int 7 7 7 7 7 7 7 7 7 7 ...
## $ WidthDifference : int 1 1 1 1 1 1 1 1 1 1 ...
## $ PercentPremiumSeats: num 24.7 24.7 24.7 24.7 24.7 ...
attach(airline)
hist(FlightDuration , xlab="Flight duration", col = "blue")
Maximum number of flights operates on 8 hour route.
hist(SeatsEconomy, xlab="Number of Economy Seats", col = "blue")
No airlines offered seating capacity between 250 to 300.
hist(SeatsPremium, xlab="Number of Premium Seats", col = "blue")
Majority of the airlines offers 35 to 40 premium class seats.
hist(WidthEconomy, xlab="Width of Economy Seats", col = "blue")
Only three types of width available in economy class and 18 inch is most prevelent.
hist(WidthPremium, xlab="Width of Premium Economy Seats", col = "blue")
19 inch width is available in majority of aircrafts.
hist(PriceEconomy, xlab="Price of Economy Seats", col = "blue")
As expected, number airlines charging below $1000 are more for economic class.
hist(PricePremium, xlab="Price of Premium Economy Seats", col = "blue")
Finding outliers in tha data.
attach(airline)
## The following objects are masked from airline (pos = 3):
##
## Aircraft, Airline, FlightDuration, IsInternational,
## PercentPremiumSeats, PitchDifference, PitchEconomy,
## PitchPremium, PriceEconomy, PricePremium, PriceRelative,
## SeatsEconomy, SeatsPremium, SeatsTotal, TravelMonth,
## WidthDifference, WidthEconomy, WidthPremium
plot(Airline,SeatsPremium,main="Airline vs Economy Seats",ylab="Mean Economy Seats",col="green")
library(lattice)
histogram(~PriceEconomy | IsInternational, data=airline, col="yellow", main = "Price of economy class tickits in international and domestic flights")
histogram(~PricePremium | IsInternational, data=airline, col="yellow", main = "Price of premium class tickits in international and domestic flights")
plot(Airline,SeatsPremium,col="black",main="Airline vs Economy Seats",ylab="Mean Economy Seats")
plot(Airline,SeatsEconomy,col='black',main="Airline vs Premium Seats",ylab="Mean PremiumSeats")
library(car)
scatterplot(PriceEconomy, FlightDuration, main="Economy class pricing vs. flight duration", xlab="Price (in $)", ylab = "Duration (in hours)")
library(corrgram)
corrgram(airline, order=TRUE,lower.panel=panel.shade,upper.panel=panel.pie,text.panel=panel.txt, main="Corrgram")
Null Hypothesis: There is no difference in the price with respect to width of the seat.
t.test(WidthPremium, PricePremium, var.equal = TRUE, paired=FALSE)
##
## Two Sample t-test
##
## data: WidthPremium and PricePremium
## t = -30.333, df = 914, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1943.914 -1707.658
## sample estimates:
## mean of x mean of y
## 19.47162 1845.25764
Here, p value is less than 0.05, hence we reject the null hypothesis. Therefor, there is a significant impact of width and price in the premium class.
cor.test((PricePremium-PriceEconomy),WidthDifference)
##
## Pearson's product-moment correlation
##
## data: (PricePremium - PriceEconomy) and WidthDifference
## t = 2.5291, df = 456, p-value = 0.01177
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.02627012 0.20700978
## sample estimates:
## cor
## 0.1176138
rmdl = lm((PricePremium-PriceEconomy) ~ PitchDifference + WidthDifference + FlightDuration)
summary(rmdl)
##
## Call:
## lm(formula = (PricePremium - PriceEconomy) ~ PitchDifference +
## WidthDifference + FlightDuration)
##
## Residuals:
## Min 1Q Median 3Q Max
## -859.4 -324.7 -62.7 150.1 3331.5
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -286.933 117.833 -2.435 0.0153 *
## PitchDifference 10.387 20.779 0.500 0.6174
## WidthDifference 74.641 30.977 2.410 0.0164 *
## FlightDuration 80.992 6.754 11.992 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 506.1 on 454 degrees of freedom
## Multiple R-squared: 0.2538, Adjusted R-squared: 0.2489
## F-statistic: 51.48 on 3 and 454 DF, p-value: < 2.2e-16
Variation in prices were more in International flights. Pitch and Width were more in International flights. Prices of economy increases with Witdth, Pitch, flight duration. Price of Premium class tickits were more than Economy class tickits. Since there is no absolute measure of price, the contributing factors to the difference in prices of the two tickits, which we take as our output of regression, are pitch and width difference and flight duration