Introduction of the business problem

Business Problem: Enhancing Airline Passenger Satisfaction Through Data-Driven Insights

The airline industry is a dynamic industry with intense competition. In recent decades, airline competition has received tremendous interest inside the research community, given the competition’s potentially extensive impacts on passengers, economies, and the society as a whole. One important aspect of this competition is how airlines treat their customers. Exceptional customer service and experience are key ingredients for achieving competitive advantage in the airlines industry. In an industry where customer satisfaction and loyalty are crucial, airlines must go above and beyond to exceed customer expectations and deliver a memorable travel experience. In order to achieve that, airlines strive to identify the key factors that drive customer dissatisfaction. By doing that, they will be able to allocate resources appropriately, so as to effectively improve the customer experience. This study aims to address two key business questions: Which attributes contribute the most to passenger satisfaction, and how can airlines prioritize improvements to enhance overall satisfaction? How can we predict whether a passenger is likely to be satisfied or dissatisfied with their flight experience, enabling proactive adjustments to improve customer experience? To answer these two business questions, we will analyze our data set that contains both individual satisfaction scores as well as operational details for each observation. These variables will give us both the subjective and the objective aspects of the passenger experience. In order to address the first question, we will apply classification analysis. This approach will allow us to understand which factors have the strongest influence on passengers’ satisfaction or dissatisfaction. For the second question we will apply predictive modeling. By creating a predictive model, we will be able to estimate a passenger’s satisfaction based on various characteristics and flight circumstances during the journey. To achieve these tasks, we will use two machine learning models: the decision tree and logistic regression. Both of the models were considered suitable for our analysis, as it will be explained below. A decision tree will give us clear insights into how individual variables contribute to overall satisfaction, enabling rule-based decision making. Logistic regression allows for the quantification of the impact of each variable on satisfaction or dissatisfaction. By combining and comparing these models, we aim to reveal the most critical points in the airline experience. By identifying patterns and predicting dissatisfaction before it occurs, airlines will be able to implement targeted improvements that align with the passengers needs. In that way, the airlines will have the opportunity to further enhance their long-term competitive advantage in the airline market.

DATA PREPARATION

We start data preparation by reading the data and saving them as a data frame named ‘dta’.

dta <- read.csv("airline_passenger_satisfaction.csv") #importing data

We then download some of the required packages needed for the next parts of the assignment.

# Set the CRAN mirror:
local({r <- getOption("repos")
r["CRAN"] <- "https://cran.rstudio.com/"
options(repos = r)})

# Install the packages used in this tutorial:
packages <- c("C50", "ggplot2", "gmodels", "Hmisc", "randomForest", "rsample","tidyverse")

for (i in packages) {
    if(!require(i, character.only = TRUE)) {
        install.packages(i, dependencies = TRUE)
    }
}
## Loading required package: C50
## Loading required package: ggplot2
## Loading required package: gmodels
## Loading required package: Hmisc
## 
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:base':
## 
##     format.pval, units
## Loading required package: randomForest
## randomForest 4.7-1.2
## Type rfNews() to see new features/changes/bug fixes.
## 
## Attaching package: 'randomForest'
## The following object is masked from 'package:ggplot2':
## 
##     margin
## Loading required package: rsample
## Loading required package: tidyverse
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ lubridate 1.9.4     ✔ tibble    3.2.1
## ✔ purrr     1.0.4     ✔ tidyr     1.3.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::combine()       masks randomForest::combine()
## ✖ dplyr::filter()        masks stats::filter()
## ✖ dplyr::lag()           masks stats::lag()
## ✖ randomForest::margin() masks ggplot2::margin()
## ✖ dplyr::src()           masks Hmisc::src()
## ✖ dplyr::summarize()     masks Hmisc::summarize()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

1. Data exploration

The data set we selected contains airline passenger satisfaction survey results. We begin our report with the data exploration and cleaning part in order to discover how we are going to handle our variables and address our problem statement. We use read.csv() to import the data set in R from a CSV file. Then we asked R to give us the summary, structure and head of all the variables. From the results we got we realised that we had both numeric and categorical variables. Before cleaning the data, we explore its structure and contents to understand what transformations are needed. The summary() function gives a quick statistical overview of numeric variables and distribution of categorical variables.

summary(dta)
##        ID            Gender               Age        Customer.Type     
##  Min.   :     1   Length:129880      Min.   : 7.00   Length:129880     
##  1st Qu.: 32471   Class :character   1st Qu.:27.00   Class :character  
##  Median : 64940   Mode  :character   Median :40.00   Mode  :character  
##  Mean   : 64940                      Mean   :39.43                     
##  3rd Qu.: 97410                      3rd Qu.:51.00                     
##  Max.   :129880                      Max.   :85.00                     
##                                                                        
##  Type.of.Travel        Class           Flight.Distance Departure.Delay  
##  Length:129880      Length:129880      Min.   :  31    Min.   :   0.00  
##  Class :character   Class :character   1st Qu.: 414    1st Qu.:   0.00  
##  Mode  :character   Mode  :character   Median : 844    Median :   0.00  
##                                        Mean   :1190    Mean   :  14.71  
##                                        3rd Qu.:1744    3rd Qu.:  12.00  
##                                        Max.   :4983    Max.   :1592.00  
##                                                                         
##  Arrival.Delay     Departure.and.Arrival.Time.Convenience
##  Min.   :   0.00   Min.   :0.000                         
##  1st Qu.:   0.00   1st Qu.:2.000                         
##  Median :   0.00   Median :3.000                         
##  Mean   :  15.09   Mean   :3.058                         
##  3rd Qu.:  13.00   3rd Qu.:4.000                         
##  Max.   :1584.00   Max.   :5.000                         
##  NA's   :393                                             
##  Ease.of.Online.Booking Check.in.Service Online.Boarding Gate.Location  
##  Min.   :0.000          Min.   :0.000    Min.   :0.000   Min.   :0.000  
##  1st Qu.:2.000          1st Qu.:3.000    1st Qu.:2.000   1st Qu.:2.000  
##  Median :3.000          Median :3.000    Median :3.000   Median :3.000  
##  Mean   :2.757          Mean   :3.306    Mean   :3.253   Mean   :2.977  
##  3rd Qu.:4.000          3rd Qu.:4.000    3rd Qu.:4.000   3rd Qu.:4.000  
##  Max.   :5.000          Max.   :5.000    Max.   :5.000   Max.   :5.000  
##                                                                         
##  On.board.Service  Seat.Comfort   Leg.Room.Service  Cleanliness   
##  Min.   :0.000    Min.   :0.000   Min.   :0.000    Min.   :0.000  
##  1st Qu.:2.000    1st Qu.:2.000   1st Qu.:2.000    1st Qu.:2.000  
##  Median :4.000    Median :4.000   Median :4.000    Median :3.000  
##  Mean   :3.383    Mean   :3.441   Mean   :3.351    Mean   :3.286  
##  3rd Qu.:4.000    3rd Qu.:5.000   3rd Qu.:4.000    3rd Qu.:4.000  
##  Max.   :5.000    Max.   :5.000   Max.   :5.000    Max.   :5.000  
##                                                                   
##  Food.and.Drink  In.flight.Service In.flight.Wifi.Service
##  Min.   :0.000   Min.   :0.000     Min.   :0.000         
##  1st Qu.:2.000   1st Qu.:3.000     1st Qu.:2.000         
##  Median :3.000   Median :4.000     Median :3.000         
##  Mean   :3.205   Mean   :3.642     Mean   :2.729         
##  3rd Qu.:4.000   3rd Qu.:5.000     3rd Qu.:4.000         
##  Max.   :5.000   Max.   :5.000     Max.   :5.000         
##                                                          
##  In.flight.Entertainment Baggage.Handling Satisfaction      
##  Min.   :0.000           Min.   :1.000    Length:129880     
##  1st Qu.:2.000           1st Qu.:3.000    Class :character  
##  Median :4.000           Median :4.000    Mode  :character  
##  Mean   :3.358           Mean   :3.632                      
##  3rd Qu.:4.000           3rd Qu.:5.000                      
##  Max.   :5.000           Max.   :5.000                      
## 
str(dta)
## 'data.frame':    129880 obs. of  24 variables:
##  $ ID                                    : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Gender                                : chr  "Male" "Female" "Male" "Male" ...
##  $ Age                                   : int  48 35 41 50 49 43 43 60 50 38 ...
##  $ Customer.Type                         : chr  "First-time" "Returning" "Returning" "Returning" ...
##  $ Type.of.Travel                        : chr  "Business" "Business" "Business" "Business" ...
##  $ Class                                 : chr  "Business" "Business" "Business" "Business" ...
##  $ Flight.Distance                       : int  821 821 853 1905 3470 3788 1963 853 2607 2822 ...
##  $ Departure.Delay                       : int  2 26 0 0 0 0 0 0 0 13 ...
##  $ Arrival.Delay                         : int  5 39 0 0 1 0 0 3 0 0 ...
##  $ Departure.and.Arrival.Time.Convenience: int  3 2 4 2 3 4 3 3 1 2 ...
##  $ Ease.of.Online.Booking                : int  3 2 4 2 3 4 3 4 1 5 ...
##  $ Check.in.Service                      : int  4 3 4 3 3 3 4 3 3 3 ...
##  $ Online.Boarding                       : int  3 5 5 4 5 5 4 4 2 5 ...
##  $ Gate.Location                         : int  3 2 4 2 3 4 3 4 1 2 ...
##  $ On.board.Service                      : int  3 5 3 5 3 4 5 3 4 5 ...
##  $ Seat.Comfort                          : int  5 4 5 5 4 4 5 4 3 4 ...
##  $ Leg.Room.Service                      : int  2 5 3 5 4 4 5 4 4 5 ...
##  $ Cleanliness                           : int  5 5 5 4 5 3 4 4 3 4 ...
##  $ Food.and.Drink                        : int  5 3 5 4 4 3 5 4 3 2 ...
##  $ In.flight.Service                     : int  5 5 3 5 3 4 5 3 4 5 ...
##  $ In.flight.Wifi.Service                : int  3 2 4 2 3 4 3 4 4 2 ...
##  $ In.flight.Entertainment               : int  5 5 3 5 3 4 5 3 4 5 ...
##  $ Baggage.Handling                      : int  5 5 3 5 3 4 5 3 4 5 ...
##  $ Satisfaction                          : chr  "Neutral or Dissatisfied" "Satisfied" "Satisfied" "Satisfied" ...
head(dta)
##   ID Gender Age Customer.Type Type.of.Travel    Class Flight.Distance
## 1  1   Male  48    First-time       Business Business             821
## 2  2 Female  35     Returning       Business Business             821
## 3  3   Male  41     Returning       Business Business             853
## 4  4   Male  50     Returning       Business Business            1905
## 5  5 Female  49     Returning       Business Business            3470
## 6  6   Male  43     Returning       Business Business            3788
##   Departure.Delay Arrival.Delay Departure.and.Arrival.Time.Convenience
## 1               2             5                                      3
## 2              26            39                                      2
## 3               0             0                                      4
## 4               0             0                                      2
## 5               0             1                                      3
## 6               0             0                                      4
##   Ease.of.Online.Booking Check.in.Service Online.Boarding Gate.Location
## 1                      3                4               3             3
## 2                      2                3               5             2
## 3                      4                4               5             4
## 4                      2                3               4             2
## 5                      3                3               5             3
## 6                      4                3               5             4
##   On.board.Service Seat.Comfort Leg.Room.Service Cleanliness Food.and.Drink
## 1                3            5                2           5              5
## 2                5            4                5           5              3
## 3                3            5                3           5              5
## 4                5            5                5           4              4
## 5                3            4                4           5              4
## 6                4            4                4           3              3
##   In.flight.Service In.flight.Wifi.Service In.flight.Entertainment
## 1                 5                      3                       5
## 2                 5                      2                       5
## 3                 3                      4                       3
## 4                 5                      2                       5
## 5                 3                      3                       3
## 6                 4                      4                       4
##   Baggage.Handling            Satisfaction
## 1                5 Neutral or Dissatisfied
## 2                5               Satisfied
## 3                3               Satisfied
## 4                5               Satisfied
## 5                3               Satisfied
## 6                4               Satisfied

Most of our numeric variables were scaled in a very convenient 0-5 scale since they were individual satisfaction ratings of various airline services. However, four of our numerical variables were not scaled in that same way so we chose to explore them further. The four numeric variables that were not scaled were: the Age variable, the Flight Distance variable, the Departure Delay variable and the Arrival Delay variable. So we decided to find the standard deviation and plot the histogram and the boxplot for each one. The results are presented below.

summary(dta$Age) 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    7.00   27.00   40.00   39.43   51.00   85.00
sd(dta$Age)
## [1] 15.11936

According to the results of the summary of the Age variable, there are no outliers since the median and mean values are very similar.

hist(dta$Age, 
     main = "Histogram of Age",  # Title
     xlab = "Age",               # X-axis label
     ylab = "Frequency")         # Y-axis label

boxplot(dta$Age, 
        main = "Boxplot of Age",  # Title
        ylab = "Age")           # Y-axis label

summary(dta$Flight.Distance)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##      31     414     844    1190    1744    4983
sd(dta$Flight.Distance)
## [1] 997.4525

The results in the Flight Distance variable suggest that there are outliers in the values since the median and mean differ.

hist(dta$Flight.Distance, 
     main = "Histogram of Flight Distance",  # Title
     xlab = "Distance",             # X-axis label
     ylab = "Frequency")         # Y-axis label

boxplot(dta$Flight.Distance, 
        main = "Boxplot of Flight Distance",  # Title
        ylab = "Distance")           # Y-axis label  

Regarding the Departure Delay and Arrival Delay variables, we noticed something quite interesting. We observed that most of the values in the Departure Delay and Arrival Delay variables were either 0 or close to 0, which is a positive finding in a real-world context, indicating that most flights experienced little to no delays. However, when visualizing the data using histograms, this distribution did not provide much useful insight, as the high concentration of zeros dominated the plots.

To better represent the distribution of actual delays, we refined our histogram by filtering the Departure Delay and Arrival Delay values to focus on those between 1 and 60 minutes. This adjustment allowed us to better understand the patterns in flight delays, providing a more informative visualization. All plots are presented below.

summary(dta$Departure.Delay, na.rm = TRUE)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    0.00    0.00   14.71   12.00 1592.00
sd(dta$Departure.Delay, na.rm = TRUE)
## [1] 38.07113

The results suggest that there are outliers in the values of departure delay variable since the median and mean differ significantly. Also, since we removed NA values and the median is 0, it is validated that there is a large number of 0 values in the departure delay variable.

hist(dta$Departure.Delay, 
     main = "Histogram of Departure Delay",  # Title
     xlab = "Delay Time",             # X-axis label
     ylab = "Frequency")         # Y-axis label

boxplot(dta$Departure.Delay, 
        main = "Boxplot of Departure Delay",  # Title
        ylab = "Delay Time")           # Y-axis label

dta_filtered_departure <- dta$Departure.Delay[dta$Departure.Delay >= 1 & dta$Departure.Delay <60]
hist(dta_filtered_departure, 
     main = "Histogram of Filtered Departure Delays (between 1 & 60 minutes)", 
     xlab = "Values", 
     ylab = "Frequency")

rm(dta_filtered_departure)
summary(dta$Arrival.Delay, na.rm = TRUE)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    0.00    0.00    0.00   15.09   13.00 1584.00     393
sd(dta$Arrival.Delay, na.rm = TRUE)
## [1] 38.46565

The results suggest that there are outliers in the values of Arrival Delay variable since the median and mean differ significantly. Also, similarly with the departure delay variable, we validate that there is a large number of 0 values in the arrival delay variable, since the median value is 0.

hist(dta$Arrival.Delay, 
     main = "Histogram of Arrival Delay",  # Title
     xlab = "Delay Time",             # X-axis label
     ylab = "Frequency")         # Y-axis label

boxplot(dta$Arrival.Delay, 
        main = "Boxplot of Arrival Delay",  # Title
        ylab = "Delay Time")           # Y-axis label

dta_filtered_arrival <- dta$Arrival.Delay[dta$Arrival.Delay >= 1 & dta$Arrival.Delay <60]
hist(dta_filtered_arrival, 
     main = "Histogram of Filtered Arrival Delays (between 1 & 60 minutes)", 
     xlab = "Values", 
     ylab = "Frequency")

rm(dta_filtered_arrival)

Next, we wanted to display the frequencies of the values in our categorical variables and also verify their unique categories. In order to achieve that, we first created tables and then barplots for each variable.It seems that 4 of our categorical variables consist of 2 categories each, while the Class variable consists of 3 categories.

table(dta$Gender)
## 
## Female   Male 
##  65899  63981
table(dta$Customer.Type)
## 
## First-time  Returning 
##      23780     106100
table(dta$Type.of.Travel)
## 
## Business Personal 
##    89693    40187
table(dta$Class)
## 
##     Business      Economy Economy Plus 
##        62160        58309         9411
table(dta$Satisfaction)
## 
## Neutral or Dissatisfied               Satisfied 
##                   73452                   56428
barplot(table(dta$Gender),
        las = 1,
        main = "Distribution of Gender")

barplot(table(dta$Customer.Type),
        las = 1,
        main = "Distribution of Customer Type")

barplot(table(dta$Type.of.Travel),
        las = 1,
        main = "Distribution of Type of Travel")

barplot(table(dta$Class),
        las = 1,
        main = "Distribution of Travel Class")

barplot(table(dta$Satisfaction),
        las = 1,
        main = "Distribution of Satisfaction")

As mentioned before, we divided our numerical variables to scaled ones (ratings from 0-5) and non scaled ones. We extracted the scaled numerical variables from the original data set and created a new data frame in which all the columns present the scaled numerical variables.

dta_scaled <- data.frame(dta[,10:23])

lapply() was used to efficiently iterate through all scaled variables and create histograms so we could visualise the different ratings. We noticed that only the Baggage Handling variable had non-zero values, while others had also zeros (according to our data dictionary zeros are interpreted as missing values).

lapply(names(dta_scaled), function(var) {
  clean_var <- gsub("\\.", " ", var)  # Replace dots with spaces
  hist(dta_scaled[[var]], 
       main = paste("Histogram of", clean_var),  # Cleaned variable name
       xlab = paste(clean_var,"score" ),  # Cleaned variable name for x-axis
       breaks = 15)
})

## [[1]]
## $breaks
##  [1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
## 
## $counts
##  [1]  6681 19409     0 21534     0 22378     0 31880     0 27998
## 
## $density
##  [1] 0.1028796 0.2988759 0.0000000 0.3315984 0.0000000 0.3445950 0.0000000
##  [8] 0.4909147 0.0000000 0.4311364
## 
## $mids
##  [1] 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75
## 
## $xname
## [1] "dta_scaled[[var]]"
## 
## $equidist
## [1] TRUE
## 
## attr(,"class")
## [1] "histogram"
## 
## [[2]]
## $breaks
##  [1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
## 
## $counts
##  [1]  5682 21886     0 30051     0 30393     0 24444     0 17424
## 
## $density
##  [1] 0.08749615 0.33701879 0.00000000 0.46275023 0.00000000 0.46801663
##  [7] 0.00000000 0.37640899 0.00000000 0.26830921
## 
## $mids
##  [1] 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75
## 
## $xname
## [1] "dta_scaled[[var]]"
## 
## $equidist
## [1] TRUE
## 
## attr(,"class")
## [1] "histogram"
## 
## [[3]]
## $breaks
##  [1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
## 
## $counts
##  [1]     1 16108     0 16102     0 35453     0 36333     0 25883
## 
## $density
##  [1] 1.539883e-05 2.480443e-01 0.000000e+00 2.479520e-01 0.000000e+00
##  [6] 5.459347e-01 0.000000e+00 5.594857e-01 0.000000e+00 3.985679e-01
## 
## $mids
##  [1] 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75
## 
## $xname
## [1] "dta_scaled[[var]]"
## 
## $equidist
## [1] TRUE
## 
## attr(,"class")
## [1] "histogram"
## 
## [[4]]
## $breaks
##  [1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
## 
## $counts
##  [1]  3080 13261     0 21934     0 27117     0 38468     0 26020
## 
## $density
##  [1] 0.0474284 0.2042039 0.0000000 0.3377579 0.0000000 0.4175701 0.0000000
##  [8] 0.5923622 0.0000000 0.4006775
## 
## $mids
##  [1] 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75
## 
## $xname
## [1] "dta_scaled[[var]]"
## 
## $equidist
## [1] TRUE
## 
## attr(,"class")
## [1] "histogram"
## 
## [[5]]
## $breaks
##  [1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
## 
## $counts
##  [1]     1 21991     0 24296     0 35717     0 30466     0 17409
## 
## $density
##  [1] 1.539883e-05 3.386357e-01 0.000000e+00 3.741300e-01 0.000000e+00
##  [6] 5.500000e-01 0.000000e+00 4.691407e-01 0.000000e+00 2.680782e-01
## 
## $mids
##  [1] 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75
## 
## $xname
## [1] "dta_scaled[[var]]"
## 
## $equidist
## [1] TRUE
## 
## attr(,"class")
## [1] "histogram"
## 
## [[6]]
## $breaks
##  [1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
## 
## $counts
##  [1]     5 14787     0 18351     0 28542     0 38703     0 29492
## 
## $density
##  [1] 7.699415e-05 2.277025e-01 0.000000e+00 2.825839e-01 0.000000e+00
##  [6] 4.395134e-01 0.000000e+00 5.959809e-01 0.000000e+00 4.541423e-01
## 
## $mids
##  [1] 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75
## 
## $xname
## [1] "dta_scaled[[var]]"
## 
## $equidist
## [1] TRUE
## 
## attr(,"class")
## [1] "histogram"
## 
## [[7]]
## $breaks
##  [1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
## 
## $counts
##  [1]     1 15108     0 18529     0 23328     0 39756     0 33158
## 
## $density
##  [1] 1.539883e-05 2.326455e-01 0.000000e+00 2.853249e-01 0.000000e+00
##  [6] 3.592239e-01 0.000000e+00 6.121959e-01 0.000000e+00 5.105944e-01
## 
## $mids
##  [1] 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75
## 
## $xname
## [1] "dta_scaled[[var]]"
## 
## $equidist
## [1] TRUE
## 
## attr(,"class")
## [1] "histogram"
## 
## [[8]]
## $breaks
##  [1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
## 
## $counts
##  [1]   598 12895     0 24540     0 25056     0 35886     0 30905
## 
## $density
##  [1] 0.0092085 0.1985679 0.0000000 0.3778873 0.0000000 0.3858331 0.0000000
##  [8] 0.5526024 0.0000000 0.4759008
## 
## $mids
##  [1] 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75
## 
## $xname
## [1] "dta_scaled[[var]]"
## 
## $equidist
## [1] TRUE
## 
## attr(,"class")
## [1] "histogram"
## 
## [[9]]
## $breaks
##  [1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
## 
## $counts
##  [1]    14 16729     0 20113     0 30639     0 33969     0 28416
## 
## $density
##  [1] 0.0002155836 0.2576070219 0.0000000000 0.3097166615 0.0000000000
##  [6] 0.4718047428 0.0000000000 0.5230828457 0.0000000000 0.4375731444
## 
## $mids
##  [1] 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75
## 
## $xname
## [1] "dta_scaled[[var]]"
## 
## $equidist
## [1] TRUE
## 
## attr(,"class")
## [1] "histogram"
## 
## [[10]]
## $breaks
##  [1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
## 
## $counts
##  [1]   132 16051     0 27383     0 27794     0 30563     0 27957
## 
## $density
##  [1] 0.002032646 0.247166615 0.000000000 0.421666153 0.000000000 0.427995072
##  [7] 0.000000000 0.470634432 0.000000000 0.430505082
## 
## $mids
##  [1] 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75
## 
## $xname
## [1] "dta_scaled[[var]]"
## 
## $equidist
## [1] TRUE
## 
## attr(,"class")
## [1] "histogram"
## 
## [[11]]
## $breaks
##  [1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
## 
## $counts
##  [1]     5  8862     0 14308     0 25316     0 47323     0 34066
## 
## $density
##  [1] 7.699415e-05 1.364644e-01 0.000000e+00 2.203265e-01 0.000000e+00
##  [6] 3.898368e-01 0.000000e+00 7.287188e-01 0.000000e+00 5.245765e-01
## 
## $mids
##  [1] 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75
## 
## $xname
## [1] "dta_scaled[[var]]"
## 
## $equidist
## [1] TRUE
## 
## attr(,"class")
## [1] "histogram"
## 
## [[12]]
## $breaks
##  [1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
## 
## $counts
##  [1]  3916 22328     0 32320     0 32185     0 24775     0 14356
## 
## $density
##  [1] 0.06030182 0.34382507 0.00000000 0.49769018 0.00000000 0.49561133
##  [7] 0.00000000 0.38150601 0.00000000 0.22106560
## 
## $mids
##  [1] 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75
## 
## $xname
## [1] "dta_scaled[[var]]"
## 
## $equidist
## [1] TRUE
## 
## attr(,"class")
## [1] "histogram"
## 
## [[13]]
## $breaks
##  [1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
## 
## $counts
##  [1]    18 15675     0 21968     0 23884     0 36791     0 31544
## 
## $density
##  [1] 0.0002771789 0.2413766554 0.0000000000 0.3382814906 0.0000000000
##  [6] 0.3677856483 0.0000000000 0.5665383431 0.0000000000 0.4857406837
## 
## $mids
##  [1] 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75
## 
## $xname
## [1] "dta_scaled[[var]]"
## 
## $equidist
## [1] TRUE
## 
## attr(,"class")
## [1] "histogram"
## 
## [[14]]
## $breaks
##  [1] 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6
## [20] 4.8 5.0
## 
## $counts
##  [1]  9028     0     0     0 14362     0     0     0     0 25851     0     0
## [13]     0     0 46761     0     0     0     0 33878
## 
## $density
##  [1] 0.3475516 0.0000000 0.0000000 0.0000000 0.5528950 0.0000000 0.0000000
##  [8] 0.0000000 0.0000000 0.9951879 0.0000000 0.0000000 0.0000000 0.0000000
## [15] 1.8001617 0.0000000 0.0000000 0.0000000 0.0000000 1.3042039
## 
## $mids
##  [1] 1.1 1.3 1.5 1.7 1.9 2.1 2.3 2.5 2.7 2.9 3.1 3.3 3.5 3.7 3.9 4.1 4.3 4.5 4.7
## [20] 4.9
## 
## $xname
## [1] "dta_scaled[[var]]"
## 
## $equidist
## [1] TRUE
## 
## attr(,"class")
## [1] "histogram"

Next, we wanted to find correlations between our variables. The possible combinations are correlations between 1) numerical and numerical 2) non numerical and numerical 3) non numerical and non numerical

numerical and numerical: In order to find useful correlations between our numerical data, we examined the relationships between numerical variables by computing their correlation coefficients. Our goal was to identify strong associations that could provide meaningful insights in a real-world scenario.

We specifically chose variables that we believed could have logical connections, allowing us to confirm or challenge our expectations. Below are the results and their interpretations:

cor(dta$Departure.Delay, dta$Arrival.Delay, use = "complete.obs")
## [1] 0.9652912
cor(dta$Flight.Distance, dta$Departure.Delay, use = "complete.obs")
## [1] 0.002402006
cor(dta$Departure.and.Arrival.Time.Convenience, dta$Gate.Location, use = "complete.obs")
## [1] 0.4475099
cor(dta$Food.and.Drink, dta$In.flight.Service, use = "complete.obs")
## [1] 0.03520966

Departure Delay & Arrival Delay had the strongest relationship, confirming a common and expected assumption. Other correlations were weak or moderate, suggesting that certain factors operate independently of each other.

non numerical and numerical: The way we are addressing the relationship between categorical and numerical variables is by plotting boxplots. As an example we took once again two variables that we believed could have logical connections: In-Flight Service variable and Class variable.

boxplot(dta_scaled$In.flight.Service ~ dta$Class,
        main = "In-flight Service Ratings by Class",  # Title of the plot
        xlab = "Class",                               # X-axis label
        ylab = "In-flight Service Rating",           # Y-axis label
        las = 1)                                     

From the graph it is apparent that the Economy and the Economy Plus class have no major difference in In-Flight Service satisfaction however the satisfaction for Business is visibly higher.

non numerical and non numerical: Last but not least, we checked relationships between categorical variables to identify possible associations. While correlation is typically used for numerical data, contingency tables (or cross-tabulations) help us explore how two categorical variables interact. Again we took a logical example and we created the table below. This table summarizes the distribution of Customer Type (Returning vs. First-Time) across different travel classes (Economy, Business, First Class).

table(dta$Customer.Type, dta$Class)
##             
##              Business Economy Economy Plus
##   First-time     9231   13634          915
##   Returning     52929   44675         8496

If we see fit to extend our correlations between more combinations between variables we will do so in a later stage of the report.

2. Data cleaning

Data duplication checking is not needed when it comes to this data set since every customer has a different assigned ID.

Before proceeding with further analysis, it is crucial to verify that all variables have the correct data types and that there are no inconsistencies that could affect our results.

length(unique(sapply(dta_scaled, class))) == 1 #checkig if all the values are the same type
## [1] TRUE
table(sapply(dta, class)) #checking the types of values
## 
## character   integer 
##         5        19

Since we received TRUE, it means the data set is clean in terms of types.

Next, in our original data set we replaced all the zeros in the scaled numerical variables with NAs. As we mentioned earlier in the columns 11-23 (scaled numerical variables) every zero has a meaning of NA. After we replaced the zeros with NAs we calculated the number of NAs of every variable of our data set.

dta[, 11:23][dta[, 11:23] == 0] <- NA #replacing 0 with NAs
colSums(is.na(dta)) #checking the amount of NAs
##                                     ID                                 Gender 
##                                      0                                      0 
##                                    Age                          Customer.Type 
##                                      0                                      0 
##                         Type.of.Travel                                  Class 
##                                      0                                      0 
##                        Flight.Distance                        Departure.Delay 
##                                      0                                      0 
##                          Arrival.Delay Departure.and.Arrival.Time.Convenience 
##                                    393                                      0 
##                 Ease.of.Online.Booking                       Check.in.Service 
##                                   5682                                      1 
##                        Online.Boarding                          Gate.Location 
##                                   3080                                      1 
##                       On.board.Service                           Seat.Comfort 
##                                      5                                      1 
##                       Leg.Room.Service                            Cleanliness 
##                                    598                                     14 
##                         Food.and.Drink                      In.flight.Service 
##                                    132                                      5 
##                 In.flight.Wifi.Service                In.flight.Entertainment 
##                                   3916                                     18 
##                       Baggage.Handling                           Satisfaction 
##                                      0                                      0

Most of the scaled variables have an insignificant amount of NA values. The variable with the biggest amount of them is Ease of Online Booking which accounts for 4,4% of total values in this variable. Therefore, we are not excluding any of the variables but instead apply imputation.

The variables that required imputation were the all the scaled numerical ones and the Arrival.Delays variable.
For the scaled numerical ones we decided to use mean imputation. While this method can sometimes introduce logical inconsistencies- for example, it assumes that the missing values follow the same distribution as the observed data- it remains the best approach in this case.

Our scaled variables follow a normal distribution, which makes mean imputation a suitable method, as it preserves the overall data structure without introducing significant bias.

Because we wanted to deal with the missing values ONLY in the scaled variables, we used our previously created data set dta_scaled so we can work more freely. Through mean imputation we replaced the missing values, we checked to see if every single one was replaced and finally we replaced the columns in the original data set. We then deleted the dta_scaled data frame.

dta_scaled[dta_scaled == 0] <- NA
dta_scaled[] <- lapply(dta_scaled, function(x) 
  ifelse(is.na(x), mean(x, na.rm = TRUE), x)
)
colSums(is.na(dta_scaled)) #checking the amount of NAs
## Departure.and.Arrival.Time.Convenience                 Ease.of.Online.Booking 
##                                      0                                      0 
##                       Check.in.Service                        Online.Boarding 
##                                      0                                      0 
##                          Gate.Location                       On.board.Service 
##                                      0                                      0 
##                           Seat.Comfort                       Leg.Room.Service 
##                                      0                                      0 
##                            Cleanliness                         Food.and.Drink 
##                                      0                                      0 
##                      In.flight.Service                 In.flight.Wifi.Service 
##                                      0                                      0 
##                In.flight.Entertainment                       Baggage.Handling 
##                                      0                                      0
dta[,10:23] <- dta_scaled
rm(dta_scaled)

Regression-based imputation assumes a strong relationship between variables so we decided to use it for the missing values in the Arrival.Delays variable. In order to do that we needed to check the correlations of the variable with the rest of the variables to find out which ones should be in the regression model as predictors.

correlations <- sapply(dta[, c(3, 7, 8, 10:23)], function(x)#checking the correlations between the arrival delay and other numerical values to see which ones we can use for the regression based imputation model
  cor(dta$Arrival.Delay, x, use = "complete.obs")
)

print(correlations)
##                                    Age                        Flight.Distance 
##                           -0.011247759                           -0.001934547 
##                        Departure.Delay Departure.and.Arrival.Time.Convenience 
##                            0.965291184                           -0.007869764 
##                 Ease.of.Online.Booking                       Check.in.Service 
##                           -0.013250592                           -0.021590970 
##                        Online.Boarding                          Gate.Location 
##                           -0.034293023                            0.005651015 
##                       On.board.Service                           Seat.Comfort 
##                           -0.034833670                           -0.030407466 
##                       Leg.Room.Service                            Cleanliness 
##                            0.009611069                           -0.016449644 
##                         Food.and.Drink                      In.flight.Service 
##                           -0.024508008                           -0.059910709 
##                 In.flight.Wifi.Service                In.flight.Entertainment 
##                           -0.027091727                           -0.030289356 
##                       Baggage.Handling 
##                           -0.007935105
rm(correlations)

Based on the results we can see that the only variable that has a strong correlation with the Arrival.Delays variable is the Departure.Delays variable. Hence, since these variables have very strong correlation (>0.9), we are using Departure.Delays as the sole predictor for the regression model. So, we used our model to apply regression imputation and we checked to see that indeed every missing value has been replaced.

model <- lm(Arrival.Delay ~ Departure.Delay, data = dta, subset = !is.na(dta$Arrival.Delay))
summary(model)
## 
## Call:
## lm(formula = Arrival.Delay ~ Departure.Delay, data = dta, subset = !is.na(dta$Arrival.Delay))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -53.510  -1.975  -0.757  -0.461 236.436 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     0.757464   0.029927   25.31   <2e-16 ***
## Departure.Delay 0.978849   0.000736 1329.95   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.05 on 129485 degrees of freedom
## Multiple R-squared:  0.9318, Adjusted R-squared:  0.9318 
## F-statistic: 1.769e+06 on 1 and 129485 DF,  p-value: < 2.2e-16
dta$Arrival.Delay[is.na(dta$Arrival.Delay)] <- predict(model, newdata = dta[is.na(dta$Arrival.Delay), ]) 
colSums(is.na(dta))
##                                     ID                                 Gender 
##                                      0                                      0 
##                                    Age                          Customer.Type 
##                                      0                                      0 
##                         Type.of.Travel                                  Class 
##                                      0                                      0 
##                        Flight.Distance                        Departure.Delay 
##                                      0                                      0 
##                          Arrival.Delay Departure.and.Arrival.Time.Convenience 
##                                      0                                      0 
##                 Ease.of.Online.Booking                       Check.in.Service 
##                                      0                                      0 
##                        Online.Boarding                          Gate.Location 
##                                      0                                      0 
##                       On.board.Service                           Seat.Comfort 
##                                      0                                      0 
##                       Leg.Room.Service                            Cleanliness 
##                                      0                                      0 
##                         Food.and.Drink                      In.flight.Service 
##                                      0                                      0 
##                 In.flight.Wifi.Service                In.flight.Entertainment 
##                                      0                                      0 
##                       Baggage.Handling                           Satisfaction 
##                                      0                                      0
rm(model)

In the beginning of our analysis we plotted the 4 non-scaled numerical variables: Age, Flight Distance, Departure Delay and Arrival Delay. The graphs indicated that our distribution is skewed in the Flight Distance, Departure Delay and Arrival Delay variables. When dealing with skewed data, rescaling is a common technique used to make distributions more manageable and improve interpretability. While it does not necessarily make the data perfectly normal, it helps reduce the influence of extreme values and stabilizes variance, which is often an important assumption in statistical modeling. Also, rescaling does not fundamentally change the interpretation of a variable, it simply allows us to work with a data set that aligns better with model assumptions.

Initially, we attempted to apply a square root transformation, but the results were unsatisfactory- the distribution remained highly skewed, and the transformation did not improve interpretability. Given this, we opted for a log transformation, even though we had left-skewed data rather than the typical right-skewed data seen in variables transformed with logs.

By using this transformation, we aimed to: make the distribution closer to normal, making statistical analyses more reliable and reduce the impact of extreme values that could disproportionately affect models.

So, for each of the three variables we applied logarithmic rescaling and plotted the histogram and the boxplot to compare the graphs with the initial ones and visualise the difference between raw and rescaled data. The results are presented below:

dta$Flight.Distance.log <- log(1 + dta$Flight.Distance)
summary(dta$Flight.Distance)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##      31     414     844    1190    1744    4983
summary(dta$Flight.Distance.log)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   3.466   6.028   6.739   6.706   7.465   8.514
hist(dta$Flight.Distance.log, 
     main = "Histogram of Rescaled Flight Distance",  # Title
     xlab = "Distance",             # X-axis label
     ylab = "Frequency")         # Y-axis label

boxplot(dta$Flight.Distance.log, 
        main = "Boxplot of Rescaled Flight Distance",  # Title
        ylab = "Distance")           # Y-axis label 

dta$Arrival.Delay.log <- log(1 + dta$Arrival.Delay)
summary(dta$Arrival.Delay)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    0.00    0.00   15.16   13.00 1584.00
summary(dta$Arrival.Delay.log)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.000   0.000   1.267   2.639   7.368
hist(dta$Arrival.Delay.log, 
     main = "Histogram of Rescaled Arrival Delay",  # Title
     xlab = "Delay Time",             # X-axis label
     ylab = "Frequency")         # Y-axis label

boxplot(dta$Arrival.Delay.log, 
        main = "Boxplot of Rescaled Arrival Delay",  # Title
        ylab = "Delay Time")           # Y-axis label

dta$Departure.Delay.log <- log(1 + dta$Departure.Delay)
summary(dta$Departure.Delay)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    0.00    0.00   14.71   12.00 1592.00
summary(dta$Departure.Delay.log)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.000   0.000   1.234   2.565   7.373
hist(dta$Departure.Delay.log, 
     main = "Histogram of Rescaled Departure Delay",  # Title
     xlab = "Delay Time",             # X-axis label
     ylab = "Frequency")         # Y-axis label

boxplot(dta$Departure.Delay.log, 
        main = "Boxplot of Rescaled Departure Delay",  # Title
        ylab = "Delay Time")           # Y-axis label

In the two Delay variables, since there is a disproportionate amount of values equal or close to zero, even with recalling the distribution does not become normal.

Before we proceed with the next stages of the analysis we deleted the logarithmic columns from our original data set dta in order to have a cleaner data set:

dta <- dta[, -25:-27]

After completing our data exploration and preprocessing steps, we repeat the summary() and str() commands to gain a final overview of our cleaned and transformed data set. This allowed us to verify that our adjustments- such as handling missing values and addressing inconsistencies- were successfully implemented. We can see that the NA values in the delay variables have been removed. Also, as a result of applying mean imputation, some scores within the scaled variables now have non-integer values, converting them to numerical variables. Since the original satisfaction scores ranged from 1 to 5 and were treated as ordinal, this transformation is acceptable for our analysis.

summary(dta)
##        ID            Gender               Age        Customer.Type     
##  Min.   :     1   Length:129880      Min.   : 7.00   Length:129880     
##  1st Qu.: 32471   Class :character   1st Qu.:27.00   Class :character  
##  Median : 64940   Mode  :character   Median :40.00   Mode  :character  
##  Mean   : 64940                      Mean   :39.43                     
##  3rd Qu.: 97410                      3rd Qu.:51.00                     
##  Max.   :129880                      Max.   :85.00                     
##  Type.of.Travel        Class           Flight.Distance Departure.Delay  
##  Length:129880      Length:129880      Min.   :  31    Min.   :   0.00  
##  Class :character   Class :character   1st Qu.: 414    1st Qu.:   0.00  
##  Mode  :character   Mode  :character   Median : 844    Median :   0.00  
##                                        Mean   :1190    Mean   :  14.71  
##                                        3rd Qu.:1744    3rd Qu.:  12.00  
##                                        Max.   :4983    Max.   :1592.00  
##  Arrival.Delay     Departure.and.Arrival.Time.Convenience
##  Min.   :   0.00   Min.   :1.000                         
##  1st Qu.:   0.00   1st Qu.:2.000                         
##  Median :   0.00   Median :3.223                         
##  Mean   :  15.16   Mean   :3.223                         
##  3rd Qu.:  13.00   3rd Qu.:4.000                         
##  Max.   :1584.00   Max.   :5.000                         
##  Ease.of.Online.Booking Check.in.Service Online.Boarding Gate.Location  
##  Min.   :1.000          Min.   :1.000    Min.   :1.000   Min.   :1.000  
##  1st Qu.:2.000          1st Qu.:3.000    1st Qu.:2.000   1st Qu.:2.000  
##  Median :3.000          Median :3.000    Median :3.332   Median :3.000  
##  Mean   :2.883          Mean   :3.306    Mean   :3.332   Mean   :2.977  
##  3rd Qu.:4.000          3rd Qu.:4.000    3rd Qu.:4.000   3rd Qu.:4.000  
##  Max.   :5.000          Max.   :5.000    Max.   :5.000   Max.   :5.000  
##  On.board.Service  Seat.Comfort   Leg.Room.Service  Cleanliness   
##  Min.   :1.000    Min.   :1.000   Min.   :1.000    Min.   :1.000  
##  1st Qu.:2.000    1st Qu.:2.000   1st Qu.:2.000    1st Qu.:2.000  
##  Median :4.000    Median :4.000   Median :4.000    Median :3.000  
##  Mean   :3.383    Mean   :3.441   Mean   :3.366    Mean   :3.287  
##  3rd Qu.:4.000    3rd Qu.:5.000   3rd Qu.:4.000    3rd Qu.:4.000  
##  Max.   :5.000    Max.   :5.000   Max.   :5.000    Max.   :5.000  
##  Food.and.Drink  In.flight.Service In.flight.Wifi.Service
##  Min.   :1.000   Min.   :1.000     Min.   :1.000         
##  1st Qu.:2.000   1st Qu.:3.000     1st Qu.:2.000         
##  Median :3.000   Median :4.000     Median :3.000         
##  Mean   :3.208   Mean   :3.642     Mean   :2.814         
##  3rd Qu.:4.000   3rd Qu.:5.000     3rd Qu.:4.000         
##  Max.   :5.000   Max.   :5.000     Max.   :5.000         
##  In.flight.Entertainment Baggage.Handling Satisfaction      
##  Min.   :1.000           Min.   :1.000    Length:129880     
##  1st Qu.:2.000           1st Qu.:3.000    Class :character  
##  Median :4.000           Median :4.000    Mode  :character  
##  Mean   :3.359           Mean   :3.632                      
##  3rd Qu.:4.000           3rd Qu.:5.000                      
##  Max.   :5.000           Max.   :5.000
str(dta)
## 'data.frame':    129880 obs. of  24 variables:
##  $ ID                                    : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Gender                                : chr  "Male" "Female" "Male" "Male" ...
##  $ Age                                   : int  48 35 41 50 49 43 43 60 50 38 ...
##  $ Customer.Type                         : chr  "First-time" "Returning" "Returning" "Returning" ...
##  $ Type.of.Travel                        : chr  "Business" "Business" "Business" "Business" ...
##  $ Class                                 : chr  "Business" "Business" "Business" "Business" ...
##  $ Flight.Distance                       : int  821 821 853 1905 3470 3788 1963 853 2607 2822 ...
##  $ Departure.Delay                       : int  2 26 0 0 0 0 0 0 0 13 ...
##  $ Arrival.Delay                         : num  5 39 0 0 1 0 0 3 0 0 ...
##  $ Departure.and.Arrival.Time.Convenience: num  3 2 4 2 3 4 3 3 1 2 ...
##  $ Ease.of.Online.Booking                : num  3 2 4 2 3 4 3 4 1 5 ...
##  $ Check.in.Service                      : num  4 3 4 3 3 3 4 3 3 3 ...
##  $ Online.Boarding                       : num  3 5 5 4 5 5 4 4 2 5 ...
##  $ Gate.Location                         : num  3 2 4 2 3 4 3 4 1 2 ...
##  $ On.board.Service                      : num  3 5 3 5 3 4 5 3 4 5 ...
##  $ Seat.Comfort                          : num  5 4 5 5 4 4 5 4 3 4 ...
##  $ Leg.Room.Service                      : num  2 5 3 5 4 4 5 4 4 5 ...
##  $ Cleanliness                           : num  5 5 5 4 5 3 4 4 3 4 ...
##  $ Food.and.Drink                        : num  5 3 5 4 4 3 5 4 3 2 ...
##  $ In.flight.Service                     : num  5 5 3 5 3 4 5 3 4 5 ...
##  $ In.flight.Wifi.Service                : num  3 2 4 2 3 4 3 4 4 2 ...
##  $ In.flight.Entertainment               : num  5 5 3 5 3 4 5 3 4 5 ...
##  $ Baggage.Handling                      : int  5 5 3 5 3 4 5 3 4 5 ...
##  $ Satisfaction                          : chr  "Neutral or Dissatisfied" "Satisfied" "Satisfied" "Satisfied" ...

3. Data transformation

The third part of data preparation includes the transformation of data. The goal is to prepare a data set that is suitable for the next phases of the assignment.To achieve this, we need to scale the numerical variables and encode the categorical variables as binary or dummy variables.

In the following chunk, we begin by creating the data frame in which all transformations will be applied, named dta_transformed. We keep the original cleaned data frame, named dta, which will be used again in the modelling part of the assignment, where encoded categorical and scaled variables are not necessarily needed.

Before we create the new data frame, we remove the first column, which is the ID variable used to number the customers. This variable is not required for the next steps, as it does not provide any meaningful information for analysis or modeling. We also convert all column names to lowercase to simplify referencing in the code.

colnames(dta) <- tolower(colnames(dta)) #decapitalizing
dta["id"] <- NULL #remove ID variable
dta_transformed <- dta  #create new data set 

In the next chunk, we encode all five categorical variables: gender, customer.type, type.of.travel, class, and satisfaction. As shown in the exploration phase, all variables except for class can be converted into binary variables, as they each contain only two categories. For the class variable, which contains three categories (Economy, Economy Plus, and Business), we create two dummy variables, leaving one category as the reference group. This reduces dimensionality while we still have the necessary information for analysis.

#Convert 4 categorical variables into binary variables
dta_transformed$gender <- ifelse(dta_transformed$gender == "Female", 0,
                                      ifelse(dta_transformed$gender == "Male", 1,NA))

dta_transformed$satisfaction <- ifelse(dta_transformed$satisfaction == "Neutral or Dissatisfied", 0,
                                      ifelse(dta_transformed$satisfaction == "Satisfied", 1,NA))
                                               
dta_transformed$customer.type <- ifelse(dta_transformed$customer.type == "First-time", 0,
                                      ifelse(dta_transformed$customer.type == "Returning", 1,NA))
  
dta_transformed$type.of.travel <- ifelse(dta_transformed$type.of.travel == "Personal", 0,
                                      ifelse(dta_transformed$type.of.travel == "Business", 1,NA))
  
# Convert class variable into two dummy variables 
class_dummies <- model.matrix(~ class, data = dta_transformed)[, -1]  # create (k-1) dummies
dta_transformed <- cbind(dta_transformed, class_dummies)  # add the dummies back to the data set
dta_transformed$class <- NULL # remove the original class variable
rm(class_dummies) # remove class_dummies matrix

# Rename the new class dummy variables
names(dta_transformed)[names(dta_transformed) == "classEconomy"] <- "class.economy_dummie"
names(dta_transformed)[names(dta_transformed) == "classEconomy Plus"] <- "class.economy.plus_dummie"

Next, we scale all numerical variables in the data frame to ensure they contribute equally to distance-based techniques such as PCA and clustering. Scaling transforms the variables to have a mean of 0 and a standard deviation of 1, which prevents features with larger ranges from dominating the analysis.

Categorical variables that were previously encoded as binary or dummy variables are excluded from this scaling step, as scaling them would alter their original 0/1 meaning and reduce interpretability in the next steps. Besides, binary and dummy variables are already set to a small range (0 and 1). So, since they do not contain very large or small numbers, we assume that they will not negatively affect models sensitive to differences in scale.

# Create a vector of encoded categorical variables
encoded_categorical <- c("gender", "customer.type", "type.of.travel","class.economy_dummie", "class.economy.plus_dummie","satisfaction")
# Get the names of the columns to scale
numeric_vars <- setdiff(names(dta_transformed), encoded_categorical)
# Scale only numeric variables, keep others unchanged
dta_transformed[ , numeric_vars] <- scale(dta_transformed[ , numeric_vars], center = TRUE, scale = TRUE)
rm(encoded_categorical,numeric_vars)
# Overview of changes in the data frame  
str(dta_transformed)
## 'data.frame':    129880 obs. of  24 variables:
##  $ gender                                : num  1 0 1 1 0 1 1 0 1 0 ...
##  $ age                                   : num  0.567 -0.293 0.104 0.699 0.633 ...
##  $ customer.type                         : num  0 1 1 1 1 1 1 1 1 1 ...
##  $ type.of.travel                        : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ flight.distance                       : num  -0.37 -0.37 -0.338 0.717 2.286 ...
##  $ departure.delay                       : num  -0.334 0.296 -0.386 -0.386 -0.386 ...
##  $ arrival.delay                         : num  -0.263 0.618 -0.393 -0.393 -0.367 ...
##  $ departure.and.arrival.time.convenience: num  -0.165 -0.906 0.575 -0.906 -0.165 ...
##  $ ease.of.online.booking                : num  0.092 -0.694 0.878 -0.694 0.092 ...
##  $ check.in.service                      : num  0.548 -0.242 0.548 -0.242 -0.242 ...
##  $ online.boarding                       : num  -0.265 1.333 1.333 0.534 1.333 ...
##  $ gate.location                         : num  0.018 -0.764 0.8 -0.764 0.018 ...
##  $ on.board.service                      : num  -0.298 1.256 -0.298 1.256 -0.298 ...
##  $ seat.comfort                          : num  1.181 0.423 1.181 1.181 0.423 ...
##  $ leg.room.service                      : num  -1.054 1.26 -0.283 1.26 0.489 ...
##  $ cleanliness                           : num  1.305 1.305 1.305 0.543 1.305 ...
##  $ food.and.drink                        : num  1.351 -0.157 1.351 0.597 0.597 ...
##  $ in.flight.service                     : num  1.154 1.154 -0.546 1.154 -0.546 ...
##  $ in.flight.wifi.service                : num  0.15 -0.656 0.957 -0.656 0.15 ...
##  $ in.flight.entertainment               : num  1.231 1.231 -0.269 1.231 -0.269 ...
##  $ baggage.handling                      : num  1.159 1.159 -0.536 1.159 -0.536 ...
##  $ satisfaction                          : num  0 1 1 1 1 1 1 1 0 1 ...
##  $ class.economy_dummie                  : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ class.economy.plus_dummie             : num  0 0 0 0 0 0 0 0 0 0 ...

FEATURE ENGINEERING AND DIMENSIONALITY REDUCTION

1. Creation of new variables

As a first step in the feature engineering process, we explore the correlations between all numerical variables in the data frame. This helps us identify variables that are highly correlated and could potentially be merged into simplified variables, reducing dimensionality. To do this, we generate a correlation matrix using the apa.cor.table() function, which provides a clean APA-style layout. To fully analyze the correlations across all variables, the complete correlation table can be seen by clicking on the cor_matrix in the environment pane. For validation, we also conducted a further correlation check by using the caret package and checking for strong correlations(>0.7) between the variables.

Before the correlation analysis, we create a new data frame that will be used specifically for the creation of new variables and for some of the next phases of the assignment, including clustering and PCA analysis. In this new data frame, named dta_cls, we exclude the variable satisfaction, as it is the target variable we aim to predict in the modeling stage. We also exclude satisfaction from clustering and PCA analysis. By keeping satisfaction separate, we ensure that our groupings are based only on the input features.

library(apaTables) # Load required package

dta_cls <- dta_transformed[ , !names(dta_transformed) %in% "satisfaction" ] #Create new data frame used for variable creation

apa.cor.table(dta_cls) # Create the correlation matrix
## 
## 
## Means, standard deviations, and correlations with confidence intervals
##  
## 
##   Variable                                  M     SD   1           
##   1. gender                                 0.49  0.50             
##                                                                    
##   2. age                                    -0.00 1.00 .01**       
##                                                        [.00, .01]  
##                                                                    
##   3. customer.type                          0.82  0.39 .03**       
##                                                        [.03, .04]  
##                                                                    
##   4. type.of.travel                         0.69  0.46 -.01**      
##                                                        [-.01, -.00]
##                                                                    
##   5. flight.distance                        0.00  1.00 .00         
##                                                        [-.00, .01] 
##                                                                    
##   6. departure.delay                        0.00  1.00 .00         
##                                                        [-.00, .01] 
##                                                                    
##   7. arrival.delay                          -0.00 1.00 .00         
##                                                        [-.00, .01] 
##                                                                    
##   8. departure.and.arrival.time.convenience 0.00  1.00 .01**       
##                                                        [.00, .01]  
##                                                                    
##   9. ease.of.online.booking                 0.00  1.00 .01*        
##                                                        [.00, .01]  
##                                                                    
##   10. check.in.service                      -0.00 1.00 .01**       
##                                                        [.00, .01]  
##                                                                    
##   11. online.boarding                       -0.00 1.00 -.04**      
##                                                        [-.05, -.03]
##                                                                    
##   12. gate.location                         0.00  1.00 -.00        
##                                                        [-.01, .00] 
##                                                                    
##   13. on.board.service                      0.00  1.00 .01*        
##                                                        [.00, .01]  
##                                                                    
##   14. seat.comfort                          0.00  1.00 -.03**      
##                                                        [-.04, -.03]
##                                                                    
##   15. leg.room.service                      -0.00 1.00 .02**       
##                                                        [.02, .03]  
##                                                                    
##   16. cleanliness                           0.00  1.00 .00         
##                                                        [-.00, .01] 
##                                                                    
##   17. food.and.drink                        0.00  1.00 .00         
##                                                        [-.00, .01] 
##                                                                    
##   18. in.flight.service                     0.00  1.00 .04**       
##                                                        [.03, .04]  
##                                                                    
##   19. in.flight.wifi.service                0.00  1.00 .01*        
##                                                        [.00, .01]  
##                                                                    
##   20. in.flight.entertainment               0.00  1.00 .00         
##                                                        [-.00, .01] 
##                                                                    
##   21. baggage.handling                      -0.00 1.00 .04**       
##                                                        [.03, .04]  
##                                                                    
##   22. class.economy_dummie                  0.45  0.50 -.00        
##                                                        [-.01, .00] 
##                                                                    
##   23. class.economy.plus_dummie             0.07  0.26 -.01**      
##                                                        [-.02, -.01]
##                                                                    
##   2            3            4            5            6            7           
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##   .28**                                                                        
##   [.28, .29]                                                                   
##                                                                                
##   .04**        -.31**                                                          
##   [.04, .05]   [-.31, -.30]                                                    
##                                                                                
##   .10**        .23**        .27**                                              
##   [.09, .10]   [.22, .23]   [.26, .27]                                         
##                                                                                
##   -.01**       -.00         .01*         .00                                   
##   [-.01, -.00] [-.01, .00]  [.00, .01]   [-.00, .01]                           
##                                                                                
##   -.01**       -.00         .01          -.00         .97**                    
##   [-.02, -.01] [-.01, .00]  [-.00, .01]  [-.01, .00]  [.97, .97]               
##                                                                                
##   -.01**       .10**        -.25**       -.07**       -.01*        -.01**      
##   [-.02, -.01] [.09, .10]   [-.26, -.25] [-.08, -.07] [-.01, -.00] [-.01, -.00]
##                                                                                
##   .01**        .02**        .13**        .05**        -.01**       -.01**      
##   [.01, .02]   [.01, .02]   [.12, .13]   [.04, .06]   [-.02, -.01] [-.02, -.01]
##                                                                                
##   .03**        .03**        -.02**       .07**        -.02**       -.02**      
##   [.03, .04]   [.03, .04]   [-.02, -.01] [.07, .08]   [-.02, -.01] [-.03, -.02]
##                                                                                
##   .18**        .18**        .21**        .19**        -.03**       -.03**      
##   [.18, .19]   [.17, .18]   [.20, .21]   [.19, .20]   [-.04, -.03] [-.04, -.03]
##                                                                                
##   -.00         -.00         .03**        .01*         .01*         .01*        
##   [-.01, .01]  [-.01, .00]  [.02, .04]   [.00, .01]   [.00, .01]   [.00, .01]  
##                                                                                
##   .06**        .05**        .06**        .11**        -.03**       -.03**      
##   [.05, .06]   [.05, .06]   [.05, .07]   [.11, .12]   [-.04, -.03] [-.04, -.03]
##                                                                                
##   .16**        .16**        .13**        .16**        -.03**       -.03**      
##   [.15, .16]   [.15, .16]   [.12, .13]   [.15, .16]   [-.03, -.02] [-.04, -.03]
##                                                                                
##   .05**        .05**        .13**        .13**        .01**        .01**       
##   [.05, .06]   [.05, .06]   [.12, .13]   [.13, .14]   [.01, .02]   [.00, .02]  
##                                                                                
##   .05**        .08**        .08**        .10**        -.01**       -.02**      
##   [.05, .06]   [.08, .09]   [.08, .09]   [.09, .10]   [-.02, -.01] [-.02, -.01]
##                                                                                
##   .02**        .06**        .07**        .06**        -.02**       -.02**      
##   [.02, .03]   [.05, .06]   [.06, .07]   [.05, .06]   [-.03, -.02] [-.03, -.02]
##                                                                                
##   -.05**       -.02**       .02**        .06**        -.05**       -.06**      
##   [-.06, -.05] [-.03, -.02] [.02, .03]   [.05, .06]   [-.06, -.05] [-.07, -.05]
##                                                                                
##   .01**        -.00         .13**        .01**        -.02**       -.03**      
##   [.00, .02]   [-.01, .00]  [.12, .13]   [.00, .02]   [-.03, -.02] [-.03, -.02]
##                                                                                
##   .07**        .11**        .15**        .13**        -.03**       -.03**      
##   [.07, .08]   [.10, .11]   [.15, .16]   [.13, .14]   [-.03, -.02] [-.04, -.02]
##                                                                                
##   -.05**       -.02**       .03**        .06**        -.00         -.01**      
##   [-.05, -.04] [-.03, -.02] [.03, .04]   [.06, .07]   [-.01, .00]  [-.01, -.00]
##                                                                                
##   -.13**       -.12**       -.50**       -.40**       .01**        .01**       
##   [-.14, -.13] [-.12, -.11] [-.51, -.50] [-.41, -.40] [.00, .01]   [.01, .02]  
##                                                                                
##   -.01**       .06**        -.10**       -.12**       .00          .01         
##   [-.02, -.01] [.06, .07]   [-.11, -.10] [-.13, -.12] [-.00, .01]  [-.00, .01] 
##                                                                                
##   8            9            10           11           12           13          
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##   .50**                                                                        
##   [.50, .50]                                                                   
##                                                                                
##   .12**        .02**                                                           
##   [.11, .12]   [.01, .02]                                                      
##                                                                                
##   .06**        .36**        .22**                                              
##   [.05, .06]   [.36, .37]   [.22, .23]                                         
##                                                                                
##   .49**        .50**        -.04**       -.00                                  
##   [.49, .50]   [.49, .50]   [-.04, -.03] [-.01, .00]                           
##                                                                                
##   .08**        .03**        .24**        .17**        -.03**                   
##   [.07, .08]   [.03, .04]   [.24, .25]   [.16, .17]   [-.03, -.02]             
##                                                                                
##   -.01**       .03**        .19**        .43**        .00          .13**       
##   [-.02, -.00] [.02, .03]   [.18, .20]   [.43, .44]   [-.00, .01]  [.13, .14]  
##                                                                                
##   -.00         .08**        .15**        .13**        -.01         .36**       
##   [-.01, .00]  [.08, .09]   [.15, .16]   [.13, .14]   [-.01, .00]  [.35, .36]  
##                                                                                
##   .00          .01**        .18**        .34**        -.01*        .12**       
##   [-.00, .01]  [.01, .02]   [.17, .18]   [.34, .35]   [-.01, -.00] [.12, .13]  
##                                                                                
##   -.01**       .03**        .08**        .25**        -.00         .06**       
##   [-.01, -.00] [.02, .03]   [.08, .09]   [.24, .25]   [-.01, .00]  [.05, .06]  
##                                                                                
##   .08**        .03**        .24**        .09**        .00          .55**       
##   [.08, .09]   [.02, .03]   [.23, .24]   [.08, .09]   [-.01, .01]  [.55, .56]  
##                                                                                
##   .37**        .68**        .06**        .46**        .35**        .12**       
##   [.37, .38]   [.67, .68]   [.06, .07]   [.45, .46]   [.35, .36]   [.12, .13]  
##                                                                                
##   -.03**       .03**        .12**        .29**        .00          .42**       
##   [-.04, -.03] [.03, .04]   [.11, .12]   [.29, .30]   [-.00, .01]  [.41, .42]  
##                                                                                
##   .08**        .03**        .23**        .09**        .00          .52**       
##   [.07, .08]   [.02, .03]   [.23, .24]   [.09, .10]   [-.00, .01]  [.52, .52]  
##                                                                                
##   .11**        -.11**       -.13**       -.29**       -.01         -.18**      
##   [.10, .12]   [-.11, -.10] [-.13, -.12] [-.29, -.28] [-.01, .00]  [-.19, -.18]
##                                                                                
##   .02**        -.02**       -.06**       -.07**       -.00         -.08**      
##   [.01, .02]   [-.02, -.01] [-.07, -.06] [-.08, -.07] [-.01, .00]  [-.08, -.07]
##                                                                                
##   14           15           16           17           18           19          
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##                                                                                
##   .11**                                                                        
##   [.10, .11]                                                                   
##                                                                                
##   .68**        .10**                                                           
##   [.68, .68]   [.09, .10]                                                      
##                                                                                
##   .58**        .03**        .66**                                              
##   [.57, .58]   [.03, .04]   [.66, .66]                                         
##                                                                                
##   .07**        .37**        .09**        .03**                                 
##   [.06, .07]   [.36, .37]   [.08, .10]   [.03, .04]                            
##                                                                                
##   .14**        .15**        .14**        .14**        .11**                    
##   [.13, .14]   [.15, .16]   [.14, .15]   [.13, .15]   [.10, .11]               
##                                                                                
##   .61**        .30**        .69**        .62**        .41**        .21**       
##   [.61, .62]   [.30, .31]   [.69, .70]   [.62, .63]   [.40, .41]   [.21, .22]  
##                                                                                
##   .07**        .37**        .10**        .04**        .63**        .12**       
##   [.07, .08]   [.37, .37]   [.09, .10]   [.03, .04]   [.63, .63]   [.11, .12]  
##                                                                                
##   -.20**       -.18**       -.12**       -.08**       -.13**       -.06**      
##   [-.21, -.20] [-.19, -.18] [-.13, -.12] [-.09, -.07] [-.14, -.13] [-.07, -.06]
##                                                                                
##   -.06**       -.06**       -.04**       -.02**       -.06**       .00         
##   [-.06, -.05] [-.07, -.06] [-.04, -.03] [-.02, -.01] [-.07, -.06] [-.00, .01] 
##                                                                                
##   20           21           22          
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##                                         
##   .38**                                 
##   [.37, .38]                            
##                                         
##   -.18**       -.14**                   
##   [-.18, -.17] [-.14, -.13]             
##                                         
##   -.05**       -.07**       -.25**      
##   [-.05, -.04] [-.07, -.06] [-.26, -.25]
##                                         
## 
## Note. M and SD are used to represent mean and standard deviation, respectively.
## Values in square brackets indicate the 95% confidence interval.
## The confidence interval is a plausible range of population correlations 
## that could have caused the sample correlation (Cumming, 2014).
##  * indicates p < .05. ** indicates p < .01.
## 
install.packages("caret")
## 
## The downloaded binary packages are in
##  /var/folders/2g/hk80wzy125g_cz67jgkcrnxm0000gn/T//RtmplWSUeS/downloaded_packages
library(caret)
## Loading required package: lattice
## 
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
## 
##     lift
cor_matrix <- cor(dta_cls)
high_corr <- findCorrelation(cor_matrix, cutoff = 0.70, names = TRUE)
print(high_corr)
## [1] "departure.delay"

After reviewing the correlation matrix and validating with the caret package, we observed a very strong correlation (r = 0.97) between the variables arrival.delay and departure.delay, which was also seen in the data preparation phase. Given this correlation, we decided to merge these two variables into a single one, named delay.avg, by taking their average. This helps reduce dimensionality without losing relevant information. Besides this correlation, all the other correlations were below moderate(<0.7).

# Create the merged delay variable
dta_cls$delay.avg <- rowMeans(dta_cls[, c("departure.delay", "arrival.delay")], na.rm = TRUE)
# Remove the original delay variables
dta_cls <- dta_cls[, !names(dta_cls) %in% c("departure.delay", "arrival.delay")]

However, even after this merge, the data frame would still contain 22 variables. We initially thought that this number of variables is not suitable for effective clustering and PCA analysis.

That is why we explored the possibility of merging variables that are conceptually related despite being correlated. We created three new features that group variables based on their functional meaning within the passenger experience:

  1. flight.experience.score: captures in-flight comfort and service, calculated as the average of the variables on.board.service, seat.comfort, leg.room.service, and cleanliness.

  2. amenities.score: expresses entertainment and extras in the plane, based on food.and.drink, in.flight.service, in.flight.wifi.service, and in.flight.entertainment variables.

  3. ground.service.score: reflects pre- and post-flight service quality, using the variables check.in.service, gate.location, online.boarding, ease.of.online.booking, baggage.handling, anddeparture.and.arrival.time.convenience.

This dimensionality reduction technique was ultimately found to be unsuitable. After further analysis and discussions during the coaching sessions, we identified several issues that made this approach less effective.

Firstly, merging variables into broader categories would reduce interpretability, which is essential for identifying the main drivers of passenger satisfaction. By aggregating conceptually related variables, we lost valuable insights into the individual impact of specific variables. Additionally, this approach altered the weight of some variables, allowing certain components to dominate the aggregated scores while others became less important. Recognizing these limitations, we decided to abandon this approach and explore alternative methods to handle our large variable data set while preserving the relevance of each individual feature. The relevant code of this merging attempt is included in the Appendix.

2. PCA analysis

After reducing the number of variables to 22 in the data frame dta_cls, we decided to proceed with Principal Component Analysis (PCA) without merging any further variables that were not strongly correlated. We execute the PCA with the goal of keeping at least 90% of the total variance in the data.

PCA_analysis <- prcomp(dta_cls)
summary(PCA_analysis)
## Importance of components:
##                           PC1    PC2    PC3     PC4     PC5     PC6     PC7
## Standard deviation     1.9742 1.5602 1.4776 1.13969 1.00180 0.98236 0.96475
## Proportion of Variance 0.2176 0.1359 0.1219 0.07252 0.05603 0.05388 0.05197
## Cumulative Proportion  0.2176 0.3535 0.4754 0.54794 0.60397 0.65785 0.70982
##                            PC8     PC9    PC10    PC11    PC12    PC13    PC14
## Standard deviation     0.94315 0.82842 0.71277 0.68655 0.66843 0.60516 0.57182
## Proportion of Variance 0.04967 0.03832 0.02837 0.02632 0.02495 0.02045 0.01826
## Cumulative Proportion  0.75948 0.79780 0.82617 0.85248 0.87743 0.89787 0.91613
##                          PC15    PC16   PC17    PC18    PC19    PC20    PC21
## Standard deviation     0.5419 0.53631 0.4972 0.47165 0.42932 0.39546 0.26406
## Proportion of Variance 0.0164 0.01606 0.0138 0.01242 0.01029 0.00873 0.00389
## Cumulative Proportion  0.9325 0.94859 0.9624 0.97481 0.98510 0.99383 0.99772
##                           PC22
## Standard deviation     0.20199
## Proportion of Variance 0.00228
## Cumulative Proportion  1.00000

After running the analysis, we found that the first 14 principal components are sufficient to keep 90% of the total variance in the data. Next, we examine the loadings of each of the 14 principal components to see which original variables contribute most to each component. If the resulting components are interpretable, meaning that they represent meaningful, business relevant dimensions, we retain them and assign descriptive names. To assess interpretability, we examine the loadings of each original variable on the principal components, using a threshold of 0.2 to identify which variables contribute meaningfully to each component.

loadings <- data.frame(PCA_analysis$rotation) #extract the loadings of each principal component
threshold <- 0.20 
#Extract the loadings of each original variable on the PC's, based on the threshold
for (i in 1:14) {
  pc <- paste0("PC", i)
  indices <- abs(loadings[[pc]]) >= threshold
  print(data.frame(variable = rownames(loadings)[indices],
                   loading = loadings[[pc]][indices]))
}
##                   variable   loading
## 1          online.boarding 0.2927628
## 2         on.board.service 0.2737918
## 3             seat.comfort 0.3470097
## 4         leg.room.service 0.2203638
## 5              cleanliness 0.3518564
## 6           food.and.drink 0.3010174
## 7        in.flight.service 0.2567453
## 8   in.flight.wifi.service 0.2282310
## 9  in.flight.entertainment 0.4199382
## 10        baggage.handling 0.2536555
##                                 variable   loading
## 1 departure.and.arrival.time.convenience 0.4563580
## 2                 ease.of.online.booking 0.5288273
## 3                          gate.location 0.4484806
## 4                 in.flight.wifi.service 0.4301971
##            variable    loading
## 1  on.board.service -0.3839490
## 2      seat.comfort  0.3072385
## 3  leg.room.service -0.2965297
## 4       cleanliness  0.3037258
## 5    food.and.drink  0.3328723
## 6 in.flight.service -0.4483755
## 7  baggage.handling -0.4386949
##                                 variable    loading
## 1                                    age -0.4826436
## 2                        flight.distance -0.5306713
## 3 departure.and.arrival.time.convenience  0.2146077
## 4                        online.boarding -0.3826813
## 5                         food.and.drink  0.2617159
##           variable    loading
## 1  flight.distance  0.2684597
## 2 check.in.service -0.4410129
## 3 leg.room.service  0.2119129
## 4        delay.avg  0.7659218
##                                 variable    loading
## 1                        flight.distance  0.2375180
## 2 departure.and.arrival.time.convenience -0.2277680
## 3                       check.in.service -0.6247169
## 4                              delay.avg -0.6128239
##                                 variable    loading
## 1                                    age -0.7382128
## 2 departure.and.arrival.time.convenience -0.2250370
## 3                       check.in.service  0.2536409
## 4                        online.boarding  0.3320152
## 5                          gate.location -0.3195734
## 6                 in.flight.wifi.service  0.2259798
##                                 variable    loading
## 1                                    age -0.2723650
## 2                        flight.distance  0.6608514
## 3 departure.and.arrival.time.convenience  0.2405727
## 4                       check.in.service  0.2992949
## 5                        online.boarding -0.2719930
## 6                          gate.location  0.3231713
## 7                 in.flight.wifi.service -0.2978657
##            variable    loading
## 1  check.in.service  0.2378555
## 2  on.board.service -0.2624331
## 3  leg.room.service  0.8419852
## 4 in.flight.service -0.2070002
##                                 variable    loading
## 1                         type.of.travel -0.2125003
## 2 departure.and.arrival.time.convenience  0.5611500
## 3                       check.in.service -0.2078462
## 4                          gate.location -0.5070648
## 5                       on.board.service  0.3824244
## 6                       baggage.handling -0.2679191
##                                 variable    loading
## 1 departure.and.arrival.time.convenience -0.2919385
## 2                          gate.location  0.3281244
## 3                       on.board.service  0.6926921
## 4                      in.flight.service -0.2084671
## 5                       baggage.handling -0.4391406
##                 variable    loading
## 1       check.in.service -0.2520556
## 2        online.boarding  0.3945994
## 3           seat.comfort  0.4894481
## 4         food.and.drink -0.4936511
## 5 in.flight.wifi.service -0.3265462
##            variable    loading
## 1 in.flight.service  0.7447169
## 2  baggage.handling -0.6362383
##                 variable    loading
## 1        online.boarding  0.4016479
## 2            cleanliness -0.3528912
## 3         food.and.drink  0.6035006
## 4 in.flight.wifi.service -0.4530793
rm(loadings)

After checking the loadings of each PCA, we decided not to proceed with using principal components in our final analysis. While some components appeared interpretable, not all principal components provided meaningful, business-relevant interpretation. Also, some PCA’s contained too many variables with significant loading, which made it hard to come up with a meaningful name. Additionally, we observed that several variables had significant loadings across many components. This overlap in loadings further complicated categorization and given our business objective, which is identifying key drivers of customer satisfaction, we decided to keep the original, meaningful variables and scores rather than rely on abstract principal components.

3. Clustering

In our initial clustering attempt, we tried to perform clustering analysis using all 22 variables of the data frame dta_cls. However, this approach proved to be challenging, as the resulting clusters were difficult to interpret and the results were not clear.

To address this, we used a decision tree model by using the C5.0 algorithm from the C50 package. We ran the kmeans clustering algorithm for k=3,4 and 5 clusters, and then we examined the attribute usage of each variable in the decision tree. This allowed us to identify the variables that were contributing the most to the clustering analysis across all three cases.

We decided to keep the top 8 variables with the highest attribute usage when using 3 clusters, as these variables also showed high attribute usage when tested with 4 and 5 clusters. These variables were namely “in.flight entertainment”, “baggage handling”, “in flight service”, “cleanliness”, “seat comfort”, “food and drink”, “online boarding” and “on board service”. These variables are some of the individual satisfaction scores ranging from 1-5 that were standardized. We understood that these scores had a bigger impact in clustering compared to the encoded categorical variables, which had a lower attribute usage.

dta_cls <-dta_cls[, names(dta_cls) %in% c("in.flight.entertainment","baggage.handling","in.flight.service","cleanliness","seat.comfort","food.and.drink","online.boarding","on.board.service")]

#Overview of the new data set
str(dta_cls)
## 'data.frame':    129880 obs. of  8 variables:
##  $ online.boarding        : num  -0.265 1.333 1.333 0.534 1.333 ...
##  $ on.board.service       : num  -0.298 1.256 -0.298 1.256 -0.298 ...
##  $ seat.comfort           : num  1.181 0.423 1.181 1.181 0.423 ...
##  $ cleanliness            : num  1.305 1.305 1.305 0.543 1.305 ...
##  $ food.and.drink         : num  1.351 -0.157 1.351 0.597 0.597 ...
##  $ in.flight.service      : num  1.154 1.154 -0.546 1.154 -0.546 ...
##  $ in.flight.entertainment: num  1.231 1.231 -0.269 1.231 -0.269 ...
##  $ baggage.handling       : num  1.159 1.159 -0.536 1.159 -0.536 ...

By only using these 8 variables with higher attribute usage, we reduced the data set to a smaller, more meaningful subset of 8 variables. Clustering analysis was then repeated, producing clusters which we tried to further analyse and interpret. We finally used 3 clusters in our analysis, as we observed that when increasing the number of clusters to 4, some clusters had overlapping characteristics, making them hard to interpret.

set.seed(57498351) 
k <- 3 # set the number of clusters
clusters <- kmeans(dta_cls, k)
dta_cls <- cbind(dta_cls, cluster = clusters$cluster) # add the assigned cluster to each observation set

In the next chunk, we visualized the clusters in a two-dimensional plot, where the x and y axes represent the two Principal Components with the highest variance when conducting a Principal Component Analysis on the entire set of variables. None of the clusters seem to strongly overlap although the picture maps a 3-dimensional data set in a 2-dimensional plane. This picture gave us a more clear visualization compared to the picture when using all 22 variables .

Next, we calculated the cluster sizes, to check that none of the clusters are too big or too small. We visualize the cluster sizes in a pie chart. It can be seen that clusters 1 and 3 represent around 27% of the observations each, while cluster 2 rerpesents around 45% of the observations.

In the next chunk, we calculated the distributions of the variables in each cluster and a data frame that gives the mean of each clustering variable per cluster. That helped us to have a clearer view in each cluster and find some meaningful distinctive characteristics for each one.

##   cluster online.boarding on.board.service seat.comfort cleanliness
## 1       1      0.06813982      -0.90859668    0.2025568   0.1717311
## 2       2      0.39845177       0.59462253    0.5676445   0.5827241
## 3       3     -0.71896931      -0.07449592   -1.1285089  -1.1227056
##   food.and.drink in.flight.service in.flight.entertainment baggage.handling
## 1      0.2398098       -1.04798174              -0.3082042      -1.01143383
## 2      0.4820187        0.58926405               0.7700781       0.57365844
## 3     -1.0252930        0.07201908              -0.9547930       0.06142086

We tried to group the characteristics of each cluster and came up with the following:

  1. Cluster 1 - Service-focused customers, critical in employee-provided services: In this cluster, except some slightly positive scores like cleanliness, food and drink and seat comfort, customers gave negative scores in services that involved direct interactions, such as on board service and in flight service.Although they were moderately satisfied with some pre-defined services, they were critical in service aspects provided by the airlines’ employees before and during their flight.
  2. Cluster 2 - Highly satisfied customers, seeking entertainment and comfort: In this cluster, customers gave positive scores in all variables. They were highly satisfied with all kinds of services during their flight experience, particularly those incolving entertainment, comfort and service quality.
  3. Cluster 3 - Dissatisfied and critical customers, critical in the lack of amenities: In this cluster, customers gave highly negative scores across most variables, but especially to those involving luxury and comfort services such as online boarding, seat comfort, cleanliness, food and drink and in flight entertainment. They are neutral when it comes to services provided by the airlines’ employees, but seek more comfort and entertainment during their flight.

These clusters can be used by the airlines in order to effectively streamline improvements to enhance customer satisfaction. Also, by understanding the unique characteristics of each cluster, companies can target their marketing efforts and focus on specific areas for improvement, such as comfort, luxury options and employee-provided services. It’s important though to do further analysis for the variables that were excluded for the creation of the clusters. Although they were omitted due to low attribute usage, they may still have some influence on customer satisfaction.

In the chunk below, we ran the C5.0 algorithm once again for the finalized variables and clusters selection. The output was quite extensive and most of it not relevant for this part of the analysis. However, in the last rows, we can see again that the variables that were finally selected for the clustering analysis still have a high attribute usage in the model.

model <- C50::C5.0(as.factor(cluster) ~., 
                   data = dta_cls)
summary(model)
## 
## Call:
## C5.0.formula(formula = as.factor(cluster) ~ ., data = dta_cls)
## 
## 
## C5.0 [Release 2.07 GPL Edition]      Thu Apr 17 17:40:45 2025
## -------------------------------
## 
## Class specified by attribute `outcome'
## 
## Read 129880 cases (9 attributes) from undefined.data
## 
## Decision tree:
## 
## cleanliness <= -0.979776:
## :...food.and.drink > -0.911037:
## :   :...in.flight.service <= -0.5459917:
## :   :   :...seat.comfort > -1.092578:
## :   :   :   :...baggage.handling <= -1.383119: 1 (1704/5)
## :   :   :   :   baggage.handling > -1.383119:
## :   :   :   :   :...online.boarding <= -1.063626:
## :   :   :   :       :...cleanliness <= -1.741252:
## :   :   :   :       :   :...seat.comfort <= 0:
## :   :   :   :       :   :   :...food.and.drink <= 0.5972599: 3 (34)
## :   :   :   :       :   :   :   food.and.drink > 0.5972599:
## :   :   :   :       :   :   :   :...online.boarding <= -1.86236: 3 (4)
## :   :   :   :       :   :   :       online.boarding > -1.86236: 1 (6)
## :   :   :   :       :   :   seat.comfort > 0:
## :   :   :   :       :   :   :...online.boarding > -1.86236: 1 (14/1)
## :   :   :   :       :   :       online.boarding <= -1.86236:
## :   :   :   :       :   :       :...on.board.service <= -1.074771: 1 (2)
## :   :   :   :       :   :           on.board.service > -1.074771: 3 (7/1)
## :   :   :   :       :   cleanliness > -1.741252:
## :   :   :   :       :   :...baggage.handling > 0.3117611:
## :   :   :   :       :       :...seat.comfort <= 0.4234303: 3 (3)
## :   :   :   :       :       :   seat.comfort > 0.4234303: 1 (2/1)
## :   :   :   :       :       baggage.handling <= 0.3117611:
## :   :   :   :       :       :...food.and.drink <= 0:
## :   :   :   :       :           :...seat.comfort > 0.4234303: 1 (5)
## :   :   :   :       :           :   seat.comfort <= 0.4234303:
## :   :   :   :       :           :   :...seat.comfort <= 0: 3 (6)
## :   :   :   :       :           :       seat.comfort > 0:
## :   :   :   :       :           :       :...online.boarding <= -1.86236: 3 (3)
## :   :   :   :       :           :           online.boarding > -1.86236: 1 (2)
## :   :   :   :       :           food.and.drink > 0:
## :   :   :   :       :           :...online.boarding > -1.86236: 1 (42)
## :   :   :   :       :               online.boarding <= -1.86236:
## :   :   :   :       :               :...food.and.drink > 0.5972599: 1 (8)
## :   :   :   :       :                   food.and.drink <= 0.5972599:
## :   :   :   :       :                   :...seat.comfort <= 0: 3 (3)
## :   :   :   :       :                       seat.comfort > 0: 1 (3)
## :   :   :   :       online.boarding > -1.063626:
## :   :   :   :       :...food.and.drink > 0:
## :   :   :   :           :...baggage.handling <= -0.5356789: 1 (579)
## :   :   :   :           :   baggage.handling > -0.5356789:
## :   :   :   :           :   :...baggage.handling <= 0.3117611: 1 (36)
## :   :   :   :           :       baggage.handling > 0.3117611:
## :   :   :   :           :       :...on.board.service <= -1.074771: 1 (5)
## :   :   :   :           :           on.board.service > -1.074771: 2 (6/1)
## :   :   :   :           food.and.drink <= 0:
## :   :   :   :           :...cleanliness > -1.741252: 1 (169/2)
## :   :   :   :               cleanliness <= -1.741252:
## :   :   :   :               :...seat.comfort <= 0:
## :   :   :   :                   :...on.board.service <= -1.074771: 1 (14/2)
## :   :   :   :                   :   on.board.service > -1.074771: 3 (72/1)
## :   :   :   :                   seat.comfort > 0:
## :   :   :   :                   :...baggage.handling <= -0.5356789: 1 (63)
## :   :   :   :                       baggage.handling > -0.5356789:
## :   :   :   :                       :...baggage.handling > 0.3117611: 2 (2/1)
## :   :   :   :                           baggage.handling <= 0.3117611:
## :   :   :   :                           :...on.board.service <= -1.074771: 1 (2)
## :   :   :   :                               on.board.service > -1.074771: 3 (4/1)
## :   :   :   seat.comfort <= -1.092578:
## :   :   :   :...baggage.handling <= -1.383119:
## :   :   :       :...on.board.service > -1.074771:
## :   :   :       :   :...cleanliness <= -1.741252: 3 (18/1)
## :   :   :       :   :   cleanliness > -1.741252:
## :   :   :       :   :   :...in.flight.entertainment <= -1.018808: 3 (7/1)
## :   :   :       :   :       in.flight.entertainment > -1.018808:
## :   :   :       :   :       :...on.board.service <= 0.4793173:
## :   :   :       :   :           :...seat.comfort <= -1.850582: 3 (3/1)
## :   :   :       :   :           :   seat.comfort > -1.850582: 1 (18/2)
## :   :   :       :   :           on.board.service > 0.4793173:
## :   :   :       :   :           :...in.flight.service <= -1.396005: 1 (2)
## :   :   :       :   :               in.flight.service > -1.396005: 3 (3)
## :   :   :       :   on.board.service <= -1.074771:
## :   :   :       :   :...food.and.drink <= 0:
## :   :   :       :       :...in.flight.service <= -2.246018: 1 (72/4)
## :   :   :       :       :   in.flight.service > -2.246018:
## :   :   :       :       :   :...cleanliness <= -1.741252:
## :   :   :       :       :       :...online.boarding <= -1.063626: 3 (30/1)
## :   :   :       :       :       :   online.boarding > -1.063626:
## :   :   :       :       :       :   :...seat.comfort <= -1.850582: 3 (3)
## :   :   :       :       :       :       seat.comfort > -1.850582: 1 (7)
## :   :   :       :       :       cleanliness > -1.741252:
## :   :   :       :       :       :...online.boarding <= -1.86236: 3 (4)
## :   :   :       :       :           online.boarding > -1.86236:
## :   :   :       :       :           :...in.flight.service <= -1.396005: 1 (37)
## :   :   :       :       :               in.flight.service > -1.396005: 3 (2)
## :   :   :       :       food.and.drink > 0:
## :   :   :       :       :...seat.comfort > -1.850582: 1 (185/1)
## :   :   :       :           seat.comfort <= -1.850582:
## :   :   :       :           :...on.board.service <= -1.851815: 1 (29)
## :   :   :       :               on.board.service > -1.851815:
## :   :   :       :               :...online.boarding <= -1.86236: 3 (5)
## :   :   :       :                   online.boarding > -1.86236:
## :   :   :       :                   :...cleanliness > -1.741252: 1 (14)
## :   :   :       :                       cleanliness <= -1.741252:
## :   :   :       :                       :...food.and.drink <= 0.5972599: 3 (4)
## :   :   :       :                           food.and.drink > 0.5972599: 1 (8)
## :   :   :       baggage.handling > -1.383119:
## :   :   :       :...cleanliness <= -1.741252:
## :   :   :           :...online.boarding <= 0: 3 (230/2)
## :   :   :           :   online.boarding > 0:
## :   :   :           :   :...food.and.drink <= 0.5972599: 3 (21/1)
## :   :   :           :       food.and.drink > 0.5972599:
## :   :   :           :       :...online.boarding > 0.5338414: 1 (4)
## :   :   :           :           online.boarding <= 0.5338414:
## :   :   :           :           :...seat.comfort <= -1.850582: 3 (2)
## :   :   :           :               seat.comfort > -1.850582: 1 (2)
## :   :   :           cleanliness > -1.741252:
## :   :   :           :...online.boarding > -1.063626:
## :   :   :               :...food.and.drink <= 0:
## :   :   :               :   :...in.flight.service > -1.396005: 3 (25/3)
## :   :   :               :   :   in.flight.service <= -1.396005:
## :   :   :               :   :   :...baggage.handling <= -0.5356789: 1 (5)
## :   :   :               :   :       baggage.handling > -0.5356789: 3 (2)
## :   :   :               :   food.and.drink > 0:
## :   :   :               :   :...on.board.service <= 0: 1 (50/3)
## :   :   :               :       on.board.service > 0:
## :   :   :               :       :...baggage.handling <= -0.5356789: 1 (2/1)
## :   :   :               :           baggage.handling > -0.5356789: 2 (2)
## :   :   :               online.boarding <= -1.063626:
## :   :   :               :...in.flight.service > -1.396005:
## :   :   :                   :...food.and.drink <= 0.5972599: 3 (109/1)
## :   :   :                   :   food.and.drink > 0.5972599:
## :   :   :                   :   :...seat.comfort <= -1.850582: 3 (9)
## :   :   :                   :       seat.comfort > -1.850582:
## :   :   :                   :       :...baggage.handling <= -0.5356789: 1 (10/1)
## :   :   :                   :           baggage.handling > -0.5356789: 3 (6/1)
## :   :   :                   in.flight.service <= -1.396005:
## :   :   :                   :...on.board.service > 0: 3 (25/2)
## :   :   :                       on.board.service <= 0:
## :   :   :                       :...in.flight.service <= -2.246018: 1 (21/2)
## :   :   :                           in.flight.service > -2.246018:
## :   :   :                           :...food.and.drink > 0: 1 (3)
## :   :   :                               food.and.drink <= 0: [S1]
## :   :   in.flight.service > -0.5459917:
## :   :   :...baggage.handling <= -1.383119:
## :   :       :...in.flight.entertainment > 0.4810466:
## :   :       :   :...seat.comfort > 0: 2 (14)
## :   :       :   :   seat.comfort <= 0:
## :   :       :   :   :...seat.comfort <= -1.850582: 3 (7)
## :   :       :   :       seat.comfort > -1.850582:
## :   :       :   :       :...food.and.drink <= 0: 3 (6/2)
## :   :       :   :           food.and.drink > 0: 2 (6/1)
## :   :       :   in.flight.entertainment <= 0.4810466:
## :   :       :   :...seat.comfort <= -1.092578: 3 (35/5)
## :   :       :       seat.comfort > -1.092578:
## :   :       :       :...online.boarding > 0: 1 (13/1)
## :   :       :           online.boarding <= 0:
## :   :       :           :...food.and.drink > 0.5972599: 1 (7)
## :   :       :               food.and.drink <= 0.5972599:
## :   :       :               :...baggage.handling <= -2.230559: 1 (7/1)
## :   :       :                   baggage.handling > -2.230559:
## :   :       :                   :...seat.comfort <= 0: 3 (6)
## :   :       :                       seat.comfort > 0: 1 (3/1)
## :   :       baggage.handling > -1.383119:
## :   :       :...seat.comfort > 0:
## :   :           :...food.and.drink > 0: 2 (600/5)
## :   :           :   food.and.drink <= 0:
## :   :           :   :...on.board.service > 0.4793173: 2 (145)
## :   :           :       on.board.service <= 0.4793173:
## :   :           :       :...online.boarding > 0: 2 (76)
## :   :           :           online.boarding <= 0:
## :   :           :           :...cleanliness <= -1.741252:
## :   :           :               :...seat.comfort <= 0.4234303: 3 (31/1)
## :   :           :               :   seat.comfort > 0.4234303:
## :   :           :               :   :...online.boarding <= -1.86236: 3 (6)
## :   :           :               :       online.boarding > -1.86236: 2 (7)
## :   :           :               cleanliness > -1.741252:
## :   :           :               :...seat.comfort > 0.4234303: 2 (17)
## :   :           :                   seat.comfort <= 0.4234303:
## :   :           :                   :...online.boarding <= -1.86236: 3 (3)
## :   :           :                       online.boarding > -1.86236: 2 (39/1)
## :   :           seat.comfort <= 0:
## :   :           :...in.flight.entertainment <= 0.4810466:
## :   :               :...online.boarding <= -1.063626:
## :   :               :   :...seat.comfort <= -1.092578: 3 (320/4)
## :   :               :   :   seat.comfort > -1.092578:
## :   :               :   :   :...cleanliness <= -1.741252: 3 (57)
## :   :               :   :       cleanliness > -1.741252:
## :   :               :   :       :...food.and.drink <= 0: 3 (24)
## :   :               :   :           food.and.drink > 0:
## :   :               :   :           :...baggage.handling <= -0.5356789: 3 (2)
## :   :               :   :               baggage.handling > -0.5356789:
## :   :               :   :               :...online.boarding > -1.86236: 2 (14)
## :   :               :   :                   online.boarding <= -1.86236:
## :   :               :   :                   :...food.and.drink <= 0.5972599: 3 (8)
## :   :               :   :                       food.and.drink > 0.5972599: 2 (6)
## :   :               :   online.boarding > -1.063626:
## :   :               :   :...food.and.drink > 0:
## :   :               :       :...seat.comfort <= -1.850582:
## :   :               :       :   :...online.boarding > 0.5338414:
## :   :               :       :   :   :...food.and.drink > 0.5972599: 2 (14)
## :   :               :       :   :   :   food.and.drink <= 0.5972599:
## :   :               :       :   :   :   :...cleanliness <= -1.741252: 3 (3)
## :   :               :       :   :   :       cleanliness > -1.741252: 2 (3)
## :   :               :       :   :   online.boarding <= 0.5338414:
## :   :               :       :   :   :...food.and.drink <= 0.5972599: 3 (31)
## :   :               :       :   :       food.and.drink > 0.5972599:
## :   :               :       :   :       :...cleanliness <= -1.741252: 3 (17)
## :   :               :       :   :           cleanliness > -1.741252:
## :   :               :       :   :           :...online.boarding <= 0: 3 (5/1)
## :   :               :       :   :               online.boarding > 0: 2 (6)
## :   :               :       :   seat.comfort > -1.850582:
## :   :               :       :   :...online.boarding > 0: 2 (162/5)
## :   :               :       :       online.boarding <= 0:
## :   :               :       :       :...food.and.drink <= 0.5972599:
## :   :               :       :           :...cleanliness <= -1.741252: 3 (34)
## :   :               :       :           :   cleanliness > -1.741252:
## :   :               :       :           :   :...seat.comfort <= -1.092578: 3 (13/1)
## :   :               :       :           :       seat.comfort > -1.092578: 2 (34)
## :   :               :       :           food.and.drink > 0.5972599:
## :   :               :       :           :...seat.comfort > -1.092578: 2 (54)
## :   :               :       :               seat.comfort <= -1.092578:
## :   :               :       :               :...cleanliness <= -1.741252: 3 (7)
## :   :               :       :                   cleanliness > -1.741252: 2 (13)
## :   :               :       food.and.drink <= 0:
## :   :               :       :...online.boarding <= 0: 3 (96/2)
## :   :               :           online.boarding > 0:
## :   :               :           :...seat.comfort <= -1.850582: 3 (35)
## :   :               :               seat.comfort > -1.850582:
## :   :               :               :...cleanliness <= -1.741252:
## :   :               :                   :...online.boarding <= 0.5338414: 3 (32)
## :   :               :                   :   online.boarding > 0.5338414:
## :   :               :                   :   :...seat.comfort <= -1.092578: 3 (4)
## :   :               :                   :       seat.comfort > -1.092578: 2 (5)
## :   :               :                   cleanliness > -1.741252:
## :   :               :                   :...baggage.handling <= -0.5356789: 3 (3/1)
## :   :               :                       baggage.handling > -0.5356789:
## :   :               :                       :...seat.comfort > -1.092578: 2 (24)
## :   :               :                           seat.comfort <= -1.092578: [S2]
## :   :               in.flight.entertainment > 0.4810466:
## :   :               :...on.board.service <= 0.4793173:
## :   :                   :...seat.comfort <= -1.850582: 3 (24)
## :   :                   :   seat.comfort > -1.850582:
## :   :                   :   :...food.and.drink <= 0: 3 (4)
## :   :                   :       food.and.drink > 0:
## :   :                   :       :...on.board.service <= -1.074771: 3 (3/1)
## :   :                   :           on.board.service > -1.074771: 2 (8/2)
## :   :                   on.board.service > 0.4793173:
## :   :                   :...online.boarding <= -1.86236:
## :   :                       :...food.and.drink <= 0: 3 (41/1)
## :   :                       :   food.and.drink > 0:
## :   :                       :   :...seat.comfort <= -1.850582:
## :   :                       :       :...cleanliness <= -1.741252: 3 (14)
## :   :                       :       :   cleanliness > -1.741252:
## :   :                       :       :   :...food.and.drink <= 0.5972599: 3 (3)
## :   :                       :       :       food.and.drink > 0.5972599: 2 (5)
## :   :                       :       seat.comfort > -1.850582:
## :   :                       :       :...food.and.drink > 0.5972599: 2 (27)
## :   :                       :           food.and.drink <= 0.5972599:
## :   :                       :           :...cleanliness > -1.741252: 2 (11)
## :   :                       :               cleanliness <= -1.741252:
## :   :                       :               :...seat.comfort <= -1.092578: 3 (5)
## :   :                       :                   seat.comfort > -1.092578: 2 (5/1)
## :   :                       online.boarding > -1.86236:
## :   :                       :...food.and.drink > 0:
## :   :                           :...seat.comfort > -1.850582: 2 (230)
## :   :                           :   seat.comfort <= -1.850582:
## :   :                           :   :...cleanliness > -1.741252: 2 (69)
## :   :                           :       cleanliness <= -1.741252:
## :   :                           :       :...food.and.drink > 0.5972599: 2 (22)
## :   :                           :           food.and.drink <= 0.5972599:
## :   :                           :           :...online.boarding <= 0: 3 (14)
## :   :                           :               online.boarding > 0: 2 (11)
## :   :                           food.and.drink <= 0:
## :   :                           :...seat.comfort > -1.092578: 2 (62/1)
## :   :                               seat.comfort <= -1.092578:
## :   :                               :...online.boarding <= 0:
## :   :                                   :...seat.comfort <= -1.850582: 3 (19)
## :   :                                   :   seat.comfort > -1.850582:
## :   :                                   :   :...cleanliness <= -1.741252: 3 (12)
## :   :                                   :       cleanliness > -1.741252: [S3]
## :   :                                   online.boarding > 0:
## :   :                                   :...seat.comfort > -1.850582: 2 (26)
## :   :                                       seat.comfort <= -1.850582:
## :   :                                       :...cleanliness > -1.741252: 2 (11)
## :   :                                           cleanliness <= -1.741252: [S4]
## :   food.and.drink <= -0.911037:
## :   :...in.flight.service > -1.396005:
## :       :...in.flight.entertainment > 0:
## :       :   :...seat.comfort > 0:
## :       :   :   :...on.board.service <= 0.4793173:
## :       :   :   :   :...online.boarding <= -1.063626: 3 (99/4)
## :       :   :   :   :   online.boarding > -1.063626:
## :       :   :   :   :   :...food.and.drink <= -1.665185:
## :       :   :   :   :       :...online.boarding <= 0.5338414:
## :       :   :   :   :       :   :...seat.comfort <= 0.4234303: 3 (96/1)
## :       :   :   :   :       :   :   seat.comfort > 0.4234303:
## :       :   :   :   :       :   :   :...cleanliness <= -1.741252: 3 (16)
## :       :   :   :   :       :   :       cleanliness > -1.741252: 2 (15/1)
## :       :   :   :   :       :   online.boarding > 0.5338414:
## :       :   :   :   :       :   :...cleanliness > -1.741252: 2 (12)
## :       :   :   :   :       :       cleanliness <= -1.741252:
## :       :   :   :   :       :       :...seat.comfort <= 0.4234303: 3 (8)
## :       :   :   :   :       :           seat.comfort > 0.4234303: 2 (5)
## :       :   :   :   :       food.and.drink > -1.665185:
## :       :   :   :   :       :...cleanliness > -1.741252: 2 (79/2)
## :       :   :   :   :           cleanliness <= -1.741252:
## :       :   :   :   :           :...seat.comfort > 0.4234303: 2 (23)
## :       :   :   :   :               seat.comfort <= 0.4234303:
## :       :   :   :   :               :...online.boarding <= 0.5338414: 3 (42)
## :       :   :   :   :                   online.boarding > 0.5338414: 2 (7/1)
## :       :   :   :   on.board.service > 0.4793173:
## :       :   :   :   :...online.boarding > -1.063626: 2 (186/3)
## :       :   :   :       online.boarding <= -1.063626:
## :       :   :   :       :...food.and.drink > -1.665185:
## :       :   :   :           :...cleanliness > -1.741252: 2 (20)
## :       :   :   :           :   cleanliness <= -1.741252:
## :       :   :   :           :   :...online.boarding > -1.86236: 2 (14/1)
## :       :   :   :           :       online.boarding <= -1.86236:
## :       :   :   :           :       :...seat.comfort <= 0.4234303: 3 (6)
## :       :   :   :           :           seat.comfort > 0.4234303: 2 (2)
## :       :   :   :           food.and.drink <= -1.665185:
## :       :   :   :           :...seat.comfort <= 0.4234303:
## :       :   :   :               :...cleanliness <= -1.741252: 3 (14)
## :       :   :   :               :   cleanliness > -1.741252:
## :       :   :   :               :   :...online.boarding <= -1.86236: 3 (6)
## :       :   :   :               :       online.boarding > -1.86236: 2 (4/1)
## :       :   :   :               seat.comfort > 0.4234303:
## :       :   :   :               :...cleanliness > -1.741252: 2 (9)
## :       :   :   :                   cleanliness <= -1.741252:
## :       :   :   :                   :...online.boarding <= -1.86236: 3 (5)
## :       :   :   :                       online.boarding > -1.86236:
## :       :   :   :                       :...baggage.handling <= -0.5356789: 3 (2)
## :       :   :   :                           baggage.handling > -0.5356789: 2 (5)
## :       :   :   seat.comfort <= 0:
## :       :   :   :...in.flight.service <= 0.3040217:
## :       :   :       :...online.boarding <= 0.5338414: 3 (778/2)
## :       :   :       :   online.boarding > 0.5338414:
## :       :   :       :   :...seat.comfort <= -1.092578: 3 (41)
## :       :   :       :       seat.comfort > -1.092578:
## :       :   :       :       :...food.and.drink <= -1.665185: 3 (12)
## :       :   :       :           food.and.drink > -1.665185:
## :       :   :       :           :...cleanliness <= -1.741252: 3 (4)
## :       :   :       :               cleanliness > -1.741252: 2 (9)
## :       :   :       in.flight.service > 0.3040217:
## :       :   :       :...online.boarding <= -1.063626:
## :       :   :           :...seat.comfort <= -1.092578: 3 (178)
## :       :   :           :   seat.comfort > -1.092578:
## :       :   :           :   :...food.and.drink <= -1.665185: 3 (29)
## :       :   :           :       food.and.drink > -1.665185:
## :       :   :           :       :...online.boarding <= -1.86236: 3 (14)
## :       :   :           :           online.boarding > -1.86236:
## :       :   :           :           :...cleanliness <= -1.741252: 3 (8)
## :       :   :           :               cleanliness > -1.741252:
## :       :   :           :               :...baggage.handling <= -0.5356789: 3 (2)
## :       :   :           :                   baggage.handling > -0.5356789: 2 (6)
## :       :   :           online.boarding > -1.063626:
## :       :   :           :...seat.comfort <= -1.850582: 3 (75/4)
## :       :   :               seat.comfort > -1.850582:
## :       :   :               :...food.and.drink <= -1.665185:
## :       :   :                   :...cleanliness <= -1.741252:
## :       :   :                   :   :...online.boarding <= 0.5338414: 3 (40)
## :       :   :                   :   :   online.boarding > 0.5338414:
## :       :   :                   :   :   :...seat.comfort <= -1.092578: 3 (9)
## :       :   :                   :   :       seat.comfort > -1.092578: 2 (6)
## :       :   :                   :   cleanliness > -1.741252:
## :       :   :                   :   :...baggage.handling <= 0.3117611: 3 (14/2)
## :       :   :                   :       baggage.handling > 0.3117611:
## :       :   :                   :       :...seat.comfort > -1.092578: 2 (22)
## :       :   :                   :           seat.comfort <= -1.092578:
## :       :   :                   :           :...online.boarding <= 0.5338414: 3 (7)
## :       :   :                   :               online.boarding > 0.5338414: 2 (8)
## :       :   :                   food.and.drink > -1.665185:
## :       :   :                   :...seat.comfort > -1.092578:
## :       :   :                       :...baggage.handling <= -2.230559: 3 (3)
## :       :   :                       :   baggage.handling > -2.230559: 2 (62/1)
## :       :   :                       seat.comfort <= -1.092578:
## :       :   :                       :...online.boarding > 0.5338414: 2 (8)
## :       :   :                           online.boarding <= 0.5338414:
## :       :   :                           :...cleanliness <= -1.741252: 3 (10)
## :       :   :                               cleanliness > -1.741252:
## :       :   :                               :...on.board.service <= 0.4793173: 3 (2)
## :       :   :                                   on.board.service > 0.4793173: [S5]
## :       :   in.flight.entertainment <= 0:
## :       :   :...seat.comfort <= 0:
## :       :       :...baggage.handling > -1.383119: 3 (19674/12)
## :       :       :   baggage.handling <= -1.383119:
## :       :       :   :...online.boarding <= -0.2648924:
## :       :       :       :...baggage.handling > -2.230559: 3 (1029/5)
## :       :       :       :   baggage.handling <= -2.230559:
## :       :       :       :   :...in.flight.service > -0.5459917: 3 (241)
## :       :       :       :       in.flight.service <= -0.5459917:
## :       :       :       :       :...on.board.service <= -1.851815:
## :       :       :       :           :...food.and.drink <= -1.665185: 3 (15)
## :       :       :       :           :   food.and.drink > -1.665185:
## :       :       :       :           :   :...online.boarding <= -1.86236: 3 (3)
## :       :       :       :           :       online.boarding > -1.86236: 1 (14)
## :       :       :       :           on.board.service > -1.851815:
## :       :       :       :           :...seat.comfort > -1.092578: [S6]
## :       :       :       :               seat.comfort <= -1.092578:
## :       :       :       :               :...online.boarding <= -1.063626: 3 (102)
## :       :       :       :                   online.boarding > -1.063626:
## :       :       :       :                   :...on.board.service > -1.074771: 3 (28)
## :       :       :       :                       on.board.service <= -1.074771:
## :       :       :       :                       :...seat.comfort <= -1.850582: 3 (3)
## :       :       :       :                           seat.comfort > -1.850582: 1 (4)
## :       :       :       online.boarding > -0.2648924:
## :       :       :       :...in.flight.entertainment <= -1.768735: 3 (193)
## :       :       :           in.flight.entertainment > -1.768735:
## :       :       :           :...in.flight.service > -0.5459917: 3 (144/8)
## :       :       :               in.flight.service <= -0.5459917:
## :       :       :               :...on.board.service <= -1.851815: 1 (31/1)
## :       :       :                   on.board.service > -1.851815:
## :       :       :                   :...baggage.handling <= -2.230559:
## :       :       :                       :...on.board.service <= -1.074771: 1 (7)
## :       :       :                       :   on.board.service > -1.074771:
## :       :       :                       :   :...online.boarding <= 0: 3 (5)
## :       :       :                       :       online.boarding > 0: 1 (18/7)
## :       :       :                       baggage.handling > -2.230559:
## :       :       :                       :...seat.comfort <= -1.092578: 3 (46)
## :       :       :                           seat.comfort > -1.092578:
## :       :       :                           :...food.and.drink <= -1.665185: 3 (3)
## :       :       :                               food.and.drink > -1.665185: [S7]
## :       :       seat.comfort > 0:
## :       :       :...baggage.handling > -0.5356789:
## :       :           :...on.board.service > -1.074771:
## :       :           :   :...online.boarding <= 0: 3 (558/6)
## :       :           :   :   online.boarding > 0:
## :       :           :   :   :...cleanliness <= -1.741252: 3 (67)
## :       :           :   :       cleanliness > -1.741252:
## :       :           :   :       :...seat.comfort <= 0.4234303:
## :       :           :   :           :...online.boarding <= 0.5338414: 3 (55)
## :       :           :   :           :   online.boarding > 0.5338414:
## :       :           :   :           :   :...on.board.service <= 0.4793173: 3 (16/1)
## :       :           :   :           :       on.board.service > 0.4793173: 2 (6)
## :       :           :   :           seat.comfort > 0.4234303:
## :       :           :   :           :...on.board.service > 0: 2 (32/3)
## :       :           :   :               on.board.service <= 0:
## :       :           :   :               :...online.boarding <= 0.5338414: 3 (17/1)
## :       :           :   :                   online.boarding > 0.5338414: 2 (4)
## :       :           :   on.board.service <= -1.074771:
## :       :           :   :...seat.comfort <= 0.4234303: 3 (82/3)
## :       :           :       seat.comfort > 0.4234303:
## :       :           :       :...cleanliness <= -1.741252: 3 (17)
## :       :           :           cleanliness > -1.741252:
## :       :           :           :...baggage.handling > 0.3117611: 3 (10)
## :       :           :               baggage.handling <= 0.3117611:
## :       :           :               :...online.boarding <= -1.86236: 3 (5)
## :       :           :                   online.boarding > -1.86236:
## :       :           :                   :...in.flight.service <= -0.5459917: 1 (12)
## :       :           :                       in.flight.service > -0.5459917:
## :       :           :                       :...online.boarding <= 0: 3 (5/1)
## :       :           :                           online.boarding > 0: 1 (2)
## :       :           baggage.handling <= -0.5356789:
## :       :           :...cleanliness <= -1.741252:
## :       :               :...baggage.handling <= -1.383119:
## :       :               :   :...food.and.drink > -1.665185: 1 (5)
## :       :               :   :   food.and.drink <= -1.665185:
## :       :               :   :   :...on.board.service > -1.851815: 3 (49/4)
## :       :               :   :       on.board.service <= -1.851815:
## :       :               :   :       :...online.boarding <= -1.063626: 3 (3/1)
## :       :               :   :           online.boarding > -1.063626: 1 (3)
## :       :               :   baggage.handling > -1.383119:
## :       :               :   :...seat.comfort <= 0.4234303: 3 (198/1)
## :       :               :       seat.comfort > 0.4234303:
## :       :               :       :...food.and.drink <= -1.665185: 3 (58/2)
## :       :               :           food.and.drink > -1.665185:
## :       :               :           :...online.boarding <= -1.063626: 3 (4)
## :       :               :               online.boarding > -1.063626: 1 (7)
## :       :               cleanliness > -1.741252:
## :       :               :...online.boarding <= -1.063626:
## :       :                   :...on.board.service > -1.074771: 3 (105/3)
## :       :                   :   on.board.service <= -1.074771:
## :       :                   :   :...baggage.handling <= -2.230559: 1 (4)
## :       :                   :       baggage.handling > -2.230559:
## :       :                   :       :...in.flight.service > -0.5459917: 3 (36/4)
## :       :                   :           in.flight.service <= -0.5459917:
## :       :                   :           :...seat.comfort > 0.4234303: 1 (11)
## :       :                   :               seat.comfort <= 0.4234303:
## :       :                   :               :...on.board.service > -1.851815: 3 (11/1)
## :       :                   :                   on.board.service <= -1.851815: [S8]
## :       :                   online.boarding > -1.063626:
## :       :                   :...food.and.drink <= -1.665185:
## :       :                       :...seat.comfort > 0.4234303: 1 (6)
## :       :                       :   seat.comfort <= 0.4234303:
## :       :                       :   :...baggage.handling <= -1.383119: 1 (4)
## :       :                       :       baggage.handling > -1.383119:
## :       :                       :       :...online.boarding <= 0.5338414: 3 (50)
## :       :                       :           online.boarding > 0.5338414: 1 (2)
## :       :                       food.and.drink > -1.665185:
## :       :                       :...in.flight.entertainment > -1.018808: 1 (68)
## :       :                           in.flight.entertainment <= -1.018808:
## :       :                           :...on.board.service > -1.074771:
## :       :                               :...baggage.handling > -1.383119: 3 (48/6)
## :       :                               :   baggage.handling <= -1.383119:
## :       :                               :   :...in.flight.service <= 0.3040217: 1 (19/2)
## :       :                               :       in.flight.service > 0.3040217: 3 (4/1)
## :       :                               on.board.service <= -1.074771:
## :       :                               :...in.flight.service <= -0.5459917: 1 (28)
## :       :                                   in.flight.service > -0.5459917:
## :       :                                   :...seat.comfort > 0.4234303: 1 (12)
## :       :                                       seat.comfort <= 0.4234303: [S9]
## :       in.flight.service <= -1.396005:
## :       :...seat.comfort > -1.092578:
## :           :...baggage.handling > -1.383119:
## :           :   :...cleanliness <= -1.741252:
## :           :   :   :...on.board.service > -1.851815: 3 (55/2)
## :           :   :   :   on.board.service <= -1.851815:
## :           :   :   :   :...online.boarding <= -1.063626: 3 (7)
## :           :   :   :       online.boarding > -1.063626:
## :           :   :   :       :...seat.comfort > 0: 1 (11/2)
## :           :   :   :           seat.comfort <= 0:
## :           :   :   :           :...food.and.drink <= -1.665185: 3 (4)
## :           :   :   :               food.and.drink > -1.665185:
## :           :   :   :               :...baggage.handling <= -0.5356789: 1 (2)
## :           :   :   :                   baggage.handling > -0.5356789: 3 (2)
## :           :   :   cleanliness > -1.741252:
## :           :   :   :...in.flight.entertainment <= -1.768735: 1 (10)
## :           :   :       in.flight.entertainment > -1.768735:
## :           :   :       :...seat.comfort <= 0:
## :           :   :           :...baggage.handling > -0.5356789: 3 (37/2)
## :           :   :           :   baggage.handling <= -0.5356789:
## :           :   :           :   :...online.boarding > 0: 1 (4)
## :           :   :           :       online.boarding <= 0:
## :           :   :           :       :...online.boarding <= -1.063626: 3 (10/1)
## :           :   :           :           online.boarding > -1.063626:
## :           :   :           :           :...on.board.service <= 0: 1 (6/1)
## :           :   :           :               on.board.service > 0: 3 (4)
## :           :   :           seat.comfort > 0:
## :           :   :           :...on.board.service <= 0:
## :           :   :               :...food.and.drink <= -1.665185:
## :           :   :               :   :...baggage.handling <= 0.3117611: 1 (3)
## :           :   :               :   :   baggage.handling > 0.3117611: 3 (3)
## :           :   :               :   food.and.drink > -1.665185:
## :           :   :               :   :...on.board.service <= -1.074771: 1 (36)
## :           :   :               :       on.board.service > -1.074771:
## :           :   :               :       :...seat.comfort <= 0.4234303: 3 (5/1)
## :           :   :               :           seat.comfort > 0.4234303: 1 (10)
## :           :   :               on.board.service > 0:
## :           :   :               :...online.boarding > 0: 1 (7)
## :           :   :                   online.boarding <= 0:
## :           :   :                   :...baggage.handling > -0.5356789: 3 (10)
## :           :   :                       baggage.handling <= -0.5356789:
## :           :   :                       :...in.flight.service <= -2.246018: 1 (3)
## :           :   :                           in.flight.service > -2.246018: 3 (8/2)
## :           :   baggage.handling <= -1.383119:
## :           :   :...online.boarding <= -1.063626:
## :           :       :...baggage.handling <= -2.230559:
## :           :       :   :...food.and.drink > -1.665185: 1 (40)
## :           :       :   :   food.and.drink <= -1.665185:
## :           :       :   :   :...in.flight.service <= -2.246018: 1 (19/1)
## :           :       :   :       in.flight.service > -2.246018: 3 (4)
## :           :       :   baggage.handling > -2.230559:
## :           :       :   :...food.and.drink > -1.665185:
## :           :       :       :...cleanliness > -1.741252: 1 (39/3)
## :           :       :       :   cleanliness <= -1.741252:
## :           :       :       :   :...seat.comfort <= 0: 3 (14)
## :           :       :       :       seat.comfort > 0: 1 (5)
## :           :       :       food.and.drink <= -1.665185:
## :           :       :       :...seat.comfort <= 0: 3 (41)
## :           :       :           seat.comfort > 0:
## :           :       :           :...cleanliness > -1.741252: 1 (6/1)
## :           :       :               cleanliness <= -1.741252:
## :           :       :               :...online.boarding <= -1.86236: 3 (7)
## :           :       :                   online.boarding > -1.86236:
## :           :       :                   :...seat.comfort <= 0.4234303: 3 (4)
## :           :       :                       seat.comfort > 0.4234303: 1 (4/1)
## :           :       online.boarding > -1.063626:
## :           :       :...food.and.drink > -1.665185: 1 (544/2)
## :           :           food.and.drink <= -1.665185:
## :           :           :...cleanliness > -1.741252: 1 (204)
## :           :               cleanliness <= -1.741252:
## :           :               :...in.flight.service <= -2.246018: 1 (103/1)
## :           :                   in.flight.service > -2.246018:
## :           :                   :...seat.comfort <= 0:
## :           :                       :...baggage.handling <= -2.230559: 1 (3)
## :           :                       :   baggage.handling > -2.230559:
## :           :                       :   :...online.boarding <= 0: 3 (44)
## :           :                       :       online.boarding > 0:
## :           :                       :       :...online.boarding <= 0.5338414: 3 (25)
## :           :                       :           online.boarding > 0.5338414: 1 (2)
## :           :                       seat.comfort > 0:
## :           :                       :...on.board.service <= -1.074771: 1 (69/1)
## :           :                           on.board.service > -1.074771:
## :           :                           :...seat.comfort <= 0.4234303: 3 (5/1)
## :           :                               seat.comfort > 0.4234303:
## :           :                               :...online.boarding <= 0: 3 (3/1)
## :           :                                   online.boarding > 0: 1 (3)
## :           seat.comfort <= -1.092578:
## :           :...seat.comfort <= -1.850582:
## :               :...cleanliness <= -1.741252: 3 (1449/2)
## :               :   cleanliness > -1.741252:
## :               :   :...baggage.handling > -2.230559: 3 (95/6)
## :               :       baggage.handling <= -2.230559:
## :               :       :...on.board.service <= -1.074771:
## :               :           :...food.and.drink <= -1.665185: 3 (5/1)
## :               :           :   food.and.drink > -1.665185: 1 (10/1)
## :               :           on.board.service > -1.074771:
## :               :           :...online.boarding <= 0: 3 (10)
## :               :               online.boarding > 0: 1 (2)
## :               seat.comfort > -1.850582:
## :               :...baggage.handling > -1.383119:
## :                   :...on.board.service <= -1.851815:
## :                   :   :...online.boarding > -0.2648924:
## :                   :   :   :...cleanliness <= -1.741252: 3 (2)
## :                   :   :   :   cleanliness > -1.741252:
## :                   :   :   :   :...baggage.handling <= 0.3117611: 1 (27/2)
## :                   :   :   :       baggage.handling > 0.3117611:
## :                   :   :   :       :...online.boarding <= 0.5338414: 3 (8)
## :                   :   :   :           online.boarding > 0.5338414: 1 (2)
## :                   :   :   online.boarding <= -0.2648924:
## :                   :   :   :...in.flight.service > -2.246018: 3 (101/3)
## :                   :   :       in.flight.service <= -2.246018:
## :                   :   :       :...baggage.handling > -0.5356789: 3 (31)
## :                   :   :           baggage.handling <= -0.5356789:
## :                   :   :           :...food.and.drink <= -1.665185: 3 (3)
## :                   :   :               food.and.drink > -1.665185:
## :                   :   :               :...online.boarding <= -1.86236: 3 (5/1)
## :                   :   :                   online.boarding > -1.86236: 1 (19/1)
## :                   :   on.board.service > -1.851815:
## :                   :   :...in.flight.service > -2.246018: 3 (484)
## :                   :       in.flight.service <= -2.246018:
## :                   :       :...baggage.handling > -0.5356789:
## :                   :           :...online.boarding <= 0.5338414: 3 (189)
## :                   :           :   online.boarding > 0.5338414:
## :                   :           :   :...on.board.service <= -1.074771: 1 (2)
## :                   :           :       on.board.service > -1.074771: 3 (8)
## :                   :           baggage.handling <= -0.5356789:
## :                   :           :...online.boarding <= -1.063626: 3 (74/2)
## :                   :               online.boarding > -1.063626:
## :                   :               :...on.board.service > 0: 3 (16)
## :                   :                   on.board.service <= 0:
## :                   :                   :...online.boarding > -0.2648924: 1 (12)
## :                   :                       online.boarding <= -0.2648924:
## :                   :                       :...on.board.service <= -1.074771: 1 (6/1)
## :                   :                           on.board.service > -1.074771: 3 (7)
## :                   baggage.handling <= -1.383119:
## :                   :...on.board.service <= -1.851815:
## :                       :...online.boarding > -1.86236:
## :                       :   :...cleanliness > -1.741252: 1 (182)
## :                       :   :   cleanliness <= -1.741252:
## :                       :   :   :...food.and.drink > -1.665185: 1 (17)
## :                       :   :       food.and.drink <= -1.665185:
## :                       :   :       :...online.boarding <= -1.063626: 3 (9)
## :                       :   :           online.boarding > -1.063626: 1 (7/1)
## :                       :   online.boarding <= -1.86236:
## :                       :   :...cleanliness <= -1.741252: 3 (10)
## :                       :       cleanliness > -1.741252:
## :                       :       :...food.and.drink <= -1.665185: 3 (2)
## :                       :           food.and.drink > -1.665185:
## :                       :           :...baggage.handling <= -2.230559: 1 (15)
## :                       :               baggage.handling > -2.230559:
## :                       :               :...in.flight.service <= -2.246018: 1 (5)
## :                       :                   in.flight.service > -2.246018: 3 (8)
## :                       on.board.service > -1.851815:
## :                       :...in.flight.service <= -2.246018:
## :                           :...in.flight.entertainment <= -1.768735: 3 (11/1)
## :                           :   in.flight.entertainment > -1.768735:
## :                           :   :...baggage.handling <= -2.230559:
## :                           :       :...online.boarding <= -1.86236:
## :                           :       :   :...on.board.service <= 0: 1 (6)
## :                           :       :   :   on.board.service > 0: 3 (10)
## :                           :       :   online.boarding > -1.86236:
## :                           :       :   :...on.board.service <= 0.4793173: 1 (86)
## :                           :       :       on.board.service > 0.4793173:
## :                           :       :       :...online.boarding <= -1.063626: 3 (5)
## :                           :       :           online.boarding > -1.063626: 1 (12)
## :                           :       baggage.handling > -2.230559:
## :                           :       :...on.board.service <= -1.074771:
## :                           :           :...online.boarding <= -1.86236: 3 (6)
## :                           :           :   online.boarding > -1.86236: 1 (25)
## :                           :           on.board.service > -1.074771:
## :                           :           :...online.boarding <= -1.063626: [S10]
## :                           :               online.boarding > -1.063626:
## :                           :               :...on.board.service <= 0: 1 (16)
## :                           :                   on.board.service > 0: [S11]
## :                           in.flight.service > -2.246018:
## :                           :...online.boarding <= -1.063626:
## :                               :...baggage.handling > -2.230559: 3 (265/2)
## :                               :   baggage.handling <= -2.230559:
## :                               :   :...on.board.service > -1.074771: [S12]
## :                               :       on.board.service <= -1.074771:
## :                               :       :...online.boarding <= -1.86236: 3 (8)
## :                               :           online.boarding > -1.86236:
## :                               :           :...cleanliness <= -1.741252: 3 (2)
## :                               :               cleanliness > -1.741252: 1 (24)
## :                               online.boarding > -1.063626:
## :                               :...food.and.drink <= -1.665185: 3 (28)
## :                                   food.and.drink > -1.665185:
## :                                   :...on.board.service <= -1.074771:
## :                                       :...cleanliness > -1.741252: 1 (52)
## :                                       :   cleanliness <= -1.741252:
## :                                       :   :...baggage.handling <= -2.230559: 1 (2)
## :                                       :       baggage.handling > -2.230559: 3 (17)
## :                                       on.board.service > -1.074771:
## :                                       :...baggage.handling > -2.230559:
## :                                           :...online.boarding <= 0: 3 (50)
## :                                           :   online.boarding > 0: [S13]
## :                                           baggage.handling <= -2.230559:
## :                                           :...on.board.service <= 0: 1 (8)
## :                                               on.board.service > 0: [S14]
## cleanliness > -0.979776:
## :...in.flight.service <= 0:
##     :...baggage.handling <= -0.5356789:
##     :   :...on.board.service > 0:
##     :   :   :...in.flight.entertainment <= 0:
##     :   :   :   :...seat.comfort <= -1.092578:
##     :   :   :   :   :...in.flight.service <= -1.396005:
##     :   :   :   :   :   :...online.boarding <= -1.86236: 3 (5/1)
##     :   :   :   :   :   :   online.boarding > -1.86236: 1 (34/3)
##     :   :   :   :   :   in.flight.service > -1.396005:
##     :   :   :   :   :   :...baggage.handling <= -1.383119:
##     :   :   :   :   :       :...seat.comfort <= -1.850582: 3 (9/1)
##     :   :   :   :   :       :   seat.comfort > -1.850582: 1 (8/1)
##     :   :   :   :   :       baggage.handling > -1.383119:
##     :   :   :   :   :       :...online.boarding <= 0: 3 (35)
##     :   :   :   :   :           online.boarding > 0: 1 (3/1)
##     :   :   :   :   seat.comfort > -1.092578:
##     :   :   :   :   :...online.boarding <= -1.063626:
##     :   :   :   :       :...baggage.handling <= -1.383119: 1 (169/4)
##     :   :   :   :       :   baggage.handling > -1.383119:
##     :   :   :   :       :   :...in.flight.service > -1.396005:
##     :   :   :   :       :       :...seat.comfort <= 0: 3 (97)
##     :   :   :   :       :       :   seat.comfort > 0: 1 (19/1)
##     :   :   :   :       :       in.flight.service <= -1.396005:
##     :   :   :   :       :       :...on.board.service <= 0.4793173: 1 (37)
##     :   :   :   :       :           on.board.service > 0.4793173:
##     :   :   :   :       :           :...online.boarding > -1.86236: 1 (9)
##     :   :   :   :       :               online.boarding <= -1.86236:
##     :   :   :   :       :               :...in.flight.service <= -2.246018: 1 (3)
##     :   :   :   :       :                   in.flight.service > -2.246018: 3 (4)
##     :   :   :   :       online.boarding > -1.063626:
##     :   :   :   :       :...on.board.service <= 0.4793173:
##     :   :   :   :           :...in.flight.entertainment > -1.018808: 1 (563/2)
##     :   :   :   :           :   in.flight.entertainment <= -1.018808:
##     :   :   :   :           :   :...online.boarding > 0: 1 (44)
##     :   :   :   :           :       online.boarding <= 0:
##     :   :   :   :           :       :...in.flight.service <= -1.396005: 1 (19/1)
##     :   :   :   :           :           in.flight.service > -1.396005:
##     :   :   :   :           :           :...baggage.handling <= -2.230559: 1 (2)
##     :   :   :   :           :               baggage.handling > -2.230559: 3 (11/2)
##     :   :   :   :           on.board.service > 0.4793173:
##     :   :   :   :           :...in.flight.service <= -1.396005: 1 (144/6)
##     :   :   :   :               in.flight.service > -1.396005:
##     :   :   :   :               :...baggage.handling <= -1.383119: 1 (47/2)
##     :   :   :   :                   baggage.handling > -1.383119:
##     :   :   :   :                   :...online.boarding <= 0: 3 (19/1)
##     :   :   :   :                       online.boarding > 0: [S15]
##     :   :   :   in.flight.entertainment > 0:
##     :   :   :   :...in.flight.service <= -1.396005:
##     :   :   :       :...baggage.handling > -1.383119:
##     :   :   :       :   :...seat.comfort > 0.4234303:
##     :   :   :       :   :   :...online.boarding <= -1.86236: 1 (13)
##     :   :   :       :   :   :   online.boarding > -1.86236:
##     :   :   :       :   :   :   :...in.flight.service <= -2.246018:
##     :   :   :       :   :   :       :...on.board.service <= 0.4793173: 1 (45)
##     :   :   :       :   :   :       :   on.board.service > 0.4793173: [S16]
##     :   :   :       :   :   :       in.flight.service > -2.246018:
##     :   :   :       :   :   :       :...cleanliness > 0.543176: 2 (110/1)
##     :   :   :       :   :   :           cleanliness <= 0.543176:
##     :   :   :       :   :   :           :...on.board.service <= 0.4793173: 1 (4)
##     :   :   :       :   :   :               on.board.service > 0.4793173: 2 (3/1)
##     :   :   :       :   :   seat.comfort <= 0.4234303:
##     :   :   :       :   :   :...on.board.service <= 0.4793173: 1 (177/3)
##     :   :   :       :   :       on.board.service > 0.4793173:
##     :   :   :       :   :       :...in.flight.service <= -2.246018: 1 (60)
##     :   :   :       :   :           in.flight.service > -2.246018:
##     :   :   :       :   :           :...online.boarding > -0.2648924:
##     :   :   :       :   :               :...seat.comfort <= 0: 1 (4)
##     :   :   :       :   :               :   seat.comfort > 0: 2 (33)
##     :   :   :       :   :               online.boarding <= -0.2648924: [S17]
##     :   :   :       :   baggage.handling <= -1.383119:
##     :   :   :       :   :...on.board.service <= 0.4793173: 1 (500)
##     :   :   :       :       on.board.service > 0.4793173:
##     :   :   :       :       :...cleanliness <= 0.543176: 1 (212)
##     :   :   :       :           cleanliness > 0.543176:
##     :   :   :       :           :...baggage.handling <= -2.230559: 1 (80)
##     :   :   :       :               baggage.handling > -2.230559:
##     :   :   :       :               :...in.flight.service <= -2.246018: 1 (35)
##     :   :   :       :                   in.flight.service > -2.246018:
##     :   :   :       :                   :...online.boarding > -0.2648924: 2 (25/1)
##     :   :   :       :                       online.boarding <= -0.2648924:
##     :   :   :       :                       :...online.boarding <= -1.063626: 1 (6)
##     :   :   :       :                           online.boarding > -1.063626:
##     :   :   :       :                           :...seat.comfort <= 0.4234303: 1 (3)
##     :   :   :       :                               seat.comfort > 0.4234303: 2 (2)
##     :   :   :       in.flight.service > -1.396005:
##     :   :   :       :...baggage.handling <= -2.230559:
##     :   :   :           :...on.board.service <= 0.4793173: 1 (120)
##     :   :   :           :   on.board.service > 0.4793173:
##     :   :   :           :   :...cleanliness <= 0.543176: 1 (53)
##     :   :   :           :       cleanliness > 0.543176:
##     :   :   :           :       :...online.boarding <= -1.063626: 1 (7)
##     :   :   :           :           online.boarding > -1.063626: 2 (40/1)
##     :   :   :           baggage.handling > -2.230559:
##     :   :   :           :...cleanliness > 0.543176:
##     :   :   :               :...baggage.handling > -1.383119: 2 (286)
##     :   :   :               :   baggage.handling <= -1.383119:
##     :   :   :               :   :...online.boarding <= -1.86236: 1 (10/1)
##     :   :   :               :       online.boarding > -1.86236:
##     :   :   :               :       :...seat.comfort > 0.4234303: 2 (104)
##     :   :   :               :           seat.comfort <= 0.4234303:
##     :   :   :               :           :...on.board.service > 0.4793173: 2 (4)
##     :   :   :               :               on.board.service <= 0.4793173:
##     :   :   :               :               :...seat.comfort <= -1.092578: 1 (4)
##     :   :   :               :                   seat.comfort > -1.092578: [S18]
##     :   :   :               cleanliness <= 0.543176:
##     :   :   :               :...online.boarding <= -1.063626:
##     :   :   :                   :...on.board.service <= 0.4793173: 1 (139/4)
##     :   :   :                   :   on.board.service > 0.4793173:
##     :   :   :                   :   :...baggage.handling <= -1.383119: 1 (6/1)
##     :   :   :                   :       baggage.handling > -1.383119: 2 (5)
##     :   :   :                   online.boarding > -1.063626:
##     :   :   :                   :...baggage.handling <= -1.383119:
##     :   :   :                       :...on.board.service <= 0.4793173: 1 (78)
##     :   :   :                       :   on.board.service > 0.4793173:
##     :   :   :                       :   :...seat.comfort <= 0: 1 (2)
##     :   :   :                       :       seat.comfort > 0: 2 (46)
##     :   :   :                       baggage.handling > -1.383119:
##     :   :   :                       :...seat.comfort > 0: 2 (203)
##     :   :   :                           seat.comfort <= 0:
##     :   :   :                           :...seat.comfort <= -1.092578: 1 (21/1)
##     :   :   :                               seat.comfort > -1.092578:
##     :   :   :                               :...food.and.drink <= -0.911037: 3 (3/1)
##     :   :   :                                   food.and.drink > -0.911037: [S19]
##     :   :   on.board.service <= 0:
##     :   :   :...seat.comfort <= -1.092578:
##     :   :       :...food.and.drink > -0.1568885:
##     :   :       :   :...online.boarding > -1.86236: 1 (666/6)
##     :   :       :   :   online.boarding <= -1.86236:
##     :   :       :   :   :...on.board.service <= -1.074771: 1 (105)
##     :   :       :   :       on.board.service > -1.074771:
##     :   :       :   :       :...cleanliness <= 0: 3 (12/1)
##     :   :       :   :           cleanliness > 0: 1 (44/4)
##     :   :       :   food.and.drink <= -0.1568885:
##     :   :       :   :...on.board.service <= -1.074771:
##     :   :       :       :...online.boarding <= -1.86236:
##     :   :       :       :   :...in.flight.service <= -2.246018: 1 (38/2)
##     :   :       :       :   :   in.flight.service > -2.246018:
##     :   :       :       :   :   :...food.and.drink <= -1.665185: 3 (24)
##     :   :       :       :   :       food.and.drink > -1.665185:
##     :   :       :       :   :       :...seat.comfort <= -1.850582:
##     :   :       :       :   :           :...in.flight.service > -1.396005: 3 (13)
##     :   :       :       :   :           :   in.flight.service <= -1.396005:
##     :   :       :       :   :           :   :...food.and.drink <= -0.911037: 3 (7)
##     :   :       :       :   :           :       food.and.drink > -0.911037: 1 (6/1)
##     :   :       :       :   :           seat.comfort > -1.850582:
##     :   :       :       :   :           :...food.and.drink > -0.911037: 1 (22/2)
##     :   :       :       :   :               food.and.drink <= -0.911037:
##     :   :       :       :   :               :...cleanliness <= 0: 3 (8)
##     :   :       :       :   :                   cleanliness > 0: 1 (7)
##     :   :       :       :   online.boarding > -1.86236:
##     :   :       :       :   :...food.and.drink <= -1.665185:
##     :   :       :       :       :...on.board.service <= -1.851815: 1 (38)
##     :   :       :       :       :   on.board.service > -1.851815:
##     :   :       :       :       :   :...cleanliness <= 0:
##     :   :       :       :       :       :...online.boarding <= -1.063626: 3 (16/1)
##     :   :       :       :       :       :   online.boarding > -1.063626:
##     :   :       :       :       :       :   :...seat.comfort <= -1.850582: 3 (3/1)
##     :   :       :       :       :       :       seat.comfort > -1.850582: 1 (8)
##     :   :       :       :       :       cleanliness > 0:
##     :   :       :       :       :       :...seat.comfort > -1.850582: 1 (20)
##     :   :       :       :       :           seat.comfort <= -1.850582:
##     :   :       :       :       :           :...cleanliness <= 0.543176: 3 (2)
##     :   :       :       :       :               cleanliness > 0.543176: 1 (5/1)
##     :   :       :       :       food.and.drink > -1.665185:
##     :   :       :       :       :...baggage.handling <= -1.383119: 1 (307/2)
##     :   :       :       :           baggage.handling > -1.383119:
##     :   :       :       :           :...seat.comfort > -1.850582: 1 (55/1)
##     :   :       :       :               seat.comfort <= -1.850582:
##     :   :       :       :               :...in.flight.service <= -1.396005: 1 (18)
##     :   :       :       :                   in.flight.service > -1.396005:
##     :   :       :       :                   :...online.boarding > -0.2648924: 1 (4)
##     :   :       :       :                       online.boarding <= -0.2648924: [S20]
##     :   :       :       on.board.service > -1.074771:
##     :   :       :       :...online.boarding > -1.063626:
##     :   :       :           :...food.and.drink <= -0.911037:
##     :   :       :           :   :...cleanliness <= 0: 3 (30/2)
##     :   :       :           :   :   cleanliness > 0:
##     :   :       :           :   :   :...food.and.drink <= -1.665185:
##     :   :       :           :   :       :...cleanliness <= 0.543176:
##     :   :       :           :   :       :   :...online.boarding <= 0.5338414: 3 (10)
##     :   :       :           :   :       :   :   online.boarding > 0.5338414:
##     :   :       :           :   :       :   :   :...seat.comfort <= -1.850582: 3 (2)
##     :   :       :           :   :       :   :       seat.comfort > -1.850582: 1 (4)
##     :   :       :           :   :       :   cleanliness > 0.543176:
##     :   :       :           :   :       :   :...seat.comfort > -1.850582: 1 (9)
##     :   :       :           :   :       :       seat.comfort <= -1.850582: [S21]
##     :   :       :           :   :       food.and.drink > -1.665185:
##     :   :       :           :   :       :...seat.comfort > -1.850582: 1 (16)
##     :   :       :           :   :           seat.comfort <= -1.850582:
##     :   :       :           :   :           :...online.boarding > 0: 1 (8)
##     :   :       :           :   :               online.boarding <= 0:
##     :   :       :           :   :               :...cleanliness <= 0.543176: 3 (3)
##     :   :       :           :   :                   cleanliness > 0.543176: 1 (2)
##     :   :       :           :   food.and.drink > -0.911037:
##     :   :       :           :   :...seat.comfort > -1.850582: 1 (77)
##     :   :       :           :       seat.comfort <= -1.850582:
##     :   :       :           :       :...in.flight.service <= -1.396005: 1 (13)
##     :   :       :           :           in.flight.service > -1.396005:
##     :   :       :           :           :...cleanliness > 0: 1 (5)
##     :   :       :           :               cleanliness <= 0:
##     :   :       :           :               :...online.boarding <= 0: 3 (11)
##     :   :       :           :                   online.boarding > 0: [S22]
##     :   :       :           online.boarding <= -1.063626:
##     :   :       :           :...in.flight.service <= -1.396005:
##     :   :       :               :...online.boarding > -1.86236: 1 (6)
##     :   :       :               :   online.boarding <= -1.86236:
##     :   :       :               :   :...in.flight.service <= -2.246018: 1 (2)
##     :   :       :               :       in.flight.service > -2.246018: 3 (5/1)
##     :   :       :               in.flight.service > -1.396005:
##     :   :       :               :...baggage.handling <= -2.230559:
##     :   :       :                   :...seat.comfort <= -1.850582: 3 (3)
##     :   :       :                   :   seat.comfort > -1.850582:
##     :   :       :                   :   :...food.and.drink <= -0.911037: 3 (3/1)
##     :   :       :                   :       food.and.drink > -0.911037: 1 (5)
##     :   :       :                   baggage.handling > -2.230559:
##     :   :       :                   :...cleanliness <= 0: 3 (96)
##     :   :       :                       cleanliness > 0:
##     :   :       :                       :...food.and.drink <= -0.911037:
##     :   :       :                           :...baggage.handling <= -1.383119: 1 (3/1)
##     :   :       :                           :   baggage.handling > -1.383119: 3 (53/1)
##     :   :       :                           food.and.drink > -0.911037:
##     :   :       :                           :...online.boarding <= -1.86236: 3 (7/1)
##     :   :       :                               online.boarding > -1.86236:
##     :   :       :                               :...seat.comfort > -1.850582: 1 (15)
##     :   :       :                                   seat.comfort <= -1.850582: [S23]
##     :   :       seat.comfort > -1.092578:
##     :   :       :...in.flight.entertainment > 0.4810466:
##     :   :           :...on.board.service <= -1.074771: 1 (1367)
##     :   :           :   on.board.service > -1.074771:
##     :   :           :   :...baggage.handling <= -1.383119: 1 (387)
##     :   :           :       baggage.handling > -1.383119:
##     :   :           :       :...in.flight.service <= -1.396005: 1 (124)
##     :   :           :           in.flight.service > -1.396005:
##     :   :           :           :...seat.comfort > 0.4234303: 2 (179)
##     :   :           :               seat.comfort <= 0.4234303:
##     :   :           :               :...online.boarding > -1.063626: 2 (9)
##     :   :           :                   online.boarding <= -1.063626:
##     :   :           :                   :...online.boarding <= -1.86236: 1 (4)
##     :   :           :                       online.boarding > -1.86236:
##     :   :           :                       :...seat.comfort <= 0: 1 (2)
##     :   :           :                           seat.comfort > 0: 2 (2)
##     :   :           in.flight.entertainment <= 0.4810466:
##     :   :           :...food.and.drink <= -1.665185:
##     :   :               :...baggage.handling <= -1.383119: 1 (619/2)
##     :   :               :   baggage.handling > -1.383119:
##     :   :               :   :...seat.comfort <= 0:
##     :   :               :       :...cleanliness <= 0:
##     :   :               :       :   :...in.flight.service <= -1.396005: 1 (8/1)
##     :   :               :       :   :   in.flight.service > -1.396005: 3 (73/1)
##     :   :               :       :   cleanliness > 0:
##     :   :               :       :   :...online.boarding > -1.063626: 1 (74)
##     :   :               :       :       online.boarding <= -1.063626:
##     :   :               :       :       :...on.board.service <= -1.074771: 1 (3)
##     :   :               :       :           on.board.service > -1.074771: 3 (19)
##     :   :               :       seat.comfort > 0:
##     :   :               :       :...online.boarding > -1.063626: 1 (179)
##     :   :               :           online.boarding <= -1.063626:
##     :   :               :           :...cleanliness > 0: 1 (19/1)
##     :   :               :               cleanliness <= 0:
##     :   :               :               :...seat.comfort <= 0.4234303: 3 (9)
##     :   :               :                   seat.comfort > 0.4234303:
##     :   :               :                   :...online.boarding <= -1.86236: 3 (2)
##     :   :               :                       online.boarding > -1.86236: 1 (2)
##     :   :               food.and.drink > -1.665185:
##     :   :               :...online.boarding > -1.063626: 1 (13494/5)
##     :   :                   online.boarding <= -1.063626:
##     :   :                   :...on.board.service <= -1.074771: 1 (1243)
##     :   :                       on.board.service > -1.074771:
##     :   :                       :...cleanliness > 0: 1 (373/1)
##     :   :                           cleanliness <= 0:
##     :   :                           :...food.and.drink <= -0.911037:
##     :   :                               :...seat.comfort <= 0: 3 (15)
##     :   :                               :   seat.comfort > 0:
##     :   :                               :   :...online.boarding > -1.86236: 1 (4)
##     :   :                               :       online.boarding <= -1.86236:
##     :   :                               :       :...seat.comfort <= 0.4234303: 3 (3)
##     :   :                               :           seat.comfort > 0.4234303: 1 (3)
##     :   :                               food.and.drink > -0.911037:
##     :   :                               :...online.boarding > -1.86236: 1 (209/1)
##     :   :                                   online.boarding <= -1.86236:
##     :   :                                   :...in.flight.service <= -1.396005: 1 (34)
##     :   :                                       in.flight.service > -1.396005: [S24]
##     :   baggage.handling > -0.5356789:
##     :   :...cleanliness <= 0:
##     :       :...on.board.service > 0:
##     :       :   :...in.flight.service <= -2.246018:
##     :       :   :   :...seat.comfort <= -1.092578: 3 (8/1)
##     :       :   :   :   seat.comfort > -1.092578:
##     :       :   :   :   :...online.boarding > -0.2648924: 1 (37/3)
##     :       :   :   :       online.boarding <= -0.2648924:
##     :       :   :   :       :...online.boarding <= -1.86236:
##     :       :   :   :           :...seat.comfort > 0.4234303: 1 (3)
##     :       :   :   :           :   seat.comfort <= 0.4234303:
##     :       :   :   :           :   :...on.board.service > 0.4793173: 3 (5)
##     :       :   :   :           :       on.board.service <= 0.4793173:
##     :       :   :   :           :       :...baggage.handling <= 0.3117611: 1 (4/1)
##     :       :   :   :           :           baggage.handling > 0.3117611: 3 (2)
##     :       :   :   :           online.boarding > -1.86236:
##     :       :   :   :           :...food.and.drink <= -1.665185: 3 (2)
##     :       :   :   :               food.and.drink > -1.665185:
##     :       :   :   :               :...baggage.handling <= 0.3117611: 1 (40/2)
##     :       :   :   :                   baggage.handling > 0.3117611:
##     :       :   :   :                   :...on.board.service <= 0.4793173: 1 (9/1)
##     :       :   :   :                       on.board.service > 0.4793173: 3 (7/2)
##     :       :   :   in.flight.service > -2.246018:
##     :       :   :   :...online.boarding > -0.2648924:
##     :       :   :       :...seat.comfort <= -1.092578: 3 (13)
##     :       :   :       :   seat.comfort > -1.092578:
##     :       :   :       :   :...in.flight.service > -1.396005: 2 (78/4)
##     :       :   :       :       in.flight.service <= -1.396005:
##     :       :   :       :       :...baggage.handling <= 0.3117611: 1 (18/2)
##     :       :   :       :           baggage.handling > 0.3117611:
##     :       :   :       :           :...online.boarding <= 0: 3 (4/1)
##     :       :   :       :               online.boarding > 0: 2 (13)
##     :       :   :       online.boarding <= -0.2648924:
##     :       :   :       :...in.flight.entertainment > 0:
##     :       :   :           :...food.and.drink > -0.911037: 2 (40/1)
##     :       :   :           :   food.and.drink <= -0.911037:
##     :       :   :           :   :...food.and.drink <= -1.665185: 3 (6)
##     :       :   :           :       food.and.drink > -1.665185:
##     :       :   :           :       :...in.flight.entertainment <= 0.4810466: 3 (3/1)
##     :       :   :           :           in.flight.entertainment > 0.4810466: 2 (3)
##     :       :   :           in.flight.entertainment <= 0:
##     :       :   :           :...seat.comfort > 0:
##     :       :   :               :...online.boarding <= -1.86236:
##     :       :   :               :   :...seat.comfort <= 0.4234303: 3 (5)
##     :       :   :               :   :   seat.comfort > 0.4234303: 1 (4)
##     :       :   :               :   online.boarding > -1.86236:
##     :       :   :               :   :...seat.comfort > 0.4234303: 2 (11)
##     :       :   :               :       seat.comfort <= 0.4234303:
##     :       :   :               :       :...online.boarding <= -1.063626: 3 (2)
##     :       :   :               :           online.boarding > -1.063626: 2 (4/1)
##     :       :   :               seat.comfort <= 0:
##     :       :   :               :...in.flight.service <= -1.396005:
##     :       :   :                   :...on.board.service > 0.4793173: 3 (43)
##     :       :   :                   :   on.board.service <= 0.4793173:
##     :       :   :                   :   :...baggage.handling > 0.3117611: 3 (18)
##     :       :   :                   :       baggage.handling <= 0.3117611:
##     :       :   :                   :       :...online.boarding <= -1.063626: 3 (9)
##     :       :   :                   :           online.boarding > -1.063626: [S25]
##     :       :   :                   in.flight.service > -1.396005:
##     :       :   :                   :...baggage.handling <= 0.3117611: 3 (313/2)
##     :       :   :                       baggage.handling > 0.3117611:
##     :       :   :                       :...food.and.drink > 0: 2 (2)
##     :       :   :                           food.and.drink <= 0:
##     :       :   :                           :...online.boarding <= -1.063626: 3 (21)
##     :       :   :                               online.boarding > -1.063626: [S26]
##     :       :   on.board.service <= 0:
##     :       :   :...seat.comfort <= -1.092578:
##     :       :       :...online.boarding > 0:
##     :       :       :   :...on.board.service <= -1.074771: 1 (11)
##     :       :       :   :   on.board.service > -1.074771:
##     :       :       :   :   :...online.boarding <= 0.5338414: 3 (6/1)
##     :       :       :   :       online.boarding > 0.5338414: 1 (3)
##     :       :       :   online.boarding <= 0:
##     :       :       :   :...on.board.service > -1.074771: 3 (43)
##     :       :       :       on.board.service <= -1.074771:
##     :       :       :       :...food.and.drink > 0.5972599: 1 (3)
##     :       :       :           food.and.drink <= 0.5972599:
##     :       :       :           :...seat.comfort <= -1.850582:
##     :       :       :               :...in.flight.service <= -2.246018: 1 (3/1)
##     :       :       :               :   in.flight.service > -2.246018: 3 (32)
##     :       :       :               seat.comfort > -1.850582:
##     :       :       :               :...baggage.handling > 0.3117611: 3 (6)
##     :       :       :                   baggage.handling <= 0.3117611:
##     :       :       :                   :...online.boarding <= -1.86236: 3 (3)
##     :       :       :                       online.boarding > -1.86236:
##     :       :       :                       :...on.board.service <= -1.851815: 1 (13)
##     :       :       :                           on.board.service > -1.851815: [S27]
##     :       :       seat.comfort > -1.092578:
##     :       :       :...in.flight.entertainment > 0.4810466:
##     :       :           :...in.flight.service <= -1.396005: 1 (6)
##     :       :           :   in.flight.service > -1.396005:
##     :       :           :   :...on.board.service > -1.074771: 2 (11/1)
##     :       :           :       on.board.service <= -1.074771:
##     :       :           :       :...baggage.handling <= 0.3117611: 1 (5)
##     :       :           :           baggage.handling > 0.3117611: 2 (6/1)
##     :       :           in.flight.entertainment <= 0.4810466:
##     :       :           :...online.boarding <= -1.063626:
##     :       :               :...on.board.service <= -1.074771:
##     :       :               :   :...online.boarding <= -1.86236:
##     :       :               :   :   :...on.board.service <= -1.851815: 1 (44/2)
##     :       :               :   :   :   on.board.service > -1.851815:
##     :       :               :   :   :   :...in.flight.service <= -1.396005: 1 (15/2)
##     :       :               :   :   :       in.flight.service > -1.396005:
##     :       :               :   :   :       :...seat.comfort <= 0: 3 (31)
##     :       :               :   :   :           seat.comfort > 0: 1 (2)
##     :       :               :   :   online.boarding > -1.86236:
##     :       :               :   :   :...baggage.handling <= 0.3117611: 1 (142/1)
##     :       :               :   :       baggage.handling > 0.3117611:
##     :       :               :   :       :...in.flight.service <= -1.396005: 1 (24/1)
##     :       :               :   :           in.flight.service > -1.396005:
##     :       :               :   :           :...seat.comfort <= 0.4234303: 3 (9)
##     :       :               :   :               seat.comfort > 0.4234303: 1 (2)
##     :       :               :   on.board.service > -1.074771:
##     :       :               :   :...in.flight.service > -1.396005:
##     :       :               :       :...seat.comfort <= 0: 3 (88/1)
##     :       :               :       :   seat.comfort > 0:
##     :       :               :       :   :...seat.comfort <= 0.4234303: 3 (5/2)
##     :       :               :       :       seat.comfort > 0.4234303: 1 (2)
##     :       :               :       in.flight.service <= -1.396005:
##     :       :               :       :...online.boarding <= -1.86236:
##     :       :               :           :...seat.comfort > 0: 1 (2)
##     :       :               :           :   seat.comfort <= 0: [S28]
##     :       :               :           online.boarding > -1.86236:
##     :       :               :           :...baggage.handling <= 0.3117611: 1 (11)
##     :       :               :               baggage.handling > 0.3117611:
##     :       :               :               :...in.flight.service <= -2.246018: 1 (4)
##     :       :               :                   in.flight.service > -2.246018: 3 (3)
##     :       :               online.boarding > -1.063626:
##     :       :               :...food.and.drink <= -0.911037:
##     :       :                   :...in.flight.service <= -1.396005:
##     :       :                   :   :...food.and.drink > -1.665185: 1 (25)
##     :       :                   :   :   food.and.drink <= -1.665185:
##     :       :                   :   :   :...on.board.service <= -1.851815: 1 (6)
##     :       :                   :   :       on.board.service > -1.851815: 3 (5/1)
##     :       :                   :   in.flight.service > -1.396005:
##     :       :                   :   :...online.boarding > 0.5338414: 2 (3)
##     :       :                   :       online.boarding <= 0.5338414:
##     :       :                   :       :...seat.comfort > 0:
##     :       :                   :           :...food.and.drink <= -1.665185: 3 (2)
##     :       :                   :           :   food.and.drink > -1.665185: 1 (6/1)
##     :       :                   :           seat.comfort <= 0: [S29]
##     :       :                   food.and.drink > -0.911037:
##     :       :                   :...seat.comfort > 0:
##     :       :                       :...on.board.service <= -1.074771: 1 (82)
##     :       :                       :   on.board.service > -1.074771:
##     :       :                       :   :...baggage.handling > 0.3117611:
##     :       :                       :       :...in.flight.service <= -1.396005: 1 (2)
##     :       :                       :       :   in.flight.service > -1.396005: 2 (25)
##     :       :                       :       baggage.handling <= 0.3117611:
##     :       :                       :       :...online.boarding <= 0.5338414: 1 (29)
##     :       :                       :           online.boarding > 0.5338414:
##     :       :                       :           :...seat.comfort <= 0.4234303: 1 (5)
##     :       :                       :               seat.comfort > 0.4234303: 2 (9/1)
##     :       :                       seat.comfort <= 0:
##     :       :                       :...baggage.handling <= 0.3117611: 1 (784/5)
##     :       :                           baggage.handling > 0.3117611:
##     :       :                           :...in.flight.service <= -1.396005: 1 (94)
##     :       :                               in.flight.service > -1.396005:
##     :       :                               :...on.board.service > -1.074771: [S30]
##     :       :                                   on.board.service <= -1.074771: [S31]
##     :       cleanliness > 0:
##     :       :...on.board.service <= -1.074771:
##     :           :...in.flight.entertainment <= 0.4810466:
##     :           :   :...baggage.handling <= 0.3117611:
##     :           :   :   :...food.and.drink <= 0.5972599: 1 (979/8)
##     :           :   :   :   food.and.drink > 0.5972599:
##     :           :   :   :   :...in.flight.entertainment <= 0: 1 (35)
##     :           :   :   :       in.flight.entertainment > 0:
##     :           :   :   :       :...online.boarding <= 0.5338414: 1 (2)
##     :           :   :   :           online.boarding > 0.5338414: 2 (6/1)
##     :           :   :   baggage.handling > 0.3117611:
##     :           :   :   :...in.flight.service <= -1.396005: 1 (267/1)
##     :           :   :       in.flight.service > -1.396005:
##     :           :   :       :...on.board.service <= -1.851815: 1 (52)
##     :           :   :           on.board.service > -1.851815:
##     :           :   :           :...in.flight.entertainment > 0: 2 (66)
##     :           :   :               in.flight.entertainment <= 0:
##     :           :   :               :...food.and.drink > 0.5972599: 2 (9)
##     :           :   :                   food.and.drink <= 0.5972599:
##     :           :   :                   :...online.boarding <= 0.5338414: 1 (16)
##     :           :   :                       online.boarding > 0.5338414: 2 (2)
##     :           :   in.flight.entertainment > 0.4810466:
##     :           :   :...in.flight.service <= -2.246018: 1 (194)
##     :           :       in.flight.service > -2.246018:
##     :           :       :...on.board.service <= -1.851815:
##     :           :           :...online.boarding <= 0.5338414:
##     :           :           :   :...baggage.handling <= 0.3117611: 1 (196)
##     :           :           :   :   baggage.handling > 0.3117611:
##     :           :           :   :   :...in.flight.service <= -1.396005: 1 (18)
##     :           :           :   :       in.flight.service > -1.396005:
##     :           :           :   :       :...seat.comfort <= 0: 1 (4/1)
##     :           :           :   :           seat.comfort > 0: 2 (17)
##     :           :           :   online.boarding > 0.5338414:
##     :           :           :   :...baggage.handling > 0.3117611: 2 (46)
##     :           :           :       baggage.handling <= 0.3117611:
##     :           :           :       :...in.flight.service <= -1.396005: 1 (28)
##     :           :           :           in.flight.service > -1.396005:
##     :           :           :           :...seat.comfort <= 0.4234303: 1 (3)
##     :           :           :               seat.comfort > 0.4234303: 2 (28)
##     :           :           on.board.service > -1.851815:
##     :           :           :...seat.comfort > 0:
##     :           :               :...in.flight.service <= -1.396005:
##     :           :               :   :...baggage.handling <= 0.3117611: 1 (39)
##     :           :               :   :   baggage.handling > 0.3117611: 2 (40)
##     :           :               :   in.flight.service > -1.396005:
##     :           :               :   :...seat.comfort > 0.4234303: 2 (229)
##     :           :               :       seat.comfort <= 0.4234303:
##     :           :               :       :...online.boarding <= -1.86236: 1 (3)
##     :           :               :           online.boarding > -1.86236: 2 (37/2)
##     :           :               seat.comfort <= 0:
##     :           :               :...online.boarding <= -1.063626: 1 (22)
##     :           :                   online.boarding > -1.063626:
##     :           :                   :...seat.comfort <= -1.092578:
##     :           :                       :...online.boarding <= 0: 1 (8/1)
##     :           :                       :   online.boarding > 0: 2 (3)
##     :           :                       seat.comfort > -1.092578:
##     :           :                       :...in.flight.service > -1.396005: 2 (9)
##     :           :                           in.flight.service <= -1.396005:
##     :           :                           :...baggage.handling <= 0.3117611: 1 (2)
##     :           :                               baggage.handling > 0.3117611: 2 (2)
##     :           on.board.service > -1.074771:
##     :           :...in.flight.service <= -2.246018:
##     :               :...on.board.service <= 0:
##     :               :   :...food.and.drink <= 0.5972599: 1 (108)
##     :               :   :   food.and.drink > 0.5972599:
##     :               :   :   :...baggage.handling <= 0.3117611: 1 (46)
##     :               :   :       baggage.handling > 0.3117611:
##     :               :   :       :...in.flight.entertainment <= 0: 1 (3)
##     :               :   :           in.flight.entertainment > 0:
##     :               :   :           :...seat.comfort > 0.4234303: 2 (36/1)
##     :               :   :               seat.comfort <= 0.4234303:
##     :               :   :               :...online.boarding <= -0.2648924: 1 (7)
##     :               :   :                   online.boarding > -0.2648924: 2 (3/1)
##     :               :   on.board.service > 0:
##     :               :   :...in.flight.entertainment > 0.4810466:
##     :               :       :...online.boarding > -1.063626: 2 (156)
##     :               :       :   online.boarding <= -1.063626:
##     :               :       :   :...on.board.service > 0.4793173: 2 (15)
##     :               :       :       on.board.service <= 0.4793173:
##     :               :       :       :...baggage.handling > 0.3117611: 2 (2)
##     :               :       :           baggage.handling <= 0.3117611:
##     :               :       :           :...online.boarding <= -1.86236: 1 (6)
##     :               :       :               online.boarding > -1.86236:
##     :               :       :               :...seat.comfort <= 0.4234303: 1 (4)
##     :               :       :                   seat.comfort > 0.4234303: 2 (4)
##     :               :       in.flight.entertainment <= 0.4810466:
##     :               :       :...baggage.handling > 0.3117611:
##     :               :           :...online.boarding <= -1.063626:
##     :               :           :   :...on.board.service <= 0.4793173: 1 (12/1)
##     :               :           :   :   on.board.service > 0.4793173: 2 (3)
##     :               :           :   online.boarding > -1.063626:
##     :               :           :   :...in.flight.entertainment <= 0:
##     :               :           :       :...seat.comfort <= 0.4234303: 1 (6)
##     :               :           :       :   seat.comfort > 0.4234303: 2 (2)
##     :               :           :       in.flight.entertainment > 0:
##     :               :           :       :...seat.comfort > 0: 2 (77)
##     :               :           :           seat.comfort <= 0:
##     :               :           :           :...seat.comfort <= -1.092578: 1 (2)
##     :               :           :               seat.comfort > -1.092578: 2 (2)
##     :               :           baggage.handling <= 0.3117611:
##     :               :           :...on.board.service <= 0.4793173: 1 (79)
##     :               :               on.board.service > 0.4793173:
##     :               :               :...online.boarding <= 0: 1 (22)
##     :               :                   online.boarding > 0: [S32]
##     :               in.flight.service > -2.246018:
##     :               :...in.flight.entertainment > 0.4810466: 2 (1021/2)
##     :                   in.flight.entertainment <= 0.4810466:
##     :                   :...seat.comfort <= 0:
##     :                       :...food.and.drink <= 0:
##     :                       :   :...online.boarding <= -1.063626: 3 (10)
##     :                       :   :   online.boarding > -1.063626:
##     :                       :   :   :...food.and.drink <= -0.911037: 3 (9/1)
##     :                       :   :       food.and.drink > -0.911037:
##     :                       :   :       :...baggage.handling <= 0.3117611: 1 (5)
##     :                       :   :           baggage.handling > 0.3117611: 2 (3/1)
##     :                       :   food.and.drink > 0:
##     :                       :   :...on.board.service > 0: 2 (65/5)
##     :                       :       on.board.service <= 0:
##     :                       :       :...cleanliness > 0.543176: 2 (3)
##     :                       :           cleanliness <= 0.543176:
##     :                       :           :...online.boarding <= -1.063626:
##     :                       :               :...seat.comfort <= -1.850582: 3 (3)
##     :                       :               :   seat.comfort > -1.850582: 1 (10)
##     :                       :               online.boarding > -1.063626: [S33]
##     :                       seat.comfort > 0:
##     :                       :...in.flight.entertainment <= -1.018808:
##     :                           :...on.board.service > 0.4793173:
##     :                           :   :...in.flight.service > -1.396005: 2 (17/1)
##     :                           :   :   in.flight.service <= -1.396005: [S34]
##     :                           :   on.board.service <= 0.4793173:
##     :                           :   :...in.flight.entertainment <= -1.768735: 1 (15)
##     :                           :       in.flight.entertainment > -1.768735:
##     :                           :       :...in.flight.service > -1.396005:
##     :                           :           :...food.and.drink <= 0: 1 (4/1)
##     :                           :           :   food.and.drink > 0: 2 (6)
##     :                           :           in.flight.service <= -1.396005:
##     :                           :           :...baggage.handling <= 0.3117611: 1 (14)
##     :                           :               baggage.handling > 0.3117611: [S35]
##     :                           in.flight.entertainment > -1.018808:
##     :                           :...on.board.service > 0:
##     :                               :...in.flight.service > -1.396005: 2 (544)
##     :                               :   in.flight.service <= -1.396005:
##     :                               :   :...online.boarding > -1.063626: 2 (154/2)
##     :                               :       online.boarding <= -1.063626:
##     :                               :       :...baggage.handling > 0.3117611: 2 (18)
##     :                               :           baggage.handling <= 0.3117611: [S36]
##     :                               on.board.service <= 0:
##     :                               :...online.boarding <= -1.86236:
##     :                                   :...seat.comfort > 0.4234303: 2 (4)
##     :                                   :   seat.comfort <= 0.4234303:
##     :                                   :   :...baggage.handling <= 0.3117611: 1 (31)
##     :                                   :       baggage.handling > 0.3117611: [S37]
##     :                                   online.boarding > -1.86236:
##     :                                   :...baggage.handling > 0.3117611: 2 (225/2)
##     :                                       baggage.handling <= 0.3117611:
##     :                                       :...in.flight.service <= -1.396005: 1 (56/1)
##     :                                           in.flight.service > -1.396005: [S38]
##     in.flight.service > 0:
##     :...on.board.service <= -1.074771:
##         :...cleanliness > 0:
##         :   :...baggage.handling <= -0.5356789:
##         :   :   :...in.flight.entertainment <= 0.4810466:
##         :   :   :   :...seat.comfort <= -1.850582:
##         :   :   :   :   :...online.boarding <= -1.86236: 3 (4/1)
##         :   :   :   :   :   online.boarding > -1.86236: 1 (26/1)
##         :   :   :   :   seat.comfort > -1.850582:
##         :   :   :   :   :...in.flight.service <= 0.3040217: 1 (815/1)
##         :   :   :   :       in.flight.service > 0.3040217:
##         :   :   :   :       :...baggage.handling <= -1.383119: 1 (180)
##         :   :   :   :           baggage.handling > -1.383119:
##         :   :   :   :           :...on.board.service <= -1.851815: 1 (50)
##         :   :   :   :               on.board.service > -1.851815:
##         :   :   :   :               :...in.flight.entertainment <= -1.018808: 1 (3)
##         :   :   :   :                   in.flight.entertainment > -1.018808: 2 (39)
##         :   :   :   in.flight.entertainment > 0.4810466:
##         :   :   :   :...on.board.service <= -1.851815:
##         :   :   :       :...in.flight.service <= 0.3040217:
##         :   :   :       :   :...online.boarding <= 0.5338414: 1 (208)
##         :   :   :       :   :   online.boarding > 0.5338414:
##         :   :   :       :   :   :...baggage.handling <= -1.383119: 1 (62)
##         :   :   :       :   :       baggage.handling > -1.383119:
##         :   :   :       :   :       :...seat.comfort <= 0.4234303: 1 (2)
##         :   :   :       :   :           seat.comfort > 0.4234303: 2 (26)
##         :   :   :       :   in.flight.service > 0.3040217:
##         :   :   :       :   :...baggage.handling <= -2.230559: 1 (46)
##         :   :   :       :       baggage.handling > -2.230559:
##         :   :   :       :       :...baggage.handling > -1.383119:
##         :   :   :       :           :...seat.comfort <= -1.092578: 1 (2)
##         :   :   :       :           :   seat.comfort > -1.092578: 2 (42)
##         :   :   :       :           baggage.handling <= -1.383119:
##         :   :   :       :           :...online.boarding <= 0.5338414: 1 (16)
##         :   :   :       :               online.boarding > 0.5338414:
##         :   :   :       :               :...seat.comfort <= 0.4234303: 1 (7)
##         :   :   :       :                   seat.comfort > 0.4234303: 2 (23)
##         :   :   :       on.board.service > -1.851815:
##         :   :   :       :...baggage.handling <= -2.230559:
##         :   :   :           :...in.flight.service <= 0.3040217: 1 (47)
##         :   :   :           :   in.flight.service > 0.3040217:
##         :   :   :           :   :...online.boarding <= 0.5338414: 1 (23)
##         :   :   :           :       online.boarding > 0.5338414: 2 (31/2)
##         :   :   :           baggage.handling > -2.230559:
##         :   :   :           :...seat.comfort <= 0:
##         :   :   :               :...in.flight.service > 0.3040217: 2 (8)
##         :   :   :               :   in.flight.service <= 0.3040217:
##         :   :   :               :   :...online.boarding <= -0.2648924: 1 (24)
##         :   :   :               :       online.boarding > -0.2648924:
##         :   :   :               :       :...baggage.handling <= -1.383119: 1 (2)
##         :   :   :               :           baggage.handling > -1.383119: 2 (7)
##         :   :   :               seat.comfort > 0:
##         :   :   :               :...baggage.handling > -1.383119: 2 (235)
##         :   :   :                   baggage.handling <= -1.383119:
##         :   :   :                   :...in.flight.service > 0.3040217: 2 (37)
##         :   :   :                       in.flight.service <= 0.3040217:
##         :   :   :                       :...online.boarding <= 0.5338414: 1 (16)
##         :   :   :                           online.boarding > 0.5338414: 2 (27/1)
##         :   :   baggage.handling > -0.5356789:
##         :   :   :...on.board.service <= -1.851815:
##         :   :       :...in.flight.entertainment <= 0.4810466:
##         :   :       :   :...baggage.handling <= 0.3117611:
##         :   :       :   :   :...in.flight.service <= 0.3040217: 1 (277/1)
##         :   :       :   :   :   in.flight.service > 0.3040217: 2 (46/1)
##         :   :       :   :   baggage.handling > 0.3117611:
##         :   :       :   :   :...in.flight.entertainment <= -1.018808: 1 (6/2)
##         :   :       :   :       in.flight.entertainment > -1.018808:
##         :   :       :   :       :...seat.comfort > -1.092578: 2 (91)
##         :   :       :   :           seat.comfort <= -1.092578:
##         :   :       :   :           :...online.boarding <= -1.86236: 3 (2)
##         :   :       :   :               online.boarding > -1.86236: 1 (2/1)
##         :   :       :   in.flight.entertainment > 0.4810466:
##         :   :       :   :...seat.comfort > -1.092578: 2 (362/1)
##         :   :       :       seat.comfort <= -1.092578:
##         :   :       :       :...online.boarding > -0.2648924: 2 (9)
##         :   :       :           online.boarding <= -0.2648924:
##         :   :       :           :...seat.comfort <= -1.850582: 1 (7)
##         :   :       :               seat.comfort > -1.850582:
##         :   :       :               :...online.boarding <= -1.063626: 1 (4/1)
##         :   :       :                   online.boarding > -1.063626: 2 (2)
##         :   :       on.board.service > -1.851815:
##         :   :       :...seat.comfort > 0:
##         :   :           :...food.and.drink > -0.1568885: 2 (901/6)
##         :   :           :   food.and.drink <= -0.1568885:
##         :   :           :   :...in.flight.entertainment <= -1.018808: 1 (6/2)
##         :   :           :       in.flight.entertainment > -1.018808:
##         :   :           :       :...online.boarding > -1.86236: 2 (49/2)
##         :   :           :           online.boarding <= -1.86236:
##         :   :           :           :...seat.comfort <= 0.4234303: 3 (3)
##         :   :           :               seat.comfort > 0.4234303: 2 (2)
##         :   :           seat.comfort <= 0:
##         :   :           :...cleanliness > 0.543176: 2 (37)
##         :   :               cleanliness <= 0.543176:
##         :   :               :...online.boarding > -1.063626:
##         :   :                   :...seat.comfort > -1.850582: 2 (20)
##         :   :                   :   seat.comfort <= -1.850582:
##         :   :                   :   :...online.boarding <= 0: 1 (5/1)
##         :   :                   :       online.boarding > 0: 2 (5)
##         :   :                   online.boarding <= -1.063626:
##         :   :                   :...seat.comfort <= -1.850582: 3 (6)
##         :   :                       seat.comfort > -1.850582:
##         :   :                       :...seat.comfort <= -1.092578:
##         :   :                           :...online.boarding <= -1.86236: 3 (2)
##         :   :                           :   online.boarding > -1.86236: 1 (5)
##         :   :                           seat.comfort > -1.092578:
##         :   :                           :...online.boarding <= -1.86236: 1 (6/1)
##         :   :                               online.boarding > -1.86236: 2 (3)
##         :   cleanliness <= 0:
##         :   :...food.and.drink > 0:
##         :       :...baggage.handling > 0.3117611: 2 (11)
##         :       :   baggage.handling <= 0.3117611:
##         :       :   :...in.flight.service <= 0.3040217: 1 (14/2)
##         :       :       in.flight.service > 0.3040217: 2 (4/1)
##         :       food.and.drink <= 0:
##         :       :...baggage.handling > -0.5356789:
##         :           :...online.boarding <= -0.2648924:
##         :           :   :...on.board.service > -1.851815:
##         :           :   :   :...seat.comfort > 0:
##         :           :   :   :   :...online.boarding <= -1.86236: 3 (6/2)
##         :           :   :   :   :   online.boarding > -1.86236: 1 (13/1)
##         :           :   :   :   seat.comfort <= 0:
##         :           :   :   :   :...in.flight.entertainment <= 0: 3 (300)
##         :           :   :   :       in.flight.entertainment > 0:
##         :           :   :   :       :...food.and.drink <= -0.911037: 3 (6)
##         :           :   :   :           food.and.drink > -0.911037: 2 (9/2)
##         :           :   :   on.board.service <= -1.851815:
##         :           :   :   :...online.boarding <= -1.063626:
##         :           :   :       :...seat.comfort <= 0: 3 (123)
##         :           :   :       :   seat.comfort > 0:
##         :           :   :       :   :...online.boarding <= -1.86236: 3 (3/1)
##         :           :   :       :       online.boarding > -1.86236: 1 (7)
##         :           :   :       online.boarding > -1.063626:
##         :           :   :       :...in.flight.service > 0.3040217: 3 (21/1)
##         :           :   :           in.flight.service <= 0.3040217:
##         :           :   :           :...seat.comfort <= -1.092578: 3 (16)
##         :           :   :               seat.comfort > -1.092578:
##         :           :   :               :...baggage.handling <= 0.3117611: 1 (194/1)
##         :           :   :                   baggage.handling > 0.3117611: 3 (7)
##         :           :   online.boarding > -0.2648924:
##         :           :   :...seat.comfort <= -1.092578: 3 (14/2)
##         :           :       seat.comfort > -1.092578:
##         :           :       :...in.flight.service <= 0.3040217:
##         :           :           :...baggage.handling <= 0.3117611: 1 (86/1)
##         :           :           :   baggage.handling > 0.3117611:
##         :           :           :   :...online.boarding <= 0: 3 (4)
##         :           :           :       online.boarding > 0:
##         :           :           :       :...on.board.service <= -1.851815: 1 (12)
##         :           :           :           on.board.service > -1.851815: 2 (5)
##         :           :           in.flight.service > 0.3040217:
##         :           :           :...online.boarding <= 0: 3 (3/1)
##         :           :               online.boarding > 0:
##         :           :               :...on.board.service > -1.851815: 2 (24)
##         :           :                   on.board.service <= -1.851815:
##         :           :                   :...baggage.handling <= 0.3117611: 1 (5)
##         :           :                       baggage.handling > 0.3117611: 2 (7)
##         :           baggage.handling <= -0.5356789:
##         :           :...seat.comfort <= -1.092578:
##         :               :...on.board.service > -1.851815: 3 (44/2)
##         :               :   on.board.service <= -1.851815:
##         :               :   :...online.boarding > 0: 1 (7)
##         :               :       online.boarding <= 0:
##         :               :       :...seat.comfort <= -1.850582: 3 (15)
##         :               :           seat.comfort > -1.850582:
##         :               :           :...online.boarding > -1.063626: 1 (10)
##         :               :               online.boarding <= -1.063626:
##         :               :               :...baggage.handling <= -1.383119: 1 (3/1)
##         :               :                   baggage.handling > -1.383119: 3 (5)
##         :               seat.comfort > -1.092578:
##         :               :...online.boarding <= -1.86236:
##         :                   :...on.board.service <= -1.851815:
##         :                   :   :...in.flight.service <= 0.3040217: 1 (40/1)
##         :                   :   :   in.flight.service > 0.3040217:
##         :                   :   :   :...baggage.handling <= -1.383119: 1 (4)
##         :                   :   :       baggage.handling > -1.383119: 3 (2)
##         :                   :   on.board.service > -1.851815:
##         :                   :   :...baggage.handling <= -1.383119: 1 (11/1)
##         :                   :       baggage.handling > -1.383119:
##         :                   :       :...seat.comfort <= 0: 3 (34/1)
##         :                   :           seat.comfort > 0: 1 (9/1)
##         :                   online.boarding > -1.86236:
##         :                   :...in.flight.service <= 0.3040217: 1 (648/5)
##         :                       in.flight.service > 0.3040217:
##         :                       :...baggage.handling <= -1.383119: 1 (88)
##         :                           baggage.handling > -1.383119:
##         :                           :...online.boarding > -0.2648924: 1 (27)
##         :                               online.boarding <= -0.2648924:
##         :                               :...online.boarding <= -1.063626: 3 (13/1)
##         :                                   online.boarding > -1.063626:
##         :                                   :...on.board.service <= -1.851815: 1 (10)
##         :                                       on.board.service > -1.851815: 3 (10)
##         on.board.service > -1.074771:
##         :...in.flight.entertainment <= 0:
##             :...online.boarding > -0.2648924:
##             :   :...seat.comfort <= -1.092578:
##             :   :   :...online.boarding > 0.5338414:
##             :   :   :   :...seat.comfort > -1.850582: 2 (28/3)
##             :   :   :   :   seat.comfort <= -1.850582:
##             :   :   :   :   :...baggage.handling <= 0.3117611: 3 (7)
##             :   :   :   :       baggage.handling > 0.3117611: 2 (5/1)
##             :   :   :   online.boarding <= 0.5338414:
##             :   :   :   :...on.board.service <= 0.4793173: 3 (121/7)
##             :   :   :       on.board.service > 0.4793173:
##             :   :   :       :...seat.comfort <= -1.850582: 3 (21)
##             :   :   :           seat.comfort > -1.850582:
##             :   :   :           :...online.boarding > 0: 2 (16/1)
##             :   :   :               online.boarding <= 0:
##             :   :   :               :...in.flight.service <= 0.3040217: 3 (3)
##             :   :   :                   in.flight.service > 0.3040217: 2 (4/1)
##             :   :   seat.comfort > -1.092578:
##             :   :   :...baggage.handling > -0.5356789:
##             :   :       :...online.boarding > 0: 2 (1769/8)
##             :   :       :   online.boarding <= 0:
##             :   :       :   :...on.board.service > 0: 2 (192)
##             :   :       :       on.board.service <= 0:
##             :   :       :       :...in.flight.service > 0.3040217: 2 (63)
##             :   :       :           in.flight.service <= 0.3040217:
##             :   :       :           :...baggage.handling > 0.3117611: 2 (23)
##             :   :       :               baggage.handling <= 0.3117611:
##             :   :       :               :...seat.comfort <= 0: 3 (20)
##             :   :       :                   seat.comfort > 0: 2 (5)
##             :   :       baggage.handling <= -0.5356789:
##             :   :       :...on.board.service <= 0:
##             :   :           :...in.flight.service <= 0.3040217: 1 (93)
##             :   :           :   in.flight.service > 0.3040217:
##             :   :           :   :...baggage.handling <= -1.383119: 1 (32/1)
##             :   :           :       baggage.handling > -1.383119: 2 (22/2)
##             :   :           on.board.service > 0:
##             :   :           :...online.boarding <= 0:
##             :   :               :...baggage.handling <= -2.230559:
##             :   :               :   :...on.board.service <= 0.4793173: 1 (2)
##             :   :               :   :   on.board.service > 0.4793173: 3 (3/1)
##             :   :               :   baggage.handling > -2.230559:
##             :   :               :   :...in.flight.service > 0.3040217: 2 (6)
##             :   :               :       in.flight.service <= 0.3040217:
##             :   :               :       :...on.board.service <= 0.4793173: 3 (12)
##             :   :               :           on.board.service > 0.4793173:
##             :   :               :           :...baggage.handling <= -1.383119: 3 (3)
##             :   :               :               baggage.handling > -1.383119: 2 (2)
##             :   :               online.boarding > 0:
##             :   :               :...baggage.handling > -1.383119:
##             :   :                   :...in.flight.entertainment > -1.768735: 2 (113)
##             :   :                   :   in.flight.entertainment <= -1.768735:
##             :   :                   :   :...on.board.service <= 0.4793173: 1 (4/1)
##             :   :                   :       on.board.service > 0.4793173: 2 (4/1)
##             :   :                   baggage.handling <= -1.383119:
##             :   :                   :...in.flight.service > 0.3040217:
##             :   :                       :...baggage.handling > -2.230559: 2 (20)
##             :   :                       :   baggage.handling <= -2.230559:
##             :   :                       :   :...on.board.service <= 0.4793173: 1 (8/1)
##             :   :                       :       on.board.service > 0.4793173: 2 (12/1)
##             :   :                       in.flight.service <= 0.3040217:
##             :   :                       :...on.board.service <= 0.4793173: 1 (22/1)
##             :   :                           on.board.service > 0.4793173:
##             :   :                           :...baggage.handling <= -2.230559: 1 (8/1)
##             :   :                               baggage.handling > -2.230559: [S39]
##             :   online.boarding <= -0.2648924:
##             :   :...baggage.handling > -0.5356789:
##             :       :...seat.comfort > 0:
##             :       :   :...online.boarding > -1.86236: 2 (276/4)
##             :       :   :   online.boarding <= -1.86236:
##             :       :   :   :...seat.comfort > 0.4234303: 2 (44)
##             :       :   :       seat.comfort <= 0.4234303:
##             :       :   :       :...on.board.service > 0.4793173:
##             :       :   :           :...baggage.handling > 0.3117611: 2 (7)
##             :       :   :           :   baggage.handling <= 0.3117611:
##             :       :   :           :   :...in.flight.service <= 0.3040217: 3 (2)
##             :       :   :           :       in.flight.service > 0.3040217: 2 (3)
##             :       :   :           on.board.service <= 0.4793173:
##             :       :   :           :...in.flight.service <= 0.3040217: 3 (21/1)
##             :       :   :               in.flight.service > 0.3040217:
##             :       :   :               :...baggage.handling <= 0.3117611: 3 (6)
##             :       :   :                   baggage.handling > 0.3117611:
##             :       :   :                   :...on.board.service <= 0: 3 (3/1)
##             :       :   :                       on.board.service > 0: 2 (3)
##             :       :   seat.comfort <= 0:
##             :       :   :...online.boarding <= -1.063626:
##             :       :       :...on.board.service <= 0.4793173: 3 (1315/1)
##             :       :       :   on.board.service > 0.4793173:
##             :       :       :   :...baggage.handling <= 0.3117611: 3 (299)
##             :       :       :       baggage.handling > 0.3117611:
##             :       :       :       :...in.flight.service <= 0.3040217: 3 (146)
##             :       :       :           in.flight.service > 0.3040217:
##             :       :       :           :...online.boarding <= -1.86236: 3 (57/1)
##             :       :       :               online.boarding > -1.86236:
##             :       :       :               :...seat.comfort <= -1.092578: 3 (9)
##             :       :       :                   seat.comfort > -1.092578: 2 (84)
##             :       :       online.boarding > -1.063626:
##             :       :       :...seat.comfort <= -1.092578: 3 (136)
##             :       :           seat.comfort > -1.092578:
##             :       :           :...on.board.service > 0.4793173:
##             :       :               :...food.and.drink > -0.911037: 2 (451/5)
##             :       :               :   food.and.drink <= -0.911037:
##             :       :               :   :...in.flight.service <= 0.3040217: 3 (8)
##             :       :               :       in.flight.service > 0.3040217:
##             :       :               :       :...food.and.drink <= -1.665185: 3 (4)
##             :       :               :           food.and.drink > -1.665185: 2 (2)
##             :       :               on.board.service <= 0.4793173:
##             :       :               :...in.flight.service <= 0.3040217:
##             :       :                   :...baggage.handling <= 0.3117611: 3 (497)
##             :       :                   :   baggage.handling > 0.3117611:
##             :       :                   :   :...on.board.service > 0: 2 (121/4)
##             :       :                   :       on.board.service <= 0: [S40]
##             :       :                   in.flight.service > 0.3040217:
##             :       :                   :...on.board.service > 0: 2 (230/3)
##             :       :                       on.board.service <= 0:
##             :       :                       :...baggage.handling <= 0.3117611: [S41]
##             :       :                           baggage.handling > 0.3117611: [S42]
##             :       baggage.handling <= -0.5356789:
##             :       :...seat.comfort > 0:
##             :           :...on.board.service <= 0:
##             :           :   :...in.flight.service > 0.3040217:
##             :           :   :   :...baggage.handling <= -1.383119: 1 (5)
##             :           :   :   :   baggage.handling > -1.383119: 2 (4/1)
##             :           :   :   in.flight.service <= 0.3040217:
##             :           :   :   :...online.boarding > -1.86236: 1 (18)
##             :           :   :       online.boarding <= -1.86236:
##             :           :   :       :...seat.comfort <= 0.4234303: 3 (4/1)
##             :           :   :           seat.comfort > 0.4234303: 1 (4)
##             :           :   on.board.service > 0:
##             :           :   :...food.and.drink <= -0.911037: 3 (2)
##             :           :       food.and.drink > -0.911037:
##             :           :       :...baggage.handling <= -1.383119:
##             :           :           :...in.flight.service <= 0.3040217: 1 (13/1)
##             :           :           :   in.flight.service > 0.3040217:
##             :           :           :   :...seat.comfort <= 0.4234303: 1 (4/2)
##             :           :           :       seat.comfort > 0.4234303: 2 (2)
##             :           :           baggage.handling > -1.383119:
##             :           :           :...cleanliness > 0: 1 (3/1)
##             :           :               cleanliness <= 0:
##             :           :               :...seat.comfort > 0.4234303: 2 (13)
##             :           :                   seat.comfort <= 0.4234303:
##             :           :                   :...online.boarding > -1.063626: 2 (7)
##             :           :                       online.boarding <= -1.063626: [S43]
##             :           seat.comfort <= 0:
##             :           :...on.board.service <= 0:
##             :               :...online.boarding <= -1.063626:
##             :               :   :...baggage.handling > -1.383119: 3 (123)
##             :               :   :   baggage.handling <= -1.383119:
##             :               :   :   :...in.flight.service <= 0.3040217:
##             :               :   :       :...online.boarding > -1.86236: 1 (10)
##             :               :   :       :   online.boarding <= -1.86236:
##             :               :   :       :   :...baggage.handling <= -2.230559: 1 (3/1)
##             :               :   :       :       baggage.handling > -2.230559: 3 (2)
##             :               :   :       in.flight.service > 0.3040217:
##             :               :   :       :...online.boarding <= -1.86236: 3 (7)
##             :               :   :           online.boarding > -1.86236:
##             :               :   :           :...baggage.handling <= -2.230559: 1 (4/1)
##             :               :   :               baggage.handling > -2.230559: 3 (4)
##             :               :   online.boarding > -1.063626:
##             :               :   :...seat.comfort <= -1.092578: 3 (14/1)
##             :               :       seat.comfort > -1.092578:
##             :               :       :...in.flight.entertainment <= -1.018808:
##             :               :           :...food.and.drink <= 0: 3 (9)
##             :               :           :   food.and.drink > 0: 1 (3)
##             :               :           in.flight.entertainment > -1.018808:
##             :               :           :...in.flight.service <= 0.3040217: 1 (203)
##             :               :               in.flight.service > 0.3040217:
##             :               :               :...baggage.handling <= -1.383119: 1 (16)
##             :               :                   baggage.handling > -1.383119: 3 (11)
##             :               on.board.service > 0:
##             :               :...baggage.handling > -1.383119:
##             :                   :...in.flight.service <= 0.3040217: 3 (328)
##             :                   :   in.flight.service > 0.3040217:
##             :                   :   :...on.board.service <= 0.4793173: 3 (41)
##             :                   :       on.board.service > 0.4793173:
##             :                   :       :...online.boarding <= -1.063626: 3 (16)
##             :                   :           online.boarding > -1.063626:
##             :                   :           :...seat.comfort <= -1.092578: 3 (3)
##             :                   :               seat.comfort > -1.092578: 2 (16)
##             :                   baggage.handling <= -1.383119:
##             :                   :...online.boarding <= -1.063626: 3 (63/4)
##             :                       online.boarding > -1.063626:
##             :                       :...food.and.drink <= -0.911037: 3 (3)
##             :                           food.and.drink > -0.911037:
##             :                           :...in.flight.service > 0.3040217:
##             :                               :...baggage.handling > -2.230559: 3 (12)
##             :                               :   baggage.handling <= -2.230559: [S44]
##             :                               in.flight.service <= 0.3040217:
##             :                               :...on.board.service <= 0.4793173:
##             :                                   :...seat.comfort <= -1.092578: 3 (3/1)
##             :                                   :   seat.comfort > -1.092578: 1 (15)
##             :                                   on.board.service > 0.4793173: [S45]
##             in.flight.entertainment > 0:
##             :...baggage.handling > -1.383119:
##                 :...food.and.drink <= -1.665185:
##                 :   :...seat.comfort > 0:
##                 :   :   :...online.boarding > -1.063626: 2 (324)
##                 :   :   :   online.boarding <= -1.063626:
##                 :   :   :   :...in.flight.entertainment > 0.4810466: 2 (69)
##                 :   :   :       in.flight.entertainment <= 0.4810466:
##                 :   :   :       :...cleanliness <= 0:
##                 :   :   :           :...seat.comfort <= 0.4234303: 3 (19)
##                 :   :   :           :   seat.comfort > 0.4234303:
##                 :   :   :           :   :...online.boarding <= -1.86236: 3 (9/1)
##                 :   :   :           :       online.boarding > -1.86236: 2 (4)
##                 :   :   :           cleanliness > 0:
##                 :   :   :           :...seat.comfort > 0.4234303: 2 (27)
##                 :   :   :               seat.comfort <= 0.4234303:
##                 :   :   :               :...cleanliness > 0.543176: 2 (12)
##                 :   :   :                   cleanliness <= 0.543176:
##                 :   :   :                   :...online.boarding <= -1.86236: 3 (7)
##                 :   :   :                       online.boarding > -1.86236: 2 (2)
##                 :   :   seat.comfort <= 0:
##                 :   :   :...in.flight.entertainment <= 0.4810466:
##                 :   :       :...online.boarding <= -1.063626:
##                 :   :       :   :...cleanliness <= 0.543176: 3 (107)
##                 :   :       :   :   cleanliness > 0.543176:
##                 :   :       :   :   :...seat.comfort <= -1.092578: 3 (17)
##                 :   :       :   :       seat.comfort > -1.092578:
##                 :   :       :   :       :...online.boarding <= -1.86236: 3 (8)
##                 :   :       :   :           online.boarding > -1.86236: 2 (4)
##                 :   :       :   online.boarding > -1.063626:
##                 :   :       :   :...cleanliness <= 0:
##                 :   :       :       :...online.boarding <= 0.5338414: 3 (66/1)
##                 :   :       :       :   online.boarding > 0.5338414:
##                 :   :       :       :   :...seat.comfort <= -1.092578: 3 (12)
##                 :   :       :       :       seat.comfort > -1.092578: 2 (4)
##                 :   :       :       cleanliness > 0:
##                 :   :       :       :...seat.comfort > -1.092578: 2 (83)
##                 :   :       :           seat.comfort <= -1.092578:
##                 :   :       :           :...online.boarding <= 0.5338414:
##                 :   :       :               :...seat.comfort <= -1.850582: 3 (26)
##                 :   :       :               :   seat.comfort > -1.850582:
##                 :   :       :               :   :...cleanliness <= 0.543176: 3 (12)
##                 :   :       :               :       cleanliness > 0.543176: 2 (10/1)
##                 :   :       :               online.boarding > 0.5338414:
##                 :   :       :               :...seat.comfort > -1.850582: 2 (19)
##                 :   :       :                   seat.comfort <= -1.850582:
##                 :   :       :                   :...cleanliness <= 0.543176: 3 (7)
##                 :   :       :                       cleanliness > 0.543176: 2 (5)
##                 :   :       in.flight.entertainment > 0.4810466:
##                 :   :       :...online.boarding > -1.063626:
##                 :   :           :...cleanliness > 0: 2 (125)
##                 :   :           :   cleanliness <= 0:
##                 :   :           :   :...seat.comfort > -1.850582: 2 (57/1)
##                 :   :           :       seat.comfort <= -1.850582:
##                 :   :           :       :...online.boarding <= 0.5338414: 3 (10)
##                 :   :           :           online.boarding > 0.5338414: 2 (5)
##                 :   :           online.boarding <= -1.063626:
##                 :   :           :...cleanliness <= 0:
##                 :   :               :...seat.comfort <= -1.092578: 3 (21)
##                 :   :               :   seat.comfort > -1.092578:
##                 :   :               :   :...online.boarding <= -1.86236: 3 (8)
##                 :   :               :       online.boarding > -1.86236: 2 (6/1)
##                 :   :               cleanliness > 0:
##                 :   :               :...seat.comfort > -1.092578: 2 (28)
##                 :   :                   seat.comfort <= -1.092578:
##                 :   :                   :...cleanliness > 0.543176:
##                 :   :                       :...seat.comfort > -1.850582: 2 (13)
##                 :   :                       :   seat.comfort <= -1.850582: [S46]
##                 :   :                       cleanliness <= 0.543176:
##                 :   :                       :...online.boarding <= -1.86236: 3 (13)
##                 :   :                           online.boarding > -1.86236:
##                 :   :                           :...seat.comfort <= -1.850582: 3 (5)
##                 :   :                               seat.comfort > -1.850582: [S47]
##                 :   food.and.drink > -1.665185:
##                 :   :...seat.comfort > 0: 2 (40030/35)
##                 :       seat.comfort <= 0:
##                 :       :...food.and.drink > -0.1568885:
##                 :           :...baggage.handling <= -0.5356789:
##                 :           :   :...on.board.service > 0: 2 (189/4)
##                 :           :   :   on.board.service <= 0:
##                 :           :   :   :...in.flight.entertainment > 0.4810466: 2 (38)
##                 :           :   :       in.flight.entertainment <= 0.4810466:
##                 :           :   :       :...seat.comfort <= -1.850582:
##                 :           :   :           :...online.boarding <= -0.2648924: 3 (12)
##                 :           :   :           :   online.boarding > -0.2648924: 1 (4/1)
##                 :           :   :           seat.comfort > -1.850582:
##                 :           :   :           :...in.flight.service > 0.3040217: 2 (7)
##                 :           :   :               in.flight.service <= 0.3040217: [S48]
##                 :           :   baggage.handling > -0.5356789:
##                 :           :   :...online.boarding <= -1.86236:
##                 :           :       :...in.flight.entertainment > 0.4810466: 2 (218)
##                 :           :       :   in.flight.entertainment <= 0.4810466:
##                 :           :       :   :...seat.comfort > -1.850582:
##                 :           :       :       :...cleanliness > 0: 2 (118)
##                 :           :       :       :   cleanliness <= 0:
##                 :           :       :       :   :...seat.comfort > -1.092578: 2 (17)
##                 :           :       :       :       seat.comfort <= -1.092578: [S49]
##                 :           :       :       seat.comfort <= -1.850582:
##                 :           :       :       :...cleanliness > 0.543176: 2 (9)
##                 :           :       :           cleanliness <= 0.543176:
##                 :           :       :           :...food.and.drink > 0.5972599: [S50]
##                 :           :       :               food.and.drink <= 0.5972599: [S51]
##                 :           :       online.boarding > -1.86236:
##                 :           :       :...seat.comfort > -1.850582: 2 (1691)
##                 :           :           seat.comfort <= -1.850582:
##                 :           :           :...cleanliness <= 0:
##                 :           :               :...online.boarding > 0: 2 (44)
##                 :           :               :   online.boarding <= 0: [S52]
##                 :           :               cleanliness > 0:
##                 :           :               :...on.board.service > 0: 2 (485)
##                 :           :                   on.board.service <= 0: [S53]
##                 :           food.and.drink <= -0.1568885:
##                 :           :...online.boarding <= -1.063626:
##                 :               :...in.flight.service > 0.3040217:
##                 :               :   :...seat.comfort <= -1.850582:
##                 :               :   :   :...cleanliness <= 0:
##                 :               :   :   :   :...online.boarding <= -1.86236: 3 (9)
##                 :               :   :   :   :   online.boarding > -1.86236: [S54]
##                 :               :   :   :   cleanliness > 0:
##                 :               :   :   :   :...food.and.drink > -0.911037: 2 (22)
##                 :               :   :   :       food.and.drink <= -0.911037: [S55]
##                 :               :   :   seat.comfort > -1.850582:
##                 :               :   :   :...cleanliness > 0: 2 (97)
##                 :               :   :       cleanliness <= 0:
##                 :               :   :       :...online.boarding > -1.86236: 2 (37)
##                 :               :   :           online.boarding <= -1.86236: [S56]
##                 :               :   in.flight.service <= 0.3040217:
##                 :               :   :...seat.comfort > -1.092578:
##                 :               :       :...food.and.drink > -0.911037:
##                 :               :       :   :...cleanliness > 0: 2 (36)
##                 :               :       :   :   cleanliness <= 0: [S57]
##                 :               :       :   food.and.drink <= -0.911037:
##                 :               :       :   :...cleanliness <= 0: 3 (16/1)
##                 :               :       :       cleanliness > 0: [S58]
##                 :               :       seat.comfort <= -1.092578:
##                 :               :       :...cleanliness > 0.543176:
##                 :               :           :...online.boarding <= -1.86236: [S59]
##                 :               :           :   online.boarding > -1.86236:
##                 :               :           :   :...seat.comfort > -1.850582: 2 (14)
##                 :               :           :       seat.comfort <= -1.850582: [S60]
##                 :               :           cleanliness <= 0.543176:
##                 :               :           :...food.and.drink <= -0.911037: 3 (84)
##                 :               :               food.and.drink > -0.911037:
##                 :               :               :...cleanliness <= 0: 3 (24)
##                 :               :                   cleanliness > 0: [S61]
##                 :               online.boarding > -1.063626:
##                 :               :...in.flight.service > 0.3040217: 2 (416)
##                 :                   in.flight.service <= 0.3040217:
##                 :                   :...seat.comfort > -1.092578:
##                 :                       :...on.board.service > 0: 2 (293/1)
##                 :                       :   on.board.service <= 0:
##                 :                       :   :...food.and.drink <= -0.911037: 3 (5/1)
##                 :                       :       food.and.drink > -0.911037: [S62]
##                 :                       seat.comfort <= -1.092578:
##                 :                       :...cleanliness <= 0:
##                 :                           :...online.boarding > 0.5338414: [S63]
##                 :                           :   online.boarding <= 0.5338414: [S64]
##                 :                           cleanliness > 0:
##                 :                           :...seat.comfort > -1.850582: 2 (96)
##                 :                               seat.comfort <= -1.850582:
##                 :                               :...cleanliness > 0.543176: 2 (34/1)
##                 :                                   cleanliness <= 0.543176: [S65]
##                 baggage.handling <= -1.383119:
##                 :...on.board.service > 0.4793173:
##                     :...online.boarding > -1.063626: 2 (416)
##                     :   online.boarding <= -1.063626:
##                     :   :...in.flight.entertainment > 0.4810466: 2 (44/2)
##                     :       in.flight.entertainment <= 0.4810466:
##                     :       :...baggage.handling > -2.230559: 2 (18)
##                     :           baggage.handling <= -2.230559:
##                     :           :...in.flight.service <= 0.3040217: 1 (11/1)
##                     :               in.flight.service > 0.3040217: 2 (2)
##                     on.board.service <= 0.4793173:
##                     :...in.flight.service > 0.3040217:
##                         :...on.board.service > 0: 2 (202/1)
##                         :   on.board.service <= 0:
##                         :   :...baggage.handling <= -2.230559:
##                         :       :...in.flight.entertainment <= 0.4810466: 1 (37)
##                         :       :   in.flight.entertainment > 0.4810466:
##                         :       :   :...seat.comfort <= 0: 1 (4/1)
##                         :       :       seat.comfort > 0: 2 (34)
##                         :       baggage.handling > -2.230559:
##                         :       :...online.boarding > -1.86236: 2 (91/1)
##                         :           online.boarding <= -1.86236:
##                         :           :...cleanliness <= 0.543176: 1 (4/1)
##                         :               cleanliness > 0.543176: 2 (4)
##                         in.flight.service <= 0.3040217:
##                         :...baggage.handling <= -2.230559:
##                             :...in.flight.entertainment <= 0.4810466:
##                             :   :...seat.comfort > -1.092578: 1 (197/4)
##                             :   :   seat.comfort <= -1.092578:
##                             :   :   :...food.and.drink <= -0.911037: 3 (8)
##                             :   :       food.and.drink > -0.911037: 1 (5)
##                             :   in.flight.entertainment > 0.4810466:
##                             :   :...on.board.service <= 0: 1 (44)
##                             :       on.board.service > 0:
##                             :       :...online.boarding <= -1.063626: 1 (4)
##                             :           online.boarding > -1.063626: 2 (48/1)
##                             baggage.handling > -2.230559:
##                             :...seat.comfort > 0.4234303:
##                                 :...food.and.drink > 0.5972599: 2 (93)
##                                 :   food.and.drink <= 0.5972599:
##                                 :   :...on.board.service <= 0: 1 (3)
##                                 :       on.board.service > 0: 2 (22/2)
##                                 seat.comfort <= 0.4234303:
##                                 :...on.board.service <= 0:
##                                     :...in.flight.entertainment <= 0.4810466: 1 (69)
##                                     :   in.flight.entertainment > 0.4810466:
##                                     :   :...online.boarding <= 0: 1 (7)
##                                     :       online.boarding > 0: 2 (6/1)
##                                     on.board.service > 0:
##                                     :...food.and.drink <= -1.665185: 3 (6)
##                                         food.and.drink > -1.665185:
##                                         :...seat.comfort > 0:
##                                             :...online.boarding <= -1.063626:
##                                             :   :...cleanliness <= 0.543176: 1 (11)
##                                             :   :   cleanliness > 0.543176: 2 (3)
##                                             :   online.boarding > -1.063626:
##                                             :   :...cleanliness > 0: 2 (71)
##                                             :       cleanliness <= 0: [S66]
##                                             seat.comfort <= 0: [S67]
## 
## SubTree [S1]
## 
## in.flight.entertainment <= -1.018808: 3 (6)
## in.flight.entertainment > -1.018808:
## :...baggage.handling <= -0.5356789: 1 (7/1)
##     baggage.handling > -0.5356789: 3 (4/1)
## 
## SubTree [S2]
## 
## online.boarding <= 0.5338414: 3 (6)
## online.boarding > 0.5338414: 2 (9)
## 
## SubTree [S3]
## 
## baggage.handling <= 0.3117611: 3 (5)
## baggage.handling > 0.3117611: 2 (16)
## 
## SubTree [S4]
## 
## online.boarding <= 0.5338414: 3 (7)
## online.boarding > 0.5338414: 2 (4/1)
## 
## SubTree [S5]
## 
## baggage.handling <= 0.3117611: 3 (2)
## baggage.handling > 0.3117611: 2 (10)
## 
## SubTree [S6]
## 
## in.flight.entertainment <= -1.018808: 3 (4)
## in.flight.entertainment > -1.018808:
## :...food.and.drink <= -1.665185: 3 (3)
##     food.and.drink > -1.665185: 1 (4)
## 
## SubTree [S7]
## 
## online.boarding <= 0: 3 (3/1)
## online.boarding > 0: 1 (3)
## 
## SubTree [S8]
## 
## online.boarding <= -1.86236: 3 (2)
## online.boarding > -1.86236: 1 (5)
## 
## SubTree [S9]
## 
## online.boarding > -0.2648924: 1 (9/1)
## online.boarding <= -0.2648924:
## :...baggage.handling <= -1.383119: 1 (2)
##     baggage.handling > -1.383119: 3 (6)
## 
## SubTree [S10]
## 
## in.flight.entertainment <= -1.018808: 3 (42)
## in.flight.entertainment > -1.018808:
## :...on.board.service <= 0.4793173: 1 (3)
##     on.board.service > 0.4793173: 3 (2)
## 
## SubTree [S11]
## 
## online.boarding <= 0: 3 (8)
## online.boarding > 0:
## :...on.board.service <= 0.4793173: 1 (5)
##     on.board.service > 0.4793173: 3 (3/1)
## 
## SubTree [S12]
## 
## in.flight.entertainment <= -1.018808: 3 (70)
## in.flight.entertainment > -1.018808: 1 (4/1)
## 
## SubTree [S13]
## 
## on.board.service <= 0: 1 (2)
## on.board.service > 0: 3 (12/1)
## 
## SubTree [S14]
## 
## online.boarding <= 0: 3 (18)
## online.boarding > 0:
## :...on.board.service <= 0.4793173: 1 (7)
##     on.board.service > 0.4793173:
##     :...online.boarding <= 0.5338414: 3 (3)
##         online.boarding > 0.5338414: 1 (2)
## 
## SubTree [S15]
## 
## in.flight.entertainment > -1.018808: 2 (9)
## in.flight.entertainment <= -1.018808:
## :...seat.comfort <= 0.4234303: 1 (5/1)
##     seat.comfort > 0.4234303: 2 (3/1)
## 
## SubTree [S16]
## 
## in.flight.entertainment <= 0.4810466: 1 (2)
## in.flight.entertainment > 0.4810466: 2 (26)
## 
## SubTree [S17]
## 
## in.flight.entertainment <= 0.4810466: 1 (17/2)
## in.flight.entertainment > 0.4810466:
## :...online.boarding <= -1.86236: 1 (2)
##     online.boarding > -1.86236: 2 (2)
## 
## SubTree [S18]
## 
## online.boarding <= -0.2648924: 1 (2)
## online.boarding > -0.2648924: 2 (4)
## 
## SubTree [S19]
## 
## online.boarding > 0: 2 (8)
## online.boarding <= 0:
## :...cleanliness <= 0: 2 (6/1)
##     cleanliness > 0: 1 (3)
## 
## SubTree [S20]
## 
## on.board.service > -1.851815: 3 (8)
## on.board.service <= -1.851815:
## :...online.boarding <= -1.063626: 3 (2)
##     online.boarding > -1.063626: 1 (3)
## 
## SubTree [S21]
## 
## online.boarding <= 0.5338414: 3 (2)
## online.boarding > 0.5338414: 1 (2)
## 
## SubTree [S22]
## 
## baggage.handling <= -1.383119: 1 (3)
## baggage.handling > -1.383119: 3 (3/1)
## 
## SubTree [S23]
## 
## cleanliness <= 0.543176: 3 (4)
## cleanliness > 0.543176: 1 (3)
## 
## SubTree [S24]
## 
## baggage.handling <= -1.383119: 1 (13)
## baggage.handling > -1.383119:
## :...seat.comfort > 0: 1 (10)
##     seat.comfort <= 0:
##     :...food.and.drink <= 0.5972599: 3 (31)
##         food.and.drink > 0.5972599: 1 (3)
## 
## SubTree [S25]
## 
## in.flight.entertainment <= -1.018808: 3 (7)
## in.flight.entertainment > -1.018808: 1 (14/1)
## 
## SubTree [S26]
## 
## on.board.service <= 0.4793173: 3 (10)
## on.board.service > 0.4793173: 2 (7/1)
## 
## SubTree [S27]
## 
## in.flight.service <= -1.396005: 1 (4)
## in.flight.service > -1.396005: 3 (5)
## 
## SubTree [S28]
## 
## in.flight.entertainment <= 0: 3 (10/1)
## in.flight.entertainment > 0: 1 (2)
## 
## SubTree [S29]
## 
## in.flight.entertainment <= 0: 3 (21)
## in.flight.entertainment > 0:
## :...on.board.service <= -1.074771: 1 (3)
##     on.board.service > -1.074771: 3 (3/1)
## 
## SubTree [S30]
## 
## online.boarding <= 0: 3 (13/2)
## online.boarding > 0: 2 (5)
## 
## SubTree [S31]
## 
## online.boarding > -0.2648924: 1 (11)
## online.boarding <= -0.2648924:
## :...on.board.service <= -1.851815: 1 (11/1)
##     on.board.service > -1.851815:
##     :...in.flight.entertainment <= 0: 3 (8)
##         in.flight.entertainment > 0: 1 (2)
## 
## SubTree [S32]
## 
## in.flight.entertainment <= -1.018808: 1 (4)
## in.flight.entertainment > -1.018808:
## :...seat.comfort <= 0: 1 (2)
##     seat.comfort > 0:
##     :...food.and.drink <= 0: 1 (3/1)
##         food.and.drink > 0: 2 (30/1)
## 
## SubTree [S33]
## 
## in.flight.entertainment <= 0: 1 (9)
## in.flight.entertainment > 0:
## :...seat.comfort > -1.092578: 2 (8/1)
##     seat.comfort <= -1.092578:
##     :...online.boarding <= -0.2648924: 1 (12/1)
##         online.boarding > -0.2648924:
##         :...seat.comfort <= -1.850582: 1 (6/2)
##             seat.comfort > -1.850582: 2 (4)
## 
## SubTree [S34]
## 
## in.flight.entertainment <= -1.768735: 1 (5/1)
## in.flight.entertainment > -1.768735: 2 (4)
## 
## SubTree [S35]
## 
## on.board.service <= 0: 1 (3)
## on.board.service > 0: 2 (3)
## 
## SubTree [S36]
## 
## on.board.service <= 0.4793173: 1 (10/1)
## on.board.service > 0.4793173: 2 (3)
## 
## SubTree [S37]
## 
## in.flight.service <= -1.396005: 1 (2)
## in.flight.service > -1.396005: 2 (8/1)
## 
## SubTree [S38]
## 
## in.flight.entertainment > 0: 2 (206)
## in.flight.entertainment <= 0:
## :...seat.comfort <= 0.4234303:
##     :...food.and.drink <= 0: 1 (27)
##     :   food.and.drink > 0:
##     :   :...online.boarding > 0.5338414: 2 (7)
##     :       online.boarding <= 0.5338414:
##     :       :...cleanliness <= 0.543176: 1 (20)
##     :           cleanliness > 0.543176: 2 (5)
##     seat.comfort > 0.4234303:
##     :...online.boarding <= 0: 1 (6)
##         online.boarding > 0:
##         :...food.and.drink > -0.911037: 2 (27)
##             food.and.drink <= -0.911037:
##             :...cleanliness <= 0.543176: 1 (3)
##                 cleanliness > 0.543176: 2 (4)
## 
## SubTree [S39]
## 
## in.flight.entertainment <= -1.018808: 1 (2/1)
## in.flight.entertainment > -1.018808: 2 (7)
## 
## SubTree [S40]
## 
## food.and.drink <= -0.1568885: 3 (99)
## food.and.drink > -0.1568885: 2 (4/1)
## 
## SubTree [S41]
## 
## food.and.drink <= 0: 3 (104)
## food.and.drink > 0: 2 (4)
## 
## SubTree [S42]
## 
## food.and.drink <= -0.911037: 3 (4)
## food.and.drink > -0.911037: 2 (113)
## 
## SubTree [S43]
## 
## on.board.service <= 0.4793173: 3 (5)
## on.board.service > 0.4793173: 2 (3/1)
## 
## SubTree [S44]
## 
## on.board.service <= 0.4793173: 1 (6/1)
## on.board.service > 0.4793173: 3 (5)
## 
## SubTree [S45]
## 
## baggage.handling <= -2.230559: 1 (7/1)
## baggage.handling > -2.230559:
## :...food.and.drink <= 0: 3 (12)
##     food.and.drink > 0: 1 (2)
## 
## SubTree [S46]
## 
## online.boarding <= -1.86236: 3 (7)
## online.boarding > -1.86236: 2 (6)
## 
## SubTree [S47]
## 
## baggage.handling <= -0.5356789: 3 (3)
## baggage.handling > -0.5356789: 2 (10)
## 
## SubTree [S48]
## 
## online.boarding <= -1.063626: 1 (8)
## online.boarding > -1.063626:
## :...online.boarding > -0.2648924: 2 (9)
##     online.boarding <= -0.2648924:
##     :...seat.comfort <= -1.092578: 1 (3)
##         seat.comfort > -1.092578: 2 (6/1)
## 
## SubTree [S49]
## 
## food.and.drink <= 0.5972599: 3 (8)
## food.and.drink > 0.5972599: 2 (7)
## 
## SubTree [S50]
## 
## cleanliness <= 0: 3 (2)
## cleanliness > 0: 2 (8)
## 
## SubTree [S51]
## 
## on.board.service <= 0.4793173: 3 (39/2)
## on.board.service > 0.4793173:
## :...in.flight.service <= 0.3040217: 3 (4/1)
##     in.flight.service > 0.3040217: 2 (6)
## 
## SubTree [S52]
## 
## on.board.service > 0.4793173: 2 (15)
## on.board.service <= 0.4793173:
## :...online.boarding <= -1.063626: 3 (11)
##     online.boarding > -1.063626:
##     :...food.and.drink <= 0.5972599: 3 (6)
##         food.and.drink > 0.5972599: 2 (8)
## 
## SubTree [S53]
## 
## online.boarding > -1.063626: 2 (95)
## online.boarding <= -1.063626:
## :...cleanliness > 0.543176: 2 (26)
##     cleanliness <= 0.543176:
##     :...baggage.handling > 0.3117611: 2 (13)
##         baggage.handling <= 0.3117611:
##         :...in.flight.service <= 0.3040217: 3 (7)
##             in.flight.service > 0.3040217: 2 (6)
## 
## SubTree [S54]
## 
## food.and.drink <= -0.911037: 3 (5)
## food.and.drink > -0.911037: 2 (3)
## 
## SubTree [S55]
## 
## online.boarding > -1.86236: 2 (11)
## online.boarding <= -1.86236:
## :...cleanliness <= 0.543176: 3 (7)
##     cleanliness > 0.543176: 2 (4)
## 
## SubTree [S56]
## 
## food.and.drink > -0.911037: 2 (14)
## food.and.drink <= -0.911037:
## :...seat.comfort <= -1.092578: 3 (7)
##     seat.comfort > -1.092578: 2 (6)
## 
## SubTree [S57]
## 
## online.boarding <= -1.86236: 3 (6)
## online.boarding > -1.86236: 2 (13)
## 
## SubTree [S58]
## 
## online.boarding > -1.86236: 2 (23)
## online.boarding <= -1.86236:
## :...cleanliness <= 0.543176: 3 (14)
##     cleanliness > 0.543176: 2 (2)
## 
## SubTree [S59]
## 
## food.and.drink <= -0.911037: 3 (12)
## food.and.drink > -0.911037:
## :...seat.comfort <= -1.850582: 3 (4)
##     seat.comfort > -1.850582: 2 (3)
## 
## SubTree [S60]
## 
## food.and.drink <= -0.911037: 3 (3)
## food.and.drink > -0.911037: 2 (4)
## 
## SubTree [S61]
## 
## seat.comfort <= -1.850582: 3 (10)
## seat.comfort > -1.850582:
## :...online.boarding <= -1.86236: 3 (3)
##     online.boarding > -1.86236: 2 (9)
## 
## SubTree [S62]
## 
## baggage.handling <= -0.5356789: 1 (2)
## baggage.handling > -0.5356789: 2 (7)
## 
## SubTree [S63]
## 
## food.and.drink > -0.911037: 2 (16/1)
## food.and.drink <= -0.911037:
## :...seat.comfort <= -1.850582: 3 (5)
##     seat.comfort > -1.850582: 2 (3)
## 
## SubTree [S64]
## 
## food.and.drink <= -0.911037: 3 (27)
## food.and.drink > -0.911037:
## :...seat.comfort <= -1.850582: 3 (11)
##     seat.comfort > -1.850582:
##     :...online.boarding <= 0: 3 (8/1)
##         online.boarding > 0: 2 (6)
## 
## SubTree [S65]
## 
## online.boarding <= 0: 3 (13)
## online.boarding > 0:
## :...food.and.drink > -0.911037: 2 (13)
##     food.and.drink <= -0.911037:
##     :...online.boarding > 0.5338414: 2 (7)
##         online.boarding <= 0.5338414:
##         :...baggage.handling <= 0.3117611: 3 (8)
##             baggage.handling > 0.3117611: 2 (2)
## 
## SubTree [S66]
## 
## online.boarding > 0.5338414: 2 (6)
## online.boarding <= 0.5338414:
## :...food.and.drink <= 0: 1 (7)
##     food.and.drink > 0: 2 (5/1)
## 
## SubTree [S67]
## 
## in.flight.entertainment > 0.4810466: 2 (3)
## in.flight.entertainment <= 0.4810466:
## :...seat.comfort <= -1.850582:
##     :...food.and.drink <= 0: 3 (3)
##     :   food.and.drink > 0: 1 (5/1)
##     seat.comfort > -1.850582:
##     :...cleanliness <= 0:
##         :...food.and.drink <= -0.911037: 3 (3/1)
##         :   food.and.drink > -0.911037: 1 (8)
##         cleanliness > 0:
##         :...seat.comfort <= -1.092578: 1 (6)
##             seat.comfort > -1.092578:
##             :...food.and.drink > 0.5972599: 2 (5)
##                 food.and.drink <= 0.5972599:
##                 :...online.boarding <= 0: 1 (8)
##                     online.boarding > 0:
##                     :...food.and.drink <= -0.1568885: 1 (2)
##                         food.and.drink > -0.1568885: 2 (3)
## 
## 
## Evaluation on training data (129880 cases):
## 
##      Decision Tree   
##    ----------------  
##    Size      Errors  
## 
##    1072  609( 0.5%)   <<
## 
## 
##     (a)   (b)   (c)    <-classified as
##    ----  ----  ----
##   35190    81   151    (a): class 1
##      56 58496    64    (b): class 2
##     161    96 35585    (c): class 3
## 
## 
##  Attribute usage:
## 
##  100.00% cleanliness
##  100.00% in.flight.service
##   97.28% baggage.handling
##   94.78% seat.comfort
##   89.08% in.flight.entertainment
##   83.07% food.and.drink
##   76.44% on.board.service
##   40.14% online.boarding
## 
## 
## Time: 0.3 secs

Next, we tried to generate a Silhouette diagram to assess clustering quality. However, the large size of the data set resulted in processing limitations. Due to computational constraints, it was not possible to generate a Silhouette diagram for the entire data set. Therefore, we made random samples from the data set and calculated Silhouette scores for each sample. Since we did not use the set.seed() function to get the same sample each time, by rerunning the chunk below for multiple samples, we checked the consistency of the Silhouette diagrams created. As we found out, the silhouette diagram shape as well as the silhouette coefficients of each cluster were consistent for any sample created. So we concluded that a diagram from a sapmple would also be representative of the whole data set.

The silhouette plot shows a moderately defined clustering structure with an average silhouette width of 0.31. Cluster 2 is the most distinct with a silhouette width of 0.35, while Cluster 3 has a width of 0.31. Cluster 1, with a width of 0.22, appears less well-defined and possibly overlapping with others. There are very few negative values, suggesting most points are reasonably assigned to clusters. However, the silhouette widths are not particularly high, indicating that the separation between clusters is present but not sharp.

Finally, we wanted to check the number of overall satisfied and unsatisfied customers in each cluster. We added back the satisfaction variable to the dta_cls data frame, in order to create a crosstable and visualize the distribution of unsatisfied customers in the clusters. In that way, we can distinguish the clusters with the most unsatisfied customers. We remind that 0 scores refer to unsatisfied or neutral customers, while 1 scores refer to satisfied customers.

The results of the crosstable confirmed our previous assessment of the clusters based on the individual satisfaction scores used. Similarly with the cluster naming we did before, we can also see here that the largest proportion (around 82%) of the customers in cluster 3 is unsatisfied or neutral, a smaller but also important proportion (around 70%) of customers in cluster 1 is also unsatisfied or neutral, while in cluster 2, around 67% of the customers are satisfied. Overall, we can see that in clusters 1 and 3, customers are more likely to be unsatisfied. Of course, the 33% of unsatisfied customers in cluster 2 might be problematic; a significant portion of customers in cluster 2 are still unsatisfied and that challenges the assumption that all customers in that cluster are satisfied.

In the end of this chunk, we remove some variables that were used in the clustering analysis and are no longer needed.

dta_def <- cbind(dta_cls, satisfaction = dta_transformed$satisfaction)

gmodels::CrossTable(dta_def$satisfaction, dta_def$cluster,
                    prop.chisq = FALSE,
                    prop.c = TRUE,
                    prop.r = FALSE,
                    prop.t = FALSE,
                    dnn = c("Satisfied (0 = no, 1 = yes, 99 = missing)", "Clusters"))
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Col Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  129880 
## 
##  
##                                           | Clusters 
## Satisfied (0 = no, 1 = yes, 99 = missing) |         1 |         2 |         3 | Row Total | 
## ------------------------------------------|-----------|-----------|-----------|-----------|
##                                         0 |     24773 |     19227 |     29452 |     73452 | 
##                                           |     0.699 |     0.328 |     0.822 |           | 
## ------------------------------------------|-----------|-----------|-----------|-----------|
##                                         1 |     10649 |     39389 |      6390 |     56428 | 
##                                           |     0.301 |     0.672 |     0.178 |           | 
## ------------------------------------------|-----------|-----------|-----------|-----------|
##                              Column Total |     35422 |     58616 |     35842 |    129880 | 
##                                           |     0.273 |     0.451 |     0.276 |           | 
## ------------------------------------------|-----------|-----------|-----------|-----------|
## 
## 
rm(clusters,model,siz)

MODELLING

1. Selection and justification of models

Predicting airline passenger satisfaction involves analyzing a variety of factors that influence whether a passenger feels “Satisfied” or “Dissatisfied.” To address this challenge, both decision trees and logistic regression were chosen as the primary models due to their complementary strengths in handling complex data and providing actionable insights. Decision trees, particularly using the C5.0 algorithm, offer significant advantages for predicting airline passenger satisfaction. This method is effective for capturing complex, non-linear relationships between features and understanding how different combinations of factors contribute to the final classification, such as a passenger being “Satisfied” or “Dissatisfied.” The tree structure is highly interpretable, providing clear, rule-based insights. Additionally, decision trees support various methods for improvement, including boosting, cost-sensitivity, and Random Forests, all of which can enhance model performance, accuracy, and robustness in handling different data scenarios. Logistic regression, on the other hand, excels at binary classification tasks like predicting passenger satisfaction. It models the relationship between input features (such as Flight Distance, delays, or service ratings) and the probability of a passenger being “Satisfied” or “Dissatisfied.” The coefficients from logistic regression offer clear, interpretable insights into how each feature influences the likelihood of satisfaction. It works well with both categorical and numerical data, making it versatile for different types of input. While logistic regression doesn’t capture complex interactions as effectively as decision trees, its simplicity and clarity make it a valuable tool for generating actionable business insights. Together, decision trees and logistic regression complement each other. Decision trees provide a visual, rule-based structure that can model interactions between features, while logistic regression offers straightforward insights into how each individual feature impacts passenger satisfaction.

In order to work with our two chosen models we took into account various assumptions that helped us interpret the results in a useful and complete manner.

Decision tree key Assumptions:

Logistic Regression key Assumptions:

2. Preparation of the data set used in modelling

For the modeling phase, we start by creating a new data frame called dta_model, which is based on the original data frame dta. This approach ensures that our modeling process is independent of the transformations applied earlier, such as encoding categorical variables or scaling numerical variables, which are unnecessary for the models we intend to use. In these models, categorical variables can be used as factors without encoding, and numerical variables do not require scaling.

First, we have to merge again the departure delay and arrival delay variables into the single variable named delay.avg, as previously done in the feature engineering part. Next, as it was previously explained, we still face the problem that our data set has a large number of variables (22 with the predicting variable satisfaction). This number of variables is still too high for efficient and meaningful modeling. After further consideration, research, and consultation with the professor during a coaching session, we determined that certain variables should be excluded from the model to avoid redundancy and potential issues with multicollinearity.

Specifically, we decided to exclude the variables that are scaled from 1 to 5, as they represent individual satisfaction scores for various aspects of the flight experience. Including these variables in the modeling process would result in several problems. Firstly, predicting overall satisfaction based on these individual satisfaction scores would be circular and redundant. Since these scores are direct measures of satisfaction with different aspects of the service, we figured that using them to predict the overall satisfaction would reduce the usefulness of the model . Additionally, including these scores could lead to multicollinearity issues, where highly correlated predictors distort the model’s interpretation and reliability.

By excluding these satisfaction-related variables, we ensure that the model is based on more objective and independent factors, such as travel-related variables and flight characteristics. While this approach may reduce the overall accuracy of the model, it provides a more realistic and meaningful assessment of customer satisfaction, free from the biases introduced by including variables that are already designed to measure satisfaction.

The base data set for modeling, dta_model, will therefore contain only the relevant non-scaled variables, excluding the individual satisfaction scores. The data set will contain 8 variables in total.

As for the last part of our improvement for the model, we’ll examine a data set with only individual satisfaction scores. We acknowledge the inherent bias and the risk of circular reasoning but we find this lens valuable. This models “too good to be true” accuracy rate will illustrate these concerns. However, this approach allows us to identify which specific service aspects most strongly influence passenger satisfaction. These insights, though limited in causal interpretability, can still guide airlines in prioritizing areas for service improvement based on what customers consistently rate most critically.

The following chunk implements these steps.

Note: Decision tree is not sensitive to variable scale so we are not rescaling in any way.

dta_model <- dta # Create the new data frame for modeling 

dta_model$delay.avg <- rowMeans(dta_model[, c("departure.delay", "arrival.delay")], na.rm = TRUE) # recreate the merged delay variable

dta_model <- dta_model[, !names(dta_model) %in% c("departure.delay", "arrival.delay")] # Remove the original delay variables

dta_tree <- dta_model[, -c(7:20)] # Exclude the individual satisfaction score variables

# Overview of the final data set
str(dta_tree)
## 'data.frame':    129880 obs. of  8 variables:
##  $ gender         : chr  "Male" "Female" "Male" "Male" ...
##  $ age            : int  48 35 41 50 49 43 43 60 50 38 ...
##  $ customer.type  : chr  "First-time" "Returning" "Returning" "Returning" ...
##  $ type.of.travel : chr  "Business" "Business" "Business" "Business" ...
##  $ class          : chr  "Business" "Business" "Business" "Business" ...
##  $ flight.distance: int  821 821 853 1905 3470 3788 1963 853 2607 2822 ...
##  $ satisfaction   : chr  "Neutral or Dissatisfied" "Satisfied" "Satisfied" "Satisfied" ...
##  $ delay.avg      : num  3.5 32.5 0 0 0.5 0 0 1.5 0 6.5 ...

3. Model 1: Decision Tree

Decision tree setup:

First, we transformed the characters into factors, and kept numerical and integer types as they were.

#Model preparation:

For the data split, we chose different datasets for each model, and therefore we created two different sets of training and testing data. This was done for clarity and organizational purposes, to keep the modeling processes for the decision tree and logistic regression clearly separated.

When we are comparing the accuracy of the models, we will keep the same seed and proportion to ensure compatible foundation, making it appropriate for direct analysis.

# Convert categorical variables to factors
dta_tree$gender <- as.factor(dta_tree$gender)
dta_tree$customer.type <- as.factor(dta_tree$customer.type)
dta_tree$type.of.travel <- as.factor(dta_tree$type.of.travel)
dta_tree$class <- as.factor(dta_tree$class)
dta_tree$satisfaction <- as.factor(dta_tree$satisfaction)

# Overview of the final data set
str(dta_tree)
## 'data.frame':    129880 obs. of  8 variables:
##  $ gender         : Factor w/ 2 levels "Female","Male": 2 1 2 2 1 2 2 1 2 1 ...
##  $ age            : int  48 35 41 50 49 43 43 60 50 38 ...
##  $ customer.type  : Factor w/ 2 levels "First-time","Returning": 1 2 2 2 2 2 2 2 2 2 ...
##  $ type.of.travel : Factor w/ 2 levels "Business","Personal": 1 1 1 1 1 1 1 1 1 1 ...
##  $ class          : Factor w/ 3 levels "Business","Economy",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ flight.distance: int  821 821 853 1905 3470 3788 1963 853 2607 2822 ...
##  $ satisfaction   : Factor w/ 2 levels "Neutral or Dissatisfied",..: 1 2 2 2 2 2 2 2 1 2 ...
##  $ delay.avg      : num  3.5 32.5 0 0 0.5 0 0 1.5 0 6.5 ...

After converting the necessary variables to factors for the decision tree model, we proceeded to explore the distribution of each variable in the dataset using visualizations. The code below automatically loops through all columns in the dta_tree data frame and generates appropriate plots depending on the variable type.

If the variable is numeric or integer, we use a histogram to display its distribution, grouped and colored by the satisfaction variable. This helps us visually assess whether certain satisfaction levels are associated with higher or lower values for each numeric variable.

If the variable is categorical (i.e., a factor), we use a bar plot to display the counts per category, again grouped and colored by satisfaction. This gives insight into how satisfaction is distributed across different categories (e.g., travel class, customer type, etc.).

Hmisc::describe(dta_tree)
## dta_tree 
## 
##  8  Variables      129880  Observations
## --------------------------------------------------------------------------------
## gender 
##        n  missing distinct 
##   129880        0        2 
##                         
## Value      Female   Male
## Frequency   65899  63981
## Proportion  0.507  0.493
## --------------------------------------------------------------------------------
## age 
##        n  missing distinct     Info     Mean  pMedian      Gmd      .05 
##   129880        0       75        1    39.43     39.5    17.33       15 
##      .10      .25      .50      .75      .90      .95 
##       20       27       40       51       59       64 
## 
## lowest :  7  8  9 10 11, highest: 77 78 79 80 85
## --------------------------------------------------------------------------------
## customer.type 
##        n  missing distinct 
##   129880        0        2 
##                                 
## Value      First-time  Returning
## Frequency       23780     106100
## Proportion      0.183      0.817
## --------------------------------------------------------------------------------
## type.of.travel 
##        n  missing distinct 
##   129880        0        2 
##                             
## Value      Business Personal
## Frequency     89693    40187
## Proportion    0.691    0.309
## --------------------------------------------------------------------------------
## class 
##        n  missing distinct 
##   129880        0        3 
##                                                  
## Value          Business      Economy Economy Plus
## Frequency         62160        58309         9411
## Proportion        0.479        0.449        0.072
## --------------------------------------------------------------------------------
## flight.distance 
##        n  missing distinct     Info     Mean  pMedian      Gmd      .05 
##   129880        0     3821        1     1190     1051     1066      177 
##      .10      .25      .50      .75      .90      .95 
##      236      414      844     1744     2751     3380 
## 
## lowest :   31   56   67   73   74, highest: 4243 4502 4817 4963 4983
## --------------------------------------------------------------------------------
## satisfaction 
##        n  missing distinct 
##   129880        0        2 
##                                                           
## Value      Neutral or Dissatisfied               Satisfied
## Frequency                    73452                   56428
## Proportion                   0.566                   0.434
## --------------------------------------------------------------------------------
## delay.avg 
##        n  missing distinct     Info     Mean  pMedian      Gmd      .05 
##   129880        0      926    0.904    14.94     4.75    24.49        0 
##      .10      .25      .50      .75      .90      .95 
##        0        0        1       12       43       77 
## 
## lowest : 0        0.378732 0.5      1        1.36816 
## highest: 974      1014     1121.5   1292.5   1588    
## --------------------------------------------------------------------------------
set.seed(46748717)

proportion <- 0.7 # desired proportion here
split_tree <- rsample::initial_split(dta_tree, prop = proportion) #split the data
training_tree <- training(split_tree)
testing_tree <- testing(split_tree)
model_tree <- C50::C5.0(satisfaction ~., data = training_tree)
summary(model_tree)
## 
## Call:
## C5.0.formula(formula = satisfaction ~ ., data = training_tree)
## 
## 
## C5.0 [Release 2.07 GPL Edition]      Thu Apr 17 17:41:00 2025
## -------------------------------
## 
## Class specified by attribute `outcome'
## 
## Read 90916 cases (8 attributes) from undefined.data
## 
## Decision tree:
## 
## class = Business:
## :...type.of.travel = Personal: Neutral or Dissatisfied (1877/211)
## :   type.of.travel = Business:
## :   :...customer.type = First-time:
## :       :...age <= 24:
## :       :   :...age > 12: Satisfied (1197/183)
## :       :   :   age <= 12:
## :       :   :   :...age <= 8: Neutral or Dissatisfied (12)
## :       :   :       age > 8: Satisfied (38/14)
## :       :   age > 24:
## :       :   :...age > 30: Neutral or Dissatisfied (3208/754)
## :       :       age <= 30:
## :       :       :...flight.distance > 1613: Neutral or Dissatisfied (160/36)
## :       :           flight.distance <= 1613:
## :       :           :...flight.distance <= 1078: Neutral or Dissatisfied (1568/578)
## :       :               flight.distance > 1078:
## :       :               :...age <= 25: Satisfied (78/21)
## :       :                   age > 25: Neutral or Dissatisfied (277/126)
## :       customer.type = Returning:
## :       :...age > 38:
## :           :...age <= 60: Satisfied (22708/3485)
## :           :   age > 60:
## :           :   :...delay.avg <= 3.5: Satisfied (880/384)
## :           :       delay.avg > 3.5: Neutral or Dissatisfied (536/194)
## :           age <= 38:
## :           :...age <= 19:
## :               :...delay.avg <= 3:
## :               :   :...flight.distance <= 543: Neutral or Dissatisfied (65/27)
## :               :   :   flight.distance > 543: Satisfied (499/170)
## :               :   delay.avg > 3:
## :               :   :...flight.distance <= 489: Neutral or Dissatisfied (43/3)
## :               :       flight.distance > 489:
## :               :       :...age <= 14: Neutral or Dissatisfied (142/22)
## :               :           age > 14: [S1]
## :               age > 19:
## :               :...delay.avg <= 4: Satisfied (6087/1448)
## :                   delay.avg > 4:
## :                   :...age > 25: Satisfied (3031/1092)
## :                       age <= 25:
## :                       :...flight.distance <= 558: Neutral or Dissatisfied (95/33)
## :                           flight.distance > 558:
## :                           :...delay.avg <= 12: Satisfied (268/94)
## :                               delay.avg > 12: [S2]
## class in {Economy,Economy Plus}:
## :...type.of.travel = Personal: Neutral or Dissatisfied (26288/2613)
##     type.of.travel = Business:
##     :...customer.type = First-time: Neutral or Dissatisfied (10044/1391)
##         customer.type = Returning:
##         :...delay.avg > 3.5:
##             :...flight.distance <= 126: Satisfied (152/61)
##             :   flight.distance > 126: Neutral or Dissatisfied (4427/1605)
##             delay.avg <= 3.5:
##             :...age > 64: Neutral or Dissatisfied (363/129)
##                 age <= 64:
##                 :...flight.distance <= 279: Satisfied (1454/483)
##                     flight.distance > 279:
##                     :...age > 52:
##                         :...flight.distance > 636: Neutral or Dissatisfied (447/158)
##                         :   flight.distance <= 636: [S3]
##                         age <= 52:
##                         :...age <= 14: Neutral or Dissatisfied (75/23)
##                             age > 14:
##                             :...flight.distance <= 970:
##                                 :...class = Economy:
##                                 :   :...age <= 28: Neutral or Dissatisfied (301/142)
##                                 :   :   age > 28: Satisfied (1639/704)
##                                 :   class = Economy Plus: [S4]
##                                 flight.distance > 970:
##                                 :...flight.distance > 2297: Satisfied (149/53)
##                                     flight.distance <= 2297:
##                                     :...age <= 25: Satisfied (133/55)
##                                         age > 25: [S5]
## 
## SubTree [S1]
## 
## flight.distance <= 1752: Satisfied (98/38)
## flight.distance > 1752: Neutral or Dissatisfied (106/36)
## 
## SubTree [S2]
## 
## gender = Female: Neutral or Dissatisfied (231/109)
## gender = Male: Satisfied (236/108)
## 
## SubTree [S3]
## 
## class = Economy: Satisfied (381/181)
## class = Economy Plus: Neutral or Dissatisfied (86/39)
## 
## SubTree [S4]
## 
## delay.avg <= 2.5: Satisfied (657/265)
## delay.avg > 2.5: Neutral or Dissatisfied (33/12)
## 
## SubTree [S5]
## 
## flight.distance > 1253: Neutral or Dissatisfied (345/126)
## flight.distance <= 1253:
## :...flight.distance <= 1188: Neutral or Dissatisfied (402/178)
##     flight.distance > 1188: Satisfied (100/34)
## 
## 
## Evaluation on training data (90916 cases):
## 
##      Decision Tree   
##    ----------------  
##    Size      Errors  
## 
##      43 17418(19.2%)   <<
## 
## 
##     (a)   (b)    <-classified as
##    ----  ----
##   42586  8873    (a): class Neutral or Dissatisfied
##    8545 30912    (b): class Satisfied
## 
## 
##  Attribute usage:
## 
##  100.00% type.of.travel
##  100.00% class
##   69.02% customer.type
##   52.94% age
##   25.81% delay.avg
##   16.11% flight.distance
##    0.51% gender
## 
## 
## Time: 0.2 secs

Visualization of the model

After building the model, we visualized the decision tree using the rpart.plot() function. This plot provides an intuitive breakdown of how the model classifies passengers as “Satisfied” or “Neutral or Dissatisfied” based on their characteristics.

As shown in the tree, travel class is the most influential first split: passengers flying Economy or Economy Plus are more likely to be dissatisfied, while those in Business class tend to report higher satisfaction. Subsequent splits include customer type, age, and type of travel, reflecting how satisfaction levels differ between first-time and returning customers, younger vs. older travelers, and business vs. personal trips.

This visual output helps interpret the model’s logic and highlights which passenger attributes the tree considers most relevant in predicting satisfaction.

library(rpart)
library(rpart.plot)

model_rpart <- rpart(satisfaction ~ ., data = training_tree, method = "class")
rpart.plot(model_rpart, type = 2, extra = 104, fallen.leaves = TRUE)

#Evaluating the model:

We built the confusion matrix for the training and testing data. In the case of training data, the accuracy level was 80,84% and for the testing data it was 79,99%. The lower percentage on the testing data is expected, but the minimal difference between both suggest that the model can be generalized and used with unseen data.

While accuracy gives an overall performance metric, we also included precision, recall, and F1 score to better understand the model’s behavior in distinguishing satisfied vs. unsatisfied customers. Specifically the results were the following: Precision: 0.76, Recall: 0.77, F1 Score: 0.77. These values can be particularly helpful in contexts where misclassification has business implications, for example wrongly assuming a dissatisfied customer is satisfied.

library(caret)

pred.test_tree <- predict(model_tree, testing_tree)


# Make sure factors have same levels
pred.test_tree <- factor(pred.test_tree, levels = levels(testing_tree$satisfaction))

# Show confusion matrix and full stats
confusion <- confusionMatrix(pred.test_tree, testing_tree$satisfaction, positive = "Satisfied")
print(confusion)
## Confusion Matrix and Statistics
## 
##                          Reference
## Prediction                Neutral or Dissatisfied Satisfied
##   Neutral or Dissatisfied                   17965      3772
##   Satisfied                                  4028     13199
##                                           
##                Accuracy : 0.7998          
##                  95% CI : (0.7958, 0.8038)
##     No Information Rate : 0.5644          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.5936          
##                                           
##  Mcnemar's Test P-Value : 0.003886        
##                                           
##             Sensitivity : 0.7777          
##             Specificity : 0.8169          
##          Pos Pred Value : 0.7662          
##          Neg Pred Value : 0.8265          
##              Prevalence : 0.4356          
##          Detection Rate : 0.3387          
##    Detection Prevalence : 0.4421          
##       Balanced Accuracy : 0.7973          
##                                           
##        'Positive' Class : Satisfied       
## 
conf_matrix <- table(Predicted = pred.test_tree, Actual = testing_tree$satisfaction)

# Extract values (adjust levels if needed)
TP <- conf_matrix["Satisfied", "Satisfied"]
FP <- conf_matrix["Satisfied", "Neutral or Dissatisfied"]
FN <- conf_matrix["Neutral or Dissatisfied", "Satisfied"]
TN <- conf_matrix["Neutral or Dissatisfied", "Neutral or Dissatisfied"]

# Additional Metrics to evaluate
# accuracy <- (TP + TN) / sum(conf_matrix) #to test if our calculations are correct
precision <- TP / (TP + FP)
recall <- TP / (TP + FN)
f1_score <- 2 * (precision * recall) / (precision + recall)

# Print
# cat("Accuracy:", round(accuracy, 3), "\n")
cat("Precision:", round(precision, 4), "\n")
## Precision: 0.7662
cat("Recall:", round(recall, 4), "\n")
## Recall: 0.7777
cat("F1 Score:", round(f1_score, 4), "\n")
## F1 Score: 0.7719

#Improvement of the model:

  1. Boosting

First, we carried out a boosting method. The testing model set to 10 trials improved the accuracy level by a marginal amount, coming up to 80,08%. We tried implementing a smaller and a larger number of trials, but concluded that the optimal result was achieved by setting it to 10. 

Interestingly, although the accuracy barely increased, we observed improvements in recall and F1 score, specifically recall increased to 0.79 and F1 score increased to 0.77.
This indicates the model became better at correctly identifying positive cases (satisfied customers), even if the total number of correct predictions remained similar. This result suggests a more balanced and sensitive model.

# Add boosting
model_boost_tree <- C5.0(satisfaction ~.,
                    data = training_tree,
                    trials = 10)

pred.test_tree <- predict(model_boost_tree, testing_tree)

# Make sure factors have same levels
pred.test_tree <- factor(pred.test_tree, levels = levels(testing_tree$satisfaction))

# Show confusion matrix and full stats
confusion <- confusionMatrix(pred.test_tree, testing_tree$satisfaction, positive = "Satisfied")
print(confusion)
## Confusion Matrix and Statistics
## 
##                          Reference
## Prediction                Neutral or Dissatisfied Satisfied
##   Neutral or Dissatisfied                   17735      3502
##   Satisfied                                  4258     13469
##                                           
##                Accuracy : 0.8008          
##                  95% CI : (0.7968, 0.8048)
##     No Information Rate : 0.5644          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.597           
##                                           
##  Mcnemar's Test P-Value : < 2.2e-16       
##                                           
##             Sensitivity : 0.7936          
##             Specificity : 0.8064          
##          Pos Pred Value : 0.7598          
##          Neg Pred Value : 0.8351          
##              Prevalence : 0.4356          
##          Detection Rate : 0.3457          
##    Detection Prevalence : 0.4550          
##       Balanced Accuracy : 0.8000          
##                                           
##        'Positive' Class : Satisfied       
## 
conf_matrix <- table(Predicted = pred.test_tree, Actual = testing_tree$satisfaction)

# Extract values (adjust levels if needed)
TP <- conf_matrix["Satisfied", "Satisfied"]
FP <- conf_matrix["Satisfied", "Neutral or Dissatisfied"]
FN <- conf_matrix["Neutral or Dissatisfied", "Satisfied"]
TN <- conf_matrix["Neutral or Dissatisfied", "Neutral or Dissatisfied"]

# Additional Metrics to evaluate
# accuracy <- (TP + TN) / sum(conf_matrix) #to test if our calculations are correct
precision <- TP / (TP + FP)
recall <- TP / (TP + FN)
f1_score <- 2 * (precision * recall) / (precision + recall)

# Print
# cat("Accuracy:", round(accuracy, 3), "\n")
cat("Precision:", round(precision, 4), "\n")
## Precision: 0.7598
cat("Recall:", round(recall, 4), "\n")
## Recall: 0.7936
cat("F1 Score:", round(f1_score, 4), "\n")
## F1 Score: 0.7764
  1. Cost matrix

Second, we carried out assigning costs to mistakes. Following the tutorial’s logic, we used the cost matrix to assign a higher weight to the false negatives, as we identified it to be a more costly mistake in our business case. Mistaking a dissatisfied customer as a satisfied one is more expensive than the opposite misclassification.

We tested two cases of weight assigning and calculated the accuracy, precision, recall and F1 score for each one:


1) 2-1 ratio (false negative- false positive):

Accuracy level: 78,25%

Precision: 0.70

Recall: 0.87

F1 Score: 0.77


2) 3-1 ratio (false negative- false positive):

Accuracy level: 77,12%

Precision: 0.67

Recall: 0.9

F1 Score: 0.77

# Specifying the cost matrix
cost.matrix <- matrix(c(NA, 1,  # FN costs of predicting "Satisfied" whereas actual value is "Neutral or Dissatisfied"
                        2, NA), # FP costs of predicting "Neutral or Dissatisfied" whereas actual value is "Satisfied"
                      nrow = 2, 
                      ncol = 2,
                      byrow = FALSE)
rownames(cost.matrix) <- colnames(cost.matrix) <- c("Neutral or Dissatisfied", "Satisfied")
# Estimating the model with the cost matrix
model.cost <- C5.0(satisfaction ~., 
              data = training_tree,
              costs = cost.matrix)

pred.test_tree <- predict(model.cost, testing_tree)
# Make sure factors have same levels
pred.test_tree <- factor(pred.test_tree, levels = levels(testing_tree$satisfaction))

# Show confusion matrix and full stats
confusion <- confusionMatrix(pred.test_tree, testing_tree$satisfaction, positive = "Satisfied")
print(confusion)
## Confusion Matrix and Statistics
## 
##                          Reference
## Prediction                Neutral or Dissatisfied Satisfied
##   Neutral or Dissatisfied                   15690      2183
##   Satisfied                                  6303     14788
##                                           
##                Accuracy : 0.7822          
##                  95% CI : (0.7781, 0.7863)
##     No Information Rate : 0.5644          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.569           
##                                           
##  Mcnemar's Test P-Value : < 2.2e-16       
##                                           
##             Sensitivity : 0.8714          
##             Specificity : 0.7134          
##          Pos Pred Value : 0.7012          
##          Neg Pred Value : 0.8779          
##              Prevalence : 0.4356          
##          Detection Rate : 0.3795          
##    Detection Prevalence : 0.5413          
##       Balanced Accuracy : 0.7924          
##                                           
##        'Positive' Class : Satisfied       
## 
conf_matrix <- table(Predicted = pred.test_tree, Actual = testing_tree$satisfaction)

# Extract values (adjust levels if needed)
TP <- conf_matrix["Satisfied", "Satisfied"]
FP <- conf_matrix["Satisfied", "Neutral or Dissatisfied"]
FN <- conf_matrix["Neutral or Dissatisfied", "Satisfied"]
TN <- conf_matrix["Neutral or Dissatisfied", "Neutral or Dissatisfied"]

# Additional Metrics to evaluate
# accuracy <- (TP + TN) / sum(conf_matrix) #to test if our calculations are correct
precision <- TP / (TP + FP)
recall <- TP / (TP + FN)
f1_score <- 2 * (precision * recall) / (precision + recall)

# Print
# cat("Accuracy:", round(accuracy, 3), "\n")
cat("Precision:", round(precision, 4), "\n")
## Precision: 0.7012
cat("Recall:", round(recall, 4), "\n")
## Recall: 0.8714
cat("F1 Score:", round(f1_score, 4), "\n")
## F1 Score: 0.777

The trade-off in this method gives us a significant decrease (almost half) in the number of false negative cases, while compromising slightly on the accuracy level. This means that the model became more sensitive to dissatisfied customers (higher recall), which is valuable in our case. The drop in precision is expected, as the model now tends to classify more customers as dissatisfied, prioritizing the detection of dissatisfaction which aligns with our business priorities.

  1. Random forest

Lastly, we built a random forest. The parameters include:
ntree <- how many trees do you want to estimate

replace <- TRUE use a randomly selected set of predictor variables

mtry <- if randomize is set to TRUE, how many predictor variables do you want to sample

We took 4 different cases and we paid attention to how changing mtry and replace influenced results:

1) replace= FALSE

Accuracy: 77.31%

Precision: 0.75

Recall: 0.79

F1 Score: 0.77
2) mtry = 2, replace = TRUE

Accuracy: 79.86%

Precision: 0.75

Recall: 0.79

F1 Score: 0.77

3) mtry = 4, replace = TRUE

Accuracy: 79.24%

Precision: 0.75

Recall: 0.77

F1 Score: 0.76

4) mtry = 6, replace = TRUE

Accuracy: 77.21%

Precision: 0.73

Recall: 0.74

F1 Score: 0.74


At the beginning, we set the replace value as FALSE, which meant no variable randomization. Then we tried three different forests using the following sample setups: 2, 4 and 6. As the number of samples increased, the model accuracy slightly dropped, and so did precision and recall. This can be explained by the positive correlation of overfitting risk and the number of samples. When there are only two predictors per tree, each tree gets a different subset of variables, limiting the chance of overlapping. The best trade-off was achieved when mtry = 2, where we saw the best combination of accuracy and F1 score, indicating that more diverse trees (fewer variables per split) helped the forest perform better overall.

model.forest <- randomForest::randomForest(satisfaction ~., 
                                           data = training_tree,
                                           ntree = 500, # trees to be grown
                                           mtry = 2, # variables to sample at each split
                                           replace = FALSE) # sampling of cases with or without replacement

pred.test_tree <- predict(model.forest, testing_tree)
# Make sure factors have same levels
pred.test_tree <- factor(pred.test_tree, levels = levels(testing_tree$satisfaction))

# Show confusion matrix and full stats
confusion <- confusionMatrix(pred.test_tree, testing_tree$satisfaction, positive = "Satisfied")
print(confusion)
## Confusion Matrix and Statistics
## 
##                          Reference
## Prediction                Neutral or Dissatisfied Satisfied
##   Neutral or Dissatisfied                   17680      3518
##   Satisfied                                  4313     13453
##                                         
##                Accuracy : 0.799         
##                  95% CI : (0.795, 0.803)
##     No Information Rate : 0.5644        
##     P-Value [Acc > NIR] : < 2.2e-16     
##                                         
##                   Kappa : 0.5934        
##                                         
##  Mcnemar's Test P-Value : < 2.2e-16     
##                                         
##             Sensitivity : 0.7927        
##             Specificity : 0.8039        
##          Pos Pred Value : 0.7572        
##          Neg Pred Value : 0.8340        
##              Prevalence : 0.4356        
##          Detection Rate : 0.3453        
##    Detection Prevalence : 0.4560        
##       Balanced Accuracy : 0.7983        
##                                         
##        'Positive' Class : Satisfied     
## 
conf_matrix <- table(Predicted = pred.test_tree, Actual = testing_tree$satisfaction)

# Extract values (adjust levels if needed)
TP <- conf_matrix["Satisfied", "Satisfied"]
FP <- conf_matrix["Satisfied", "Neutral or Dissatisfied"]
FN <- conf_matrix["Neutral or Dissatisfied", "Satisfied"]
TN <- conf_matrix["Neutral or Dissatisfied", "Neutral or Dissatisfied"]

# Additional Metrics to evaluate
# accuracy <- (TP + TN) / sum(conf_matrix) #to test if our calculations are correct
precision <- TP / (TP + FP)
recall <- TP / (TP + FN)
f1_score <- 2 * (precision * recall) / (precision + recall)

# Print
# cat("Accuracy:", round(accuracy, 3), "\n")
cat("Precision:", round(precision, 4), "\n")
## Precision: 0.7572
cat("Recall:", round(recall, 4), "\n")
## Recall: 0.7927
cat("F1 Score:", round(f1_score, 4), "\n")
## F1 Score: 0.7746

Interpreting Variable Importance (Gini Scores)

To better understand which variables contributed most to the random forest model, we used:

This produced Gini importance scores, which reflect how much each variable reduced node impurity across all trees in the forest. Node impurity refers to how mixed the classes are in a decision tree split — lower impurity means the data is better separated, so a variable that reduces impurity more is considered more important.

A higher Gini score means the variable was more frequently used in splits and contributed more to improving the classification.

These scores help identify the most influential features in predicting customer satisfaction.

For example, if flight_distance or type_of_travel scored highly, it means they were critical in helping the model distinguish between satisfied and dissatisfied customers.

varImpPlot(model.forest)

After attempting all three improvement methods, only boosting was successful at increasing the accuracy level but we did manage to make some valuable observations. Even when accuracy stays flat or slightly drops, improvements in recall, precision, or F1 score can signal a more meaningful and balanced model. This as a result is especially meaningful in business settings where certain types of errors are more costly than others. That’s why evaluating models using multiple metrics is crucial for making informed decisions.

Testing with individual satisfaction scores

In the final stage of our model improvement, we tested a version using only the individual satisfaction score variables to predict overall satisfaction. This dataset included scaled satisfaction ratings (from 1 to 5) for various flight-related aspects, such as seat comfort, cleanliness, food, and more.

We are fully aware that this approach introduces bias and leads to circular reasoning, as we are essentially using components of overall satisfaction to predict overall satisfaction itself. However, our intention here was not to develop a fair or generalizable model, but rather to explore which specific service elements contribute most strongly to customer satisfaction from a modeling perspective.

We built and evaluated this version using both a C5.0 decision tree and a random forest model. The C5.0 decision tree model helped us visualize the logic the model follows when making predictions based on satisfaction scores. As expected, both models produced exceptionally high accuracy — bordering on unrealistic — which was anticipated and further reinforces the concern of overfitting and redundancy.

In a second step, we used the random forest model and examined variable importance using Gini scores via the varImpPlot() function. These Gini importance scores reflect how much each variable contributed to reducing impurity across the trees, helping us identify which features (e.g., inflight wifi, legroom service, or seat comfort) were most influential. These insights can help airlines prioritize customer experience improvements by focusing on the most impactful service aspects.

In summary, while this model is not suitable for real-world deployment, it serves a valuable exploratory purpose: highlighting which service components passengers consistently associate with their overall satisfaction


dta_satisfaction_tree <- dta_model[, c(7:21)]

dta_satisfaction_tree$satisfaction <- as.factor(dta_satisfaction_tree$satisfaction)
set.seed(46748717)

proportion <- 0.7 # desired proportion here
split_satisfaction_tree <- rsample::initial_split(dta_satisfaction_tree, prop = proportion) #split the data
training_st_tree <- training(split_satisfaction_tree)
testing_st_tree <- testing(split_satisfaction_tree)

model_tree <- C50::C5.0(satisfaction ~., data = training_st_tree)

pred.test_tree <- predict(model_tree, testing_st_tree)


# Make sure factors have same levels
pred.test_tree <- factor(pred.test_tree, levels = levels(testing_st_tree$satisfaction))

# Show confusion matrix and full stats
confusion <- confusionMatrix(pred.test_tree, testing_st_tree$satisfaction, positive = "Satisfied")
print(confusion)
## Confusion Matrix and Statistics
## 
##                          Reference
## Prediction                Neutral or Dissatisfied Satisfied
##   Neutral or Dissatisfied                   21151      1563
##   Satisfied                                   842     15408
##                                           
##                Accuracy : 0.9383          
##                  95% CI : (0.9358, 0.9406)
##     No Information Rate : 0.5644          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.8739          
##                                           
##  Mcnemar's Test P-Value : < 2.2e-16       
##                                           
##             Sensitivity : 0.9079          
##             Specificity : 0.9617          
##          Pos Pred Value : 0.9482          
##          Neg Pred Value : 0.9312          
##              Prevalence : 0.4356          
##          Detection Rate : 0.3954          
##    Detection Prevalence : 0.4171          
##       Balanced Accuracy : 0.9348          
##                                           
##        'Positive' Class : Satisfied       
## 
conf_matrix <- table(Predicted = pred.test_tree, Actual = testing_st_tree$satisfaction)

# Extract values (adjust levels if needed)
TP <- conf_matrix["Satisfied", "Satisfied"]
FP <- conf_matrix["Satisfied", "Neutral or Dissatisfied"]
FN <- conf_matrix["Neutral or Dissatisfied", "Satisfied"]
TN <- conf_matrix["Neutral or Dissatisfied", "Neutral or Dissatisfied"]

# Additional Metrics to evaluate
# accuracy <- (TP + TN) / sum(conf_matrix) #to test if our calculations are correct
precision <- TP / (TP + FP)
recall <- TP / (TP + FN)
f1_score <- 2 * (precision * recall) / (precision + recall)

# Print
# cat("Accuracy:", round(accuracy, 3), "\n")
cat("Precision:", round(precision, 4), "\n")
## Precision: 0.9482
cat("Recall:", round(recall, 4), "\n")
## Recall: 0.9079
cat("F1 Score:", round(f1_score, 4), "\n")
## F1 Score: 0.9276

Visualization of satisfaction tree

To complement our results, we also visualized the decision tree model. The figure below illustrates how the model makes its predictions by sequentially splitting the data based on satisfaction-related variables like online boarding, in-flight WiFi service, and leg room service.

Each branch of the tree represents a decision rule, and each leaf shows the predicted class along with the proportion of samples it applies to. For example, customers who rated online boarding below 3.2 and in-flight WiFi service below 3.5 were overwhelmingly predicted to be dissatisfied - a pattern that intuitively aligns with what we would expect from a service experience perspective.

This visual further supports our goal of identifying the key satisfaction drivers. Even though the model itself is biased and not generalizable, the structure of the tree helps us see which service dimensions most consistently influence satisfaction. These findings can guide practical improvements, as they clearly point to which areas of the service experience passengers react to most strongly.

library(rpart)
library(rpart.plot)

model_rpart <- rpart(satisfaction ~ ., data = training_st_tree, method = "class")
rpart.plot(model_rpart, type = 2, extra = 104, fallen.leaves = TRUE)

Interpreting Variable Importance (Gini Scores)

As part of our analysis, we extracted the Gini importance scores using the varImpPlot() function from the random forest model. These scores quantify how much each variable contributed to reducing uncertainty (impurity) in the model’s predictions across all trees.

This helped us pinpoint which service dimensions mattered most in shaping customer satisfaction. For example, features like online boarding, in-flight WiFi service, and cleanliness stood out as consistently influential in the model’s decision process.

Although the model itself is not reliable for prediction due to its biased setup, the relative importance of these features is still valuable from a business strategy perspective. Airlines can use this information to focus improvement efforts on the areas that passengers care about most, as identified through consistent patterns in the data.

model.forest <- randomForest::randomForest(satisfaction ~., 
                                           data = training_st_tree,
                                           ntree = 500, # trees to be grown
                                           mtry = 2, # variables to sample at each split
                                           replace = FALSE) # sampling of cases with or without replacement
varImpPlot(model.forest)

4. Model 2: Logistic regression

For the logistic regression model, we started by setting the same seed as we used for the decision tree. This ensured that the data was split consistently, allowing for a fair comparison between the two models. We created a new copy of the dataset that we used for the decision tree (called dta_lreg) in order to keep the character variables as factors since logistic regression in R requires the predictors to be either factors or numerical when doing classification. The target variable satisfaction was converted into a factor, as logistic regression in R requires a categorical outcome for classification tasks. We then split the dataset into training (70%) and testing (30%) sets. This approach allowed the model to be trained on a subset of the data while being evaluated on unseen observations, helping us assess its generalizability. A logistic regression model was fitted using all predictor variables, with the family = binomial argument indicating that it is a binary classification problem.

training_lreg <- training_tree
testing_lreg <- testing_tree
str(training_tree)
## 'data.frame':    90916 obs. of  8 variables:
##  $ gender         : Factor w/ 2 levels "Female","Male": 2 2 2 2 2 2 1 1 2 1 ...
##  $ age            : int  37 12 62 44 10 25 51 44 56 63 ...
##  $ customer.type  : Factor w/ 2 levels "First-time","Returning": 2 2 2 2 1 2 2 2 2 2 ...
##  $ type.of.travel : Factor w/ 2 levels "Business","Personal": 1 2 1 1 1 1 2 1 2 1 ...
##  $ class          : Factor w/ 3 levels "Business","Economy",..: 2 2 1 2 2 1 2 1 3 1 ...
##  $ flight.distance: int  529 633 1218 541 979 725 1371 227 462 2015 ...
##  $ satisfaction   : Factor w/ 2 levels "Neutral or Dissatisfied",..: 2 1 2 2 1 1 1 2 2 2 ...
##  $ delay.avg      : num  128.5 137.5 40.5 8 15 ...
model_lreg <- glm(satisfaction ~ ., data = training_lreg, family = binomial) #Fit the logistic regression model 

summary(model_lreg) #View the model summary
## 
## Call:
## glm(formula = satisfaction ~ ., family = binomial, data = training_lreg)
## 
## Coefficients:
##                          Estimate Std. Error z value Pr(>|z|)    
## (Intercept)            -4.188e-01  3.193e-02 -13.118   <2e-16 ***
## genderMale              1.374e-02  1.687e-02   0.815    0.415    
## age                    -2.204e-04  6.286e-04  -0.351    0.726    
## customer.typeReturning  1.733e+00  2.434e-02  71.204   <2e-16 ***
## type.of.travelPersonal -2.306e+00  2.537e-02 -90.892   <2e-16 ***
## classEconomy           -1.257e+00  2.109e-02 -59.594   <2e-16 ***
## classEconomy Plus      -1.417e+00  3.521e-02 -40.235   <2e-16 ***
## flight.distance         6.956e-06  9.533e-06   0.730    0.466    
## delay.avg              -4.661e-03  2.369e-04 -19.675   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 124447  on 90915  degrees of freedom
## Residual deviance:  87211  on 90907  degrees of freedom
## AIC: 87229
## 
## Number of Fisher Scoring iterations: 4
pred.test_lreg <- predict(model_lreg, newdata = testing_lreg, type = "response") #Predict probabilities on the test set

pred_classes <- ifelse(pred.test_lreg > 0.5, "Satisfied", "Neutral or Dissatisfied") # Convert probabilities to predicted classes (using a threshold of 0.5)

Predictions were made on the test set in terms of probabilities, which were then converted into class labels using a 0.5 threshold. If the predicted probability was greater than 0.5, the model classified the observation as “Satisfied”; otherwise, it was classified as “Neutral or Dissatisfied.” A confusion matrix was generated to compare the predicted satisfaction levels with the actual ones, giving insight into the model’s classification performance. Finally, the model’s accuracy, precision, recall and F1 score was calculated and printed to evaluate how well it performed on the test set.

#Evaluating the model:

# Make sure both vectors are factors with the same level order
pred_classes <- factor(pred_classes, levels = c("Neutral or Dissatisfied", "Satisfied"))
actual_classes <- factor(testing_lreg$satisfaction, levels = c("Neutral or Dissatisfied", "Satisfied"))

confusion_matrix <- confusionMatrix(pred_classes, actual_classes, positive = "Satisfied")
print(confusion_matrix)
## Confusion Matrix and Statistics
## 
##                          Reference
## Prediction                Neutral or Dissatisfied Satisfied
##   Neutral or Dissatisfied                   17303      3872
##   Satisfied                                  4690     13099
##                                           
##                Accuracy : 0.7803          
##                  95% CI : (0.7761, 0.7844)
##     No Information Rate : 0.5644          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.5555          
##                                           
##  Mcnemar's Test P-Value : < 2.2e-16       
##                                           
##             Sensitivity : 0.7718          
##             Specificity : 0.7868          
##          Pos Pred Value : 0.7364          
##          Neg Pred Value : 0.8171          
##              Prevalence : 0.4356          
##          Detection Rate : 0.3362          
##    Detection Prevalence : 0.4565          
##       Balanced Accuracy : 0.7793          
##                                           
##        'Positive' Class : Satisfied       
## 
# Confusion matrix
conf_matrix <- table(Predicted = pred_classes, Actual = actual_classes)


# Extract values (adjust levels if needed)
TP <- conf_matrix["Satisfied", "Satisfied"]
FP <- conf_matrix["Satisfied", "Neutral or Dissatisfied"]
FN <- conf_matrix["Neutral or Dissatisfied", "Satisfied"]
TN <- conf_matrix["Neutral or Dissatisfied", "Neutral or Dissatisfied"]

# Additional Metrics to evaluate
# accuracy <- (TP + TN) / sum(conf_matrix) #to test if our calculations are correct
precision <- TP / (TP + FP)
recall <- TP / (TP + FN)
f1_score <- 2 * (precision * recall) / (precision + recall)

# Print
# cat("Accuracy:", round(accuracy, 3), "\n")
cat("Precision:", round(precision, 4), "\n")
## Precision: 0.7364
cat("Recall:", round(recall, 4), "\n")
## Recall: 0.7718
cat("F1 Score:", round(f1_score, 4), "\n")
## F1 Score: 0.7537

#Improving the model:

  1. Using optimal threshold level

Firstly, we tried to improve the accuracy of the model by alternating the threshold level. By default, logistic regression in R uses a threshold of 0.5 to classify outcomes: if the predicted probability of being “Satisfied” is greater than 0.5, the model classifies the observation as “Satisfied”; otherwise, it’s labeled as “Neutral or Dissatisfied.” However, this cutoff may not be the most effective point for maximizing model performance.

We first plotted the Receiver Operating Characteristic (ROC) curve to visually assess the trade-off between sensitivity (recall) and specificity. Using the coords() function from the pROC package, we identified the “best” threshold (the one that balances these trade-offs most effectively) and found it to be around 0.51.

library(pROC)
## Type 'citation("pROC")' for a citation.
## 
## Attaching package: 'pROC'
## The following object is masked from 'package:gmodels':
## 
##     ci
## The following objects are masked from 'package:stats':
## 
##     cov, smooth, var
roc_obj <- roc(testing_lreg$satisfaction, pred.test_lreg)
## Setting levels: control = Neutral or Dissatisfied, case = Satisfied
## Setting direction: controls < cases
plot(roc_obj)

coords(roc_obj, "best", ret="threshold")
##   threshold
## 1 0.4701863

To further validate this, we created a mapping of thresholds versus accuracy. This plot helped confirm that small variations around the 0.5 mark produced the highest accuracy values, peaking around 0.47–0.51. Based on this, we fine-tuned the classification threshold to 0.51, which slightly improved the model’s predictive performance and gave us the foloowing new values: Precision: 0.74, Recall: 0.76, F1 Score: 0.75 and Accuracy: 0.7821

The increase in both precision and recall suggests that the model is now more reliable in correctly classifying both satisfied and unsatisfied customers.

thresholds <- seq(0, 1, by = 0.01)
accuracies <- sapply(thresholds, function(t) {
  pred_classes <- ifelse(pred.test_lreg > t, "Satisfied", "Neutral or Dissatisfied")
  mean(pred_classes == testing_lreg$satisfaction)
})
plot(thresholds, accuracies, type = "l", xlab = "Threshold", ylab = "Accuracy")
abline(v = 0.5, col = "red", lty = 2)

model_lreg <- glm(satisfaction ~ ., data = training_lreg, family = binomial) #Fit the logistic regression model 


pred.test_lreg <- predict(model_lreg, newdata = testing_lreg, type = "response") #Predict probabilities on the test set

pred_classes <- ifelse(pred.test_lreg > 0.51, "Satisfied", "Neutral or Dissatisfied") # Convert probabilities to predicted classes

# Make sure both vectors are factors with the same level order
pred_classes <- factor(pred_classes, levels = c("Neutral or Dissatisfied", "Satisfied"))
actual_classes <- factor(testing_lreg$satisfaction, levels = c("Neutral or Dissatisfied", "Satisfied"))

confusion_matrix <- confusionMatrix(pred_classes, actual_classes, positive = "Satisfied")
print(confusion_matrix)
## Confusion Matrix and Statistics
## 
##                          Reference
## Prediction                Neutral or Dissatisfied Satisfied
##   Neutral or Dissatisfied                   17598      4094
##   Satisfied                                  4395     12877
##                                          
##                Accuracy : 0.7821         
##                  95% CI : (0.778, 0.7862)
##     No Information Rate : 0.5644         
##     P-Value [Acc > NIR] : < 2e-16        
##                                          
##                   Kappa : 0.5578         
##                                          
##  Mcnemar's Test P-Value : 0.00113        
##                                          
##             Sensitivity : 0.7588         
##             Specificity : 0.8002         
##          Pos Pred Value : 0.7455         
##          Neg Pred Value : 0.8113         
##              Prevalence : 0.4356         
##          Detection Rate : 0.3305         
##    Detection Prevalence : 0.4433         
##       Balanced Accuracy : 0.7795         
##                                          
##        'Positive' Class : Satisfied      
## 
# Confusion matrix
conf_matrix <- table(Predicted = pred_classes, Actual = actual_classes)


# Extract values (adjust levels if needed)
TP <- conf_matrix["Satisfied", "Satisfied"]
FP <- conf_matrix["Satisfied", "Neutral or Dissatisfied"]
FN <- conf_matrix["Neutral or Dissatisfied", "Satisfied"]
TN <- conf_matrix["Neutral or Dissatisfied", "Neutral or Dissatisfied"]

# Additional Metrics to evaluate
# accuracy <- (TP + TN) / sum(conf_matrix) #to test if our calculations are correct
precision <- TP / (TP + FP)
recall <- TP / (TP + FN)
f1_score <- 2 * (precision * recall) / (precision + recall)

# Print
# cat("Accuracy:", round(accuracy, 3), "\n")
cat("Precision:", round(precision, 4), "\n")
## Precision: 0.7455
cat("Recall:", round(recall, 4), "\n")
## Recall: 0.7588
cat("F1 Score:", round(f1_score, 4), "\n")
## F1 Score: 0.7521
  1. Scaling variables

As a second improvement method to the logistic regression model, we implemented scaling for the numeric predictor variables. While not strictly required, this step is considered good practice for logistic regression, as the model is sensitive to differences in scale across features. Variables with large ranges (such as flight.distance or average delay times) can disproportionately influence the coefficient estimates, potentially leading to numerical instability, slower convergence, and difficulty interpreting model weights.

To address this, we standardized the key numeric variables (age, flight.distance, and delay.avg) by centering them around their mean and scaling them by their standard deviation - using statistics derived from the training set to avoid data leakage. The same transformation was then applied to the testing data using the training parameters.

# Get numeric columns
numeric_vars <- c("age", "flight.distance", "delay.avg")

# Standardize on training set
for (var in numeric_vars) {
  mean_val <- mean(training_lreg[[var]], na.rm = TRUE)
  sd_val <- sd(training_lreg[[var]], na.rm = TRUE)
  
  training_lreg[[var]] <- scale(training_lreg[[var]], center = mean_val, scale = sd_val)
  testing_lreg[[var]] <- scale(testing_lreg[[var]], center = mean_val, scale = sd_val)
}
model_lreg <- glm(satisfaction ~ ., data = training_lreg, family = binomial)

pred.test_lreg <- predict(model_lreg, newdata = testing_lreg, type = "response")
pred_classes <- ifelse(pred.test_lreg > 0.5, "Satisfied", "Neutral or Dissatisfied")

pred_classes <- factor(pred_classes, levels = c("Neutral or Dissatisfied", "Satisfied"))
actual_classes <- factor(testing_lreg$satisfaction, levels = c("Neutral or Dissatisfied", "Satisfied"))

confusion_matrix <- confusionMatrix(pred_classes, actual_classes, positive = "Satisfied")
print(confusion_matrix)
## Confusion Matrix and Statistics
## 
##                          Reference
## Prediction                Neutral or Dissatisfied Satisfied
##   Neutral or Dissatisfied                   17303      3872
##   Satisfied                                  4690     13099
##                                           
##                Accuracy : 0.7803          
##                  95% CI : (0.7761, 0.7844)
##     No Information Rate : 0.5644          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.5555          
##                                           
##  Mcnemar's Test P-Value : < 2.2e-16       
##                                           
##             Sensitivity : 0.7718          
##             Specificity : 0.7868          
##          Pos Pred Value : 0.7364          
##          Neg Pred Value : 0.8171          
##              Prevalence : 0.4356          
##          Detection Rate : 0.3362          
##    Detection Prevalence : 0.4565          
##       Balanced Accuracy : 0.7793          
##                                           
##        'Positive' Class : Satisfied       
## 
# Manual confusion matrix
conf_matrix <- table(Predicted = pred_classes, Actual = actual_classes)

# Extract values
TP <- conf_matrix["Satisfied", "Satisfied"]
FP <- conf_matrix["Satisfied", "Neutral or Dissatisfied"]
FN <- conf_matrix["Neutral or Dissatisfied", "Satisfied"]
TN <- conf_matrix["Neutral or Dissatisfied", "Neutral or Dissatisfied"]

# Precision, recall, F1
precision <- TP / (TP + FP)
recall <- TP / (TP + FN)
f1_score <- 2 * (precision * recall) / (precision + recall)

# Output
cat("Precision:", round(precision, 4), "\n")
## Precision: 0.7364
cat("Recall:", round(recall, 4), "\n")
## Recall: 0.7718
cat("F1 Score:", round(f1_score, 4), "\n")
## F1 Score: 0.7537

After scaling the variables, the logistic regression model was re-fitted and evaluated. The results were:

Interestingly, while the overall accuracy remained similar to our previous model, we observed a slight drop in precision, with a slight increase in recall. The F1 Score also remained stable, suggesting the model still maintains a good balance between identifying true positives and avoiding false alarms.

This shows that scaling didn’t drastically change model performance, but it brought it to a more stable and interpretable state — particularly valuable if we later want to examine or compare coefficients. It also ensures consistency across preprocessing steps, which is important when deploying or expanding the model.

  1. Cross Validation

As a final improvement step, we applied 10-fold cross-validation to evaluate the logistic regression model’s stability and generalizability. Cross-validation is a widely used technique to ensure that the model performs well not just on one specific train-test split, but across multiple folds of the data.

In our approach, the data was split into 10 equal parts (folds). The model was trained on 9 of them and tested on the remaining one — this process repeated 10 times so each fold served as the test set once. This helps reduce overfitting and gives a more realistic estimate of model performance.

We used the train() function from the caret package, specifying method = “glm” for logistic regression and trControl = trainControl(method = “cv”, number = 10) to activate the 10-fold cross-validation.

Although cross-validation does not provide a single accuracy score in this setup, it adds value by ensuring that the model’s performance isn’t overly reliant on one data split. This increases confidence in its ability to generalize to unseen data.

library(caret)
train_control <- trainControl(method = "cv", number = 10)
train(satisfaction ~ ., data = training_lreg, method = "glm", family = "binomial", trControl = train_control)
## Generalized Linear Model 
## 
## 90916 samples
##     7 predictor
##     2 classes: 'Neutral or Dissatisfied', 'Satisfied' 
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 81825, 81825, 81824, 81824, 81824, 81824, ... 
## Resampling results:
## 
##   Accuracy   Kappa    
##   0.7850103  0.5644907

Logistic Regression over Individual Satisfaction Scores

training_lreg <- training_st_tree
testing_lreg <- testing_st_tree
str(training_st_tree)
## 'data.frame':    90916 obs. of  15 variables:
##  $ departure.and.arrival.time.convenience: num  5 4 3 2 4 3 5 4 5 1 ...
##  $ ease.of.online.booking                : num  5 3 3 2 4 3 3 4 4 2 ...
##  $ check.in.service                      : num  5 1 4 5 1 4 4 1 5 1 ...
##  $ online.boarding                       : num  5 1 5 4 4 3 5 3 4 3 ...
##  $ gate.location                         : num  5 4 3 2 4 3 1 4 3 1 ...
##  $ on.board.service                      : num  5 2 5 4 4 3 5 5 4 5 ...
##  $ seat.comfort                          : num  5 1 5 5 5 3 5 2 3 1 ...
##  $ leg.room.service                      : num  1 4 5 1 4 4 3 5 2 5 ...
##  $ cleanliness                           : num  5 1 4 5 5 3 4 4 3 2 ...
##  $ food.and.drink                        : num  5 3 4 5 5 3 2 4 3 3 ...
##  $ in.flight.service                     : num  1 4 5 4 4 3 5 5 4 5 ...
##  $ in.flight.wifi.service                : num  5 3 3 4 4 3 3 4 4 1 ...
##  $ in.flight.entertainment               : num  1 5 5 5 5 3 5 5 3 5 ...
##  $ baggage.handling                      : int  4 4 5 1 4 3 5 5 4 5 ...
##  $ satisfaction                          : Factor w/ 2 levels "Neutral or Dissatisfied",..: 2 1 2 2 1 1 1 2 2 2 ...
model_lreg <- glm(satisfaction ~ ., data = training_lreg, family = binomial) #Fit the logistic regression model 

summary(model_lreg) #View the model summary
## 
## Call:
## glm(formula = satisfaction ~ ., family = binomial, data = training_lreg)
## 
## Coefficients:
##                                         Estimate Std. Error  z value Pr(>|z|)
## (Intercept)                            -9.035311   0.070356 -128.422  < 2e-16
## departure.and.arrival.time.convenience -0.539724   0.009645  -55.960  < 2e-16
## ease.of.online.booking                  0.117194   0.012810    9.148  < 2e-16
## check.in.service                        0.256403   0.008347   30.719  < 2e-16
## online.boarding                         0.951635   0.010718   88.789  < 2e-16
## gate.location                          -0.005977   0.009354   -0.639   0.5228
## on.board.service                        0.326400   0.009919   32.908  < 2e-16
## seat.comfort                            0.161526   0.010772   14.995  < 2e-16
## leg.room.service                        0.351794   0.008346   42.152  < 2e-16
## cleanliness                             0.066286   0.011844    5.597 2.19e-08
## food.and.drink                         -0.076440   0.010476   -7.297 2.95e-13
## in.flight.service                       0.029508   0.011535    2.558   0.0105
## in.flight.wifi.service                  0.647199   0.012406   52.167  < 2e-16
## in.flight.entertainment                 0.262154   0.013468   19.465  < 2e-16
## baggage.handling                        0.054116   0.010961    4.937 7.92e-07
##                                           
## (Intercept)                            ***
## departure.and.arrival.time.convenience ***
## ease.of.online.booking                 ***
## check.in.service                       ***
## online.boarding                        ***
## gate.location                             
## on.board.service                       ***
## seat.comfort                           ***
## leg.room.service                       ***
## cleanliness                            ***
## food.and.drink                         ***
## in.flight.service                      *  
## in.flight.wifi.service                 ***
## in.flight.entertainment                ***
## baggage.handling                       ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 124447  on 90915  degrees of freedom
## Residual deviance:  70612  on 90901  degrees of freedom
## AIC: 70642
## 
## Number of Fisher Scoring iterations: 5
pred.test_lreg <- predict(model_lreg, newdata = testing_lreg, type = "response") #Predict probabilities on the test set

pred_classes <- ifelse(pred.test_lreg > 0.5, "Satisfied", "Neutral or Dissatisfied") # Convert probabilities to predicted classes (using a threshold of 0.5)
# Make sure both vectors are factors with the same level order
pred_classes <- factor(pred_classes, levels = c("Neutral or Dissatisfied", "Satisfied"))
actual_classes <- factor(testing_lreg$satisfaction, levels = c("Neutral or Dissatisfied", "Satisfied"))

confusion_matrix <- confusionMatrix(pred_classes, actual_classes)
print(confusion_matrix)
## Confusion Matrix and Statistics
## 
##                          Reference
## Prediction                Neutral or Dissatisfied Satisfied
##   Neutral or Dissatisfied                   18648      3420
##   Satisfied                                  3345     13551
##                                                  
##                Accuracy : 0.8264                 
##                  95% CI : (0.8226, 0.8301)       
##     No Information Rate : 0.5644                 
##     P-Value [Acc > NIR] : <2e-16                 
##                                                  
##                   Kappa : 0.6467                 
##                                                  
##  Mcnemar's Test P-Value : 0.3683                 
##                                                  
##             Sensitivity : 0.8479                 
##             Specificity : 0.7985                 
##          Pos Pred Value : 0.8450                 
##          Neg Pred Value : 0.8020                 
##              Prevalence : 0.5644                 
##          Detection Rate : 0.4786                 
##    Detection Prevalence : 0.5664                 
##       Balanced Accuracy : 0.8232                 
##                                                  
##        'Positive' Class : Neutral or Dissatisfied
## 
# Confusion matrix
conf_matrix <- table(Predicted = pred_classes, Actual = actual_classes)


# Extract values (adjust levels if needed)
TP <- conf_matrix["Satisfied", "Satisfied"]
FP <- conf_matrix["Satisfied", "Neutral or Dissatisfied"]
FN <- conf_matrix["Neutral or Dissatisfied", "Satisfied"]
TN <- conf_matrix["Neutral or Dissatisfied", "Neutral or Dissatisfied"]

# Additional Metrics to evaluate
# accuracy <- (TP + TN) / sum(conf_matrix) #to test if our calculations are correct
precision <- TP / (TP + FP)
recall <- TP / (TP + FN)
f1_score <- 2 * (precision * recall) / (precision + recall)

# Print
# cat("Accuracy:", round(accuracy, 3), "\n")
cat("Precision:", round(precision, 4), "\n")
## Precision: 0.802
cat("Recall:", round(recall, 4), "\n")
## Recall: 0.7985
cat("F1 Score:", round(f1_score, 4), "\n")
## F1 Score: 0.8002

5. Comparison of the models

Qualitative Assessment

Interpretability and Transparency

The decision tree model provides superior interpretability because it shows through visualization the decision-making process for satisfaction evaluation. The branches in the model enable simple tracking of which features play the most important role in the process. Logistic regression provides feature coefficients that demonstrate variable influence on satisfaction odds but the interpretation of interaction effects and non-linear patterns remains difficult.

Ease of Communication with Stakeholders

The visual rule-based structure of decision trees (“If-Then” logic) gives them a slight advantage in communication ease compared to logistic regression. The visual structure of decision trees makes them ideal for presenting information to business stakeholders who lack technical backgrounds. The explanation of estimates and coefficients in logistic regression requires additional clarification.

Robustness to Data Noise and Outliers

The model becomes more stable against small data fluctuations through regularized logistic regression. Decision trees demonstrate sensitivity to minor data variations because small changes in the data can generate different tree structures which decreases reliability until pruning or ensemble methods such as Random Forest are implemented.

Feature Interaction Handling

The decision tree model uses its split-based mechanism to detect variable interactions naturally during the modeling process. Logistic regression needs manual interaction term specification to detect such relationships because it does not detect them automatically which restricts its ability to model complex relationships.

Overfitting Tendencies

Decision trees tend to develop overfitting problems when they reach deep growth without proper constraints. Logistic regression demonstrates stronger resistance to overfitting through the implementation of regularization techniques. The risk of overfitting decision trees becomes significant for our dataset because some features show correlation or contain noisy data such as subjective service satisfaction ratings unless proper controls are implemented.

Quantitative Assessment

Accuracy:

Precision:

Recall:

F1 Score:

The decision tree consistently performs better across all basic metrics, even if the differences are small. It strikes a better balance between avoiding false positives (precision) and missing satisfied customers (recall).

Kappa (κ): Kappa is needed to show that tree is not just better because of luck, it’s genuinely learning the data better.

Since both models are far above the No Information Rate, and class distribution isn’t super skewed, Kappa gives extra confidence that the tree is capturing real patterns, not just flukes.

Decision Tree wins both technically and practically. It’s especially useful if you want interpretable, slightly more accurate, and balanced predictions. Logistic Regression isn’t far behind and could be useful as a baseline or second-opinion model.

Note about random forests: Even though accuracy is almost identical between decision tree and random forest, the forest is likely more stable and less prone to overfitting as shown from recall and F1 score. Its strength lies in combining multiple decision trees to reduce variance which help generalize better on unseen data. Performance boost is incremental yet it offers more robust and reliable explanation.

Note about individual satisfaction scores and understanding logistic regression results:

The estimates (coefficients) in logistic regression show the magnitude and direction of the effect of each feature on the probability of passenger satisfaction. Low p-values (typically below 0.05) indicate strong evidence that the feature has a real effect on satisfaction. When interpreting our results, we focused on the variables with both large absolute estimates and significant p-values. Those features with high absolute estimates and significant p-values are the key attributes airlines should focus on to strategically enhance passenger satisfaction.
For significant variables, we found that in-flight Wi-Fi service, online boarding, legroom, and entertainment had strong positive coefficients, indicating that they are important levers for improving satisfaction.
Meanwhile flight time convenience and personal travel reasons had negative coefficient, indicating they are important preventers for improving satisfaction.

###APPENDIX

Appendix A: Attempt to merge conceptually related variables in the feature engineering phase

{r creating new variables based in conceptual relevance} Create a new data frame that merges the variables into the conceptually related scores that were described dta_cls_merged <- dta_cls

Create the three new features dta_cls_merged$flight.experience.score <- rowMeans(dta_cls_merged[, c(“on.board.service”, “seat.comfort”, “leg.room.service”, “cleanliness”)], na.rm = TRUE)

dta_cls_merged$amenities.score <- rowMeans(dta_cls_merged[, c(“food.and.drink”, “in.flight.service”, “in.flight.wifi.service”, “in.flight.entertainment”)], na.rm = TRUE)

dta_cls_merged$ground.service.score <- rowMeans(dta_cls_merged[, c(“check.in.service”, “gate.location”, “online.boarding”, “ease.of.online.booking”, “baggage.handling”,“departure.and.arrival.time.convenience”)], na.rm = TRUE)

Remove the original variables that we merged dta_cls_merged <- dta_cls_merged[, !names(dta_cls_merged) %in% c(“on.board.service”, “seat.comfort”,“leg.room.service”, “cleanliness”,“food.and.drink”, “in.flight.service”, “in.flight.wifi.service”, “in.flight.entertainment”,“check.in.service”, “gate.location”, “online.boarding”, “ease.of.online.booking”, “baggage.handling”, “departure.and.arrival.time.convenience”)]

#Overview of the new data set str(dta_cls_merged)

PCA Analysis with the merged data set PCA_analysis_merged <- prcomp(dta_cls) summary(PCA_analysis_merged)

Conclusions and Recommendations

  1. Key Drivers of Passenger Satisfaction

The analysis revealed essential attributes which powerfully affect passenger satisfaction levels:

Objective Factors:

Service-Related Factors:

Actionable Recommendations:

  1. Predictive Models of Passenger Satisfaction

The decision tree model demonstrated superior performance than logistic regression because it accurately predicted passenger satisfaction through better accuracy and precision and recall and F1 scores. The decision tree model demonstrates better performance because it successfully detects complex nonlinear patterns in passenger data. Its high recall means airlines could proactively identify the majority of satisfied passengers, while high precision limits incorrect assumptions about satisfaction. Both of these are critical for customer loyalty programs or post-flight engagement.

Actionable Recommendations: