Assignment 7

Author

Colleen Malloy

Comparison of Aer Lingus and United Airlines Reviews

Introduction

My family only flys Aer Lingus Airlines when we fly internationally. I wonder, are we missing out on other airlines? So, this report explores the sentiment and emotions through reviews on Aer Lingus and United Airlines. This data was scraped and collected from the Skytrax website, a place for airline reviews.

Questions:

  1. Do positive and negative sentiments differ amongst Aer Lingus and United reviews?
  2. Does one airline have more negative emotions within the reviews than the other?
  3. Are there more positive/negative comments in certain months of the year? Or has there been a trend over the years?

Load Data from Previous Web scraping

  • Load Packages

  • Load Data files

  • Clean up and Combine Data files into one

Start Reviewing Words Within Reviews

First, I looked at related words for each airline:

Joining with `by = join_by(word)`
Joining with `by = join_by(word)`
`summarise()` has grouped output by 'airline'. You can override using the
`.groups` argument.
# A tibble: 1,717 × 3
# Groups:   airline [1]
   airline    word         n
   <chr>      <chr>    <int>
 1 Aer Lingus flight     127
 2 Aer Lingus aer         94
 3 Aer Lingus lingus      93
 4 Aer Lingus verified    70
 5 Aer Lingus dublin      60
 6 Aer Lingus service     55
 7 Aer Lingus trip        49
 8 Aer Lingus airline     48
 9 Aer Lingus airport     42
10 Aer Lingus told        40
# ℹ 1,707 more rows
`summarise()` has grouped output by 'airline'. You can override using the
`.groups` argument.
# A tibble: 1,682 × 3
# Groups:   airline [1]
   airline word         n
   <chr>   <chr>    <int>
 1 United  flight     220
 2 United  united     126
 3 United  verified    70
 4 United  told        53
 5 United  time        52
 6 United  hours       51
 7 United  trip        51
 8 United  airline     47
 9 United  service     41
10 United  airport     39
# ℹ 1,672 more rows
Warning: The `trans` argument of `continuous_scale()` is deprecated as of ggplot2 3.5.0.
ℹ Please use the `transform` argument instead.

The words are the same, which makes sense because words like “flight” and “trip” are used when talking about any airline or flight.

I decided to go deeper and first find the sentiment scores for each airline and then find the negative and positive words within each airlines reviews.

Sentiment Scores using NRC Lexicon:

# Load in the NRC lexicon
# Make sure to follow the prompts in the console!
nrc <- get_sentiments("nrc")

# Visualize emotional sentiment counts for each brewery
airlines %>% 
  inner_join(nrc, by = "word", relationship = "many-to-many") %>% 
  group_by(sentiment, airline) %>% 
  summarize(n = n()) %>% 
  ggplot(aes(x = sentiment, y = n, fill = airline)) +
  geom_bar(stat = "identity", position = "dodge") +
  scale_fill_manual(values = c("lightgreen","darkblue")) +
  labs(title = "Airline Sentiment Scores",
       subtitle = "Total number of emotive words scored ",
       y = "Total Number of Words",
       x = "Emotional Sentiment",
       fill = "Airline")
`summarise()` has grouped output by 'sentiment'. You can override using the
`.groups` argument.

There is not much to compare with this visual. However, I feel like this sentiment visual answers question number 1 as no. There is not much positive or negative difference in sentiment scores for these two airlines. I think we should dive deeper by seeing which words are used more frequently for each airline.

Positive and Negative Words:

`summarise()` has grouped output by 'airline'. You can override using the
`.groups` argument.
Joining with `by = join_by(word)`
`summarise()` has grouped output by 'airline'. You can override using the
`.groups` argument.
# A tibble: 4 × 3
# Groups:   airline [2]
  airline    sentiment     n
  <chr>      <chr>     <int>
1 Aer Lingus negative    194
2 United     negative    178
3 Aer Lingus positive    106
4 United     positive     87

I chose to include words used 5 or more times because I scraped 70 reviews for each airline. The fact that “delayed” was included in 37 of these 70 reviews for United Airlines is interesting, and making me think my family’s choice in airline is good and we might not be switching over to United anytime soon.

Here, we can answer questions one and two…

  1. Do positive and negative sentiments differ amongst Aer Lingus and United reviews? Yes, as you can see, there is more variation and negative words used in United Airline reviews than Aer Lingus reviews. I found it interesting that the word “delayed” was used 31 more times in United reviews than Aer Lingus reviews.

  2. Does one airline have more negative emotions within the reviews than the other? Yes, I would say that United reviews have more negative emotions within the reviews than Aer Lingus. Let’s look into this idea a little more…

Positive and Negative Word Scores:

Here, I found positive and negative scores for each airline based on the words used in reviews on their airlines.

I also found this by month of the year because the reviews are recorded by what month it was posted in, and I wanted to answer question three.

  1. Are there more positive/negative comments in certain months of the year? Or has there been a trend over the years?
Joining with `by = join_by(word)`
`summarise()` has grouped output by 'airline', 'month'. You can override using
the `.groups` argument.

Joining with `by = join_by(word)`
`summarise()` has grouped output by 'airline', 'month'. You can override using
the `.groups` argument.

These visuals are inverses of each other, but as we can see, there are significantly more negative words and emotions used within these airline reviews.

United airlines does not have one month of the year where there is a higher positive score than negative score.

We can see that in the summer months, July, August, and September, there are a lot more reviews for United, and these are all more negative than positive. I think this is due to high travel period. I think both airlines see more negative comments and emotions exemplified because most people complain about airlines. There are not that many people that applaud or compliment airlines. As humans, we just want to safely and comfortably get to our destination so our vacation or trip can begin. Also, many people are already crabby when they board their flight to come home from their vacation or trip, simply because it is over. They had a great time and they do not want it to end. So anything going a little bit wrong or off during their flight experience can put a bad taste in their mouth and cause them to write a negative review on the airline. Also, if a flight is delayed, people are forced to sit in an airport with nothing to do. So many have the extra time to sit down and complain about an airline. If a flight goes well, people will just go on with their life happily, but not express that happiness in a review.