Assignment #7 Matthew Stewart

Intro to American Airlines vs Delta Airlines Sentiment Analysis

For this assignment, I will be using data sets of American Airlines and Delta Airlines reviews in order to do text and sentiment analysis on how customers feel about each airline.

Loading Data and Cleaning

First, I loaded the data from my one drive, then I had to clean these data sets by assigning an airline column to each review so that I could differentiate which review was about which airline. Then, I had to get rid of some words that were at the start of the American Airlines reviews. There was a “Trip Verified” or “Not Verified” at the start of every review so I used a function to get rid of those and any white space that was in the review. Then, I combined the data sets into one data frame and assigned each review an ID.

Text Analysis

Now that I had the data ready to be used, my next step was to unnest the words into individual instances so that I can analyze each word in the review. Then I paired words together to find which words are used together the most in a review.

# A tibble: 5,323,148 × 4
# Groups:   airline [2]
   airline  item1    item2        n
   <chr>    <chr>    <chr>    <dbl>
 1 American american flight    2800
 2 American american airlines  2760
 3 American airlines flight    2415
 4 American time     flight    1857
 5 American flight   service   1808
 6 American flight   airline   1610
 7 American flight   hours     1543
 8 American fly      flight    1503
 9 American american service   1473
10 American american time      1471
# ℹ 5,323,138 more rows

Sentiment Analysis

To begin my sentiment analysis, I used the Bing sentiment lexicon to find which airline had more positive or negative sentiments about it. After taking the net sentiments between the two airlines, it was clear that there was more negative sentiment against American Airlines than Delta.

For my second graph, I used the NRC lexicon to gather what emotions each airline was giving to people based on their review. Even though my American Airlines dataset had more reviews than the Delta one it’s fair to say that American had basically the same positive emotions as Delta and a little more negative emotions than Delta. American also had a higher amount of anticipation and trust words in their reviews.

The last sentiment analysis I completed was to group words like service, staff, delay, time, legroom, or seat into “Comfort”, “Service”, and “Timeliness” categories to see how people felt about each airline when it comes to that type of review. I found that American Airlines had a significantly lower sentiment for service and a slightly lower sentiment for timeliness than Delta