# A tibble: 5,323,148 × 4
# Groups: airline [2]
airline item1 item2 n
<chr> <chr> <chr> <dbl>
1 American american flight 2800
2 American american airlines 2760
3 American airlines flight 2415
4 American time flight 1857
5 American flight service 1808
6 American flight airline 1610
7 American flight hours 1543
8 American fly flight 1503
9 American american service 1473
10 American american time 1471
# ℹ 5,323,138 more rows
Assignment #7 Matthew Stewart
Intro to American Airlines vs Delta Airlines Sentiment Analysis
For this assignment, I will be using data sets of American Airlines and Delta Airlines reviews in order to do text and sentiment analysis on how customers feel about each airline.
Loading Data and Cleaning
First, I loaded the data from my one drive, then I had to clean these data sets by assigning an airline column to each review so that I could differentiate which review was about which airline. Then, I had to get rid of some words that were at the start of the American Airlines reviews. There was a “Trip Verified” or “Not Verified” at the start of every review so I used a function to get rid of those and any white space that was in the review. Then, I combined the data sets into one data frame and assigned each review an ID.
Text Analysis
Now that I had the data ready to be used, my next step was to unnest the words into individual instances so that I can analyze each word in the review. Then I paired words together to find which words are used together the most in a review.
Sentiment Analysis
To begin my sentiment analysis, I used the Bing sentiment lexicon to find which airline had more positive or negative sentiments about it. After taking the net sentiments between the two airlines, it was clear that there was more negative sentiment against American Airlines than Delta.
For my second graph, I used the NRC lexicon to gather what emotions each airline was giving to people based on their review. Even though my American Airlines dataset had more reviews than the Delta one it’s fair to say that American had basically the same positive emotions as Delta and a little more negative emotions than Delta. American also had a higher amount of anticipation and trust words in their reviews.
The last sentiment analysis I completed was to group words like service, staff, delay, time, legroom, or seat into “Comfort”, “Service”, and “Timeliness” categories to see how people felt about each airline when it comes to that type of review. I found that American Airlines had a significantly lower sentiment for service and a slightly lower sentiment for timeliness than Delta