This is another mini project I carried out to not only display my data mining and visualization skills but to also show my ability to gather and manipulate data from multiple sources. Due to the limitations the Twitter API has set on tweet pulls and date range, I was able to extract 1340 tweets generated within the last 7 days; from the date of tweet pull. I performed this analysis using the R Programming Language.
The goal of this report is to identify issues that American Airlines (AA) customers experience. These issues will be identified using a sentiment scoring analysis on tweets from AA customers. The sentiment scoring analysis will highlight negative and positive customer sentiments. The results are displayed using a word cloud of frequently used words/phrases and scatter plots to show customer sentiments.
Excluding the neutral sentiments (neither negative or positive), negative AA customer sentiments accounted for ~ 63% of the tweets pulled for this analysis while the positive AA customer sentiments accounted for ~ 37%.
The most frequent negative sentiments resulted from flight delays and cancellations, baggae claim issues and customer service/call center issues. Let’s not discount the few good sentiments but having negative reports that far outweight the positive goes to say it is imperative that improvements be made to correct underlying issues to poor customer experience.
Based on this analysis and the assumption that the AA customer experiences retrieved from the tweets pulled in this report are to an extent representative of the sentiments AA customers feel, I recommend the implementation of the following:
AA should assign more personnel who are knowledgable of current events to attend to and better assist with customer needs and concerns.
AA should investigate baggage claim issues to improve lost baggage concerns.
Note: Due to the limit on the number of tweet pulls imposed by the Twitter API, the tweet data in this analysis is not representative of all AA customer experiences.
Shown below are the steps I took to carry out this analysis:
Let’s visualize the words for the most frequently tweeted complaints and praises.
Note: The size of the text corresponds to the number of times that text appeared in the sample of tweets pulled. I.e., the larger the text, the higher the frequency.
The word cloud above doesn’t do a good job of interpreting what AA customers are really trying to say so let’s dive into a sentiment analysis to have a better understanding of the tweets.
The scatter plot below clearly shows there are more negative sentiments than there are positive.
Displayed below are word clouds of the most frequent positive and negative sentiments.
That’s a much better representation of customer sentiments than the first wordcloud visualization. Here we can see that some people actaully do appreciate AA’s service.
To give you an idea of what AA customers have actually tweeted, I have created a table of the top 5 positive tweets and the top 5 negative tweets. These tables are shown below.
## text
## 1 london heathrow flagship lounge relaxing gr staff good food nice respite
## 2 lessons jetairways competitors compassion empathy basic good customer service
## 3 love a aircraft free movies the pitch seats dramatically improved
## 4 a beautiful plane i flying work continents i love people finding solutions
## 5 every time i fly i pleasant experience grateful helpfulness excellent customer service
## text
## 1 no worst airline dirty filthy dirty plane hour runway pilot didnt calculate winds disaster
## 2 unhelpful cancelled flight refuses accommodate refunding cancelled leg flight forcing lengthy reroute
## 3 bad hrs delay plane due maintenance sick upset aa
## 4 guys fucking worst hour delay denver i mechanical problems
## 5 another chaotic lgayul trip disorganized grumpy gate agents dumpy terminal tired plane thx
Note: From data cleaning and manipulation, tweets have lost some structuring but still contain the main information
email: nky@utexas.edu