Analysis of the Twitter environment of Delta Airlines

Introduction

For our Project, we chose one of the most popular American Airline brands i.e Delta Airlines. This project is aimed to help Delta Airlines to understand what people think about their service through analyzing information from one of the most popular social media: Twitter. Our Analysis includes the following steps:

  • Twitter API Calling & Data preparation
  • WordClouds for analyzing the most popular words used in the tweets
  • Sentiment Analysis using Emotional scores and Positive/Negative Sentiments
  • Topic Modelling using Bigrams
  • Follower Profile Summary elaborating the geographic distribution across the globe We collected overall approximately 25000 tweets for Delta Airlines (including both Tweets to Delta and Tweets by Delta).

https://gmishiny.shinyapps.io/DeltaInsight/

We took data from January 2021 from Delta Airlines company. In Twitter it follows 38.8K and it has 1.5M Followers.

Getting data (tweets) via API

To connect to twitter API, we used OAuthFactory function and Twitter Application credentials. To use that you need to apply for Twitter Developers Account explaining why and how data will be used.

my_token <- create_token( app = "", consumer_key = "", consumer_secret = "", access_token = "", access_secret = "", set_renv=FALSE)

After that we saved the Retrieved Tweets as an Object so that we don’t exhaust our usage limit. There are 22 files with data which were merged.

As we wanted to work with more data that is possible to extract using one account, we used 6 accounts.

Giving us in final 23926 tweets from clients and 1000 from Delta to analyze.

Preparing and cleaning of data

We removed misspeled words, so they wouldn’t affect our work. That is how cleaned table looks like.

## # A tibble: 2 x 90
##   user_id status_id created_at          screen_name text  source
##   <chr>   <chr>     <dttm>              <chr>       <chr> <chr> 
## 1 106627~ 13451575~ 2021-01-01 23:58:37 AliAkinK    "If ~ Twitt~
## 2 270589~ 13451571~ 2021-01-01 23:57:11 TomMinerCMS "   ~ Twitt~
## # ... with 84 more variables: display_text_width <dbl>,
## #   reply_to_status_id <chr>, reply_to_user_id <chr>,
## #   reply_to_screen_name <chr>, is_quote <lgl>, is_retweet <lgl>,
## #   favorite_count <int>, retweet_count <int>, quote_count <int>,
## #   reply_count <int>, hashtags <list>, symbols <list>, urls_url <list>,
## #   urls_t.co <list>, urls_expanded_url <list>, media_url <list>,
## #   media_t.co <list>, media_expanded_url <list>, media_type <list>,
## #   ext_media_url <list>, ext_media_t.co <list>, ext_media_expanded_url <list>,
## #   ext_media_type <chr>, mentions_user_id <list>, mentions_screen_name <list>,
## #   lang <chr>, quoted_status_id <chr>, quoted_text <chr>,
## #   quoted_created_at <dttm>, quoted_source <chr>, quoted_favorite_count <int>,
## #   quoted_retweet_count <int>, quoted_user_id <chr>, quoted_screen_name <chr>,
## #   quoted_name <chr>, quoted_followers_count <int>,
## #   quoted_friends_count <int>, quoted_statuses_count <int>,
## #   quoted_location <chr>, quoted_description <chr>, quoted_verified <lgl>,
## #   retweet_status_id <chr>, retweet_text <chr>, retweet_created_at <dttm>,
## #   retweet_source <chr>, retweet_favorite_count <int>,
## #   retweet_retweet_count <int>, retweet_user_id <chr>,
## #   retweet_screen_name <chr>, retweet_name <chr>,
## #   retweet_followers_count <int>, retweet_friends_count <int>,
## #   retweet_statuses_count <int>, retweet_location <chr>,
## #   retweet_description <chr>, retweet_verified <lgl>, place_url <chr>,
## #   place_name <chr>, place_full_name <chr>, place_type <chr>, country <chr>,
## #   country_code <chr>, geo_coords <list>, coords_coords <list>,
## #   bbox_coords <list>, status_url <chr>, name <chr>, location <chr>,
## #   description <chr>, url <chr>, protected <lgl>, followers_count <int>,
## #   friends_count <int>, listed_count <int>, statuses_count <int>,
## #   favourites_count <int>, account_created_at <dttm>, verified <lgl>,
## #   profile_url <chr>, profile_expanded_url <chr>, account_lang <lgl>,
## #   profile_banner_url <chr>, profile_background_url <chr>,
## #   profile_image_url <chr>

After we tokenized words and removed stop words.

Creation of graphs, world clouds and analysis

The Most Frequent Words used for Tweets TO Delta Airlines

To our mind first 5 words are quite obvious as those are words correlated to flights and company itself. On 6 places there is word “support” and that would be essential for future analyze to go deeper in those tweets where people mention that word as it might be both negative or positive: “great support Delta!” or “awful support, never returned money”.

The Most Frequent Words used for Tweets BY Delta Airlines

The most popular words are “confirmation” and “dm” - direct message. As seen later there are a lot of negative sentiments. That mightlead to proposing resoluyion in direct messages.

Sentiment analysis of Replies from Delta Airlines

Amount of negative sentiments is twice higher than positive.

As it can be seen from the bars top two reasons for negative feedback are inconvinience and delays. Delays are easy understandable cause while inconviniece might be studied deeper in order to see what exactly was the reason: delays, bad timing of flights, COVID measures etc.

Word “concern” and “concerns” have the same meaning. That would double bar chart of “concern” and give one more topic for investigation: what are clients so concerned about.

Speaking about positive sentiments it’s obvious that people are happy because they are safe, also company makes refunds and probably staff members are patient and sincere. Other words are grads of happiness felt by clients.

## # A tibble: 2 x 2
##   sentiment     n
##   <chr>     <int>
## 1 negative   4781
## 2 positive   2005

Topic Analysis of Tweets from Top 6 Competitors

On this graphs we can see what clients of competitors tweet. So far it doesn’t represent a significant insights and shows that mostly people tweet about the same topics.

Top 10 Most Liked Tweets of Delta Airlines (in January’2021)

Table with the most popular tweets.

## # A tibble: 10 x 4
##    created_at          screen_name  text                          favorite_count
##    <dttm>              <chr>        <chr>                                  <int>
##  1 2021-01-09 23:39:05 Lakers       "Next stop: H-Town \n\n#Lake~           7050
##  2 2021-01-12 05:25:49 Cleavon_MD   "KICKED OFF FLIGHT: Melody B~           4211
##  3 2021-01-31 17:59:22 AshaRangapp~ "Just a reminder that the GO~           1819
##  4 2021-01-09 19:37:15 ConservaMom~ "\U0001f6a8Fascism Takes Fli~           1421
##  5 2021-01-19 20:12:47 SilverNumbe~ "Finally @Delta changed up t~           1001
##  6 2021-01-12 19:33:23 marcusdipao~ "A militia group called for ~            834
##  7 2021-01-21 20:57:00 NYRangers    "And we’re off. <U+2708><U+FE0F>\n \nThank~            602
##  8 2021-01-01 09:21:25 stewartcink  "Seems to me airlines mostly~            440
##  9 2021-01-29 02:58:20 maximum      "Cannot begin to understate ~            397
## 10 2021-01-29 02:58:20 maximum      "Cannot begin to understate ~            395

Top 10 Hashtags occurying in the Tweets for Delta Airlines (in January’2021)

As the main topic of passed January was change of presidents in the USA, ex-president #Trump is the most popular hashtag.

Top 10 Mentions in the Tweets for Delta Airlines (in January’2021)

It would be interesting to understand in what context clients mention competitors: negative or positive sentiments. Suprisinglym people metion Coca-Cola and various TelCom companies.

Analysis of “Twitter Status Frequency”

There are no specific trends in user’s tweets but we can see that on evenings of Sundays and Mondays people post more.

Answers from Delta are usually 1-2 days later but the trend is the same.

Top 10 Tweeter Users making Maximum No. of Tweets (in January’2021)¶

Some of the accounts are obvious: SecretFlying or GetYouRefund. They are connected to flight tickes and airlines - one finds great deals, second helps with refund. Other accounts are not so obvious and need further investigation - why they are in top-10?

Top Plaforms used by Users Tweeting about Delta Airlines

There is a huge gap between iPhone and Android users. Twitter for iPhone is used by 50% of clients who tweet about Delta. If users of website and mobile application show the same trend, there is a huge need in developing and maintaining in great shape mobile app for iPhones while investing a bit less in Android’s.

Top Languages Spoken by Users that follow Delta Airlines

Surprisingly it’s not English. That information might influence adding different languages to SMM campaighns.

Top Locations of Users Tweeting about Delta Airlines

This analysis might be developed further into finding correlation between angry tweets and airports. For example, Los Angeles airport accumulates the most of negative feedback. That might lead to negotiations with the airport to understand why there are delays or something like that.

World Map showing Global Distribution of Users Tweeting about Delta Airlines

The USA has the most tweets, though there is a huge red spot in Africa and it would be interesting to investigate why. All other red dots all over the map correspond to huge airport hubs.

Sentiment Analysis (Classifying Tweets into 10 different Emotions )

Anger, Anticipation, Disgust, Fear, Joy, Sadness, Surprise, Trust, Negative and Positive.

As it’s seen on the graph, positive and negative emotions are quite close.

##                                                                                                                                                                               Delta_clean$text
## 1                                   If were not getting 2000 I expect the governmentsubsided   and  to provide us with a loaded gas card frequent flyer miles and a waived Prime subscription 
## 2                                                                      please help Luke with this Were still in the middle of a global pandemic and your flexibility would be much appreciated
## 3                                                                                                                                        Where would you go if enough people got vaccinated   
## 4                                                                                                                                                              I agree I feel very safe flying
## 5     can you cancel the refund and fix my ticket i was advice not to use the LaGuardia Airport and use the JFK airport because it closer you have my information you can step in am listening
## 6  I am force to have to cancel my flight and pick a different airport in New York all I ask for was to switch from LGA to JFK I am pissed off I am unhappy you treating basic flight this way
##   anger anticipation disgust fear joy sadness surprise trust negative positive
## 1     0            1       0    0   0       0        1     2        0        3
## 2     0            0       0    1   0       1        0     0        1        1
## 3     0            0       0    0   0       0        0     0        0        0
## 4     0            0       0    1   1       0        0     1        0        3
## 5     0            1       0    0   0       1        0     1        1        1
## 6     2            1       1    1   0       2        0     0        3        1

Topic Analysis using Bigram

## package 'RWeka' successfully unpacked and MD5 sums checked
## 
## The downloaded binary packages are in
##  C:\Users\dasha\AppData\Local\Temp\RtmpcReJmV\downloaded_packages

For Tweets “TO” Delta Airlines

Data for world clouds has to be pre-processed to delete noise and mingless words such as ‘a’, ‘the’, ‘was’ etc. Here we can see pre-processed world-cloud. Later there will be an example of a not prepared data.

For Tweets “BY” Delta Airlines

“Apologize for” is clearly seen as one of the leaders. No wonder most of the tweets have negative sentiment.

WordCloud for “Tweets to Delta Airlines”

This is word cloud before pre-processing. Just ‘the’, ‘a’, ‘was’ etc.

This word cloud shows sence as it was cleaned.

WordCloud for “Tweets by Delta Airlines”

Not cleaned word cloud.

Cleaned word cloud.

Conclusion

With the overall tweets data that we were able to collect we were able to summarize the overall user tweet trend to Delta airlines which included

  • User Trend – Through our analysis, we found user-specific data like the most liked post, most popular hashtags among users, most shared links, etc

  • Tweeter Status Frequency – This helped us understand the most likable days of the month when users post about Delta Airlines and in return the tendency of Delta airlines to post replies to the user tweets.

  • User Profile – This included top users making maximum tweets about Delta Airlines, their most preferred platforms, top languages they spoke.

  • Sentiment Analysis – Based on the Emotions scores of the Tweets. Also, engagement & activity regarding the sentiments (positive / negative). It was even observed that Delta Airlines replies to tweets more often when there is extremely positive or negative sentiment.

  • Did a detailed Topic Analysis using Bigrams.

  • Follower Profile Summary - Used a world map to see the overall distribution of Tweets users across the globe and their top locations.