Introduction

In this brief document I seek to give an account of the potentialities of data analysis on Twitter. In this presentation, I analyze the most recent tweets published by the official Nike account. If you hire this service you can choose to make this same type of analysis on any other public account, on the answers or mentions obtained on a public account, on a specific topic, on a location… there is a great variety of possibilities!

The report you will receive will be very similar to this one, but with the data you need!

I work in Spanish and English! Don’t hesitate to ask any questions! I am here to help you and I really enjoy my job!

Descriptive statistics

Nike has a total of 8.2660710^{6} followers and follows 116 accounts. It was created on 2011-11-18.

The last 3200 posted tweets were downloaded from the official Nike account. In the following graph we can see the period in which these tweets occurred and the frequency per week.

This analysis leads us to 2018-03-19 18:30:33, the first of these 3200 analyzed tweets. In the following table we can see those weeks in which there were more tweets. This information can be useful if we want to track specific events that may have encouraged the flow of information.

day n
2018-07-27 49
2019-07-07 30
2018-06-01 27
2018-06-19 22
2018-05-14 21
2018-03-28 19
2018-03-29 19
2018-04-23 19
2018-03-26 18
2018-05-29 17

The average number of tweets per day was 4 and the standard deviation was 4. Of these tweets 215 were organic tweets (created by the user), 2975 were responses and 10 were retweets. Next we will work exclusively with the organic tweets.

Tweets by type

As far as retweets and favourites are concerned, we note that the account received a total of 2.51243110^{6} favourites and 7.3991510^{5} retweets during this time. These figures imply an average per tweet of 1.168572610^{4} favourites and 3441.4651163 retweets.

The most favorite tweets were
text favorite_count
Mamba Forever. https://t.co/wIchSUwFM2 337468

Nothing can stop what we can do together. You can’t stop sport. Because #YouCantStopUs.

Join Us | https://t.co/fQUWzDVH3q https://t.co/YAig7FIL6G
319698
You can take the superhero out of her costume, but you can never take away her superpowers. #justdoit https://t.co/dDB6D9nzaD 278193

Mamba Forever.

(sound on) https://t.co/B2LUIcpRCc
235289

Let’s all be part of the change.

#UntilWeAllWin https://t.co/guhAG48Wbp
225786
Just a kid from Akron, building a legacy that extends far beyond the basketball court. #JustDoIt https://t.co/kTtmFQGDdi 155384

No matter what we’re up against, we are never too far down to come back. #YouCantStopUs

Join Us | https://t.co/4PA8xvWFag https://t.co/VA30CC2ehg
141630
Now more than ever, we are one team. #playinside #playfortheworld https://t.co/LRLhL4FwkG 130255
What carries you can change the game. Who carries you can change the world. #JustDoIt https://t.co/1vuruYb0hj 112780
Kids from Akron don’t just dream it. They do it. #justdoit https://t.co/Cmj2caP2Mq 46100

On the other hand, those with more retweets are the following (it is common that they coincide)

text retweet_count
You can take the superhero out of her costume, but you can never take away her superpowers. #justdoit https://t.co/dDB6D9nzaD 113941

Nothing can stop what we can do together. You can’t stop sport. Because #YouCantStopUs.

Join Us | https://t.co/fQUWzDVH3q https://t.co/YAig7FIL6G
104534
Mamba Forever. https://t.co/wIchSUwFM2 101635

Let’s all be part of the change.

#UntilWeAllWin https://t.co/guhAG48Wbp
99219

Mamba Forever.

(sound on) https://t.co/B2LUIcpRCc
88607

No matter what we’re up against, we are never too far down to come back. #YouCantStopUs

Join Us | https://t.co/4PA8xvWFag https://t.co/VA30CC2ehg
35081
Now more than ever, we are one team. #playinside #playfortheworld https://t.co/LRLhL4FwkG 31561
Just a kid from Akron, building a legacy that extends far beyond the basketball court. #JustDoIt https://t.co/kTtmFQGDdi 22137
Unstoppable belief. #justdoit https://t.co/9Axn3glvwz 13555
What carries you can change the game. Who carries you can change the world. #JustDoIt https://t.co/1vuruYb0hj 11968

Worldcloud

A first step to a more sophisticated analysis involves identifying those terms most commonly used. This type of information will give us a global overview of the account we are analyzing and its way of communicating with its followers.The chart below provides information on the top 10 words used by the account.

Another way to observe this information is through a word cloud. It takes the data regarding the words used and presents them graphically, where the size of the word indicates its frequency.

Wordcloud

Sentiment Analysis

A second step in the analysis involves accounting for what is being communicated through the account. Translating this into quantitative data tends to be difficult. Fortunately, the sentiment analysis strategy offers an opportunity in this regard. This strategy analyzes the words used in the tweets and assigns each of them a value in each of ten primary senses. In this way we can, in some way, evaluate the general tone of the account.

Sentiment scores

On a large scale, we can divide these feelings into positive and negative, in order to observe the most used words for each of them. This information is very useful to understand more clearly what is happening in the above chart.

Most used words by type of sentiment

## Joining, by = "word"
## Selecting by n

Finally, it is possible to classify no longer the words, but the tweets, according to the emotions of the words they contain. This gives a global overview of the communication that the account is carrying out. Zero indicates neutrality.

Sentiment by tweet

Topic Analysis

Another useful tool, when it comes to gaining insight into the communications that an account is having with its followers, is what is known as topic analysis. The objective of this type of analysis is to group the complex amount of information we are handling into a set of themes that are repeated, depending on the relationships between words in the tweets. Here we will use a specific tool for this: Latent Dirichlet allocation (LDA). I think this will become clearer as we move into practical analysis.

First, let’s look at the 5 most used words a classification in 5 topics.

Terms by topic

Topic 1 Topic 2 Topic 3 Topic 4 Topic 5
check see justdoit brand updates
can can congratulations athletes keep
available like crazy endorse locked
style thats dream proactively time
know run just requests tuned

The words we observe for each of the topics are words that commonly appear together in the tweets made by the user. We can classify the tweets according to their belonging to each of these topics (according to their probability). This way we can see to what extent the account treated each one of them.

It’s clear that this estimation is imperfect, but it helps us make more sense of the communications we are studying.

Topic Percentage

So, to finish this example, we can see the average of favorites and retweets per topic. This can help the account to know more about the impact that the different types of communication between followers are having.

Fav and Rt by Topic

Conclusion

This is a quick demonstration of the possibilities offered by twitter data analysis. The report can be customized and the possibilities are virtually unlimited. If anything you saw here interested you, or you think that similar analysis could be important for you, your company, brand, or whatever objective you are pursuing, do not hesitate to contact me! I will be happy to answer your questions.

Enjoy!