Introduction to the Data

The purpose of this document is to analyze tweets that were made for two different NBA playoff games. The two games that took place were Boston Celtics vs. Milwaukee Bucks and Memphis Grizzles vs. Golden State Warriors. Both of these games were played on May 1st, 2022 with the Celtics vs. Bucks (referred to as Game 1) happening first and then Grizzles vs. Warriors (referred to as Game 2) happening directly after that. I collected the 500 most recent tweets for each team the night after the games were over. The reason I collected them that night was to see people’s reactions to their team winning or losing, along with trying to avoid all of the ads to watch the game on TV and having those tweets not provide a lot of insight. I used the Twitter API to pull all of the tweets for these two playoff games. With the data that I gathered, I will conduct some sentiment analyses on them to see if there is a difference between the games.

** Data does not contain retweets

Analyses and Comparisons

Question 1: Which words did people use the most throughout the games?

The tables and graphs below show the number (n) of times that a word was used for each game. I filter words that were used more that 50 times to be represented on the graphs. The first table shows the game between the Celtics and the Bucks. It does not come as a surprise that the most used word was ‘bucks’. This is because the Bucks won the game and people were talking about how they were able to win Game 1 of the Eastern Semifinals on the road in Boston. People will want to talk about the winning team more and this is proven in the first table for the Bucks. Not only is this proven in the first table but also the second table. The Warriors were able to pull off a come from behind win on the road against the Grizzles, leading to ‘warriors’ as the top word that people were using. One other thing that I found interesting is that for all of the teams their mascot names were just more then where they are from. The closest team for this to be 1:1 was the Memphis Grizzlies with grizzlies used 643 times and Memphis used 632 times.

Some other words that were seen across does games were NBA, game (referring to game 1 of the playoffs), playoffs, win, nbaplayoffs, and words of that nature. The only words that were different were based off of the game. For example, different players names that were playing well or ejection in the case of the Grizzlies vs. Warriors games because Draymon Greene got ejected in the second quarter.

Question 2: Were there words that affected sentiment the most?

For my next question, I wanted to see if there were certain words that were helping drive a more positive or negative sentiment for each game. In order to do this, I used the bing lexicon to help score each of the words based on how positive or negative they are.

As you can see, there were some words that affected the sentiment more than others. Now are these words accurate of how they are used within the context of basketball? The answer is yes and no. Those that are correct within the scope of basketball for Game 1 were words like win, victory, smart, advantage, missed, loss, and losers. Win is the top word that drove sentiment that was taken in the right context. The reason why free and golden are not are because free is most likely referring to a free throw, which in some instance can be taken as a positive but that is not always the case and the reasoning for golden is because of the Golden State Warriors and that being their name. It is also interesting that golden was the second most positive when it comes to sentiment because they were not even playing in this game. I would have to assume this is because some people were talking about all of the teams in the playoffs. From the negative sentiment perspective there is really only one word that may not belong and that is defensive. This is not talking about someone being defensive toward another but rather they are playing defense and people are talking about that aspect of the game.

Now for Game 2, we see some of the same things that we saw from the graph of Game 1. The only major difference is that golden played a heavier factor on the positive sentiment for the game. This would be due to them playing in the game and more and more people were talking about them as they played. Like game 1, there were some words that surprised me as being positive or negative. These words would include defeated/defeat as a positive and bane, steal, and defensive as negatives. The only reason why defeated/defeat would be positive is if it was referring to Golden State winning the game and defeating the Grizzles, and in this case would be positive for one and negative for the other. Moving on to the negative words that I found surprising. Bane is one that some people might not know because it is a last name of one of the players on the Grizzlies and probably should not be in this analysis. Defensive, again is similar to what happen in the Game 1 graph. It is referring to the defensive that the teams are playing and not necessarily a negative sentiment that goes along with it.

Question 3: Where in the tweet did these words occur?

My final question dealt with where in the tweet each word occurred and the sentiment that goes along with those groups and if this varied for the games that were played.

There was a clear difference when it came to the two games. The tweets from Game 1 had a higher sentiment score at the end of each tweet, whereas for Game 2 they had higher scores toward the middle of the tweet. For Game 1, Boston was very low in score for the beginning or middle of the tweet but then shot up as it got toward the end of the tweet. This was similar to Milwaukee but there scores were always higher and rose as we got further in the tweet. Looking at Game 2, like I said before, is a different story than Game 1. We see more positive score in the middle of the tweet, with the end being lower that the middle but higher than the beginning. I found these two graphs to be interesting because the two teams for each game were similar but the two games were very different that on another.