Synopsis:

I analysed twitter activity of the following top global pharma companies (by sales):

Given below are the findings.


Approach:

Data Collection Collected tweets related to these major pharma companies using twitter API, looking for tweets including @merck @pfizer etc. Twitter search API by default returns tweets over the last 7 days.

Engamanent Profile Use tweet count as a measure of relative engament profile of these companies for the period in consideration.

Wordcloud Create a wordcloud with the most frequent words for each of these pharma companies and investigate if there are any interesting trends.

Dendogram Create a dendogram for the words frequently tweeted together and understand if there are any interesting trends.

Sentiment Analysis Perform sentiment analysis of the tweets using simple word sentiment score methodology , where each word is scored based on whether it is a positive word or negative word and the entire tweet is scored on the sum of individuals word scores in the tweet.
Tweets with a 0 score are treated a neutral tweets.Tweets with negative scores are treated as as negative tweets and tweets with positive scores are treated as positive tweets.


Relative social engagement profile:

Though there might be some seasonal variation due to external factors like news items etc , I plotted the tweet count for these pharma companies to understand their relative social engagement profile of these pharma majors.

Clearly Merck seems to be the most active company in Twitter followed by Pfizer . Roche seems to be the least engaged followed by Sanofi.

Let us take a deeper look at the tweets related to each of these companies , in detail.


Novartis Tweets Analysis:

Given below is the wordcloud created from tweets related to Novartis.

Observations

This is in line with expectations of tweets from a Oncology focused pharma major , no surprises here.

Given below is the dendogram created from tweets related to Novartis.

This dendogram is very interesting to observe.


Pfizer Tweets Analysis:

Given below is the wordcloud created from tweets related to Novartis.

Observations

First thing I notice on this wordcloud, compared to what Novartis wordcloud is that the focus on diseases is missing here. Two key themes emerge.

Given below is the dendogram created from tweets related to Pfizer.

Along with the observations above the dendogram also brings in an additional perspective , which involves CEO , Read GSK and Merck - when I looked up why, it seems analysts expect that Pfizer’s CEO Mr.Read might make a bid for GSK , and that he made a 2.9B deal with merck recently on cancer therapies.


Roche Tweets Analysis:

Given below is the wordcloud created from tweets related to Novartis.

Observations

These are typical tweets you would expect from a Pharma company ,focused on disease and patients with lesser engagements on other topics. This also explains why Roche is one of the least engaged Pharma companies in Twitter.

Given below is the dendogram created from tweets related to Roche.

The dendogram confirms our finding in the word cloud - tweets are entirely focused on the diseases and trials .Roche seems to be minimally engaged in other activities , at least as it is viewed in Twitter.


Sanofi Tweets Analysis:

Given below is the wordcloud created from tweets related to Novartis.

As expected the tweets are completely overwhelmed by the 2015 Patents for Humanity awards - sunpower , novartis , bravo , congrats, winner etc Also terms related to Sanofi’s core business amp , malaria, health , innovation etc are also frequently tweeted.

Lets take a look at the dendogram.


Merck Tweets Analysis:

Given below is the wordcloud created from tweets related to Novartis.

I was totally surprised looking at Merck’s tweet wordcloud spacex , NASA, payloads , spacestation,launch , iss , cargo etc. It was more like that of a space company than that of a pharma major - had to google it. I found the explanation - On April 17 , Merck had sent a protein crystal growth experiment to International Space Station.

Oncology , Cancer etc also find mention in relatively lesser scale.

Lets take a look at the dendogram.


Sentiment Analysis:

Now that we have analysed the individual tweets of these pharma majors , let us take a look at the sentiment analysis plot for these tweets for each of these pharma majors.


It seems most of the plots are neutral and this plot doesn’t provide a good understanding of the tweet sentiment.

Let us calculate the tweet sentiment excluding neutral tweets and identifying the ratio of positive tweets to the overall non-negative tweets.


Using the simple word sentiment approach , we see Sanofi leading the positive sentiment score followed by Pfizer and Merck. Sanofi and Merck are helped by the congratulatory messages for their Patents for Humanity win. Merck seems to have the least positive sentiment among its Top 5 Pharma peers.


Conclusion

We observed a few interesting trends using a simple visual approach to the tweets and unearthed a few news items explaining curious observations in the wrodclouds and dendogram in the above analysis.

We should take the sentiment score results with a bit of caution as word based sentiment analysis can be a bit misleading especially in an industry that deals with diseases , trials and therapies.

As a future improvement on this project , I am considering improving this analysis using advanced natural language processing methodologies so calculate the sentiment scores.