Drug Switching Presentation

author: Sam Weiss date: 4/4/2016 autosize: true

Introduction

Goal is to figure out where people switch drugs from and to and why

In this I will present a formulation:

Data given sent to me was very messy. Some text about drug companies financials and in other languages. Decided to scrape data from webmd for each drug provided. This provided a standard dataset to play with. (data enclosed in email). Captured ~2700 reviews/ posts 

Labelling with Spacy

spacy used NLP to create a tree of dependencys for words within a sentence. I can correctly classify a sentence like:

FALSE [1] " doctor stopped byetta was put on januvia and glimepiride  mgs"
FALSE        relation       first                     second sentence
FALSE 11480     nsubj      doctor                    stopped      916
FALSE 11481      ROOT     stopped               ROOT*stopped      916
FALSE 11482 nsubjpass      byetta                put-stopped      916
FALSE 11483   auxpass         was                put-stopped      916
FALSE 11484     ccomp         put                    stopped      916
FALSE 11485      prep          on                put-stopped      916
FALSE 11486      pobj     januvia             on-put-stopped      916
FALSE 11487        cc         and     januvia-on-put-stopped      916
FALSE 11488  compound glimepiride mgs-januvia-on-put-stopped      916
FALSE 11489                       mgs-januvia-on-put-stopped      916
FALSE 11490      conj         mgs     januvia-on-put-stopped      916

From this you can see that Byetta has only “put-stopped” while Januvia has “on-put-stopped”. This allows me to correctly label the transition from one drug to another. This is very powerful and I think a lot more can be done to determine the structure of a sentence. For now I’ll just use it for labelling purposes.

Slankey Diagrams - Different Question

The first question was which drugs to people switch from and to. Below is a slankey diagram of this where you can see what drugs a consumer stops (on the left) and starts (on the right)

However, only about 90 posts had information that mentioned switching to a different drug by name. Can’t really do much stats on 90 observations so I can’t answer the question directly. I thought I’d try to answer a different question: Why do people start and stop specific drugs?

Extract Features - Topic Modelling

The methodology is to extract features and correlate these features with whether a person starts / stops taking a certain drug. I used a method called “Latent Dirichelet Allocation” or “Topic Models” to extract features. This assumes each sentence is a combination of different “Topics”. Below are the results of important words for each topic. We have to intepret the results ourselves but it seems to do an ok job for a first try.

Regression Analysis

Next we regress these features on whether a person switches a drug or not. The model coefficients are then used to determine what is associated with increase in probability of switching or starting a drug. Below the Topic is “gain_also_weight_swell_leg_pain_lbs” which is either gaining weight and fluid and swelling. Below we can see the Actos is more associated with this feature for BOTH stopping and starting the drug… This is because people often discuss of changes when they start a new drug, not why they started one. For example:

FALSE [1] "i am so glad i read these reviews after only  weeks of taking actos i have leg swellingpain and severe abdominal problems"                                
FALSE [2] "ive been taking actos for about  months the med works well but i have gained about  lbs and i have swelling in my legs and some slight breathing problems"

Other Topics: Cost

FALSE [1] "i was started on januvia  yrs ago or so and was doing well and blood sugars were under control and then had to stop and take actos because of high cost of januvia"
FALSE [2] "why is this medicine  so costly why do you all not have a generic to actos"
FALSE [1] "insurance forced me from byetta to victoza due to the cost"
FALSE [2] "  all im not too happy with is the cost of victoza"

Other Topics: Nausea

FALSE [1] "bydureon definetly has helped me nausea was horrible in the begining but has gotten better"                                                               
FALSE [2] "  i have had two bouts of extreme nausea and vomiting which i first attributed to the flu but now have realized it was probably a side effect of bydureon"
FALSE [1] "  i didnt get the nausea except when i injected byetta into my arm"         
FALSE [2] " i was nauseated the first time i took byetta and one other time after that"

Conclusion

I haven’t been able to directly answer the question you’ve asked. However, I do think these methods show there is some value in identying which drugs are associated with which topics. This will at least give you a better idea of what to look for.