If you attended my workshop on neural networks in R using TensorFlow via Keras, then parts of the following may seem familiar!
Now, that’s out the way, let’s get to it!
R/Pharma 2020 Conference for practitioners of R in the Pharmaceutical Industry • Oct. 15th 2020
If you attended my workshop on neural networks in R using TensorFlow via Keras, then parts of the following may seem familiar!
Now, that’s out the way, let’s get to it!
Identifying drug candidates is expensive!
Performing biochemical screening assays in the laboratory is time consuming
In cases, where the number of potential candidates is large, predictive modeling can help prioritise candidates for screening
Thereby, the search space and thus costs are greatly reduced
Source: Original file on wikipedia | Author | CC BY-SA 4.0
You work as a data scientist in a pharmaceutical company
You have taken delivery of a predictive model for candidate prioritisation
In the documentation it says, that the final model was created by expanding an initial simple model:
Also in the documentation you find some visualisations quantifying some performance metrics for the final model
Source: Original file on kissclipart
Metrics:
mse = mean-squared-error (low = good)
pcc = Pearson’s correlation Coefficient (high = good)
scc = Spearman’s correlation Coefficient (high = good)
Conclusion:
Evidently, the complex model captures the more subtle information better
Therefore, it is decided to put the complex model into production
You create a shiny app wrapping the model predictions and continue with other tasks
However…
Time goes by…
People using the model for prioritisation in the wet lab start complaining that despite prioritising targets using the model, only very few of the prioritised candidates are found to be relevant downstream