The aim of this analysis is to identify any connection between opioid prescriptions and educational outcomes in West Virginia. Using opioid prescription data from the WV Board of Pharmacy, as well as the educational achievement outcomes from across the state, I was able to identify a negative correlation between opioid prescriptions and student proficiency in math, science, and reading.
Looking at the correlations between all of my data points, the ones that stand out as interesting to me are the proficiency in each subject (first three columns) and opioid doses per 1000 residents (fourth row from bottom). Each of these correlations show a negative relationship.
I generated linear models for each of the three subjects using opioid doses per 1000 residents and median income as predictors. Although my r-squared values were not ideal, the variables were statistically significant across the board, and the models suggested a positive relationship between proficiency and income, and a negative relationship between proficiency and opioid doses.
I used Kmeans to cluster the counties into five groups based on the four opioid variables. I then looked at the mean proficiency in each subject for each cluster and found that across the board, cluster 2 scored the highest, followed by 4, then 1, then 5, then 3, with the small exception that cluster 1 scored slightly higher than cluster 4 in science. Additionally, the median income for each cluster follows this same trend, with cluster 2 having the highest income and 3 having the lowest, and the number of opioid doses prescribed per capita follows the inverse of this pattern, with cluster 3 having the most doses and 2 having the fewest. This trend clearly suggests a correlation between opioid precriptions and educational outcomes.
## New names:
## • `` -> `...1`
## # A tibble: 5 × 6
## kmean$...1 math reading science median_income opioid_doses_per_1000_residents
## <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 31.9 39.1 26.0 49929. 38341.
## 2 2 25.6 34.8 20.1 48192. 69335
## 3 3 32.7 41.6 29.5 57485. 27536.
## 4 4 26.9 37.3 22.8 46593. 52947
## 5 5 32.4 40.1 25.7 54174. 34704.
The final analysis I chose to perform for this project is a decision tree using opioid data to predict whether or not a county is above the national average in each of the three subjects. The three decision trees performed extremely well, with the math tree having 94.5% accuracy and the reading and science trees having 96.3% accuracy. The fact that the decision trees were able to identify proficiency with such accuracy using only opioid data leads me to believe that there is a concrete relationship between opioid usage and educational outcomes.
## [1] 0.9454545
## [1] 0.9818182
## [1] 0.9636364
##
## No Yes
## NO 40 2
## YES 1 12
##
## No Yes
## NO 41 0
## YES 1 13
##
## No Yes
## NO 52 2
## YES 0 1
Without a doubt, there are limitations to the analyses that I performed, namely the lack of past data. Without educational data from the 1980’s and 90’s, before opioid usage became as widespread, I am unable to determine if the proficiency scores are caused by opioid usage, or if they are both simply effects of some larger socioeconomic cause. Additionally, it is entirely possible that the scores are not influenced by opioid usage, rather opioid usage is influenced by educational outcomes via pharmaceutical companies specifically targeting poorly educated areas.
Opioid Data found from WV Board of Pharmacy
Chat GPT used to read data in PDFs and generate it as .xlsx files.