R Markdown

The Drug Use by Age dataset was taken from five thirty-eight’s github: https://github.com/fivethirtyeight/data/tree/master/drug-use-by-age

The dataset contains 17 age groups and 13 drugs and documents the drug usage across the different age groups.

The purpose of my project is to see which age groups use illegal soft drugs, hard drugs, legal pharmaceuticals, or alcohol. I also want to see if there is an overall trend, such as if young age groups start off with one group of drugs and with age gravitate to another.

I condensed hallucinogenic and marijuana as “soft drugs” and the more traditional hard drugs together (cocaine, crack, heroin, and meth). The pharmaceutical drugs deserved two categories since pain releivers and oxycontin alleviate pain, while the other pharmeceuticals are different (tranquilizer, stimulants, and sedative). I left inhalent and alcohol in their own categories, because alcohol is its own intoxicant and inhalents i did not know how to appropriately categorize.

#Retrieve Original Data
DrugUse <- "https://raw.github.com/geeman1209/MSDATA2020/master/Winter Bridge - R/Data/drug-use-by-age.csv"

DU <- read.csv(DrugUse)

#Print summary of original data
summary(DU)
##       age           n         alcohol.use    alcohol.frequency marijuana.use  
##  12     : 1   Min.   :2223   Min.   : 3.90   Min.   : 3.00     Min.   : 1.10  
##  13     : 1   1st Qu.:2469   1st Qu.:40.10   1st Qu.:10.00     1st Qu.: 8.70  
##  14     : 1   Median :2798   Median :64.60   Median :48.00     Median :20.80  
##  15     : 1   Mean   :3251   Mean   :55.43   Mean   :33.35     Mean   :18.92  
##  16     : 1   3rd Qu.:3058   3rd Qu.:77.50   3rd Qu.:52.00     3rd Qu.:28.40  
##  17     : 1   Max.   :7391   Max.   :84.20   Max.   :52.00     Max.   :34.00  
##  (Other):11                                                                   
##  marijuana.frequency  cocaine.use    cocaine.frequency   crack.use     
##  Min.   : 4.00       Min.   :0.000   5.0    :6         Min.   :0.0000  
##  1st Qu.:30.00       1st Qu.:0.500   5.5    :2         1st Qu.:0.0000  
##  Median :52.00       Median :2.000   8.0    :2         Median :0.4000  
##  Mean   :42.94       Mean   :2.176   -      :1         Mean   :0.2941  
##  3rd Qu.:52.00       3rd Qu.:4.000   1.0    :1         3rd Qu.:0.5000  
##  Max.   :72.00       Max.   :4.900   15.0   :1         Max.   :0.6000  
##                                      (Other):4                         
##  crack.frequency   heroin.use     heroin.frequency hallucinogen.use
##  -      :3       Min.   :0.0000   -      : 1       Min.   :0.100   
##  5.0    :2       1st Qu.:0.1000   1.0    : 1       1st Qu.:0.600   
##  6.0    :2       Median :0.2000   120.0  : 1       Median :3.200   
##  1.0    :1       Mean   :0.3529   180.0  : 1       Mean   :3.394   
##  10.0   :1       3rd Qu.:0.6000   2.0    : 1       3rd Qu.:5.200   
##  15.0   :1       Max.   :1.1000   280.0  : 1       Max.   :8.600   
##  (Other):7                        (Other):11                       
##  hallucinogen.frequency  inhalant.use   inhalant.frequency pain.releiver.use
##  Min.   : 2.000         Min.   :0.000   4.0    :5          Min.   : 0.600   
##  1st Qu.: 3.000         1st Qu.:0.600   2.0    :2          1st Qu.: 3.900   
##  Median : 3.000         Median :1.400   3.0    :2          Median : 6.200   
##  Mean   : 8.412         Mean   :1.388   -      :1          Mean   : 6.271   
##  3rd Qu.: 4.000         3rd Qu.:2.000   10.0   :1          3rd Qu.: 9.000   
##  Max.   :52.000         Max.   :3.000   12.0   :1          Max.   :10.000   
##                                         (Other):5                           
##  pain.releiver.frequency oxycontin.use    oxycontin.frequency tranquilizer.use
##  Min.   : 7.00           Min.   :0.0000   12.0   :2           Min.   :0.200   
##  1st Qu.:12.00           1st Qu.:0.4000   13.5   :2           1st Qu.:1.400   
##  Median :12.00           Median :1.1000   -      :1           Median :3.500   
##  Mean   :14.71           Mean   :0.9353   17.5   :1           Mean   :2.806   
##  3rd Qu.:15.00           3rd Qu.:1.4000   20.0   :1           3rd Qu.:4.200   
##  Max.   :36.00           Max.   :1.7000   24.5   :1           Max.   :5.400   
##                                           (Other):9                           
##  tranquilizer.frequency stimulant.use   stimulant.frequency    meth.use     
##  Min.   : 4.50          Min.   :0.000   Min.   :  2.00      Min.   :0.0000  
##  1st Qu.: 6.00          1st Qu.:0.600   1st Qu.:  7.00      1st Qu.:0.2000  
##  Median :10.00          Median :1.800   Median : 10.00      Median :0.4000  
##  Mean   :11.74          Mean   :1.918   Mean   : 31.15      Mean   :0.3824  
##  3rd Qu.:11.00          3rd Qu.:3.000   3rd Qu.: 12.00      3rd Qu.:0.6000  
##  Max.   :52.00          Max.   :4.100   Max.   :364.00      Max.   :0.9000  
##                                                                             
##  meth.frequency  sedative.use    sedative.frequency
##  -      :2      Min.   :0.0000   Min.   :  3.00    
##  12.0   :2      1st Qu.:0.2000   1st Qu.:  6.50    
##  30.0   :2      Median :0.3000   Median : 10.00    
##  10.5   :1      Mean   :0.2824   Mean   : 19.38    
##  104.0  :1      3rd Qu.:0.4000   3rd Qu.: 17.50    
##  105.0  :1      Max.   :0.5000   Max.   :104.00    
##  (Other):8
#Get total number of participants in the study
TotalPeople <- sum(DU$n)

#create subset of data I want to examine
#convert percentages to actual number of reported users
DU2 <- DU %>% select(n, ends_with('use'))

inc_cols <- DU2 %>% select(ends_with('use'))

DU3 <- ceiling((DU2$n * inc_cols)/100)

Final <- cbind(DU$age, DU2$n, DU3)

setnames(Final, old = c("DU$age", "DU2$n"), new = c("Age Groups", "Group Total"))


#condense the several individual drug columns to more generalized categorical columns in a new data frame
df <- mutate(Final, hard_drugs = cocaine.use + crack.use + heroin.use + meth.use, soft_drugs = marijuana.use + hallucinogen.use, pharma_pain_drugs = pain.releiver.use + oxycontin.use, other_pharma = sedative.use + tranquilizer.use + stimulant.use)

#select subset of condensed data frame
new_du <- data.frame(df$`Age Groups`, df$`Group Total`, df$alcohol.use, df$hard_drugs, df$soft_drugs, df$pharma_pain_drugs, df$other_pharma, df$inhalant.use)

##Rename columns
setnames(new_du, old = c("df..Age.Groups.", "df..Group.Total.", "df.alcohol.use", "df.hard_drugs", "df.soft_drugs", "df.pharma_pain_drugs", "df.other_pharma", "df.inhalant.use"), new = c("Age Groups", "Group_Total", "alcohol_users", "hard.drug.users", "soft.drug.users", "pharma.pain.drug.users", "other.pharma.users", "inhalent.users" ))

##Sumarise new data frame values
summary(new_du)
##    Age Groups  Group_Total   alcohol_users  hard.drug.users soft.drug.users 
##  12     : 1   Min.   :2223   Min.   : 110   Min.   :  0.0   Min.   :  33.0  
##  13     : 1   1st Qu.:2469   1st Qu.:1207   1st Qu.: 33.0   1st Qu.: 299.0  
##  14     : 1   Median :2798   Median :1498   Median :100.0   Median : 793.0  
##  15     : 1   Mean   :3251   Mean   :1905   Mean   :105.9   Mean   : 691.8  
##  16     : 1   3rd Qu.:3058   3rd Qu.:2220   3rd Qu.:155.0   3rd Qu.: 942.0  
##  17     : 1   Max.   :7391   Max.   :5544   Max.   :317.0   Max.   :1582.0  
##  (Other):11                                                                 
##  pharma.pain.drug.users other.pharma.users inhalent.users 
##  Min.   : 15.0          Min.   :  5.0      Min.   : 0.00  
##  1st Qu.:121.0          1st Qu.: 75.0      1st Qu.:23.00  
##  Median :243.0          Median :175.0      Median :37.00  
##  Mean   :230.7          Mean   :159.2      Mean   :41.35  
##  3rd Qu.:270.0          3rd Qu.:209.0      3rd Qu.:61.00  
##  Max.   :552.0          Max.   :388.0      Max.   :92.00  
## 
A_mean <- mean(new_du$alcohol_users)
Hard_mean <- mean(new_du$hard.drug.users)
Soft_mean <- mean(new_du$soft.drug.users)
Pharm_pain_mean <- mean(new_du$pharma.pain.drug.users)
other_pharm <- mean(new_du$other.pharma.users)
inhalent_mean <- mean(new_du$inhalent.users)

x1 <- c("Avg Alcohol Users", "Avg Hard Drug Users", "Avg Soft Drug Users", "Avg Pharma Pain Users", "Avg Other Pharma Users", "Inhalent Users" )

x <- c(A_mean, Hard_mean, Soft_mean, Pharm_pain_mean, other_pharm, inhalent_mean)

comparison <- data.frame("Column Names" = x1 ,"Averages" = x)

comparison
##             Column.Names   Averages
## 1      Avg Alcohol Users 1904.58824
## 2    Avg Hard Drug Users  105.94118
## 3    Avg Soft Drug Users  691.76471
## 4  Avg Pharma Pain Users  230.70588
## 5 Avg Other Pharma Users  159.17647
## 6         Inhalent Users   41.35294
###Gather Frequency Data

Freq <- DU %>% select(age, ends_with('frequency'))
Freq
##      age alcohol.frequency marijuana.frequency cocaine.frequency
## 1     12                 3                   4               5.0
## 2     13                 6                  15               1.0
## 3     14                 5                  24               5.5
## 4     15                 6                  25               4.0
## 5     16                10                  30               7.0
## 6     17                13                  36               5.0
## 7     18                24                  52               5.0
## 8     19                36                  60               5.5
## 9     20                48                  60               8.0
## 10    21                52                  52               5.0
## 11 22-23                52                  52               5.0
## 12 24-25                52                  60               6.0
## 13 26-29                52                  52               5.0
## 14 30-34                52                  72               8.0
## 15 35-49                52                  48              15.0
## 16 50-64                52                  52              36.0
## 17   65+                52                  36                 -
##    crack.frequency heroin.frequency hallucinogen.frequency inhalant.frequency
## 1                -             35.5                     52               19.0
## 2              3.0                -                      6               12.0
## 3                -              2.0                      3                5.0
## 4              9.5              1.0                      4                5.5
## 5              1.0             66.5                      3                3.0
## 6             21.0             64.0                      3                4.0
## 7             10.0             46.0                      4                4.0
## 8              2.0            180.0                      3                3.0
## 9              5.0             45.0                      2                4.0
## 10            17.0             30.0                      4                2.0
## 11             5.0             57.5                      3                4.0
## 12             6.0             88.0                      2                2.0
## 13             6.0             50.0                      3                4.0
## 14            15.0             66.0                      2                3.5
## 15            48.0            280.0                      3               10.0
## 16            62.0             41.0                     44               13.5
## 17               -            120.0                      2                  -
##    pain.releiver.frequency oxycontin.frequency tranquilizer.frequency
## 1                       36                24.5                   52.0
## 2                       14                41.0                   25.5
## 3                       12                 4.5                    5.0
## 4                       10                 3.0                    4.5
## 5                        7                 4.0                   11.0
## 6                        9                 6.0                    7.0
## 7                       12                 7.0                   12.0
## 8                       12                 7.5                    4.5
## 9                       10                12.0                   10.0
## 10                      15                13.5                    7.0
## 11                      15                17.5                   12.0
## 12                      15                20.0                   10.0
## 13                      13                13.5                   10.0
## 14                      22                46.0                    8.0
## 15                      12                12.0                    6.0
## 16                      12                 5.0                   10.0
## 17                      24                   -                    5.0
##    stimulant.frequency meth.frequency sedative.frequency
## 1                  2.0              -               13.0
## 2                  4.0            5.0               19.0
## 3                 12.0           24.0               16.5
## 4                  6.0           10.5               30.0
## 5                  9.5           36.0                3.0
## 6                  9.0           48.0                6.5
## 7                  8.0           12.0               10.0
## 8                  6.0          105.0                6.0
## 9                 12.0           12.0                4.0
## 10                10.0            2.0                9.0
## 11                10.0           46.0               52.0
## 12                10.0           21.0               17.5
## 13                 7.0           30.0                4.0
## 14                12.0           54.0               10.0
## 15                24.0          104.0               10.0
## 16                24.0           30.0              104.0
## 17               364.0              -               15.0

Including Graphics

new_du1 <- new_du[ , c(3:8)]
data2 <- data.frame(new_du1)
#data2$Age.Groups <- as.numeric(as.character(data2$Age.Groups))

barplot(as.matrix(data2), main="Total Users By Drug/Intoxicant",ylab="Users", beside=TRUE,col=rainbow (17), cex.names = 0.5)
legend ("topright",c("12yr","13yr","14yr","15yr","16yr", "17yr", "18yr", "19yr", "20yr", "21yr", "22-23yr","24-25yr", "26-29yr", "30-34yr", "35-49yr", "50-64yr", "65yr+" ),cex=.75,bty="n",fill=rainbow (17))

hist(Freq$alcohol.frequency)

hist(Freq$marijuana.frequency)

plot(new_du$alcohol_users ~ new_du$`Age Groups`)

plot(Freq$alcohol.frequency ~ Freq$age)

boxplot(Freq$alcohol.frequency, Freq$marijuana.frequency, Freq$cocaine.frequency, Freq$crack.frequency, Freq$heroin.frequency, Freq$pain.releiver.frequency, Freq$oxycontin.frequency, main="Median number of times a user in age groups used drug/alcohol", xlab="Drugs", names=c("alcohol", "marijuana", "cocaine", "crack", "heroin", "pain", "oxycontin"))

boxplot(new_du1$alcohol_users, new_du1$hard.drug.users, new_du1$soft.drug.users, new_du1$pharma.pain.drug.users, new_du1$other.pharma.users, new_du1$inhalent.users, main="Boxplot of Useres for condensed drug categories", xlab="Drugs", names = c("alcohol", "hard drugs", "soft drugs", "pharma pain", "other pharma", "inhalents"))

The data illustrated that alcohol is by far the most used and preferred intoxicant by all age groups. However, drugs such as marijuana and “soft drugs” are almost as popular as alcohol among the younger age groups. Another interesting piece of data is the frequency of use for marijuana comparable or more than alcohol in the younger age groups and greater than alcohol in the later ages. Harder drugs like cocaine while not as popular among the age groups, is frequently used as much or more than marijuana and alcochol. The outlier are heroin and meth in regards to frequency of use. For the 19 yr old age group, meth and heroin had very high usage, attesting to its highly addictive nature.

Overall, the data shows that while alcohol is popular, the soft drugs can truly be seen as “gateway” drugs. It is used among early age groups and has high frequency of use, more so than alcohol. Alcohol is probably a environmental social activity while marijuana or hallucinogens are habitual.