The Drug Use by Age dataset was taken from five thirty-eight’s github: https://github.com/fivethirtyeight/data/tree/master/drug-use-by-age
The dataset contains 17 age groups and 13 drugs and documents the drug usage across the different age groups.
The purpose of my project is to see which age groups use illegal soft drugs, hard drugs, legal pharmaceuticals, or alcohol. I also want to see if there is an overall trend, such as if young age groups start off with one group of drugs and with age gravitate to another.
I condensed hallucinogenic and marijuana as “soft drugs” and the more traditional hard drugs together (cocaine, crack, heroin, and meth). The pharmaceutical drugs deserved two categories since pain releivers and oxycontin alleviate pain, while the other pharmeceuticals are different (tranquilizer, stimulants, and sedative). I left inhalent and alcohol in their own categories, because alcohol is its own intoxicant and inhalents i did not know how to appropriately categorize.
#Retrieve Original Data
DrugUse <- "https://raw.github.com/geeman1209/MSDATA2020/master/Winter Bridge - R/Data/drug-use-by-age.csv"
DU <- read.csv(DrugUse)
#Print summary of original data
summary(DU)
## age n alcohol.use alcohol.frequency marijuana.use
## 12 : 1 Min. :2223 Min. : 3.90 Min. : 3.00 Min. : 1.10
## 13 : 1 1st Qu.:2469 1st Qu.:40.10 1st Qu.:10.00 1st Qu.: 8.70
## 14 : 1 Median :2798 Median :64.60 Median :48.00 Median :20.80
## 15 : 1 Mean :3251 Mean :55.43 Mean :33.35 Mean :18.92
## 16 : 1 3rd Qu.:3058 3rd Qu.:77.50 3rd Qu.:52.00 3rd Qu.:28.40
## 17 : 1 Max. :7391 Max. :84.20 Max. :52.00 Max. :34.00
## (Other):11
## marijuana.frequency cocaine.use cocaine.frequency crack.use
## Min. : 4.00 Min. :0.000 5.0 :6 Min. :0.0000
## 1st Qu.:30.00 1st Qu.:0.500 5.5 :2 1st Qu.:0.0000
## Median :52.00 Median :2.000 8.0 :2 Median :0.4000
## Mean :42.94 Mean :2.176 - :1 Mean :0.2941
## 3rd Qu.:52.00 3rd Qu.:4.000 1.0 :1 3rd Qu.:0.5000
## Max. :72.00 Max. :4.900 15.0 :1 Max. :0.6000
## (Other):4
## crack.frequency heroin.use heroin.frequency hallucinogen.use
## - :3 Min. :0.0000 - : 1 Min. :0.100
## 5.0 :2 1st Qu.:0.1000 1.0 : 1 1st Qu.:0.600
## 6.0 :2 Median :0.2000 120.0 : 1 Median :3.200
## 1.0 :1 Mean :0.3529 180.0 : 1 Mean :3.394
## 10.0 :1 3rd Qu.:0.6000 2.0 : 1 3rd Qu.:5.200
## 15.0 :1 Max. :1.1000 280.0 : 1 Max. :8.600
## (Other):7 (Other):11
## hallucinogen.frequency inhalant.use inhalant.frequency pain.releiver.use
## Min. : 2.000 Min. :0.000 4.0 :5 Min. : 0.600
## 1st Qu.: 3.000 1st Qu.:0.600 2.0 :2 1st Qu.: 3.900
## Median : 3.000 Median :1.400 3.0 :2 Median : 6.200
## Mean : 8.412 Mean :1.388 - :1 Mean : 6.271
## 3rd Qu.: 4.000 3rd Qu.:2.000 10.0 :1 3rd Qu.: 9.000
## Max. :52.000 Max. :3.000 12.0 :1 Max. :10.000
## (Other):5
## pain.releiver.frequency oxycontin.use oxycontin.frequency tranquilizer.use
## Min. : 7.00 Min. :0.0000 12.0 :2 Min. :0.200
## 1st Qu.:12.00 1st Qu.:0.4000 13.5 :2 1st Qu.:1.400
## Median :12.00 Median :1.1000 - :1 Median :3.500
## Mean :14.71 Mean :0.9353 17.5 :1 Mean :2.806
## 3rd Qu.:15.00 3rd Qu.:1.4000 20.0 :1 3rd Qu.:4.200
## Max. :36.00 Max. :1.7000 24.5 :1 Max. :5.400
## (Other):9
## tranquilizer.frequency stimulant.use stimulant.frequency meth.use
## Min. : 4.50 Min. :0.000 Min. : 2.00 Min. :0.0000
## 1st Qu.: 6.00 1st Qu.:0.600 1st Qu.: 7.00 1st Qu.:0.2000
## Median :10.00 Median :1.800 Median : 10.00 Median :0.4000
## Mean :11.74 Mean :1.918 Mean : 31.15 Mean :0.3824
## 3rd Qu.:11.00 3rd Qu.:3.000 3rd Qu.: 12.00 3rd Qu.:0.6000
## Max. :52.00 Max. :4.100 Max. :364.00 Max. :0.9000
##
## meth.frequency sedative.use sedative.frequency
## - :2 Min. :0.0000 Min. : 3.00
## 12.0 :2 1st Qu.:0.2000 1st Qu.: 6.50
## 30.0 :2 Median :0.3000 Median : 10.00
## 10.5 :1 Mean :0.2824 Mean : 19.38
## 104.0 :1 3rd Qu.:0.4000 3rd Qu.: 17.50
## 105.0 :1 Max. :0.5000 Max. :104.00
## (Other):8
#Get total number of participants in the study
TotalPeople <- sum(DU$n)
#create subset of data I want to examine
#convert percentages to actual number of reported users
DU2 <- DU %>% select(n, ends_with('use'))
inc_cols <- DU2 %>% select(ends_with('use'))
DU3 <- ceiling((DU2$n * inc_cols)/100)
Final <- cbind(DU$age, DU2$n, DU3)
setnames(Final, old = c("DU$age", "DU2$n"), new = c("Age Groups", "Group Total"))
#condense the several individual drug columns to more generalized categorical columns in a new data frame
df <- mutate(Final, hard_drugs = cocaine.use + crack.use + heroin.use + meth.use, soft_drugs = marijuana.use + hallucinogen.use, pharma_pain_drugs = pain.releiver.use + oxycontin.use, other_pharma = sedative.use + tranquilizer.use + stimulant.use)
#select subset of condensed data frame
new_du <- data.frame(df$`Age Groups`, df$`Group Total`, df$alcohol.use, df$hard_drugs, df$soft_drugs, df$pharma_pain_drugs, df$other_pharma, df$inhalant.use)
##Rename columns
setnames(new_du, old = c("df..Age.Groups.", "df..Group.Total.", "df.alcohol.use", "df.hard_drugs", "df.soft_drugs", "df.pharma_pain_drugs", "df.other_pharma", "df.inhalant.use"), new = c("Age Groups", "Group_Total", "alcohol_users", "hard.drug.users", "soft.drug.users", "pharma.pain.drug.users", "other.pharma.users", "inhalent.users" ))
##Sumarise new data frame values
summary(new_du)
## Age Groups Group_Total alcohol_users hard.drug.users soft.drug.users
## 12 : 1 Min. :2223 Min. : 110 Min. : 0.0 Min. : 33.0
## 13 : 1 1st Qu.:2469 1st Qu.:1207 1st Qu.: 33.0 1st Qu.: 299.0
## 14 : 1 Median :2798 Median :1498 Median :100.0 Median : 793.0
## 15 : 1 Mean :3251 Mean :1905 Mean :105.9 Mean : 691.8
## 16 : 1 3rd Qu.:3058 3rd Qu.:2220 3rd Qu.:155.0 3rd Qu.: 942.0
## 17 : 1 Max. :7391 Max. :5544 Max. :317.0 Max. :1582.0
## (Other):11
## pharma.pain.drug.users other.pharma.users inhalent.users
## Min. : 15.0 Min. : 5.0 Min. : 0.00
## 1st Qu.:121.0 1st Qu.: 75.0 1st Qu.:23.00
## Median :243.0 Median :175.0 Median :37.00
## Mean :230.7 Mean :159.2 Mean :41.35
## 3rd Qu.:270.0 3rd Qu.:209.0 3rd Qu.:61.00
## Max. :552.0 Max. :388.0 Max. :92.00
##
A_mean <- mean(new_du$alcohol_users)
Hard_mean <- mean(new_du$hard.drug.users)
Soft_mean <- mean(new_du$soft.drug.users)
Pharm_pain_mean <- mean(new_du$pharma.pain.drug.users)
other_pharm <- mean(new_du$other.pharma.users)
inhalent_mean <- mean(new_du$inhalent.users)
x1 <- c("Avg Alcohol Users", "Avg Hard Drug Users", "Avg Soft Drug Users", "Avg Pharma Pain Users", "Avg Other Pharma Users", "Inhalent Users" )
x <- c(A_mean, Hard_mean, Soft_mean, Pharm_pain_mean, other_pharm, inhalent_mean)
comparison <- data.frame("Column Names" = x1 ,"Averages" = x)
comparison
## Column.Names Averages
## 1 Avg Alcohol Users 1904.58824
## 2 Avg Hard Drug Users 105.94118
## 3 Avg Soft Drug Users 691.76471
## 4 Avg Pharma Pain Users 230.70588
## 5 Avg Other Pharma Users 159.17647
## 6 Inhalent Users 41.35294
###Gather Frequency Data
Freq <- DU %>% select(age, ends_with('frequency'))
Freq
## age alcohol.frequency marijuana.frequency cocaine.frequency
## 1 12 3 4 5.0
## 2 13 6 15 1.0
## 3 14 5 24 5.5
## 4 15 6 25 4.0
## 5 16 10 30 7.0
## 6 17 13 36 5.0
## 7 18 24 52 5.0
## 8 19 36 60 5.5
## 9 20 48 60 8.0
## 10 21 52 52 5.0
## 11 22-23 52 52 5.0
## 12 24-25 52 60 6.0
## 13 26-29 52 52 5.0
## 14 30-34 52 72 8.0
## 15 35-49 52 48 15.0
## 16 50-64 52 52 36.0
## 17 65+ 52 36 -
## crack.frequency heroin.frequency hallucinogen.frequency inhalant.frequency
## 1 - 35.5 52 19.0
## 2 3.0 - 6 12.0
## 3 - 2.0 3 5.0
## 4 9.5 1.0 4 5.5
## 5 1.0 66.5 3 3.0
## 6 21.0 64.0 3 4.0
## 7 10.0 46.0 4 4.0
## 8 2.0 180.0 3 3.0
## 9 5.0 45.0 2 4.0
## 10 17.0 30.0 4 2.0
## 11 5.0 57.5 3 4.0
## 12 6.0 88.0 2 2.0
## 13 6.0 50.0 3 4.0
## 14 15.0 66.0 2 3.5
## 15 48.0 280.0 3 10.0
## 16 62.0 41.0 44 13.5
## 17 - 120.0 2 -
## pain.releiver.frequency oxycontin.frequency tranquilizer.frequency
## 1 36 24.5 52.0
## 2 14 41.0 25.5
## 3 12 4.5 5.0
## 4 10 3.0 4.5
## 5 7 4.0 11.0
## 6 9 6.0 7.0
## 7 12 7.0 12.0
## 8 12 7.5 4.5
## 9 10 12.0 10.0
## 10 15 13.5 7.0
## 11 15 17.5 12.0
## 12 15 20.0 10.0
## 13 13 13.5 10.0
## 14 22 46.0 8.0
## 15 12 12.0 6.0
## 16 12 5.0 10.0
## 17 24 - 5.0
## stimulant.frequency meth.frequency sedative.frequency
## 1 2.0 - 13.0
## 2 4.0 5.0 19.0
## 3 12.0 24.0 16.5
## 4 6.0 10.5 30.0
## 5 9.5 36.0 3.0
## 6 9.0 48.0 6.5
## 7 8.0 12.0 10.0
## 8 6.0 105.0 6.0
## 9 12.0 12.0 4.0
## 10 10.0 2.0 9.0
## 11 10.0 46.0 52.0
## 12 10.0 21.0 17.5
## 13 7.0 30.0 4.0
## 14 12.0 54.0 10.0
## 15 24.0 104.0 10.0
## 16 24.0 30.0 104.0
## 17 364.0 - 15.0
new_du1 <- new_du[ , c(3:8)]
data2 <- data.frame(new_du1)
#data2$Age.Groups <- as.numeric(as.character(data2$Age.Groups))
barplot(as.matrix(data2), main="Total Users By Drug/Intoxicant",ylab="Users", beside=TRUE,col=rainbow (17), cex.names = 0.5)
legend ("topright",c("12yr","13yr","14yr","15yr","16yr", "17yr", "18yr", "19yr", "20yr", "21yr", "22-23yr","24-25yr", "26-29yr", "30-34yr", "35-49yr", "50-64yr", "65yr+" ),cex=.75,bty="n",fill=rainbow (17))
hist(Freq$alcohol.frequency)
hist(Freq$marijuana.frequency)
plot(new_du$alcohol_users ~ new_du$`Age Groups`)
plot(Freq$alcohol.frequency ~ Freq$age)
boxplot(Freq$alcohol.frequency, Freq$marijuana.frequency, Freq$cocaine.frequency, Freq$crack.frequency, Freq$heroin.frequency, Freq$pain.releiver.frequency, Freq$oxycontin.frequency, main="Median number of times a user in age groups used drug/alcohol", xlab="Drugs", names=c("alcohol", "marijuana", "cocaine", "crack", "heroin", "pain", "oxycontin"))
boxplot(new_du1$alcohol_users, new_du1$hard.drug.users, new_du1$soft.drug.users, new_du1$pharma.pain.drug.users, new_du1$other.pharma.users, new_du1$inhalent.users, main="Boxplot of Useres for condensed drug categories", xlab="Drugs", names = c("alcohol", "hard drugs", "soft drugs", "pharma pain", "other pharma", "inhalents"))
The data illustrated that alcohol is by far the most used and preferred intoxicant by all age groups. However, drugs such as marijuana and “soft drugs” are almost as popular as alcohol among the younger age groups. Another interesting piece of data is the frequency of use for marijuana comparable or more than alcohol in the younger age groups and greater than alcohol in the later ages. Harder drugs like cocaine while not as popular among the age groups, is frequently used as much or more than marijuana and alcochol. The outlier are heroin and meth in regards to frequency of use. For the 19 yr old age group, meth and heroin had very high usage, attesting to its highly addictive nature.
Overall, the data shows that while alcohol is popular, the soft drugs can truly be seen as “gateway” drugs. It is used among early age groups and has high frequency of use, more so than alcohol. Alcohol is probably a environmental social activity while marijuana or hallucinogens are habitual.