This project analyzes the Drug Consumption (Quantified) dataset to explore relationships between personality traits, age, and cannabis use. The dataset includes 1,885 participants with demographic variables, Big Five personality scores (Neuroticism, Extraversion, Openness, Agreeableness, Conscientiousness), impulsiveness, sensation seeking, and drug usage frequency levels across 18 substances.
A preliminary look at the data shows:
Age: Participants range from standardized scores of -0.95 to 2.59, with most falling near the mean (mean = 0.03, median ≈ 0).
Gender: Fairly balanced with numeric coding centered around 0.
Education, Country, Ethnicity: Vary continuously and are standardized, allowing comparisons across groups.
Personality and behavioral traits: Neuroticism, Extraversion, Openness, Agreeableness, Conscientiousness, Impulsiveness, and Sensation Seeking are all standardized scores, centered near zero with moderate variation. For example, Openness ranges from -3.27 to 2.90, while Impulsiveness ranges from -2.55 to 2.90, indicating substantial individual differences.
Drug use: Most drug variables are categorical, encoded from “CL0” (never used) to “CL6” (frequent use). Cannabis, alcohol, and caffeine show the widest variability, while illicit substances like heroin and crack show almost no use in the sample.
These descriptive statistics suggest that the dataset contains meaningful variability for exploring:
-How age and personality traits relate to cannabis use and risk-taking behaviors.
-Differences in cannabis consumption patterns across demographic groups.
-How impulsiveness and sensation seeking interact with substance use severity.
For this project, I:
The main focus was on cannabis use patterns and how they relate to impulsiveness and personality traits.
library(ggplot2)
library(dplyr)
library(scales)
library(tidyr)
setwd("C:\\Users\\jason\\Desktop\\IS470 sub directory")
drug_data <- read.table("drug_consumption_with_headers.txt",
header = TRUE,
sep = ",",
stringsAsFactors = FALSE)
# Convert encoded variables
drug_data <- drug_data %>%
mutate(
age_group = case_when(
Age == -0.95197 ~ "18-24",
Age == -0.07854 ~ "25-34",
Age == 0.49788 ~ "35-44",
Age == 1.09449 ~ "45-54",
Age == 1.82213 ~ "55-64",
Age == 2.59171 ~ "65+",
TRUE ~ "Unknown"
),
cannabis_frequency = factor(
Cannabis,
levels = c("CL0","CL1","CL2","CL3","CL4","CL5","CL6"),
labels = c("Never","10yr+","Decade","Year","Month","Week","Day")
),
cannabis_user_binary = ifelse(
Cannabis %in% c("CL3","CL4","CL5","CL6"),
"User","Non-User"
)
)
severity_scale <- c("CL0"=0,"CL1"=1,"CL2"=2,"CL3"=3,"CL4"=4,"CL5"=5,"CL6"=6)
drug_data <- drug_data %>%
mutate(
cannabis_severity_score = severity_scale[Cannabis],
alcohol_severity_score = severity_scale[Alcohol],
cocaine_severity_score = severity_scale[Coke],
heroin_severity_score = severity_scale[Heroin]
)
The chart shows how cannabis usage varies across age groups:
18–24 group: Highest proportion reporting of daily use. This suggests younger adults are more likely to have tried cannabis, possibly due to changing social norms or its legality status.
25–44 group: Shows a balanced mix of past and current use. These individuals are more likely to use cannabis occasionally or weekly, reflecting an age where experimentation may have evolved into regular use for recreational or stress-relief purposes.
55+ group: Very few report daily or weekly use, indicating that older adults may have experimented in the past but significantly reduced usage.
This pattern suggests generational differences in cannabis adoption and emphasizes the importance of age when studying substance use trends.
ggplot(drug_data, aes(x = age_group, fill = cannabis_frequency)) +
geom_bar(position = "fill") +
scale_y_continuous(labels = percent) +
scale_fill_viridis_d(option = "viridis") +
labs(
title = "Cannabis Use Frequency by Age Group",
x = "Age Group",
y = "Percentage"
) +
theme_minimal()
Cannabis usage distribution across age groups
This scatter plot demonstrates the relationship between Impulsiveness and Sensation Seeking across different age groups:
Both traits decline with age, with 18–24-year-olds showing the highest impulsiveness and sensation-seeking scores.
Younger adults appear more likely to engage in risk-taking behaviors, while older adults show more stability and caution.
The positive correlation suggests that individuals who are highly impulsive are also likely to seek new and exciting experiences, which may partly explain higher substance use in younger age groups.
The visual highlights age as a key factor influencing psychological traits relevant to behavioral health.
ggplot(drug_data, aes(x = Impulsive, y = SS, color = age_group)) +
geom_jitter(alpha = 0.6) +
geom_smooth(method = "lm", se = FALSE, color = "black") +
labs(
title = "Impulsiveness vs Sensation Seeking",
x = "Impulsiveness",
y = "Sensation Seeking"
) +
theme_minimal()
Impulsiveness vs Sensation Seeking
This graph compares Big Five personality traits between cannabis users and non-users. Comparing cannabis users and non-users shows:
Openness: Slightly higher in users, suggesting a preference for novelty and new experiences.
Conscientiousness: Slightly lower in users, potentially reflecting a tendency toward less structured behavior.
Other traits (Neuroticism, Extraversion, Agreeableness) show minimal differences, indicating personality impacts are selective rather than global.
This aligns with research indicating that specific personality traits may predispose individuals to experiment with substances, while others are not strongly affected.
personality_long_data <- drug_data %>%
select(cannabis_user_binary, Nscore, Escore, Oscore, Ascore, Cscore) %>%
pivot_longer(
cols = Nscore:Cscore,
names_to = "trait",
values_to = "trait_score"
)
ggplot(personality_long_data,
aes(x = cannabis_user_binary, y = trait_score, fill = cannabis_user_binary)) +
geom_boxplot() +
facet_wrap(~trait, scales = "free") +
labs(
title = "Personality Traits: Cannabis Users vs Non-Users",
x = "",
y = "Trait Score"
) +
theme_minimal() +
theme(legend.position = "none")
Personality trait comparison
This visualization shows how impulsiveness changes as cannabis use frequency increases. It illustrates a clear trend:
Impulsiveness increases as cannabis use frequency increases, with daily users exhibiting the highest scores.
Lower usage levels (0–2) show minimal impulsiveness variation, suggesting occasional users behave similarly to non-users.
The pattern may reflect both a predisposition effect (more impulsive individuals use more) and a behavioral effect (frequent use increases impulsive tendencies).
It emphasizes the behavioral link between personality traits and substance use severity.
ggplot(drug_data,
aes(x = factor(cannabis_severity_score),
y = Impulsive,
fill = factor(cannabis_severity_score))) +
geom_boxplot() +
labs(
title = "Impulsiveness by Cannabis Use Level",
x = "Cannabis Level (0 = Never, 6 = Daily)",
y = "Impulsiveness"
) +
theme_minimal() +
theme(legend.position = "none")
Impulsiveness across cannabis severity levels
This bar chart compares the average severity level for selected substances. It shows that:
Alcohol has the highest average use, followed by cannabis. This reflects legal accessibility and cultural acceptance.
Cocaine and heroin are rarely used, consistent with societal and legal constraints.
The chart highlights a significant gap between legal and illicit drug use, indicating public health interventions might need to focus on common, legal substances rather than only illicit ones.
drug_severity_summary <- data.frame(
drug_name = c("Alcohol","Cannabis","Cocaine","Heroin"),
mean_severity = c(
mean(drug_data$alcohol_severity_score, na.rm = TRUE),
mean(drug_data$cannabis_severity_score, na.rm = TRUE),
mean(drug_data$cocaine_severity_score, na.rm = TRUE),
mean(drug_data$heroin_severity_score, na.rm = TRUE)
)
)
ggplot(drug_severity_summary,
aes(x = reorder(drug_name, mean_severity),
y = mean_severity,
fill = mean_severity)) +
geom_col() +
geom_text(aes(label = round(mean_severity, 2)), hjust = -0.2) +
coord_flip() +
labs(
title = "Average Drug Use Severity by Substance",
x = "Drug",
y = "Mean Usage Level"
) +
theme_minimal()
Average severity level by substance
The analyses reveal distinct patterns in substance use, personality, and behavior across age groups. Cannabis use varies by age: younger adults (18–24) are more likely to use it daily, middle-aged adults use moderately to frequently, and older adults have typically tried it but seldom use it currently.
Impulsiveness and sensation seeking decline with age, reflecting developmental and social influences on risk-taking. Cannabis users tend to score higher in openness and slightly lower in conscientiousness, suggesting certain traits may predispose experimentation. Greater impulsiveness also correlates with more severe cannabis use, highlighting the link between behavior and consumption.
Across substances, legal drugs like alcohol and cannabis are far more prevalent than illicit ones, emphasizing the influence of accessibility, legality, and cultural norms. These findings matter because they show that substance use is not random, it is shaped by age, personality, and behavior, providing a roadmap for targeting prevention, tailoring interventions, and shaping policies that resonate with the populations most at risk.