Case Study 2: Reproducing a study using NSDUH data

Case Study 2 comprises of analysis of 2018 data, of adolescents age 12-17 years old on heavy alcohol intake and the characteristics of alcohol use. We will be using the article Characteristics of drinking events associated with heavy episodic drinking among adolescents in the United States (Rossheim et al., 2017) as reference.

Load the packages and dataset

library(broom)
library(MASS)
library(lmtest)
library(survey)
library(tidyverse)

load("NSDUH2018.RData")

Exploration of the data

The dataset will be renamed to “NSDUH2018” which will be used in further coding. The table function is used here to observe the ages included in the dataset.

summary(NSDUH2018)

table(NSDUH2018$AGE2)

## 
##     1     2     3     4     5     6     7     8     9    10    11    12    13 
##  2032  2263  2233  2215  2321  2223  1845  1659  1621  1658  3412  3442  3868 
##    14    15    16    17 
##  4926 11688  4938  3969

The dataset contains ages from 1 - 17 years. 15 year olds were the most number of participants in this survey.

Selecting our target population and creating new variables

To select our target group of 12 - 17 years of age who had consumed alcohol in the last 30 days, the following subset code will be used:

desg2018 <-svydesign(id=~verep, strata=~vestr, weights=~ANALWT_C, nest = TRUE, data = NSDUH2018)

desg2018 <- subset(desg, AGE2 >= 12 & AGE2 <= 17 & alcrec == 1)

To create a new variables for heavy episodic of drinking (HED) in adolescents, we will use this following codes:

First step is to separate the males and females from the “irsex” variable:

males <- filter(NSDUH2018, irsex == 1)
females <- filter(NSDUH2018, irsex == 2)

Second step is to create variables for males and females for HED from the variable “cadrlast”. This variable includes answers for the question “Please think about the last time you drank any alcoholic beverage. How many drinks did you have that time?” (5 or more drinks if males, and 4 or more drinks if females):

males$HED[males$cadrlast >= 5] <- "1"
males$HED[males$cadrlast <= 5] <- "0"

females$HED[females$cadrlast >= 4] <- "1"
females$HED[females$cadrlast <= 4] <- "0"

The new variables “males” and “females” contain 29,358 and 26,955 observations respectively.

The last step is to combine the two data frames:

NSDUH2018b <- rbind(males, females)
NSDUH2018b

table(NSDUH2018b$HED)

Case Study 2: Reproducing a study using NSDUH data

Jayati Atahar

9/21/2020

Load the packages and dataset

Exploration of the data

Selecting our target population and creating new variables

Analysis of the sources of alcohol