Introduction

The abuse of substances in the United States has been growing at an exponential rate in the past few years. The Substance Abuse and Mental Health Services Administration is the primary source of information regarding the prevalance, patterns and the consequences of alcohol, tobacco, and illegal drug use and abuse and mental disorders in the U.S civilian, non-institutionalized population ages 12 and older.

The Treatment Episode Data set–Admissions (TEDS–A)is a national census data system of annual admissions to public and private substance abuse treatment facilities.

Variables

The TEDS–A collects information on data reported to state substance abuse agencies by treatment programs, then reports this to SAMHSA. TEDS variables that are required to be reported are called the Minimum Data Set, these variables included range from basic demographic features of the respondents to what age was first substance use, or whether there is a presence of psychiatric problems.

library(Zelig)
library(DescTools)
library(stargazer)
library(dplyr)
library(foreign)
setwd("C:/Users/Xiomara/Desktop/DS0001")

dataset<- read.dta(file="sampledata.dta")

TEDS–A was first started in 1992, for this observation the TEDS–A collected in 2012 will be used. For the 2012 TEDS–A there were a total of 1749767 observations with a total of 63 variables used. Those variables are as follows:

names(dataset)
##  [1] "CASEID"   "YEAR"     "AGE"      "GENDER"   "RACE"     "ETHNIC"  
##  [7] "MARSTAT"  "EDUC"     "EMPLOY"   "DETNLF"   "PREG"     "VET"     
## [13] "LIVARAG"  "PRIMINC"  "ARRESTS"  "STFIPS"   "CBSA"     "PMSA"    
## [19] "REGION"   "DIVISION" "SERVSETA" "METHUSE"  "DAYWAIT"  "PSOURCE" 
## [25] "DETCRIM"  "NOPRIOR"  "SUB1"     "ROUTE1"   "FREQ1"    "FRSTUSE1"
## [31] "SUB2"     "ROUTE2"   "FREQ2"    "FRSTUSE2" "SUB3"     "ROUTE3"  
## [37] "FREQ3"    "FRSTUSE3" "NUMSUBS"  "IDU"      "ALCFLG"   "COKEFLG" 
## [43] "MARFLG"   "HERFLG"   "METHFLG"  "OPSYNFLG" "PCPFLG"   "HALLFLG" 
## [49] "MTHAMFLG" "AMPHFLG"  "STIMFLG"  "BENZFLG"  "TRNQFLG"  "BARBFLG" 
## [55] "SEDHPFLG" "INHFLG"   "OTCFLG"   "OTHERFLG" "ALCDRUG"  "DSMCRIT" 
## [61] "PSYPROB"  "HLTHINS"  "PRIMPAY"

For this observation the variables that are going to be used are as follows:

dataset2 <- select(dataset, AGE, GENDER, RACE, EDUC,SUB1, FREQ1, FRSTUSE1)

names(dataset2)
## [1] "AGE"      "GENDER"   "RACE"     "EDUC"     "SUB1"     "FREQ1"   
## [7] "FRSTUSE1"

AGE: This is calculated from the date of birth and date of admission, anyone 11 years old and younger are not included in the TEDS. The variable age is grouped into 12 categories:

Value Label
2 12-14
3 15-17
4 18-20
5 21-24
6 25-29
7 30-34
8 35-39
9 40-44
10 45-49
11 50-54
12 55 and over

GENDER: Identifies clients sex: male or female

RACE: Specifies the clients race

  1. ALASKA NATIVE
  2. AMERICAN INDIAN(OTHER THAN ALASKA NATIVE)
  3. ASIAN OR PACIFIC ISLANDER
  4. BLACK OR AFRICAN AMERICAN
  5. WHITE
  6. ASIAN
  7. OTHER SINGLE RACE
  8. TWO OR MORE RACES
  9. NATIVE HAWAIIAM OR OTHER PACIFIC ISLANDER

EDUC: Specifies the highest school grade(number of school years) completed by the client

Value Label
1 8 Years or Less
2 9-11
3 12
4 13-15
5 16 or more

SUB1: This variable identifies the clients primary substance problem

  1. NONE
  2. ALCOHOL
  3. COCAINE/CRACK
  4. MARIJUANA/HASHISH
  5. HEROIN
  6. NON-PRESCRIPTION METHADONE
  7. OTHER OPIATES OR SYNTHETICS
  8. PCP
  9. OTHER HALLUCINOGENS
  10. METHAMPHETAMINE
  11. OTHER AMPHETAMINES
  12. OTHER STIMULANTS
  13. BENZODIAZAPINES
  14. OTHER NON-BENZODIAZEPINE
  15. BARBITUATES
  16. OTHER NON-BARBITUATE SEDATIVES OR HHYPNOTICS
  17. INHALANTS
  18. OVER-THE-COUNTER MEDICATIONS
  19. OTHER

FREQ1: specifies the frequency of use of the primary source.

  1. No use in the past month
  2. 1-3 times in the past month
  3. 1-2 times in the past week
  4. 3-6 times in the past week
  5. Daily

FRSTUSE1: For drugs other than alcohol, this variable identifies the age at which the client first used the substance identified as the primary source

Value Label
1 11 & under
2 12-14
3 15-17
4 18-20
5 21-24
6 25-29

THEORIES

For this research I want to see whether basic demographic characterisitcs of the respondents affect what substance they use, the frequecy they use this substance and what age they were when they first started using(abusing) these substances.I want to see whether women or men are more like to abuse each substance, whether their age and education have to do with what substance they abuse and othes things of that nature

Descriptive Statistics

summarise(dataset2, mean_age= 
            mean(AGE),
          mean_educ = mean(EDUC),
            mean_frstuse1 = mean(FRSTUSE1),
          mean_sub1=mean(SUB1))
##   mean_age mean_educ mean_frstuse1 mean_sub1
## 1 7.418692  3.835766      4.577659  5.231846

This information shows the mean of the variables AGE, EDUC, FRSTUSE1, and SUB1. This means that the average age of respondents we between 30-34 years old. The average number of years in school(EDUC) was 12 years meaning that they were high school graduates. The average age when first use of substance was between the ages 18-20. The average number of respondents primary substance use was heroin.

Desc(dataset2$AGE)
## ------------------------------------------------------------------------- 
## dataset2$AGE (factor)
## 
##      length         n       NAs    levels    unique     dupes
##   1.750e+06 1.750e+06         0        12        11         y
## 
##                                    level   freq  perc cumfreq cumperc
## 1                                  25-29 275926  .158  275926    .158
## 2                                  30-34 238520  .136  514446    .294
## 3                                  21-24 214528  .123  728974    .417
## 4                                  45-49 181398  .104  910372    .520
## 5                                  40-44 178881  .102 1089253    .623
## 6                                  35-39 170844  .098 1260097    .720
## 7                                  50-54 143828  .082 1403925    .802
## 8                            55 AND OVER 121015  .069 1524940    .872
## 9                                  18-20 104588  .060 1629528    .931
## 10                                 15-17  99368  .057 1728896    .988
## 11                                 12-14  20871  .012 1749767   1.000
## 12 MISSING/UNKNOWN/NOT COLLECTED/INVALID      0  .000 1749767   1.000
Desc(dataset2$GENDER)
## ------------------------------------------------------------------------- 
## dataset2$GENDER (factor)
## 
##      length         n       NAs    levels    unique     dupes
##   1.750e+06 1.750e+06         0         3         3         y
## 
##                                   level    freq  perc cumfreq cumperc
## 1                                  MALE 1163017  .665 1163017    .665
## 2                                FEMALE  583400  .333 1746417    .998
## 3 MISSING/UNKNOWN/NOT COLLECTED/INVALID    3350  .002 1749767   1.000
Desc(dataset2$RACE)
## ------------------------------------------------------------------------- 
## dataset2$RACE (factor)
## 
##      length         n       NAs    levels    unique     dupes
##   1.750e+06 1.750e+06         0        10        10         y
## 
##                                         level    freq  perc cumfreq
## 1                                       WHITE 1158404  .662 1158404
## 2                   BLACK OR AFRICAN AMERICAN  348798  .199 1507202
## 3                           OTHER SINGLE RACE  137759  .079 1644961
## 4  AMERICAN INDIAN (OTHER THAN ALASKA NATIVE)   40119  .023 1685080
## 5                           TWO OR MORE RACES   22612  .013 1707692
## 6       MISSING/UNKNOWN/NOT COLLECTED/INVALID   18445  .011 1726137
## 7                                       ASIAN   11000  .006 1737137
## 8   NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER    7517  .004 1744654
## 9       ALASKA NATIVE (ALEUT, ESKIMO, INDIAN)    3457  .002 1748111
## 10                  ASIAN OR PACIFIC ISLANDER    1656  .001 1749767
##    cumperc
## 1    0.662
## 2    0.861
## 3    0.940
## 4    0.963
## 5    0.976
## 6    0.986
## 7    0.993
## 8    0.997
## 9    0.999
## 10   1.000
Desc(dataset2$EDUC)
## ------------------------------------------------------------------------- 
## dataset2$EDUC (factor)
## 
##      length         n       NAs    levels    unique     dupes
##   1.750e+06 1.750e+06         0         6         6         y
## 
##                                   level   freq  perc cumfreq cumperc
## 1                                    12 716618  .410  716618    .410
## 2                                  9-11 447439  .256 1164057    .665
## 3                                 13-15 333928  .191 1497985    .856
## 4                       8 YEARS OR LESS 129186  .074 1627171    .930
## 5                            16 OR MORE  90460  .052 1717631    .982
## 6 MISSING/UNKNOWN/NOT COLLECTED/INVALID  32136  .018 1749767   1.000
Desc(dataset2$SUB1)
## ------------------------------------------------------------------------- 
## dataset2$SUB1 (factor)
## 
##      length         n       NAs    levels    unique     dupes
##   1.750e+06 1.750e+06         0        20        20         y
## 
##                                    level   freq  perc cumfreq cumperc
## 1                                ALCOHOL 681374  .389  681374    .389
## 2                      MARIJUANA/HASHISH 305560  .175  986934    .564
## 3                                 HEROIN 285451  .163 1272385    .727
## 4           OTHER OPIATES AND SYNTHETICS 164158  .094 1436543    .821
## 5                          COCAINE/CRACK 121065  .069 1557608    .890
## 6                        METHAMPHETAMINE 116090  .066 1673698    .957
## 7                        BENZODIAZEPINES  17019  .010 1690717    .966
## 8                                   NONE  14917  .009 1705634    .975
## 9                                  OTHER   9897  .006 1715531    .980
## 10                    OTHER AMPHETAMINES   8137  .005 1723668    .985
## 11 MISSING/UNKNOWN/NOT COLLECTED/INVALID   5772  .003 1729440    .988
## 12                                   PCP   5732  .003 1735172    .992
## ... etc.
##  [list output truncated]
Desc(dataset2$FREQ1)
## ------------------------------------------------------------------------- 
## dataset2$FREQ1 (factor)
## 
##      length         n       NAs    levels    unique     dupes
##   1.750e+06 1.750e+06         0         6         6         y
## 
##                                   level   freq  perc cumfreq cumperc
## 1                                 DAILY 680067  .389  680067    .389
## 2              NO USE IN THE PAST MONTH 480923  .275 1160990    .664
## 3           1-3 TIMES IN THE PAST MONTH 209313  .120 1370303    .783
## 4            3-6 TIMES IN THE PAST WEEK 196142  .112 1566445    .895
## 5            1-2 TIMES IN THE PAST WEEK 152290  .087 1718735    .982
## 6 MISSING/UNKNOWN/NOT COLLECTED/INVALID  31032  .018 1749767   1.000
Desc(dataset2$FRSTUSE1)
## ------------------------------------------------------------------------- 
## dataset2$FRSTUSE1 (factor)
## 
##      length         n       NAs    levels    unique     dupes
##   1.750e+06 1.750e+06         0        13        13         y
## 
##                                    level   freq  perc cumfreq cumperc
## 1                                  15-17 490295  .280  490295    .280
## 2                                  12-14 380580  .218  870875    .498
## 3                                  18-20 305557  .175 1176432    .672
## 4                                  21-24 177435  .101 1353867    .774
## 5                                  25-29 118429  .068 1472296    .841
## 6                           11 AND UNDER 114141  .065 1586437    .907
## 7                                  30-34  61075  .035 1647512    .942
## 8                                  35-39  34053  .019 1681565    .961
## 9  MISSING/UNKNOWN/NOT COLLECTED/INVALID  28043  .016 1709608    .977
## 10                                 40-44  19836  .011 1729444    .988
## 11                                 45-49  11402  .007 1740846    .995
## 12                                 50-54   5639  .003 1746485    .998
## ... etc.
##  [list output truncated]

Conclusion

There is a an effect on what kind of substance is used by the gender of the respondent, males were more likel to abuse Heroin while women were more likely to abuse Alcohol. Similarly age also has an effect on what substance the respondents were more likely to use and abuse. It is shown that for different ages the substance of choice changes, for 12 year to about 24 year old they were more likely to abuse marijuana. As we get into respondents in their late 20’s and on the substance that is most used is alcohol. The effects vary with age, their amount of education, their gender, their race.Basic demographics have a strong correlation with the substance of choice, frequency of use, and age they first used this substance.