Introduction

According to Centers for Disease Control and Prevention (CDC), cigarette smoking is the leading cause of preventable disease and deaths in the United States of America. Smoking accounts for more than 480,000, or 1 in 5, deaths every year(“State Tobacco Activities Tracking and Evaluation”, 2017). In 2016, approximately 15.5 percent of US adults 18 years or older currently smoked. Therefore, nearly 37.8 million smoked and more than 16 million of those smokers reported living with a tobacco-related disease.This study is an attempt to examine the effectiveness of implementing state excise cigarette taxes per pack, or TPP, on motivating current smokers to quit as the majority of state legislatures insist that increasing the excise tax on pack of cigarettes contributes to declines in the overall smoking population.

For example, a The Economist article, “Time to Quit Smoking”, defends this notion by discussing the impact smoking has on less than developed countries (“Time to Quit”, 2015). In America and other “rich” countries the Australia, Britain, Canada, and Italy, one in five or fewer smoke and smoking-related deaths are ultimately in a decline. On the contrary, in poor countries like Africa, more and more people are becoming addicted. The article goes on to assert the way to deter people from smoking and decrease the number of smoking-related deaths is to increase taxes, ban smoking indoors, and publicize health risks linked to smoking. Consequently, countries that implemented higher taxes on cigarettes and new smoking regulations like Uruguay and Turkey experienced considerable drops in smoking rates amounting to approximately 10 percentage points. Furthermore, the article states “a 10 percent price increase cuts consumption by 4-5%, half of which is among smokers who quit; the effect is two to three times as large among young people, who have less money” (“Time to Quit”, 2015). Moreover, tobacco taxes tend to be regressive, as in a higher percentage of manual workers smoke relative professionals and managers; however, poor smokers are said to be more likely to quit when tax prices rise.

On the contrary, some maintain the opposite point of view regarding the impact raising taxes can have on smokers. A The Washington Post article by Keith Humphreys, a Professor and Director of Mental Health Policy at the Stanford University Department of Psychiatry , “Why the Wealthy Stopped Smoking, but the Poor Didn’t”, focuses on a disturbing revelation that those of a lower socioeconomic status smoke more than those of a higher status (Humphreys, 2015). In a Gallup poll of over 75,000 Americans, the rate of smoking among people who make $24,000 or less was more than double that of people who make $90,000 or more. A study discussed by the article emphasizes that once the health risks of smoking became more universally known, the wealthy were more likely to terminate their smoking habits as “high-income families decreased their smoking by 62 percent from 1965 to 1999, versus only 9 percent for low-income families. Poor smokers are said to have trouble quitting for three main reasons: they have a stronger addiction due to more time spent smoking, income tends to segregate people by income so poor smokers tend to have to make quit-attempts near other smokers while wealthier smokers do not, and lower income people are likely to lag middle class people in their access to tobacco cessation programs. Professor Humphreys maintains that the concentration of smoking among the poor exacerbates income inequality in the US. An ever-increasing tobacco taxes can eventually become regressive for addicted low-income smokers who refuse to stop smoking. Essentially, raising the excise tax per pack has a more significant negative impact on the poor.

Lastly, regardless of income or socioeconomic status, some research maintains the focus of encouraging smokers to quit should be non-tax policies.A study, “What Cigarette Price is Required for Smokers to Attempt to Quit Smoking?” conducted by Eun-Ja Park et al. evaluates the ideal price at which the most smokers will try to stop smoking along with factors, such as non-tax tobacco control policies, that are potentially related to the required price (Park et al., 2015). A cross-sectional analysis was orchestrated using data on 1,257 male smokers and information was collected on what cigarette price per pack would make them attempt to quit. A regression on log-transformed price and logistic regression on non-quitting were carried out to isolate associated factors. Results revealed younger age, higher education, lack of concern about the health effects of smoking, lack of attempts to stop smoking, and more cigarettes consumed per day correlated with a higher price required for a quit attempt. On the other hand, exposure to various non-tax related policies were sizably associated with lower cigarette prices needed to evoke a subject to try to quit. Overall, those who require a higher price per pack would find more success in quitting if the price increase were accompanied by a non-tax policy, particularly for heavy smokers. Improving non-tax policies appears to increase the effectiveness of tax policies.

The goal of this study is to determine whether or not states with larger cigarette excise tax rates have higher rates of attempts to terminate smoking habits than states with lower tax rates, using nationally representative survey data. Specifically, the study seeks to find an answer to whether or not the magnitude of the tobacco tax shares a positive relationship with the number of attempts to quit smoking. By determining whether or not smoking respondents attempted to quit to the excise tax per pack of cigarettes of the respondents’ corresponding current state address, I interpret the effect of the tax on attempts to quit. Additionally, I integrated other variables, such as income, presence of a state medicaid funded tobacco cessation program, overall state cigarette use, and age to analyze the potential impact these factors could have on attempts to stop smoking; I focus on more experienced smokers as quitting becomes more difficult as time spent smoking increases.

Methods

Data

#----------------------------------------------------------------------------------------------#
#Project: Data Challenge 1                                                                      
#Author: Emma Brauer
#Program Name: 1_dataprep 
#Data Used: BRFSS16_DCL.dta
#Created: 1/31/2018
#Last Revised & Notes: 

#Reminders: Set working directory.

#Contents: This file converts original data from Stata to R format and recodes variables for analysis
#----------------------------------------------------------------------------------------------#

#----------------------------------------------------------------------------------------------#
#Set Up and load data. Remember that after the first run you can comment out install.packages lines
#----------------------------------------------------------------------------------------------#
#Set working directory
setwd("U:/ECO 307/DC1")

#Install packages
#install.packages("foreign")
#install.packages("tidyverse")

#Library packages
library(foreign)
library(tidyverse)

#Import Data from Stata format to R
library(foreign)
BRFSS16 <- read.dta("BRFSS2016_DC1.dta")

#Look at your data
#View(BRFSS16)

#-----------------------------------------------------------------------------------------------#
#Rename variables that start with _ because R hates them. My example here with _state will work for any other
#variables that start with _ too. 
#-----------------------------------------------------------------------------------------------#
names(BRFSS16)[names(BRFSS16) == "_state"] <- "fips"
names(BRFSS16)[names(BRFSS16) == "_ageg5yr"] <- "ageg5yr"
names(BRFSS16)[names(BRFSS16) == "_finalwt"] <- "finalwt"
#-----------------------------------------------------------------------------------------------#
#Select the Variables Relevant for your Analysis. Don't worry, you can always add more later.
#I chose to put mine in a new dataframe called "ltcdata" ltc stands for long term care.
#-----------------------------------------------------------------------------------------------#
perpacktaxdata <- select(BRFSS16, fips, ageg5yr, stopsmk2, income2, finalwt)
#View(perpacktaxdata)

#-----------------------------------------------------------------------------------------------#
#Merge in your state policy data
#-----------------------------------------------------------------------------------------------#
#First open it
statepol <- read.csv("PolicyData_Brauer.csv")

#Look at it (Does it look right?)
#View(statepol)

#We need to name the variable that contains the fips code the same as in our BRFSS file
names(statepol)[names(statepol)== "X_state"] <- "fips"

#Merge it with your analytic data set (mine is lctdata - what did you call yours above?)
perpacktaxdata_statepol <- left_join(perpacktaxdata, statepol, by="fips")

#Remove the other dataframes from your workspace to conserve memory
rm(BRFSS16, perpacktaxdata, statepol)

#-----------------------------------------------------------------------------------------------#
#Recode variables and label values for categorical variables
#When we looked at the data we could see there were lots of weird values. For example 88's and 99's
#Use your BRFSS codebook to decide what values those should be recoded to.Then adapt the code below.
#For variables where numbers are used to represent categories, we can prepare the variables as factors.
#-----------------------------------------------------------------------------------------------#
perpacktaxdata_statepol$stopsmk2 <- as.numeric(recode(as.character(perpacktaxdata_statepol$stopsmk2), "7" = "NA", "9" = "NA", "2" = "0"))
perpacktaxdata_statepol$income2 <- as.numeric(recode(as.character(perpacktaxdata_statepol$income2), "77" = "NA", "99" = "NA"))
perpacktaxdata_statepol$income2 <- as.factor(perpacktaxdata_statepol$income2)
levels(perpacktaxdata_statepol$income2) <- c("Less than $10K",
                                             "$10K to <$15K",
                                             "15K to <$20K",
                                             "$20K to <$25K",
                                             "$25K to <$35K",
                                             "$35K to <$50K",
                                             "$50K to <$75K",
                                             "$75K or more",
                                             "Don't Know",
                                             "Refused")
#table(perpacktaxdata_statepol$stopsmk2) 














#-----------------------------------------------------------------------------------------------#
#Apply your sample selection criteria using the filter verb from DataCamp
#-----------------------------------------------------------------------------------------------#
perpacktaxdata_statepol <- perpacktaxdata_statepol %>% filter(ageg5yr >= 2)
#perpacktaxdata_statepol<-perpacktaxdata_statepol %>% filter(income2 == "Less than $10K"|income2 == "$10K to <$15K"|income2 == "$15K to $20K"|income2 == "$20K to $25K")







#Drop observations with missing values on any of your variables by replacing my dataframe name with yours
#in the code below.
#Pay attention - if a ton of observations are dropped, you may have a problem here
perpacktaxdata_statepol <- perpacktaxdata_statepol %>% filter(complete.cases(.))

##################################################################################################
#Analytic sample is prepared. Proceed to 2_eda
##################################################################################################
Figure 1 Excise Tax Per Cigarette Package By State

Figure 1 Excise Tax Per Cigarette Package By State

The data utilized in this analysis came from the 2016 Behavioral Risk Factor Surveillance System and the TPP, or tax per pack, data was retrieved from the Sales Tax Handbook (“Map of State Cigarette Excise Taxes”). The BRFSS contains the widest range of health-related surveys in the United States. For this particular study, STOPSMK2 is the key outcome variable and is measured by whether or not current smokers have stopped smoking for one day or longer because they were trying to quit within the past 12 months. Current smokers are typically defined as persons who have reported smoking at least 100 cigarettes during their lifetime and reported smoking every day or most days. The STOPSMK2 variable is a binary variable, therefore, an attempt to quit smoking is represented by a 1 while no attempt is represented by a 0. TPP is the key independent variable as it is a measure of the magnitude of the state excise cigarette tax per pack. Also, TPP is a continuous variable since most states’ excise taxes differ; a larger independent variable value indicates a larger tax levied on the cigarette packs.

In addition to these primary outcome and independent measures, the data file also includes an interval scale measure, INCOME2, which provides information on annual household income for all sources. In the analysis, I estimate models that include all incomes to assess the differences between each category of income and TPP determined by the quadratic functional form equation. The BRFSS file also includes an interval scale measure of respondent age, AGEG5YR. I chose to restrict ages 29 and below to focus on more experienced smokers who are more dependents on cigarettes.

I also incorporated variables that were not present in the BRFSS file: MCP and CIGUSE. The MCP variable translates to Medicaid Cessation Program while the percentage of smokers within each state is denoted by CIGUSE. First, the presence of a state-wide tobacco cessation program could potentially alleviate some of the difficulty associated with attempting to stop smoking. Some programs were even created in response to cutting state costs related to caring for people, including state employees, with tobacco caused diseases and conditions. As a result, this variable was controlled for in the multiple regression equation by researching information on Medicaid sponsored cessation programs by state released by the Centers for Disease Control and Prevention (“Tobacco Cessation: State and Federal Efforts to Help”). This variable is binary in that each state either had a program that covered some over-the-counter products, some Rx products, or therapy/counseling/social support, or it does not. Second, CIGUSE is a continuous variable based on data collected by the CDC as well (“Smoking & Tobacco Use”, 2018).

After executing the sample restriction and removing responses with missing data, the total number of observations amounts to 55,834. Table 2 provides weighted summary statistics for all variables involved in this analysis by applying BRFSS sampling weights finalwt.

#Paste either all of 2_EDA here or split it up into separate chunks so you can write text in between.

Empirical Strategy

To infer the relationship between excise tax per pack and attempts to quit, I compare whether or not current smoking respondents have attempted to quit to the magnitude of the TPP of each of the respondents’ current state addresses, provided by the BRFSS survey data. I begin with model one, which is a simple regression that employs weighted least squares to address the complex sampling methods utilized by the BRFSS.

Model 1 Simple Regression

\[stopsmk2_{i} = \beta_{0} + \beta_{1}TPP_{i}+\epsilon_{i}\]

In Model 1, the size of the cigarette excise tax per package of cigarettes is represented by \(\beta_1\) while the binary stopsmk2 variable is identified as the Y variable. A positive estimate for \(\beta_1\) would indicate that a higher TPP is correlated with an increase in the likeliness of quitting. This model attributes any differences in attempts to stop smoking to TPP even though states can differ in other ways that may have an impact on attempts to quit. For example, medicaid funded tobacco cessation programs can alleviate a great deal of the struggle associated with trying to quit. This could ease the mind of smokers who are thinking about smoking and help motivate them to terminate their habit. If this holds true, the policy estimate in my simple regression may be positively biased in that not controlling for medicaid programs leads to TPP increasing attempts to stop smoking.

#Most of you will want your graph to go in results I am guessing. So you would paste the code from 2_EDA that makes the graph here.
model1 <- lm(stopsmk2 ~ TPP, data = perpacktaxdata_statepol, weights = finalwt) 
library(stargazer)
stargazer(model1, title = "Table 1 Results for Model 1", header=FALSE, covariate.labels = c("Tax Per Pack"), column.labels = c("Model 1"), type = "html")
Table 1 Results for Model 1
Dependent variable:
stopsmk2
Model 1
Tax Per Pack -0.001
(0.002)
Constant 0.587***
(0.004)
Observations 55,834
R2 0.00000
Adjusted R2 -0.00002
Residual Std. Error 11.535 (df = 55832)
F Statistic 0.143 (df = 1; 55832)
Note: p<0.1; p<0.05; p<0.01

Model 2 Multiple Regression

\[stopsmk2_{i} = \beta_{0} + \beta_{1}TPP_{i}+ \beta_{2}income2 + \beta_{3}ageg5yr + \beta_{4}MCP + \beta_{5}CIGUSE+ \epsilon_{i}\] To control for this bias, I included whether or not each state had some form of medicaid tobacco cessation program (MCP). Moreover, I included income, age, and cigarette use by state in my Model 2. Income and age are both interval variables while cigarette use is a ratio variable. The omitted categories are income below 10,000 and ages 18 to 29.

Model 3 Multiple Regression: Functional Form

\[stopsmk2_{i} = \beta_{0} + \beta_{1}TPP_{i}+ \beta_{2}TPP^2{i}+ \beta_{2}income2 + \beta_{3}ageg5yr + \beta_{4}MCP + \beta_{5}CIGUSE+ \epsilon_{i}\] Increases in TPP are meant to deter smokers of all ages and incomes from smoking, however, research shows that doesn’t always hold true. As a result, I included all categories of income and all ages 30 and older to determine whether or not there exists a category of income or age at which smokers’ demand for cigarettes becomes inelastic. To do this, I estimated a quadratic functional form model.

Results

In Model 1, a dollar increase in the excise tax per pack is associated with a 0.1 percentage point decrease in the probability one will attempt to stop smoking. However, a TPP coefficient of -0.1 is rather small and is not statistically significant. Essentially, the simple regression reveals there is little to no relationship between attempts to stop smoking and tax per pack.

After controlling for income, smokers who are 30 or older, whether or not a state medicaid funded program exists, and state cigarette use percentage in Model 2, a dollar increase in the excise tax per pack is associated with a 0.52533 percentage point decrease in the probability one will try to discontinue their smoking habits, which reveals an increase from the 0.1 tax per pack coefficient generated by the simple regression. Therefore, one or more of the variables I incorporated into my multiple regression biased my results by increasing the likeliness smokers will not attempt to stop smoking.

Table 2 contains the results of estimating Model 3 with I(TPP^2) being tax per pack squared as this model is a quadratic functional form. Since the coefficient on TPP changes from negative 1.6 percentage points to a positive 0.25 percentage points once TPP is squared, as the excise tax per pack increases, the probability of trying to quit decreases.

Because older smokers are more likely to not try to quit, I predicted attempts to quit would decrease as age increased. This appears to be fairly accurate since, according to Table 2, attempts to quit become less and less likely as age increases. In terms of income, smokers of higher socioeconomic statuses are less likely to be as affected by a small tax burden relative to a smoker of a lower socioeconomic status. Table 2 also displays a decrease in the probability of attempting to quit smoking as income increases until smokers achieve an income of over $50,000. Smokers with incomes of $50,000 to $75,000 and $75,000 or more portray an increased likeliness of quitting compared to smokers who earn $35,000 to $50,000. This may indicate there is in fact an income at which demand for cigarettes is reasonably inelastic. However, none of the differences in the estimated coefficients are very drastic and the standard errors are too large to conclude any differences in the estimates are statistically significant. All in all, this model suggests tax per pack may not have a meaningful impact on attempts to quit smoking overall.

#Most of you will want your graph to go in results I am guessing. So you would paste the code from 2_EDA that makes the graph here.
model3 <- lm(stopsmk2 ~ TPP + income2 + MCP + CIGUSE + as.factor(ageg5yr), data = perpacktaxdata_statepol, weights = finalwt)

model4 <- lm(stopsmk2 ~ TPP + income2 + MCP + CIGUSE + as.factor(ageg5yr) + I(TPP^2), data = perpacktaxdata_statepol, weights = finalwt)

library(stargazer)
stargazer(model3, model4, title = "Table 2 Results for Models 3 and 4", header=FALSE, covariate.labels = c("Tax Per Pack", "Income 10K to less than 15K", "$15K to less than $20K", "$20K to less than $25K", "$25K to less than $35K", "$35K to less than $50K", "$50K to less than $75K", "$75K or more", "Medicaid Cessation Program", "Cigarette Use Percentage", "Age 30 to 34", "Age 35 to 39", "Age 40 to 44", "Age 45 to 49", "Age 50 to 54", "Age 55 to 59", "Age 60 to 64", "Age 65 to 69", "Age 70 to 74", "Age 75 to 79", "Age 80 or Older", "Don't Know"), column.labels = c("Model 2", "Model 3"), type = "html")
Table 2 Results for Models 3 and 4
Dependent variable:
stopsmk2
Model 2 Model 3
(1) (2)
Tax Per Pack -0.005** -0.016**
(0.002) (0.008)
Income 10K to less than 15K -0.004 -0.004
(0.010) (0.010)
15K to less than 20K -0.009 -0.009
(0.009) (0.009)
20K to less than 25K -0.025*** -0.025***
(0.009) (0.009)
25K to less than 35K -0.026*** -0.026***
(0.009) (0.009)
35K to less than 50K -0.072*** -0.072***
(0.009) (0.009)
50K to less than 75K -0.061*** -0.061***
(0.009) (0.009)
75K or more -0.066*** -0.066***
(0.008) (0.008)
Medicaid Cessation Program -0.001 0.004
(0.007) (0.007)
Cigarette Use Percentage -0.003*** -0.003***
(0.001) (0.001)
Age 30 to 34 -0.020** -0.020**
(0.009) (0.009)
Age 35 to 39 -0.057*** -0.057***
(0.009) (0.009)
Age 40 to 44 -0.087*** -0.087***
(0.009) (0.009)
Age 45 to 49 -0.098*** -0.098***
(0.009) (0.009)
Age 50 to 54 -0.099*** -0.099***
(0.009) (0.009)
Age 55 to 59 -0.099*** -0.099***
(0.009) (0.009)
Age 60 to 64 -0.107*** -0.107***
(0.009) (0.009)
Age 65 to 69 -0.142*** -0.142***
(0.011) (0.011)
Age 70 to 74 -0.145*** -0.144***
(0.013) (0.013)
Age 75 to 79 -0.168*** -0.168***
(0.017) (0.017)
Age 80 or Older -0.197*** -0.197***
(0.024) (0.024)
Don’t Know 0.017 0.016
(0.031) (0.031)
I(TPP2) 0.003
(0.002)
Constant 0.765*** 0.769***
(0.019) (0.019)
Observations 55,834 55,834
R2 0.011 0.011
Adjusted R2 0.011 0.011
Residual Std. Error 11.471 (df = 55811) 11.471 (df = 55810)
F Statistic 29.099*** (df = 22; 55811) 27.929*** (df = 23; 55810)
Note: p<0.1; p<0.05; p<0.01
#Plot
plot_data2 <- perpacktaxdata_statepol %>%
  group_by(TPP) %>%
  summarize(meanstops=weighted.mean(stopsmk2, finalwt)) 

ggplot(plot_data2, aes(x=TPP, y=meanstops)) + geom_point() + geom_smooth() +  labs(title = "Current US Smokers' Attempts to Quit Based on Tax Per Pack", x = "Tax Per Pack", y = "Average Attempts to Quit")

This graph depicts the relationship between TPP and stopsmk2 after controlling for additional key variables emphasized in Model 2

Conclusion

This study analyzed the potential relationship between TPP, tax per pack, and attempts to stop smoking among current smokers in the United States. Preliminary research on this topic varies as some researchers assert taxing cigarettes has a profound negative effect on the poor as fewer wealthy people engage in regular smoking habits while others believe the poor possess a more elastic demand and would be prompted to quit with small changes in the price of cigarettes. Using cross-sectional data from the BRFSS, I was unable to find a clear relationship between TPP and the stopsmk2 variable. To thoroughly examine the relationship between income, age, TPP, and attempts to quit with Model 3, I was also unable to uncover a significant relationship between the targeted variables. However, this study does have limitations that could have significantly impacted the results. The most important limitation is that only cross sectional data was utilized, therefore, any differences in whether or not smoking respondents from a certain state tried to quit could be misattributed to the TPP. In addition, the BRFSS only includes current smokers in the stopsmk2 variable. As a result, the data did not include respondents who may have successfully quit smoking within the past 12 months and may no longer consider themselves a smoker. This could account for why the models show a small negative relationship between TPP and attempts to stop. Also, the stopsmk2 variable is binary and, therefore, doesn’t show as much of the potential magnitude of the impact excise taxes could have on quitting as a continuous variable that counts attempts and successful quitters. If more attempts and successful quitters were included in the data, the data supplied by the models would be negatively biased. This means without those additional attempts to quit in the original data, the current models show a more negative relationship between TPP and attempts to quit. Lastly, the binary variable stopsmk2 includes both casual and regular smokers. This could bias the data because casual smokers could find it less arduous to quit smoking while heavy smokers struggle to a much greater extent. Eliminating casual smokers from the data would positively bias the estimated models.

Works Cited

Humphreys, K. (2015, January 14). Why the wealthy stopped smoking, but the poor didn’t. Retrieved from https://www.washingtonpost.com/news/wonk/wp/2015/01/14/why-the-wealthy-stopped-smoking-but-the-poor-didnt/?utm_term=.68ff4a2acabb

Map of Cigarette Excise Taxes & Price Per Pack. (n.d.). Retrieved from https://www.salestaxhandbook.com/cigarette-tax-map

Park, E., Park, S., Cho, S., Kim, Y., Seo, H. G., Driezen, P., . . . Fong, G. T. (2015, July 01). What cigarette price is required for smokers to attempt to quit smoking? Findings from the ITC Korea Waves 2 and 3 Survey. Retrieved from http://tobaccocontrol.bmj.com/content/24/Suppl_3/iii48?utm_source=TrendMD&utm_medium=cpc&utm_campaign=TC_TrendMD-0

Smoking & Tobacco Use. (2018, February 15). Retrieved from https://www.cdc.gov/tobacco/data_statistics/fact_sheets/adult_data/cig_smoking/index.htm

State Tobacco Activities Tracking and Evaluation (STATE) System. (2017, September 19). Retrieved from https://www.cdc.gov/statesystem/cigaretteuseadult.html

Time to quit. (2015, July 11). Retrieved from https://www.economist.com/news/international/21657383-even-though-it-clear-how-get-people-stop-smoking-rates-are-still-rising-many