1 Introduction

The role of policy analysis is to connect research with policy. Because of heavy time constrains, policy analyses are typically ambiguous regarding the details of how the analysis was carried out. This creates three problems: (i) its hard to understand the connection between research and policy, (ii) allows policy makers to cherry pick policy reports, and (iii) hinders systematic improvement and/or automation of parts of the analysis. In this document we demonstrate the use of a reproducible workflow to reduce the ambiguity in policy analysis.

Here we attempt to contribute to the policy discussion of the minimum wage. The minimum wage is a contentious policy issue in the US. Increasing it has positive and negative effects that different policymakers value differently. We aim to add clarity on what those effects are, how much do we know about them, and how those effects vary when elements of the analysis change. We select the most up-to-date, non-partisan, policy analysis of the effects of raising the minimum wage, and build an open-source reproducible analysis on top of it.

In 2014 the Congressional Budget Office published the report titled “The Effects of a Minimum-Wage Increase on Employment and Family Income”. The report receive wide attention from key stakeholders and has been used extensible as an input in the debate around the minimum wage. To this date we consider the CBO report to be the best non-partisan estimation of the effects of raising the minimum wage at the federal level. Although there was disagreement among experts around some technical issues, this disagreement has been mainly circumscribed around one of the many inputs used in the analysis, and we can fit the opposing positions in to our framework.

Our purposes are twofold: First, promote the technical discussion around a recurrent policy issue (minimum wage) by making explicit and visible all the components and key assumptions of its most up-to-date official policy analysis. Second, demonstrate how new scientific practices of transparency and reproducibility (T & R) can be applied to policy analysis. We encourage the reader to collaborate in this document and help develop an ever-improving version of the important policy estimates[^3] (re)produced here.

To achieve our goal we reviewed the CBO report and extract the key components of its analysis. We adapt new guidelines propose by the scientific community (TOP) into policy analysis. In it, the analysis achieves the highest standards of transparency and reproducibility (T & R) when the data, methods and workflow are completely reproducible and every part of the analysis and its assumptions, are easily readable. We also benefit from hindsight and structure this document around the costs and benefits mainly discussed in the policy debate.

CBO’s report, in its original form already represents a significant improvement in T & R relative to the standard practices of policy analyses. The report contains most of the components required for a full reproduction. We add the missing components, make explicit assumptions when needed, complement the narrative explanations with some mathematical formulae, visualizations, and the analytical code use behind all the replication.

Important Note:

Although our aim is to translate practices of T & R from Science to Policy Analysis, we need to highlight an important difference regarding reproducibility between the two of them. A scientific report takes the form of a peer review publication that represent several months or years of research, followed up by a review process that can be as lengthy as the research itself. For this reason, when a scientific publication is subject to replication is expected to succeed. Policy analysis is usually performed under tight deadlines, and is not unusual to rely on arbitrary assumptions and/or irreproducible calculations. For this reasons we do not attempt to replicate the CBO report as a way of testing the veracity of the analysis. We use reproducibility, paired with full transparency, to generate a living document that represents the best policy analysis up to date. Our expectations are that this living document will be serve as a building block to discuss and incorporate incremental improvements to the policy analysis used to inform the debate around the minimum wage.

The CBO report describes three policy estimates: the effects of raising the minimum wage on income of families with members that receive a raise, the effects on income of families with members that loose their jobs, and the distributions of losses in the economy used to pay for the raise in the minimum wage. All the policy estimates to replicate are presented in the following tables.

Note on the code languages (R and Stata): The analysis can be replicated using either language, but only R provides the one-click workflow. For Stata the reader has to copy and paste the scripts sequentially or excecute this do file.

Policy estimates in CBO report: Overall effects

	Effects/Policy Estimates
wage gains (billions of $)	31
wage losses (bns of $)	~5
Balance losses (bns of $)	~24
Net effect (bns of $)	2
# of Wage gainers (millions)	16.5
#of Wage losers (millions)	0.5

Policy estimates in CBO report: Distributional effects across poverty lines (PL)

	<1PL	[1PL, 3PL)	[3PL, 6PL)	>6PL
Balance losses (bns of $)	~0.3	~3.4	~3.4	~17
Net effect (bns of $)	5	12	2	-17

In this companion we attempt to reproduce all the policy estimates of table 1 and 2, and walk the reader through all the details behind it. At a high level, two key components need to be computed: employment effects and distributional effects.

2 Employment effects

At a general level the effects on employment ($\widehat{\Delta E}$) will be calculated using a more detailed version of the following equation:

\[ \begin{equation} \widehat{\Delta E} = N \times \eta \times \% \Delta w + \text{Other factors} \label{eq:1} \tag{1} \end{equation} \]

Where $N$ represents the relevant population, $\eta$ the elasticity of labor demand, $\Delta w$ the relevant percentual variation in wages, and the Other factors will encapsulate effects on employment through an increase in the aggregate demand.

We first describe some general methodological considerations and then describe how to compute each component from equation $\eqref{eq:1}$.

2.1 Data, wages, and forecast

We describe the data used, the chosen wage variable, and the procedure used to forecast the wage and population distribution of 2016 using data from 2013.

2.1.1 Data

The Current Population Survey (CPS) was used to compute the effects on employment. From the analysis described in the section on distributional effects of the original report, we can deduce that the data corresponds to the Outgoing Rotation Group (ORG). CPS is a monthly cross sectional survey. The same individual is interviewed eight times over a period of 12 months. The interviews take place in the first and last 4 months of that period. By the 4th and 12th interview, individuals are asked detailed information on earnings. The CPS ORG file contains the information on this interviews for a given year. We analyze the data for 2013.

Currently three versions of these data sets can be found online: CPS raw files, ORG NBER and ORG CEPR. The analysis will be performed using the CPER ORG data base.

The weights used in our analysis will be orgwgt/12

2.1.1.1 Code to load the data

call.cps.org.data <- function(){
  data_use <- "CPER_ORG"
  # Using CEPR ORG data
  if (data_use == "CPER_ORG") {
  # Checking if working directory contains data, download if not.
    if ( !("cepr_org_2013.dta" %in% dir(path = "./rawdata/") ) ) {
        # create name of file to store data
        tf <- "cepr_org_2013.zip"

        # download the CPS repwgts zipped file to the local computer
        download.file(url =  "http://ceprdata.org/wp-content/cps/data/cepr_org_2013.zip", tf , mode = "wb" )

        # unzip the file's contents and store the file name within the temporary directory
        fn <- unzip(exdir = "./rawdata/", zipfile = tf , overwrite = T )
    }
    df <- haven::read_dta("./rawdata/cepr_org_2013.dta")
  }

  # Using NBER ORG data
  if (data_use == "NBER_ORG") {
    # Checking if working directory contains data, download if not.
    if ( !("morg13.dta" %in% dir()) ) {
      # Downloading data 53mb
      df <- haven::read_dta("http://www.nber.org/morg/annual/morg13.dta")
    }
    df <- haven::read_dta("morg13.dta")
  }

  df <- tbl_df(df)

  # There are 1293 cases with missin values for the weigths. I delete them from the data.
  df <- df %>% filter(!is.na(orgwgt))
  df$final_weights <- df$orgwgt/12
  df$lfstat <- as.character(as_factor(df$lfstat))
  return(df)
}

df <- call.cps.org.data()

2.1.2 Wage variable

We assume no further adjustments like imputation for top coding, trimming, excluding overtime/commissions, or imputation of usual hours for ‘’hours vary’’ respondents. The CEPR ORG data includes several wage variables (described here). The wage variable that best matches the description above is wage3. This variable measures earnings for hourly workers (excluding overtime, tips, commissions and bonuses -otc-) and non hourly workers (including otc). According to CEPR “…attempts to match the NBER’s recommendation for the most consistent hourly wage series from 1979 to the present”

2.1.3 Note on CBO’s Wage Adjustment

In the original report from CBO, an adjustment was made to the wage of all the workers that did not report an hourly wage (wage3 is estimated as usual salary per self-reported pay-period over usual hours per pay-period). The goal was to reduce the measurement error in those wages, following the methodology proposed in this paper and compute the adjusted wage as a weighted average of the original wage and the average wage of workers with similar characteristics.

\[ \begin{equation} w_{ig} = \alpha w^{raw}_{ig} - (1 - \alpha) \overline{w^{raw}_{g}} \\ \text{with: } \quad \overline{w^{raw}_{g}} = \frac{\sum_{g} w^{raw}_{ig} }{N_{g}} \label{eq:2} \tag{2} \end{equation} \]

To implement such adjustments into this dynamic document, we requiere information from CBO on: $\alpha$ and $G$ in $\eqref{eq:2}$.

2.1.4 Wage forecast

To simulate the policy effects we need the distribution of wages and employment under the status quo. From the perspective of 2013, this implies forecasting to 2016 data on employment and wages available in 2013.

We forecast the wage distribution, from 2013 to 2016 in the following way:

2.1.4.1 Growth adjustments

We assume that the growth forecasts were taken from the 10-Year Economic Projections from CBO (this website). Annualized growth rates for the number of workers $g_{workers}$, and nominal wage per $g_{wages}$ worker where computed as follows:

\[ \begin{aligned} \widehat{ g_{workers} } &= \left[ \frac{\widehat{ N_{workers}^{2016} } }{N_{workers}^{2013}} \right]^{1/3}- 1 \\ \widehat{ g_{wages} } &= \left[ \frac{\widehat{ Wages^{2016} } / \widehat{ N_{workers}^{2016} } }{Wages^{2013} / N_{workers}^{2013}} \right]^{1/3} - 1 \end{aligned} \]

The original report assumes higher wage growth for high wages than low wages. To create different rates of growth in wage, we compute different wage growth rates for each decile of wage. The increments across deciles were constant and the set to match a final lowest decile with a yearly growth rate of 2.9%.

The adjustment over number of workers was made through the weight variable final_weights (multiplying it by the growth rate) whereas the wage3 variable was multiplied by the forecast growth rate of per worker wages.

2.1.4.1.1 Code to get economic growth forecasts

dwld_and_write_d1 <- function()  {
    # Check if csv file is in local machine. Download if not.
    if ( !("CBO_forec_in2013.csv" %in% dir(path = "./rawdata/") ) ) {
    # download data and read from xls (originally downloaded from CBO, then stored in dataverse for stability)
    download.file(url = "https://www.cbo.gov/sites/default/files/51135-2013-02-EconomicProjections.xls",
                      destfile = "forec_in2013.xls", mode="wb")

    # might require this line in a fresh run:
    # import from xls to data frame
    trends.df <- rio::import( "forec_in2013.xls" , sheet= "2. Calendar Year")

    # Save as csv in raw data.
    write_csv(trends.df, path = "rawdata/CBO_forec_in2013.csv")
  }
 return(read_csv(file = "rawdata/CBO_forec_in2013.csv"))
}

select_vals <- function(trends.df=cbo_forecast) {
  # Get column of all projections for 2013: get data from 2012 up to 2019)
  sel.col <- which(trends.df==2012, arr.ind = TRUE)[2]
  sel.col <- sel.col:(sel.col+7)
  # Get row with all projections for wages and salaries in billions of (nominal) dollars
  # Note: the excel file always contains two rows with the words "wage[s]" and "salar[ies|y]",
  # we are looking for the second one (corresponding to Wages and Salaries under Income)
  sel.row1 <- unique(
                      apply(trends.df,
                            2, function(x)  grep("Wage.*Salar.*", x ) )
                      )[[2]]
  sel.row1 <- sel.row1[2]

  # Get row with all projections for number people employed (in millions)
  sel.row2 <- which(trends.df=="Employment, Civilian, 16 Years or Older (Household Survey)", arr.ind = TRUE)[1]
  # Get Price Index. CBO uses Personal Consumption Expenditures (PCE)
  sel.row3 <- unique(
                      apply(trends.df,
                            2, function(x)  grep("Price Index, Personal Consumption", x ) )
                    )[[2]]

  #Keep only rows and colums identified above
  clean.df <- trends.df[c(sel.row1, sel.row2, sel.row3) , sel.col]
  return(clean.df)
 }  

compute_rates <- function(var.df) {
  # Compute growth rates.

  #Labeling and formating
  colnames(var.df) <- 2012:(2012+7)
  var.df <- apply(var.df, 2, as.numeric)
  row.names(var.df) <- c( "wages(total)", "workers",  "Personal CPI")

  #Generate wage and non-wage income per worker
  var.df <- rbind( var.df ,  
                      (var.df["wages(total)", ] * 1e9 ) / ( var.df["workers", ] * 1e6) )

  # "Price Index, Personal Consumption" = "Personal CPI"
  row.names(var.df) <- c("wages(total)", "workers",  "Personal CPI",
                         "wages per worker")
  # Transpose the data
  var.df <- t(var.df)

  # Define a new data set with the anual growth rate of each variable over time
  return( var.df/lag(var.df,1) - 1 )
  }

# Growth rate over several years. This function will be called several times down below.
gr.factor <- function(var1, init.year, last.year) {
# Compute the compounded growth factor for a given variable in a time interval
# For example growth factor between years 1,2 and 3 will be:
# (1+growth_rate_yr1) * (1+growth_rate_yr2) * (1+growth_rate_yr3)
    if (init.year == 2012) {init.year <- 2013}
    if (is.null(dim( growth.df[as.character(init.year:last.year),] ) )) {
      rates <- growth.df[as.character(init.year:last.year),] + 1
    } else {
      rates <- apply( growth.df[as.character(init.year:last.year),] + 1, 2, prod )
    }
    return(rates[var1])
}             

trends.df <- dwld_and_write_d1()    
clean.df <- select_vals(trends.df)
growth.df <- compute_rates(clean.df)

#gr.factor("wages per worker", 2014, 2016)

2.1.4.2 ACA adjustments

In forecasting the wages, the CBO report describes that some wages will be reduced in response to the individual mandate from the Affordable Care Act. Given that no details of the adjustments are described in the report, we do not incorporate them in this dynamic document.

2.1.4.3 State level minimum wage adjustments

CBO had to predict the future changes in the state level minimum wages. We use the actual values implemented by each state. The data comes from the Department of Labor (here).

Whenever the predicted wages were below the 2016 federal minimum wage they were replace by it.

Important assumption: when imputing state level min wages, we assume that no effects on employment where incorporated.

2.1.4.3.1 Code to get minimum wage values by state

# Minimum wage by state: download, write and read data
dwld_and_write_d2 <- function() {
  # Check if data is in local machine and download if not.
  if ( !("state_mw.csv" %in% dir(path = "./rawdata/") ) ) {
    fileURL <- "https://www.dol.gov/whd/state/stateMinWageHis.htm"
    #load webpage
    webpage <- read_html(fileURL)
    #identifying all tables in webpage
    tbls <- html_nodes(webpage, "table")
    #extracting tables
    aux1 <- html_table(tbls, fill = TRUE)
    #table 4 and 5 contain the relevant years
    df1 <- aux1[[4]]
    df2 <- aux1[[5]][,1:5]
    df3 <- left_join(df1, df2, "State or other\r\n      jurisdiction")
    colnames(df3) <- c("State", paste("y", 2007:2017, sep = ""))
    write_csv(df3, path = "rawdata/state_mw.csv")
  }
  return(read_csv(file = "rawdata/state_mw.csv"))
}   

# Clean data
clean_d2 <- function(df1 = state_mw_raw) {
  # Check if raw data is newer than clean data, or if clean data doesn't exist
  a <- file.info("./rawdata/state_mw.csv")$ctime
  b <- file.info("./data/state_mw_clean.csv")$ctime
  if (a > b | is.na(b))  {
    # remove states long name and add codes
    # IMPROVE: This section should be replace with a merge.
    df1 <- df1 %>% filter(State != "Guam" & State != "Puerto \r\n      Rico" & State != "U.S. \r\n      Virgin Islands" )
    df1$states <- c("Federal","AL","AK","AZ","AR","CA","CO","CT",
                                "DE","FL","GA","HI","ID","IL","IN","IA","KS","KY",
                                "LA","ME","MD","MA","MI","MN","MS","MO",
                                "MT","NE","NV","NH","NJ","NM","NY","NC",
                                "ND","OH","OK","OR","PA","RI","SC","SD",
                                "TN","TX","UT","VT","VA","WA","WV","WI",
                                "WY","DC")

    df1$states_codes_cps <- c(NA, 63, 94, 86, 71, 93, 84, 16, 51, 59, 58, 95, 82,
                              33, 32, 42, 47, 61, 72, 11, 52, 14, 34, 41, 64, 43,
                              81, 46, 88, 12, 22, 85, 21, 56, 44, 31, 73, 92, 23,
                              15, 57, 45, 62, 74, 87, 13, 54, 91, 55, 35, 83, 53)                                         
    df1 <- df1[,-which(colnames(df1)=="State")]
    df2 <- df1 %>% select(-c("states", "states_codes_cps"))

    # removing text from numeric values of MW
    for (year in paste("y", 2007:2017, sep = "")) {
      aux2 <- sapply(df2[, year], function(x) gsub("[^\\d |^\\. |-]+", "", x , perl = TRUE))
      df2[, year] <- unlist(lapply(str_split(aux2, "-"), function(x) max(as.numeric(x))))
      # Some states report a min wage below the federal one, so we replace
      fed_min <- as.numeric(df2[1, year])
      df2[, year] <- replace(df2[, year],
                             df2[, year] < fed_min | is.na(df2[, year]),
                             fed_min)    
      }

    df2 <- cbind("state" = df1$states,"state_co" = df1$states_codes_cps, df2)
    write_csv(df2, path = "data/state_mw_clean.csv")
  }
  return(read_csv(file = "data/state_mw_clean.csv"))
}

# Get raw data on MW
state_mw_raw <- dwld_and_write_d2()
# Get clean data on MW
state_min_w <- clean_d2(df1 = state_mw_raw)
# Export MW data to Stata
write.dta(state_min_w, "state_min_w.dta")

2.1.4.4 Code to forecast wages and workers

#GENERAL NOTE FOR THIS SECTION: the analysis perfomed here is the same as the one with CPS ASEC, so it should be better to wrap it in a function that is called twice. This way I can make sure that the sensitivity analysis works everywhere.

#Wage adjutsment

#CBO mentions that the lowest 10th percent gets a 2.9% growth in anual wage
#I compute the anualized growth rate of wages and creat 10 bins of wage growth
#starting at 2.4%, then adjust by minimum wages of 2016 and get a anualized
#growth of 2.9% for the lowest decile.

#THESE TWO FUNCTIONS ARE DIFFERENT BETWEEN ASEC AND ORG
### Compute annual growth rate of wages from 2013 to 2016                         
wage.gr.f <- function(SA.wage.gr = param.wage.gr) {
  ( ( gr.factor("wages per worker", 2014, 2016) )^(1/3) - 1 ) * SA.wage.gr
}
### Compute annual growth rate of workers from 2013 to 2016
workers.gr.f <- function(SA.worker.gr = param.worker.gr) {
  ( ( gr.factor("workers", 2014, 2016) )^(1/3) - 1 ) * SA.worker.gr
}

# SAME AS ASEC
# Compute the gap between average wage growth and the growth of the lowest decile (assume by CBO)
half.gap.f <- function(SA.wage.gr = param.wage.gr,
                       SA.base.growth = param.base.growth) {
  wage.gr.f(SA.wage.gr) - SA.base.growth
}

# 10 wage growth bins starting from lowest assumed value to 2 times the gap computed above
wage.gr.bins.f <- function( SA.base.growth = param.base.growth,
                            SA.wage.gr = param.wage.gr) {
  bins <- seq(SA.base.growth, wage.gr.f(SA.wage.gr) +
        half.gap.f(SA.wage.gr,SA.base.growth), length.out = 10)
  return(bins)
}

# CAUTION: DO NOT apply 'ntile()' fn from dplry as is will split ties differently than 'cut()' and results will not
# be comparable to STATA.

# Here we adjust min wages
# SAME
### Apply differential growth rate to wages (by deciles)
wages.final.cps.org.f <- function(df.temp = df,
                                  wage.gr.bins.temp = wage.gr.bins,
                                  SA.states.raise = param.states.raise,
                                  SA.wages = param.wages) {
    # Compute deciles of wages
    aux.var  <- wtd.quantile(x = df.temp$wage3,
                             probs = 1:9/10, weights = df.temp$final_weights)
    # Get state min wages for 2016 and 2013
    st.minws <- state_min_w %>%
      select( c( "state_co","y2013" ,"y2016", "state") )%>%
      rename(state_la = state, state = state_co)
    # prepare state key for merge
    df.temp <- df.temp %>%
      mutate(state_la = as.character(as_factor(state)),
             state = as.numeric(state))  
    # merge with state MW
    df.temp <- left_join(df.temp, st.minws, "state")
    # Create categories of decile, apply wage growth, and increase is new wage is below 2016 state wage
    df.temp <- df.temp %>%
        mutate( "w3.deciles" = cut(wage3, c(0, aux.var, Inf),
                                  right = TRUE, include.lowest = TRUE) ,  
                "w3.adj1" =  wage3 * ( 1 + wage.gr.bins.temp[w3.deciles] )^3,
                "wages.final" = ifelse(w3.adj1> y2016 * SA.states.raise,
                                  w3.adj1,
                                  y2016 * SA.states.raise) * SA.wages  )
    return(df.temp)
}

wage.gr <- wage.gr.f()
workers.gr <- workers.gr.f()  
half.gap <- half.gap.f()
wage.gr.bins <- wage.gr.bins.f()
df <- wages.final.cps.org.f()

2.2 Get the $N$

2.2.1 Identify the relevant universe

According to the CPS data, the population of working age in 2013 was 245.7 million*. Of those, 143.9 million were working, 11.5, were unemployed and 90.3 were not in the labor force (NILF).

Among those employed, 129.2 million workers receive a salary (not self employed or self incorporated). A small number of salary workers (0.2 million) did not reported any wages and were excluded from the sample. Of the employed salary workers 53.2 million did not report an hourly wage and it was computed from their reported pay-period divided by the reported hours in such pay-period. However, 3 million workers from this group reported having varying hours. Their wages were not calculated and were also excluded from the sample. As a result the final number of workers where a rise in the minimum wage can have a direct effect is 126 million (= 129.2 - 0.2 - 3), this is our universe of interest. Figure 1 presents visual representation of all these populations.

We know compute some descriptive statistics of the labor force in 2013 and the distribution of wages of the universe of interest both in 2013 and the predicted values for 2016.

Define variable that tags population of interest

2.2.1.1 Statistics and code behind figure 1

# Tag population of interest:
# Employed &
#   (not selfincorporated | not selfemployed) &
#   ( (not paid by the hour & (hours do not vary or missing in hours vary)) | paid by the hour ) &
#   wages are not zero and not missing
get.pop.int <- function(df.temp=df) {
  df.temp <- df.temp %>% mutate("pop_of_int" = (empl == 1 &
               (selfinc == 0 & selfemp == 0) &
               ( (paidhre == 0 & ( hrsvary != 1 | is.na(hrsvary) )  )  |
                   paidhre == 1 )  &
               !(wage3 == 0 | is.na(wage3) ) ) )
  return(df.temp)
}

# Tables 1 - 4 where constructed to look at the data. Only table 4 is shown in the final output
# Table 1: Describing population of interest (weighted and unweighted)
# To compute the new total of workers we multiply the original weigths by the growth rate.
f_table1 <- function(var_name, title) {
  var1 <- dplyr::enquo(var_name)
  table_1 <- df  %>%
     summarise("(1) Total" =
                 sum(!!var1, na.rm = TRUE),
               "(2) Employed" =
                 sum( !!var1 * (empl == 1), na.rm = TRUE),
               "(3) Salary (among employed)" =
                 sum(!!var1 * (empl == 1 &      #Salary worker if
                 (selfinc == 0 & selfemp == 0))        #not self employed or     
                  , na.rm = TRUE),                     #self incorp.

               "(4) Not Paid hourly (among salary)" =
                 sum(!!var1 * (empl == 1 &       # Not paid hourly if salary and
                 (selfinc == 0 & selfemp == 0) &        # not paid hourly
                 (paidhre == 0 | is.na(paidhre) )), na.rm = TRUE),

               "(5) Hours Vary (among not paid hourly)" =
                 sum(!!var1 * (empl == 1 &       #Hours vary if not paid hourly and
                 (selfinc == 0 & selfemp == 0) &        #hours vary
                 (paidhre == 0 | is.na(paidhre) ) & hrsvary == 1), na.rm = TRUE),

               "(6) No wage (in (3) but not in (5))" =
                 sum(!!var1 * ( (empl == 1 & selfinc == 0 & selfemp == 0) &
                     (paidhre == 1 | hrsvary != 1 | is.na(hrsvary) ) &
                     (wage3==0 | is.na(wage3) ) ) , na.rm = TRUE),

               "Population of Interest = (3) - (5) - (6)" =
                sum(!!var1 * (empl == 1 & (selfinc ==0 & selfemp == 0) &
                      ( (paidhre == 0 & hrsvary != 1) | paidhre ==1 )  &
                      wage3 != 0) , na.rm = TRUE)
               )

  table_1 <- t(table_1)
  table_1 <- format(table_1, big.mark = ",", digits = 0, scientific = FALSE)
  colnames(table_1) <- title
  return((table_1))
}

# Table 2: summary statistics of wage variable (levels)
#Summary stats
sum.stas1 <- function(x, wt) {
   c( "mean" = weighted.mean(x,w = wt, na.rm = TRUE),
      "sd" = sqrt( wtd.var(x, weights = wt) ) ,
      "median" = wtd.quantile( x, weights = wt, prob = c(.5)) ,
                wtd.quantile( x, weights = wt, prob = c(.1, .9) ) )
}
f_table2 <- function() {
  table_2 <- df %>%
  filter(pop_of_int == 1 & !is.na(wage3)) %>%
    with(sum.stas1(wage3, final_weights))

  table_2 <- cbind(table_2)
  colnames(table_2) <- "Wage"
  return(table_2)
}

# Table 3: distribution of the population by wages levels
f_table3 <- function() {
  table_3 <- df %>%
    filter(pop_of_int == 1 & !is.na(wage3)) %>%
      summarise("> $7.5" = weighted.mean(wage3<7.5,w = final_weights),
                "> $9" = weighted.mean(wage3<9,w = final_weights),
                "> $10.10" = weighted.mean(wage3<10.10,w = final_weights),
                "> $13" = weighted.mean(wage3<13,w = final_weights),
                "> $15" = weighted.mean(wage3<15,w = final_weights)
                )

  table_3 <- t(table_3)
  colnames(table_3) <- "Perc"
  return(table_3)
}

# Table 4: combination of tables 1 - 3
f_table4 <- function() {
  table_4 <- matrix(NA, 7, 2)
  colnames(table_4)  <- c("2013", "2016: status quo")
  rownames(table_4) <- c("Salary workers",
                         "Median wage",
                         "< 7.5","< 9",
                         "< 10.10", "< 13",
                         "< 15" )

  table_4[1,1] <- table_1[7,1]
  table_4[1,2] <- format(sum(df$final_weights[df$pop_of_int==1] *
                              (1 + workers.gr)^3,
                            na.rm = TRUE), big.mark=",")
  table_4[2,1] <- table_2[3]

  table_4[2,2] <- round( with(df[df$pop_of_int == 1 & !is.na(df$wages.final), ],
                              wtd.quantile( wages.final, weights = final_weights *
                              (1 + workers.gr)^3, prob = c(.5) ) ), digits = 2 )

  table_4[3:7,1]  <- round(as.matrix(table_3), digits = 2)

  aux.1 <- df %>%
    filter(pop_of_int == 1 & !is.na(wages.final)) %>%
      summarise("< 7.5" = weighted.mean(wages.final<7.5,
                                          w = final_weights * (1 + workers.gr)^3),
                "< 9" = weighted.mean(wages.final<9,
                                       w = final_weights * (1 + workers.gr)^3),
                "< 10.10" = weighted.mean(wages.final<10.10,
                                           w = final_weights * (1 + workers.gr)^3),
                "< 13" = weighted.mean(wages.final<13,
                                        w = final_weights * (1 + workers.gr)^3),
                "< 15" = weighted.mean(wages.final<15,
                                        w = final_weights * (1 + workers.gr)^3)
                )

  table_4[3:7,2] <- round( as.matrix(aux.1), digits = 2 )
  return(table_4)
}

### Build first treemap
#if (!(length(dev.list()) == 0)) { dev.off() }
#x11()
### Get data for treemap 1
data_for_treemap1 <- function() {
  universe.1 <- df %>%
          mutate("teen" = ifelse(age<20, "teen", "adult"),
                 "selfemp_inc" = 1 * (selfemp == 1 | selfinc == 1),
                 "pop_of_int" = 1 * pop_of_int ) %>%  
          group_by(lfstat, selfemp_inc, teen, pop_of_int)  %>%
          summarise("total" = sum(final_weights, na.rm = TRUE))

  universe.1$selfemp_inc[universe.1$lfstat!="Employed"] = NA
  universe.1[universe.1$lfstat!="Employed" | universe.1$selfemp_inc==1, c("teen", "pop_of_int")] = NA
  universe.1$selfemp_inc[universe.1$selfemp_inc==0]  <- "salary"
  universe.1$selfemp_inc[universe.1$selfemp_inc==1]  <- "self employed or self incorporated"
  #universe.1$pop_of_int <- with(universe.1, ifelse(pop_of_int==1,"included", "excluded"))
  return(universe.1)
}

### Plot treemap with distribution of labor force in 2013
treemap.1 <- function(){
  invisible(
  treemap(universe.1,
    index=c("lfstat", "selfemp_inc", "teen"),
    vSize=c("total"),
    range = c(7, 15),
    type="index",
    algorithm="pivotSize",
    fontsize.labels = c(12:8),
    border.col = c("#FFFFFF", "#000000","#000000"),
    aspRatio= 1.5,
    palette = c("#D3D3D3"),
    title.legend="number of employees",
    fontface.labels  = c(3,2,1),
    align.labels=list(c("left", "top"), c("right", "top"), c("right", "bottom") ),
    bg.labels = 1,
    title = "Figure 1: Distribution of population of working age in 2013"
  ))
}

df <- get.pop.int()
table_1 <- cbind(f_table1(final_weights, "N"),
                 f_table1(!is.na(final_weights), "Unweighted"))
table_2 <- f_table2()
table_3 <- f_table3()
table_4 <- f_table4()

For the universe of interest (employed, salaried, with hourly wages or non varying hours N = 126 millions), we describe the distribution of hourly wages in 2013 and the forecast values for 2016. Figure 2.

2.2.1.2 Statistics and code behind figure 2

# Plot with distribution of wages under the status quo in 2013 and 2016
two_dist <- function() {
  df  %>%
    filter(pop_of_int==1 & wage3<=200)  %>%
    select(wage3, wages.final, final_weights) %>%
    melt(value.variables=c("wage3", "wages.final") , id="final_weights") %>%
    mutate("final_weights" = ifelse(variable=="wages.final",
                                    final_weights * (1 + workers.gr)^3,
                                    final_weights) ) %>%
    ggplot() +
      geom_density(aes(x = value,
                       fill=variable,
                       weight = final_weights,
                       alpha = 1/2,
                       colour=variable), bw=1, kernel = "gau") +
      geom_vline(xintercept = c(7.25, 10.10, 11.5), col="blue") +
      coord_cartesian(xlim = c(0,20)) +   
      guides(alpha = "none", colour="none") +
      scale_x_discrete(limits = c(0,7.5, 10.10, 11.5, 20)) +
      labs(y = NULL,
           x = "Wage" ,
           title = "Figure 2: Distribution of wages in 2013 and 2016(forecast)")+
      theme(axis.ticks = element_blank(), axis.text.y = element_blank()) +
      theme(legend.justification=c(0,1),
            legend.position=c(0,1),
            legend.background = element_rect(fill = "transparent",
                                             colour = "transparent") )+
      scale_fill_discrete(name=NULL,
                           labels=c("2013", "2016 (Forecast)"))
}
p <- two_dist()

Table 1 below presents more detail statistics for the wage distributions for 2013 and for the forecast wages of 2016.

Comparison of 2013 and 2016 under the status quo
	2013	2016: status quo
Salary workers	125,992,501	131,946,284
Median wage	16.25	18.44
< 7.5	0.04	0.01
< 9	0.13	0.08
< 10.10	0.23	0.16
< 13	0.36	0.3
< 15	0.43	0.38

Among the population of interest, the employment effects of the minimum wage will be computed separatedley for adults ($age\geq 20$) and teenagers ($16 \leq age < 20$). For this purpose we present the wage distribution for both groups. Figure 3.

2.2.1.3 Statistics and code behind figure 3

#####################
#if (!(length(dev.list()) == 0)) { dev.off() }
#x11
### Get data for treemap 2
data_for_treemap2 <- function() {
  universe.1 <- df %>%
          mutate("teen" = ifelse(age<20, "teen", "adult"),
                 "wage_c" = ifelse(wage3<10.10 & !is.na(wage3), "w < 10.10", "w >= 10.10"),
                 "selfemp_inc" = 1 * (selfemp == 1 | selfinc == 1) ) %>%
            group_by(lfstat, selfemp, selfemp_inc, teen, wage_c)  %>%
              summarise("total" = sum(final_weights, na.rm = TRUE))

  universe.1$selfemp_inc[universe.1$lfstat!="Employed"] = NA
  universe.1$teen[universe.1$lfstat!="Employed"] = NA
  universe.1$wage_c[universe.1$selfemp_inc==1 | universe.1$lfstat!="Employed"] = NA

  universe.1$wage_n <- as.numeric(as.factor(universe.1$wage_c))

  universe.1$selfemp_inc[universe.1$selfemp_inc==0]  <- "salary"
  universe.1$selfemp_inc[universe.1$selfemp_inc==1]  <- "self employed or self incorporated"

  universe.1$color <-c("#d95f0e", "#fff7bc")[as.factor(universe.1$wage_c)]
  universe.1$color[is.na(universe.1$color)] <- "#D3D3D3"
  return(universe.1)
}
### Treemap with distribution of labor for and wages
treemap.2 <- function() {  
  invisible(
    treemap(universe.1,
       index=c("lfstat", "selfemp_inc", "teen", "wage_c"),
       vSize=c("total"),
       vColor = c("color"),
       range = c(7, 15),
       type="color",
       aspRatio= 1.5,
       algorithm="pivotSize",
       border.col = c("#FFFFFF", "#000000","#000000","#000000"),
       sortID="-wage_n",
       fontsize.labels = c(12:8),
       #aspRatio= c(4,3),
       title.legend="number of employees",
       align.labels=list(c("left", "top"), c("right", "top"),
                         c("right", "bottom"), c("center", "center")),
       fontface.labels  = c(3,2,1,1),
       bg.labels = 1,
       title = "Figure 3: Figure 1 + proportion of salary workers earning more/less than 10.10",
       lowerbound.cex.labels = 1
    )
  )
}

#WISH LIST: display number of workers below red bars in reactive fashion.
#Maybe not even a plot: only a slider with min wage and a
#reactive box with the number of workers that would be below it.

2.2.2 Identify relevant population

Given our universe, the next step is to identify the population that would be actually affected if a raise in the minimum wage takes place. The two relevant populations to define now are the number of low wage workers ($\widehat{ N_{lowwage} }$) and the number of workers that would earn less than the new minimum wage ($\widehat{ N_{w\leq MW^{1}} }$). CBO defines a low wage as one below $11.5 dollars per hour in 2016, and the proposed value used for the minimum wage is $10.10 dollars per hour. From now on we separate each of these groups among adults and teenagers following CBO’s convention.

Three adjustments are applied to the population of workers with wages below the new minimum wage:
1 - Workers whose earnings are mainly from tips are tagged and a different minimum wage is apply to them ($2.13).
2 - A fraction $\alpha_{ 1 }$ of $\widehat{ N_{ w\leq MW^{1} } }$ is deleted to account for non compliers.
3 - A fraction $\alpha_{ 2 }$ of $\widehat{ N_{ w\leq MW^{1} } }$ is deleted to account for workers not subject to the Fair Labor Standards Act.

After this three adjustments, performed over the relevant population forecast for the year 2016, we obtain the final population.

2.2.2.1 Tipped workers

Additional information needed from CBO: which occupations they used to identify tipped workers and clarify the conceptual need to adjust for this population (as the lower min wage only applies if they make more than 7.50 an hour).

Apply different minimum wage to workers who receive more than $30 in tips. This was applied to 11 occupations (such as waiter, bartender, and hairdresser ~10% if low wage workers)

Given that we do not know which 11 categories the report makes reference to, and which variable that defines the categories, we will use the variable peernuot to identify tipped workers. This variable overestimates the number of tipped workers (13% as opposed to the 10% mentioned in the report) because it also contains the workers paid overtime or commissions.

Tipped workers with wages below 7.25 are 1% of the total tipped workers. Non-tipped workers with wages below 7.25 are 1.6% of the total.

2.2.2.2 Non compliance

We estimate the proportion of low wage workers (wage less then 11.50) that earn less than the their state’s minimum wage in 2013 as a proxy for non compliers under the new minimum wage in 2016. The original report mentions that it comes up to 12% of the low wage population.

Following the original report (footnote 25) we will consider salary workers as non compliers only if their wage is lower than the minimum wage, by less than 25 cents for non-tipped workers or 13 cents for tipped workers.

2.2.2.2.1 Code to compute percentage of non compliers

# Percentage of total workers in 2013 that earn less that their states' minimum wage.
# Universe is defined by "low wage" workers, with wages below $11.5 an hour.  
# variable peernuot seems to be the most appropiate variable to indicate wether or nor receives tips
# 1=YES; 2=NO
# DESCRIBE
f_non_comp_stats <- function(SA.low.wage = param.low.wage){
  df %>%
  select(wage3, state, final_weights, peernuot, pop_of_int, y2013, y2016)  %>%
    filter(pop_of_int == 1 & wage3 <= SA.low.wage * 11.50)  %>%
      summarise(
        "% of non compliers w/o adj" =
          wtd.mean(wage3 < y2013,
                          weights = final_weights),
        "% of non compliers with adj" =
          wtd.mean(wage3 + 0.25 * (peernuot == 2) +
                          0.13 * (peernuot == 1)  < y2013,
                          weights = final_weights)
        )
}

non.comp.stats <- f_non_comp_stats()

2.2.2.3 Not covered that might benefit

Additional information needed from CBO: how they identified non-FLSA eligibility.

Include not covered by FLSA but expected to be affected: employees of small firms, occupations generally exempt from FLSA, and teenagers in first 90 days of employment.

The estimated percentage of non-compliance is 10.6% of the target population in 2013 (N = 131,946,284).

2.2.3 Summary

Define $\hat{g(2016|2013)}$ as the growth factor for the population from the year 2013 to 2016 ($\hat{g(2016|2013)} = (1 + \hat{g})^3$, where $\hat{g}$ is the annual growth rate of the population). Then:
\[ \begin{aligned} \widehat{ N^{teen}_{final} } &= \hat{N^{teen}_{\hat{w} \leq MW^{1} } } (2016|2013) \times (1 - \hat{ \alpha^{teen}_{1} } - \hat{ \alpha^{teen}_{2} })\\ &= \hat{ g(2016|2013) } \times \hat{ N^{teen}_{employed} } (2013) \times P(\hat{w} \leq MW^{1}|teen) \times (1 - \hat{ \alpha^{teen}_{1} } - \hat{ \alpha^{teen}_{2} }) \end{aligned} \]

Analogously for the adult population:

\[ \begin{aligned} \widehat{ N^{adult}_{final} } &= \hat{ g(2016|2013) } \times \hat{ N^{adult}_{employed} } (2013) \times P(\hat{w} \leq MW^{1}|adult) \times (1 - \hat{ \alpha^{adult}_{1} } - \hat{ \alpha^{adult}_{2} }) \end{aligned} \]

The table below presents the estimate from 2013 for all each component.

### Table 4 in DD (elements to compute N final)
# TO DO: write this function in a more compact way
N.final.f <- function(df.temp = df, workers.gr.temp = workers.gr,
                      SA.low.wage = param.low.wage){
  aux.1 <-  bind_rows(
    df.temp  %>%
    filter(pop_of_int == 1) %>%
    mutate("adult" = ifelse(age>=20, "Adult", "Teen") )  %>%
    group_by(adult)  %>%
    summarise( "Salary workers ($\\hat{ N_{employed} }$) (millions)" =
                 sum(final_weights)/1e6,
               "Low wage workers ($w \\leq 11.5 p/h$) (millions)" =
                 sum(final_weights * (wages.final <= 11.50 * SA.low.wage))/1e6,
               "% Salary below new MW ($P(\\hat{w} \\leq MW^{1})$)" =
                 wtd.mean( 1 * (wages.final <= 10.10),
                           na.rm = TRUE, weights = final_weights) * 100)  ,   
    df.temp %>%
    filter(pop_of_int == 1) %>%
    summarise( "Salary workers ($\\hat{ N_{employed} }$) (millions)" =
                 sum(final_weights)/1e6,
                "Low wage workers ($w \\leq 11.5 p/h$) (millions)" =
                 sum(final_weights * (wages.final <= 11.50 * SA.low.wage))/1e6,
                "% Salary below new MW ($P(\\hat{w} \\leq MW^{1})$)" =
                 wtd.mean( 1*(wages.final <=10.10),
                           na.rm = TRUE, weights = final_weights) * 100) %>%
    mutate(adult = "Total" )
  )

  #Non compliance (starting from a different denominator: 'pop_of_int == 1 & wage3 < 11.5')
  aux.2 <- bind_rows(
    df.temp %>%
    select(wage3, age, state, final_weights, peernuot, pop_of_int, y2013)  %>%
    filter(pop_of_int == 1 & wage3 <= 11.5 * SA.low.wage) %>%
    mutate("adult" = ifelse(age>=20, "Adult", "Teen") )  %>%
    group_by(adult)  %>%
        summarise(
            "% of non compliers ($\\alpha_{1}$)" =
            wtd.mean(1 * (wage3 + 0.25 * (peernuot == 2) +
                            0.13 * (peernuot == 1)  < y2013 ),
                            weights = final_weights)*100 ),  
    df.temp %>%
    select(wage3, age, state, final_weights, peernuot, pop_of_int, y2013)  %>%
    filter(pop_of_int == 1 & wage3 <= 11.5 * SA.low.wage) %>%
        summarise(
            "% of non compliers ($\\alpha_{1}$)" =
            wtd.mean(1 * (wage3 + 0.25 * (peernuot == 2) +
                            0.13 * (peernuot == 1)  < y2013 ),
                            weights = final_weights)*100 ) %>%
            mutate("adult" = "Total")
  )
  stats2 <- rbind(t(aux.1[,-1 ]), t(aux.2[, -1]) )
  colnames(stats2) <- t(aux.1[,1])
  stats2 <- rbind(stats2, "$\\hat{ g(2016|2013) }$" = (1 + workers.gr.temp)^3 )
  aux.total <- apply(stats2, 2, function(x) x[5] * x[1] * (x[3]/100) * (1 - x[4]/100)  )
  return(rbind(stats2, "$\\widehat{ N_{final} }$ (millions)" = aux.total))
}

#RENAME stats2 with table.n.final
table.n.final <- N.final.f()

# FH: There a small difference in the overall total and the sum of Teens and Adults:
# table.n.final[6,3] - sum( (table.n.final[6, 1:2])  )
# I have pinned down the source of the problem to differences in the way % is calculated (for groups relative to the total)

Characteristics of target population
	Adult	Teen	Total
Salary workers ($\hat{ N_{employed} }$) (millions)	121.69	4.30	125.99
Low wage workers ($w \leq 11.5 p/h$) (millions)	26.85	3.73	30.58
% Salary below new MW ($P(\hat{w} \leq MW^{1})$)	14.41	74.31	16.46
% of non compliers ($\alpha_{1}$)	10.31	12.75	10.57
$\hat{ g(2016\|2013) }$	1.05	1.05	1.05
$\widehat{ N_{final} }$ (millions)	16.47	2.92	19.42

2.3 Get the $\eta \times \Delta w$

In order to get the elasticity of labor demand that best fits the context analysed here, two steps were required: (i) identify from the literature the best estimate available; (ii) extrapolate that estimate to fit the specific context of this policy analysis.

2.3.1 Getting the best estimates from the literature

It is unclear the precise mechanism used by CBO to choose the estimates for the labor demand elasticity.

We take their estimate as given: -0.1 for the teenager population, with a “likely range” a range from 0 to -0.2 (“likely range” is used throughout the report to estimate the expert judgment that the elasticity will be in that range 2/3 of the time)¹. The reasons provided in the report for choosing this figure can be summarize by the following three points:
- More weight was given to studies that exploit across state variation (as opposed to over time country level variation).
- The final estimate takes in to account publication bias towards highly negative estimates.
- The magnitude of the increase (%39) and the fact that would be indexed to inflation going forward, makes it an unusually large increase in the minimum wage.
With this elements we can write the chosen elasticity from the literature ($\eta_{lit}$) as the product of the original assessment of the literature ($\eta_{lit}^{0}$), a reduction factor for publication bias ($F_{pub.bias}<1$) and a amplification factor for a larger variation in minimum wage ($F_{large.variation}>1$):

\[ \begin{aligned} \eta^{teen}_{lit} &= \eta_{lit}^{0} \times F_{pub.bias} \times F_{large.variation} = -0.1 \end{aligned} \]

Additionally, CBO provides two additional caveats that could be added to the analysis:
- Ripple effects on employment were assume to be null as the result of two opposing effects: (i) “ripple-wages” would increase unemployment but (ii) substitution of marginally more productive workers for layed off workers below the new minimum wage would decrease unemployment. CBO assumes these effects roughly cancel each other.
- CBO acknowledges that effects could be larger (in abs value) during recessions but does not predict a recession for 2016. No estimation is provided of how much larger those effects would be in case of a recession.

2.3.2 Extrapolating research estimate to current context

Three adjustments are proposed: (i) extrapolate elasticity estimates for teenager to adults; (ii) re-scale elasticity to population affected by the new minimum wage; (iii) adjust elasticities to reflect average wage variation from and increase in the minimum wage.

2.3.2.1 Extrapolate from teenagers to adult population

Most of the estimates from the literate are for teenage population. CBO proposes to extrapolate this estimates to the adult population in the following fashion: \[ \begin{aligned} \eta^{adults}_{lit} = \eta^{teens}_{lit} \times F_{extrapolation} \end{aligned} \]

Where a value $F_{extrapolation} = 1/3$ is chosen to reflect that the demand for adult labor is suspected to be more inelastic than the demand for teen labor.

2.3.2.2 Re-escale elasticities to the population affected by the minimum wage

The literature reports the estimated effect for a given population $\eta_{lit}$. This estimate can be seen as the weighted average between the demand elasticities for the directly affected population ($\eta_{w\leq MW}$), with wages below the new minimum, and the non affected ($\eta_{w > MW}$) populations, with wages above the new minimum:

\[ \begin{aligned} \eta^{g}_{lit} =& p^{g}_{w\leq MW} \eta^{g}_{w\leq MW} + (1 - p^{g}_{w\leq MW})\eta^{g}_{w > MW} \hspace{4em} g = \{teens, \, adults \} \end{aligned} \]

The underlying assumption is that $\eta_{w > MW} = 0$. With this the first proposed adjustment becomes:

\[ \begin{aligned} \eta^{g}_{w\leq MW} =& \frac{\eta^{g}_{lit}}{p^{g}_{w\leq MW}} \hspace{4em} g = \{teens, \, adults \} \end{aligned} \]

The fraction of the population with a forecast income below the new minimum wage ($p^{g}_{w\leq MW}$) is 74.3% for teenagers and 14.4% for adults.

2.3.2.3 Adjust elasticities to average wage variation

Given that the percentual variation from the old wage to the new minimum wage varies for different levels of wages, total effect should be computed as $\sum_{b} \% \Delta w_{b} \eta^{g}_{w\leq MW} \times N_{b}$ for $b = 1 \dots B$ wage brackets.

The report approximates this calculation by computing the employment effect for average wage variation across the total population, in age group $g$, affected by the minimum wage: $\overline{\%\Delta w^{g}} \times \eta^{g}_{w\leq MW} \times \widehat{ N^{final}_{g} }$.

Finally CBO argues that as the variation changes the elasticity should be re-scaled to reflect such variation. With the elasticity resulting in:

\[ \begin{aligned} \widetilde{ \eta^{g}_{w\leq MW} } &= \frac{\eta^{g}_{lit}}{p^{g}_{w\leq MW}} \times \frac{\%\Delta MW}{\overline{\%\Delta w^{g}}} = \eta^{g}_{lit} \times F^{g}_{adjs} \hspace{4em} g = \{teens, \, adults \} \end{aligned} \]

Looking at historical trends in the CPS, CBO estimates that $F^{g}_{adjs}$ is 4.5 for both populations². In the following table we summarize all the elements required to compute $\widetilde{ \eta^{g}_{w\leq MW} }$

2.4 Other factors

CBO reasons that a rise in the minimum wage would have effects in aggregate consumption and this in turn would have effects on employment. The overall effect is estimated as an increase in employment between 30,000 and 50,000 jobs (“a few tens of thousands of jobs”). A narrative argument is provided for the mechanisms behind this effect.

The effects on consumption are separated into direct and indirect.

2.4.1 Direct effects on consumption

Job loses $\Rightarrow$ reduction in consumption.
Increase wages $\Rightarrow$ increase consumption.
Less profits for business owners and shareholders $\Rightarrow$ reduction in consumption.
Increase prices $\Rightarrow$ reduction in consumption.

Overall the direct effect on consumption is estimated[?] to be positive due to a higher marginal propensity to consume of the low wage individuals relative to high income ones.

2.4.2 Indirect effects on consumption

Increase in consumption $\Rightarrow$ Increase investment in the future $\Rightarrow$ Increase consumption in the future.
Increase prices of low-wage-intensive items $\Rightarrow$ increase demand in other items $\Rightarrow$ Bottleneck in other items until firms adjust.

Overall the indirect effect on consumption is estimated to be negative.

2.4.3 Overall effect on consumption and its effect on employment

CBO estimates [?] that the net effect on consumption would be positive and that its effect on employment would be between 30,000 and 50,000 additional jobs for 2016. This effects are estimated for the short run only. The methodology is mention to be similar to the one used to asses the American Recovery and Reinvestment Act (found here)

2.4.4 Prevent double counting

The estimated elasticities in the literature already account for approximately 10% of the effects through consumption, so the final effect of consumption here is multiplied by 0.9 to prevent double counting.

\[ \begin{aligned} \widehat{OF} &= 40,000 \times 0.9 \end{aligned} \]

2.5 Computing effects on employment

Putting all these elements together we get: \[ \begin{aligned} \widehat{ \Delta E } &= \sum_{g\in\{A,T\}} \left( \widehat{ N^{final}_{g} } \times \widetilde{ \eta^{g}_{w\leq MW} }\times \overline{\%\Delta w^{g}} \right) - \widehat{OF} \end{aligned} \]

2.5.1 Code to compute each component

#Elasticity from the literature         
eta.lit.f <- function(SA.eta.lit = param.eta.lit) - 0.1 * SA.eta.lit

#Extrapolation factor used by CBO
factor.extrap.f <- function(SA.factor.extrap = param.factor.extrap) 1/3 * SA.factor.extrap

#All remaining components required to compute effect on employment
final.other.comp <- function(df.temp = df) {
  eta.lit <- eta.lit.f()
  factor.extrap <- factor.extrap.f()
  rem.comp <-  df.temp  %>%
    filter(pop_of_int == 1) %>%
    mutate("adult" = ifelse(age>=20, "Adult", "Teen") )  %>%
    group_by(adult)  %>%
    summarise("$\\overline{\\%\\Delta w}$" = wtd.mean( ifelse(wages.final <=10.10,
                                              (10.10 - wages.final)/wages.final, NA) ,
                                       na.rm = TRUE, weights = final_weights) * 100 )

  rem.comp <- rbind("$\\eta_{lit}$" = c( eta.lit * factor.extrap , eta.lit ),
                  "$\\eta_{w \\leq MW'}$" = c( eta.lit * factor.extrap , eta.lit ) /
                    (table.n.final["% Salary below new MW ($P(\\hat{w} \\leq MW^{1})$)",1:2]/100),
                  "$F_{adj}$" = c( 4.5, 4.5 ),
                  t(rem.comp[,-1]) )

  aux.1 <- apply(rem.comp, 2, function(x) x[1] * x[3])
  rem.comp <- rbind(rem.comp, "$\\widetilde{\\eta_{w\\leq MW}}$"= aux.1)

  colnames(rem.comp) <- c("Adult", "Teen")  
  return(rem.comp)
}


# N final = gr * N_employed * % below min wage * compliers
# Adj Fact = 1/(% below min wage) * nominal variation/average variation
# Elasticity = elasticities * Extrap * Adj Fact
# Employmemt effect = N final * Elasticity * Avg variation

# N final = gr * N_employed * % below min wage * compliers
f_n_final <- function (SA.N = param.N,
                       SA.fract.minwage = param.fract.minwage,
                       SA.noncomp = param.noncomp) {
  (1 + workers.gr)^3 *
  table.n.final["Salary workers ($\\hat{ N_{employed} }$) (millions)",1:2] *  SA.N *
  table.n.final["% Salary below new MW ($P(\\hat{w} \\leq MW^{1})$)",1:2]/100 * SA.fract.minwage *
  ( 1 - SA.noncomp * table.n.final["% of non compliers ($\\alpha_{1}$)",1:2]/100 - 0)
}

# Adj Fact = 1/(% below min wage) * nominal variation/average variation
# adj_fact takes literal version from CBO report (last line: "Teen")
f_adj_fact <- function(SA.fract.minwage = param.fract.minwage,
                       SA.av.wage.var = param.av.wage.var,
                       frac.below.mw = table.n.final["% Salary below new MW ($P(\\hat{w} \\leq MW^{1})$)",1:2]/100,
                       avg.wg.inc = rem.comp["$\\overline{\\%\\Delta w}$",1:2]/100 ) {
  new.mw <- 10.10
  nom.wg.inc <- new.mw/7.25 - 1
  res1 <- ( 1/(frac.below.mw * SA.fract.minwage) )*
      ( nom.wg.inc / ( avg.wg.inc * SA.av.wage.var ) )
  return(res1)
}

# FUNCTION FOR ELASTICITY FINAL
f_elast <- function(SA.eta.lit = param.eta.lit,
                    SA.factor.extrap = param.factor.extrap,
                    SA.F.adj = param.F.adj,
                    var_adj_fact = adj_fact) {
   c( eta.lit.f(SA.eta.lit) * factor.extrap.f(SA.factor.extrap) ,
     eta.lit.f(SA.eta.lit) ) *
    var_adj_fact * SA.F.adj
}

f_delta_e <- function(SA.av.wage.var = param.av.wage.var,
                      SA.other.factors = param.other.factors) {
  other.factors <- 0.05 * 0.9 * SA.other.factors
  sum( n_final * elas_final *
        (rem.comp["$\\overline{\\%\\Delta w}$",1:2]/100) *
        SA.av.wage.var ) + other.factors
}

rem.comp <- final.other.comp()
n_final <- f_n_final()
n_below_mw_teen <- table.n.final["% Salary below new MW ($P(\\hat{w} \\leq MW^{1})$)","Teen"]/100
adj_fact <- f_adj_fact(SA.fract.minwage = param.fract.minwage,
                       SA.av.wage.var = param.av.wage.var,
                       frac.below.mw = n_below_mw_teen,
                       avg.wg.inc = rem.comp["$\\overline{\\%\\Delta w}$","Teen"]/100 ) #1:2 |"Teen"
n_below_mw_both <- table.n.final["% Salary below new MW ($P(\\hat{w} \\leq MW^{1})$)",1:2]/100
adj_fact2 <- f_adj_fact(SA.fract.minwage = param.fract.minwage,
                       SA.av.wage.var = param.av.wage.var,
                       frac.below.mw = n_below_mw_both,
                       avg.wg.inc = rem.comp["$\\overline{\\%\\Delta w}$",1:2]/100 ) #1:2 |"Teen"

#Elasticity = elasticities * Extrap * Adj Fact
elas_final <- f_elast(var_adj_fact = 4.5 * param.F.adj) # adj_fact | 1, 4.5
# Employmemt effect = N final * Elasticity * Avg variation - other factors
delta.e1 <- f_delta_e()
# -0.5032739

#Following methodology literally
elas_final <- f_elast(var_adj_fact = adj_fact) # adj_fact | 1, 4.5
delta.e2 <- f_delta_e()

#Following methodology in spirit
elas_final <- f_elast(var_adj_fact = adj_fact2) # adj_fact | 1, 4.5
delta.e3 <- f_delta_e()

#I am not convinced that the adjustment is needed in the first place
elas_final <- f_elast(var_adj_fact = 1) # adj_fact | 1, 4.5
delta.e4 <- f_delta_e()

# "Arbitrary" Sens. Anal to get lower unemp (0.9 * wage in c, 0.37 to 0.5, 1/3 to 1/4)
# delta.e3 <- sum( rem.comp$N * rem.comp$`% Empl Below MW`/100 * ( 1 - rem.comp$`% of non compliers with adj`/100 - 0) *
#        0.9*rem.comp$`% Mean Wage Inc`/100 * 0.1/(rem.comp$`% Empl Below MW`/100) *
#        ((0.9*rem.comp$`% Mean Wage Inc`/100)/0.5) * c(1/4, 1)
# ) - 0.05 * 0.9

# library(foreign)
# df.ma <- read.dta("C:/Users/fhocesde/Documents/dissertation/meta-analysis/minwage1.dta")
# #x11()
#
# hist(df.ma$tstatistic, breaks = 300, xlim = c(-8,4))
#
# abline(v = c(-1.96, -1.6, 1.6, 1.96), col = "red")  

#knitr::kable(rem.comp, caption="Components of Elasticities", digits = 2)

Components of Elasticities
	Adult	Teen
$\eta_{lit}$	-0.03	-0.10
$\eta_{w \leq MW'}$	-0.23	-0.13
$F_{adj}$	4.50	4.50
$\overline{\%\Delta w}$	13.36	16.59
$\widetilde{\eta_{w\leq MW}}$	-0.15	-0.45

Using all the components described above we get $\widehat{ \Delta^{-} E } =$ -503 thousand jobs. The report however computes $F^{g}_{adjs}$ in a different fashion and gets a value of 4.5 (when computing the values of $F^{g}_{adjs}$ from the table below - as oppose to using historical values - we get $\widehat{ \Delta^{-} E } =$ -343 thousand jobs).

3 Distributional effects

In the first step towards obtaining the policy estimates presented in the introduction we concluded with a figure of $\widehat{ \Delta^{-} E } =$ -503 thousand jobs lost. We now turn to two additional key quantities: the wage gain among those who get a rise do to the new minimum wage, and the distribution of the losses that pay for that raise. The effect of both quantities is estimated at the level of family income.

3.1 Computing Family income

As the unit of interest now is the family and detailed information on income is needed, CBO performs the distributional analysis using a different data set from the Current Population Survey. Instead of the ORG, the following analysis uses the CPS Annual Social and Economic Supplement (ASEC) of March 2013. This data contains income information for the year 2012.

3.1.0.1 Code to load the data and merge state minimum wages

To the CPS ASEC data we also merge the data on state minimum wages describe in section 2.1.4.3

# List all non-function objects that I need from here forward

# Call data from CPS ASEC March 2013
call.cps.asec.data <-  function() {
  data_use <- "CPER_ASEC"

  # Using CEPR ORG data
  if (data_use == "CPER_ASEC") {
  # Checking if working directory contains data, download if not.
    if ( !("cepr_march_2013.dta" %in% dir(path = "./rawdata/")) ) {
        # create name of file to store data
        tf <- "cepr_march_2013.zip"

        # download the CPS repwgts zipped file to the local computer
        download.file(url =  "http://ceprdata.org/wp-content/cps/data/cepr_march_2013.zip", tf , mode = "wb" )

        # unzip the file's contents and store the file name within the temporary directory
        fn <- unzip(exdir = "./rawdata/", zipfile = tf ,"cepr_march_2013.dta",  overwrite = T )
    }
    df <- haven::read_dta("./rawdata/cepr_march_2013.dta")
  }

  df <- tbl_df(df)
  return(df)
}

# Add sensitivity analysis parameters to hours, weeks, weights
add.base.vars <- function(SA.hours = param.hours,
                          SA.weeks = param.weeks,
                          SA.N = param.N) {
  df %>% mutate("hrslyr" = hrslyr * SA.hours,
                "wkslyr" = wkslyr * SA.weeks,
                "hhwgt" = hhwgt * SA.N)
}

# Merge state min wage info
add_minw <- function(df.temp = df) {
  # Get state min wages for 2016 and 2013
  st.minws <- state_min_w %>%
    select( c( "state_co","y2013" ,"y2016", "state") )%>%
    rename(state_la = state, state = state_co)
  # prepare state key for merge
  df <- df %>%
    mutate(state_la = as.character(as_factor(state)),
           state = as.numeric(state))  
  # merge with state MW
  df <- left_join(df, st.minws, "state")
  return(df)
}

df <- call.cps.asec.data()
df <- add.base.vars()
df <- add_minw(df)

3.1.1 Computing wages in CPS ASEC 2013

The hourly wage ($w$) was computed as the ratio of yearly earnings ($y$) and the product of usual number of hours worked in a week ($Hour.per.Week$) and the number of weeks worked in a year ($Weeks.per.Year$). The CPS ASEC data set contains three variables for yearly earnings: incp_all, incp_ern, incp_wag corresponding to all income, earnings, and wages respectively. We choose incp_wag.

For this data set, the weights used in our analysis will be hhwgt.

\[ \begin{aligned} \hat{w} = \frac{\hat{y}}{\hat{Hours \, per \, Week} \times \hat{Weeks\,per\,Year}} \end{aligned} \]

3.1.1.1 Code for computing wages and descriptive statistics

# Tag population of interest FH: (1) Need to deal with 598 NA's and (2) clarify
# that restrictions here are not exactly the same as CPS ORG.  
# Pop of interest: employed & (not self employed or self incorp) & (wage variable not zero and not missing)
f_pop_of_int <- function() {
  df <- df %>% mutate("pop_of_int" =  1 * (empl == 1 &
                (selfinc == 0 & selfemp == 0) &
                !(incp_wag == 0 | is.na(incp_wag) ) )
                )
  return(df)
}

# For CPS ASEC I am using the weights hhwgt
# Population size, employed and salaried
f_table_5 <- function() {
  table_5  <- df  %>%
     summarise("(1) Total" =
                 sum(hhwgt, na.rm = TRUE),
               "(2) Employed" =
                 sum( hhwgt * (empl == 1), na.rm = TRUE),
               "(3) Salary (among employed)" =
                 sum(hhwgt * (empl == 1 &             #Salary worker if
                 (selfinc == 0 & selfemp == 0))       #not self employed or     
                  , na.rm = TRUE)                     #self incorp.
              )

  table_5 <- t(table_5)
  colnames(table_5) <- "N"
  table_5 <- format(table_5, big.mark = ",", digits = 0, scientific = FALSE)
  return(table_5)
}

# Summary stats of wage
sum.stas1 <- function(x, wt) {
   c( "mean" = weighted.mean(x,w = wt, na.rm = TRUE),
      "sd" = sqrt( wtd.var(x, weights = wt) ) ,
      "median" = wtd.quantile( x, weights = wt, prob = c(.5)) ,
                 wtd.quantile( x, weights = wt, prob = c(.1, .9) ) )
}

# Descriptive stats of earnings, hours, and weeks.
f_table_6 <- function() {
  table_6 <- df %>%
    filter(pop_of_int == 1 & !is.na(hhwgt)) %>%
      select(incp_wag, hrslyr, wkslyr, hhwgt)

  table_6 <- sapply(table_6[,c("incp_wag", "hrslyr", "wkslyr")],
                    function(x) sum.stas1(x, table_6$hhwgt) )
  table_6 <- cbind(table_6)
  colnames(table_6) <- c("Earnings", "Hours", "weeks")
  return(table_6)
}

# Compute hourly wages, replace negative vales withs 0's
add.wage.var <- function(df) {
  df$wage <- with(df, incp_wag/(hrslyr * wkslyr) )
  df$wage[df$wage<0]  <- NA

  df <- df %>%
      mutate("hhwgt.2013" = gr.factor("workers", 2012, 2013) *  hhwgt ,
           "wage.2013" = gr.factor("wages per worker", 2012, 2013)  *  wage)
  return(df)
}

# Summary stats for year 2013
f_table_7 <- function() {  
  table_7 <- df %>%
    filter(pop_of_int == 1 & !is.na(wage)) %>%
      summarise("N" = sum(hhwgt.2013),
                "> $7.5" = weighted.mean(wage.2013<7.5,w = hhwgt.2013),
                "> $9" = weighted.mean(wage.2013<9,w = hhwgt.2013),
                "> $10.10" = weighted.mean(wage.2013<10.10,w = hhwgt.2013),
                "> $13" = weighted.mean(wage.2013<13,w = hhwgt.2013),
                "> $15" = weighted.mean(wage.2013<15,w = hhwgt.2013)
                )

  table_7 <- t(table_7)
  colnames(table_7) <- "Perc (2013)"  
  return(table_7)
}

df <- f_pop_of_int()
table_5 <- f_table_5()
table_6 <- f_table_6()
df <- add.wage.var(df)
table_7 <- f_table_7()

3.1.1.2 Adjusting wages 1

As with the CPS ORG, an adjustment for wages is applied. Unlike the previous modification, where the adjustments were over a fraction of the population (those who did not report an hourly wage), our understanding is that CBO adjust the wages of all the population in this case.

The formula for the adjustments are the same as those propose in equation $\eqref{eq:2}$.

3.1.1.3 Forecasting wage

The wage forecast is the same methodology as in section 2. This methodology is applied to a different data set (CPS ASEC) and for one additional year (forecasting from 2012 to 2016) than with the CPS ORG data.

3.1.1.3.1 Code to forecast wages, workers

#Wage adjutsment
#CBO mentions that the lowest 10th percent gets a 2.9% growth in anual wage
#I compute the anualized growth rate of wages and creat 10 bins of wage growth
#starting at 2.4%, then adjust by minimum wages of 2016 and get a anualized
#growth of 2.9% for the lowest decile.
#THIS TWO LINES OF CODE ARE DIFFERENT BETWEEN ASEC AND ORG

### Computing annualized growth rate for wage
wage.gr.asec.f <- function(SA.wage.gr = param.wage.gr) {
  ( ( gr.factor("wages per worker", 2013, 2016) )^(1/4) - 1 ) * SA.wage.gr
}

### For workers
workers.gr.asec.f <- function(SA.worker.gr = param.worker.gr) {
  ( ( gr.factor("workers", 2013, 2016) )^(1/4) - 1 ) * SA.worker.gr
}

### Half of the groth gap between 1st and 10th decile.
#SAME but calling above functions
half.gap.asec.f <- function(SA.wage.gr = param.wage.gr,
                     SA.base.growth = param.base.growth) {
  wage.gr.asec.f(SA.wage.gr) - SA.base.growth
}

### Compute 10 rates of wage growth
wage.gr.bins.asec.f <- function(SA.base.growth = param.base.growth,
                         SA.wage.gr = param.wage.gr) {
  seq(SA.base.growth, wage.gr.asec.f(SA.wage.gr) +
        half.gap.asec.f(SA.wage.gr,SA.base.growth), length.out = 10)
}

# Assign popultation to deciles according to 'wage' variable
# CAUTION: DO NOT apply 'ntile()' fn from dplry as is will split ties differently than 'cut()'
# and results will not be comparable to STATA.
# NOT THE SAME (power of 4 intead of 3)
add.wages.1 <- function(df.temp = df) {
  aux.var  <- wtd.quantile(x = df$wage, probs = 1:9/10,weights = df$hhwgt)
  df.temp %>%
        mutate( wage.deciles = cut(wage, c(0, aux.var, Inf) ,
                                right = TRUE, include.lowest = TRUE) ,  
                wage.adj1 =  wage * ( 1 + wage.gr.bins[wage.deciles] )^4)
}

# Here we adjust min wages to 2016 levels
# SAME
wages.final.asec.org.f <- function(SA.states.raise = param.states.raise,
                                SA.wages = param.wages) {
  with( df, ifelse(wage.adj1> y2016 * SA.states.raise,
                   wage.adj1,
                   y2016 * SA.states.raise) ) * SA.wages
}

wage.gr <- wage.gr.asec.f()
workers.gr <- workers.gr.asec.f()  
half.gap <- half.gap.asec.f()
wage.gr.bins <- wage.gr.bins.asec.f()
df <- add.wages.1()
df$wages.final <- wages.final.asec.org.f()

3.1.1.3.2 Statistics and code behind figure 4

# Descriptives of wage and population size (analogous to table 4 in CPS ORG)
f_table_8 <- function() {
  table_8 <- matrix(NA, 7, 2)
  colnames(table_8)  <- c("2013", "2016: status quo")
  rownames(table_8) <- c("Salary workers",
                         "Median wage",
                         "< 7.5","< 9",
                         "< 10.10", "< 13",
                         "< 15" )
  # Total in 2013
  table_8[1,1] <- table_5[3]
  # projected total in 2016
  # To compute the new total of workers we multiply the original weigths by the growth rate.
  table_8[1,2] <-  format(sum(df$hhwgt[df$pop_of_int==1] *
                              (1 + workers.gr)^4,
                            na.rm = TRUE), big.mark=",")

  # Median wage before projections
  table_8[2,1] <- df %>%
    filter( (pop_of_int == 1) & !is.na(wage) ) %>%
    with(wtd.quantile( wage, weights = hhwgt, prob = c(.5) ) ) %>%
    round(digits = 2)

  # Median wage after projections
  table_8[2,2] <- df %>%
    filter( (pop_of_int == 1) & !is.na(wage) ) %>%
    with(wtd.quantile( wages.final, weights =  hhwgt *
                         (1 + workers.gr)^4, prob = c(.5) ) ) %>%
    round(digits = 2)

  # Wage distribution in 2013
  table_8[3:7,1]  <- round(as.matrix(table_7[-1]), digits = 2)

aux.1 <- df %>%
  filter(pop_of_int == 1 & !is.na(wages.final)) %>%
    summarise("< 7.5" = weighted.mean(wages.final<7.5,w = hhwgt * (1 + workers.gr)^4),
              "< 9" = weighted.mean(wages.final<9,w = hhwgt * (1 + workers.gr)^4),
              "< 10.10" = weighted.mean(wages.final<10.10,w = hhwgt * (1 + workers.gr)^4),
              "< 13" = weighted.mean(wages.final<13,w = hhwgt * (1 + workers.gr)^4),
              "< 15" = weighted.mean(wages.final<15,w = hhwgt * (1 + workers.gr)^4)
              )

  table_8[3:7,2] <- round( as.matrix(aux.1), digits = 2 )
  return(table_8)
}

# Histogram of wages below $20 for 2013 and 2016
two_hist_asec <- function(){
  p2 <-    df  %>%
  filter(pop_of_int==1 & wage<=200)  %>%
  select(wage.2013, wages.final, hhwgt, hhwgt.2013) %>%
  gather(key = variable, value = value, -c(hhwgt.2013, hhwgt) ) %>%
  mutate("hhwgt" = ifelse(variable=="wages.final",
                                  hhwgt * (1 + workers.gr)^4 ,
                                  hhwgt.2013 ) ) %>%
  ggplot() +
    geom_density(aes(x = value,
                     fill=variable,
                     weight = hhwgt,
                     alpha = 1/2,
                     colour=variable), bw=1, kernel = "gau") +
    geom_vline(xintercept = c(7.25, 10.10, 11.5), col="blue") +
    coord_cartesian(xlim = c(0,20)) +
    scale_x_discrete(limits = c(0,7.5, 10.10, 11.5, 20))  +   
    guides(alpha = "none", colour="none") +
    labs(y = NULL,
         x = "Wage" ,
         title = "Figure 4: Distribution of wages in 2013 and 2016(forecast)")+
    theme(axis.ticks = element_blank(), axis.text.y = element_blank()) +
    theme(legend.justification=c(0,1),
          legend.position=c(0,1),
          legend.background = element_rect(fill = "transparent",
                                           colour = "transparent") )+
    scale_fill_discrete(name=NULL,
                         labels=c("2013 (Forecast)", "2016 (Forecast)"))
  return(p2)
}

table_8 <- f_table_8()
p2 <- two_hist_asec()
# print(p2)
# knitr::kable(table_8, caption="Comparison of 2013 and 2016 under the status quo", digits = 1)

Comparison of 2013 and 2016 under the status quo
	2013	2016: status quo
Salary workers	122,593,557	129,545,571
Median wage	17.79	20.56
< 7.5	0.09	0.03
< 9	0.15	0.09
< 10.10	0.2	0.15
< 13	0.32	0.25
< 15	0.4	0.33

3.1.1.4 Adjusting wages 2

CBO mentions that “it found far fewer workers who would be directly affected by the change in the minimum wage than it had in its analysis of employment”, using the CPS ASEC we get 15% workers below the 10.10 threshold in 2016, while using the CPS ORG we get16% in 2016. Given that the differences are not so large, we do not perfomer this adjustment.

3.2 Imputing policy effects

3.2.1 Imputing wage gains

If the wage is below the proposed new minimum (10.10), we increase that wage up to 10.10 for all the eligible population.

3.2.1.1 Ripple effects

CBO applies an additional wage increase for wages that are in a neighborhood up to 50% of the max increase ($+-0.5(10.10 - 7.25) = +- \$1.4$). Thus, the final imputed wage is:

\[ \tilde{w} = \begin{cases} MW' + 0.5(w - 7.25) \quad if \quad w \in [8.7, 10.10) \\ w + 0.5(11.5 - w) \quad if \quad w \in [10.10, 11.5) \\ MW' \quad o/w \end{cases} \]

3.2.1.1.1 Code for new min wage with ripple effects

# Increase population size (apply workers growth rate to all pop)

# Create new wage (after inc in min wage) & apply ripple effects
wage.ripple.f <- function(SA.ripple = param.ripple) {
  df %>%
  mutate( "hhwgt.2016" = hhwgt * (1 + workers.gr)^4 ,
          "below_min" = ifelse(wages.final <= 10.10 & pop_of_int == 1,
                            1,
                            0),
          "below_min" = ifelse(is.na(below_min),
                            0,
                            below_min),
          "new.wage"  = ifelse(wages.final<10.10 & pop_of_int %in% 1,
                            10.10,
                            wages.final),
          "new.wage"  = ifelse(wages.final>10.10 &
                                 wages.final<SA.ripple["scope_above"] &
                                 pop_of_int==1,
                               wages.final + SA.ripple["intensity"] *
                                 (SA.ripple["scope_above"] - wages.final),
                               ifelse(wages.final>SA.ripple["scope_below"] &
                                        wages.final<=10.10 &
                                        pop_of_int==1,
                                      10.10 + SA.ripple["intensity"] *
                                        (wages.final - 7.25),
                                      new.wage
                          ) )
  )
}

# Get the number of workers whose wage would bebow 10.10 in the status quo (in millions)
n_benes_f <- function() {
  df %>%
  filter(wages.final <= 10.10 & pop_of_int==1) %>%
  summarise(sum(hhwgt.2016, na.rm = TRUE)/1e6)
}

# Compute total wage increase (yearly, in billions) -without ripple effects and before destroying jobs-
wage_inc_f <- function() {
  df %>%
  filter(below_min == 1 & pop_of_int == 1 ) %>%
  transmute("gains" = (10.10 - wages.final) * hhwgt.2016 * hrslyr * wkslyr) %>%
  summarise(sum(gains, na.rm = TRUE))/1e9
}

# Total gain with ripple effects but without destroying any jobs  
wage_inc_with_ripple_f <- function() {
  df %>%
  transmute("all_jobs_gains" = (new.wage - wages.final) * hhwgt.2016 *
              hrslyr * wkslyr) %>%
  summarize( sum(all_jobs_gains, na.rm = TRUE) ) / 1e9
}

df <- wage.ripple.f()
n_benes <- n_benes_f()
wage_inc <- wage_inc_f()
wage_inc_with_ripple <- wage_inc_with_ripple_f()

3.2.1.2 Substracting non compliers

In section 2.2 we estimate that 10.6% of workers eligible for a rise would not receive such benefit. To account for this fraction of non compliers replace the same fraction of new wages with what would have receive under the status quo.

3.2.1.2.1 Code to substract non compliers

#Assign old wages to a % of the population (non-compliers)
add.nocomp <- function(df.temp=df, alpha.1.temp = alpha.1, SA.low.wage = param.low.wage) {
  #randomly assign no compliance to alpha.1.temp% of the population
  df.temp %>% mutate("no.comply" = ifelse(pop_of_int==1 & wages.final<11.5 * SA.low.wage,   
                                    ifelse(runif( length(new.wage) )< alpha.1.temp,
                                           "Not comply", "comply"),
                                    NA),
  #Assing old wages for all the non-compliers
                "new.wage.nocomp" = ifelse(no.comply %in% "Not comply" ,
                                    wages.final,
                                    new.wage)
                                    )
}

# Total gain with ripple effects but without destroying any jobs and accounting for non-compliance
wage.inc.with.ripple.non.comp_f <- function(){
  df %>% transmute("aux1" = (new.wage.nocomp - wages.final) *  hhwgt.2016 * hrslyr * wkslyr) %>%
    #filter(below_min == 0)  %>%
    summarise(sum(aux1, na.rm = TRUE) ) / 1e9
}

# Number of workers with wages below new min, that are eligible to receive wage inc (before job loses)
N_benes_compliance_below_min_f <- function() {
  df %>% filter((wages.final != new.wage.nocomp) & df$below_min==1) %>%
    summarise(sum(hhwgt.2016, na.rm = TRUE))/1e6 %>% as.double()
}

# Number of workers with wages above the new min, that receive a wage increase due to ripples.
N_benes_compliance_above_min_f <- function() {
  res1 <- df %>%
    filter((wages.final != new.wage.nocomp) & below_min==0) %>%
    summarise(sum(hhwgt.2016, na.rm = TRUE))/1e6
  return(as.numeric(res1))
}

pt <- function(x) round(x, 1)

# Non-compliance parameter
alpha.1 <- table.n.final["% of non compliers ($\\alpha_{1}$)", "Total"] * param.noncomp /100
set.seed(123)
df <- add.nocomp()
wage.inc.with.ripple.non.comp <- wage.inc.with.ripple.non.comp_f()
N_benes_compliance_below_min <- N_benes_compliance_below_min_f()
N_benes_compliance_above_min <- N_benes_compliance_above_min_f()
# Total number of beneficiares of wage rise
N_benes_compliance <- sum(df$hhwgt.2016[(df$wages.final !=df$new.wage.nocomp)], na.rm = TRUE)/1e6

After accounting for non compliance, the total number of workers that are potentially eligible for a raise is 24.2 million. Of that number 17.7 would have had a wage below the new minimum and 6.6 would have had a wage above 10.10, but receives a wage increase through ripple effects. We now impute job losses in order to obtain the final number of workers who benefit and lost from a raise in the minimum wage.

3.2.2 Imputing job losses

The imputation above so far is applied to all workers below the minimum wage (and the ripple effects). Now we need to remove the $\widehat{\Delta E} = -0.5$ million workers by imputing them a wage of 0. CBO chose to not move the wage all the way to 0 but to cut it in half and apply such imputation to $2 \times \widehat{\Delta E}$. When destroying jobs CBO argued that the effect would be heavier on teenagers and low wage adults.

3.2.2.1 Code to impute job loses

# Assign half of old wages to a % of the population (2* delta.e1)
job.killer <- function(df.temp= df, SA.jobcut = param.jobcut) {
  #Determine which fraction of the population relevant population represents 2 * delta.e1
  job.cut.factor <- 2
  frac_to_destroy <- df %>%
    filter(pop_of_int == 1 & wages.final <10.10) %>%
    summarise(job.cut.factor * (- delta.e1) * 1e6 / sum(hhwgt.2016))

  #randomly assign no compliance to frac_to_destroy% of the population
  set.seed(123)
  df.temp %>%
    mutate("wage_cut" = ifelse(pop_of_int==1 & wages.final<10.10,
                               ifelse(runif(length(new.wage) )< as.numeric(frac_to_destroy),
                                           "half wage", "same wage"),
                                    NA),
  # Compute variables for each scenario
  # cut jobs, no wage raise
                "cut.wage" = ifelse(wage_cut %in% "half wage",
                                 wages.final/job.cut.factor,
                                 wages.final) ,
  # cut jobs relative to sq, raise the other wages
                "new.wage.final" = ifelse(wage_cut %in% "half wage",
                                       wages.final/job.cut.factor,
                                       new.wage.nocomp),  
  # cut jobs relative to new wages, raise the other wages
  "new.wage.cut" = ifelse(wage_cut %in% "half wage",
                                     new.wage.nocomp/job.cut.factor,
                                     new.wage.nocomp)
  )  

}

# Compute total wage increase after ripple effects (yearly in billions)
wage.gain.total_f <- function(){
  df %>%
      summarise( "Total wage gain" =  sum( (new.wage.final - wages.final) *
                   hhwgt.2016 * hrslyr * wkslyr , na.rm = TRUE) / 1e9 ,
                "Total wage loss" = sum( (wages.final - cut.wage) *
                   hhwgt.2016 * hrslyr * wkslyr , na.rm = TRUE) / 1e9,
                "Total gain before JD" =  sum( (new.wage.final - cut.wage) *
                   hhwgt.2016 * hrslyr * wkslyr , na.rm = TRUE) / 1e9
                )  
}

#with(df, hist(new.wage.final - wages.final, xlim = c(-10,10) , ylim=c(0,3e3) , breaks = 50) )
#df %>% mutate("wls" = ifelse(new.wage.final > wages.final, "winner", ifelse(new.wage.final == wages.final, "nothing", "loser"))) %>% with(wtd.table(wls, weights = hhwgt.2016))

df <- job.killer()
wage.gain.total <- wage.gain.total_f()

So far we get a total wage gain (accounting for job losses) of 52.2 billion dollars (in 2016) and a total wage loss of 7.1 billions due to workers loosing their jobs. The funds that cover the wage gains have to come from either less profits for business or higher prices for consumers. In the next section we review the distribution of wage gains, wage losses and balance losses across the income distribution.

3.3 Computing family income under status quo and minimum wage increase

The family income forecast was computed as the sum of forecast wages and non-wage income: \[ \begin{aligned} \widehat{Y_{h}(2016|2013)} &= \sum_{i \in h} \left( g_{w}\hat{w} + \sum_{l}(g_{nw_{l}}\hat{nw_{l}}) \right) \end{aligned} \]

The other components of family income were forecast as follows: when a growth rate was available for the sub component ($l$) it was applied (the only one mentioned is interest and dividends), otherwise the growth rate was equal to the change in the price index for personal consumption

Additional information needed from CBO: how do they decompose the income.

3.3.1 Growth of non working population

Forecasts of population growth were the same for working population. For non working population, the growth rate was matched to CBO forecasts for that group.

3.3.1.1 Code to construct the final income variables

#Anualized growth rate for non-wage income
non.wage.gr.f <- function(SA.nonwage.gr = param.nonwage.gr) {
  ( ( gr.factor( "Personal CPI", 2013, 2016) )^(1/4) - 1 ) *
               SA.nonwage.gr
}

#Adjust non-wage income, compute per cap incomes, and compute amount won/loss per person   
all.income.f <- function(df.temp = df, non.wage.gr.temp = non.wage.gr) {
  #Separate HH income in to wage and non-wage and apply growth rate
  df.temp$non.wage <- (df.temp$incp_ern - df.temp$incp_wag)  
  df.temp$non.wage.2016 <- df.temp$non.wage * ( 1 + non.wage.gr.temp)^4

  # Generate per cap income based on new wages (after policy) and old wages
  df.temp <- df.temp %>% mutate("year.wage.1" = new.wage.final * (hrslyr * wkslyr) ,
                                "year.wage.2" = wages.final * (hrslyr * wkslyr) ) %>%
        group_by(hhseq) %>% mutate("hhld.wage1" = sum(year.wage.1, na.rm = TRUE),
                                   "hhld.wage2" = sum(year.wage.2, na.rm = TRUE),
                                   "hhld.non.wage" = sum(non.wage.2016, na.rm = TRUE),
                                   "hhld.income1" = hhld.wage1 + hhld.non.wage,
                                   "hhld.income2" = hhld.wage2 + hhld.non.wage,
                                   "N.fam" = n(),
                                   "new.inc.pc" = hhld.income1/N.fam,
                                   "sq.inc.pc" = hhld.income2/N.fam) %>%
                                    select(-(hhld.wage1:N.fam) ) %>% ungroup() %>%
    # compute amount won/loss per person
    mutate("winners" = ifelse(new.inc.pc > sq.inc.pc, new.inc.pc - sq.inc.pc, 0),
           "losers" =  - ifelse(new.inc.pc < sq.inc.pc, new.inc.pc - sq.inc.pc, 0) )
  return(df.temp)
}

# Computing differences in income
# df %>%
#  with(sum( (new.inc.pc - sq.inc.pc) * hhwgt.2016, na.rm = TRUE) ) /1e9
# Per capita agregate income has slightly less gains, probably due to hhld with winners and losers.
# wage.gain.total["Total wage gain"]

#Computing statistic of income variation
inc_ch_stats <- function(SA.factor.1 = param.factor.1,
                         SA.net.benef = param.net.benef,
                         SA.dist.loss = param.dist.loss) {
  res1 <- list()
  # losses = all the wage gains - subs.factor * (all the loses from jod dest) - net benefits
  res1[["losses"]] <- df %>%
    summarise(sum(winners * hhwgt.2016) -
              SA.factor.1 * sum(losers * hhwgt.2016) -
              SA.net.benef)  
  aux1 <- df %>% mutate("poverty_bins" =
                        findInterval(x = sq.inc.pc,
                                     vec = c(-Inf,11740, 6*11740, Inf))) %>%
  with(wtd.table(poverty_bins, weights = hhwgt.2016))

  res1[["pop.dist"]] <- aux1$sum.of.weights

  res1[["losses.pc"]]  <- as.numeric(res1[["losses"]] ) * param.dist.loss / res1[["pop.dist"]]
  return(res1)
}

# Compute balance losses and classify income by PL groups
win.loss.f <- function(SA.net.benef = param.net.benef, losses.pc.temp = losses.pc,
                       df.temp = df) {
  # Asssing per capita balance losses to each indiv.
  df.temp$balance.loss <- as.numeric(losses.pc.temp[with(df.temp, findInterval(x = sq.inc.pc,
                                                 vec = c(-Inf,11740, 6*11740, Inf) )) ])
  # Build quintiles of income based on status quo income
  bins <- with(df.temp, wtd.quantile(x = sq.inc.pc, probs = 1:4/5, weights = hhwgt.2016))
  df.temp$income.group <- with(df.temp, findInterval(x = sq.inc.pc, vec =  c(-Inf,bins) ))

  #df$income.group <- with(df, findInterval(x = sq.inc.pc, vec =  bins ))
  #classify income into 1, 3 and 6 povery lines
  df.temp$income.group.1 <- with(df.temp, findInterval(x = sq.inc.pc, vec =  c(-Inf,11740*c(1,3,6),Inf)  ) )
  return(df.temp)
}

#Compute quintiles based on percapita income (status quo)
add.quintiles <- function(df.temp = df) {
  aux.var  <- df.temp %>% with(wtd.quantile(x = sq.inc.pc, probs = 1:4/5, weights = hhwgt.2016))
  df.temp <- df.temp %>%
        mutate( "income.group" = cut(sq.inc.pc, c(-Inf, aux.var, Inf) ,
                                right = TRUE, include.lowest = TRUE, labels = FALSE) )
  return(df.temp)
}

non.wage.gr <- non.wage.gr.f()
df <- all.income.f()
aux1 <- inc_ch_stats();
losses.pc <- aux1[["losses.pc"]]; losses <- aux1[["losses"]]; pop.dist <- aux1[["pop.dist"]]
df <- win.loss.f()
df <- add.quintiles()

3.3.2 Other income losses

Income losses from a reduction in profits ($\Delta^{-}\pi$) and an increase in aggregate prices ($\Delta^{+}P$) is estimated to be distributed as following: 1% of the losses for those below the poverty line (PL), 29% for those between 1 and 6 PL, and 70% for those above 6PL.

3.4 Other considerations

3.4.1 Economywide income effect

CBO argues that the overall effect on the economy is positive and of $2 billion dollars for 2016.

3.4.2 Quantifying loses

Mix gains and loses.
Output lost - increase in aggregate demand (!)
Net gain of 2 billion dollars.

3.4.3 Distributional effects

Only results, no methodology at all! This is probably the most important (and overlooked part of the report)

3.4.4 Interaction with other programs

No interactions with other programs is assumed (SNAP or EITC).

3.4.5 What is in the 2/3

CBO acknowledges uncertainty in the estimates of elasticity due to possible technological changes in the future.

4 Results

# Compute variation by hhld - plot all the effects
final_fig1_f <- function(){
  df %>% select(new.inc.pc,sq.inc.pc, hhwgt.2016, balance.loss)  %>%
  mutate( "variation" = new.inc.pc - sq.inc.pc,
          "sixthtile" = findInterval(x = sq.inc.pc,
                                     vec = c(-Inf,11740, 11740*1:6, Inf) )) %>%
  ggplot(aes(sq.inc.pc, variation))  +
  geom_hline(color = "gray", alpha = 0.5, yintercept = 0, size = 1) +

  geom_jitter( colour = "black", alpha = 1/30, size = 1/10,
               position = position_jitter(height = 30, width = 15) ) +
  coord_cartesian(xlim = c(0,2e5),
                    ylim = c(-5e3,5e3) )+
  theme_minimal() +
  labs(y = "Change relative to no raise in min wage",
       x = "Per capita income" ,
       title = "Distribution of gains and losses across per capita income") +
  geom_vline(xintercept = 11740*1:6,
             col="red", size = 1/3) +
  scale_y_discrete(limits = c(-50e2, -25e2, 0,25e2, 50e2),
                   labels = c("-5K", "-2.5K", "0", "2.5K", "5K")) +
  scale_x_discrete(limits = c(5e3,50e3, 103e3, 150e3),
                   labels = c("5K","50K", "100K", "150K")) +
  geom_abline(color = "red", alpha = 0.2, slope=-1/2, intercept=0,
  na.rm = FALSE, show.legend = NA) +
  geom_abline(color = "blue", alpha = 0.2, slope=(10.10/7.25 - 1), intercept=0,
  na.rm = FALSE, show.legend = NA) +
  geom_jitter(aes( sq.inc.pc, - balance.loss - 120 ),
              colour = "blue", alpha = 1/30,
              size = 1/10,
              position = position_jitter(height = 30, width = 15) )  +
  geom_text(x = 15e3, y = -4e3,angle = 0,
            label = c("1PL"),
            size = 3, colour = "red", data = data.frame()) +
  geom_text(x = 12e3*6+3e3, y = -4e3,angle = 0,
            label = c("6PL"),
            size = 3, colour = "red",  data = data.frame()) +
  theme(plot.title = element_text(hjust = 0.5))
}  

# Compute variation by hhld - plot all the effects in same units (average per group)
final_fig2_f <- function(){
  df %>%
  select(new.inc.pc,sq.inc.pc, hhwgt.2016, balance.loss, winners, losers)  %>%   
  mutate( "variation" = new.inc.pc - sq.inc.pc,
          "sixthtile" = findInterval(x = sq.inc.pc,
                                     vec = c(-Inf,11740*1:6, Inf) ) ) %>%   
  filter(sixthtile>=0) %>%
  group_by(sixthtile) %>%
  summarise("mean" = wtd.mean(variation, weights = hhwgt.2016),
            "mean (win)" = wtd.mean(winners, weights = hhwgt.2016),
            "mean (lose)" = wtd.mean(losers, weights = hhwgt.2016),
            "mean (bal lose)" = wtd.mean(balance.loss, weights = hhwgt.2016),
            "N" = sum( hhwgt.2016)   
            ) %>%
  select(`mean (win)`, `mean (lose)`, `mean (bal lose)`, sixthtile) %>%
  melt(value.variables=c("mean (win)", "mean (lose)") , id="sixthtile") %>%
  ggplot(aes(as.factor(sixthtile), value, fill = as.factor(variable))) +
  geom_bar(color = "gray", alpha = 0.5,
           stat = "summary", fun.y = "mean",
           position = "dodge") +
  coord_cartesian(ylim = c(0,300) ) +
  geom_text(x = 6.8, y = 265,angle = 90, label = - round(losses.pc[3]) , size = 3, colour = "#6699FF", alpha = 0.5) +
  geom_segment(aes(x = 7, y = 200, xend = 7, yend = 310),
               colour = "#6699FF", arrow = arrow(length = unit(0.2, "cm"))) +
  theme(legend.justification=c(0, 0),
                legend.position=c(0, .7),
                legend.background = element_rect(colour = 'transparent', fill = 'transparent')
                ) +
  scale_fill_discrete(name=NULL,
                      labels=c("Wage +", "Wage -", "Balance -")) +
  labs(y = "$/year",
       x = "Poverty Lines" ,
       title = "Distribution of gains and losses across poverty lines") +
  theme(plot.title = element_text(hjust = 0.5))
 #print(final_fig2)
 #ggsave("final_fig2.png")
}  

# Compute variation by hhld - plot all the effects in same units (average per group) with quintiles instead of poverty lines.
final_fig3_f <- function(){
final_fig3 <- df %>%
  select(new.inc.pc,sq.inc.pc, hhwgt.2016, balance.loss, winners, losers, income.group)  %>%   
  mutate( "variation" = new.inc.pc - sq.inc.pc,
           "inc_gain" = ifelse(variation>0,
                              "gain",
                              ifelse(variation < 0, "loss", "same" ) ) ) %>% group_by(income.group) %>%
  summarise("mean" = wtd.mean(variation, weights = hhwgt.2016),
            "mean (win)" = wtd.mean(winners, weights = hhwgt.2016),
            "mean (lose)" = wtd.mean(losers, weights = hhwgt.2016),
            "mean (bal lose)" = wtd.mean(balance.loss, weights = hhwgt.2016),
            "N" = sum( hhwgt.2016)   
            )

high.loss <-  as.numeric(final_fig3[max(final_fig3$income.group), "mean (bal lose)"])

#THIS IS THE PLOT THAT I WANT TO OUPUT TO THE SHINY APP
final_fig3 <- final_fig3 %>%
  select(`mean (win)`, `mean (lose)`, `mean (bal lose)`, income.group) %>%
  melt(value.variables=c("mean (win)", "mean (lose)") , id="income.group") %>%
  ggplot(aes(as.factor(income.group), value, fill = as.factor(variable))) +
  geom_bar(color = "gray", alpha = 0.5,
           stat = "summary", fun.y = "mean",
           position = "dodge") +
  coord_cartesian(ylim = c(0,400) ) +
  geom_text(x = 4.8, y = 365,angle = 90,
            label = - round(high.loss) ,
            size = 3, colour = "#6699FF", alpha = 0.5) +
  geom_segment(aes(x = 5, y = 300, xend = 5, yend = 410),
               colour = "#6699FF", arrow = arrow(length = unit(0.2, "cm"))) +
  theme(legend.justification=c(0, 0),
                legend.position=c(0, .7),
                legend.background = element_rect(colour = 'transparent', fill = 'transparent')
                ) +
  scale_fill_discrete(name=NULL,
                      labels=c("Wage +", "Wage -", "Balance -")) +
  labs(y = "$/year",
       x = "Quintiles of per capita income" ,
       title = "Distribution of gains and losses across quintiles") +
  theme(plot.title = element_text(hjust = 0.5))   
  return(final_fig3)
}  

final_fig1 <- final_fig1_f()
final_fig2 <- final_fig2_f()
final_fig3 <- final_fig3_f()
#print(final_fig1)
#ggsave("alt_pe1.png")

# THIS IS THE PLOT THAT I OUTPUT FOR THE PAPER (AND THE INITIAL SA)
# NEED TO MAKE SURE I AM EXPORTING IN SAME DIMENSIONS
# print(final_fig3)  
# ggsave("policy_est.png")
# 5.89 x 3.69 in image
#NEED TO CHECk WHY RIPPLE EFFECTS AFFECT WAGE LOSES

# Table proposal for gains and losses by income group and income variation
table_9_f <- function() {
  table_9 <- matrix(NA, ncol = 4, nrow = 5)
  colnames(table_9)  <- c("<1PL", "[1PL, 3PL)", "[3PL, 6PL)", ">6PL")
  rownames(table_9)  <- c("wage gains", "wage loses", "other loses",
                          "aggr effect", "N_i")

  table_9["N_i", ] <- wtd.table(with(df, income.group.1) , weights = df$hhwgt.2016)[[2]]/1e6

  aux.1 <- df %>% group_by(income.group.1)  %>% summarise(sum((winners) * hhwgt.2016, na.rm = TRUE)/1e9)
  table_9[ "wage gains", ] <- t(as.data.frame(aux.1[,2]) )

  #Wage loses: compared to SQ
  aux.1 <- df %>% group_by(income.group.1)  %>% summarise(sum((losers) * hhwgt.2016, na.rm = TRUE)/1e9)
  table_9[ "wage loses", ] <- t(as.data.frame(aux.1[,2]) )

  #Imputing the balnce of the losses
  table_9[ "other loses", ] <- as.numeric(losses) * c(0.01, rep(.29/2, 2), 0.70)/1e9

  aux.1 <- df %>%
    group_by(income.group.1)  %>%
      summarise(sum((winners - losers - balance.loss) *
                      hhwgt.2016, na.rm = TRUE)/1e9)
  table_9[ "aggr effect", ] <- t(as.data.frame(aux.1[,2]) )
  return(table_9)
}

table_9 <- table_9_f()
#FH: questions:
#    - what is the precise way to define family in CPS?
#    - what is the standard way to choose one obs per family?

4.1 Original format

Policy estimates in CBO report and Replication Results

	Effects/Policy Estimates	Replication
wage gains (billions of $)	31	58.6
wage losses (bns of $)	~5	8.8
Balance losses (bns of $)	~24	47.8
Net effect (bns of $)	2	2
# of Wage gainers (millions)	16.5	24.2/17.7
#of Wage losers (millions)	0.5	0.5

	Balance losses (bns of $)	Replication	Net effect (bns of $)	Replication NE
<1PL	~0.3	0.5	5	18.8
[1PL, 3PL)	~3.4	6.9	12	16.2
[3PL, 6PL)	~3.4	6.9	2	-0.2
>6PL	~17	33.5	-17	-32.9

4.2 New format

Our first proposal is to combine all gains and losses in one figure. Figure 5 plots the variation in the income per capita of all the population, sorted by initial income. The black dots represent income variation due to wage gains and wage losses from the new minimum wage. The blue dots represente the average income loss by each person to pay for the increase in the minimum wage. The read lines represents poverty lines.

As we can see, the direct effect is largely progressive (larger proportion of benefits to people in lower parts of the income distribution). This figure also highlights how little is known about who pays for a rise in the minimum wage. Laking evidence at that time, CBO had to assume (“guess work”) that the distribution was as depicted in figure 5: people under one poverty line (PL) pays 1% of the losses, people between 1 and 6 PLs pays 29%, and people above 6 PLs pays 70% of the losses.

Figure 5

Although figure 5 is good to illustrate the last point, it is not a good figure to present as the outcome to policy makers as compares effective wage variations due to raise in minimum wage (black dots) with average income losses (blue dots). A second version for a final output is presented in figure 6. Here the population has been grouped by where there income falls into different poverty line bins. Now we are comparing averages with averages, and are still using the same categories used by CBO (poverty lines). However each group contain a different number of individuals (for example there is much more people under 1 PL than above 6 PLs). This makes it harder for policy makers to separate normative from positive judgemntes (for example: a policy makers that weights each bin equally is implicitly giving more weight to individuals whose income is above 6 PLs that below 1 PL).

Figure 6

For our third and last iteration, we propose that the output that policy makers should take from this analysis is the following.

Final replication output

Learn more

In figure 7 we summarize all the estimated effects associated with raising the minimum wage. The red bars represent the average gain in per capita income due to wage raises. The green bars represent the average loss in per capita income due to job losses. And the blue bars represent the average loss in per capita income that is requiered to pay for a raise in the minimum wage (“balance losses”). All of this estimates are presented for five equally size groups of the population as distributed by income (income quintiles).

The report presents smaller estimates for the $9.00 dollar option (-0.075). The rationale is that a smaller increase (in magnitude but also not indexed by inflation, and with later implementation than the 10.10 option) will allow firms to adjust other margins before reducing employment.↩
CBO calculated the fraction of teenagers with earnings below the minimum wage from 1979 to 2009 and the result came to about a third. Then they look at the average change in earnings for teenagers subject to the minimum wage over the same period, and compared that to the nominal change in each variation of the minimum wage. This ratio came to be about 1.5. With this the final estimates for the elasticity for teenagers came to be 4.5 ($1.5/(1/3)$) times higher than what is estimated in the literature.↩

	Adult	Teen	Total
Salary workers (\(\hat{ N_{employed} }\)) (millions)	121.69	4.30	125.99
Low wage workers (\(w \leq 11.5 p/h\)) (millions)	26.85	3.73	30.58
% Salary below new MW (\(P(\hat{w} \leq MW^{1})\))	14.41	74.31	16.46
% of non compliers (\(\alpha_{1}\))	10.31	12.75	10.57
\(\hat{ g(2016\|2013) }\)	1.05	1.05	1.05
\(\widehat{ N_{final} }\) (millions)	16.47	2.92	19.42

	Adult	Teen
\(\eta_{lit}\)	-0.03	-0.10
\(\eta_{w \leq MW'}\)	-0.23	-0.13
\(F_{adj}\)	4.50	4.50
\(\overline{\%\Delta w}\)	13.36	16.59
\(\widetilde{\eta_{w\leq MW}}\)	-0.15	-0.45

Dynamic Documention for “The Effects of a Minimum-Wage Increase on Employment and Family Income”

Download/Modify/Contribute the analysis here

Last edit: 2018-08-22