1.0 Introduction

This in document I have attempted the replication of the very famous Card and Kruger 1994 study of minimum wage.

The impact of minimum wage increases on employment has long been a contentious issue in labor economics. This seminal study by Card & Kruger examined New Jersey’s wage hike from $4.25 to $5.05 per hour, surveying 410 fast-food restaurants in New Jersey and eastern Pennsylvania. The study found no significant evidence that this wage increase led to job losses, challenging prevailing economic theories.

This replication study utilizes the same dataset as the original research to explore the impact of the minimum wage increase. By doing so, I aim to refine my data analysis skills and enhance my proficiency in coding with R. This exercise not only provides an opportunity to validate the original findings but also allows me to practice and develop my analytical capabilities in a real-world context.

Note: This project is an ongoing side project that is not finished yet.

1.1 upload data

The data sets for this project come from David Cards website. The replication folder has several files but we will only need two of them. The data has come in withought column names luckly we have the code book so that we can know what each column means.Because the column names follow a consistent pattern we are able to use a regular expression to grab our column names. Then we will replace the data frame column names with the vector of new names.

df <- read.table("~/Documents/Rpubs_Projects/njmin/public.dat")
codebook <- read_lines("~/Documents/Rpubs_Projects/njmin/codebook")
kable(df[1:8,])
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36 V37 V38 V39 V40 V41 V42 V43 V44 V45 V46
46 1 0 0 0 0 0 1 0 0 0 30.00 15.00 3.00 . 19.0 . 1 . 2 6.5 16.5 1.03 1.03 0.52 3 3 1 1 111792 1 3.50 35.00 3.00 4.30 26.0 0.08 1 2 6.50 16.50 1.03 . 0.94 4 4
49 2 0 0 0 0 0 1 0 0 0 6.50 6.50 4.00 . 26.0 . 0 . 2 10.0 13.0 1.01 0.90 2.35 4 3 1 1 111292 . 0.00 15.00 4.00 4.45 13.0 0.05 0 2 10.00 13.00 1.01 0.89 2.35 4 4
506 2 1 0 0 0 0 1 0 0 0 3.00 7.00 2.00 . 13.0 0.37 0 30.0 2 11.0 10.0 0.95 0.74 2.33 3 3 1 1 111292 . 3.00 7.00 4.00 5.00 19.0 0.25 . 1 11.00 11.00 0.95 0.74 2.33 4 3
56 4 1 0 0 0 0 1 0 0 0 20.00 20.00 4.00 5.00 26.0 0.10 1 0.0 2 10.0 12.0 0.87 0.82 1.79 2 2 1 1 111492 . 0.00 36.00 2.00 5.25 26.0 0.15 0 2 10.00 12.00 0.92 0.79 0.87 2 2
61 4 1 0 0 0 0 1 0 0 0 6.00 26.00 5.00 5.50 52.0 0.15 1 0.0 3 10.0 12.0 0.87 0.77 1.65 2 2 1 1 111492 . 28.00 3.00 6.00 4.75 13.0 0.15 0 2 10.00 12.00 1.01 0.84 0.95 2 2
62 4 1 0 0 0 0 1 0 0 2 0.00 31.00 5.00 5.00 26.0 0.07 0 45.0 2 10.0 12.0 0.87 0.77 0.95 2 2 1 1 111492 . . . . . 26.0 . 0 2 10.00 12.00 . 0.84 1.79 3 3
445 1 0 0 0 0 0 0 1 0 0 50.00 35.00 3.00 5.00 26.0 0.10 0 0.0 2 6.0 18.0 1.04 0.88 0.94 3 3 1 1 110792 . 15.00 18.00 5.00 4.75 26.0 0.15 0 2 6.00 18.00 1.04 0.86 0.94 3 3
451 1 0 0 0 0 0 0 1 0 0 10.00 17.00 5.00 5.00 52.0 0.25 0 0.0 2 0.0 24.0 1.05 0.84 0.96 6 4 1 1 111792 2 26.00 9.00 6.00 5.00 26.0 0.20 0 2 0.00 24.00 1.11 0.84 0.94 6 3
print(codebook)
##  [1] "              Code Book for New Jersey-Pennsylvania Data Set"                                                                                                                                                                                                                                                
##  [2] ""                                                                                                                                                                                                                                                                                                            
##  [3] "Note: there are 410 observations in the data set"                                                                                                                                                                                                                                                            
##  [4] ""                                                                                                                                                                                                                                                                                                            
##  [5] "            Column Location"                                                                                                                                                                                                                                                                                 
##  [6] " Name:       Start    End    Format     Explanation"                                                                                                                                                                                                                                                         
##  [7] "\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4\xc4"
##  [8] "SHEET           1        3     3.0   sheet number (unique store id)"                                                                                                                                                                                                                                         
##  [9] "CHAIN           5        5     1.0   chain 1=bk; 2=kfc; 3=roys; 4=wendys"                                                                                                                                                                                                                                    
## [10] "CO_OWNED        7        7     1.0   1 if company owned"                                                                                                                                                                                                                                                     
## [11] "STATE           9        9     1.0   1 if NJ; 0 if Pa                      "                                                                                                                                                                                                                                 
## [12] ""                                                                                                                                                                                                                                                                                                            
## [13] "Dummies for location:"                                                                                                                                                                                                                                                                                       
## [14] "SOUTHJ         11       11     1.0   1 if in southern NJ"                                                                                                                                                                                                                                                    
## [15] "CENTRALJ       13       13     1.0   1 if in central NJ"                                                                                                                                                                                                                                                     
## [16] "NORTHJ         15       15     1.0   1 if in northern NJ"                                                                                                                                                                                                                                                    
## [17] "PA1            17       17     1.0   1 if in PA, northeast suburbs of Phila"                                                                                                                                                                                                                                 
## [18] "PA2            19       19     1.0   1 if in PA, Easton etc"                                                                                                                                                                                                                                                 
## [19] "SHORE          21       21     1.0   1 if on NJ shore"                                                                                                                                                                                                                                                       
## [20] ""                                                                                                                                                                                                                                                                                                            
## [21] "First Interview"                                                                                                                                                                                                                                                                                             
## [22] "NCALLS         23       24     2.0   number of call-backs*"                                                                                                                                                                                                                                                  
## [23] "EMPFT          26       30     5.2   # full-time employees"                                                                                                                                                                                                                                                  
## [24] "EMPPT          32       36     5.2   # part-time employees"                                                                                                                                                                                                                                                  
## [25] "NMGRS          38       42     5.2   # managers/ass't managers"                                                                                                                                                                                                                                              
## [26] "WAGE_ST        44       48     5.2   starting wage ($/hr)"                                                                                                                                                                                                                                                   
## [27] "INCTIME        50       54     5.1   months to usual first raise"                                                                                                                                                                                                                                            
## [28] "FIRSTINC       56       60     5.2   usual amount of first raise ($/hr)"                                                                                                                                                                                                                                     
## [29] "BONUS          62       62     1.0   1 if cash bounty for new workers"                                                                                                                                                                                                                                       
## [30] "PCTAFF         64       68     5.1   % employees affected by new minimum"                                                                                                                                                                                                                                    
## [31] "MEALS          70       70     1.0   free/reduced price code (See below)"                                                                                                                                                                                                                                    
## [32] "OPEN           72       76     5.2   hour of opening"                                                                                                                                                                                                                                                        
## [33] "HRSOPEN        78       82     5.2   number hrs open per day"                                                                                                                                                                                                                                                
## [34] "PSODA          84       88     5.2   price of medium soda, including tax"                                                                                                                                                                                                                                    
## [35] "PFRY           90       94     5.2   price of small fries, including tax"                                                                                                                                                                                                                                    
## [36] "PENTREE        96      100     5.2   price of entree, including tax"                                                                                                                                                                                                                                         
## [37] "NREGS         102      103     2.0   number of cash registers in store"                                                                                                                                                                                                                                      
## [38] "NREGS11       105      106     2.0   number of registers open at 11:00 am"                                                                                                                                                                                                                                   
## [39] ""                                                                                                                                                                                                                                                                                                            
## [40] "Second Interview"                                                                                                                                                                                                                                                                                            
## [41] "TYPE2         108      108     1.0   type 2nd interview 1=phone; 2=personal"                                                                                                                                                                                                                                 
## [42] "STATUS2       110      110     1.0   status of second interview: see below"                                                                                                                                                                                                                                  
## [43] "DATE2         112      117     6.0   date of second interview MMDDYY format"                                                                                                                                                                                                                                 
## [44] "NCALLS2       119      120     2.0   number of call-backs*"                                                                                                                                                                                                                                                  
## [45] "EMPFT2        122      126     5.2   # full-time employees"                                                                                                                                                                                                                                                  
## [46] "EMPPT2        128      132     5.2   # part-time employees"                                                                                                                                                                                                                                                  
## [47] "NMGRS2        134      138     5.2   # managers/ass't managers"                                                                                                                                                                                                                                              
## [48] "WAGE_ST2      140      144     5.2   starting wage ($/hr)"                                                                                                                                                                                                                                                   
## [49] "INCTIME2      146      150     5.1   months to usual first raise"                                                                                                                                                                                                                                            
## [50] "FIRSTIN2      152      156     5.2   usual amount of first raise ($/hr)"                                                                                                                                                                                                                                     
## [51] "SPECIAL2      158      158     1.0   1 if special program for new workers"                                                                                                                                                                                                                                   
## [52] "MEALS2        160      160     1.0   free/reduced price code (See below)"                                                                                                                                                                                                                                    
## [53] "OPEN2R        162      166     5.2   hour of opening"                                                                                                                                                                                                                                                        
## [54] "HRSOPEN2      168      172     5.2   number hrs open per day"                                                                                                                                                                                                                                                
## [55] "PSODA2        174      178     5.2   price of medium soda, including tax"                                                                                                                                                                                                                                    
## [56] "PFRY2         180      184     5.2   price of small fries, including tax"                                                                                                                                                                                                                                    
## [57] "PENTREE2      186      190     5.2   price of entree, including tax"                                                                                                                                                                                                                                         
## [58] "NREGS2        192      193     2.0   number of cash registers in store"                                                                                                                                                                                                                                      
## [59] "NREGS112      195      196     2.0   number of registers open at 11:00 am"                                                                                                                                                                                                                                   
## [60] ""                                                                                                                                                                                                                                                                                                            
## [61] "Codes:"                                                                                                                                                                                                                                                                                                      
## [62] ""                                                                                                                                                                                                                                                                                                            
## [63] "Free/reduced Meal Variable:"                                                                                                                                                                                                                                                                                 
## [64] "0 = none"                                                                                                                                                                                                                                                                                                    
## [65] "1 = free meals"                                                                                                                                                                                                                                                                                              
## [66] "2 = reduced price meals"                                                                                                                                                                                                                                                                                     
## [67] "3 = both free and reduced price meals"                                                                                                                                                                                                                                                                       
## [68] ""                                                                                                                                                                                                                                                                                                            
## [69] ""                                                                                                                                                                                                                                                                                                            
## [70] "Second Interview Status"                                                                                                                                                                                                                                                                                     
## [71] "0 = refused second interview (count = 1)"                                                                                                                                                                                                                                                                    
## [72] "1 = answered 2nd interview (count = 399)"                                                                                                                                                                                                                                                                    
## [73] "2 = closed for renovations (count = 2)"                                                                                                                                                                                                                                                                      
## [74] "3 = closed \"permanently\" (count = 6)"                                                                                                                                                                                                                                                                      
## [75] "4 = closed for highway construction (count = 1)"                                                                                                                                                                                                                                                             
## [76] "5 = closed due to Mall fire (count = 1)"                                                                                                                                                                                                                                                                     
## [77] ""                                                                                                                                                                                                                                                                                                            
## [78] ""                                                                                                                                                                                                                                                                                                            
## [79] "*Note: number of call-backs = 0 if contacted on first call"                                                                                                                                                                                                                                                  
## [80] ""

1.2 Data Cleaning

# Use regex to match the beginning of each line followed by a space
pattern <- "^\\S+(?=  )"

# Find the matches using gregexpr
matches <- gregexpr(pattern, codebook, perl = TRUE)

# Extract the matches
matched_words <- regmatches(codebook, matches)

# Convert list to vector for easier manipulation
matched_words <- unlist(matched_words)  


# Remove blank or whitespace-only elements
new_column_names <- matched_words[matched_words != "" & !grepl("^\\s*$", matched_words)]

# Display the non-blank words
print(new_column_names)
##  [1] "SHEET"    "CHAIN"    "CO_OWNED" "STATE"    "SOUTHJ"   "CENTRALJ"
##  [7] "NORTHJ"   "PA1"      "PA2"      "SHORE"    "NCALLS"   "EMPFT"   
## [13] "EMPPT"    "NMGRS"    "WAGE_ST"  "INCTIME"  "FIRSTINC" "BONUS"   
## [19] "PCTAFF"   "MEALS"    "OPEN"     "HRSOPEN"  "PSODA"    "PFRY"    
## [25] "PENTREE"  "NREGS"    "NREGS11"  "TYPE2"    "STATUS2"  "DATE2"   
## [31] "NCALLS2"  "EMPFT2"   "EMPPT2"   "NMGRS2"   "WAGE_ST2" "INCTIME2"
## [37] "FIRSTIN2" "SPECIAL2" "MEALS2"   "OPEN2R"   "HRSOPEN2" "PSODA2"  
## [43] "PFRY2"    "PENTREE2" "NREGS2"   "NREGS112"
# replace column names 
colnames(df) <- new_column_names
head(df)
##   SHEET CHAIN CO_OWNED STATE SOUTHJ CENTRALJ NORTHJ PA1 PA2 SHORE NCALLS EMPFT
## 1    46     1        0     0      0        0      0   1   0     0      0 30.00
## 2    49     2        0     0      0        0      0   1   0     0      0  6.50
## 3   506     2        1     0      0        0      0   1   0     0      0  3.00
## 4    56     4        1     0      0        0      0   1   0     0      0 20.00
## 5    61     4        1     0      0        0      0   1   0     0      0  6.00
## 6    62     4        1     0      0        0      0   1   0     0      2  0.00
##   EMPPT NMGRS WAGE_ST INCTIME FIRSTINC BONUS PCTAFF MEALS OPEN HRSOPEN PSODA
## 1 15.00  3.00       .    19.0        .     1      .     2  6.5    16.5  1.03
## 2  6.50  4.00       .    26.0        .     0      .     2 10.0    13.0  1.01
## 3  7.00  2.00       .    13.0     0.37     0   30.0     2 11.0    10.0  0.95
## 4 20.00  4.00    5.00    26.0     0.10     1    0.0     2 10.0    12.0  0.87
## 5 26.00  5.00    5.50    52.0     0.15     1    0.0     3 10.0    12.0  0.87
## 6 31.00  5.00    5.00    26.0     0.07     0   45.0     2 10.0    12.0  0.87
##   PFRY PENTREE NREGS NREGS11 TYPE2 STATUS2  DATE2 NCALLS2 EMPFT2 EMPPT2 NMGRS2
## 1 1.03    0.52     3       3     1       1 111792       1   3.50  35.00   3.00
## 2 0.90    2.35     4       3     1       1 111292       .   0.00  15.00   4.00
## 3 0.74    2.33     3       3     1       1 111292       .   3.00   7.00   4.00
## 4 0.82    1.79     2       2     1       1 111492       .   0.00  36.00   2.00
## 5 0.77    1.65     2       2     1       1 111492       .  28.00   3.00   6.00
## 6 0.77    0.95     2       2     1       1 111492       .      .      .      .
##   WAGE_ST2 INCTIME2 FIRSTIN2 SPECIAL2 MEALS2 OPEN2R HRSOPEN2 PSODA2 PFRY2
## 1     4.30     26.0     0.08        1      2   6.50    16.50   1.03     .
## 2     4.45     13.0     0.05        0      2  10.00    13.00   1.01  0.89
## 3     5.00     19.0     0.25        .      1  11.00    11.00   0.95  0.74
## 4     5.25     26.0     0.15        0      2  10.00    12.00   0.92  0.79
## 5     4.75     13.0     0.15        0      2  10.00    12.00   1.01  0.84
## 6        .     26.0        .        0      2  10.00    12.00      .  0.84
##   PENTREE2 NREGS2 NREGS112
## 1     0.94      4        4
## 2     2.35      4        4
## 3     2.33      4        3
## 4     0.87      2        2
## 5     0.95      2        2
## 6     1.79      3        3
# find all charter vectors in df and change to numeric 
df <- df|> 
  mutate(across(where(is.character), ~ as.numeric(replace(., . == ".", NA))))
df$DATE2 <- as.Date(strptime(df$DATE2, format = "%m%d%y"))
df |>
psych::describe() |>
  kable(caption = "Table 0: Summary Statistics")|>
  kable_classic()
Table 0: Summary Statistics
vars n mean sd median trimmed mad min max range skew kurtosis se
SHEET 1 410 246.5073171 148.2317989 237.50 244.1646341 186.807600 1.00 522.00 521.00 0.1079612 -1.1698454 7.3206467
CHAIN 2 410 2.1170732 1.1104972 2.00 2.0213415 1.482600 1.00 4.00 3.00 0.4099750 -1.2668755 0.0548435
CO_OWNED 3 410 0.3439024 0.4755893 0.00 0.3048780 0.000000 0.00 1.00 1.00 0.6548374 -1.5750117 0.0234877
STATE 4 410 0.8073171 0.3948880 1.00 0.8841463 0.000000 0.00 1.00 1.00 -1.5526808 0.4118399 0.0195021
SOUTHJ 5 410 0.2268293 0.4192929 0.00 0.1585366 0.000000 0.00 1.00 1.00 1.2998286 -0.3111868 0.0207074
CENTRALJ 6 410 0.1536585 0.3610617 0.00 0.0670732 0.000000 0.00 1.00 1.00 1.9137822 1.6666450 0.0178316
NORTHJ 7 410 0.4268293 0.4952214 0.00 0.4085366 0.000000 0.00 1.00 1.00 0.2947864 -1.9177606 0.0244572
PA1 8 410 0.0878049 0.2833567 0.00 0.0000000 0.000000 0.00 1.00 1.00 2.9022768 6.4389330 0.0139940
PA2 9 410 0.1048780 0.3067706 0.00 0.0060976 0.000000 0.00 1.00 1.00 2.5697266 4.6147684 0.0151503
SHORE 10 410 0.0853659 0.2797667 0.00 0.0000000 0.000000 0.00 1.00 1.00 2.9569123 6.7598353 0.0138167
NCALLS 11 410 1.1414634 1.3860950 0.00 0.9573171 0.000000 0.00 8.00 8.00 1.2248531 2.1240241 0.0684544
EMPFT 12 404 8.2029703 8.6195226 6.00 6.7916667 5.930400 0.00 60.00 60.00 1.8746681 5.1452723 0.4288373
EMPPT 13 406 18.8312808 10.0817269 17.00 17.9723926 10.378200 0.00 60.00 60.00 0.9252540 1.0792019 0.5003477
NMGRS 14 404 3.4202970 1.0184082 3.00 3.3512346 1.482600 1.00 10.00 9.00 1.5797983 6.9777038 0.0506677
WAGE_ST 15 390 4.6156410 0.3470151 4.50 4.5816026 0.370650 4.25 5.75 1.50 0.6591875 -0.3515297 0.0175718
INCTIME 16 379 18.3100264 11.2046053 13.00 17.3393443 13.343400 2.00 52.00 50.00 1.0781286 1.4317377 0.5755419
FIRSTINC 17 367 0.2252861 0.1079813 0.22 0.2121695 0.074130 0.05 0.75 0.70 1.6422167 4.3241097 0.0056366
BONUS 18 410 0.2463415 0.4314062 0.00 0.1829268 0.000000 0.00 1.00 1.00 1.1730934 -0.6253593 0.0213056
PCTAFF 19 366 48.8696721 35.1159517 50.00 48.5928571 44.478000 0.00 100.00 100.00 0.0412889 -1.3665141 1.8355402
MEALS 20 410 1.9073171 0.5336548 2.00 1.8902439 0.000000 0.00 3.00 3.00 -0.1811396 0.6623298 0.0263553
OPEN 21 410 8.0414634 2.1608854 7.00 8.0259146 1.482600 0.00 11.50 11.50 -0.2181176 0.3990436 0.1067185
HRSOPEN 22 410 14.4390244 2.8099873 15.50 14.4771341 2.223900 7.00 24.00 17.00 0.0761353 0.0591687 0.1387754
PSODA 23 402 1.0448756 0.0886873 1.06 1.0426398 0.074130 0.73 1.49 0.76 0.3476042 1.5595291 0.0044233
PFRY 24 393 0.9219847 0.1058813 0.93 0.9214603 0.118608 0.67 1.27 0.60 0.2517357 0.3211310 0.0053410
PENTREE 25 398 1.3221859 0.6430845 1.02 1.2297187 0.118608 0.49 3.95 3.46 1.2931530 0.3478130 0.0322349
NREGS 26 404 3.5940594 1.2576572 3.00 3.4907407 1.482600 1.00 8.00 7.00 0.7581589 0.5461304 0.0625708
NREGS11 27 398 2.7160804 0.8878087 3.00 2.6718750 1.482600 1.00 6.00 5.00 0.4547008 0.3926864 0.0445018
TYPE2 28 410 1.0951220 0.2937417 1.00 1.0000000 0.000000 1.00 2.00 1.00 2.7499703 5.5759543 0.0145069
STATUS2 29 410 1.0487805 0.3532053 1.00 1.0000000 0.000000 0.00 5.00 5.00 7.3333510 61.1614907 0.0174436
DATE2 30 410 NaN NA NA NaN NA Inf -Inf -Inf NA NA NA
NCALLS2 31 161 2.2236025 1.7641399 2.00 1.8372093 1.482600 1.00 9.00 8.00 2.1858518 4.9389853 0.1390337
EMPFT2 32 398 8.2751256 7.9707627 6.00 7.1421875 7.413000 0.00 40.00 40.00 1.1262055 0.8229380 0.3995382
EMPPT2 33 400 18.6775000 10.6996355 17.00 17.9796875 10.378200 0.00 60.00 60.00 0.6387888 0.1363367 0.5349818
NMGRS2 34 404 3.4839109 1.1398984 3.00 3.4336420 1.482600 0.00 8.00 8.00 0.4569002 1.7412189 0.0567121
WAGE_ST2 35 389 4.9962725 0.2531904 5.05 5.0329393 0.000000 4.25 6.25 2.00 -1.1329012 4.4272551 0.0128373
INCTIME2 36 344 22.0319767 12.8292950 26.00 20.6123188 16.308600 2.00 52.00 50.00 0.8325025 0.4946323 0.6917092
FIRSTIN2 37 330 0.2170909 0.1057957 0.20 0.2029167 0.074130 0.00 1.00 1.00 2.2566788 10.0619748 0.0058239
SPECIAL2 38 392 0.2091837 0.4072456 0.00 0.1369427 0.000000 0.00 1.00 1.00 1.4245678 0.0294880 0.0205690
MEALS2 39 399 1.7769424 0.5471440 2.00 1.7788162 0.000000 0.00 3.00 3.00 -0.2576902 0.1158165 0.0273915
OPEN2R 40 399 8.1109023 2.1579612 7.00 8.1253894 1.482600 0.00 11.50 11.50 -0.3889423 0.9599420 0.1080332
HRSOPEN2 41 399 14.4655388 2.7524948 15.00 14.4306854 2.965200 8.00 24.00 16.00 0.2890415 0.4151100 0.1377971
PSODA2 42 388 1.0449485 0.0935670 1.05 1.0394231 0.074130 0.41 1.40 0.99 0.0387418 6.9057394 0.0047501
PFRY2 43 382 0.9412304 0.1093037 0.94 0.9422549 0.111195 0.69 1.37 0.68 0.1734571 0.3395168 0.0055925
PENTREE2 44 386 1.3540674 0.6497019 1.04 1.2697097 0.185325 0.41 2.85 2.44 1.0409515 -0.6428443 0.0330690
NREGS2 45 388 3.6082474 1.2435400 3.00 3.5128205 1.482600 1.00 8.00 7.00 0.6415350 0.0090107 0.0631312
NREGS112 46 383 2.6605744 0.8860092 3.00 2.6026059 1.482600 0.00 7.00 7.00 0.6469254 1.6504314 0.0452730

2.0 Sample Tabulations:

# Create Tabulations for all data
table1_col1 <- df |>
  summarise(Num_stores = length(SHEET),
            Refused_Second_Interview = sum(STATUS2 == 0),
            Num_Closed = sum(STATUS2 == 3),
            Num_Renovations = sum(STATUS2 == 2),
            Num_Interviewed = sum(STATUS2 == 1),
            Num_temp_closed = sum(STATUS2 == 4),
            Num_Close_due_to_fire = sum(STATUS2 == 5)) |> 
  select(Num_stores,
         Num_Interviewed,
         Num_Renovations,
         Num_Closed,
         Num_temp_closed,
         Num_Close_due_to_fire,
         Refused_Second_Interview
         )

# Tabulate for each state
table1_col2 <- df |>
  group_by(STATE)|>
  summarise(Num_stores = length(SHEET),
            Refused_Second_Interview = sum(STATUS2 == 0),
            Num_Closed = sum(STATUS2 == 3),
            Num_Renovations = sum(STATUS2 == 2),
            Num_Interviewed = sum(STATUS2 == 1),
            Num_temp_closed = sum(STATUS2 == 4),
            Num_Close_due_to_fire = sum(STATUS2 == 5))|>
  select(Num_stores,
         Num_Interviewed,
         Num_Renovations,
         Num_Closed,
         Num_temp_closed,
         Num_Close_due_to_fire,
         Refused_Second_Interview
         )
  
# combine data, then transpose for correct orientation
table1 <- as.data.frame(t(rbind(table1_col1,table1_col2)))

# rename columns 
table1 <- table1 |>
  rename(All = V1,
         NJ = V3,
         PA = V2)

# use Kable to make pretty table 
kable(table1, caption = "Table 1 - Sample Design and Responce Rates")|>
  kable_classic(full_width = FALSE , html_font = "Cambria")
Table 1 - Sample Design and Responce Rates
All PA NJ
Num_stores 410 79 331
Num_Interviewed 399 78 321
Num_Renovations 2 0 2
Num_Closed 6 1 5
Num_temp_closed 1 0 1
Num_Close_due_to_fire 1 0 1
Refused_Second_Interview 1 0 1

In the public data the stores that dropped out after the first wave are no longer in the data set. This is why this table looks diffrent than the one in the orgnal paper

# Calculations for table 2 
table2 <-  df|>
  group_by(STATE)|>
  summarise(`a. Burger King` = sum(CHAIN == 1),
             `b. KFC` = sum(CHAIN == 2),
             `c. Roy Rogers` = sum(CHAIN == 3),
             `d. Wendys` = sum(CHAIN == 4),
             `e. CompanyOwned` = sum(CO_OWNED == 1),
             `f. FTE Employment` = mean(EMPFT + NMGRS + .5 * EMPPT , na.rm = TRUE), 
             `g. Percentage full-time employees` = mean((EMPFT + NMGRS) / (EMPFT + EMPPT + NMGRS) ,na.rm = TRUE ),
             `h. Starting Wage` = mean(WAGE_ST, na.rm = TRUE),
             `i. Wage = $4.25 (percentage)` = mean(PCTAFF, na.rm = TRUE),  
             `j. Price of full meal` = mean(PENTREE + PFRY + PSODA, na.rm = TRUE),
             `k. Hours open on week days` = mean(HRSOPEN, na.rm = TRUE),
             `l. FTE Employment` = mean(EMPFT2 + NMGRS2 + .5 * EMPPT2 , na.rm = TRUE), 
             `m. Percentage full-time employees` = mean((EMPFT2 + NMGRS2) / (EMPFT2 + EMPPT2 + NMGRS2),na.rm = TRUE),
             `n. Starting Wage` = mean(WAGE_ST2, na.rm = TRUE),
             `o. Price of full meal` = mean(PENTREE2 + PFRY2 + PSODA2, na.rm = TRUE),
             `p. Hours open on week days` = mean(HRSOPEN2, na.rm = TRUE)
              ) |>
  mutate(across(everything(), ~ round(., 2)))
             
# combine data, then transpose for correct orentation
table2 <- as.data.frame(t(table2))

# Rename columns for table 
table2 <- table2 |>
  rename(PA = V1,
         NJ = V2)

# Kable for pretty table 
kable(table2, caption = "Table2 - Means of Key Variables") |>
  kable_classic(full_width = F, html_font = "Cambria") |>
  pack_rows("1. Distribution of Store Types", 2, 6) |>
  pack_rows("2. Means in Wave 1:", 7, 12) |>
  pack_rows("3. Means in Warve2:",13,17)
Table2 - Means of Key Variables
PA NJ
STATE 0.00 1.00
1. Distribution of Store Types
  1. Burger King
35.00 136.00
  1. KFC
12.00 68.00
  1. Roy Rogers
17.00 82.00
  1. Wendys
15.00 45.00
  1. CompanyOwned
28.00 113.00
2. Means in Wave 1:
  1. FTE Employment
23.33 20.44
  1. Percentage full-time employees
0.39 0.38
  1. Starting Wage
4.63 4.61
  1. Wage = $4.25 (percentage)
45.91 49.62
  1. Price of full meal
3.04 3.35
  1. Hours open on week days
14.53 14.42
3. Means in Warve2:
  1. FTE Employment
21.17 21.03
  1. Percentage full-time employees
0.36 0.41
  1. Starting Wage
4.62 5.08
  1. Price of full meal
3.03 3.41
  1. Hours open on week days
14.65 14.42
# this is the count of stores and we want to use the precentage 
df |>
  mutate(state = factor(STATE , 
                           labels = c("Pennsylvania", "New Jersey"))) |>
  ggplot(aes(x = WAGE_ST, fill = state))+
  geom_histogram(aes(y = after_stat(count) / sum(after_stat(count)) * 100),
                 position = "dodge", bins = 14) + 
  labs(title = "Febuary 1992",x = "Wage Range", y = "Precentage of Stores")+
  theme_classic()

df |>
  mutate(state = factor(STATE , 
                           labels = c("Pennsylvania", "New Jersey"))) |>
  ggplot(aes(x = WAGE_ST2, fill = state))+
  geom_histogram(aes(y = after_stat(count) / sum(after_stat(count)) * 100),
                 position = "dodge", bins = 14) + 
  labs(title = "November 1992",x = "Wage Range", y = "Precentage of Stores") +  
  theme_classic()

3.0 Diffrence In Diffrence Estimation

3.1 Simple Comparison

# Column i, ii, & iii 
stores_by_state <- df |>
  group_by(STATE)|>
  summarise(FTE_EMP1 = mean(EMPFT + NMGRS + .5 * EMPPT, na.rm = TRUE),
         FTE_EMP2 = mean(EMPFT2 + NMGRS2 + .5 * EMPPT, na.rm = TRUE),
         Change_in_mean_FTE = FTE_EMP2 - FTE_EMP1)|> 
  select(FTE_EMP1,FTE_EMP2,Change_in_mean_FTE) |>
  mutate(across(everything(), ~ round(., 2)))

# transpose data 
stores_by_state <- as.data.frame(t(stores_by_state))

# rename columns for table
stores_by_state <- stores_by_state |>
  rename(`PA (i)` = V1,
         `NJ (ii)` = V2) |>
  mutate(`Diffrence in NJ - PA (iii)` = `NJ (ii)` - `PA (i)`)
# Column iv 
stores_NJ_iv <- df |>
  filter(STATE == 1 & WAGE_ST == 4.25 ) |>
  summarise(FTE_EMP1 = mean(EMPFT + NMGRS + .5 * EMPPT, na.rm = TRUE),
         FTE_EMP2 = mean(EMPFT2 + NMGRS2 + .5 * EMPPT, na.rm = TRUE),
         Change_in_mean_FTE = FTE_EMP2 - FTE_EMP1)|> 
  select(FTE_EMP1,FTE_EMP2,Change_in_mean_FTE) |>
  mutate(across(everything(), ~ round(., 2)))

# Column V 
stores_NJ_v <- df |>
  filter(STATE == 1 & between(WAGE_ST, 4.26,4.99)) |>
  summarise(FTE_EMP1 = mean(EMPFT + NMGRS + .5 * EMPPT, na.rm = TRUE),
         FTE_EMP2 = mean(EMPFT2 + NMGRS2 + .5 * EMPPT, na.rm = TRUE),
         Change_in_mean_FTE = FTE_EMP2 - FTE_EMP1)|> 
  select(FTE_EMP1,FTE_EMP2,Change_in_mean_FTE) |>
  mutate(across(everything(), ~ round(., 2)))

# Column vi
stores_NJ_vi <- df |>
  filter(STATE == 1 & WAGE_ST >= 5) |>
  summarise(FTE_EMP1 = mean(EMPFT + NMGRS + .5 * EMPPT, na.rm = TRUE),
         FTE_EMP2 = mean(EMPFT2 + NMGRS2 + .5 * EMPPT, na.rm = TRUE),
         Change_in_mean_FTE = FTE_EMP2 - FTE_EMP1)|> 
  select(FTE_EMP1,FTE_EMP2,Change_in_mean_FTE) |>
  mutate(across(everything(), ~ round(., 2)))

# transpose Data 
stores_NJ <- as.data.frame(t(rbind(stores_NJ_iv, stores_NJ_v, stores_NJ_vi)))

# rename columns for table 
stores_NJ <- stores_NJ |>
  rename(`Wage = $4.25 (iv)` = V1,
         `Wage = $4.26-$4.99 (v)`= V2,
         `Wage = $5.00 (vi)` = V3)
# colum vii 
columnVII <- df |>
  filter(STATE == 1) |>
  summarise(FTEMP1_low = mean(ifelse(WAGE_ST == 4.25, EMPFT + NMGRS + .5 *
                                          EMPPT, NA_real_), na.rm = TRUE),
            FTEMP1_high = mean(ifelse(WAGE_ST >= 5 , EMPFT + NMGRS + .5 *
                                          EMPPT, NA_real_), na.rm = TRUE),
            `low-high FTE Before` = FTEMP1_low - FTEMP1_high,
            FTEMP2_low = mean(ifelse(WAGE_ST == 4.25, EMPFT2 + NMGRS2 + .5 *
                                          EMPPT2, NA_real_), na.rm = TRUE),
            FTEMP2_high = mean(ifelse(WAGE_ST >= 5 , EMPFT2 + NMGRS2 + .5 *
                                          EMPPT2, NA_real_), na.rm = TRUE),
            `low-high FTE After` = FTEMP2_low - FTEMP2_high,
            `Change in mean FTE` = `low-high FTE After` -`low-high FTE Before` 
            ) |>
  select(`low-high FTE Before`, `low-high FTE After`, `Change in mean FTE`) |>
    rename( FTE_EMP1 = `low-high FTE Before`,
          FTE_EMP2 = `low-high FTE After`, 
          Change_in_mean_FTE =  `Change in mean FTE` )


  
# colum viii 
columnVIII <- df |>
  filter(STATE == 1) |>
  summarise(FTEMP1_mid = mean(ifelse(between(WAGE_ST, 4.26,4.99), 
                                     EMPFT + NMGRS + .5 *EMPPT,
                                     NA_real_), na.rm = TRUE),
            FTEMP1_high = mean(ifelse(WAGE_ST >= 5 , EMPFT + NMGRS + .5 *
                                          EMPPT, NA_real_), na.rm = TRUE),
            `mid-high FTE Before` = FTEMP1_mid - FTEMP1_high,
            FTEMP2_mid = mean(ifelse(between(WAGE_ST, 4.26,4.99), 
                                     EMPFT2 + NMGRS2 + .5 *EMPPT2,
                                     NA_real_), na.rm = TRUE),
            FTEMP2_high = mean(ifelse(WAGE_ST >= 5 ,
                                      EMPFT2 + NMGRS2 + .5 *EMPPT2,
                                      NA_real_), na.rm = TRUE),
            `mid-high FTE After` = FTEMP2_mid - FTEMP2_high,
            `Change in mean FTE` = `mid-high FTE After` -`mid-high FTE Before` 
            ) |>
  select(`mid-high FTE Before`, `mid-high FTE After`, `Change in mean FTE`) |>
  rename( FTE_EMP1 = `mid-high FTE Before`,
          FTE_EMP2 = `mid-high FTE After`, 
          Change_in_mean_FTE =  `Change in mean FTE` )
  

Diffrence_NJ <- as.data.frame(t(rbind(columnVII, columnVIII)))

Diffrence_NJ <- Diffrence_NJ |>
  rename(`Low-High (vii)` = V1,
         `Mid-High (viii)` = V2)
# combine data frames for table 
table3 <- cbind(stores_by_state, stores_NJ, Diffrence_NJ )

# use Kable for making pretty table 
kable(table3, caption = "Table 3 - AVERAGE EMPLOYMENT PER STORE BEFORE AND 
AFTER THE RISE IN NEW JERSEY MINIMUM MAGE") |>
  kable_classic(full_width = F, html_font = "Cambria") |>
  add_header_above(c(" ","Stores by State" = 3,"Stores in New Jersey" = 3, 
                     "Differences within NJ" = 2))
Table 3 - AVERAGE EMPLOYMENT PER STORE BEFORE AND AFTER THE RISE IN NEW JERSEY MINIMUM MAGE
Stores by State
Stores in New Jersey
Differences within NJ
PA (i) NJ (ii) Diffrence in NJ - PA (iii) Wage = $4.25 (iv) Wage = $4.26-$4.99 (v) Wage = $5.00 (vi) Low-High (vii) Mid-High (viii)
FTE_EMP1 23.33 20.44 -2.89 19.56 20.08 22.25 -2.6932990 -2.1693431
FTE_EMP2 20.89 21.31 0.42 21.18 21.32 20.29 0.6637829 0.7417874
Change_in_mean_FTE -2.44 0.87 3.31 1.63 1.24 -1.96 3.3570819 2.9111305

3.1 Regression Analysis

# Create variables for regression Analysis
final_df <- df |>
  mutate(FTE_EMP1 = EMPFT + NMGRS + .5 * EMPPT,
         FTE_EMP2 = EMPFT2 + NMGRS2 + .5 * EMPPT2,
         delta_emp = FTE_EMP2 - FTE_EMP1, 
         GAP = ifelse(STATE == 1 & WAGE_ST <= 5.05,((5.05-WAGE_ST)/WAGE_ST),0),
         CHAIN1 = ifelse(CHAIN == 1,1,0),
         CHAIN2 = ifelse(CHAIN == 2,1,0),
         CHAIN3 = ifelse(CHAIN == 3,1,0)) |>
  select(FTE_EMP1,FTE_EMP2,delta_emp,GAP,CHAIN1,CHAIN2,CHAIN3, STATE, CO_OWNED,
         SOUTHJ, CENTRALJ, NORTHJ, PA1, PA2) |>
  drop_na()
# Chek-out the mean of non-zero GAP data
final_df|>
  filter(GAP > 0) |>
  summarise(GAP_MEAN = mean(GAP)) -> test

print(test)
##    GAP_MEAN
## 1 0.1152347
# Regressions 
reg1 <- lm(delta_emp ~ STATE, data = final_df)

reg2 <- lm(delta_emp ~ STATE + CHAIN1 + CHAIN2 + CHAIN3 + CO_OWNED, data = final_df)

reg3 <- lm(delta_emp ~ GAP, data = final_df)

reg4 <- lm(delta_emp ~ GAP + CHAIN1 + CHAIN2 + CHAIN3 + CO_OWNED, data = final_df)

reg5 <- lm(delta_emp ~ GAP + SOUTHJ + CENTRALJ + NORTHJ + PA1 + PA2 ,data = final_df)
# present htm table of regression results 
htmltools::includeHTML("table3.htm")
Table 4: Reduced Form models
Dependent variable:
delta_emp
(1)(2)(3)(4)(5)
STATE2.496**2.497**
(1.127)(1.130)
CHAIN10.189-0.355
(1.444)(1.462)
CHAIN20.8390.453
(1.625)(1.636)
CHAIN3-1.984-2.199
(1.636)(1.637)
CO_OWNED0.3330.431
(1.072)(1.071)
GAP16.359***15.877***13.005*
(5.924)(6.044)(7.156)
SOUTHJ0.054
(1.844)
CENTRALJ-1.346
(1.938)
NORTHJ0.093
(1.688)
PA1-2.898
(2.013)
PA2
Constant-2.283**-2.167-1.669**-1.187-0.970
(1.006)(1.521)(0.672)(1.313)(1.355)
Observations368368368368368
R20.0130.0260.0200.0320.031
Adjusted R20.0110.0130.0180.0180.018
Residual Std. Error8.709 (df = 366)8.699 (df = 362)8.678 (df = 366)8.675 (df = 362)8.678 (df = 362)
F Statistic4.906** (df = 1; 366)1.958* (df = 5; 362)7.626*** (df = 1; 366)2.368** (df = 5; 362)2.318** (df = 5; 362)
Note:*p<0.1; **p<0.05; ***p<0.01

4.0 Speification tests

Conclusion