Set-up and background

The assignment is worth 100 points. There are 27 questions. You should have the following packages installed:

library(tidyverse)
library(patchwork)
library(fixest)

In this problem set you will summarize the paper “Do Workers Work More if Wages Are High? Evidence from a Randomized Field Experiment” (Fehr and Goette, AER 2007) and recreate some of its findings.

1 Big picture

[Q1] What is the main question asked in this paper?

This paper is studying the repsonse to transitory wage changes. Past studies are insufficient because they do not correct for temporal changes like seasonal labor supplies or needs. In this study of Swiss messangers behavioral foundations of labor supply can be studied since the workers are free to choose how many shifts they work and how much effor they put in.

[Q2] Recall the taxi cab studies where reference dependence is studied using observational data. What can an experimental study do that an observational study can’t?

The researchers were able to run two different experiments. One based on the neo-classical model and the other based on reference dependance. This provided the opportunity to treat reference dependance as a personal trait in one study and not in another and see if it had any impact on the data. This is key becasue they were able to somewhat determine if this would have an impact on their study and adjust the model for this behavior. Observational studies can not adjust for this because they do not interefere they just observe whereas an experimental study sets the paramaters.

[Q3] Summarize the field experiment design.

Messangers participated in the study and were randomly assigned to group A or B. Participants did not know the reason for the study but were required to fill out questionaires during the study in order to benefit from the study. If they did not fill out the questions they were removed from the study. Therefore the data collected for the study only consisted of those who followed the guidelines and thus represents a group without attrition. In different months group A and B received an increased commission so that both groups could be tested for their repsonses and both were part of the control. Earnings were not given until the end of the study to ensure that income levels did not impact results.

[Q4] Summarize the laboratory experiment design. Why was it included with the study?

This was included in the study to further examine if a predicition could be made about when someone would work more shifts and put in less effort at those shifts. This was incldued becasue other studies have been unable to predict this and been unable to eliminate temporal effects. In the taxi study people had targt points they were working towards that varied per person. This section of the study tried to predict and account for this to get more accurate predictions. For this in particular messangers kept and turned in receipts for all their rides so they were very aware of their revenue and targets.

[Q5] Summarize the main results of the field experiment.

First they found that group A and B had nearly identical results before the study indicating a large treatment effect since the main difference would be the study itself. Second it was found that with higher wages, more shifts were worked. This indicated the wage elasticity of total revenue is equal to the elasticity of shifts plus the elasticity of the revenue per shift. Therefore, the higher wage elasticity of shifts compared to the elasticity of total revenues is a first indication that the elasticity of effort per shift is negative. Lastly the study found that the negative impact of the wage increase on revenue per shift is associated with the messengers’ degree of loss aversion and that loss aversion therefore did impact results.

[Q6] Summarize the main results of the laboratory experiment.

Workers will increase their working time once they receive a higher wage. Wage increase does cause a decrease in effort however. The increase in the number of shifts, however, dominates the negative impact on effort per shift by a large margin such that overall labor supply strongly increases.

[Q7] Why are these results valuable? What have we learned? Motivate your discussion with a real-world example.

The TLDR here is that if you increase wages more people will be willing to work at more hours. Effort may go down a bit since they can achieve previous earning goals in less effort but this is not a bad thing. I am in the midst of our planning cycle at a consulting firm and one of our strongest indicators is employee utilization. Our aim is to have this utilization at around 60% not 100% this is because any more than 60% leads to high burnout and turnover. Hiring new employees is expensive and the learning curve can be steep. When applied to this study the lesson is similar that if you pay well and dont overwork the employees tend to stay and are generally happier and you have a larger candidate pool. Maybe McDonalds and others should consider upping the minimum wage a bit as employees leave en masse.

2 Replication

Use theme_classic() for all plots.

2.1 Correlations in revenues across firms

For this section please use dailycorrs.csv.

dailycorrs = read_csv("~/Documents/BC Applied Economics/Behavioral Economics/dailycorrs.csv")

[Q8] The authors show that earnings at Veloblitz and Flash are correlated. Show this with a scatter plot with a regression line and no confidence interval. Title your axes and the plot appropriately. Do not print the plot but assign it to an object called p1.

data = dailycorrs

p1 <- dailycorrs %>% 
  ggplot(aes(x = logv, y = logf)) + 
  geom_point() +
  geom_smooth(method = 'lm', se= FALSE) +
  xlab("Veloblitz Earnings") + 
  ylab("Flash Earnings") +
  labs(title = "Earnings Correlation")+
   theme_classic()+
  theme(legend.position = "none")
  
p1

[Q9] Next plot the kernel density estimates of revenues for both companies. Overlay the distributions and make the densities transparent so they are easily seen. Title your axes and the plot appropriately. Do not print the plot but assign it to an object called p2.

melted_v = data.frame(Earnings = dailycorrs$logv, Company = "Veloblitz")
melted_f = data.frame(Earnings = dailycorrs$logf, Company = "Flash")
melted = rbind(melted_v, melted_f)
p2 = melted %>% ggplot(aes(x = Earnings, fill = Company)) + geom_density(alpha = 0.4)+ labs(title = "Earnings Density by Company")+
   theme_classic()

p2

[Q10] Now combine both plots using library(patchwork) and label the plots with letters.

library(patchwork)
patchwork <- p1+p2 + plot_annotation(tag_levels = "A")

patchwork

2.2 Tables 2 and 3

For this section please use tables1to4.csv.

tables1to4 = read_csv("~/Documents/BC Applied Economics/Behavioral Economics/tables1to4.csv")

2.2.1 Table 2

On page 307 the authors write:

“Table 2 controls for individual fixed effects by showing how, on average, the messengers’ revenues deviate from their person-specific mean revenues. Thus, a positive number here indicates a positive deviation from the person-specific mean; a negative number indicates a negative deviation.”

[Q11] Fixed effects are a way to control for heterogeneity across individuals that is time invariant. Why would we want to control for fixed effects? Give a reason how bike messengers could be different from each other, and how these differences might not vary over time.

We would want to control for fixed effects so that we can better focus on the variables that we want to study over time. Fixed effects as they sound do not change and are within groups therefore they will have similar impacts throughout the time of your study. For instance there may be a subset of messanger bikers who are afraid of the dark or hate the rain. Regardless of the rate youre offering they likely wont go out in the rain or late at night. We’d want to control for this fixed effect since its going to be there regardless and otherwise might skew our data.

[Q12] Create a variable called totrev_fe and add it to the dataframe. This requires you to “average out” each individual’s revenue for a block from their average revenue: \(x_i^{fe} = x_{it} - \bar{x}_i\) where \(x_i^{fe}\) is the fixed effect revenue for \(i\).

total_rev_data <- tables1to4 %>%
  group_by(fahrer) %>% 
  mutate(mean_total_rev = mean(totrev)) %>% 
  mutate(totrev_fe = totrev - mean(totrev)) %>% 
  mutate(total_rev_minus_mean = totrev_fe - mean_total_rev)

total_rev_data %>% 
  group_by(fahrer) %>% 
  summarise(total_rev_data)
## # A tibble: 53,268 × 18
## # Groups:   fahrer [138]
##    fahrer vebli block totrev shifts  high firstblock block1 block2 block3
##     <dbl> <dbl> <dbl>  <dbl>  <dbl> <dbl>      <dbl>  <dbl>  <dbl>  <dbl>
##  1      1     1     1  3091.     13     0      14836      1      0      0
##  2      1     1     2  1210       5     1      14864      0      1      0
##  3      1     1     3   782       3     0      14913      0      0      1
##  4      2     1     1  2308.      8     0      14836      1      0      0
##  5      2     1     2  2148       6     1      14864      0      1      0
##  6      2     1     3  1936.      6     0      14913      0      0      1
##  7      3     1     1  3552.      9     0      14836      1      0      0
##  8      3     1     2  1889       4     0      14864      0      1      0
##  9      3     1     3  1883       5     1      14913      0      0      1
## 10      4     1     1   993       4     0      14836      1      0      0
## # … with 53,258 more rows, and 8 more variables: maxhigh <dbl>, other <dbl>,
## #   group <chr>, experiment <dbl>, odd <dbl>, mean_total_rev <dbl>,
## #   totrev_fe <dbl>, total_rev_minus_mean <dbl>

[Q13] Use summarise() to recreate the findings in Table 2 for “Participating Messengers” using your new variable totrev_fe. (You do not have to calculate the differences in means.)

In addition to calculating the fixed-effect controled means, calculate the standard errors. Recall the standard error is \(\frac{s_{jt}}{\sqrt{n_{jt}}}\) where \(s_{jt}\) is the standard deviation for treatment \(j\) in block \(t\) and \(n_{jt}\) are the corresponding number of observations.

(Hint: use n() to count observations.) Each calculation should be named to a new variable. Assign the resulting dataframe to a new dataframe called df_avg_revenue.

df_avg_revenue <- total_rev_data %>% 
  drop_na(odd) %>% 
  group_by(odd, block) %>% 
  summarise(n=n(),mean_total_rev_fe = mean(totrev_fe), se_total_rev = sd(totrev_fe)/sqrt(n))

df_avg_revenue
## # A tibble: 6 × 5
## # Groups:   odd [2]
##     odd block     n mean_total_rev_fe se_total_rev
##   <dbl> <dbl> <int>             <dbl>        <dbl>
## 1     0     1    19            -120.          303.
## 2     0     2    20            -278.          241.
## 3     0     3    20             392.          251.
## 4     1     1    21             -48.9         361.
## 5     1     2    22             722.          193.
## 6     1     3    22            -675.          289.

[Q14] Plot df_avg_revenue. Use points for the means and error bars for standard errors of the means.

To dodge the points and size them appropriately, use

geom_point(position=position_dodge(width=0.5), size=4)

To place the error bars use

geom_errorbar(aes(
  x = block, 
  ymin = mean_total_rev_fe - se_total_rev, ymax = mean_total_rev_fe + se_total_rev),
  width = .1,
  position=position_dodge(width=0.5))

You will need to replace [MEAN] with whatever you named your average revenues and [SE] with whatever you named your standard errors.

p3 <- df_avg_revenue %>% 
  ggplot(aes(x = block, y = mean_total_rev_fe, colour = factor(odd))) +
  geom_point(position = position_dodge(width = 0.5), size = 4) +
  theme_classic() +
  geom_errorbar(aes(
    x = block, 
    ymin = df_avg_revenue$mean_total_rev_fe - df_avg_revenue$se_total_rev, ymax = df_avg_revenue$mean_total_rev_fe + df_avg_revenue$se_total_rev),
                width = 0.1, 
                position = position_dodge(width=0.5))

p3

[Q15] Interpret the plot.

Rather straightforward that groups receiving the pay increase will notice an increase in average reveneu while thos who didnt will experience a decrease.

2.2.2 Table 3

[Q16] Recreate the point estimates in Model (1) in Table 3 by hand (you don’t need to worry about the standard errors). Assign it to object m1. Recreating this model requires you to control for individual fixed effects and estimate the following equation where \(\text{H}\) is the variable high, \(\text{B2}\) is the second block (block == 2) and \(\text{B3}\) is the third block (block == 3):

\[ y_{ijt} - \bar{y}_{ij} = \beta_1 (\text{H}_{ijt} - \bar{\text{H}}_{ij}) + \beta_2 (\text{B2}_{ijt} - \bar{\text{B2}}_{ij}) + \beta_3 (\text{B3}_{ijt} - \bar{\text{B3}}_{ij}) + (\varepsilon_{ijt} - \bar{\varepsilon}_{ij}) \]

fahrer_group_data <- total_rev_data %>% 
  filter(maxhigh == 1) %>% 
  group_by(fahrer) %>% 
  mutate(high_fe = high - mean(high), b2_fe = block2 - mean(block2), b3_fe = block3 - mean(block3))

m1 <-  lm(totrev_fe~high_fe + b2_fe + b3_fe, data = fahrer_group_data)

summary(m1)
## 
## Call:
## lm(formula = totrev_fe ~ high_fe + b2_fe + b3_fe, data = fahrer_group_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4376.3  -656.5    46.9   831.3  3355.6 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -2.332e-13  1.123e+02   0.000 1.000000    
## high_fe      1.034e+03  2.732e+02   3.783 0.000243 ***
## b2_fe       -2.110e+02  3.126e+02  -0.675 0.501078    
## b3_fe       -5.747e+02  3.069e+02  -1.873 0.063545 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1251 on 120 degrees of freedom
## Multiple R-squared:  0.1232, Adjusted R-squared:  0.1013 
## F-statistic: 5.622 on 3 and 120 DF,  p-value: 0.00122

[Q17] Now recreate the same point estimates using lm and assign it to object m2. You are estimating the model below where where \(\text{F}_i\) is the dummy variable for each messenger (fahrer). Make sure to cluster the standard errors at the messenger level. (Use lmtest and sandwhich for this.)

\[ y_{ijt} - \beta_0 + \beta_1 \text{H}_{ijt} + \beta_2 \text{B2}_{ijt} + \beta_3 \text{B3}_{ijt} + \sum_{i=1}^{n} \alpha_i \text{F}_i + \varepsilon_{ijt} \]

m2pre <- lm(totrev~high + block2 + block3 + factor(fahrer), data = total_rev_data %>% filter(maxhigh == 1))

library(lmtest)
library(sandwich)

m2 <- coeftest(m2pre, vcov=vcovCL,cluster = ~fahrer)

m2
## 
## t test of coefficients:
## 
##                     Estimate  Std. Error     t value  Pr(>|t|)    
## (Intercept)       1.6117e+03  2.9297e+02  5.5012e+00 4.509e-07 ***
## high              1.0336e+03  3.2685e+02  3.1621e+00  0.002222 ** 
## block2           -2.1097e+02  4.9725e+02 -4.2430e-01  0.672516    
## block3           -5.7471e+02  5.4568e+02 -1.0532e+00  0.295454    
## factor(fahrer)2   4.3670e+02  6.1699e-12  7.0779e+13 < 2.2e-16 ***
## factor(fahrer)3   7.4720e+02  5.1180e-12  1.4599e+14 < 2.2e-16 ***
## factor(fahrer)5  -1.2704e+03  1.4648e+02 -8.6727e+00 4.240e-13 ***
## factor(fahrer)6   1.5375e+03  5.7540e-12  2.6721e+14 < 2.2e-16 ***
## factor(fahrer)7   7.5017e+02  5.2566e-12  1.4271e+14 < 2.2e-16 ***
## factor(fahrer)8  -1.0697e+02  5.1039e-12 -2.0958e+13 < 2.2e-16 ***
## factor(fahrer)9   3.9122e+03  5.2902e-12  7.3952e+14 < 2.2e-16 ***
## factor(fahrer)14  8.8400e+02  5.3399e-12  1.6555e+14 < 2.2e-16 ***
## factor(fahrer)15 -3.9967e+01  5.2990e-12 -7.5424e+12 < 2.2e-16 ***
## factor(fahrer)18  1.3274e+03  5.1262e-12  2.5894e+14 < 2.2e-16 ***
## factor(fahrer)19  2.6590e+03  1.1817e-11  2.2502e+14 < 2.2e-16 ***
## factor(fahrer)21 -5.1413e+02  6.2731e-12 -8.1958e+13 < 2.2e-16 ***
## factor(fahrer)22  2.9556e+03  5.1183e-12  5.7745e+14 < 2.2e-16 ***
## factor(fahrer)23  3.3302e+03  5.1318e-12  6.4894e+14 < 2.2e-16 ***
## factor(fahrer)24  1.7592e+03  5.6287e-12  3.1253e+14 < 2.2e-16 ***
## factor(fahrer)25  6.1167e+01  5.6503e-12  1.0825e+13 < 2.2e-16 ***
## factor(fahrer)28 -1.5017e+02  5.1918e-12 -2.8924e+13 < 2.2e-16 ***
## factor(fahrer)30  1.5575e+03  5.1325e-12  3.0347e+14 < 2.2e-16 ***
## factor(fahrer)31 -7.3797e+02  5.1419e-12 -1.4352e+14 < 2.2e-16 ***
## factor(fahrer)32 -5.2610e+02  5.1484e-12 -1.0219e+14 < 2.2e-16 ***
## factor(fahrer)33  2.1732e+03  5.1624e-12  4.2097e+14 < 2.2e-16 ***
## factor(fahrer)34  9.1070e+03  4.9957e-12  1.8230e+15 < 2.2e-16 ***
## factor(fahrer)35 -6.4000e+01  5.1438e-12 -1.2442e+13 < 2.2e-16 ***
## factor(fahrer)36  2.0549e+03  5.2160e-12  3.9396e+14 < 2.2e-16 ***
## factor(fahrer)37  2.1940e+03  5.1583e-12  4.2533e+14 < 2.2e-16 ***
## factor(fahrer)38  1.8369e+03  5.1612e-12  3.5591e+14 < 2.2e-16 ***
## factor(fahrer)42  3.9380e+03  5.4985e-12  7.1620e+14 < 2.2e-16 ***
## factor(fahrer)44  5.2187e+02  5.1491e-12  1.0135e+14 < 2.2e-16 ***
## factor(fahrer)45  5.6977e+02  5.3733e-12  1.0604e+14 < 2.2e-16 ***
## factor(fahrer)49  1.0442e+03  6.2645e-12  1.6669e+14 < 2.2e-16 ***
## factor(fahrer)50  6.5907e+03  5.1539e-12  1.2788e+15 < 2.2e-16 ***
## factor(fahrer)51  5.2890e+03  5.1441e-12  1.0282e+15 < 2.2e-16 ***
## factor(fahrer)52  4.2783e+03  5.1499e-12  8.3075e+14 < 2.2e-16 ***
## factor(fahrer)53  3.4330e+03  5.7537e-12  5.9667e+14 < 2.2e-16 ***
## factor(fahrer)55  9.7627e+02  6.1927e-12  1.5765e+14 < 2.2e-16 ***
## factor(fahrer)56  1.7618e+03  5.6151e-12  3.1376e+14 < 2.2e-16 ***
## factor(fahrer)57  2.4715e+03  5.1982e-12  4.7546e+14 < 2.2e-16 ***
## factor(fahrer)58  1.3532e+03  5.1264e-12  2.6397e+14 < 2.2e-16 ***
## factor(fahrer)60  9.3103e+02  5.1815e-12  1.7968e+14 < 2.2e-16 ***
## factor(fahrer)61  6.6470e+02  5.1447e-12  1.2920e+14 < 2.2e-16 ***
## factor(fahrer)63 -2.6111e+02  1.4648e+02 -1.7825e+00  0.078504 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

[Q18] Now use feols to recreate Model (1), including the standard errors. Assign your estimates to the object m3. You are estimating the model below where where \(\alpha_i\) is the individual intercept (i.e. the individual fixed effect):

\[ y_{ijt} = \alpha_i + \beta_1 \text{H}_{ijt} + \beta_2 \text{B2}_{ijt} + \beta_3 \text{B3}_{ijt} + \varepsilon_{ijt} \]

total_rev_data %>% 
  filter(maxhigh == 1)
## # A tibble: 124 × 18
## # Groups:   fahrer [42]
##    vebli block totrev shifts  high firstblock block1 block2 block3 maxhigh other
##    <dbl> <dbl>  <dbl>  <dbl> <dbl>      <dbl>  <dbl>  <dbl>  <dbl>   <dbl> <dbl>
##  1     1     1  3091.     13     0      14836      1      0      0       1     0
##  2     1     2  1210       5     1      14864      0      1      0       1     0
##  3     1     3   782       3     0      14913      0      0      1       1     0
##  4     1     1  2308.      8     0      14836      1      0      0       1     0
##  5     1     2  2148       6     1      14864      0      1      0       1     0
##  6     1     3  1936.      6     0      14913      0      0      1       1     0
##  7     1     1  3552.      9     0      14836      1      0      0       1     0
##  8     1     2  1889       4     0      14864      0      1      0       1     0
##  9     1     3  1883       5     1      14913      0      0      1       1     0
## 10     1     2   930.      4     1      14864      0      1      0       1     0
## # … with 114 more rows, and 7 more variables: group <chr>, experiment <dbl>,
## #   odd <dbl>, fahrer <dbl>, mean_total_rev <dbl>, totrev_fe <dbl>,
## #   total_rev_minus_mean <dbl>
m3 <-  feols(totrev~high + block2 + block3 | fahrer, data = total_rev_data)

summary(m3)
## OLS estimation, Dep. Var.: totrev
## Observations: 386 
## Fixed-effects: fahrer: 138
## Standard-errors: Clustered (fahrer) 
##        Estimate Std. Error t value  Pr(>|t|))    
## high    1076.20     232.69  4.6249 0.00000858 ***
## block2  -289.96     160.67 -1.8047 0.07332300 .  
## block3  -675.90     190.58 -3.5465 0.00053500 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 1,038.3     Adj. R2: 0.611806
##                 Within R2: 0.101884

[Q19] Compare the estimates in m1, m2 and m3. What is the same? What is different? What would you say is the main advantage of using felm()?

The coefficients in theory should be the same throughout the three estimates. The differences should be the standard errors based on the way that the data was grouped. I personally found no advantages so using feols as I couldnt quite figure it out. I believe that the advantage should be an easier and simpler formula however as it seemed you wouldnt have to worry about clustering your standard error and could just enter in the variables you wanted included in your regression.

Take it easy on me this was a difficult one and we just kicked off planning season at work =)

[Q20] Explain why you need to cluster the standard errors.

You would want to cluster around standard errors to because it can account for errors within the groups. Instead of having to do each individually you can cluster and get the jist of it. It was useful in this study where events were independent and not identical.