Discrimination

Author

David Kane

Background Information

Bertrand, Marianne and Sendhil Mullainathan. 2004. “Are Emily and Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination.” American Economic Review, 94 (4): 991–1013. Data cleaned and discussed in “Quantitative Social Science: An Introduction” by Kosuke Imai.

“We study race in the labor market by sending fictitious resumes to help-wanted ads in Boston and Chicago newspapers. To manipulate perceived race, resumes are randomly assigned African-American- or White-sounding names. White names receive 50 percent more callbacks for interviews. Callbacks are also more respon- sive to resume quality for White names than for African-American ones. The racial gap is uniform across occupation, industry, and employer size. We also find little evidence that employers are inferring social class from the names. Differential treatment by race still appears to still be prominent in the U.S. labor market.”

Discussion

Subject: Discrimination

Broad question: Race and employment in the United States

Specific question: Effect of racially-coded names on call-backs in 2020’s for adults in Boston and Chicago applying for jobs.

Preceptor Table:

Units: individuals (or resumes ?)
(Potential) Outcomes: call-back if black name, call-back if white name
Causal/predictive: a causal model
Covariates: sex, zip code (not shown in table)
Treatment: black name on resume versus white name on resume

Example Preceptor Table

Preceptor Table
ID	Potential Outcomes		Covariate
ID	Callback if White Name	Callback if Black Name	Treatment
1	Yes*	Yes	Black
2	No	Yes*	White
…	…	…	…
10	No*	No	Black
11	Yes*	Yes	Black
…	…	…	…
N	Yes	No*	White

Questions for Second Class

What are the units, precisely?

All the people sending out resumes for entry-level positions in Boston and Chicago during 2020 – 2030 who have names which are associated with a race. So, N is (?) in the thousands.

What is the moment in time, precisely?

2020 to 2030

What is one reason why validity might not hold?

There is less of a strong connection between names and race today then there was in 2001. The column “resume with black name” in the data does not mean the same thing as the column “resume with black name” means in the Preceptor Table. Just because two columns have the same name does not mean that they are really the same thing.

Describe the Population Table in words

Every time a person with a racially-coded name sends out a resume in Boston/Chicago between 2020 and 2030. That is, there will be multiple rows for the same person for two reasons. First, they will often apply for more than one job. Second, because they might apply for jobs in both 2022 and in, separately (and with a new resume), 2026.

What is one reason why stability might not hold?

Stability means that the relationship between the columns in the Population Table is the same for three categories of rows: the data, the Preceptor Table, and the larger population from which both are drawn. If racism has decrease significantly since 2001, then the relationship between the resume-with-black-name column and the callback outcome will not be as strong now as it was then.

What is one reason why representativeness might not hold?

Representativeness, or the lack thereof, concerns two relationship, among the rows in the Population Table. The first is between the Preceptor Table and the other rows. The second is between our data and the other rows. The original data from 2001 was not even real. They did not use real people. They were not sampling from the set of all job seekers. There is no reason to believe that the sorts of made-up-resumes they used are representative of the sorts of people/resumes applying for jobs now.

What is one reason why unconfoundedness might not hold?

Unconfoundedness means that the treatment assignment is independent of the potential outcomes, when we condition on pre-treatment covariates. A model is confounded if this is not true. Given that they randomly assigned names to resumes, unconfoundedness should be true. But what if the assignment process was not really random? They provide no details on the process. Is it that hard to believe that poorly-paid research assistants might not have bothered to really randomize the assignment of names to resumes?

Describe in words what the Preceptor Table would look like if we were trying to, instead, predict whether or not someone would get a callback?

The Preceptor Table would look like this. When we predict, we only have one outcome column. We don’t care about the counterfactual. We don’t care what would have happened if, for example, a white name had been used in resume 1. All we care about is a model which can predict, given information about treatment and other variables, what will happen to a given resume.

Preceptor Table
ID	Outcome	Covariate
ID	Callback	Treatment
1	Yes	Black
2	No	White
…	…	…
10	No	Black
11	Yes	Black
…	…	…
N	Yes	White

Trying to compile a simple C file

Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
using C compiler: ‘Apple clang version 15.0.0 (clang-1500.3.9.4)’
using SDK: ‘’
clang -arch arm64 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -D_HAS_AUTO_PTR_ETC=0  -include '/Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/opt/R/arm64/include    -fPIC  -falign-functions=64 -Wall -g -O2  -c foo.c -o foo.o
In file included from <built-in>:1:
In file included from /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
In file included from /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library/RcppEigen/include/Eigen/Dense:1:
In file included from /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library/RcppEigen/include/Eigen/Core:19:
/Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:679:10: fatal error: 'cmath' file not found
#include <cmath>
         ^~~~~~~
1 error generated.
make: *** [foo.o] Error 1