Lecture:
- Housekeeping
- Types of quantitative data
- Data quality standards
Lab:
- Download R & RStudio
- Basics of RStudio
- What is an object?
- What is a function?
- What is a package?
January 14, 2025
Source: Donaldson, D., & Storeygard, A. (2016). The view from above: Applications of satellite data in economics. In Journal of Economic Perspectives (Vol. 30, Issue 4, pp. 171–198). American Economic Association. https://doi.org/10.1257/jep.30.4.171
When we collect data or deal with off-the-shelf data, we can use the following criteria to evaluate data quality:
Data Validity: do the scores of the variable accurately capture what a variable is said to represent or indicate?
“describes the extent to which data depicts the measures they claim to represent” (Brancati 2016, p.235)
e.g. prison overcrowding, voter turnout
Completeness: a dataset is complete if it (1) includes values for the whole universe of relevant cases and (2) includes observations for all of the relevant measures or variables in the data.
e.g. (1) includes the universe of relevant cases A survey of provincial voters shouldn’t exclude voters located in Manitoba (unless there is some theoretically driven reason to do so).
e.g. (2) includes observations for all relevant measures or variables in the data If the survey of provincial voters dataset includes a variable that indicates respondents’ ages, it shouldn’t be missing the ages of the respondents in Manitoba but include the ages of all other voters.
Consistency: Data consistency “refers to the absence of contradictions in the data” (Brancati 2016, p.238). For example, data is consistent when cases are coded according to the same rules and the data are collected using the same types of sources.
15 minute break and attendance sign in
return for the hands-on component of the class