State your research question, a description of the variables you’ll use, and your data sources (please include website links if possible).
Our research asks the question, “How did Alabama voter support for Donald Trump in the 2016 presidential election in the Black Belt vs. non-Black Belt regions correlate with and inform voter support in those regions for Roy Moore in the 2017 Senate election?” Our data comes from research cited in the Atlantic article, “African American Voters Made Doug Jones a US Senator in Alabama”. Our outcome variable is the percentage of voters in each county that voted for Roy Moore in the 2017 Senate election. One numerical explanatory variable that we hope will inform this outcome is the percentage of voters in each county that voted for President Trump in the 2016 election. Another categorical explanatory variable is whether a county is categorized as part of the “Black Belt”. The Black Belt is a historically black region of the South and will serve as a proxy variable for race. We used the map from the Center for Business and Economic Research at the University of Alabama to determine the traditional counties of the Black Belt.
clean_names() function from the janitor package then select() only the variables you are going to use.Example:
| Region | County | Candidate | Percent Votes Received |
|---|---|---|---|
| NotBB | Autauga | Trump | 0.7277 |
| NotBB | Autauga | Moore | 0.5990 |
| NotBB | Baldwin | Trump | 0.7655 |
| NotBB | Baldwin | Moore | 0.6170 |
| BB | Barbour | Trump | 0.5210 |
| BB | Barbour | Moore | 0.4200 |
Create “exploratory data analysis” visualizations of your data. At this point these are preliminary and can change for the submission, but the only requirement is that your visualizations use each of the measurement variables included in your dataset to test out if they work.