Download the data in the additional data folder and save it in your part1/data/raw
folder.
Who You Calling Hispanic
Use the read_excel()
function to import excel files:
We’ll use the Import Dataset
user interface to learn about read_excel
In the Files pane, right-click the data you want to read in:
Sometimes it is useful to concatenate strings with the paste()
function:
You can use this inside a mutate. For instance, if you want to add the word County after each county name in your asthma data.
We’ll add 3 more datasets to our New York county poverty dataframe:
filter()
function to remove rows where Numerator = “s” before you add up the asthma hospitalizations for each countyOpen your part1
project and create a new script, called ny_county_dataset
. If you haven’t already, download the data in the additional data folder and save it in your part1/data/raw
folder.
Read in your processed county poverty dataset
For each of the three datasets:
summarise()*
and group_by()
to aggregate the data to County-levelleft_join()
to join each dataset to your New York poverty dataframe# add atm, lottery and asthma data to ny county poverty dataframe
# load packages
library(tidyverse)
# import our county dataset with population and poverty
county_pov <- read_csv("data/processed/county_pov_rate_2019.csv")
# import atm location data
raw_atms <- read_csv("data/raw/Bank-Owned_ATM_Locations_in_New_York_State.csv")
# aggregate to county-level and create a common join key
atms_by_county <- raw_atms |>
group_by(County) |>
summarise(atms = n()) |>
mutate(County = paste0(County, " County"))
# join atm data to county
county_pov_atms <- county_pov |>
left_join(atms_by_county, by = c("COUNTY" = "County")) |>
mutate(banks_per10000 = atms/county_pop*10000)
Create one county-level dataframe with:
County name, County ID, population, poverty rate, atms, atms per 10K people, lottery retailers, lottery retailers per 10k people, asthma hospitalizations, asthma hospitalizations per 10k people
Explore your new dataframe. Follow your interest! Some things you could look at:
Upload your script for assignment 4a with the answer to these 3 questions (or other questions) at the bottom of your script
Next week we will use visualization to explore more.
If you finish exploring those datasets in class, add more data and keep exploring health in NY counties.
Go to the County Health Rankings and Roadmaps website to download data for New York.
Sum of Us