In this lab, we’ll take a look at household movement within and into the five borough of New York City between 2013 and 2014. As always, we’ll start by loading our libraries.
For this lab, we’ll use American Community Survey data. To help keep response rates for the Census near 100%, in the 1990’s, the Census Bureau decided to make the Census shorter, and to get rid of the Long Form Census that was administered to a portion of the population. They found that longer questionnaires yielded lower response rates. So, the Census Bureau created a new survey called the American Community Survey. The ACS is administered to 1% of the US population each year, and asks a much more detailed and lengthy series of questions than on the Census, allowing the Census Bureau to keep the Census itself very brief. Here, we will look at the results of their questions asking about household movement within the 5 boroughs of New York City on the 2014 ACS.
The code below creates geographies for our pull from the ACS, and then imports data from a specific table already created by the Census Bureau. Enter in 2014 for the year to tell ACS which year’s data we want.
bronx <- geo.make(county = "Bronx County", state = "NY")
manhattan <- geo.make(county = "New York County", state = "NY")
queens <- geo.make(county= "Queens County", state = "NY")
brooklyn <- geo.make(county = "Kings County", state = "NY")
statenisland <- geo.make(county = "Richmond County", state = "NY")
nyc_counties <- bronx + manhattan + queens + brooklyn + statenisland
migration <- acs.fetch(geography = nyc_counties, table.number = "B07003", endyear = 2014, col.names = "pretty")
The names for the columns in the table are extremely long, so this code shortens the lines. Then, we flip the row and columns of the table to make them easier to read.
migration.names <- acs.colnames(migration)
new.names <- stringr::str_sub(migration.names, start = 82)
acs.colnames(migration) <- new.names
estimates <- as.data.frame(estimate(migration))
estimatest <- t(estimates)
estimatest
Bronx County, New York | New York County, New York | Queens County, New York | Kings County, New York | Richmond County, New York | |
---|---|---|---|---|---|
Total: | 1393802 | 1600370 | 2253840 | 2534064 | 466317 |
Male | 654198 | 753692 | 1091228 | 1197613 | 225809 |
Female | 739604 | 846678 | 1162612 | 1336451 | 240508 |
Same house 1 year ago: | 1233672 | 1338628 | 2034932 | 2294179 | 439584 |
Same house 1 year ago: Male | 572558 | 629956 | 983286 | 1081083 | 212229 |
Same house 1 year ago: Female | 661114 | 708672 | 1051646 | 1213096 | 227355 |
Moved within same county: | 96316 | 122220 | 131218 | 147184 | 16226 |
Moved within same county: Male | 47355 | 59340 | 63692 | 70764 | 8026 |
Moved within same county: Female | 48961 | 62880 | 67526 | 76420 | 8200 |
Moved from different county within same state: | 36972 | 44328 | 43493 | 40441 | 6358 |
Moved from different county within same state: Male | 21018 | 21334 | 22668 | 19955 | 3317 |
Moved from different county within same state: Female | 15954 | 22994 | 20825 | 20486 | 3041 |
Moved from different state: | 12143 | 63513 | 19414 | 28869 | 2846 |
Moved from different state: Male | 6236 | 28193 | 9674 | 14424 | 1510 |
Moved from different state: Female | 5907 | 35320 | 9740 | 14445 | 1336 |
Moved from abroad: | 14699 | 31681 | 24783 | 23391 | 1303 |
Moved from abroad: Male | 7031 | 14869 | 11908 | 11387 | 727 |
Moved from abroad: Female | 7668 | 16812 | 12875 | 12004 | 576 |
Next, since we’ll want to work with the data, we make some changes to how the names or each variable is formatted.
new.names <- str_replace_all(str_trim(new.names), " ", ".")
new.names <- str_replace_all(str_trim(new.names), ":", "")
colnames(estimates) <- new.names
The ACS provides us with counts, or the estimated number of people that fall within each category. We want to take a look at percents so we can view the relative proportions within each category. The code below will calculate the percentages within each category.
estimates$percent_total <- round(100*estimates$Total/estimates$Total)
estimates$percent_male <- round(100*estimates$Male/estimates$Total)
estimates$percent_female <- round(100*estimates$Female/estimates$Total)
estimates$percent_same_house <- round(100*estimates$Same.house.1.year.ago/estimates$Total)
estimates$percent_same_house_male <- round(100*estimates$Same.house.1.year.ago.Male/estimates$Same.house.1.year.ago)
estimates$percent_same_house_female <- round(100*estimates$Same.house.1.year.ago.Female/estimates$Same.house.1.year.ago)
estimates$percent_moved_within_same_county <- round(100*estimates$Moved.within.same.county/estimates$Total)
estimates$percent_moved_within_same_county_male <- round(100*estimates$Moved.within.same.county.Male/estimates$Moved.within.same.county)
estimates$percent_moved_within_same_county_female <- round(100*estimates$Moved.within.same.county.Female/estimates$Moved.within.same.county)
estimates$percent_moved_from_different_county_within_same_state <- round(100*estimates$Moved.from.different.county.within.same.state/estimates$Total)
estimates$percent_moved_from_different_county_within_same_state_male <- round(100*estimates$Moved.from.different.county.within.same.state.Male/estimates$Moved.from.different.county.within.same.state)
estimates$percent_moved_from_different_county_within_same_state_female <- round(100*estimates$Moved.from.different.county.within.same.state.Female/estimates$Moved.from.different.county.within.same.state)
estimates$percent_moved_from_different_state <- round(100*estimates$Moved.from.different.state/estimates$Total)
estimates$percent_moved_from_different_state_male <- round(100*estimates$Moved.from.different.state.Male/estimates$Moved.from.different.state)
estimates$percent_moved_from_different_state_female <- round(100*estimates$Moved.from.different.state.Female/estimates$Moved.from.different.state)
estimates$percent_moved_from_abroad <- round(100*estimates$Moved.from.abroad/estimates$Total)
estimates$percent_moved_from_abroad_male <- round(100*estimates$Moved.from.abroad.Male/estimates$Moved.from.abroad)
estimates$percent_moved_from_abroad_female <- round(100*estimates$Moved.from.abroad.Female/estimates$Moved.from.abroad)
Now, we want to create a separate table with just the percentages. The code below separates out the variables with percentages from the original count variables. After separating out the variables we want in our new table, we’ll need to flip the rows and columns again using the t() command. Go ahead an name the datasets. You’ll need one name for the dataset that selects just the variables in percent format, and you’ll need a second dataset name for the transposed version.
percenttable <- dplyr::select(estimates, percent_total, percent_male, percent_female, percent_same_house, percent_same_house_male, percent_same_house_female, percent_moved_within_same_county, percent_moved_within_same_county_male, percent_moved_within_same_county_female, percent_moved_from_different_county_within_same_state, percent_moved_from_different_county_within_same_state_male, percent_moved_from_different_county_within_same_state_female, percent_moved_from_different_state, percent_moved_from_different_state_male, percent_moved_from_different_state_female, percent_moved_from_abroad, percent_moved_from_abroad_male, percent_moved_from_abroad_female)
tpercenttable <- t(percenttable)
tpercenttable
Bronx County, New York | New York County, New York | Queens County, New York | Kings County, New York | Richmond County, New York | |
---|---|---|---|---|---|
percent_total | 100 | 100 | 100 | 100 | 100 |
percent_male | 47 | 47 | 48 | 47 | 48 |
percent_female | 53 | 53 | 52 | 53 | 52 |
percent_same_house | 89 | 84 | 90 | 91 | 94 |
percent_same_house_male | 46 | 47 | 48 | 47 | 48 |
percent_same_house_female | 54 | 53 | 52 | 53 | 52 |
percent_moved_within_same_county | 7 | 8 | 6 | 6 | 3 |
percent_moved_within_same_county_male | 49 | 49 | 49 | 48 | 49 |
percent_moved_within_same_county_female | 51 | 51 | 51 | 52 | 51 |
percent_moved_from_different_county_within_same_state | 3 | 3 | 2 | 2 | 1 |
percent_moved_from_different_county_within_same_state_male | 57 | 48 | 52 | 49 | 52 |
percent_moved_from_different_county_within_same_state_female | 43 | 52 | 48 | 51 | 48 |
percent_moved_from_different_state | 1 | 4 | 1 | 1 | 1 |
percent_moved_from_different_state_male | 51 | 44 | 50 | 50 | 53 |
percent_moved_from_different_state_female | 49 | 56 | 50 | 50 | 47 |
percent_moved_from_abroad | 1 | 2 | 1 | 1 | 0 |
percent_moved_from_abroad_male | 48 | 47 | 48 | 49 | 56 |
percent_moved_from_abroad_female | 52 | 53 | 52 | 51 | 44 |
Go ahead and knit this template to PDF. The main results you’ll see are two tables, the first with counts, and the second with percentages. We will interpret these tables in the lab in Blackboard.