In this lab, we’ll take a look at household movement within and into the five borough of New York City between 2013 and 2014. As always, we’ll start by loading our libraries.

For this lab, we’ll use American Community Survey data. To help keep response rates for the Census near 100%, in the 1990’s, the Census Bureau decided to make the Census shorter, and to get rid of the Long Form Census that was administered to a portion of the population. They found that longer questionnaires yielded lower response rates. So, the Census Bureau created a new survey called the American Community Survey. The ACS is administered to 1% of the US population each year, and asks a much more detailed and lengthy series of questions than on the Census, allowing the Census Bureau to keep the Census itself very brief. Here, we will look at the results of their questions asking about household movement within the 5 boroughs of New York City on the 2014 ACS.

The code below creates geographies for our pull from the ACS, and then imports data from a specific table already created by the Census Bureau. Enter in 2014 for the year to tell ACS which year’s data we want.

bronx <- geo.make(county = "Bronx County", state = "NY")
manhattan <- geo.make(county = "New York County", state = "NY")
queens <- geo.make(county= "Queens County", state = "NY")
brooklyn <- geo.make(county = "Kings County", state = "NY")
statenisland <- geo.make(county = "Richmond County", state = "NY")
nyc_counties <- bronx + manhattan + queens + brooklyn + statenisland

migration <- acs.fetch(geography = nyc_counties, table.number = "B07003", endyear = 2014, col.names = "pretty")

The names for the columns in the table are extremely long, so this code shortens the lines. Then, we flip the row and columns of the table to make them easier to read.

migration.names <- acs.colnames(migration)
new.names <- stringr::str_sub(migration.names, start = 82)
acs.colnames(migration) <- new.names

estimates <- as.data.frame(estimate(migration))

estimatest <- t(estimates)

estimatest
Bronx County, New York New York County, New York Queens County, New York Kings County, New York Richmond County, New York
Total: 1393802 1600370 2253840 2534064 466317
Male 654198 753692 1091228 1197613 225809
Female 739604 846678 1162612 1336451 240508
Same house 1 year ago: 1233672 1338628 2034932 2294179 439584
Same house 1 year ago: Male 572558 629956 983286 1081083 212229
Same house 1 year ago: Female 661114 708672 1051646 1213096 227355
Moved within same county: 96316 122220 131218 147184 16226
Moved within same county: Male 47355 59340 63692 70764 8026
Moved within same county: Female 48961 62880 67526 76420 8200
Moved from different county within same state: 36972 44328 43493 40441 6358
Moved from different county within same state: Male 21018 21334 22668 19955 3317
Moved from different county within same state: Female 15954 22994 20825 20486 3041
Moved from different state: 12143 63513 19414 28869 2846
Moved from different state: Male 6236 28193 9674 14424 1510
Moved from different state: Female 5907 35320 9740 14445 1336
Moved from abroad: 14699 31681 24783 23391 1303
Moved from abroad: Male 7031 14869 11908 11387 727
Moved from abroad: Female 7668 16812 12875 12004 576

Next, since we’ll want to work with the data, we make some changes to how the names or each variable is formatted.

new.names <- str_replace_all(str_trim(new.names), " ", ".")
new.names <- str_replace_all(str_trim(new.names), ":", "")
colnames(estimates) <- new.names

The ACS provides us with counts, or the estimated number of people that fall within each category. We want to take a look at percents so we can view the relative proportions within each category. The code below will calculate the percentages within each category.

estimates$percent_total <- round(100*estimates$Total/estimates$Total)

estimates$percent_male <- round(100*estimates$Male/estimates$Total)
estimates$percent_female <- round(100*estimates$Female/estimates$Total)

estimates$percent_same_house <- round(100*estimates$Same.house.1.year.ago/estimates$Total)
estimates$percent_same_house_male <- round(100*estimates$Same.house.1.year.ago.Male/estimates$Same.house.1.year.ago)
estimates$percent_same_house_female <- round(100*estimates$Same.house.1.year.ago.Female/estimates$Same.house.1.year.ago)

estimates$percent_moved_within_same_county <- round(100*estimates$Moved.within.same.county/estimates$Total)
estimates$percent_moved_within_same_county_male <- round(100*estimates$Moved.within.same.county.Male/estimates$Moved.within.same.county)
estimates$percent_moved_within_same_county_female <- round(100*estimates$Moved.within.same.county.Female/estimates$Moved.within.same.county)

estimates$percent_moved_from_different_county_within_same_state <- round(100*estimates$Moved.from.different.county.within.same.state/estimates$Total)
estimates$percent_moved_from_different_county_within_same_state_male <- round(100*estimates$Moved.from.different.county.within.same.state.Male/estimates$Moved.from.different.county.within.same.state)
estimates$percent_moved_from_different_county_within_same_state_female <- round(100*estimates$Moved.from.different.county.within.same.state.Female/estimates$Moved.from.different.county.within.same.state)

estimates$percent_moved_from_different_state <- round(100*estimates$Moved.from.different.state/estimates$Total)
estimates$percent_moved_from_different_state_male <- round(100*estimates$Moved.from.different.state.Male/estimates$Moved.from.different.state)
estimates$percent_moved_from_different_state_female <- round(100*estimates$Moved.from.different.state.Female/estimates$Moved.from.different.state)

estimates$percent_moved_from_abroad <- round(100*estimates$Moved.from.abroad/estimates$Total)
estimates$percent_moved_from_abroad_male <- round(100*estimates$Moved.from.abroad.Male/estimates$Moved.from.abroad)
estimates$percent_moved_from_abroad_female <- round(100*estimates$Moved.from.abroad.Female/estimates$Moved.from.abroad)

Now, we want to create a separate table with just the percentages. The code below separates out the variables with percentages from the original count variables. After separating out the variables we want in our new table, we’ll need to flip the rows and columns again using the t() command. Go ahead an name the datasets. You’ll need one name for the dataset that selects just the variables in percent format, and you’ll need a second dataset name for the transposed version.

percenttable <- dplyr::select(estimates, percent_total, percent_male, percent_female, percent_same_house, percent_same_house_male, percent_same_house_female, percent_moved_within_same_county, percent_moved_within_same_county_male, percent_moved_within_same_county_female, percent_moved_from_different_county_within_same_state, percent_moved_from_different_county_within_same_state_male, percent_moved_from_different_county_within_same_state_female, percent_moved_from_different_state, percent_moved_from_different_state_male, percent_moved_from_different_state_female, percent_moved_from_abroad, percent_moved_from_abroad_male, percent_moved_from_abroad_female)

tpercenttable <- t(percenttable)

tpercenttable
Bronx County, New York New York County, New York Queens County, New York Kings County, New York Richmond County, New York
percent_total 100 100 100 100 100
percent_male 47 47 48 47 48
percent_female 53 53 52 53 52
percent_same_house 89 84 90 91 94
percent_same_house_male 46 47 48 47 48
percent_same_house_female 54 53 52 53 52
percent_moved_within_same_county 7 8 6 6 3
percent_moved_within_same_county_male 49 49 49 48 49
percent_moved_within_same_county_female 51 51 51 52 51
percent_moved_from_different_county_within_same_state 3 3 2 2 1
percent_moved_from_different_county_within_same_state_male 57 48 52 49 52
percent_moved_from_different_county_within_same_state_female 43 52 48 51 48
percent_moved_from_different_state 1 4 1 1 1
percent_moved_from_different_state_male 51 44 50 50 53
percent_moved_from_different_state_female 49 56 50 50 47
percent_moved_from_abroad 1 2 1 1 0
percent_moved_from_abroad_male 48 47 48 49 56
percent_moved_from_abroad_female 52 53 52 51 44

Go ahead and knit this template to PDF. The main results you’ll see are two tables, the first with counts, and the second with percentages. We will interpret these tables in the lab in Blackboard.