Internal Migration Lab

In this lab, we’ll take a look at household movement within and into the five borough of New York City between 2013 and 2014. As always, we’ll start by loading our libraries.

For this lab, we’ll use American Community Survey data. To help keep response rates for the Census near 100%, in the 1990’s, the Census Bureau decided to make the Census shorter, and to get rid of the Long Form Census that was administered to a portion of the population. They found that longer questionnaires yielded lower response rates. So, the Census Bureau created a new survey called the American Community Survey. The ACS is administered to 1% of the US population each year, and asks a much more detailed and lengthy series of questions than on the Census, allowing the Census Bureau to keep the Census itself very brief. Here, we will look at the results of their questions asking about household movement within the 5 boroughs of New York City on the 2014 ACS.

The code below creates geographies for our pull from the ACS, and then imports data from a specific table already created by the Census Bureau. Enter in 2014 for the year to tell ACS which year’s data we want.

bronx <- geo.make(county = "Bronx County", state = "NY")
manhattan <- geo.make(county = "New York County", state = "NY")
queens <- geo.make(county= "Queens County", state = "NY")
brooklyn <- geo.make(county = "Kings County", state = "NY")
statenisland <- geo.make(county = "Richmond County", state = "NY")
nyc_counties <- bronx + manhattan + queens + brooklyn + statenisland

migration <- acs.fetch(geography = nyc_counties, table.number = "B07003", endyear = 2014, col.names = "pretty")

The names for the columns in the table are extremely long, so this code shortens the lines. Then, we flip the row and columns of the table to make them easier to read.

migration.names <- acs.colnames(migration)
new.names <- stringr::str_sub(migration.names, start = 82)
acs.colnames(migration) <- new.names

estimates <- as.data.frame(estimate(migration))

estimatest <- t(estimates)

estimatest

	Bronx County, New York	New York County, New York	Queens County, New York	Kings County, New York	Richmond County, New York
Total:	1393802	1600370	2253840	2534064	466317
Male	654198	753692	1091228	1197613	225809
Female	739604	846678	1162612	1336451	240508
Same house 1 year ago:	1233672	1338628	2034932	2294179	439584
Same house 1 year ago: Male	572558	629956	983286	1081083	212229
Same house 1 year ago: Female	661114	708672	1051646	1213096	227355
Moved within same county:	96316	122220	131218	147184	16226
Moved within same county: Male	47355	59340	63692	70764	8026
Moved within same county: Female	48961	62880	67526	76420	8200
Moved from different county within same state:	36972	44328	43493	40441	6358
Moved from different county within same state: Male	21018	21334	22668	19955	3317
Moved from different county within same state: Female	15954	22994	20825	20486	3041
Moved from different state:	12143	63513	19414	28869	2846
Moved from different state: Male	6236	28193	9674	14424	1510
Moved from different state: Female	5907	35320	9740	14445	1336
Moved from abroad:	14699	31681	24783	23391	1303
Moved from abroad: Male	7031	14869	11908	11387	727
Moved from abroad: Female	7668	16812	12875	12004	576

Next, since we’ll want to work with the data, we make some changes to how the names or each variable is formatted.

new.names <- str_replace_all(str_trim(new.names), " ", ".")
new.names <- str_replace_all(str_trim(new.names), ":", "")
colnames(estimates) <- new.names

The ACS provides us with counts, or the estimated number of people that fall within each category. We want to take a look at percents so we can view the relative proportions within each category. The code below will calculate the percentages within each category.

estimates$percent_total <- round(100*estimates$Total/estimates$Total)

estimates$percent_male <- round(100*estimates$Male/estimates$Total)
estimates$percent_female <- round(100*estimates$Female/estimates$Total)

estimates$percent_same_house <- round(100*estimates$Same.house.1.year.ago/estimates$Total)
estimates$percent_same_house_male <- round(100*estimates$Same.house.1.year.ago.Male/estimates$Same.house.1.year.ago)
estimates$percent_same_house_female <- round(100*estimates$Same.house.1.year.ago.Female/estimates$Same.house.1.year.ago)

estimates$percent_moved_within_same_county <- round(100*estimates$Moved.within.same.county/estimates$Total)
estimates$percent_moved_within_same_county_male <- round(100*estimates$Moved.within.same.county.Male/estimates$Moved.within.same.county)
estimates$percent_moved_within_same_county_female <- round(100*estimates$Moved.within.same.county.Female/estimates$Moved.within.same.county)

estimates$percent_moved_from_different_county_within_same_state <- round(100*estimates$Moved.from.different.county.within.same.state/estimates$Total)
estimates$percent_moved_from_different_county_within_same_state_male <- round(100*estimates$Moved.from.different.county.within.same.state.Male/estimates$Moved.from.different.county.within.same.state)
estimates$percent_moved_from_different_county_within_same_state_female <- round(100*estimates$Moved.from.different.county.within.same.state.Female/estimates$Moved.from.different.county.within.same.state)

estimates$percent_moved_from_different_state <- round(100*estimates$Moved.from.different.state/estimates$Total)
estimates$percent_moved_from_different_state_male <- round(100*estimates$Moved.from.different.state.Male/estimates$Moved.from.different.state)
estimates$percent_moved_from_different_state_female <- round(100*estimates$Moved.from.different.state.Female/estimates$Moved.from.different.state)

estimates$percent_moved_from_abroad <- round(100*estimates$Moved.from.abroad/estimates$Total)
estimates$percent_moved_from_abroad_male <- round(100*estimates$Moved.from.abroad.Male/estimates$Moved.from.abroad)
estimates$percent_moved_from_abroad_female <- round(100*estimates$Moved.from.abroad.Female/estimates$Moved.from.abroad)

Now, we want to create a separate table with just the percentages. The code below separates out the variables with percentages from the original count variables. After separating out the variables we want in our new table, we’ll need to flip the rows and columns again using the t() command. Go ahead an name the datasets. You’ll need one name for the dataset that selects just the variables in percent format, and you’ll need a second dataset name for the transposed version.

percenttable <- dplyr::select(estimates, percent_total, percent_male, percent_female, percent_same_house, percent_same_house_male, percent_same_house_female, percent_moved_within_same_county, percent_moved_within_same_county_male, percent_moved_within_same_county_female, percent_moved_from_different_county_within_same_state, percent_moved_from_different_county_within_same_state_male, percent_moved_from_different_county_within_same_state_female, percent_moved_from_different_state, percent_moved_from_different_state_male, percent_moved_from_different_state_female, percent_moved_from_abroad, percent_moved_from_abroad_male, percent_moved_from_abroad_female)

tpercenttable <- t(percenttable)

tpercenttable

	Bronx County, New York	New York County, New York	Queens County, New York	Kings County, New York	Richmond County, New York
percent_total	100	100	100	100	100
percent_male	47	47	48	47	48
percent_female	53	53	52	53	52
percent_same_house	89	84	90	91	94
percent_same_house_male	46	47	48	47	48
percent_same_house_female	54	53	52	53	52
percent_moved_within_same_county	7	8	6	6	3
percent_moved_within_same_county_male	49	49	49	48	49
percent_moved_within_same_county_female	51	51	51	52	51
percent_moved_from_different_county_within_same_state	3	3	2	2	1
percent_moved_from_different_county_within_same_state_male	57	48	52	49	52
percent_moved_from_different_county_within_same_state_female	43	52	48	51	48
percent_moved_from_different_state	1	4	1	1	1
percent_moved_from_different_state_male	51	44	50	50	53
percent_moved_from_different_state_female	49	56	50	50	47
percent_moved_from_abroad	1	2	1	1	0
percent_moved_from_abroad_male	48	47	48	49	56
percent_moved_from_abroad_female	52	53	52	51	44

Go ahead and knit this template to PDF. The main results you’ll see are two tables, the first with counts, and the second with percentages. We will interpret these tables in the lab in Blackboard.