library(tidycensus)
library(tidyverse)Workshop 4
Objectives for the assignment
This assignment was designed to practice my skills in Quarto & R Markdown functions (which can be used with other that R languages, such as Python & C++). For this assignment, I will be formatting my last assignment Workshop 3.
Within Workshop 3, I was trying to identify whether there are certain differences or patterns within the moving types of population when comparing in moving populations comparing urban & rural moving trends.
Method used in the assignment
In order to start the analysis for this or any assignment. it is first important to download necessary libraries which include packages within R programming. This can be completed using the code below:
- Load the data from 2016 America’s Census Survey while assigning it to the variable ‘v16’.
v16 <- load_variables(2016, "acs5", cache = TRUE)- Pull the total population information about the North Carolina for 2016.
popNC_total_2016 <- get_acs(geography = "county",
state = "NC",
variable = "B03001_001",
year = 2016)|>
select (GEOID, estimate, NAME) |>
rename(pop_total_2016 = estimate)- Categorize the types of movers: moved within the county, moved from a different county, moved from different state, moved from abroad.
#moved within the same county
NCmoved_samecounty_2016 <- get_acs(geography = "county",
state = "NC",
variable = "B07001_003",
year = 2016)|>
select (GEOID, estimate, NAME) |>
rename(NCmoved_samecounty_2016 = estimate)The sample code above for moved within the same county was duplicated and adjusted with minor changes to obtain similar information based on the other three types of relocations
Once the movers type data has been identified & structures for all 100 counties within North Carolina. I have proceeded with analysis by categorizing the counties as metro by selection top 10 counties with highest population counts (metro type) and completing the similar steps for selecting bottom 10 counties with lowest population counts (nonmetro).
Then, I decided to identify the percentage by type to scale the percentage by type across the total population relative to that county, assuming that there will be differences by type of what type of moves tend to happen within the county.
To highlight the differences in percentages among movers’ types, I have created a Bar chart for each county to visualize how moving types differ for each county.
### PROPORTION BY COUNTY
y <- c(countyname="Buncombe County, North Carolina")
buncombe <- movers_top10_2016[movers_top10_2016$NAME == "Buncombe County, North Carolina" ,]
bper_movedsamecounty <- buncombe$NCmoved_samecounty_2016 / buncombe$total_moved_2016
bper_moveddiffcounty <- buncombe$NCmoved_diffcounty_2016 / buncombe$total_moved_2016
bper_movedstate <- buncombe$NCmoved_state_2016 / buncombe$total_moved_2016
bper_movedabroad <- buncombe$NCmoved_abroad_2016 / buncombe$total_moved_2016
buncombe_long <- buncombe %>%
pivot_longer(
cols = starts_with("NCmoved"),
names_to = "move_type",
values_to = "count"
)
#plot
ggplot(buncombe_long, aes(x = NAME, y = count, fill = move_type)) +
geom_bar(stat = "identity", position = "fill") +
scale_y_continuous(labels = scales::percent) +
labs(
title = "Proportion of Movers by Type (Buncombe County)",
x = "County",
y = "Percent of Movers",
fill = "Move Type"
) +
theme_minimal()- Then I have completed the similar analysis steps 1-6 for the nonmetro counties (the bottom least populated 10 counties) with graphing.
### PROPORTION BY COUNTY LOWER END
f <- c(countyname="Tyrrell County, North Carolina")
tyrell <- movers_bottom10_2016[movers_bottom10_2016$NAME == "Tyrrell County, North Carolina" ,]
tper_movedsamecounty <- tyrell$NCmoved_samecounty_2016 / tyrell$total_moved_2016
tper_moveddiffcounty <- tyrell$NCmoved_diffcounty_2016 / tyrell$total_moved_2016
tper_movedstate <- tyrell$NCmoved_state_2016 / tyrell$total_moved_2016
tper_movedabroad <- tyrell$NCmoved_abroad_2016 / tyrell$total_moved_2016
tyrell_long <- tyrell %>%
pivot_longer(
cols = starts_with("NCmoved"),
names_to = "move_type",
values_to = "count"
)
#plot
ggplot(tyrell_long, aes(x = NAME, y = count, fill = move_type)) +
geom_bar(stat = "identity", position = "fill") +
scale_y_continuous(labels = scales::percent) +
labs(
title = "Proportion of Movers by Type (Tyrrell County)",
x = "County",
y = "Percent of Movers",
fill = "Move Type"
) +
theme_minimal()After analyzing the results of this assignment, I have discovered that there is no major differences between metro and nonmetro counties and there are not any common trends appearing in the the movers types ratios relative to the population.
POST ASSIGNMENT NOTES:
Nonmetro Counties are Commonly Used to Depict Rural and Small-Town Trends “USDA, Economic Research Service (ERS) researchers and others who analyze conditions in”rural” America most often use data on nonmetropolitan (nonmetro) areas, defined by the U.S. Office of Management and Budget (OMB) on the basis of counties or county-equivalent units (e.g., parishes, boroughs). Counties are a standard unit for publishing economic data and for conducting research to track and explain regional population and economic trends. Estimates of population, employment, and income are available for counties annually. Counties also are frequently used as basic building blocks for areas of economic and social integration, such as labor-market areas.”