3/1/24 Midterm Exam

```Inc2022 <- read.csv(“https://drkblake.com/wp-content/uploads/2024/02/Inc2022.csv”)

Inc2017 <- read.csv(“https://drkblake.com/wp-content/uploads/2024/02/Inc2017.csv”)

This code installs the tidyverse package with the first half. These are packages that are needed to run information necessary to analyze. The 2nd half deals with the code that helps get the information into R by downloading from a link. I download that and set up my 2 inital data frames.

Income <- left_join(Inc2022, Inc2017, by = join_by(GEOID == GEOID))

The left join function allows for my Inc2022 and Inc2017 dataframes to join together into one by the common point of GEOID. This creates a new dataframe called Income.

Income <- Income %>% mutate(Change = HHInc2022 - HHInc2017) head(Income, 10)

For my 3rd bit of code I used the mutate function in order to change income into something new. By using Mutate, I was able to shift Income into something else. To do this I subtracted HHInc2017 from HHInc2022 in order to see what was different from 5 years time passing.

Income <- Income %>% mutate(Level = case_when(HHInc2022 > 99999 ~ “$100K+”, HHInc2022 < 100000 ~ “<$100K”, .default = “Error”))

In order to add the variable Level to the income dataframe, I needed to use mutate once again and then add the case_when function as well. This allowed me to find how many districts had less than $100K and how many had more than that amount. This also allowed a default of error if something was off target.

LevelByCounty <- Income %>% group_by(County, Level) %>% summarize(Count = n()) %>% pivot_wider(names_from = Level, values_from = Count) head(LevelByCounty, 10)

This was a bigger function, but it got the job done as far as getting a new dataframe in LevelByCounty. To do it, I needed to use the group_by function to find how to set up LevelByCounty, and then used summarize to create a count that distinguished which numbers were where in the order. Pivot_wider helped to establish where to get the names of from and values from which finished this chunk. The head(LevelByCounty) allowed for me to see the first 10 rows of the dataframe.

RichDistricts <- Income %>% filter(Level == “$100K+”) head(RichDistricts, 10)

This chunk of code helped to create the new dataframe RichDistricts and then filter it out on the basis of Level. To do that, you put income on the other side of the first line, and then bring the filter below as a function to help keep things in order.

RichDistricts <- RichDistricts %>% arrange(desc(HHInc2022)) head(RichDistricts, 10)

The final chunk of code helped to get information put in place with the arrange function and then put descending order in the parenthesis with the HHInc2022 dataframe in mind. RichDistricts goes on the top line and that helps to set up which dataframe we will be focusing on.
1 unique pattern I noticed was how the biggest changes are happening in Wilson County as far as total change. The top 3 biggest districts as far as moving people were all within Wilson County. This shows that of all the counties in the mid-state, that one is finding a way to become one of the top places to live in the Nashville area. ```

3/1/24 Midterm Exam

Dylan Simmons

2024-03-01