Population Age Project

Question

The question being answered is which age had the largest population in each state in 2019. This data set has the columns state, state_name, age, population (which is the population of the people who are that age), and state_total_population. The columns used to answer the question are state, age, and population. the data set used is Population Age 2019 Data and the source is Centers for Disease Control and Prevention. https://www.openintro.org/data/index.php?data=pop_age_2019

Data Analysis, plot

First step was lading the data set. Next selecting the columns that are going to be used to answer the question. Then grouping the data by state and removing ages that where 85+ because that would include more than one age as well as filtering to only keep the maximum population for each state as that will be the age with the largest population. Lastly creating a visualization of the results and showed how frequently each age appeared.

Loading

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.1     ✔ stringr   1.5.2
## ✔ ggplot2   4.0.0     ✔ tibble    3.3.0
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.1.0     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
pa19 <- read_csv("pop_age_2019.csv") #loading
## Rows: 4386 Columns: 5
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): state, state_name, age
## dbl (2): population, state_total_population
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Cleaning

pa19s <- pa19 |>
  select(state,age,population)  #selecting the columns that are going to be used to anwser the question
pa19sf <- pa19s |>
  group_by(state) |> #grouping the data by state
  filter(age != "85+") |> #removing ages that where 85+ because that would include more than one age
  filter(population == max(population)) |> #filtering to only keep the maximum population for each satate as that will be the age with the largest population.
  arrange(desc(age)) #wanted to see the oldest and youngest 

Visualization

#from data110
pa19sfg2 <- pa19sf 
ggplot(pa19sfg2, aes(x = state, y = age , fill = population,)) +
  geom_col(position = position_dodge(width = 5), width = .4) +
  scale_fill_gradient(low = "yellow4", high = "red", name = "Number of People That Are That Age in the State") +
  labs (title = "Age With Largest Population by State",
        x="State",
        y="Age",
        caption= "Source: Centers for Disease Control and Prevention") +
  theme_minimal(base_size = 5) +
  

pa19sfg2

pa19sfc <- pa19sf |>
  group_by(age) |> #grouping by age
  count (age) #wanted to which ages apeard most frequently 

head(pa19sfc,n=12) # to show all the ages and their count 
## # A tibble: 12 × 2
## # Groups:   age [12]
##    age       n
##    <chr> <int>
##  1 12        2
##  2 19        7
##  3 20        1
##  4 22        1
##  5 23        1
##  6 28       21
##  7 29        4
##  8 55        4
##  9 58        2
## 10 59        4
## 11 61        3
## 12 62        1

Conclusion

The visualization shows which age had the largest population in each state in 2019. Some findings where the oldest age with the largest population in a state was 62 for Montana (MT). The youngest age with the largest population in a state was 12 for Idaho (ID) and Mississippi (MS). 28 was the age that had the largest population in 21 states which was more than any other age. In the future it would be better to have a nicer visualization, maybe arranging it in deceding order and answering a different question.

Population Age 2019 Data (“pop_age_2019.csv”) source: Centers for Disease Control and Prevention. https://www.openintro.org/data/index.php?data=pop_age_2019