Start with Nat0718.
Look at the data for one combination of Race, Year, and Region.Just look at Age and Rate. Select Whites in the Northeast in 2018.
onecell = Nat0718 %>%
filter(Race == "White", Region == "NE", Year == 2018) %>%
select(Age,Rate)
onecell
## # A tibble: 6 x 2
## Age Rate
## <fct> <dbl>
## 1 15-19 0.00998
## 2 20-24 0.0468
## 3 25-29 0.0825
## 4 30-34 0.110
## 5 35-39 0.0639
## 6 40-44 0.0144
These are average births per woman per year. Assuming that the behavior of these women is constant for all their child-bearing years, how would you compute the average number of births per woman during a lifetime?
Multiply the rates by 5 and add them up.
Get the national rate by Race, Year and Age
This step sums the rates over region.
Put the results in a dataframe RaceYrAge. Note that you will have to compute the rates at this level by adding up the births and the Fpop because these are not the lowest level cells in the table. These rates will be per woman per year.
RaceYrAge = Nat0718 %>%
group_by(Race,yr,Age) %>%
summarize(Births = sum(Births),
Fpop = sum(Fpop) ) %>%
mutate(Rate = Births/Fpop)
## `summarise()` has grouped output by 'Race', 'yr'. You can override using the `.groups` argument.
Combine the data to get the national value of TFR by race and year.
## `summarise()` has grouped output by 'Race'. You can override using the `.groups` argument.
Plot the data as we have it now. Do a scatterplot of TFR by year. Map color to race.