Harold Nelson
02/27/2022
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.5 ✓ dplyr 1.0.7
## ✓ tidyr 1.1.4 ✓ stringr 1.4.0
## ✓ readr 2.0.2 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
We’re going to get birth data by state from 2003 through 2020.
The following video will show you how to get the basic data from CDC Wonder. Point your browser to https://wonder.cdc.gov/. Then follow along with the video.
Video Link: https://www.youtube.com/watch?v=Oiw7bm4GjvQ
Donload the data for 2003-2006 from CDC Wonder following the directions in the video to obtain by_state_0306. Import the data into your R environment using the “Import Dataset” control.
Copy the code created by the control into the chunk below. Run glimpse() on the dataframe to verify that the process worked.
# Place your code here.
by_state_year_0306 <- read_delim("~/Downloads/Natality, 2003-2006.txt", delim = "\t", escape_double = FALSE,
trim_ws = TRUE)
## Rows: 1672 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr (7): Notes, State, State Code, Age of Mother 9, Age of Mother 9 Code, Fe...
## dbl (3): Year, Year Code, Births
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 1,672
## Columns: 10
## Warning: One or more parsing issues, see `problems()` for details
## $ Notes <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ State <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Al…
## $ `State Code` <chr> "01", "01", "01", "01", "01", "01", "01", "01",…
## $ `Age of Mother 9` <chr> "Under 15 years", "Under 15 years", "Under 15 y…
## $ `Age of Mother 9 Code` <chr> "15", "15", "15", "15", "15-19", "15-19", "15-1…
## $ Year <dbl> 2003, 2004, 2005, 2006, 2003, 2004, 2005, 2006,…
## $ `Year Code` <dbl> 2003, 2004, 2005, 2006, 2003, 2004, 2005, 2006,…
## $ Births <dbl> 172, 162, 150, 163, 8095, 8126, 7771, 8537, 186…
## $ `Female Population` <chr> "Not Available", "Not Available", "Not Availabl…
## $ `Fertility Rate` <chr> "Not Available", "Not Available", "Not Availabl…
Repeat the process above for the data from 2007-2020. Note that the video refers to 2007-2018, but the change is obvious.
by_state_year_0720 <- read_delim("~/Downloads/Natality, 2007-2020.txt", delim = "\t", escape_double = FALSE,
trim_ws = TRUE)
## Rows: 5826 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr (7): Notes, State, State Code, Age of Mother 9, Age of Mother 9 Code, Fe...
## dbl (3): Year, Year Code, Births
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Combine the two data frames into by_state_0320 using rbind().
## Warning: One or more parsing issues, see `problems()` for details
Edit the dataframe using dplyr. It should have the following variables.
Use summary() to check your work.
by_state_0320 = by_state_0320 %>%
filter(State != "District of Columbia") %>%
select("State","Year",Age = "Age of Mother 9 Code", Fpop = "Female Population", "Births", Rate = "Fertility Rate") %>%
mutate(Fpop = as.numeric(Fpop),
Births = as.numeric(Births),
Rate = as.numeric(Rate)/1000) %>%
drop_na()
## Warning in mask$eval_all_mutate(quo): NAs introduced by coercion
## Warning in mask$eval_all_mutate(quo): NAs introduced by coercion
Lets’ look at the time-series of Rate for the state of Washington. Map color to Age. Use plotly.
Compare the states Connecticut, Washington, and Utah for birth rates in the 25-29 group. Map color to State and use plotly.