Assignment

Choose any three of the “wide” datasets identified in the Week 6 Discussion items. (You may use your own dataset; please don’t use my Sample Post dataset, since that was used in your Week 6 assignment!)

For each of the three chosen datasets:

  1. Create a .CSV file (or optionally, a MySQL database!) that includes all of the information included in the dataset. You’re encouraged to use a “wide” structure similar to how the information appears in the discussion item, so that you can practice tidying and transformations as described below.

  2. Read the information from your .CSV file into R, and use tidyr and dplyr as needed to tidy and transform your data. [Most of your grade will be based on this step!]

  3. Perform the analysis requested in the discussion item.

  4. Your code should be in an R Markdown file, posted to rpubs.com, and should include narrative descriptions of your data cleanup work, analysis, and conclusions.

Solution

Data

Data Source

Michele Bradley posted the following data set about marriage rates from FiveThirtyEight’s GitHub: https://github.com/fivethirtyeight/data/blob/master/marriage/both_sexes.csv

Note: the values in it represent share of the relevant population that has never been married.

Read CSV from URL

theURL <- "https://raw.githubusercontent.com/fivethirtyeight/data/master/marriage/both_sexes.csv"
marriage <- read.csv(theURL, header = TRUE, sep = ",")
head(marriage)
##   X year       date  all_2534   HS_2534   SC_2534  BAp_2534  BAo_2534
## 1 1 1960 1960-01-01 0.1233145 0.1095332 0.1522818 0.2389952 0.2389952
## 2 2 1970 1970-01-01 0.1269715 0.1094000 0.1495096 0.2187031 0.2187031
## 3 3 1980 1980-01-01 0.1991767 0.1617313 0.2236916 0.2881646 0.2881646
## 4 4 1990 1990-01-01 0.2968306 0.2777491 0.2780912 0.3612968 0.3656655
## 5 5 2000 2000-01-01 0.3450087 0.3316545 0.3249205 0.3874906 0.3939579
## 6 6 2001 2001-01-01 0.3527767 0.3446069 0.3341101 0.3835686 0.3925148
##     GD_2534 White_2534 Black_2534 Hisp_2534   NE_2534   MA_2534
## 1        NA  0.1164848  0.1621855 0.1393736 0.1504184 0.1628934
## 2        NA  0.1179043  0.1855163 0.1298769 0.1517231 0.1640680
## 3        NA  0.1824126  0.3137500 0.1885440 0.2414327 0.2505925
## 4 0.3474505  0.2639256  0.4838556 0.2962372 0.3500384 0.3623321
## 5 0.3691740  0.3127149  0.5144994 0.3180681 0.4091852 0.4175565
## 6 0.3590304  0.3183506  0.5437985 0.3321214 0.4200581 0.4294281
##   Midwest_2534 South_2534 Mountain_2534 Pacific_2534 poor_2534   mid_2534
## 1    0.1121467  0.1090562    0.09152117    0.1198758 0.1371597 0.07514929
## 2    0.1153741  0.1126220    0.10293602    0.1374964 0.1717202 0.08159207
## 3    0.1828339  0.1688435    0.17434230    0.2334279 0.3100591 0.14825303
## 4    0.2755046  0.2639794    0.25264326    0.3319579 0.4199108 0.24320008
## 5    0.3308022  0.3099712    0.30621032    0.3753061 0.5033676 0.30202036
## 6    0.3344332  0.3182688    0.30980779    0.3844799 0.5178771 0.31716118
##   rich_2534   all_3544    HS_3544    SC_3544  BAp_3544  BAo_3544   GD_3544
## 1 0.2066776 0.07058157 0.06860309 0.06663695 0.1326265 0.1326265        NA
## 2 0.1724093 0.06732520 0.06511964 0.06271724 0.1116899 0.1116899        NA
## 3 0.1851082 0.06883378 0.06429102 0.06531333 0.1056102 0.1056102        NA
## 4 0.2783226 0.11191800 0.11210043 0.09699372 0.1285172 0.1258567 0.1328018
## 5 0.2717386 0.15605881 0.16993703 0.13800404 0.1541238 0.1536299 0.1550970
## 6 0.2532041 0.15642529 0.16870156 0.13986044 0.1548151 0.1524923 0.1595169
##   White_3544 Black_3544  Hisp_3544    NE_3544    MA_3544 Midwest_3544
## 1 0.06825586 0.08836728 0.07307651 0.09194322 0.09347468   0.06863360
## 2 0.06250372 0.10290904 0.07070500 0.08570110 0.09040725   0.06156272
## 3 0.05966739 0.13140081 0.08110790 0.07997323 0.09744428   0.06070641
## 4 0.09611312 0.22010298 0.12194206 0.12785915 0.14354989   0.10157576
## 5 0.13207032 0.30239381 0.15469520 0.17327422 0.18819256   0.14539201
## 6 0.13287455 0.30857796 0.14953050 0.16653497 0.18315109   0.14794407
##   South_3544 Mountain_3544 Pacific_3544 poor_3544   mid_3544  rich_3544
## 1 0.06026353    0.04739747   0.05822486 0.1019749 0.04717272 0.08553870
## 2 0.05966057    0.04651163   0.06347796 0.1117548 0.04566838 0.06499159
## 3 0.05914089    0.04880077   0.07552538 0.1291426 0.05050321 0.04445951
## 4 0.09637035    0.09189904   0.13134638 0.2012208 0.09024739 0.06573916
## 5 0.14230600    0.13584194   0.17480047 0.2813137 0.12815751 0.08622046
## 6 0.14312592    0.13943820   0.17694864 0.2919112 0.13267625 0.06803283
##     all_4554    HS_4554    SC_4554   BAp_4554   BAo_4554    GD_4554
## 1 0.07254649 0.06840792 0.07903755 0.15360889 0.15360889         NA
## 2 0.05968794 0.05833439 0.05443478 0.10466047 0.10466047         NA
## 3 0.05250871 0.05036563 0.04816180 0.08623774 0.08623774         NA
## 4 0.05947824 0.05988244 0.04654087 0.07301884 0.06416529 0.08394886
## 5 0.08804394 0.09442809 0.07558786 0.09208417 0.09097472 0.09362802
## 6 0.08823342 0.09189007 0.07795481 0.09333365 0.09313480 0.09362876
##   White_4554 Black_4554  Hisp_4554    NE_4554    MA_4554 Midwest_4554
## 1 0.07246692 0.06913249 0.06636058 0.10236412 0.09264788   0.07285321
## 2 0.05754799 0.07899168 0.05810740 0.08028082 0.07860635   0.05791163
## 3 0.04765354 0.08624602 0.06522951 0.06930253 0.07508466   0.04807290
## 4 0.05092552 0.11617699 0.07613556 0.07047502 0.08373134   0.05398391
## 5 0.07578174 0.17587334 0.09418009 0.10232170 0.11269659   0.08302437
## 6 0.07516912 0.18154531 0.09409896 0.09868408 0.10953635   0.08207629
##   South_4554 Mountain_4554 Pacific_4554 poor_4554   mid_4554  rich_4554
## 1 0.05977295    0.04754183   0.05996993 0.1030055 0.05364421 0.07908591
## 2 0.05174462    0.03970134   0.04826312 0.1016489 0.04221637 0.05142867
## 3 0.04485348    0.03374438   0.04958992 0.1003011 0.03830266 0.03311296
## 4 0.05043636    0.04459411   0.06461875 0.1148335 0.04562332 0.03136386
## 5 0.07631858    0.07637774   0.09896832 0.1718976 0.07055672 0.03897342
## 6 0.07886513    0.07405971   0.10119511 0.1759369 0.07407508 0.02857320
##   nokids_all_2534 kids_all_2534 nokids_HS_2534 nokids_SC_2534
## 1       0.4640564   0.002820625      0.4430148      0.5000402
## 2       0.4309043   0.009868596      0.4246779      0.4333479
## 3       0.4464304   0.025285667      0.4319342      0.4505900
## 4       0.5425242   0.060277451      0.5464881      0.5238446
## 5       0.5714531   0.099472713      0.5711395      0.5700042
## 6       0.5852213   0.110178467      0.6045475      0.5810912
##   nokids_BAp_2534 nokids_BAo_2534 nokids_GD_2534 kids_HS_2534 kids_SC_2534
## 1       0.5619099       0.5619099             NA  0.003318886  0.001150824
## 2       0.4554766       0.4554766             NA  0.012465915  0.003699982
## 3       0.4719700       0.4719700             NA  0.031930752  0.018135401
## 4       0.5560765       0.5633301      0.5332628  0.078470444  0.052032702
## 5       0.5729677       0.5862213      0.5367160  0.127193577  0.097625310
## 6       0.5698644       0.5864967      0.5258800  0.141395652  0.110030662
##   kids_BAp_2534 kids_BAo_2534 kids_GD_2534 nokids_poor_2534
## 1  0.0005751073  0.0005751073           NA        0.4933061
## 2  0.0014683425  0.0014683425           NA        0.5097742
## 3  0.0062544364  0.0062544364           NA        0.5740402
## 4  0.0171241042  0.0181766027   0.01374234        0.6546908
## 5  0.0370024452  0.0401009875   0.02761467        0.7055451
## 6  0.0399801447  0.0445838012   0.02645041        0.7147334
##   nokids_mid_2534 nokids_rich_2534 kids_poor_2534 kids_mid_2534
## 1       0.4100080        0.4921184    0.008722711  0.0007532065
## 2       0.3764538        0.4288948    0.029974945  0.0033771145
## 3       0.3998250        0.3848089    0.077926214  0.0102368871
## 4       0.5186604        0.4750156    0.170763774  0.0274655254
## 5       0.5690228        0.4458023    0.256281918  0.0597845173
## 6       0.5864741        0.4461111    0.280146488  0.0677954572
##   kids_rich_2534
## 1   0.0008027331
## 2   0.0030435661
## 3   0.0068317224
## 4   0.0182329127
## 5   0.0295644698
## 6   0.0336540502

Write CSV file

write.csv(marriage, file = "marriage.csv", row.names = FALSE)

Tidy Data

library(dplyr)
## Warning: package 'dplyr' was built under R version 3.4.2
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyr)
1. Subset demographic data to include marriage rate data for ages 25-34:
marriage2534 <- marriage[c(2,4:21)]
head(marriage2534)
##   year  all_2534   HS_2534   SC_2534  BAp_2534  BAo_2534   GD_2534
## 1 1960 0.1233145 0.1095332 0.1522818 0.2389952 0.2389952        NA
## 2 1970 0.1269715 0.1094000 0.1495096 0.2187031 0.2187031        NA
## 3 1980 0.1991767 0.1617313 0.2236916 0.2881646 0.2881646        NA
## 4 1990 0.2968306 0.2777491 0.2780912 0.3612968 0.3656655 0.3474505
## 5 2000 0.3450087 0.3316545 0.3249205 0.3874906 0.3939579 0.3691740
## 6 2001 0.3527767 0.3446069 0.3341101 0.3835686 0.3925148 0.3590304
##   White_2534 Black_2534 Hisp_2534   NE_2534   MA_2534 Midwest_2534
## 1  0.1164848  0.1621855 0.1393736 0.1504184 0.1628934    0.1121467
## 2  0.1179043  0.1855163 0.1298769 0.1517231 0.1640680    0.1153741
## 3  0.1824126  0.3137500 0.1885440 0.2414327 0.2505925    0.1828339
## 4  0.2639256  0.4838556 0.2962372 0.3500384 0.3623321    0.2755046
## 5  0.3127149  0.5144994 0.3180681 0.4091852 0.4175565    0.3308022
## 6  0.3183506  0.5437985 0.3321214 0.4200581 0.4294281    0.3344332
##   South_2534 Mountain_2534 Pacific_2534 poor_2534   mid_2534 rich_2534
## 1  0.1090562    0.09152117    0.1198758 0.1371597 0.07514929 0.2066776
## 2  0.1126220    0.10293602    0.1374964 0.1717202 0.08159207 0.1724093
## 3  0.1688435    0.17434230    0.2334279 0.3100591 0.14825303 0.1851082
## 4  0.2639794    0.25264326    0.3319579 0.4199108 0.24320008 0.2783226
## 5  0.3099712    0.30621032    0.3753061 0.5033676 0.30202036 0.2717386
## 6  0.3182688    0.30980779    0.3844799 0.5178771 0.31716118 0.2532041
colnames(marriage2534)
##  [1] "year"          "all_2534"      "HS_2534"       "SC_2534"      
##  [5] "BAp_2534"      "BAo_2534"      "GD_2534"       "White_2534"   
##  [9] "Black_2534"    "Hisp_2534"     "NE_2534"       "MA_2534"      
## [13] "Midwest_2534"  "South_2534"    "Mountain_2534" "Pacific_2534" 
## [17] "poor_2534"     "mid_2534"      "rich_2534"
2. Change column names to be more readable:
colnames(marriage2534)<- c("Year", "All", "High School Graduate", "Some College", "Bachelor's Degree", "Bachelor's and Some Graduate", "Graduate Degree", "White", "Black", "Hispanic", "New England", "Mid-Atlantic", "Midwest", "South", "Mountain West", "Pacific", "Low-Income", "Middle-Income", "Upper-Income")
colnames(marriage2534)
##  [1] "Year"                         "All"                         
##  [3] "High School Graduate"         "Some College"                
##  [5] "Bachelor's Degree"            "Bachelor's and Some Graduate"
##  [7] "Graduate Degree"              "White"                       
##  [9] "Black"                        "Hispanic"                    
## [11] "New England"                  "Mid-Atlantic"                
## [13] "Midwest"                      "South"                       
## [15] "Mountain West"                "Pacific"                     
## [17] "Low-Income"                   "Middle-Income"               
## [19] "Upper-Income"
head(marriage2534)
##   Year       All High School Graduate Some College Bachelor's Degree
## 1 1960 0.1233145            0.1095332    0.1522818         0.2389952
## 2 1970 0.1269715            0.1094000    0.1495096         0.2187031
## 3 1980 0.1991767            0.1617313    0.2236916         0.2881646
## 4 1990 0.2968306            0.2777491    0.2780912         0.3612968
## 5 2000 0.3450087            0.3316545    0.3249205         0.3874906
## 6 2001 0.3527767            0.3446069    0.3341101         0.3835686
##   Bachelor's and Some Graduate Graduate Degree     White     Black
## 1                    0.2389952              NA 0.1164848 0.1621855
## 2                    0.2187031              NA 0.1179043 0.1855163
## 3                    0.2881646              NA 0.1824126 0.3137500
## 4                    0.3656655       0.3474505 0.2639256 0.4838556
## 5                    0.3939579       0.3691740 0.3127149 0.5144994
## 6                    0.3925148       0.3590304 0.3183506 0.5437985
##    Hispanic New England Mid-Atlantic   Midwest     South Mountain West
## 1 0.1393736   0.1504184    0.1628934 0.1121467 0.1090562    0.09152117
## 2 0.1298769   0.1517231    0.1640680 0.1153741 0.1126220    0.10293602
## 3 0.1885440   0.2414327    0.2505925 0.1828339 0.1688435    0.17434230
## 4 0.2962372   0.3500384    0.3623321 0.2755046 0.2639794    0.25264326
## 5 0.3180681   0.4091852    0.4175565 0.3308022 0.3099712    0.30621032
## 6 0.3321214   0.4200581    0.4294281 0.3344332 0.3182688    0.30980779
##     Pacific Low-Income Middle-Income Upper-Income
## 1 0.1198758  0.1371597    0.07514929    0.2066776
## 2 0.1374964  0.1717202    0.08159207    0.1724093
## 3 0.2334279  0.3100591    0.14825303    0.1851082
## 4 0.3319579  0.4199108    0.24320008    0.2783226
## 5 0.3753061  0.5033676    0.30202036    0.2717386
## 6 0.3844799  0.5178771    0.31716118    0.2532041
3. Make observations from variables with tidyr ‘gather’ function:
ncol(marriage2534)
## [1] 19
marriage2534 <- gather(marriage2534, "Demographics", "n", 2:19)
head(marriage2534, 30)
##    Year         Demographics         n
## 1  1960                  All 0.1233145
## 2  1970                  All 0.1269715
## 3  1980                  All 0.1991767
## 4  1990                  All 0.2968306
## 5  2000                  All 0.3450087
## 6  2001                  All 0.3527767
## 7  2002                  All 0.3535249
## 8  2003                  All 0.3620345
## 9  2004                  All 0.3673247
## 10 2005                  All 0.3793451
## 11 2006                  All 0.4147656
## 12 2007                  All 0.4269222
## 13 2008                  All 0.4394414
## 14 2009                  All 0.4625638
## 15 2010                  All 0.4697332
## 16 2011                  All 0.4833335
## 17 2012                  All 0.4943453
## 18 1960 High School Graduate 0.1095332
## 19 1970 High School Graduate 0.1094000
## 20 1980 High School Graduate 0.1617313
## 21 1990 High School Graduate 0.2777491
## 22 2000 High School Graduate 0.3316545
## 23 2001 High School Graduate 0.3446069
## 24 2002 High School Graduate 0.3490367
## 25 2003 High School Graduate 0.3581877
## 26 2004 High School Graduate 0.3708102
## 27 2005 High School Graduate 0.3870680
## 28 2006 High School Graduate 0.4312162
## 29 2007 High School Graduate 0.4441386
## 30 2008 High School Graduate 0.4599162
4. Add a column that is the inverse of “n” and name it “Marriage_Rate”:
marriage_tidy <- marriage2534 %>% 
  mutate(Marriage_Rate = 1 - n)
5. Rename “n” to “Single_Rate”:

The figures in “n” represent the share of the relevant population that has never been married, so “Single_Rate” seems a better descriptor.

marriage_tidy <- dplyr::rename(marriage_tidy, Single_Rate = n)
head(marriage_tidy, 20)
##    Year         Demographics Single_Rate Marriage_Rate
## 1  1960                  All   0.1233145     0.8766855
## 2  1970                  All   0.1269715     0.8730285
## 3  1980                  All   0.1991767     0.8008233
## 4  1990                  All   0.2968306     0.7031694
## 5  2000                  All   0.3450087     0.6549913
## 6  2001                  All   0.3527767     0.6472233
## 7  2002                  All   0.3535249     0.6464751
## 8  2003                  All   0.3620345     0.6379655
## 9  2004                  All   0.3673247     0.6326753
## 10 2005                  All   0.3793451     0.6206549
## 11 2006                  All   0.4147656     0.5852344
## 12 2007                  All   0.4269222     0.5730778
## 13 2008                  All   0.4394414     0.5605586
## 14 2009                  All   0.4625638     0.5374362
## 15 2010                  All   0.4697332     0.5302668
## 16 2011                  All   0.4833335     0.5166665
## 17 2012                  All   0.4943453     0.5056547
## 18 1960 High School Graduate   0.1095332     0.8904668
## 19 1970 High School Graduate   0.1094000     0.8906000
## 20 1980 High School Graduate   0.1617313     0.8382687
6. Change “Marriage_Rate” and “Single_Rate”" from characters to numeric:
marriage_tidy$Marriage_Rate <- as.numeric(as.character(marriage_tidy$Marriage_Rate))
marriage_tidy$Single_Rate <- as.numeric(as.character(marriage_tidy$Single_Rate))
head(marriage_tidy, 20)
##    Year         Demographics Single_Rate Marriage_Rate
## 1  1960                  All   0.1233145     0.8766855
## 2  1970                  All   0.1269715     0.8730285
## 3  1980                  All   0.1991767     0.8008233
## 4  1990                  All   0.2968306     0.7031694
## 5  2000                  All   0.3450087     0.6549913
## 6  2001                  All   0.3527767     0.6472233
## 7  2002                  All   0.3535249     0.6464751
## 8  2003                  All   0.3620345     0.6379655
## 9  2004                  All   0.3673247     0.6326753
## 10 2005                  All   0.3793451     0.6206549
## 11 2006                  All   0.4147656     0.5852344
## 12 2007                  All   0.4269222     0.5730778
## 13 2008                  All   0.4394414     0.5605586
## 14 2009                  All   0.4625638     0.5374362
## 15 2010                  All   0.4697332     0.5302668
## 16 2011                  All   0.4833335     0.5166665
## 17 2012                  All   0.4943453     0.5056547
## 18 1960 High School Graduate   0.1095332     0.8904668
## 19 1970 High School Graduate   0.1094000     0.8906000
## 20 1980 High School Graduate   0.1617313     0.8382687
7. Round “Single_Rate”" and “Marriage_Rate”" to the hundredth decimal place:
marriage_tidy[, "Marriage_Rate" ] = format(round(marriage_tidy[, "Marriage_Rate" ], 2), nsmall = 2)
marriage_tidy[, "Single_Rate" ] = format(round(marriage_tidy[, "Single_Rate" ], 2), nsmall = 2)
8. Convert to a local data frame:
marriage_tidy <- tbl_df(marriage_tidy)
marriage_tidy
## # A tibble: 306 x 4
##     Year Demographics Single_Rate Marriage_Rate
##    <int>        <chr>       <chr>         <chr>
##  1  1960          All        0.12          0.88
##  2  1970          All        0.13          0.87
##  3  1980          All        0.20          0.80
##  4  1990          All        0.30          0.70
##  5  2000          All        0.35          0.65
##  6  2001          All        0.35          0.65
##  7  2002          All        0.35          0.65
##  8  2003          All        0.36          0.64
##  9  2004          All        0.37          0.63
## 10  2005          All        0.38          0.62
## # ... with 296 more rows

Pretty Data Table

library(DT)
datatable(marriage_tidy)

Reference: http://rpubs.com/jillenergy/313578

Analysis

Interested in exploring the declining marriage rates for people ages 25-34 from 2000-2012 by Race, Education, Income Level, and Region (US).

library(ggplot2)

All People Ages 25-34 from 2000 - 2012

All <- filter(marriage_tidy, Demographics == "All", Year >= 2000)
All
## # A tibble: 13 x 4
##     Year Demographics Single_Rate Marriage_Rate
##    <int>        <chr>       <chr>         <chr>
##  1  2000          All        0.35          0.65
##  2  2001          All        0.35          0.65
##  3  2002          All        0.35          0.65
##  4  2003          All        0.36          0.64
##  5  2004          All        0.37          0.63
##  6  2005          All        0.38          0.62
##  7  2006          All        0.41          0.59
##  8  2007          All        0.43          0.57
##  9  2008          All        0.44          0.56
## 10  2009          All        0.46          0.54
## 11  2010          All        0.47          0.53
## 12  2011          All        0.48          0.52
## 13  2012          All        0.49          0.51
All$Marriage_Rate <- as.numeric(as.character(All$Marriage_Rate))
ggplot(All, aes(Year, Marriage_Rate, group=1)) +
  geom_line() +
  geom_point() +
  expand_limits(y=.5) +
  scale_x_continuous(limits = c(2000, 2013)) +
  theme_linedraw() +
  ggtitle("Declining Marriage Rates in All People Ages 25-34") +
  ylab("Marriage Rate") +
  theme(plot.title = element_text(lineheight = .8, face = "bold"))

People Ages 25-34 by Race from 2000 - 2012

Race <- filter(marriage_tidy, Demographics == "White" | Demographics == "Black" | Demographics == "Hispanic", Year >= 2000)
Race
## # A tibble: 39 x 4
##     Year Demographics Single_Rate Marriage_Rate
##    <int>        <chr>       <chr>         <chr>
##  1  2000        White        0.31          0.69
##  2  2001        White        0.32          0.68
##  3  2002        White        0.32          0.68
##  4  2003        White        0.33          0.67
##  5  2004        White        0.33          0.67
##  6  2005        White        0.34          0.66
##  7  2006        White        0.38          0.62
##  8  2007        White        0.39          0.61
##  9  2008        White        0.40          0.60
## 10  2009        White        0.42          0.58
## # ... with 29 more rows
Race$Marriage_Rate <- as.numeric(as.character(Race$Marriage_Rate))
ggplot(Race, aes(x = Year, y = Marriage_Rate, group = Demographics, colour = Demographics)) +
  geom_line() +
  geom_point() +
  scale_y_continuous() +
  scale_x_continuous(limits = c(2000, 2013)) +
  theme_linedraw() +
  ggtitle("Declining Marriage Rates by Race in People Ages 25-34") +
  ylab("Marriage Rate") +
  theme(plot.title = element_text(lineheight = .8, face = "bold"))

People Ages 25-34 by Education Level from 2000 - 2012

Education <- filter(marriage_tidy, Demographics == "Graduate Degree" | Demographics == "Bachelor's and Some Graduate" | Demographics == "Bachelor's Degree" | Demographics == "Some College" | Demographics == "High School Graduate", Year >= 2000)
Education
## # A tibble: 65 x 4
##     Year         Demographics Single_Rate Marriage_Rate
##    <int>                <chr>       <chr>         <chr>
##  1  2000 High School Graduate        0.33          0.67
##  2  2001 High School Graduate        0.34          0.66
##  3  2002 High School Graduate        0.35          0.65
##  4  2003 High School Graduate        0.36          0.64
##  5  2004 High School Graduate        0.37          0.63
##  6  2005 High School Graduate        0.39          0.61
##  7  2006 High School Graduate        0.43          0.57
##  8  2007 High School Graduate        0.44          0.56
##  9  2008 High School Graduate        0.46          0.54
## 10  2009 High School Graduate        0.48          0.52
## # ... with 55 more rows
Education$Marriage_Rate <- as.numeric(as.character(Education$Marriage_Rate))
ggplot(Education, aes(x = Year, y = Marriage_Rate, group = Demographics, colour = Demographics)) +
  geom_line() +
  geom_point() +
  scale_y_continuous() +
  scale_x_continuous(limits = c(2000, 2013)) + 
  ggtitle("Declining Marriage Rates by Education Level in People Ages 25-34") +
  theme_classic() +
  ylab("Marriage Rate") +
  theme(plot.title = element_text(lineheight = .8, face = "bold"))

People Ages 25-34 by Income Level from 2000 - 2012

Income <- filter(marriage_tidy, Demographics == "Low-Income" | Demographics == "Middle-Income" | Demographics == "Upper-Income", Year >= 2000)
Income
## # A tibble: 39 x 4
##     Year Demographics Single_Rate Marriage_Rate
##    <int>        <chr>       <chr>         <chr>
##  1  2000   Low-Income        0.50          0.50
##  2  2001   Low-Income        0.52          0.48
##  3  2002   Low-Income        0.52          0.48
##  4  2003   Low-Income        0.53          0.47
##  5  2004   Low-Income        0.54          0.46
##  6  2005   Low-Income        0.55          0.45
##  7  2006   Low-Income        0.57          0.43
##  8  2007   Low-Income        0.59          0.41
##  9  2008   Low-Income        0.61          0.39
## 10  2009   Low-Income        0.62          0.38
## # ... with 29 more rows
Income$Marriage_Rate <- as.numeric(as.character(Income$Marriage_Rate))
ggplot(Income, aes(x = Year, y = Marriage_Rate, group = Demographics, colour = Demographics)) +
  geom_line() +
  geom_point() +
  scale_y_continuous() +
  scale_x_continuous(limits = c(2000, 2013)) + 
  ggtitle("Declining Marriage Rates by Income Level in People Ages 25-34") +
  theme_classic() +
  ylab("Marriage Rate") +
  theme(plot.title = element_text(lineheight = .8, face = "bold"))

People Ages 25-34 by Region from 2000 - 2012

Region <- filter(marriage_tidy, Demographics == "New England" | Demographics == "Mid-Atlantic" | Demographics == "Midwest" | Demographics == "South" | Demographics == "Mountain West" | Demographics == "Pacific", Year >= 2000)
Region
## # A tibble: 78 x 4
##     Year Demographics Single_Rate Marriage_Rate
##    <int>        <chr>       <chr>         <chr>
##  1  2000  New England        0.41          0.59
##  2  2001  New England        0.42          0.58
##  3  2002  New England        0.41          0.59
##  4  2003  New England        0.43          0.57
##  5  2004  New England        0.45          0.55
##  6  2005  New England        0.45          0.55
##  7  2006  New England        0.48          0.52
##  8  2007  New England        0.49          0.51
##  9  2008  New England        0.51          0.49
## 10  2009  New England        0.53          0.47
## # ... with 68 more rows
Region$Marriage_Rate <- as.numeric(as.character(Region$Marriage_Rate))
ggplot(Region, aes(x = Year, y = Marriage_Rate, group = Demographics, colour = Demographics)) +
  geom_line() +
  geom_point() +
  scale_y_continuous() +
  scale_x_continuous(limits = c(2000, 2013)) + 
  ggtitle("Declining Marriage Rates by Income Level in People Ages 25-34") +
  theme_classic() +
  ylab("Marriage Rate") +
  theme(plot.title = element_text(lineheight = .8, face = "bold"))

Conclusion

The marriage rate for people ages 25-34 has been steadily declining since 2000 regardless of race, region, income level, and education. The steepest decline looks to have been from 2005-2007 when the economy was booming.