Week 1 Assignment on RPubs

Rmd on Github

Introduction

The article I selected is Marriage Isn’t Dead — Yet by Ben Casselman. The article reports on the decline of marriage in America by utlizing marriage rates by variables such as gender, education, children, and race.

require(plyr)
## Loading required package: plyr
#Reads data from Github 
marriageData <- read.csv("https://raw.githubusercontent.com/logicalschema/DATA607/master/week1/data/both_sexes.csv")

head(marriageData)


Variable Names

There are 75 variable names. These are the names of the variables (column names for the data set):

names(marriageData)
##  [1] "X"                "year"             "date"             "all_2534"        
##  [5] "HS_2534"          "SC_2534"          "BAp_2534"         "BAo_2534"        
##  [9] "GD_2534"          "White_2534"       "Black_2534"       "Hisp_2534"       
## [13] "NE_2534"          "MA_2534"          "Midwest_2534"     "South_2534"      
## [17] "Mountain_2534"    "Pacific_2534"     "poor_2534"        "mid_2534"        
## [21] "rich_2534"        "all_3544"         "HS_3544"          "SC_3544"         
## [25] "BAp_3544"         "BAo_3544"         "GD_3544"          "White_3544"      
## [29] "Black_3544"       "Hisp_3544"        "NE_3544"          "MA_3544"         
## [33] "Midwest_3544"     "South_3544"       "Mountain_3544"    "Pacific_3544"    
## [37] "poor_3544"        "mid_3544"         "rich_3544"        "all_4554"        
## [41] "HS_4554"          "SC_4554"          "BAp_4554"         "BAo_4554"        
## [45] "GD_4554"          "White_4554"       "Black_4554"       "Hisp_4554"       
## [49] "NE_4554"          "MA_4554"          "Midwest_4554"     "South_4554"      
## [53] "Mountain_4554"    "Pacific_4554"     "poor_4554"        "mid_4554"        
## [57] "rich_4554"        "nokids_all_2534"  "kids_all_2534"    "nokids_HS_2534"  
## [61] "nokids_SC_2534"   "nokids_BAp_2534"  "nokids_BAo_2534"  "nokids_GD_2534"  
## [65] "kids_HS_2534"     "kids_SC_2534"     "kids_BAp_2534"    "kids_BAo_2534"   
## [69] "kids_GD_2534"     "nokids_poor_2534" "nokids_mid_2534"  "nokids_rich_2534"
## [73] "kids_poor_2534"   "kids_mid_2534"    "kids_rich_2534"

Note: “Variable names are as follows. Number in variable names are age ranges, so all_2534 is the marriage rate for everyone ages 25 to 34.”


Subset of Data

Now, I do not need all of the columns as I will create a subset of the data and provide columns names for the ones selected. I am interested in examining the rates of marriage in each of the U.S regions (New England, Mid-Atlantic, Midwest, South, Montain, and Pacific) for the age range of 25 to 34. In addition, the values represents the “share of the relevant population that has never been married” so we have to reverse this by subtracting the value from 1 to get the share that has been married.

#Creates a subset of the data
subset_marriageData <- subset(marriageData, select=c(year, NE_2534, MA_2534, Midwest_2534, South_2534, Mountain_2534, Pacific_2534))

#Renames the columns
subset_marriageData <- rename(subset_marriageData, c("year" = "Year", "NE_2534" = "New_England", "MA_2534" = "Mid-Atlantic", "Midwest_2534" = "Midwest", "South_2534" = "South", "Mountain_2534" = "Mountain_West", "Pacific_2534" = "Pacific" ))


subset_marriageData[-1] <- (subset_marriageData[-1] * (-1)) + 1

head(subset_marriageData)
subset_marriageData[-1]


Plotting the graphs

The graph shows the subset of those in the 25-34 age bracket who have been married by region.

require(ggplot2)
## Loading required package: ggplot2
plot(subset_marriageData[,c("Year", "New_England")],type = "l", col="red")
lines(subset_marriageData[,c("Year", "Mid-Atlantic")], col="orange")
lines(subset_marriageData[,c("Year", "Midwest")], col="yellow")
lines(subset_marriageData[,c("Year", "South")], col="green")
lines(subset_marriageData[,c("Year", "Mountain_West")], col="blue")
lines(subset_marriageData[,c("Year", "Pacific")], col="violet")

legend("topleft",
c("New_England","Mid-Atlantic", "Midwest", "South", "Mountain_West", "Pacific" ),
fill=c("red", "orange", "yellow", "green", "blue", "violet"))


Conclusions

There has been a decline in marriage for the 25-34 age bracket across the country. Next steps would possibly be to obtain additional information from decades before 1960 to see if the decline could be observed in addition or to compare to other countries.