Trends in Divorce Rates and Margarine Consumption Over Time
The divorce_margarine dataset from the dslabs package provides data on the divorce rate in Maine (measured per 1,000 people), per capita margarine consumption (in pounds), and the corresponding year. This dataset offers opportunity to explore the concept of spurious correlation—a situation where two unrelated variables appear to be correlated due to coincidence. In this analysis, the goal is to investigate any apparent correlation between divorce rates in Maine and margarine consumption per capita over time.
# Load the package with needed data setslibrary(dslabs)# Load the datasetdata("divorce_margarine", package ="dslabs")# View the first few rows and a summary of the datahead(divorce_margarine)
divorce_rate_maine margarine_consumption_per_capita year
Min. :4.10 Min. :3.700 Min. :2000
1st Qu.:4.20 1st Qu.:4.275 1st Qu.:2002
Median :4.25 Median :4.900 Median :2004
Mean :4.38 Mean :5.320 Mean :2004
3rd Qu.:4.55 3rd Qu.:6.200 3rd Qu.:2007
Max. :5.00 Max. :8.200 Max. :2009
# Load the necessary packagelibrary(ggplot2)library(dslabs)# Load the datasetdata("divorce_margarine", package ="dslabs")# Enhanced scatterplot with smoothersggplot(divorce_margarine, aes(x = year)) +geom_point(aes(y = divorce_rate_maine, color ="Divorce Rate")) +geom_smooth(aes(y = divorce_rate_maine, color ="Divorce Rate"), method ="loess", se =FALSE) +geom_point(aes(y = margarine_consumption_per_capita, color ="Margarine Consumption")) +geom_smooth(aes(y = margarine_consumption_per_capita, color ="Margarine Consumption"), method ="loess", se =FALSE) +labs(title ="Trends in Divorce Rates and Margarine Consumption Over Time",x ="Year",y ="Rate / Consumption",color ="Variable",caption ="Data Source: dslabs package, divorce_margarine dataset" ) +scale_color_manual(values =c("Divorce Rate"="#e31a1c", "Margarine Consumption"="#1f78b4")) +theme_minimal()
`geom_smooth()` using formula = 'y ~ x'
`geom_smooth()` using formula = 'y ~ x'