Data 110 DS Labs HW Assignment

Author

Shadeja Fuentes

Margarine vs. Divorce

Is this controversial spread be wreaking havoc on the American public once again? Could margarine be a contributing factor in divorce rates in Maine?

Load libraries and data

library(dslabs)
library(tidyverse)
data("divorce_margarine")
head(divorce_margarine)
  divorce_rate_maine margarine_consumption_per_capita year
1                5.0                              8.2 2000
2                4.7                              7.0 2001
3                4.6                              6.5 2002
4                4.4                              5.3 2003
5                4.3                              5.2 2004
6                4.1                              4.0 2005

Create a scatterplot

ggplot(divorce_margarine, aes(x = margarine_consumption_per_capita, y = divorce_rate_maine, color = year)) +
  geom_point(shape = 15, size = 3) +
  geom_smooth(method = "lm", se = TRUE, aes(fill = year)) +
  labs(x = "Margarine Consumption (lbs per capita)", y = "Divorce Rate (per 1000 population)",
       title = "Margarine Consumption and Divorce Rate by Year") +
  scale_x_continuous(limits = c(2, 8), breaks = seq(2, 8, 2)) +
  scale_y_continuous(limits = c(2, 8), breaks = seq(2, 8, 2)) + 
  annotate("text", x = 3, y = 7, label = "positive\nassociation", size = 6, color = "sky blue", fontface = "bold") +
  scale_color_continuous(type = "viridis")+ 
  theme_bw() + scale_fill_discrete(name = "Year", labels = c("2000", "2001", "2002", "2003", "2004", "2005", "2006", "2007", "2008", "2009"))

I worked on the divorce_margarine dataset which was a very small, containing only 10 observations of 3 different variables. The variables given were year, divorce rate and margarine consumption per capita. I produced a scatterplot to visualize the relationship between margarine consumption and divorce rate in Maine. Each data points is colored by year from 2000-2009 and I included a trend line to show the linear relationship between divorce rate and margarine consumption. I then added an annotation to highlight the positive association between the two variables. I found to be more challenging to create plots with such a small amount of information. The process was puzzling. The resulting scatterplot suggests that there may be a relationship between margarine consumption and divorce rates in Maine.