Team info

  • Group name: Anemone
  • Group members: Emily Ma, Emily Kwon, Eileen Lee

Purpose

State your research question, a description of the variables you’ll use, and your data sources.

Do chocolate bars with certain cacao percentages and bean types have higher ratings than other chocolate bars?

  • One numerical variable, outcome variable: ratings

  • One categorical explanatory/predictor variable: bean type

  • One numerical explanatory/predictor variable: cacao percentage

  • ID variable: #1-1795

Link to dataset source

Load packages and data

  1. Load all necessary packages
  2. Load the dataset then run the clean_names() function from the janitor package then select() only the variables you are going to use.
cocoa_percent rating bean_type_new ID
63 3.75 other 1
70 2.75 other 2
70 3.00 other 3
70 3.50 other 4
70 3.50 other 5
70 2.75 Criollo 6

Create EDA visualizations

Create “exploratory data analysis” visualizations of your data. At this point these are preliminary and can change for the submission, but the only requirement is that your visualizations use each of the measurement variables included in your dataset to test out if they work.