Introduction


A nutritionist at the Food and Drug Administration is studying the effects of cereal marketing on family meal choices. In particular, she would like to understand how cereal manufacturers market their products in grocery stores. She became interested in doing this study after noticing how the cereal was being restocked one day in her local grocery store. The store personnel were restocking the cereal shelves based on a reference sheet that told them where everything was to be placed. The placement of each cereal brand seemed very deliberate.

To gather data for her study, the nutritionist goes to the local grocery store and records data about cereal nutritional claims and shelf location for 77 cereals.

library(readxl)
library(mosaic)

Import the data set Cereal_Data.xslx from Canvas.


1. Identify the population and the sample in this scenario.

The population is the set of all cereals in this scenario

The sample is the cereals at the nutritionists particular local grocery store


2. Consider the variables in the data set. Identify the variables that are qualitative and those that are quantitative.

$ Shelf Qualitative $ Name Qualitative $ Manufacturer Qualitative $ Type Qualitative $ Calories Quantitative $ Protein Quantitative $ Fat Quantitative $ Sodium Quantitative $ Fiber Quantitative $ Carbohydrates Quantitative $ Sugars Quantitative $ Potassium Quantitative $ Vitamins Quantitative $ Weight (of One Serving Cup) Quantitative $ Cups in Serving Quantitative


3. Consider the variable Shelf. This variable is the shelf position of the cereal (bottom, middle, top) starting from the floor up.

To see whether the shelf position is associated with one measure of nutritive value, the amount of sugar, look at the data for the variable Sugars. Compare the sugar content of cereals on each shelf by making a separate histogram for the sugar content of the cereals on each shelf: a total of three histograms. Use the sugar content values as they are - do not factor in the serving size. (The data for one of the cereals, Quaker Oatmeal, is missing. Just continue with what is available. That’s the way it is in real life - values are missing, files are incomplete, etc.)

  • Use the same scales for your histograms so you can compare the data easily.
  • Title each histogram and label the axes.

hist(Cereal_Data\(Sugars[Cereal_Data\)Shelf==“Top”])

hist(Cereal_Data\(Sugars[Cereal_Data\)Shelf==“Bottom”])

hist(Cereal_Data\(Sugars[Cereal_Data\)Shelf==“Middle”])

i

e

e

i

e

e

i

i

e

e

i

e

e

i

i

e

e

i

e

e

i

i

e

e

i

e

e

i

e

i

i

e

e

i

e

e

i


4. Briefly describe the distribution in each histogram with respect to shape. Based on your histograms, which shelf position has cereals with the most sugar?

The histogram for the top shelf is Bell shaped or symmetric

The histogram for the middle shelf is a left skewed histogram

The histogram for the bottom shelf is a right skewed histogram

Based on the histograms and the frequency that high amounts of sugar is associated with each cereal, I feel as though the top shelf contains the most sugar.


6. Find the five-number-summary, mean, and standard deviation of the variable “Fiber”.

Five-Number Summary: 0, 0.5, 2, 3, 14

Mean: 2.152

Standard Deviation: 2.383


7. Draw side-by-side boxplots for the fiber content for each of the three groups: cereals on top shelf, cereal on middle shelf, and cereals on bottom shelf. Can you tell from the boxplots which group has the largest standard deviation?

i

e

e

i

e

e

i

i

e

e

i

e

e

i

i

e

e

i

e

e

i

i

e

e

i

e

e

i

e

i

i

e

e

i

e

e

i