First, load the packages.
Next, import the data set Cereal_Data.xslx from Canvas and display the first 6 rows of the data set.
load("~/Cereal_Data_1_.RData")
head(Cereal_Data_1_)
## # A tibble: 6 x 15
## Shelf Name Manufacturer Type Calories Protein Fat Sodium Fiber
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Top 100%… N C 70 4 1 130 10
## 2 Top 100%… Q C 120 3 5 15 2
## 3 Top All-… K C 70 4 1 260 9
## 4 Top All-… K C 50 4 0 140 14
## 5 Top Almo… R C 110 2 2 200 1
## 6 Bott… Appl… G C 110 2 2 180 1.5
## # … with 6 more variables: Carbohydrates <dbl>, Sugars <dbl>,
## # Potassium <dbl>, Vitamins <dbl>, `Weight (of One Serving Cup)` <dbl>,
## # `Cups in Serving` <dbl>
str(Cereal_Data_1_)
## Classes 'tbl_df', 'tbl' and 'data.frame': 77 obs. of 15 variables:
## $ Shelf : chr "Top" "Top" "Top" "Top" ...
## $ Name : chr "100%_Bran" "100%_Natural_Bran" "All-Bran" "All-Bran_with_Extra_Fiber" ...
## $ Manufacturer : chr "N" "Q" "K" "K" ...
## $ Type : chr "C" "C" "C" "C" ...
## $ Calories : num 70 120 70 50 110 110 110 130 90 90 ...
## $ Protein : num 4 3 4 4 2 2 2 3 2 3 ...
## $ Fat : num 1 5 1 0 2 2 0 2 1 0 ...
## $ Sodium : num 130 15 260 140 200 180 125 210 200 210 ...
## $ Fiber : num 10 2 9 14 1 1.5 1 2 4 5 ...
## $ Carbohydrates : num 5 8 7 8 14 10.5 11 18 15 13 ...
## $ Sugars : num 6 8 5 0 8 10 14 8 6 5 ...
## $ Potassium : num 280 135 320 330 NA 70 30 100 125 190 ...
## $ Vitamins : num 25 0 25 25 25 25 25 25 25 25 ...
## $ Weight (of One Serving Cup): num 1 1 1 1 1 1 1 1.33 1 1 ...
## $ Cups in Serving : num 0.33 1 0.33 0.5 0.75 0.75 1 0.75 0.67 0.67 ...
Qualitative variables are the shelf, name, manufacturer, and type Quantitative are calories, protein, fat, sodium, fiber, carbohyrates, sugars, potassium, vitamins, weight, and cups in serving. — #### 3. Consider the variable Shelf. This variable is the shelf position of the cereal (bottom, middle, top) starting from the floor up. To see whether the shelf position is associated with one measure of nutritive value, the amount of sugar, look at the data for the variable Sugars. Compare the sugar content of cereals on each shelf by making a separate histogram for the sugar content of the cereals on each shelf: a total of three histograms. Use the sugar content values as they are - do not factor in the serving size. (The data for one of the cereals, Quaker Oatmeal, is missing. Just continue with what is available. That’s the way it is in real life - values are missing, files are incomplete, etc.)
topshelf <- subset(Cereal_Data_1_, Shelf == "Top")
middleshelf <- subset(Cereal_Data_1_, Shelf == "Middle")
bottomshelf <- subset(Cereal_Data_1_, Shelf == "Bottom")
gf_histogram(~Sugars, title = "A Histogram for the Sugar Quanity on the Top Shelf", ylab = "Cereals", data=topshelf, binwidth = 2, breaks=seq(0,16, by =2), color="blue", fill="green")
gf_histogram(~Sugars, title = "A Histogram for the Sugar Quanity on the Middle Shelf", ylab = "Cereals",data=middleshelf, binwidth = 2, breaks=seq(0,16, by =2), color="pink", fill="blue")
gf_histogram(~Sugars, title = "A Histogram for the Sugar Quanity on the Bottom Shelf", ylab = "Cereals", data=bottomshelf, binwidth = 2, breaks=seq(0,16, by =2), color="blue", fill="purple")
## Warning: Removed 1 rows containing non-finite values (stat_bin).
favstats(Cereal_Data_1_$Fiber)
## min Q1 median Q3 max mean sd n missing
## 0 1 2 3 14 2.151948 2.383364 77 0