The two variables from the eggs dataset that I’ll clean will be both the large_dozen and the extra_large_dozen. Cleaning both variable for the years 2004 and 2005 to present a descriptive statistic and visualization to compare both years on how one competes with the other for an awareness of the buyers decision.
The variables that I chose to look at in this case are both the large dozen and the extra large dozen of eggs. How I collected the data I needed was with the use of Select and picking out what I really wanted to contrast & compare. In this case all I needed was the month, year, and the two variables I plan to work with.
select(eggs_tidy, month, year, large_dozen, extra_large_dozen)
I cleaned and coded the data by using the geom_smooth function. Geom_Smooth allowed me to code and visualize the data in comparible lines to arrive at my summary. I further coded my data by using xlab and ylab to rename my x and y axis.
After we cleaned the data we can see that when the customer is faced with the choice of buying one dozen Large vs Extra Large eggs, more customers are will to pay for the Extra Large Eggs.
ggplot(data = eggs_tidy) + geom_smooth(mapping = aes(x = year, y = large_dozen, color = ‘Large_Dozen’)) + geom_smooth(mapping = aes(x = year, y = extra_large_dozen, color = ‘XL_Dozen’)) + xlab(“Year”)+ ylab(“Eggs”)
-As the preceding code wasn’t able to knit properly I have provided the code used to produce my visualization.