Objective

The objective of this applied project is to explain the price of avocados using some basic descriptive analysis.This analysis can be used by producers, retailers, and groceries to make decisions about their pricing strategies, advertising strategies, and supply chain stratgies among others. Some additional analysis will follow after this episode. Your feedback is highly appreciated.

Dataset - weekly avocado sales and price data from the Hass Avocado Board website

This data was downloaded from the Hass Avocado Board website in May of 2018 & compiled into a single CSV. Here’s how the Hass Avocado Board describes the data on their website: The table below represents weekly retail scan data for National retail volume (units) and price. Retail scan data comes directly from retailers’ cash registers based on actual retail sales of Hass avocados. Starting in 2013, the table below reflects an expanded, multi-outlet retail data set. Multi-outlet reporting includes an aggregation of the following channels: grocery, mass, club, drug, dollar and military. The Average Price (of avocados) in the table reflects a per unit (per avocado) cost, even when multiple units (avocados) are sold in bags. The Product Lookup codes (PLU’s) in the table are only for Hass avocados. Other varieties of avocados (e.g. greenskins) are not included in this table.

data<-read.csv('avocado.csv')

# Let's build a simple histogram
hist(data$average_price ,
     main = "Histogram of average_price",
     xlab = "Price in USD (US Dollar)")

library(ggplot2)
ggplot(data, aes(x = average_price, fill = type)) + 
  geom_histogram(bins = 30, col = "purple4") + 
  scale_fill_manual(values = c("pink", "orange")) +
  ggtitle("Frequency of Average Price - Oragnic vs. Conventional")

# Simple EFA with ggplot
ggplot() + 
  geom_col(data, mapping = aes(x = reorder(geography,total_volume), 
                               y = total_volume, fill = year )) +
  xlab("geography")+
  ylab("total_volume")+
  theme(axis.text.x = element_text(angle = 90, size = 7)) 

# Sample response for year 2017 - The plot shows that Los Angels has the highest amount of sales in 2017. 

#1. Perform an exploratory analysis of the variable AveragePrice using the ggplot2 pacakge. How do the prices of organic and conventional avocados compare? Any other findings?

The histograms show a nice bell curve for the conventional avocados, whereas the organic avocado prices seem to vary a lot more. The prices for organic tend to be higher.

#2. Which city has the largest sales of organic and conventional Hass avocados in the dollar amount, respectively, for year 2017 and year 2018, respectively? What could be the possible reasons for Hass avocado’s popularity in these cities?

Los Angeles has the largest amount of sales for both years. A possible reason for this could be the large population. In addition, the city is in southern California near Mexico, which is a major supplier for avocados.

#3. Visit The Hass Avocado Board (https://hassavocadoboard.com/Links to an external site.) and list at least three reasons different stakeholders could benefit from their marketing research.

  1. Farmers and Avocado suppliers could use marketing research to determine how many organic and conventional avocados they need to supply in order to meet market demand.

  2. Company executives can determine which type of avocado is more preferable for consumers.

3.Executives can also observe shifts in demand for each type of avocado. Do the seasons shift demand? Location?