R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

Note: this analysis was performed using the open source software R and Rstudio.

library(readr)
data <- read_csv('conventional.csv')
## Rows: 6314 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): date, type, geography
## dbl (4): average_price, total_volume, year, Mileage
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
plot(total_volume ~ average_price, data = data)

Answers

library(ggplot2)
head(data)
## # A tibble: 6 × 7
##   date    average_price total_volume type          year geography        Mileage
##   <chr>           <dbl>        <dbl> <chr>        <dbl> <chr>              <dbl>
## 1 12/3/17          1.39       139970 conventional  2017 Albany              2832
## 2 12/3/17          1.07       504933 conventional  2017 Atlanta             2199
## 3 12/3/17          1.43       658939 conventional  2017 Baltimore/Washi…    2679
## 4 12/3/17          1.14        86646 conventional  2017 Boise                827
## 5 12/3/17          1.4        488588 conventional  2017 Boston              2998
## 6 12/3/17          1.13       153282 conventional  2017 Buffalo/Rochest…    2552
ggplot(data = data, aes(x = average_price )) +
  geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

For Organic Avocados

library(readr)
data <- read_csv('organic.csv')
## Rows: 6314 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): date, type, geography
## dbl (4): average_price, total_volume, year, Mileage
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
plot(total_volume ~ average_price, data = data)

library(ggplot2)
head(data)
## # A tibble: 6 × 7
##   date    average_price total_volume type     year geography            Mileage
##   <chr>           <dbl>        <dbl> <chr>   <dbl> <chr>                  <dbl>
## 1 12/3/17          1.44         3577 organic  2017 Albany                  2832
## 2 12/3/17          1.62        10609 organic  2017 Atlanta                 2199
## 3 12/3/17          1.58        38754 organic  2017 Baltimore/Washington    2679
## 4 12/3/17          1.77         1829 organic  2017 Boise                    827
## 5 12/3/17          1.88        21338 organic  2017 Boston                  2998
## 6 12/3/17          1.18         7575 organic  2017 Buffalo/Rochester       2552
ggplot(data = data, aes(x = average_price)) +
  geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Questions

  1. Perform an exploratory analysis of the variable AveragePrice using the ggplot2 pacakge. How do the prices of organic and conventional avocados compare? Any other findings?

The prices of the conventional and organic avocados compare by the lower they are in price, the more volume and quantity people buy.

  1. Draw a regression plot using the variable AveragePrice and Total Volume.

Did it!

  1. Visit The Hass Avocado Board (https://hassavocadoboard.com/Links to an external site.) and list at least three reasons different stakeholders could benefit from their marketing research.

Three reasons different stakeholders could benefit from their marketing research are by using graphs like we did for price and quantity, use tables, and set up data analysis reports showing their research on the sales and where they sell the most avocados based on demographic, class, etc.