library(readr)
conventional <- read_csv('conventional_avocado.csv')
## New names:
## Rows: 6314 Columns: 9
## ── Column specification
## ──────────────────────────────────────────────────────── Delimiter: "," chr
## (3): type, geography, date dbl (4): average_price, total_volume, year, Mileage
## lgl (2): ...4, ...9
## ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
## Specify the column types or set `show_col_types = FALSE` to quiet this message.
## • `` -> `...4`
## • `` -> `...9`
plot( average_price ~ total_volume, data = conventional)

library(ggplot2)
head(conventional)
## # A tibble: 6 × 9
##   type      average_price total_volume ...4   year geography Mileage date  ...9 
##   <chr>             <dbl>        <dbl> <lgl> <dbl> <chr>       <dbl> <chr> <lgl>
## 1 conventi…          1.39       139970 NA     2017 Albany       2832 12/3… NA   
## 2 conventi…          1.07       504933 NA     2017 Atlanta      2199 12/3… NA   
## 3 conventi…          1.43       658939 NA     2017 Baltimor…    2679 12/3… NA   
## 4 conventi…          1.14        86646 NA     2017 Boise         827 12/3… NA   
## 5 conventi…          1.4        488588 NA     2017 Boston       2998 12/3… NA   
## 6 conventi…          1.13       153282 NA     2017 Buffalo/…    2552 12/3… NA
ggplot(data = conventional, aes(x = average_price)) + geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

library(readr)
organic <- read_csv('organic_avocado.csv')
## New names:
## Rows: 6314 Columns: 8
## ── Column specification
## ──────────────────────────────────────────────────────── Delimiter: "," chr
## (3): type, geography, date dbl (4): average_price, total_volume, year, Mileage
## lgl (1): ...4
## ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
## Specify the column types or set `show_col_types = FALSE` to quiet this message.
## • `` -> `...4`
plot( average_price ~ total_volume, data = organic)

library(ggplot2)
head(organic)
## # A tibble: 6 × 8
##   type    average_price total_volume ...4   year geography         Mileage date 
##   <chr>           <dbl>        <dbl> <lgl> <dbl> <chr>               <dbl> <chr>
## 1 organic          1.44         3577 NA     2017 Albany               2832 12/3…
## 2 organic          1.62        10609 NA     2017 Atlanta              2199 12/3…
## 3 organic          1.58        38754 NA     2017 Baltimore/Washin…    2679 12/3…
## 4 organic          1.77         1829 NA     2017 Boise                 827 12/3…
## 5 organic          1.88        21338 NA     2017 Boston               2998 12/3…
## 6 organic          1.18         7575 NA     2017 Buffalo/Rochester    2552 12/3…
ggplot(data = organic, aes(x = average_price)) + geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

#1. Perform an exploratory analysis of the variable AveragePrice using the ggplot2 pacakge. How do the prices of organic and conventional avocados compare? Any other findings?

The histograms show a nice bell curve for the conventional avocados, whereas the organic avocado prices seem to vary a lot more. The prices for organic tend to be higher.

#2.Draw a regression plot using the variable AveragePrice and Total Volume.

#3. Visit The Hass Avocado Board (https://hassavocadoboard.com/Links to an external site.) and list at least three reasons different stakeholders could benefit from their marketing research.

  1. Farmers and Avocado suppliers could use marketing research to determine how many organic and conventional avocados they need to supply in order to meet market demand.

  2. Company executives can determine which type of avocado is more preferable for consumers.

3.Executives can also observe shifts in demand for each type of avocado. Do the seasons shift demand? Location?