library(readr)
conventional <- read_csv('conventional_avocado.csv')
## New names:
## Rows: 6314 Columns: 9
## ── Column specification
## ──────────────────────────────────────────────────────── Delimiter: "," chr
## (3): type, geography, date dbl (4): average_price, total_volume, year, Mileage
## lgl (2): ...4, ...9
## ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
## Specify the column types or set `show_col_types = FALSE` to quiet this message.
## • `` -> `...4`
## • `` -> `...9`
plot( average_price ~ total_volume, data = conventional)
library(ggplot2)
head(conventional)
## # A tibble: 6 × 9
## type average_price total_volume ...4 year geography Mileage date ...9
## <chr> <dbl> <dbl> <lgl> <dbl> <chr> <dbl> <chr> <lgl>
## 1 conventi… 1.39 139970 NA 2017 Albany 2832 12/3… NA
## 2 conventi… 1.07 504933 NA 2017 Atlanta 2199 12/3… NA
## 3 conventi… 1.43 658939 NA 2017 Baltimor… 2679 12/3… NA
## 4 conventi… 1.14 86646 NA 2017 Boise 827 12/3… NA
## 5 conventi… 1.4 488588 NA 2017 Boston 2998 12/3… NA
## 6 conventi… 1.13 153282 NA 2017 Buffalo/… 2552 12/3… NA
ggplot(data = conventional, aes(x = average_price)) + geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
library(readr)
organic <- read_csv('organic_avocado.csv')
## New names:
## Rows: 6314 Columns: 8
## ── Column specification
## ──────────────────────────────────────────────────────── Delimiter: "," chr
## (3): type, geography, date dbl (4): average_price, total_volume, year, Mileage
## lgl (1): ...4
## ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
## Specify the column types or set `show_col_types = FALSE` to quiet this message.
## • `` -> `...4`
plot( average_price ~ total_volume, data = organic)
library(ggplot2)
head(organic)
## # A tibble: 6 × 8
## type average_price total_volume ...4 year geography Mileage date
## <chr> <dbl> <dbl> <lgl> <dbl> <chr> <dbl> <chr>
## 1 organic 1.44 3577 NA 2017 Albany 2832 12/3…
## 2 organic 1.62 10609 NA 2017 Atlanta 2199 12/3…
## 3 organic 1.58 38754 NA 2017 Baltimore/Washin… 2679 12/3…
## 4 organic 1.77 1829 NA 2017 Boise 827 12/3…
## 5 organic 1.88 21338 NA 2017 Boston 2998 12/3…
## 6 organic 1.18 7575 NA 2017 Buffalo/Rochester 2552 12/3…
ggplot(data = organic, aes(x = average_price)) + geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#1. Perform an exploratory analysis of the variable AveragePrice using the ggplot2 pacakge. How do the prices of organic and conventional avocados compare? Any other findings?
The histograms show a nice bell curve for the conventional avocados, whereas the organic avocado prices seem to vary a lot more. The prices for organic tend to be higher.
#2.Draw a regression plot using the variable AveragePrice and Total Volume.
#3. Visit The Hass Avocado Board (https://hassavocadoboard.com/Links to an external site.) and list at least three reasons different stakeholders could benefit from their marketing research.
Farmers and Avocado suppliers could use marketing research to determine how many organic and conventional avocados they need to supply in order to meet market demand.
Company executives can determine which type of avocado is more preferable for consumers.
3.Executives can also observe shifts in demand for each type of avocado. Do the seasons shift demand? Location?