This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
Note: this analysis was performed using the open source software R and Rstudio.
library(readr)
avocado <- read_csv('avocado.csv')
## Rows: 12628 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): date, type, geography
## dbl (4): average_price, total_volume, year, Mileage
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
summary(avocado)
## date average_price total_volume type
## Length:12628 Min. :0.500 Min. : 253 Length:12628
## Class :character 1st Qu.:1.100 1st Qu.: 15733 Class :character
## Mode :character Median :1.320 Median : 94806 Mode :character
## Mean :1.359 Mean : 325259
## 3rd Qu.:1.570 3rd Qu.: 430222
## Max. :2.780 Max. :5660216
## year geography Mileage
## Min. :2017 Length:12628 Min. : 111
## 1st Qu.:2018 Class :character 1st Qu.:1097
## Median :2019 Mode :character Median :2193
## Mean :2019 Mean :1911
## 3rd Qu.:2020 3rd Qu.:2632
## Max. :2020 Max. :2998
Part 2
library(ggplot2)
head(avocado)
## # A tibble: 6 × 7
## date average_price total_volume type year geography Mileage
## <chr> <dbl> <dbl> <chr> <dbl> <chr> <dbl>
## 1 2017/12/3 1.39 139970 conventional 2017 Albany 2832
## 2 2017/12/3 1.44 3577 organic 2017 Albany 2832
## 3 2017/12/3 1.07 504933 conventional 2017 Atlanta 2199
## 4 2017/12/3 1.62 10609 organic 2017 Atlanta 2199
## 5 2017/12/3 1.43 658939 conventional 2017 Baltimore/Was… 2679
## 6 2017/12/3 1.58 38754 organic 2017 Baltimore/Was… 2679
ggplot(data = avocado, aes(x = average_price)) +
geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.