R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

Note: this analysis was performed using the open source software R and Rstudio.

Objective

The objective of this basic project is to explain the price of avocados using some basic descriptive analysis.This analysis can be used by producers, retailers, and groceries to make decisions about their pricing strategies, advertising strategies, and supply chain stratgies among others. Some additional analysis will follow after this episode. Your feedback is highly appreciated.

Dataset - weekly avocado sales and price data from the Hass Avocado Board website

This data was downloaded from the Hass Avocado Board website in May of 2018 & compiled into a single CSV. Here’s how the Hass Avocado Board describes the data on their website: The table below represents weekly retail scan data for National retail volume (units) and price. Retail scan data comes directly from retailers’ cash registers based on actual retail sales of Hass avocados. Starting in 2013, the table below reflects an expanded, multi-outlet retail data set. Multi-outlet reporting includes an aggregation of the following channels: grocery, mass, club, drug, dollar and military. The Average Price (of avocados) in the table reflects a per unit (per avocado) cost, even when multiple units (avocados) are sold in bags. The Product Lookup codes (PLU’s) in the table are only for Hass avocados. Other varieties of avocados (e.g. greenskins) are not included in this table.

data <- read.csv("avocado.csv")
head(data)
##   X       Date AveragePrice Total.Volume   X4046     X4225  X4770 Total.Bags
## 1 0 12/27/2015         1.33     64236.62 1036.74  54454.85  48.16    8696.87
## 2 1 12/20/2015         1.35     54876.98  674.28  44638.81  58.33    9505.56
## 3 2 12/13/2015         0.93    118220.22  794.70 109149.67 130.50    8145.35
## 4 3  12/6/2015         1.08     78992.15 1132.00  71976.41  72.58    5811.16
## 5 4 11/29/2015         1.28     51039.60  941.48  43838.39  75.78    6183.95
## 6 5 11/22/2015         1.26     55979.78 1184.27  48067.99  43.61    6683.91
##   Small.Bags Large.Bags XLarge.Bags         type year region
## 1    8603.62      93.25           0 conventional 2015 Albany
## 2    9408.07      97.49           0 conventional 2015 Albany
## 3    8042.21     103.14           0 conventional 2015 Albany
## 4    5677.40     133.76           0 conventional 2015 Albany
## 5    5986.26     197.69           0 conventional 2015 Albany
## 6    6556.47     127.44           0 conventional 2015 Albany
#install.packages('plyr')
library(plyr)
count(data, 'region')
##                 region freq
## 1               Albany  338
## 2              Atlanta  338
## 3  BaltimoreWashington  338
## 4                Boise  338
## 5               Boston  338
## 6     BuffaloRochester  338
## 7           California  338
## 8            Charlotte  338
## 9              Chicago  338
## 10    CincinnatiDayton  338
## 11            Columbus  338
## 12       DallasFtWorth  338
## 13              Denver  338
## 14             Detroit  338
## 15         GrandRapids  338
## 16          GreatLakes  338
## 17  HarrisburgScranton  338
## 18 HartfordSpringfield  338
## 19             Houston  338
## 20        Indianapolis  338
## 21        Jacksonville  338
## 22            LasVegas  338
## 23          LosAngeles  338
## 24          Louisville  338
## 25   MiamiFtLauderdale  338
## 26            Midsouth  338
## 27           Nashville  338
## 28    NewOrleansMobile  338
## 29             NewYork  338
## 30           Northeast  338
## 31  NorthernNewEngland  338
## 32             Orlando  338
## 33        Philadelphia  338
## 34       PhoenixTucson  338
## 35          Pittsburgh  338
## 36              Plains  338
## 37            Portland  338
## 38   RaleighGreensboro  338
## 39     RichmondNorfolk  338
## 40             Roanoke  338
## 41          Sacramento  338
## 42            SanDiego  338
## 43        SanFrancisco  338
## 44             Seattle  338
## 45       SouthCarolina  338
## 46        SouthCentral  338
## 47           Southeast  338
## 48             Spokane  338
## 49             StLouis  338
## 50            Syracuse  338
## 51               Tampa  338
## 52             TotalUS  338
## 53                West  338
## 54    WestTexNewMexico  335
count(data, 'AveragePrice')
##     AveragePrice freq
## 1           0.44    1
## 2           0.46    1
## 3           0.48    1
## 4           0.49    2
## 5           0.51    5
## 6           0.52    3
## 7           0.53    6
## 8           0.54    7
## 9           0.55    3
## 10          0.56   12
## 11          0.57    9
## 12          0.58   14
## 13          0.59    5
## 14          0.60   12
## 15          0.61   11
## 16          0.62   10
## 17          0.63   10
## 18          0.64   17
## 19          0.65   23
## 20          0.66   10
## 21          0.67   23
## 22          0.68   26
## 23          0.69   18
## 24          0.70   44
## 25          0.71   28
## 26          0.72   30
## 27          0.73   38
## 28          0.74   50
## 29          0.75   43
## 30          0.76   61
## 31          0.77   65
## 32          0.78   53
## 33          0.79   67
## 34          0.80   60
## 35          0.81   60
## 36          0.82   66
## 37          0.83   81
## 38          0.84   64
## 39          0.85   79
## 40          0.86   73
## 41          0.87   77
## 42          0.88   88
## 43          0.89   98
## 44          0.90   94
## 45          0.91   92
## 46          0.92  101
## 47          0.93  141
## 48          0.94  123
## 49          0.95  127
## 50          0.96  143
## 51          0.97  147
## 52          0.98  189
## 53          0.99  185
## 54          1.00  167
## 55          1.01  159
## 56          1.02  160
## 57          1.03  179
## 58          1.04  174
## 59          1.05  178
## 60          1.06  170
## 61          1.07  168
## 62          1.08  194
## 63          1.09  167
## 64          1.10  161
## 65          1.11  169
## 66          1.12  158
## 67          1.13  192
## 68          1.14  180
## 69          1.15  202
## 70          1.16  168
## 71          1.17  174
## 72          1.18  199
## 73          1.19  188
## 74          1.20  155
## 75          1.21  151
## 76          1.22  167
## 77          1.23  181
## 78          1.24  165
## 79          1.25  170
## 80          1.26  193
## 81          1.27  155
## 82          1.28  147
## 83          1.29  149
## 84          1.30  140
## 85          1.31  139
## 86          1.32  137
## 87          1.33  159
## 88          1.34  164
## 89          1.35  163
## 90          1.36  187
## 91          1.37  159
## 92          1.38  155
## 93          1.39  148
## 94          1.40  175
## 95          1.41  167
## 96          1.42  149
## 97          1.43  185
## 98          1.44  172
## 99          1.45  157
## 100         1.46  150
## 101         1.47  160
## 102         1.48  185
## 103         1.49  180
## 104         1.50  170
## 105         1.51  148
## 106         1.52  161
## 107         1.53  160
## 108         1.54  173
## 109         1.55  163
## 110         1.56  151
## 111         1.57  134
## 112         1.58  146
## 113         1.59  186
## 114         1.60  159
## 115         1.61  125
## 116         1.62  139
## 117         1.63  136
## 118         1.64  133
## 119         1.65  123
## 120         1.66  141
## 121         1.67  129
## 122         1.68  138
## 123         1.69  127
## 124         1.70  115
## 125         1.71   85
## 126         1.72  117
## 127         1.73   96
## 128         1.74  106
## 129         1.75  105
## 130         1.76  115
## 131         1.77   85
## 132         1.78   97
## 133         1.79  112
## 134         1.80  116
## 135         1.81  119
## 136         1.82  125
## 137         1.83  115
## 138         1.84   88
## 139         1.85  102
## 140         1.86   90
## 141         1.87   84
## 142         1.88   94
## 143         1.89   80
## 144         1.90   83
## 145         1.91   73
## 146         1.92   81
## 147         1.93   79
## 148         1.94   63
## 149         1.95   59
## 150         1.96   59
## 151         1.97   54
## 152         1.98   50
## 153         1.99   49
## 154         2.00   59
## 155         2.01   59
## 156         2.02   54
## 157         2.03   39
## 158         2.04   42
## 159         2.05   38
## 160         2.06   59
## 161         2.07   48
## 162         2.08   38
## 163         2.09   47
## 164         2.10   26
## 165         2.11   37
## 166         2.12   26
## 167         2.13   35
## 168         2.14   26
## 169         2.15   33
## 170         2.16   30
## 171         2.17   29
## 172         2.18   32
## 173         2.19   26
## 174         2.20   21
## 175         2.21   22
## 176         2.22   20
## 177         2.23   19
## 178         2.24   22
## 179         2.25   22
## 180         2.26   15
## 181         2.27   20
## 182         2.28   13
## 183         2.29   16
## 184         2.30   20
## 185         2.31   24
## 186         2.32   20
## 187         2.33   24
## 188         2.34   19
## 189         2.35   14
## 190         2.36   18
## 191         2.37   18
## 192         2.38   13
## 193         2.39   14
## 194         2.40   13
## 195         2.41    7
## 196         2.42    6
## 197         2.43   10
## 198         2.44   10
## 199         2.45    8
## 200         2.46    7
## 201         2.47    4
## 202         2.48    8
## 203         2.49    5
## 204         2.50    6
## 205         2.51    6
## 206         2.52    4
## 207         2.53    2
## 208         2.54    9
## 209         2.55    9
## 210         2.56    5
## 211         2.57   10
## 212         2.58    9
## 213         2.59   10
## 214         2.60    2
## 215         2.61    6
## 216         2.62    8
## 217         2.63    2
## 218         2.64    5
## 219         2.65    8
## 220         2.66    3
## 221         2.67    7
## 222         2.68    1
## 223         2.69    2
## 224         2.70    3
## 225         2.71    4
## 226         2.72    3
## 227         2.73    8
## 228         2.74    3
## 229         2.75    3
## 230         2.76    5
## 231         2.77    3
## 232         2.78    1
## 233         2.79    4
## 234         2.80    2
## 235         2.81    3
## 236         2.82    3
## 237         2.83    6
## 238         2.84    4
## 239         2.85    4
## 240         2.86    4
## 241         2.87    3
## 242         2.88    3
## 243         2.89    3
## 244         2.90    1
## 245         2.91    1
## 246         2.92    2
## 247         2.93    4
## 248         2.94    2
## 249         2.95    1
## 250         2.96    1
## 251         2.97    1
## 252         2.99    2
## 253         3.00    2
## 254         3.03    1
## 255         3.04    1
## 256         3.05    1
## 257         3.12    1
## 258         3.17    1
## 259         3.25    1
mean(data$AveragePrice)
## [1] 1.405978
median(data$AveragePrice)
## [1] 1.37
#Now let's find the mode of the variable "price." #mode(data$AveragePrice) does not work
#Let's find a built-in function instead
#See reference: https://stackoverflow.com/questions/2547402/is-there-a-built-in-function-for-finding-the-mode
freq <- tapply(data$AveragePrice,data$AveragePrice,length)
as.numeric(names(freq)[which.max(freq)])
## [1] 1.15
hist(data$AveragePrice,xlab="Average Price",main ="Frequency of Average Price")

library(ggplot2)

ggplot(data = data, aes(x = Date, y = AveragePrice)) +
geom_point()

#Let's find the relationship between price and sales volume of avocado
cor(data$Total.Volume,data$AveragePrice)
## [1] -0.1927524