This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com. Note: this analysis was performed using the open source software R and Rstudio. ## Objective Your explanation here ## Descriptive statistics Your explanation here
data <- read.csv("Avocado Hackathon 2017.csv")
head(data)
## date average_price total_volume type year geography
## 1 12/3/2017 1.39 139970 conventional 2017 Albany
## 2 12/3/2017 1.44 3577 organic 2017 Albany
## 3 12/3/2017 1.07 504933 conventional 2017 Atlanta
## 4 12/3/2017 1.62 10609 organic 2017 Atlanta
## 5 12/3/2017 1.43 658939 conventional 2017 Baltimore/Washington
## 6 12/3/2017 1.58 38754 organic 2017 Baltimore/Washington
## Mileage Total.sales.volume
## 1 2832 194558.30
## 2 2832 5150.88
## 3 2199 540278.31
## 4 2199 17186.58
## 5 2679 942282.77
## 6 2679 61231.32
install.packages('plyr')
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.1'
## (as 'lib' is unspecified)
library(plyr)
count(data, 'geography')
## geography freq
## 1 Albany 10
## 2 Atlanta 10
## 3 Baltimore/Washington 10
## 4 Boise 10
## 5 Boston 10
## 6 Buffalo/Rochester 10
## 7 Charlotte 10
## 8 Chicago 10
## 9 Cincinnati/Dayton 10
## 10 Columbus 10
## 11 Dallas/Ft. Worth 10
## 12 Denver 10
## 13 Detroit 10
## 14 Grand Rapids 10
## 15 Harrisburg/Scranton 10
## 16 Hartford/Springfield 10
## 17 Houston 10
## 18 Indianapolis 10
## 19 Jacksonville 10
## 20 Las Vegas 10
## 21 Los Angeles 10
## 22 Louisville 10
## 23 Miami/Ft. Lauderdale 10
## 24 Nashville 10
## 25 New Orleans/Mobile 10
## 26 New York 10
## 27 Orlando 10
## 28 Philadelphia 10
## 29 Phoenix/Tucson 10
## 30 Pittsburgh 10
## 31 Portland 10
## 32 Raleigh/Greensboro 10
## 33 Richmond/Norfolk 10
## 34 Sacramento 10
## 35 San Diego 10
## 36 San Francisco 10
## 37 Seattle 10
## 38 Spokane 10
## 39 St. Louis 10
## 40 Syracuse 10
## 41 Tampa 10
count(data, 'average_price')
## average_price freq
## 1 0.64 1
## 2 0.70 1
## 3 0.72 1
## 4 0.73 1
## 5 0.75 3
## 6 0.77 1
## 7 0.78 1
## 8 0.79 1
## 9 0.80 1
## 10 0.81 1
## 11 0.83 2
## 12 0.84 1
## 13 0.85 1
## 14 0.86 1
## 15 0.87 1
## 16 0.88 2
## 17 0.89 2
## 18 0.90 2
## 19 0.91 1
## 20 0.92 2
## 21 0.93 2
## 22 0.94 2
## 23 0.95 1
## 24 0.96 1
## 25 0.97 3
## 26 0.98 2
## 27 0.99 6
## 28 1.00 2
## 29 1.01 9
## 30 1.02 4
## 31 1.03 7
## 32 1.04 3
## 33 1.05 3
## 34 1.06 3
## 35 1.07 5
## 36 1.08 3
## 37 1.09 1
## 38 1.10 4
## 39 1.11 2
## 40 1.12 1
## 41 1.13 10
## 42 1.14 9
## 43 1.15 5
## 44 1.16 5
## 45 1.17 4
## 46 1.18 7
## 47 1.19 3
## 48 1.20 3
## 49 1.21 1
## 50 1.23 2
## 51 1.24 4
## 52 1.25 6
## 53 1.26 7
## 54 1.27 5
## 55 1.28 6
## 56 1.29 5
## 57 1.30 4
## 58 1.31 4
## 59 1.32 1
## 60 1.33 3
## 61 1.34 3
## 62 1.35 3
## 63 1.36 5
## 64 1.37 8
## 65 1.38 3
## 66 1.39 9
## 67 1.40 8
## 68 1.41 7
## 69 1.42 4
## 70 1.43 5
## 71 1.44 7
## 72 1.45 8
## 73 1.46 7
## 74 1.47 1
## 75 1.48 4
## 76 1.49 2
## 77 1.50 3
## 78 1.51 3
## 79 1.52 4
## 80 1.53 3
## 81 1.54 6
## 82 1.55 3
## 83 1.56 7
## 84 1.57 1
## 85 1.58 6
## 86 1.59 10
## 87 1.60 4
## 88 1.61 6
## 89 1.62 2
## 90 1.63 3
## 91 1.64 2
## 92 1.65 2
## 93 1.66 1
## 94 1.67 3
## 95 1.68 2
## 96 1.69 2
## 97 1.70 1
## 98 1.71 2
## 99 1.72 4
## 100 1.73 2
## 101 1.74 2
## 102 1.75 3
## 103 1.76 1
## 104 1.77 3
## 105 1.78 1
## 106 1.79 6
## 107 1.80 3
## 108 1.81 4
## 109 1.82 4
## 110 1.83 5
## 111 1.84 1
## 112 1.85 1
## 113 1.86 2
## 114 1.87 1
## 115 1.88 1
## 116 1.89 1
## 117 1.90 2
## 118 1.91 1
## 119 1.92 1
## 120 1.93 1
## 121 1.94 1
## 122 1.99 1
## 123 2.00 1
## 124 2.01 1
## 125 2.03 1
## 126 2.05 1
## 127 2.06 2
## 128 2.07 1
## 129 2.10 1
## 130 2.11 1
## 131 2.12 2
## 132 2.14 1
## 133 2.27 1
mean(data$average_price)
## [1] 1.37339
median(data$average_price)
## [1] 1.38
cor(data$total_volume,data$average_price)
## [1] -0.5198722
To calculate Price Elasticity of Demand we use the formula: PE = (ΔQ/ΔP) * (P/Q) # (Iacobacci, 2015, p.134-135). (ΔQ/ΔP) is determined by the coefficient in our regression analysis below. Here Beta represents the change in the dependent variable y with respect to x (i.e. Δy/Δx = (ΔQ/ΔP)). To determine (P/Q) we will use the average price and average sales volume (Salem, 2014).
plot(total_volume ~ average_price, data)
regr <- lm(total_volume ~ average_price, data)
abline(regr, col='red')
summary(regr)
##
## Call:
## lm(formula = total_volume ~ average_price, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -494290 -209254 -83751 102581 2607183
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1234469 81941 15.06 <2e-16 ***
## average_price -715216 58182 -12.29 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 367400 on 408 degrees of freedom
## Multiple R-squared: 0.2703, Adjusted R-squared: 0.2685
## F-statistic: 151.1 on 1 and 408 DF, p-value: < 2.2e-16
coefficients(regr)
## (Intercept) average_price
## 1234469.1 -715216.4
Beta <- regr$coefficients[["average_price"]]
P <- mean(data$average_price)
Q <- mean(data$total_volume)
elasticity <-Beta*P/Q
elasticity
## [1] -3.894844
Your conclusions here:
Ref: Salem, 2014. Price Elasticity with R. http://www.salemmarafi.com/code/price-elasticity-with-r/ 365datascience. https://365datascience.com/trending/price-elasticity/