PREDICTING AVOCADOS HASS AVERAGE PRICE AND SALES VOLUME (2015-2020)

Javier Calviño Tilves

17/2/2021

A-INTRODUCCION.

Estos datos representan la tabla por volumen de ventas y precio promedio del aguacate hass en USA en el periodo entre 2015-2020, tanto a nivel general (“Total US”), como los datos por regiones en USA que incluyen 8 grandes regiones (California, West, South Central, Great Lakes, Midsouth, Southeast,Northeast, Plains), como a nivel de mercado en ciudades USA.

B-OBJETO DEL ESTUDIO.

Analisis exploratorio y conclusiones del mismo, asi como la prediccion Machine Learning para estimar un modelo que prediga el precio promedio del aguacate y por ultimo hacer prediciones tanto de precio promedio como del volumen de ventas del aguacate por tipos (organico y convencional) a traves del metodo ARIMA de Series temporales en un horizonte a futuro de 96 semanas.

1. ANALISIS EXPLORATORIO.

library(tidyverse)
library(tseries)
library(lubridate)
library(scales)
library(zoo)
library(caret)
library(forecast)
avoc <- read.csv("avocado-2020.csv")
str(avoc)
## 'data.frame':    30021 obs. of  13 variables:
##  $ date         : chr  "2015-01-04" "2015-01-04" "2015-01-04" "2015-01-04" ...
##  $ average_price: num  1.22 1.79 1 1.76 1.08 1.29 1.01 1.64 1.02 1.83 ...
##  $ total_volume : num  40873 1374 435021 3847 788025 ...
##  $ X4046        : num  2819.5 57.4 364302.4 1500.2 53987.3 ...
##  $ X4225        : num  28287 154 23821 938 552906 ...
##  $ X4770        : num  49.9 0 82.2 0 39995 ...
##  $ total_bags   : num  9716 1163 46816 1408 141137 ...
##  $ small_bags   : num  9187 1163 16707 1071 137146 ...
##  $ large_bags   : num  530 0 30109 337 3991 ...
##  $ xlarge_bags  : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ type         : chr  "conventional" "organic" "conventional" "organic" ...
##  $ year         : int  2015 2015 2015 2015 2015 2015 2015 2015 2015 2015 ...
##  $ geography    : chr  "Albany" "Albany" "Atlanta" "Atlanta" ...
summary(avoc)
##      date           average_price    total_volume          X4046         
##  Length:30021       Min.   :0.440   Min.   :      85   Min.   :       0  
##  Class :character   1st Qu.:1.110   1st Qu.:   14299   1st Qu.:     783  
##  Mode  :character   Median :1.350   Median :  124205   Median :   10523  
##                     Mean   :1.391   Mean   :  939255   Mean   :  299107  
##                     3rd Qu.:1.630   3rd Qu.:  489803   3rd Qu.:  115156  
##                     Max.   :3.250   Max.   :63716144   Max.   :22743616  
##      X4225              X4770             total_bags         small_bags      
##  Min.   :       0   Min.   :      0.0   Min.   :       0   Min.   :       0  
##  1st Qu.:    2814   1st Qu.:      0.0   1st Qu.:    8374   1st Qu.:    5956  
##  Median :   24567   Median :    186.8   Median :   50391   Median :   34255  
##  Mean   :  284901   Mean   :  21629.4   Mean   :  333534   Mean   :  232126  
##  3rd Qu.:  140947   3rd Qu.:   5424.2   3rd Qu.:  159174   3rd Qu.:  112938  
##  Max.   :20470573   Max.   :2546439.1   Max.   :31689189   Max.   :20550407  
##    large_bags        xlarge_bags          type                year     
##  Min.   :       0   Min.   :      0   Length:30021       Min.   :2015  
##  1st Qu.:     352   1st Qu.:      0   Class :character   1st Qu.:2016  
##  Median :    5171   Median :      0   Mode  :character   Median :2017  
##  Mean   :   95185   Mean   :   6223                      Mean   :2017  
##  3rd Qu.:   36068   3rd Qu.:    560                      3rd Qu.:2019  
##  Max.   :13327601   Max.   :1022565                      Max.   :2020  
##   geography        
##  Length:30021      
##  Class :character  
##  Mode  :character  
##                    
##                    
## 

Pasamos la variable Date a tipo fecha que utilizaremos a posteriori para nuestro analisis.

avoc$date <- as.Date(avoc$date)

Comprobamos si hay elementos NA en nuestro dataset:

colSums(is.na(avoc))
##          date average_price  total_volume         X4046         X4225 
##             0             0             0             0             0 
##         X4770    total_bags    small_bags    large_bags   xlarge_bags 
##             0             0             0             0             0 
##          type          year     geography 
##             0             0             0

y vemos que no existen valores NA y en su estructura no hay valores extraños.

Pasaremos en nuestro analisis a hacer un desglose entre los datos por region,
ciudades, y total US que se encuentran en nuestros datos como 3 bloques difentes para orientar nuestro analisis.

Datos por Region.

avoc_region <- avoc %>%
  filter(geography %in% c("California", "West", "South Central", "Great Lakes",
                          "Midsouth", "Southeast", "Northeast", "Plains"))

Datos por Ciudad.

avoc_city <- avoc %>%
  filter(!(geography %in% c("California", "West", "South Central", "Great Lakes",
                            "Midsouth", "Southeast", "Northeast", "Plains",
                            "Total U.S.")))

Datos Total US.

avoc_total_us <- avoc%>%filter(geography=="Total U.S.")

Vamos ahora a agrupar los datos totales “Total U.S”, para calcular el precio promedio del aguacate del año y mes respectivos en funcion del tipo de los mismos.

avocado2 <- avoc_total_us %>% 
  mutate_at(vars(date), list(month=month))%>%
  group_by(year,month, type) %>%
  summarise(totalaverage = mean(`average_price`))

Una vez hecho esto, vamos a proceder a unir año y fecha del dataframe anterior en una sola variable, para simplificar y adecuar los datos.

avocado3 <- unite(avocado2,Fecha,c(1:2), sep="-")

Aqui vamos a convertir esta variable con la fecha en formato character al formato fecha.

avocado3$Periodo <- as.yearmon(as.character(avocado3$Fecha), "%Y-%m")

Una vez ya tenemos la nueva variable Periodo con nuestra fecha en su formato adecuado ya podemos eliminar la otra variable fecha que ya no nos sirve.

avocado3$Fecha=NULL

Ahora vamos a representar en nuestra grafica el precio promedio del aguacate por año, mes y tipo a nivel general.

ggplot(avocado3,aes(factor(Periodo),totalaverage,group=type, col = type))+
  geom_line()+theme(axis.text.x = element_text(angle = 90, hjust = 1))+labs(x=
  'year/month', y='AveragePrice')+ggtitle('AVERAGEPRICE AVOCADOS TOTAL US BY TYPE 2015-2020')

Para ver las diferencias de precios del “Total US” entre ambos tipos organico y convencional, realizaremos una grafica con boxplot en la que compararemos las medias de sus promedios de precio respectivamente.

ggplot(avoc_total_us, aes(x = type,y = average_price, fill=type)) + 
  geom_boxplot()+scale_y_continuous(limits=c(0,2.5),breaks = seq(0.0, 2.5, by=0.2))+labs(x="Type Avocados", y ="Averageprice")+
  ggtitle("AVERAGE PRICE DIFFERENCES BETWEEN CONVENTIONAL AND ORGANIC AVOCADO")+
  theme_minimal()

Veremos tambien la comparacion entre regiones y ciudades en cuanto al precio promedio y al tipo de avocado con sus correspondientes graficas.

avocado_region_pr <-avoc_region %>% 
  group_by(year,type,geography) %>%
  summarise(totalaverage = mean(`average_price`))%>%
  group_by(geography,type)%>%
  summarise(totalavg = mean(`totalaverage`))
avocado_city_pr <-avoc_city %>% 
  group_by(year,type,geography) %>%
  summarise(totalaverage = mean(`average_price`))%>%
  group_by(geography,type)%>%
  summarise(totalavg = mean(`totalaverage`))
ggplot(avocado_region_pr,aes(geography,totalavg,fill=type))+
  geom_bar(position = "dodge",
           stat="identity") + scale_y_continuous(labels=scales::comma)+
  labs(x='Regions', y='Averageprice')+
  ggtitle('AVERAGE PRICE USA AVOCADO BY REGION AND TYPE 2015-2020') 

ggplot(avocado_city_pr,aes(geography,totalavg,fill=type))+
  geom_bar(width=0.8, position =position_dodge(0.5),
           stat="identity") + scale_y_continuous(labels=scales::comma)+coord_flip()+
  labs(x='Regions', y='Averageprice')+
  ggtitle('AVERAGE PRICE USA AVOCADO BY CITY AND TYPE 2015-2020') 

Ahora analizaremos los datos del volumen de ventas anual del aguacate por año y tipo considerado.

avocado5 <- avoc_total_us %>% 
  group_by(year,type) %>%
  summarise(totalVolume = sum(`total_volume`)) %>% 
  mutate(Percentage = totalVolume/sum(totalVolume),
         percentLabel = scales::percent(Percentage, 
                                        accuracy = 0.01,
                                        decimal.mark = "."))

Vamos a reprentar a posteriori graficamente el volumen de ventas del aguacate por tipo y por año a nivel general, desglosando tambien en porcentaje los mismos.

ggplot(data = avocado5,
       aes(x = year,
           y = totalVolume,
           fill = type))+
         geom_bar(position = "dodge2",
           stat="identity") + coord_flip()+scale_y_continuous(labels=scales::comma)+
  geom_text(aes(label = percentLabel), 
            size = 3, 
            position = position_dodge2(width = 1))+labs(x='years period', y='sales volume')+
  ggtitle('ANNUAL VOLUME OF SALES AVOCADO USA 2015-2020')

Vamos a emplear el desglose de nuestros datos para visualizar por region y ciudad el volumen de ventas de aguacates segun su tamaño.

avocado_region_vol <-avoc_region %>% 
  group_by(geography) %>%
  mutate(rest=total_volume-X4046-X4225-X4770) %>%
  summarise(Hass_small = sum(`X4046`), Hass_large= sum(`X4225`), Hass_extralarge= sum(`X4770`),rest= sum(`rest`)) %>%
  gather(type_size,volume_size,c(Hass_small, Hass_large, Hass_extralarge,rest)) %>%
  group_by(geography,type_size) %>%
  summarise(volume_t=sum(`volume_size`)) %>%
  mutate(Percentage = volume_t/sum(volume_t),
         percentLabel = scales::percent(Percentage, 
                                        accuracy = 0.01,
                                        decimal.mark = "."))
avocado_city_vol <-avoc_city %>% 
  group_by(geography) %>%
  mutate(rest=total_volume-X4046-X4225-X4770) %>%
  summarise(Hass_small = sum(`X4046`), Hass_large= sum(`X4225`), Hass_extralarge= sum(`X4770`),rest= sum(`rest`)) %>%
  gather(type_size,volume_size,c(Hass_small, Hass_large, Hass_extralarge,rest)) %>%
  group_by(geography,type_size) %>%
  summarise(volume_t=sum(`volume_size`)) %>%
  mutate(Percentage = volume_t/sum(volume_t),
         percentLabel = scales::percent(Percentage, 
                                        accuracy = 0.01,
                                        decimal.mark = "."))

Representamos graficamente el desglose por region y volumen de ventas de los agua cates de los diferentes tamaños a considerar.

ggplot(data = avocado_region_vol,
       aes(reorder(geography,-volume_t),volume_t,fill=type_size))+
  geom_bar(position = "stack",
           stat="identity") + scale_y_continuous(labels=scales::comma)+
  geom_text(aes(label = percentLabel), 
            size = 2.5,
            position =position_stack(vjust = 0.7))+labs(x='Regions', y='Sales Volume')+
  ggtitle('US AVOCADO SALES VOLUME BY REGION AND SIZE 2015-2020') 

representamos ese mismo desglose pero ahora por ciudad.

ggplot(data = avocado_city_vol,
       aes(reorder(geography,volume_t),volume_t,fill=type_size))+
  geom_bar(position = "stack",
           stat="identity") +coord_flip()+ scale_y_continuous(labels=scales::comma)+
  labs(x='Cities', y='Sales Volume')+
  ggtitle('US AVOCADO SALES VOLUME BY CITY AND SIZE 2015-2020')

CONCLUSIONES

En lineas generales el volumen de ventas USA, en funcion del tipo de aguacate (Organico y Convencional), es muy superior los del tipo convencional al organico, aproximandamente cercano al 4% anual el organico con respecto al convencional que es de aproximadamente un 96%.

Se puede concluir que la diferencia entre el precio promedio del aguacate organico en USA con respecto al convencional es de aproximadamente 50 centavos mas superior el organico al convencional.

El precio por regiones USA es superior sobre el resto en las regiones de California y Northeast, sobre todo el precio organico llegando al maximo entre 1.75 y 1.80 dolares de promedio.

El precio promedio por ciudad es destacable en ciudades como San Francisco y Hartford/Springfield sobre todo en el precio del aguacate tipo organico superando los 2 dolares ampliamente, tambien siendo destacado junto con otras en cuanto al precio del tipo convencional estando aproximadamente sobre 1,40 dolares.

En cuanto al volumen de ventas anual de aguacate en USA los años 2018 y 2019 fueron los mas prolificos estando en 2018 sobre los 2000 millones en ventas y en el año 2019 superandolos con suficiencia.

En cuanto al volumen de ventas de aguacates por Regiones en USA, destacan por encima del resto, las regiones West, South Central y California que sobrepasan las 1750 millones de unidades vendidas.

En cuanto a volumen de ventas en ciudades USA, Los Angeles, Nueva York, Dallas/FtWorth, Houston y Phoenix/Tucson son las mas destacadas Estando la ciudad de los Angeles muy por encima del resto cercano a los 900 millones en ventas, en lineas generales el Hass pequeño es el que mas se suele vender en el mercado de las ciudades.

2. MODELO ARIMA PARA PREDICCION PRECIO PROMEDIO y VOLUMEN VENTAS POR TIPOS DE AGUACATE.

Ahora vamos a realizar la prediccion con el modelo arima para los tipos de aguacates de nuestros datos organic y conventional:

avoc_arima <- avoc %>% select(date,average_price,total_volume,type)
organic <- avoc_arima %>%
  filter(type == "organic") %>%
  group_by(date) %>%
  summarise(avgpr_org= mean(average_price), avgvol_org=mean(total_volume))
conventional <- avoc_arima %>%
  filter(type == "conventional") %>%
  group_by(date) %>%
  summarise(avgpr_conv= mean(average_price), avgvol_conv=mean(total_volume))

Realizamos nuestras series temporales con los datos arriba mencionados:

organic_avgpr <- ts(organic$avgpr_org, start=c(2015,1), frequency=52)
organic_avgvol <- ts(organic$avgvol_org,start=c(2015,1), frequency=52)
conventional_avgpr <- ts(conventional$avgpr_conv, start=c(2015,1), frequency=52)
conventional_avgvol <- ts(conventional$avgvol_conv, start=c(2015,1), frequency=52)

y realizaremos las predicciones para el mejor modelo arima para cada caso en este caso lo haremos para 96 semanas a futuro con un nivel de confianza del 95%.

2.1 PREDICCION PRECIO PROMEDIO PARA AGUACATE TIPO ORGANICO

arima_avg_org <- auto.arima(organic_avgpr,d=1,D=1, trace=T, stepwise=F, approximation=F)
forecast_mod <- forecast(arima_avg_org, h=96,level=95)

Plot de la prediccion

plot(forecast_mod)

plot con el ajuste de la prediccion.

autoplot(forecast_mod)+autolayer(fitted(forecast_mod), series="Ajuste")

Sumario resultados

print(summary(forecast_mod))
## 
## Forecast method: ARIMA(1,1,3)(1,1,0)[52]
## 
## Model Information:
## Series: organic_avgpr 
## ARIMA(1,1,3)(1,1,0)[52] 
## 
## Coefficients:
##           ar1     ma1      ma2      ma3     sar1
##       -0.7566  0.8566  -0.1082  -0.3570  -0.5424
## s.e.   0.0895  0.1023   0.0866   0.0669   0.0588
## 
## sigma^2 = 0.004938:  log likelihood = 271.37
## AIC=-530.75   AICc=-530.36   BIC=-510.25
## 
## Error measures:
##                        ME       RMSE        MAE        MPE     MAPE      MASE
## Training set 0.0003346908 0.06251377 0.04371864 -0.0210524 2.698541 0.2486772
##                     ACF1
## Training set -0.01280563
## 
## Forecasts:
##          Point Forecast       Lo 95    Hi 95
## 2020.346       1.577947 1.440214655 1.715679
## 2020.365       1.496456 1.291695903 1.701216
## 2020.385       1.488244 1.247726105 1.728763
## 2020.404       1.522809 1.263772246 1.781846
## 2020.423       1.548885 1.263872071 1.833899
## 2020.442       1.570498 1.267880783 1.873115
## 2020.462       1.636735 1.313109589 1.960360
## 2020.481       1.623589 1.283396830 1.963782
## 2020.500       1.641168 1.282917420 1.999420
## 2020.519       1.675402 1.301604052 2.049199
## 2020.538       1.672261 1.282348881 2.062174
## 2020.558       1.644330 1.239812446 2.048848
## 2020.577       1.579082 1.159834919 1.998330
## 2020.596       1.589457 1.156444783 2.022469
## 2020.615       1.596490 1.149796675 2.043183
## 2020.635       1.594098 1.134381421 2.053815
## 2020.654       1.557532 1.084965998 2.030097
## 2020.673       1.575469 1.090530508 2.060408
## 2020.692       1.601257 1.104153618 2.098361
## 2020.712       1.559131 1.050226882 2.068035
## 2020.731       1.543228 1.022736082 2.063719
## 2020.750       1.524066 0.992279902 2.055853
## 2020.769       1.480038 0.937162145 2.022914
## 2020.788       1.457599 0.903877762 2.011321
## 2020.808       1.484905 0.920530151 2.049280
## 2020.827       1.451998 0.877178961 2.026817
## 2020.846       1.449425 0.864339695 2.034511
## 2020.865       1.431333 0.836164730 2.026502
## 2020.885       1.424450 0.819361481 2.029538
## 2020.904       1.481809 0.866964993 2.096654
## 2020.923       1.531282 0.906830887 2.155732
## 2020.942       1.442919 0.809009138 2.076828
## 2020.962       1.376115 0.732884255 2.019346
## 2020.981       1.382902 0.730484521 2.035320
## 2021.000       1.375033 0.713555539 2.036511
## 2021.019       1.375736 0.705320513 2.046151
## 2021.038       1.313228 0.633992418 1.992463
## 2021.058       1.319250 0.631307952 2.007191
## 2021.077       1.359920 0.663379960 2.056460
## 2021.096       1.341913 0.636880002 2.046946
## 2021.115       1.360175 0.646749772 2.073600
## 2021.135       1.386827 0.665108002 2.108547
## 2021.154       1.406709 0.676789603 2.136629
## 2021.173       1.378643 0.640614868 2.116672
## 2021.192       1.424179 0.678129833 2.170229
## 2021.212       1.446175 0.692190034 2.200160
## 2021.231       1.451034 0.689195513 2.212872
## 2021.250       1.580730 0.811118718 2.350341
## 2021.269       1.647860 0.870554387 2.425166
## 2021.288       1.608278 0.823352169 2.393204
## 2021.308       1.565159 0.772687095 2.357631
## 2021.327       1.598186 0.798238454 2.398133
## 2021.346       1.593576 0.775318396 2.411833
## 2021.365       1.480450 0.642962839 2.317937
## 2021.385       1.478099 0.624153457 2.332044
## 2021.404       1.555287 0.687722835 2.422851
## 2021.423       1.606895 0.724058013 2.489731
## 2021.442       1.636773 0.740320949 2.533225
## 2021.462       1.707361 0.796461617 2.618261
## 2021.481       1.721435 0.797088350 2.645781
## 2021.500       1.738720 0.800543288 2.676896
## 2021.519       1.776382 0.825007236 2.727757
## 2021.538       1.762791 0.798077771 2.727505
## 2021.558       1.715558 0.737927199 2.693188
## 2021.577       1.605345 0.614786950 2.595903
## 2021.596       1.616621 0.613436120 2.619806
## 2021.615       1.614817 0.599062084 2.630573
## 2021.635       1.586503 0.558405656 2.614599
## 2021.654       1.513521 0.473172977 2.553869
## 2021.673       1.530468 0.478053033 2.582882
## 2021.692       1.552313 0.487937022 2.616689
## 2021.712       1.479298 0.403117452 2.555479
## 2021.731       1.461173 0.373297919 2.549048
## 2021.750       1.434526 0.335093730 2.533957
## 2021.769       1.380830 0.269951747 2.491708
## 2021.788       1.363531 0.241330265 2.485731
## 2021.808       1.448245 0.314830154 2.581660
## 2021.827       1.416111 0.271595532 2.560627
## 2021.846       1.426987 0.271474447 2.582500
## 2021.865       1.394300 0.227896366 2.560705
## 2021.885       1.378696 0.201499316 2.555892
## 2021.904       1.443413 0.255524470 2.631302
## 2021.923       1.480516 0.282028702 2.679003
## 2021.942       1.365351 0.156358523 2.574343
## 2021.962       1.317505 0.098097904 2.536912
## 2021.981       1.321515 0.091781581 2.551248
## 2022.000       1.295013 0.055039093 2.534986
## 2022.019       1.329384 0.079254568 2.579514
## 2022.038       1.279788 0.019583265 2.539992
## 2022.058       1.290278 0.020078556 2.560477
## 2022.077       1.319234 0.039118283 2.599350
## 2022.096       1.297032 0.007076133 2.586989
## 2022.115       1.347274 0.047552007 2.646997
## 2022.135       1.388800 0.079384878 2.698216
## 2022.154       1.434560 0.115523040 2.753598
## 2022.173       1.394999 0.066409815 2.723589

Dataframe del pronostico

pronostico <- as.data.frame(forecast_mod)
head(pronostico,20)
##          Point Forecast    Lo 95    Hi 95
## 2020.346       1.577947 1.440215 1.715679
## 2020.365       1.496456 1.291696 1.701216
## 2020.385       1.488244 1.247726 1.728763
## 2020.404       1.522809 1.263772 1.781846
## 2020.423       1.548885 1.263872 1.833899
## 2020.442       1.570498 1.267881 1.873115
## 2020.462       1.636735 1.313110 1.960360
## 2020.481       1.623589 1.283397 1.963782
## 2020.500       1.641168 1.282917 1.999420
## 2020.519       1.675402 1.301604 2.049199
## 2020.538       1.672261 1.282349 2.062174
## 2020.558       1.644330 1.239812 2.048848
## 2020.577       1.579082 1.159835 1.998330
## 2020.596       1.589457 1.156445 2.022469
## 2020.615       1.596490 1.149797 2.043183
## 2020.635       1.594098 1.134381 2.053815
## 2020.654       1.557532 1.084966 2.030097
## 2020.673       1.575469 1.090531 2.060408
## 2020.692       1.601257 1.104154 2.098361
## 2020.712       1.559131 1.050227 2.068035

2.2 PREDICCION VOLUMEN DE VENTAS PARA AGUACATE TIPO ORGANICO

arima_vol_org <- auto.arima(organic_avgvol,d=1,D=1, trace=T, stepwise=F, approximation=F)
forecast_mod <- forecast(arima_vol_org, h=96,level=95)

Plot de la prediccion

plot(forecast_mod)

plot con el ajuste de la prediccion.

autoplot(forecast_mod)+autolayer(fitted(forecast_mod), series="Ajuste")

Sumario resultados

print(summary(forecast_mod))
## 
## Forecast method: ARIMA(0,1,1)(1,1,1)[52]
## 
## Model Information:
## Series: organic_avgvol 
## ARIMA(0,1,1)(1,1,1)[52] 
## 
## Coefficients:
##           ma1     sar1     sma1
##       -0.5292  -0.3664  -0.3371
## s.e.   0.0636   0.1434   0.1770
## 
## sigma^2 = 58544490:  log likelihood = -2342.95
## AIC=4693.9   AICc=4694.08   BIC=4707.56
## 
## Error measures:
##                    ME     RMSE      MAE        MPE     MAPE      MASE
## Training set 215.0373 6837.496 4508.321 -0.2774855 6.764423 0.2921328
##                    ACF1
## Training set 0.03536086
## 
## Forecasts:
##          Point Forecast     Lo 95    Hi 95
## 2020.346      115025.07 100028.09 130022.0
## 2020.365      112699.48  96123.72 129275.2
## 2020.385      112337.76  94321.03 130354.5
## 2020.404      112449.65  93098.96 131800.3
## 2020.423      121154.55 100556.11 141753.0
## 2020.442      124543.78 102768.97 146318.6
## 2020.462      114196.84  91306.03 137087.6
## 2020.481      116308.21  92353.34 140263.1
## 2020.500      107389.09  82415.46 132362.7
## 2020.519      105249.82  79297.39 131202.3
## 2020.538      107287.20  80391.56 134182.8
## 2020.558      107132.65  79325.78 134939.5
## 2020.577      107821.80  79132.62 136511.0
## 2020.596      105180.53  75635.39 134725.7
## 2020.615      106167.87  75790.87 136544.9
## 2020.635      105200.47  74013.79 136387.1
## 2020.654      106320.44  74344.58 138296.3
## 2020.673      108536.49  75790.47 141282.5
## 2020.692      110827.99  77329.51 144326.5
## 2020.712      108929.88  74695.47 143164.3
## 2020.731      107019.88  72065.03 141974.7
## 2020.750      105477.32  69816.58 141138.1
## 2020.769      112311.43  75958.51 148664.4
## 2020.788      111745.92  74713.75 148778.1
## 2020.808      107624.32  69925.14 145323.5
## 2020.827      106603.78  68249.19 144958.4
## 2020.846      103478.55  64479.55 142477.5
## 2020.865      102137.59  62504.67 141770.5
## 2020.885      104336.42  64079.56 144593.3
## 2020.904      105333.66  64462.38 146204.9
## 2020.923       98466.75  56990.15 139943.3
## 2020.942      102899.53  60826.32 144972.7
## 2020.962      115621.50  72960.02 158283.0
## 2020.981      118343.07  75101.33 161584.8
## 2021.000      121081.35  77267.06 164895.6
## 2021.019      117307.54  72928.09 161687.0
## 2021.038      120073.82  75136.31 165011.3
## 2021.058      119385.87  73897.15 164874.6
## 2021.077      111065.88  65032.55 157099.2
## 2021.096      118171.73  71600.15 164743.3
## 2021.115      125502.45  78398.78 172606.1
## 2021.135      119657.45  72027.63 167287.3
## 2021.154      131635.80  83485.58 179786.0
## 2021.173      122227.96  73562.90 170893.0
## 2021.192      121214.36  72039.86 170388.9
## 2021.212      126582.00  76903.27 176260.7
## 2021.231      134261.87  84083.99 184439.8
## 2021.250      122761.09  72088.96 173433.2
## 2021.269      118648.99  67487.40 169810.6
## 2021.288      119488.23  67841.81 171134.6
## 2021.308      125282.19  73155.46 177408.9
## 2021.327      121216.65  68613.99 173819.3
## 2021.346      125762.87  71916.14 179609.6
## 2021.365      126129.06  71509.79 180748.3
## 2021.385      124326.33  68945.30 179707.4
## 2021.404      122745.78  66613.31 178878.2
## 2021.423      126588.76  69714.79 183462.7
## 2021.442      132213.38  74607.44 189819.3
## 2021.462      124053.70  65724.99 182382.4
## 2021.481      122473.74  63431.11 181516.4
## 2021.500      115941.33  56193.30 175689.4
## 2021.519      113863.88  53418.68 174309.1
## 2021.538      117112.54  55978.12 178247.0
## 2021.558      117207.86  55391.91 179023.8
## 2021.577      121582.89  59092.84 184072.9
## 2021.596      118205.10  55048.14 181362.1
## 2021.615      117640.90  53824.01 181457.8
## 2021.635      119456.20  54986.13 183926.3
## 2021.654      121591.52  56474.81 186708.2
## 2021.673      124122.70  58365.72 189879.7
## 2021.692      125067.17  58676.10 191458.2
## 2021.712      123540.55  56521.38 190559.7
## 2021.731      121643.90  54002.46 189285.3
## 2021.750      119514.82  51256.79 187772.9
## 2021.769      128837.06  59967.96 197706.2
## 2021.788      126801.06  57326.26 196275.9
## 2021.808      120064.51  49989.25 190139.8
## 2021.827      119892.37  49221.74 190563.0
## 2021.846      113914.24  42653.22 185175.2
## 2021.865      113455.85  41609.30 185302.4
## 2021.885      115547.57  43120.22 187974.9
## 2021.904      114428.80  41425.26 187432.3
## 2021.923      113571.89  39996.69 187147.1
## 2021.942      118545.90  44403.43 192688.4
## 2021.962      129928.04  55222.62 204633.5
## 2021.981      137077.77  61813.60 212341.9
## 2022.000      140667.54  64848.73 216486.4
## 2022.019      131667.40  55298.03 208036.8
## 2022.038      136982.68  60066.67 213898.7
## 2022.058      133274.87  55816.09 210733.7
## 2022.077      128157.47  50159.69 206155.2
## 2022.096      135131.41  56598.34 213664.5
## 2022.115      142197.13  63132.38 221261.9
## 2022.135      134395.30  54802.43 213988.2
## 2022.154      151216.18  71098.67 231333.7
## 2022.173      137871.98  57233.24 218510.7

Dataframe del pronostico

pronostico <- as.data.frame(forecast_mod)
head(pronostico,20)
##          Point Forecast     Lo 95    Hi 95
## 2020.346       115025.1 100028.09 130022.0
## 2020.365       112699.5  96123.72 129275.2
## 2020.385       112337.8  94321.03 130354.5
## 2020.404       112449.6  93098.96 131800.3
## 2020.423       121154.5 100556.11 141753.0
## 2020.442       124543.8 102768.97 146318.6
## 2020.462       114196.8  91306.03 137087.6
## 2020.481       116308.2  92353.34 140263.1
## 2020.500       107389.1  82415.46 132362.7
## 2020.519       105249.8  79297.39 131202.3
## 2020.538       107287.2  80391.56 134182.8
## 2020.558       107132.7  79325.78 134939.5
## 2020.577       107821.8  79132.62 136511.0
## 2020.596       105180.5  75635.39 134725.7
## 2020.615       106167.9  75790.87 136544.9
## 2020.635       105200.5  74013.79 136387.1
## 2020.654       106320.4  74344.58 138296.3
## 2020.673       108536.5  75790.47 141282.5
## 2020.692       110828.0  77329.51 144326.5
## 2020.712       108929.9  74695.47 143164.3

2.3 PREDICCION PRECIO PROMEDIO PARA AGUACATE TIPO CONVENCIONAL

arima_avg_con <- auto.arima(conventional_avgpr,d=1,D=1, trace=T, stepwise=F, approximation=F)
forecast_mod <- forecast(arima_avg_con, h=96,level=95)

Plot de la prediccion

plot(forecast_mod)

plot con el ajuste de la prediccion.

autoplot(forecast_mod)+autolayer(fitted(forecast_mod), series="Ajuste")

Sumario resultados

print(summary(forecast_mod))
## 
## Forecast method: ARIMA(2,1,0)(1,1,0)[52]
## 
## Model Information:
## Series: conventional_avgpr 
## ARIMA(2,1,0)(1,1,0)[52] 
## 
## Coefficients:
##           ar1      ar2     sar1
##       -0.0637  -0.1299  -0.3890
## s.e.   0.0660   0.0657   0.0621
## 
## sigma^2 = 0.007142:  log likelihood = 233.92
## AIC=-459.84   AICc=-459.66   BIC=-446.18
## 
## Error measures:
##                        ME       RMSE        MAE        MPE     MAPE      MASE
## Training set 0.0001134446 0.07551935 0.05096262 -0.1561508 4.490172 0.2863638
##                     ACF1
## Training set 0.002419695
## 
## Forecasts:
##          Point Forecast       Lo 95    Hi 95
## 2020.346      1.1980365  1.03240146 1.363672
## 2020.365      1.0674377  0.84053532 1.294340
## 2020.385      1.1176644  0.85403148 1.381297
## 2020.404      1.1946086  0.89753600 1.491681
## 2020.423      1.2277776  0.89959596 1.555959
## 2020.442      1.2506181  0.89423126 1.607005
## 2020.462      1.2677212  0.88531103 1.650131
## 2020.481      1.2954900  0.88868887 1.702291
## 2020.500      1.3032944  0.87347351 1.733115
## 2020.519      1.2840478  0.83238294 1.735713
## 2020.538      1.2563611  0.78386205 1.728860
## 2020.558      1.2187499  0.72629674 1.711203
## 2020.577      1.1996607  0.68803098 1.711290
## 2020.596      1.1823876  0.65227462 1.712501
## 2020.615      1.1842049  0.63623182 1.732178
## 2020.635      1.1652413  0.59997214 1.730511
## 2020.654      1.1707584  0.58870685 1.752810
## 2020.673      1.2069861  0.60862267 1.805350
## 2020.692      1.1959111  0.58166884 1.810153
## 2020.712      1.1619179  0.53219709 1.791639
## 2020.731      1.1500515  0.50522356 1.794879
## 2020.750      1.1194155  0.45982635 1.779005
## 2020.769      1.0634763  0.38944918 1.737504
## 2020.788      1.0641115  0.37594915 1.752274
## 2020.808      1.0412312  0.33921823 1.743244
## 2020.827      1.0524395  0.33684396 1.768035
## 2020.846      1.0258273  0.29690228 1.754752
## 2020.865      1.0271673  0.28515220 1.769182
## 2020.885      0.9830441  0.22816581 1.737922
## 2020.904      1.0531850  0.28565914 1.820711
## 2020.923      1.1052834  0.32531499 1.885252
## 2020.942      1.0295779  0.23736234 1.821794
## 2020.962      0.9607918  0.15651556 1.765068
## 2020.981      0.9525726  0.13641386 1.768731
## 2021.000      0.9266939  0.09882324 1.754565
## 2021.019      1.0010743  0.16165508 1.840493
## 2021.038      0.8243621 -0.02644888 1.675173
## 2021.058      0.9261230  0.06407072 1.788175
## 2021.077      0.9954530  0.12230415 1.868602
## 2021.096      0.9953129  0.11120671 1.879419
## 2021.115      1.0474140  0.15248475 1.942343
## 2021.135      1.0491784  0.14355532 1.954802
## 2021.154      1.0682203  0.15202822 1.984412
## 2021.173      1.0855720  0.15893143 2.012213
## 2021.192      1.0852966  0.14832411 2.022269
## 2021.212      1.0534307  0.10623888 2.000622
## 2021.231      1.1352091  0.17790712 2.092511
## 2021.250      1.1974735  0.23016701 2.164780
## 2021.269      1.2260772  0.24886863 2.203286
## 2021.288      1.0841772  0.09716590 2.071189
## 2021.308      1.1304718  0.13375410 2.127189
## 2021.327      1.1844791  0.17814874 2.190810
## 2021.346      1.1991873  0.16464281 2.233732
## 2021.365      1.0833016  0.02273096 2.143872
## 2021.385      1.1130526  0.02974701 2.196358
## 2021.404      1.2017755  0.09587188 2.307679
## 2021.423      1.2388267  0.11047083 2.367182
## 2021.442      1.2618589  0.11154993 2.412168
## 2021.462      1.2937766  0.12196041 2.465593
## 2021.481      1.3152094  0.12226373 2.508155
## 2021.500      1.3351783  0.12146734 2.548889
## 2021.519      1.3125409  0.07841538 2.546666
## 2021.538      1.2753814  0.02117399 2.529589
## 2021.558      1.2237295 -0.05024354 2.497702
## 2021.577      1.1983066 -0.09513009 2.491743
## 2021.596      1.1685902 -0.14402150 2.481202
## 2021.615      1.1627846 -0.16872598 2.494295
## 2021.635      1.1468758 -0.20326917 2.497021
## 2021.654      1.1493101 -0.21921556 2.517836
## 2021.673      1.1697875 -0.21687520 2.556450
## 2021.692      1.1408324 -0.26373325 2.545398
## 2021.712      1.1084646 -0.31377850 2.530708
## 2021.731      1.1015027 -0.33820097 2.541206
## 2021.750      1.0687367 -0.38821818 2.525692
## 2021.769      1.0161887 -0.45781565 2.490193
## 2021.788      1.0199626 -0.47089609 2.510821
## 2021.808      0.9966900 -0.51083476 2.504215
## 2021.827      1.0026015 -0.52140699 2.526610
## 2021.846      0.9961395 -0.54417629 2.536455
## 2021.865      0.9945089 -0.56194345 2.550961
## 2021.885      0.9359246 -0.63649869 2.508348
## 2021.904      0.9977980 -0.59043557 2.586032
## 2021.923      1.0384901 -0.56539799 2.642378
## 2021.942      0.9459133 -0.67347791 2.565305
## 2021.962      0.9127474 -0.72199995 2.547495
## 2021.981      0.9152899 -0.73467068 2.565251
## 2022.000      0.8824049 -0.78262994 2.547440
## 2022.019      0.9594757 -0.72049811 2.639450
## 2022.038      0.7877521 -0.90702906 2.482533
## 2022.058      0.8994900 -0.80997021 2.608950
## 2022.077      0.9703053 -0.75370895 2.694320
## 2022.096      0.9656811 -0.77276538 2.704128
## 2022.115      1.0268345 -0.72592540 2.779594
## 2022.135      1.0345403 -0.73241710 2.801498
## 2022.154      1.0669222 -0.71411942 2.847964
## 2022.173      1.0773797 -0.71763576 2.872395

Dataframe del pronostico

pronostico <- as.data.frame(forecast_mod)
head(pronostico,20)
##          Point Forecast     Lo 95    Hi 95
## 2020.346       1.198037 1.0324015 1.363672
## 2020.365       1.067438 0.8405353 1.294340
## 2020.385       1.117664 0.8540315 1.381297
## 2020.404       1.194609 0.8975360 1.491681
## 2020.423       1.227778 0.8995960 1.555959
## 2020.442       1.250618 0.8942313 1.607005
## 2020.462       1.267721 0.8853110 1.650131
## 2020.481       1.295490 0.8886889 1.702291
## 2020.500       1.303294 0.8734735 1.733115
## 2020.519       1.284048 0.8323829 1.735713
## 2020.538       1.256361 0.7838621 1.728860
## 2020.558       1.218750 0.7262967 1.711203
## 2020.577       1.199661 0.6880310 1.711290
## 2020.596       1.182388 0.6522746 1.712501
## 2020.615       1.184205 0.6362318 1.732178
## 2020.635       1.165241 0.5999721 1.730511
## 2020.654       1.170758 0.5887068 1.752810
## 2020.673       1.206986 0.6086227 1.805350
## 2020.692       1.195911 0.5816688 1.810153
## 2020.712       1.161918 0.5321971 1.791639

2.4 PREDICCION VOLUMEN VENTAS PARA AGUACATE TIPO CONVENCIONAL

arima_vol_con <- auto.arima(conventional_avgvol,d=1,D=1, trace=T, stepwise=F, approximation=F)
forecast_mod <- forecast(arima_vol_con, h=96,level=95)

Plot de la prediccion

plot(forecast_mod)

plot con el ajuste de la prediccion.

autoplot(forecast_mod)+autolayer(fitted(forecast_mod), series="Ajuste")

Sumario resultados

print(summary(forecast_mod))
## 
## Forecast method: ARIMA(0,1,2)(0,1,1)[52]
## 
## Model Information:
## Series: conventional_avgvol 
## ARIMA(0,1,2)(0,1,1)[52] 
## 
## Coefficients:
##           ma1      ma2     sma1
##       -0.5902  -0.2181  -0.5794
## s.e.   0.0680   0.0726   0.0948
## 
## sigma^2 = 6.397e+10:  log likelihood = -3127.96
## AIC=6263.92   AICc=6264.11   BIC=6277.59
## 
## Error measures:
##                    ME   RMSE      MAE        MPE    MAPE      MASE        ACF1
## Training set 69.39602 226020 136484.4 -0.8917261 7.11636 0.4895279 -0.01003706
## 
## Forecasts:
##          Point Forecast   Lo 95   Hi 95
## 2020.346        2617513 2121082 3113943
## 2020.365        2797144 2260644 3333644
## 2020.385        2567407 2022532 3112283
## 2020.404        2492944 1939819 3046069
## 2020.423        2521168 1959914 3082421
## 2020.442        2457490 1888225 3026756
## 2020.462        2506964 1929797 3084130
## 2020.481        2378913 1793953 2963873
## 2020.500        2404246 1811594 2996898
## 2020.519        2397956 1797711 2998201
## 2020.538        2382148 1774404 2989891
## 2020.558        2352795 1737645 2967945
## 2020.577        2327991 1705522 2950460
## 2020.596        2382526 1752824 3012229
## 2020.615        2385551 1748697 3022405
## 2020.635        2342212 1698286 2986138
## 2020.654        2330846 1679924 2981767
## 2020.673        2265688 1607846 2923531
## 2020.692        2273406 1608715 2938097
## 2020.712        2271148 1599678 2942618
## 2020.731        2215754 1537573 2893935
## 2020.750        2202944 1518117 2887770
## 2020.769        2245096 1553688 2936504
## 2020.788        2212872 1514944 2910799
## 2020.808        2222133 1517746 2926519
## 2020.827        2187875 1477087 2898662
## 2020.846        2200046 1482915 2917176
## 2020.865        2123964 1400545 2847382
## 2020.885        2272571 1542919 3002223
## 2020.904        2018768 1282935 2754601
## 2020.923        2034015 1292053 2775977
## 2020.942        2207413 1459372 2955453
## 2020.962        2516294 1762223 3270365
## 2020.981        2556209 1796166 3316253
## 2021.000        2735472 1969674 3501270
## 2021.019        2581068 1809390 3352747
## 2021.038        3200354 2422840 3977868
## 2021.058        2706158 1922852 3489464
## 2021.077        2613395 1824339 3402451
## 2021.096        2770138 1975374 3564903
## 2021.115        2765127 1964695 3565558
## 2021.135        2628823 1822764 3434882
## 2021.154        2797897 1986249 3609544
## 2021.173        2720630 1903432 3537829
## 2021.192        2647883 1825172 3470594
## 2021.212        2779358 1951171 3607545
## 2021.231        2694513 1860886 3528141
## 2021.250        2672920 1833888 3511953
## 2021.269        2662879 1818477 3507282
## 2021.288        3060352 2210612 3910091
## 2021.308        2943675 2088633 3798718
## 2021.327        2766813 1906500 3627126
## 2021.346        2869787 1957088 3782486
## 2021.365        3042671 2112192 3973149
## 2021.385        2812934 1872672 3753197
## 2021.404        2738471 1788525 3688418
## 2021.423        2766695 1807162 3726227
## 2021.442        2703017 1733994 3672040
## 2021.462        2752491 1774068 3730913
## 2021.481        2624440 1636708 3612171
## 2021.500        2649773 1652819 3646727
## 2021.519        2643483 1637391 3649575
## 2021.538        2627675 1612527 3642823
## 2021.558        2598322 1574198 3622446
## 2021.577        2573518 1540497 3606540
## 2021.596        2628053 1586210 3669896
## 2021.615        2631078 1580487 3681669
## 2021.635        2587739 1528473 3647006
## 2021.654        2576373 1508501 3644244
## 2021.673        2511215 1434808 3587623
## 2021.692        2518933 1434057 3603810
## 2021.712        2516675 1423395 3609955
## 2021.731        2461281 1359662 3562900
## 2021.750        2448471 1338575 3558367
## 2021.769        2490623 1372512 3608734
## 2021.788        2458399 1332132 3584665
## 2021.808        2467660 1333296 3602023
## 2021.827        2433402 1290999 3575805
## 2021.846        2445573 1295186 3595959
## 2021.865        2369491 1211176 3527805
## 2021.885        2518098 1351909 3684286
## 2021.904        2264295 1090285 3438305
## 2021.923        2279542 1097762 3461322
## 2021.942        2452939 1263440 3642439
## 2021.962        2761821 1564652 3958989
## 2021.981        2801736 1596937 4006536
## 2022.000        2980999 1768677 4193321
## 2022.019        2826595 1606768 4046422
## 2022.038        3445881 2218596 4673167
## 2022.058        2951685 1716985 4186384
## 2022.077        2858922 1616853 4100991
## 2022.096        3015665 1766270 4265061
## 2022.115        3010654 1753975 4267332
## 2022.135        2874350 1610430 4138270
## 2022.154        3043424 1772304 4314544
## 2022.173        2966157 1687878 4244437

Dataframe del pronostico

pronostico <- as.data.frame(forecast_mod)
head(pronostico,20)
##          Point Forecast   Lo 95   Hi 95
## 2020.346        2617513 2121082 3113943
## 2020.365        2797144 2260644 3333644
## 2020.385        2567407 2022532 3112283
## 2020.404        2492944 1939819 3046069
## 2020.423        2521168 1959914 3082421
## 2020.442        2457490 1888225 3026756
## 2020.462        2506964 1929797 3084130
## 2020.481        2378913 1793953 2963873
## 2020.500        2404246 1811594 2996898
## 2020.519        2397956 1797711 2998201
## 2020.538        2382148 1774404 2989891
## 2020.558        2352795 1737645 2967945
## 2020.577        2327991 1705522 2950460
## 2020.596        2382526 1752824 3012229
## 2020.615        2385551 1748697 3022405
## 2020.635        2342212 1698286 2986138
## 2020.654        2330846 1679924 2981767
## 2020.673        2265688 1607846 2923531
## 2020.692        2273406 1608715 2938097
## 2020.712        2271148 1599678 2942618