Modelo Café Arábica - Mínimos Quadrados Ordinários

 UNIVERSIDADE FEDERAL DA PARAÍBA
 PROFESSOR: Sinézio Fernandes Maia

Autor

Josué de Meneses Lopes

Data de Publicação

30 de abril de 2024

Atividade em Mínimos Quadrados Ordinários (MQO)

A atividade abaixo se refere a busca de um modelo robusto e não tendencioso em MQO, para encontrar a quantidade ótima de contratos de hedge para o mercado futuro de café arábica.

Configurações iniciais

Para evitar notações científicas:

rm(list = ls())
options(scipen = 9999)
options(max.print = 100000)

Pacotes útilizados:

library(urca)
library(readxl)
library(tidyverse)
library(deflateBR)
library(dynlm)
library(ggplot2)
library(fBasics)
library(lmtest)
library(whitestrap)
library(FinTS)

Base de dados

dados <- read_excel("/Users/josue/Documents/R/Derivativos/modelo mqo/ICFFUT_(Mensal).xlsx")
dados1 <- read_excel("/Users/josue/Documents/R/Derivativos/modelo mqo/Dados Históricos - Café Arábica 4_5 Futuros.xlsx")

Plotagem dos dados

ggplot(dados, aes(x = Data)) +
  geom_line(aes(y = Abertura_real, color = "Spot")) +
  geom_line(aes(y = Fechamento_real, color = "Futuro")) +
  labs(title = "CAFÉ ARÁBICA - Valores mensais (01.2003 - 03.2024)",
       x = "Data",
       y = "Preço R$",
       color = "Variáveis") +
  scale_color_manual(values = c("Spot" = "blue", "Futuro" = "red")) +
  theme_minimal()

CAFÉ ARÁBICA - VALORES MENSAIS

Transformando em Séries Temporais

# Transformando em séries temporais
spot <- ts(dados$Abertura_real, start= c(2003,01), end=c(2024,03), frequency = 12)
futuro <- ts(dados$Fechamento_real, start= c(2003,01), end=c(2024,03), frequency = 12)

# Definindo o tempo
times <- seq(as.Date("2003/1/2"), by = "month", length.out = 255)

Plotando os Gráficos de Preço Spot e Futuro nominais

par(mfrow = c(1,2))

plot(spot, main = "CAFÉ ARÁBICA - Spot mensal"); plot(futuro, main = "CAFÉ ARÁBICA - Futuro mensal")

Transformando em logaritmo

O objetivo é corrigir a heterocedásticidade

lspot <- log(spot)
lfuturo <- log(futuro)

plot(lspot, main = "CAFÉ ARÁBICA - log SPOT mensal"); plot(lfuturo, main = "CAFÉ ARÁBICA - log FUTURO mensal")

Fazendo a regressão em diferença com valores nominais

Para corrigir os problemas causados pela autocorrelação nós fazemos a regressão em diferença

lag <- stats::lag
Mensal <- dynlm(diff(lspot)~diff(lag(lfuturo,11)+diff(lag(lspot,-5))))
summary(Mensal)

Time series regression with "ts" data:
Start = 2003(8), End = 2023(4)

Call:
dynlm(formula = diff(lspot) ~ diff(lag(lfuturo, 11) + diff(lag(lspot, 
    -5))))

Residuals:
     Min       1Q   Median       3Q      Max 
-0.29371 -0.06474  0.00020  0.06025  0.32658 

Coefficients:
                                               Estimate Std. Error t value
(Intercept)                                   -0.001616   0.006090  -0.265
diff(lag(lfuturo, 11) + diff(lag(lspot, -5))) -0.066683   0.034028  -1.960
                                              Pr(>|t|)  
(Intercept)                                     0.7909  
diff(lag(lfuturo, 11) + diff(lag(lspot, -5)))   0.0512 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.09375 on 235 degrees of freedom
Multiple R-squared:  0.01608,   Adjusted R-squared:  0.01189 
F-statistic:  3.84 on 1 and 235 DF,  p-value: 0.05122
Mensal

Time series regression with "ts" data:
Start = 2003(8), End = 2023(4)

Call:
dynlm(formula = diff(lspot) ~ diff(lag(lfuturo, 11) + diff(lag(lspot, 
    -5))))

Coefficients:
                                  (Intercept)  
                                    -0.001616  
diff(lag(lfuturo, 11) + diff(lag(lspot, -5)))  
                                    -0.066683  

Testes para resíduos

par(mfrow = c(1,1))
ResiduosMensal <- residuals(Mensal); head(ResiduosMensal)
   ago 2003    set 2003    out 2003    nov 2003    dez 2003    jan 2004 
 0.02293875 -0.08623458  0.04459704 -0.01508888 -0.03777107  0.05073407 
plot(ResiduosMensal, type="l", col="red")
abline(h=0, col="blue", lw=3)

par(mfrow=c(1,2))
hist(ResiduosMensal, main="", col="cadetblue", prob=T, xlab = names(ResiduosMensal)[1], breaks = 30)
curve(expr=dnorm(x,mean=mean(ResiduosMensal),sd=sd(ResiduosMensal)),col="red",add= TRUE, lwd=2)
qqnorm(ResiduosMensal, col="blue")
qqline(ResiduosMensal, col="red")

Teste para a presença de normalidade

Teste de Jarque Bera:

jarqueberaTest(ResiduosMensal)

Title:
 Jarque - Bera Normalality Test

Test Results:
  STATISTIC:
    X-squared: 6.3367
  P VALUE:
    Asymptotic p Value: 0.04207 

Teste de Shapiro Wilk:

shapiro.test(ResiduosMensal)

    Shapiro-Wilk normality test

data:  ResiduosMensal
W = 0.98975, p-value = 0.09186

Testes para presença de heteroscedasticidade

Teste de Goldfeld-Quandt:

length(ResiduosMensal); length(ResiduosMensal)*0.15
[1] 237
[1] 35.55
gqtest(Mensal, fraction = 35.55, alternative = "greater")

    Goldfeld-Quandt test

data:  Mensal
GQ = 0.8693, df1 = 99, df2 = 98, p-value = 0.756
alternative hypothesis: variance increases from segment 1 to 2

Teste de Breusch-Pagan-Godfrey:

  bptest(Mensal)

    studentized Breusch-Pagan test

data:  Mensal
BP = 0.17421, df = 1, p-value = 0.6764

Teste de White:

library(whitestrap)
white_test(Mensal)
White's test results

Null hypothesis: Homoskedasticity of the residuals
Alternative hypothesis: Heteroskedasticity of the residuals
Test Statistic: 3.01
P-value: 0.221973

Testes de autocorrelação

Teste de Durbin Watson:

dwtest(Mensal)

    Durbin-Watson test

data:  Mensal
DW = 2.37, p-value = 0.9981
alternative hypothesis: true autocorrelation is greater than 0

Teste de Breusch-Godfrey (1978):

bgtest(Mensal, order = 4)

    Breusch-Godfrey test for serial correlation of order up to 4

data:  Mensal
LM test = 8.9602, df = 4, p-value = 0.0621

Teste de Arch:

ArchTest(ResiduosMensal, lags = 4)

    ARCH LM-test; Null hypothesis: no ARCH effects

data:  ResiduosMensal
Chi-squared = 7.8367, df = 4, p-value = 0.09775

Teste de Dickey-Fuller

dfuller <- ur.df(ResiduosMensal, lags = 0, type="drift")
summary(dfuller)

############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression drift 


Call:
lm(formula = z.diff ~ z.lag.1 + 1)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.23272 -0.06617 -0.00210  0.05698  0.33013 

Coefficients:
               Estimate  Std. Error t value            Pr(>|t|)    
(Intercept) -0.00002122  0.00600599  -0.004               0.997    
z.lag.1     -1.18815378  0.06434337 -18.466 <0.0000000000000002 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.09226 on 234 degrees of freedom
Multiple R-squared:  0.593, Adjusted R-squared:  0.5913 
F-statistic:   341 on 1 and 234 DF,  p-value: < 0.00000000000000022


Value of test-statistic is: -18.4658 170.4969 

Critical values for test statistics: 
      1pct  5pct 10pct
tau2 -3.46 -2.88 -2.57
phi1  6.52  4.63  3.81

Número ótimo de contratos - Valores mensais

ContratosMensalNominal <-  (10000*0.066683)/450
ContratosMensalNominal
[1] 1.481844

VALORES DEFLACIONADOS MENSAIS

Transformando em Séries Temporais e Deflacionando pelo IGP-M pelo pacote ‘deflateBR’

# Transformando em séries temporais
spot <- ts(dados$Abertura_real, start= c(2003,01), end=c(2024,03), frequency = 12)
futuro <- ts(dados$Fechamento_real, start= c(2003,01), end=c(2024,03), frequency = 12)

# Definindo o tempo
times <- seq(as.Date("2003/1/2"), by = "month", length.out = 255)

# Deflacionando
spotR <- deflate(spot, nominal_dates = times, real_date = "03/2024", index = "igpm")
futuroR <- deflate(futuro, nominal_dates = times, real_date = "03/2024", index = "igpm")

Plotando os Gráficos de Preço Spot e Futuro com valores reais

par(mfrow = c(1,2))

plot(spotR, main = "CAFÉ ARÁBICA - Spot mensal"); plot(futuroR, main = "CAFÉ ARÁBICA - Futuro mensal")

Transformando em logaritmo

O objetivo é corrigir a heterocedásticidade

lspotR <- log(spotR)
lfuturoR <- log(futuroR)

plot(lspotR, main = "CAFÉ ARÁBICA - log SPOT mensal"); plot(lfuturoR, main = "CAFÉ ARÁBICA - log FUTURO mensal")

Fazendo a regressão em diferença com valores reais

Para corrigir os problemas causados pela autocorrelação nós fazemos a regressão em diferença

lag <- stats::lag
MensalR <- dynlm(diff(lspotR)~diff(lag(lfuturoR,1)))
summary(MensalR)

Time series regression with "ts" data:
Start = 2003(2), End = 2024(2)

Call:
dynlm(formula = diff(lspotR) ~ diff(lag(lfuturoR, 1)))

Residuals:
     Min       1Q   Median       3Q      Max 
-0.30289 -0.06805  0.00161  0.06226  0.32401 

Coefficients:
                        Estimate Std. Error t value Pr(>|t|)  
(Intercept)            -0.008874   0.005989  -1.482   0.1396  
diff(lag(lfuturoR, 1)) -0.176736   0.070737  -2.498   0.0131 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.09486 on 251 degrees of freedom
Multiple R-squared:  0.02427,   Adjusted R-squared:  0.02038 
F-statistic: 6.242 on 1 and 251 DF,  p-value: 0.01311
MensalR

Time series regression with "ts" data:
Start = 2003(2), End = 2024(2)

Call:
dynlm(formula = diff(lspotR) ~ diff(lag(lfuturoR, 1)))

Coefficients:
           (Intercept)  diff(lag(lfuturoR, 1))  
             -0.008874               -0.176736  

Testes para resíduos

par(mfrow = c(1,1))
ResiduosMensalR <- residuals(MensalR); head(ResiduosMensalR)
   fev 2003    mar 2003    abr 2003    mai 2003    jun 2003    jul 2003 
 0.03922653 -0.29499242 -0.15622198  0.12233798 -0.13981296  0.07187261 
plot(ResiduosMensalR, type="l", col="red")
abline(h=0, col="blue", lw=3)

par(mfrow=c(1,2))
hist(ResiduosMensalR, main="", col="cadetblue", prob=T, xlab = names(ResiduosMensalR)[1], breaks = 30)
curve(expr=dnorm(x,mean=mean(ResiduosMensalR),sd=sd(ResiduosMensalR)),col="red",add= TRUE, lwd=2)
qqnorm(ResiduosMensalR, col="blue")
qqline(ResiduosMensalR, col="red")

Teste para a presença de normalidade

Teste de Jarque Bera:

jarqueberaTest(ResiduosMensalR)

Title:
 Jarque - Bera Normalality Test

Test Results:
  STATISTIC:
    X-squared: 2.6384
  P VALUE:
    Asymptotic p Value: 0.2674 

Teste de Shapiro Wilk:

shapiro.test(ResiduosMensalR)

    Shapiro-Wilk normality test

data:  ResiduosMensalR
W = 0.98968, p-value = 0.06906

Testes para presença de heteroscedasticidade

Teste de Goldfeld-Quandt:

length(ResiduosMensalR); length(ResiduosMensalR)*0.15
[1] 253
[1] 37.95
gqtest(MensalR, fraction = 37.95, alternative = "greater")

    Goldfeld-Quandt test

data:  MensalR
GQ = 0.77546, df1 = 106, df2 = 105, p-value = 0.9035
alternative hypothesis: variance increases from segment 1 to 2

Teste de Breusch-Pagan-Godfrey:

  bptest(MensalR)

    studentized Breusch-Pagan test

data:  MensalR
BP = 2.7295, df = 1, p-value = 0.09851

Teste de White:

library(whitestrap)
white_test(MensalR)
White's test results

Null hypothesis: Homoskedasticity of the residuals
Alternative hypothesis: Heteroskedasticity of the residuals
Test Statistic: 5.44
P-value: 0.065831

Testes de autocorrelação

Teste de Durbin Watson:

dwtest(MensalR)

    Durbin-Watson test

data:  MensalR
DW = 2.2195, p-value = 0.9603
alternative hypothesis: true autocorrelation is greater than 0

Teste de Breusch-Godfrey (1978):

bgtest(MensalR, order = 4)

    Breusch-Godfrey test for serial correlation of order up to 4

data:  MensalR
LM test = 3.5011, df = 4, p-value = 0.4777

Teste de Arch:

ArchTest(ResiduosMensalR, lags = 4)

    ARCH LM-test; Null hypothesis: no ARCH effects

data:  ResiduosMensalR
Chi-squared = 7.0143, df = 4, p-value = 0.1351

Teste de Dickey-Fuller

dfuller <- ur.df(ResiduosMensalR, lags = 0, type="drift")
summary(dfuller)

############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression drift 


Call:
lm(formula = z.diff ~ z.lag.1 + 1)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.29052 -0.07334  0.00108  0.06437  0.32849 

Coefficients:
              Estimate Std. Error t value            Pr(>|t|)    
(Intercept) -0.0001555  0.0059490  -0.026               0.979    
z.lag.1     -1.1100922  0.0628393 -17.666 <0.0000000000000002 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.09444 on 250 degrees of freedom
Multiple R-squared:  0.5552,    Adjusted R-squared:  0.5534 
F-statistic: 312.1 on 1 and 250 DF,  p-value: < 0.00000000000000022


Value of test-statistic is: -17.6656 156.0363 

Critical values for test statistics: 
      1pct  5pct 10pct
tau2 -3.44 -2.87 -2.57
phi1  6.47  4.61  3.79

Número ótimo de contratos - Valores mensais

ContratosMensalReal <-  (10000*0.176736)/450
ContratosMensalReal
[1] 3.927467

CAFÉ ARÁBICA - VALORES DIÁRIOS

Plotagem dos dados

ggplot(dados1, aes(x = Data)) +
  geom_line(aes(y = Spot_real, color = "Spot")) +
  geom_line(aes(y = Futuro_real, color = "Futuro")) +
  labs(title = "CAFÉ ARÁBICA - Valores diários (07.12.2023 - 18.04.2024)",
       x = "Data",
       y = "Preço R$",
       color = "Variáveis") +
  scale_color_manual(values = c("Spot" = "blue", "Futuro" = "red")) +
  theme_minimal()

Transformando em Séries Temporais

# Transformando em séries temporais
spotD <- ts(dados1$Spot_real, start= c(2023,12), end=c(2024,04), frequency = 365)
futuroD <- ts(dados1$Futuro_real, start= c(2023,12), end=c(2024,04), frequency = 365)

# Definindo o tempo
times <- seq(as.Date("2023/12/07"), by = "day", length.out = 89)

Plotando os Gráficos de Preço Spot e Futuro em valores nominais

par(mfrow = c(1,2))

plot(spotD, main = "CAFÉ ARÁBICA - Spot diário nominal"); plot(futuroD, main = "CAFÉ ARÁBICA - Futuro diário nominal")

Transformando em logaritmo

O objetivo é corrigir a heterocedásticidade

lspotD <- log(spotD)
lfuturoD <- log(futuroD)

plot(lspotD, main = "CAFÉ ARÁBICA - log SPOT diário nominal"); plot(lfuturoD, main = "CAFÉ ARÁBICA - log FUTURO diário nominal")

Fazendo a regressão em diferença em valores nominais

Para corrigir os problemas causados pela autocorrelação nós fazemos a regressão em diferença

lag <- stats::lag
Diario <- dynlm(diff(lspotD)~diff(lag(lfuturoD,-1)+diff(lag(lfuturoD,2)+diff(lag(lspotD,-1)+diff(lag(lspotD,-2))))))
summary(Diario)

Time series regression with "ts" data:
Start = 2023(18), End = 2024(2)

Call:
dynlm(formula = diff(lspotD) ~ diff(lag(lfuturoD, -1) + diff(lag(lfuturoD, 
    2) + diff(lag(lspotD, -1) + diff(lag(lspotD, -2))))))

Residuals:
      Min        1Q    Median        3Q       Max 
-0.236144 -0.010010  0.002443  0.015400  0.057733 

Coefficients:
                                                                                                  Estimate
(Intercept)                                                                                      0.0006907
diff(lag(lfuturoD, -1) + diff(lag(lfuturoD, 2) + diff(lag(lspotD, -1) + diff(lag(lspotD, -2))))) 0.0533340
                                                                                                 Std. Error
(Intercept)                                                                                       0.0015976
diff(lag(lfuturoD, -1) + diff(lag(lfuturoD, 2) + diff(lag(lspotD, -1) + diff(lag(lspotD, -2)))))  0.0112564
                                                                                                 t value
(Intercept)                                                                                        0.432
diff(lag(lfuturoD, -1) + diff(lag(lfuturoD, 2) + diff(lag(lspotD, -1) + diff(lag(lspotD, -2)))))   4.738
                                                                                                   Pr(>|t|)
(Intercept)                                                                                           0.666
diff(lag(lfuturoD, -1) + diff(lag(lfuturoD, 2) + diff(lag(lspotD, -1) + diff(lag(lspotD, -2))))) 0.00000315
                                                                                                    
(Intercept)                                                                                         
diff(lag(lfuturoD, -1) + diff(lag(lfuturoD, 2) + diff(lag(lspotD, -1) + diff(lag(lspotD, -2))))) ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.02989 on 348 degrees of freedom
Multiple R-squared:  0.0606,    Adjusted R-squared:  0.0579 
F-statistic: 22.45 on 1 and 348 DF,  p-value: 0.000003148
Diario

Time series regression with "ts" data:
Start = 2023(18), End = 2024(2)

Call:
dynlm(formula = diff(lspotD) ~ diff(lag(lfuturoD, -1) + diff(lag(lfuturoD, 
    2) + diff(lag(lspotD, -1) + diff(lag(lspotD, -2))))))

Coefficients:
                                                                                     (Intercept)  
                                                                                       0.0006907  
diff(lag(lfuturoD, -1) + diff(lag(lfuturoD, 2) + diff(lag(lspotD, -1) + diff(lag(lspotD, -2)))))  
                                                                                       0.0533340  

Testes para resíduos

par(mfrow = c(1,1))
ResiduosDiario <- residuals(Diario); head(ResiduosDiario)
    2023(18)     2023(19)     2023(20)     2023(21)     2023(22)     2023(23) 
 0.034670573  0.026973437  0.040881636 -0.067149606 -0.023471835 -0.006055146 
plot(ResiduosDiario, type="l", col="red")
abline(h=0, col="blue", lw=3)

par(mfrow=c(1,2))
hist(ResiduosDiario, main="", col="cadetblue", prob=T, xlab = names(ResiduosDiario)[1], breaks = 30)
curve(expr=dnorm(x,mean=mean(ResiduosDiario),sd=sd(ResiduosDiario)),col="red",add= TRUE, lwd=2)
qqnorm(ResiduosDiario, col="blue")
qqline(ResiduosDiario, col="red")

Teste para a presença de normalidade

Teste de Jarque Bera:

jarqueberaTest(ResiduosDiario)

Title:
 Jarque - Bera Normalality Test

Test Results:
  STATISTIC:
    X-squared: 15678.9415
  P VALUE:
    Asymptotic p Value: < 0.00000000000000022 

Teste de Shapiro Wilk:

shapiro.test(ResiduosDiario)

    Shapiro-Wilk normality test

data:  ResiduosDiario
W = 0.68922, p-value < 0.00000000000000022

Testes para presença de heteroscedasticidade

Teste de Goldfeld-Quandt:

length(ResiduosDiario); length(ResiduosDiario)*0.15
[1] 350
[1] 52.5
gqtest(Diario, fraction = 52.5, alternative = "greater")

    Goldfeld-Quandt test

data:  Diario
GQ = 0.98918, df1 = 147, df2 = 146, p-value = 0.5263
alternative hypothesis: variance increases from segment 1 to 2

Teste de Breusch-Pagan-Godfrey:

  bptest(Diario)

    studentized Breusch-Pagan test

data:  Diario
BP = 1.9447, df = 1, p-value = 0.1632

Teste de White:

library(whitestrap)
white_test(Diario)
White's test results

Null hypothesis: Homoskedasticity of the residuals
Alternative hypothesis: Heteroskedasticity of the residuals
Test Statistic: 1.99
P-value: 0.369154

Testes de autocorrelação

Teste de Durbin Watson:

dwtest(Diario)

    Durbin-Watson test

data:  Diario
DW = 1.8137, p-value = 0.04211
alternative hypothesis: true autocorrelation is greater than 0

Teste de Breusch-Godfrey (1978):

bgtest(Diario, order = 4)

    Breusch-Godfrey test for serial correlation of order up to 4

data:  Diario
LM test = 6.9452, df = 4, p-value = 0.1388

Teste de Arch:

ArchTest(ResiduosDiario, lags = 4)

    ARCH LM-test; Null hypothesis: no ARCH effects

data:  ResiduosDiario
Chi-squared = 0.13013, df = 4, p-value = 0.998

Teste de Dickey-Fuller

dfuller <- ur.df(ResiduosDiario, lags = 0, type="drift")
summary(dfuller)

############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression drift 


Call:
lm(formula = z.diff ~ z.lag.1 + 1)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.236355 -0.009950  0.001389  0.014737  0.058398 

Coefficients:
               Estimate  Std. Error t value            Pr(>|t|)    
(Intercept) -0.00009846  0.00159238  -0.062               0.951    
z.lag.1     -0.90878611  0.05335540 -17.033 <0.0000000000000002 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.02975 on 347 degrees of freedom
Multiple R-squared:  0.4554,    Adjusted R-squared:  0.4538 
F-statistic: 290.1 on 1 and 347 DF,  p-value: < 0.00000000000000022


Value of test-statistic is: -17.0327 145.0579 

Critical values for test statistics: 
      1pct  5pct 10pct
tau2 -3.44 -2.87 -2.57
phi1  6.47  4.61  3.79

Número ótimo de contratos - Valores diários

ContratosDiarioNominais <-  (10000*0.0533340)/450
ContratosDiarioNominais
[1] 1.1852

MODELO DIÁRIO DEFLACIONADO

Transformando em Séries Temporais e Deflacionando pelo IGP-M pelo pacote ‘deflateBR’

# Transformando em séries temporais
spotD <- ts(dados1$Spot_real, start= c(2023,12), end=c(2024,04), frequency = 365)
futuroD <- ts(dados1$Futuro_real, start= c(2023,12), end=c(2024,04), frequency = 365)

# Definindo o tempo
times <- seq(as.Date("2023/12/07"), by = "day", length.out = 89)

# Deflacionando
spotDR <- deflate(spotD, nominal_dates = times, real_date = "04/2024", index = "igpm")
futuroDR <- deflate(futuroD, nominal_dates = times, real_date = "04/2024", index = "igpm")

Plotando os Gráficos de Preço Spot e Futuro

par(mfrow = c(1,2))

plot(spotDR, main = "CAFÉ ARÁBICA - Spot diário real"); plot(futuroDR, main = "CAFÉ ARÁBICA - Futuro diário real")

Transformando em logaritmo

O objetivo é corrigir a heterocedásticidade

lspotDR <- log(spotDR)
lfuturoDR <- log(futuroDR)

plot(lspotDR, main = "CAFÉ ARÁBICA - log SPOT diário real"); plot(lfuturoDR, main = "CAFÉ ARÁBICA - log FUTURO diário real")

Fazendo a regressão em diferença

Para corrigir os problemas causados pela autocorrelação nós fazemos a regressão em diferença

lag <- stats::lag
DiarioR <- dynlm(diff(lspotDR)~diff(lag(lspotDR,-3)+diff(lag(lfuturoDR,-1)+diff(lag(lfuturoDR,-2)))))
summary(DiarioR)

Time series regression with "ts" data:
Start = 2023(17), End = 2024(4)

Call:
dynlm(formula = diff(lspotDR) ~ diff(lag(lspotDR, -3) + diff(lag(lfuturoDR, 
    -1) + diff(lag(lfuturoDR, -2)))))

Residuals:
      Min        1Q    Median        3Q       Max 
-0.230306 -0.010256  0.005394  0.016782  0.054517 

Coefficients:
                                                                                Estimate
(Intercept)                                                                  -0.00006136
diff(lag(lspotDR, -3) + diff(lag(lfuturoDR, -1) + diff(lag(lfuturoDR, -2))))  0.09517042
                                                                              Std. Error
(Intercept)                                                                   0.00171423
diff(lag(lspotDR, -3) + diff(lag(lfuturoDR, -1) + diff(lag(lfuturoDR, -2))))  0.02006381
                                                                             t value
(Intercept)                                                                   -0.036
diff(lag(lspotDR, -3) + diff(lag(lfuturoDR, -1) + diff(lag(lfuturoDR, -2))))   4.743
                                                                               Pr(>|t|)
(Intercept)                                                                       0.971
diff(lag(lspotDR, -3) + diff(lag(lfuturoDR, -1) + diff(lag(lfuturoDR, -2)))) 0.00000306
                                                                                
(Intercept)                                                                     
diff(lag(lspotDR, -3) + diff(lag(lfuturoDR, -1) + diff(lag(lfuturoDR, -2)))) ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.03221 on 351 degrees of freedom
Multiple R-squared:  0.06024,   Adjusted R-squared:  0.05756 
F-statistic:  22.5 on 1 and 351 DF,  p-value: 0.000003062
DiarioR

Time series regression with "ts" data:
Start = 2023(17), End = 2024(4)

Call:
dynlm(formula = diff(lspotDR) ~ diff(lag(lspotDR, -3) + diff(lag(lfuturoDR, 
    -1) + diff(lag(lfuturoDR, -2)))))

Coefficients:
                                                                 (Intercept)  
                                                                 -0.00006136  
diff(lag(lspotDR, -3) + diff(lag(lfuturoDR, -1) + diff(lag(lfuturoDR, -2))))  
                                                                  0.09517042  

Testes para resíduos

par(mfrow = c(1,1))
ResiduosDiarioR <- residuals(DiarioR); head(ResiduosDiarioR)
   2023(17)    2023(18)    2023(19)    2023(20)    2023(21)    2023(22) 
 0.02068819  0.02314753  0.03614094  0.04179508 -0.07135063 -0.02294401 
plot(ResiduosDiarioR, type="l", col="red")
abline(h=0, col="blue", lw=3)

par(mfrow=c(1,2))
hist(ResiduosDiarioR, main="", col="cadetblue", prob=T, xlab = names(ResiduosDiarioR)[1], breaks = 30)
curve(expr=dnorm(x,mean=mean(ResiduosDiarioR),sd=sd(ResiduosDiarioR)),col="red",add= TRUE, lwd=2)
qqnorm(ResiduosDiarioR, col="blue")
qqline(ResiduosDiarioR, col="red")

Teste para a presença de normalidade

Teste de Jarque Bera:

jarqueberaTest(ResiduosDiarioR)

Title:
 Jarque - Bera Normalality Test

Test Results:
  STATISTIC:
    X-squared: 12275.0198
  P VALUE:
    Asymptotic p Value: < 0.00000000000000022 

Teste de Shapiro Wilk:

shapiro.test(ResiduosDiarioR)

    Shapiro-Wilk normality test

data:  ResiduosDiarioR
W = 0.6717, p-value < 0.00000000000000022

Testes para presença de heteroscedasticidade

Teste de Goldfeld-Quandt:

length(ResiduosDiarioR); length(ResiduosDiarioR)*0.15
[1] 353
[1] 52.95
gqtest(DiarioR, fraction = 52.95, alternative = "greater")

    Goldfeld-Quandt test

data:  DiarioR
GQ = 1.4491, df1 = 149, df2 = 148, p-value = 0.01224
alternative hypothesis: variance increases from segment 1 to 2

Teste de Breusch-Pagan-Godfrey:

  bptest(DiarioR)

    studentized Breusch-Pagan test

data:  DiarioR
BP = 3.1541, df = 1, p-value = 0.07573

Teste de White:

library(whitestrap)
white_test(DiarioR)
White's test results

Null hypothesis: Homoskedasticity of the residuals
Alternative hypothesis: Heteroskedasticity of the residuals
Test Statistic: 3.43
P-value: 0.180093

Testes de autocorrelação

Teste de Durbin Watson:

dwtest(DiarioR)

    Durbin-Watson test

data:  DiarioR
DW = 1.9161, p-value = 0.2187
alternative hypothesis: true autocorrelation is greater than 0

Teste de Breusch-Godfrey (1978):

bgtest(DiarioR, order = 4)

    Breusch-Godfrey test for serial correlation of order up to 4

data:  DiarioR
LM test = 6.2428, df = 4, p-value = 0.1817

Teste de Arch:

ArchTest(ResiduosDiarioR, lags = 4)

    ARCH LM-test; Null hypothesis: no ARCH effects

data:  ResiduosDiarioR
Chi-squared = 0.2213, df = 4, p-value = 0.9943

Teste de Dickey-Fuller

dfuller <- ur.df(ResiduosDiarioR, lags = 0, type="drift")
summary(dfuller)

############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression drift 


Call:
lm(formula = z.diff ~ z.lag.1 + 1)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.230994 -0.009894  0.004438  0.016328  0.054917 

Coefficients:
               Estimate  Std. Error t value            Pr(>|t|)    
(Intercept) -0.00006282  0.00171673  -0.037               0.971    
z.lag.1     -0.96025205  0.05347325 -17.958 <0.0000000000000002 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.03221 on 350 degrees of freedom
Multiple R-squared:  0.4795,    Adjusted R-squared:  0.478 
F-statistic: 322.5 on 1 and 350 DF,  p-value: < 0.00000000000000022


Value of test-statistic is: -17.9576 161.2423 

Critical values for test statistics: 
      1pct  5pct 10pct
tau2 -3.44 -2.87 -2.57
phi1  6.47  4.61  3.79

Número ótimo de contratos - Valores diários

ContratosDiarioReal <-  (10000*0.09517042)/450
ContratosDiarioReal
[1] 2.114898

Conclusão

A procura de corrigir ou minimizar os problemas de heterocedásticidade, autocorrelação, mantendo a distribuição normal e confirmando que a série é estacionária, podemos afirmar na tabela seguinte:

TIPOS DE CONTRATOS QUANTIDADE ÓTIMA DE CONTRATOS QUANTIDADE QUE O PRODUTOR DEVE SE COMPROMETER RISCO MINIMIZADO
MENSAL NOMINAL 1,4818 6,66% 1,18%
MENSAL REAL 3,9274 17,67% 2,03%
DIÁRIO NOMINAL 1,1852 5,33% 5,79%
DIÁRIO REAL 2,114898 9,51% 5,75%.

Para os valores mensais, o deflator fez grande diferença na quantidade em que o produtor deve se comprometer e a quantidade ótima de contratos para de fazer hedge, chegando a ser 2,65 vezes maior que em valores nominais em ambas as situações, porém o risco minimizado não cresceu na mesma proporção, sendo apenas 1,72 vezes maior do que em valores nominais. Ou seja, é nítido que a influência da inflação impacta na forma de praticar os contratos de hedge, nesses contratos mensais de longo prazo.

Para os contratos diários a situação é diferente. A quantidade ótima de contratos e a quantidade que o produtor deve se comprometer aumentou em 1,78 vezes, entretanto o risco não foi minimizado, mas sim aumentou em 0,01%, o que difere dos valores mensais, que apesar de não crescerem proporcionalmente, eles ainda conseguiam amenizar mais o risco.