Inicialmente, será carregado o banco de dados referente a Infecção Respiratoria Aguda em crianças, ao longo de 16 anos na cidade de Rondonopolis - MT, os dados foram coletados entre janeiro de 1999 até dezembro de 2014, este estudo contém 5 variáveis, sendo elas IRA (Número de pessoas com infecção respiratória aguda), Precip, Temp, Umid.
setwd("C:\\Users\\Mateus\\Desktop\\P7\\Consultoria\\Atividade 3")
dados<-read.table("Dados3.txt",h=T)
head(dados)
## Mes.Ano IRA Precip Temp Umid
## 1 jan/99 170 413.8 25.39 86.15
## 2 fev/99 134 81.0 25.96 84.09
## 3 mar/99 241 86.4 25.49 88.12
## 4 abr/99 293 26.5 24.61 75.22
## 5 mai/99 402 3.9 22.17 69.27
## 6 jun/99 324 1.4 22.76 68.59
attach(dados)
#install.packages("hnp")
library(hnp)
## Warning: package 'hnp' was built under R version 3.5.3
## Loading required package: MASS
library(psych)
## Warning: package 'psych' was built under R version 3.5.3
library(nortest)
require(fBasics)
## Loading required package: fBasics
## Warning: package 'fBasics' was built under R version 3.5.3
## Loading required package: timeDate
## Loading required package: timeSeries
##
## Attaching package: 'timeSeries'
## The following object is masked from 'package:psych':
##
## outlier
##
## Attaching package: 'fBasics'
## The following object is masked from 'package:psych':
##
## tr
require(TSA)
## Loading required package: TSA
## Warning: package 'TSA' was built under R version 3.5.3
##
## Attaching package: 'TSA'
## The following objects are masked from 'package:timeDate':
##
## kurtosis, skewness
## The following objects are masked from 'package:stats':
##
## acf, arima
## The following object is masked from 'package:utils':
##
## tar
library(tseries)
## Warning: package 'tseries' was built under R version 3.5.3
library(forecast)
## Warning: package 'forecast' was built under R version 3.5.3
describe(dados)
## vars n mean sd median trimmed mad min max range
## Mes.Ano* 1 192 96.50 55.57 96.50 96.50 71.16 1 192.00 191.00
## IRA 2 192 431.05 215.66 426.50 421.09 263.90 103 904.00 801.00
## Precip 3 192 104.43 108.59 68.65 89.89 100.22 0 572.20 572.20
## Temp 4 192 25.17 1.61 25.59 25.25 1.22 21 28.74 7.74
## Umid 5 192 76.73 12.55 80.73 77.93 11.64 42 95.29 53.29
## skew kurtosis se
## Mes.Ano* 0.00 -1.22 4.01
## IRA 0.24 -0.98 15.56
## Precip 1.10 0.95 7.84
## Temp -0.52 -0.40 0.12
## Umid -0.74 -0.44 0.91
A unica variavel que tem a curtose positiva é Precip, ou seja possui caudas pesadas. As demais variáveis possui a curtose negativa, então a funçao de distribuição é mais achatada do que a distribuição normal
par(mfrow=c(2,2))
histPlot(as.timeSeries(dados$IRA))
histPlot(as.timeSeries(dados$Precip))
histPlot(as.timeSeries(dados$Temp))
histPlot(as.timeSeries(dados$Umid))
ad.test(dados$IRA)
##
## Anderson-Darling normality test
##
## data: dados$IRA
## A = 1.9645, p-value = 5.049e-05
ad.test(dados$Precip)
##
## Anderson-Darling normality test
##
## data: dados$Precip
## A = 8.0584, p-value < 2.2e-16
ad.test(dados$Temp)
##
## Anderson-Darling normality test
##
## data: dados$Temp
## A = 4.3088, p-value = 9.714e-11
ad.test(dados$Umid)
##
## Anderson-Darling normality test
##
## data: dados$Umid
## A = 4.8197, p-value = 5.67e-12
Conforme o tesde de normalidade, todas foram significativas, logo as variáveis IRA, Precip, Temp, Umid não segue normalidade.
cor.test(dados$IRA,dados$Precip, method = "kendall")
##
## Kendall's rank correlation tau
##
## data: dados$IRA and dados$Precip
## z = -4.5062, p-value = 6.599e-06
## alternative hypothesis: true tau is not equal to 0
## sample estimates:
## tau
## -0.2209575
cor.test(dados$IRA,dados$Temp, method = "kendall")
##
## Kendall's rank correlation tau
##
## data: dados$IRA and dados$Temp
## z = -3.9666, p-value = 7.29e-05
## alternative hypothesis: true tau is not equal to 0
## sample estimates:
## tau
## -0.19303
cor.test(dados$IRA,dados$Umid, method = "kendall")
##
## Kendall's rank correlation tau
##
## data: dados$IRA and dados$Umid
## z = -6.7018, p-value = 2.059e-11
## alternative hypothesis: true tau is not equal to 0
## sample estimates:
## tau
## -0.3257263
As variaveis IRA, Precip, Temp e Umid, tem correlação negativa. Ou seja nesse caso à medida que Precip, Temp e Umid cresce, a IRA decresce (em média), são inversamente proporcionais.
par(mfrow=c(2,2))
densityPlot(as.timeSeries(dados$IRA))
densityPlot(as.timeSeries(dados$Precip))
densityPlot(as.timeSeries(dados$Temp))
densityPlot(as.timeSeries(dados$Umid))
library(car)
## Loading required package: carData
##
## Attaching package: 'car'
## The following object is masked from 'package:fBasics':
##
## densityPlot
## The following object is masked from 'package:psych':
##
## logit
qqPlot(dados$IRA)
## [1] 19 45
mod<- glm(IRA ~ Precip + Temp + Umid , family = poisson)
summary(mod)
##
## Call:
## glm(formula = IRA ~ Precip + Temp + Umid, family = poisson)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -20.5705 -8.1575 0.1349 6.8616 17.9029
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 9.346e+00 6.158e-02 151.779 < 2e-16 ***
## Precip 2.348e-04 4.671e-05 5.026 5.01e-07 ***
## Temp -8.277e-02 2.208e-03 -37.479 < 2e-16 ***
## Umid -1.631e-02 3.426e-04 -47.611 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 21614 on 191 degrees of freedom
## Residual deviance: 16300 on 188 degrees of freedom
## AIC: 17796
##
## Number of Fisher Scoring iterations: 4
1-pchisq(16300,188)
## [1] 0
set.seed(1)
hnp(mod, print=TRUE,verb=TRUE,xlab="Quantis teóricos")
## Poisson model
## Simulation 1 out of 99
## Simulation 2 out of 99
## Simulation 3 out of 99
## Simulation 4 out of 99
## Simulation 5 out of 99
## Simulation 6 out of 99
## Simulation 7 out of 99
## Simulation 8 out of 99
## Simulation 9 out of 99
## Simulation 10 out of 99
## Simulation 11 out of 99
## Simulation 12 out of 99
## Simulation 13 out of 99
## Simulation 14 out of 99
## Simulation 15 out of 99
## Simulation 16 out of 99
## Simulation 17 out of 99
## Simulation 18 out of 99
## Simulation 19 out of 99
## Simulation 20 out of 99
## Simulation 21 out of 99
## Simulation 22 out of 99
## Simulation 23 out of 99
## Simulation 24 out of 99
## Simulation 25 out of 99
## Simulation 26 out of 99
## Simulation 27 out of 99
## Simulation 28 out of 99
## Simulation 29 out of 99
## Simulation 30 out of 99
## Simulation 31 out of 99
## Simulation 32 out of 99
## Simulation 33 out of 99
## Simulation 34 out of 99
## Simulation 35 out of 99
## Simulation 36 out of 99
## Simulation 37 out of 99
## Simulation 38 out of 99
## Simulation 39 out of 99
## Simulation 40 out of 99
## Simulation 41 out of 99
## Simulation 42 out of 99
## Simulation 43 out of 99
## Simulation 44 out of 99
## Simulation 45 out of 99
## Simulation 46 out of 99
## Simulation 47 out of 99
## Simulation 48 out of 99
## Simulation 49 out of 99
## Simulation 50 out of 99
## Simulation 51 out of 99
## Simulation 52 out of 99
## Simulation 53 out of 99
## Simulation 54 out of 99
## Simulation 55 out of 99
## Simulation 56 out of 99
## Simulation 57 out of 99
## Simulation 58 out of 99
## Simulation 59 out of 99
## Simulation 60 out of 99
## Simulation 61 out of 99
## Simulation 62 out of 99
## Simulation 63 out of 99
## Simulation 64 out of 99
## Simulation 65 out of 99
## Simulation 66 out of 99
## Simulation 67 out of 99
## Simulation 68 out of 99
## Simulation 69 out of 99
## Simulation 70 out of 99
## Simulation 71 out of 99
## Simulation 72 out of 99
## Simulation 73 out of 99
## Simulation 74 out of 99
## Simulation 75 out of 99
## Simulation 76 out of 99
## Simulation 77 out of 99
## Simulation 78 out of 99
## Simulation 79 out of 99
## Simulation 80 out of 99
## Simulation 81 out of 99
## Simulation 82 out of 99
## Simulation 83 out of 99
## Simulation 84 out of 99
## Simulation 85 out of 99
## Simulation 86 out of 99
## Simulation 87 out of 99
## Simulation 88 out of 99
## Simulation 89 out of 99
## Simulation 90 out of 99
## Simulation 91 out of 99
## Simulation 92 out of 99
## Simulation 93 out of 99
## Simulation 94 out of 99
## Simulation 95 out of 99
## Simulation 96 out of 99
## Simulation 97 out of 99
## Simulation 98 out of 99
## Simulation 99 out of 99
mod.2<- glm.nb(IRA ~ Precip + Temp + Umid , data = dados)
summary(mod.2)
##
## Call:
## glm.nb(formula = IRA ~ Precip + Temp + Umid, data = dados, init.theta = 4.439088697,
## link = log)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.25860 -0.90984 -0.00822 0.61173 1.82142
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 9.6310305 0.6520314 14.771 < 2e-16 ***
## Precip 0.0003234 0.0004354 0.743 0.457649
## Temp -0.0886262 0.0230982 -3.837 0.000125 ***
## Umid -0.0182615 0.0035407 -5.158 2.5e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for Negative Binomial(4.4391) family taken to be 1)
##
## Null deviance: 255.05 on 191 degrees of freedom
## Residual deviance: 199.24 on 188 degrees of freedom
## AIC: 2557.1
##
## Number of Fisher Scoring iterations: 1
##
##
## Theta: 4.439
## Std. Err.: 0.443
##
## 2 x log-likelihood: -2547.086
1-pchisq(199.24/4.439,188)
## [1] 1
set.seed(22)
hnp(mod.2, print=TRUE,verb=TRUE,xlab="Quantis teóricos")
## Negative binomial model (using MASS package)
## Simulation 1 out of 99
## Simulation 2 out of 99
## Simulation 3 out of 99
## Simulation 4 out of 99
## Simulation 5 out of 99
## Simulation 6 out of 99
## Simulation 7 out of 99
## Simulation 8 out of 99
## Simulation 9 out of 99
## Simulation 10 out of 99
## Simulation 11 out of 99
## Simulation 12 out of 99
## Simulation 13 out of 99
## Simulation 14 out of 99
## Simulation 15 out of 99
## Simulation 16 out of 99
## Simulation 17 out of 99
## Simulation 18 out of 99
## Simulation 19 out of 99
## Simulation 20 out of 99
## Simulation 21 out of 99
## Simulation 22 out of 99
## Simulation 23 out of 99
## Simulation 24 out of 99
## Simulation 25 out of 99
## Simulation 26 out of 99
## Simulation 27 out of 99
## Simulation 28 out of 99
## Simulation 29 out of 99
## Simulation 30 out of 99
## Simulation 31 out of 99
## Simulation 32 out of 99
## Simulation 33 out of 99
## Simulation 34 out of 99
## Simulation 35 out of 99
## Simulation 36 out of 99
## Simulation 37 out of 99
## Simulation 38 out of 99
## Simulation 39 out of 99
## Simulation 40 out of 99
## Simulation 41 out of 99
## Simulation 42 out of 99
## Simulation 43 out of 99
## Simulation 44 out of 99
## Simulation 45 out of 99
## Simulation 46 out of 99
## Simulation 47 out of 99
## Simulation 48 out of 99
## Simulation 49 out of 99
## Simulation 50 out of 99
## Simulation 51 out of 99
## Simulation 52 out of 99
## Simulation 53 out of 99
## Simulation 54 out of 99
## Simulation 55 out of 99
## Simulation 56 out of 99
## Simulation 57 out of 99
## Simulation 58 out of 99
## Simulation 59 out of 99
## Simulation 60 out of 99
## Simulation 61 out of 99
## Simulation 62 out of 99
## Simulation 63 out of 99
## Simulation 64 out of 99
## Simulation 65 out of 99
## Simulation 66 out of 99
## Simulation 67 out of 99
## Simulation 68 out of 99
## Simulation 69 out of 99
## Simulation 70 out of 99
## Simulation 71 out of 99
## Simulation 72 out of 99
## Simulation 73 out of 99
## Simulation 74 out of 99
## Simulation 75 out of 99
## Simulation 76 out of 99
## Simulation 77 out of 99
## Simulation 78 out of 99
## Simulation 79 out of 99
## Simulation 80 out of 99
## Simulation 81 out of 99
## Simulation 82 out of 99
## Simulation 83 out of 99
## Simulation 84 out of 99
## Simulation 85 out of 99
## Simulation 86 out of 99
## Simulation 87 out of 99
## Simulation 88 out of 99
## Simulation 89 out of 99
## Simulation 90 out of 99
## Simulation 91 out of 99
## Simulation 92 out of 99
## Simulation 93 out of 99
## Simulation 94 out of 99
## Simulation 95 out of 99
## Simulation 96 out of 99
## Simulation 97 out of 99
## Simulation 98 out of 99
## Simulation 99 out of 99
mod.3<- glm.nb(IRA ~ Temp + Umid , data = dados)
summary(mod.3)
##
## Call:
## glm.nb(formula = IRA ~ Temp + Umid, data = dados, init.theta = 4.428413664,
## link = log)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.26621 -0.86707 -0.00378 0.61908 1.79729
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 9.371591 0.559763 16.742 < 2e-16 ***
## Temp -0.082011 0.021540 -3.807 0.00014 ***
## Umid -0.016606 0.002771 -5.993 2.06e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for Negative Binomial(4.4284) family taken to be 1)
##
## Null deviance: 254.44 on 191 degrees of freedom
## Residual deviance: 199.26 on 189 degrees of freedom
## AIC: 2555.6
##
## Number of Fisher Scoring iterations: 1
##
##
## Theta: 4.428
## Std. Err.: 0.442
##
## 2 x log-likelihood: -2547.577
1-pchisq(199.26/4.428,189)
## [1] 1
set.seed(10)
hnp(mod.3, print=TRUE,verb=TRUE,xlab="Quantis teóricos")
## Negative binomial model (using MASS package)
## Simulation 1 out of 99
## Simulation 2 out of 99
## Simulation 3 out of 99
## Simulation 4 out of 99
## Simulation 5 out of 99
## Simulation 6 out of 99
## Simulation 7 out of 99
## Simulation 8 out of 99
## Simulation 9 out of 99
## Simulation 10 out of 99
## Simulation 11 out of 99
## Simulation 12 out of 99
## Simulation 13 out of 99
## Simulation 14 out of 99
## Simulation 15 out of 99
## Simulation 16 out of 99
## Simulation 17 out of 99
## Simulation 18 out of 99
## Simulation 19 out of 99
## Simulation 20 out of 99
## Simulation 21 out of 99
## Simulation 22 out of 99
## Simulation 23 out of 99
## Simulation 24 out of 99
## Simulation 25 out of 99
## Simulation 26 out of 99
## Simulation 27 out of 99
## Simulation 28 out of 99
## Simulation 29 out of 99
## Simulation 30 out of 99
## Simulation 31 out of 99
## Simulation 32 out of 99
## Simulation 33 out of 99
## Simulation 34 out of 99
## Simulation 35 out of 99
## Simulation 36 out of 99
## Simulation 37 out of 99
## Simulation 38 out of 99
## Simulation 39 out of 99
## Simulation 40 out of 99
## Simulation 41 out of 99
## Simulation 42 out of 99
## Simulation 43 out of 99
## Simulation 44 out of 99
## Simulation 45 out of 99
## Simulation 46 out of 99
## Simulation 47 out of 99
## Simulation 48 out of 99
## Simulation 49 out of 99
## Simulation 50 out of 99
## Simulation 51 out of 99
## Simulation 52 out of 99
## Simulation 53 out of 99
## Simulation 54 out of 99
## Simulation 55 out of 99
## Simulation 56 out of 99
## Simulation 57 out of 99
## Simulation 58 out of 99
## Simulation 59 out of 99
## Simulation 60 out of 99
## Simulation 61 out of 99
## Simulation 62 out of 99
## Simulation 63 out of 99
## Simulation 64 out of 99
## Simulation 65 out of 99
## Simulation 66 out of 99
## Simulation 67 out of 99
## Simulation 68 out of 99
## Simulation 69 out of 99
## Simulation 70 out of 99
## Simulation 71 out of 99
## Simulation 72 out of 99
## Simulation 73 out of 99
## Simulation 74 out of 99
## Simulation 75 out of 99
## Simulation 76 out of 99
## Simulation 77 out of 99
## Simulation 78 out of 99
## Simulation 79 out of 99
## Simulation 80 out of 99
## Simulation 81 out of 99
## Simulation 82 out of 99
## Simulation 83 out of 99
## Simulation 84 out of 99
## Simulation 85 out of 99
## Simulation 86 out of 99
## Simulation 87 out of 99
## Simulation 88 out of 99
## Simulation 89 out of 99
## Simulation 90 out of 99
## Simulation 91 out of 99
## Simulation 92 out of 99
## Simulation 93 out of 99
## Simulation 94 out of 99
## Simulation 95 out of 99
## Simulation 96 out of 99
## Simulation 97 out of 99
## Simulation 98 out of 99
## Simulation 99 out of 99
influenceIndexPlot(mod.3, vars = c('Studentized','Cook','Hat'), id.n=10)
Gerando um outro modelo sem os pontos (29,141,153,175,187).
mod.4<-glm.nb(IRA~Temp + Umid, data = dados[-c(29,141,153,175,187),])
summary(mod.4)
##
## Call:
## glm.nb(formula = IRA ~ Temp + Umid, data = dados[-c(29, 141,
## 153, 175, 187), ], init.theta = 4.681453477, link = log)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.17624 -0.86633 -0.02689 0.64024 1.83226
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 9.339802 0.560124 16.675 < 2e-16 ***
## Temp -0.076033 0.022114 -3.438 0.000585 ***
## Umid -0.018048 0.002801 -6.444 1.16e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for Negative Binomial(4.6815) family taken to be 1)
##
## Null deviance: 255.53 on 186 degrees of freedom
## Residual deviance: 193.75 on 184 degrees of freedom
## AIC: 2481.9
##
## Number of Fisher Scoring iterations: 1
##
##
## Theta: 4.681
## Std. Err.: 0.475
##
## 2 x log-likelihood: -2473.856
summary(mod.3)
##
## Call:
## glm.nb(formula = IRA ~ Temp + Umid, data = dados, init.theta = 4.428413664,
## link = log)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.26621 -0.86707 -0.00378 0.61908 1.79729
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 9.371591 0.559763 16.742 < 2e-16 ***
## Temp -0.082011 0.021540 -3.807 0.00014 ***
## Umid -0.016606 0.002771 -5.993 2.06e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for Negative Binomial(4.4284) family taken to be 1)
##
## Null deviance: 254.44 on 191 degrees of freedom
## Residual deviance: 199.26 on 189 degrees of freedom
## AIC: 2555.6
##
## Number of Fisher Scoring iterations: 1
##
##
## Theta: 4.428
## Std. Err.: 0.442
##
## 2 x log-likelihood: -2547.577
difinter<- 9.371591-9.339802
porcent<- difinter*100/9.371591
porcent
## [1] 0.339206
diftemp<- 0.082011-0.076033
porcent2<- diftemp*100/0.082011
porcent2
## [1] 7.289266
difumid<- (0.016606-0.018048)*-1
porcent3<- difumid*100/0.0182615
porcent3
## [1] 7.896394
Como houve uma variação maior que 5% em pelo menos uma das variáveis do modelo (Temp e Umid) não se pode reitrar estes pontos (29,141,153,175,187) do MLG.
par(mar = c(4,4,1,4))
plot(IRA, type = "l", col = "Green", lwd = 1,
ylab= "Infecção respiratória aguda",
bty = "n",
xlab="Meses", axes = FALSE)
axis(1,c(0,12,24,36,48,60,72,84,96,108,120,132,144,156,168,180,192,204), las=2)
axis(2,c(0,100,200,300,400,500,600,700,800,900,1000))
par(new=T)
#axis(2,col="black", col.axis = "black")
plot(Precip, type='l', frame=T, ann=F, axes = F, lty=2, lwd=1, bty="n", col = "Black")
axis(4,lty=2, col.axis = "black", c(0,100,200,300,400,500,600))
#axis(1)
mtext("Precipitação Pluviométrica média (mm)",side=4 , line = 3)
title(main="IRA X Precip")
par(mar = c(4,4,1,4))
plot(IRA, type = "l", col = "Green", lwd = 1,
ylab= "Infecção respiratória aguda",
bty = "n",
xlab="Meses", axes = FALSE)
axis(1,c(0,12,24,36,48,60,72,84,96,108,120,132,144,156,168,180,192), las = 2)
axis(2,c(0,100,200,300,400,500,600,700,800,900,1000))
par(new=T)
#axis(2,col="black", col.axis = "black")
plot(Temp, type='l', frame=T, ann=F, axes = F, lty=2, lwd=1, bty="n", col = "Black")
axis(4,lty=2, col.axis = "black", c(20,21,22,23,24,25,26,27,28,29,30))
#axis(1)
mtext("Temperatura média (ºC)",side=4 , line = 3)
title(main="IRA X Temp ")
par(mar = c(4,4,1,4))
plot(IRA, type = "l", col = "Green", lwd = 1,
ylab= "Infecção respiratória aguda",
bty = "n",
xlab="Meses", axes = FALSE)
axis(1,c(0,12,24,36,48,60,72,84,96,108,120,132,144,156,168,180,192), las = 2)
axis(2,c(0,100,200,300,400,500,600,700,800,900,1000))
par(new=T)
#axis(2,col="black", col.axis = "black")
plot(Umid, type='l', frame=T, ann=F, axes = F, lty=2, lwd=1, bty="n", col = "Black")
axis(4,lty=2, col.axis = "black", c(40,50,60,70,80,90,100))
#axis(1)
mtext("Umidade (%)",side=4 , line = 3)
title(main="IRA X Umid ")
Foi ajusatado para os dados uma serie com transformação logaritima para suavizar a variabilidade da serie, mas para a variavel precip quando foi aplicado a transformação logaritima, retornou valores infinitos, por esse motivo não se aplicou a transformação logaritima para a variavel precip.
dadosst<-dados[,-c(1)]
st1 <- ts(IRA,start = 1999, frequency=12)
st2 <- ts(Precip,start = 1999, frequency=12)
st3 <- ts(Temp,start = 1999, frequency=12)
st4 <- ts(Umid,start = 1999, frequency=12)
log.st1<- log(st1)
log.st3<- log(st3)
log.st4<- log(st4)
Se iniciam em janeiro de 1999 com frequência 12, ou seja, são dados mensais.
Em seguida será gerado o plot das variavéis.
par(mfrow=c(2,2))
plot(log.st1,type = "l", col='blue',main='IRA',xlab='Anos')
plot(st2,type = "l", col='Red',main='Precip',xlab='Anos')
plot(log.st3,type = "l", col='Yellow',main='Temp',xlab='Anos')
plot(log.st4,type = "l", col='Green',main='Umid',xlab='Anos')
par(mfrow=c(2,2))
histPlot(as.timeSeries(st1))
histPlot(as.timeSeries(st2))
histPlot(as.timeSeries(st3))
histPlot(as.timeSeries(st4))
adf.test(log.st1)
##
## Augmented Dickey-Fuller Test
##
## data: log.st1
## Dickey-Fuller = -4.8978, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
adf.test(st2)
##
## Augmented Dickey-Fuller Test
##
## data: st2
## Dickey-Fuller = -10.955, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
adf.test(log.st3)
##
## Augmented Dickey-Fuller Test
##
## data: log.st3
## Dickey-Fuller = -7.2784, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
adf.test(log.st4)
##
## Augmented Dickey-Fuller Test
##
## data: log.st4
## Dickey-Fuller = -10.102, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
Usando adf.test, ele gera um valor p muito pequeno, por essa razão o software no retorna o valor p de 0.01 que significa que os dados são estacionários.
a<-decompose(log.st1, type=c("additive", "multiplicative"), filter=NULL)
plot(a)
b<-decompose(st2, type=c("additive", "multiplicative"), filter=NULL)
plot(b)
c<-decompose(log.st3, type=c("additive", "multiplicative"), filter=NULL)
plot(c)
d<-decompose(log.st4, type=c("additive", "multiplicative"), filter=NULL)
plot(d)
Na figura acima observa-se que todas as quatros séries aparentemente apresemtam sazonalidade e a variavel IRA apresenta uma tendência é decrescente.
par(mfrow=c(2,2))
acf(log.st1,lag.max=150, main="Correlograma da serie IRA")
acf(st2,lag.max=150, main="Correlograma da serie Precip")
acf(log.st3,lag.max=150, main="Correlograma da serie Temp")
acf(log.st4,lag.max=150, main="Correlograma da serie Umid")
A acf amostral comprova que as séries são não-estacionária, pois apresentam um decaimento pra zero lento.
par(mfrow=c(2,2))
pacf(log.st1,lag.max=100, main = "Correlograma Parcial IRA")
pacf(st2,lag.max=100, main = "Correlograma Parcial Precip")
pacf(log.st3,lag.max=100, main = "Correlograma Parcial Temp")
pacf(log.st4,lag.max=100, main = "Correlograma Parcial Umid")
Os picos na PACF representam as fortes correlações entre os meses da série.
Será modelada a série IRA (Infecção Respiratoria aguda) com valores que abrangem o período de Janeiro/1999 a Dezembro/2014. Consideraremos previsões de 1,3,6 e 12 passos à frente, a fim de avaliar as capacidades preditivas dos diferentes métodos em diferentes horizontes de previsão.
Usaremos inicialmente os algoritmos de suavização aditivo e multiplicativo de Holt-Winters para produzir previsões das variavéis.
observado.st1<-log.st1[length(log.st1)]
observado.st2<-st2[length(st2)]
observado.st3<-log.st3[length(log.st3)]
observado.st4<-log.st4[length(log.st4)]
####IRA
## Previsão 6 passo à frente (Sem última observação)
IRA6p <- log.st1[-((length(log.st1)-5):length(log.st1))]
st16p <- ts(IRA6p, start = c(1999,7), frequency = 12)
## Previsão 12 passo à frente (Sem última observação)
IRA12p <- log.st1[-((length(log.st1)-11):length(log.st1))]
st112p <- ts(IRA12p, start = c(1994,7), frequency = 12)
####Precipitação
## Previsão 6 passo à frente (Sem última observação)
Precip6p <- st2[-((length(st2)-5):length(st2))]
st26p <- ts(Precip6p, start = c(1999,7), frequency = 12)
## Previsão 12 passo à frente (Sem última observação)
Precip12p <- st2[-((length(st2)-11):length(st2))]
st212p <- ts(Precip12p, start = c(1994,7), frequency = 12)
####Temperatura
## Previsão 6 passo à frente (Sem última observação)
Temp6p <- log.st3[-((length(log.st3)-5):length(log.st3))]
st36p <- ts(Temp6p, start = c(1999,7), frequency = 12)
## Previsão 12 passo à frente (Sem última observação)
Precip12p <- log.st3[-((length(log.st3)-11):length(log.st3))]
st312p <- ts(Precip12p, start = c(1994,7), frequency = 12)
####Umidade
Umid6p <- log.st4[-((length(log.st4)-5):length(log.st4))]
st46p <- ts(Umid6p, start = c(1999,7), frequency = 12)
## Previsão 12 passo à frente (Sem última observação)
Umid12p <- log.st4[-((length(log.st4)-11):length(log.st4))]
st412p <- ts(Umid12p, start = c(1994,7), frequency = 12)
hwa6p.st1 <- HoltWinters(st16p, seasonal = "additive")
hwa12p.st1 <- HoltWinters(st112p, seasonal = "additive")
hwa6p.st2<-HoltWinters(st26p, seasonal="additive")
hwa12p.st2<-HoltWinters(st212p, seasonal="additive")
hwa6p.st3<-HoltWinters(st36p, seasonal="additive")
hwa12p.st3<-HoltWinters(st312p, seasonal="additive")
hwa6p.st4<-HoltWinters(st46p, seasonal="additive")
hwa12p.st4<-HoltWinters(st412p, seasonal="additive")
seguindo com as previsões para as 4 séries em 6 passos a frente:
prev.hwa6p.st1<-predict(hwa6p.st1,6)[6]
prev.hwa6p.st2<-predict(hwa6p.st2,6)[6]
prev.hwa6p.st3<-predict(hwa6p.st3,6)[6]
prev.hwa6p.st4<-predict(hwa6p.st4,6)[6]
e o erro relativo de previsão:
(observado.st1-prev.hwa6p.st1)/observado.st1*100
## [1] 5.145289
(observado.st2-prev.hwa6p.st2)/observado.st2*100
## [1] 32.90784
(observado.st3-prev.hwa6p.st3)/observado.st3*100
## [1] -0.8407157
(observado.st4-prev.hwa6p.st4)/observado.st4*100
## [1] 0.6260008
Agora, as previsões para as 4 séries em 12 passos a frente:
prev.hwa12p.st1<-predict(hwa12p.st1,12)[12]
prev.hwa12p.st2<-predict(hwa12p.st2,12)[12]
prev.hwa12p.st3<-predict(hwa12p.st3,12)[12]
prev.hwa12p.st4<-predict(hwa12p.st4,12)[12]
Finalmente, o erro relativo de previsão para 12 passos a frente:
(observado.st1-prev.hwa12p.st1)/observado.st1*100
## [1] -1.786785
(observado.st2-prev.hwa12p.st2)/observado.st2*100
## [1] 24.49074
(observado.st3-prev.hwa12p.st3)/observado.st3*100
## [1] -0.8249554
(observado.st4-prev.hwa12p.st4)/observado.st4*100
## [1] -0.2383521
O erro relativo de previsão para a variavel precip foi muito elevado a 6 passos a frente em relação as demais, as outras tiveram um pequeno erro relativo de previsão.
par(mfrow = c(2,2))
plot(hwa6pp.ira,xlab="Tempo",ylab="Percentual",main="IRA")
plot(hwa6pp.precip,xlab="Tempo",ylab="Percentual",main="Precip")
plot(hwa6pp.umid,xlab="Tempo",ylab="Percentual",main="Umid")
plot(hwa6pp.temp,xlab="Tempo",ylab="Percentual",main="Temp")
O gráfico abaixo compara os valores ajustados e observados, ao qual a linha vermelha é o valor ajustado e a linha preta é o valor observado.
par(mfrow = c(2,2))
plot(hwa6p.st1,xlab="Tempo",ylab="Observado/Ajustado",main="IRA")
plot(hwa6p.st2,xlab="Tempo",ylab="Observado/Ajustado",main="Precip")
plot(hwa6p.st3,xlab="Tempo",ylab="Observado/Ajustado",main="Umid")
plot(hwa6p.st4,xlab="Tempo",ylab="Observado/Ajustado",main="Temp")
Interpretação: De acordo com o gráfico acima, o modelo não ajustou bem o início das séries acima. Por sua vez, abaixo apresenta os gráficos para 12 passos a frente.
par(mfrow = c(2,2))
plot(hwa12pp.ira,xlab="Tempo",ylab="Percentual",main="IRA")
plot(hwa12pp.precip,xlab="Tempo",ylab="Percentual",main="Precip")
plot(hwa12pp.umid,xlab="Tempo",ylab="Percentual",main="Umid")
plot(hwa12pp.temp,xlab="Tempo",ylab="Percentual",main="Temp")
É necessário realizar o gráfico dos valores ajustados versus observado, para 12 passos a frente, como foi feito para 6 passos:
par(mfrow = c(2,2))
plot(hwa12p.st1,xlab="Tempo",ylab="Observado/Ajustado",main="IRA")
plot(hwa12p.st2,xlab="Tempo",ylab="Observado/Ajustado",main="Precip")
plot(hwa12p.st3,xlab="Tempo",ylab="Observado/Ajustado",main="Umid")
plot(hwa12p.st4,xlab="Tempo",ylab="Observado/Ajustado",main="Temp")