You can run this code directly at RStudio Cloud after a free registration and importing it into your own space. On the right panel named FILES, click on TransiCoding.Rmd to open it. Now you can tweak the code and run it by yourself but be patient the first time since a number of components will need to be installed. The perron.csv data source is included.
# install.packages("tseries")
# install.packages("forecast")
# install.packages("FitAR")
# install.packages("strucchange")
# install.packages("readxl")
First, we load libraries and call the dataset “transi.csv” containing Log-Income and Ecology score in basis points (both scaled 10^5 to generate whole numbers). We artifically declare it as a time series (as if income was a deterministic function of time). To test later for trend breaks, we shall use a function developped for the Bai-Perron test implementation at https://rdrr.io/cran/strucchange/man/RealInt.html.
library(tseries)
library(forecast)
library(FitAR)
library(strucchange)
vcov.ri <- function(x, ...) kernHAC(x, kernel = "Quadratic Spectral",
prewhite = 1, approx = "AR(1)", ...)
library(readxl)
raw <- read.csv("transi.csv", header=T)
summary(raw)
## Income Ecology
## Min. :3.475 Min. : 0.000
## 1st Qu.:4.191 1st Qu.: 1.657
## Median :4.384 Median : 3.392
## Mean :4.351 Mean : 4.071
## 3rd Qu.:4.547 3rd Qu.: 5.728
## Max. :4.994 Max. :16.688
DAT <- ts(raw, frequency=1)
We test for a break in the Y (ecology) trend using the first difference and applying a test for structural change. The Bayesian Information Criteria (BIC) recommended by Bai & Perron reaches its minimum for m = 2 or 3; since information criteria are often downward-biased, these authors argue in favor of the presence of an additional break, we thus choose m = 3 at observations 212, 433, 568 (18086, 29710, 41345€). The confidence intervals for the breakpoints are well separated. The means over the 4 intervals are distinct at 2.1, 3.5, 5.7 and 7.1 (basis points for ecology mention in manifestos); coefficients have high significance (very low p-values).
U = DAT[,"Ecology"]
bp.ri <- breakpoints(U ~ 1)
summary(bp.ri)
##
## Optimal (m+1)-segment partition:
##
## Call:
## breakpoints.formula(formula = U ~ 1)
##
## Breakpoints at observation number:
##
## m = 1 373
## m = 2 335 463
## m = 3 212 433 568
## m = 4 212 363 463 568
## m = 5 106 212 363 463 568
##
## Corresponding to breakdates:
##
## m = 1 373
## m = 2 335 463
## m = 3 212 433 568
## m = 4 212 363 463 568
## m = 5 106 212 363 463 568
##
## Fit:
##
## m 0 1 2 3 4 5
## RSS 6920 4965 4740 4629 4570 4553
## BIC 3491 3281 3263 3260 3264 3275
plot(bp.ri)
confid <- confint(bp.ri)
plot(U)
lines(bp.ri)
lines(confid)
coef(bp.ri, breaks = 3)
## (Intercept)
## 1 - 212 2.053588
## 213 - 433 3.527106
## 434 - 568 5.752113
## 569 - 673 7.127397
sapply(vcov(bp.ri, breaks = 3, vcov = vcov.ri), sqrt)
## 1 - 212 213 - 433 434 - 568 569 - 673
## 0.1193592 0.1716740 0.3168263 0.3120848
confint(bp.ri, breaks = 3, vcov = vcov.ri)
##
## Confidence intervals for breakpoints
## of optimal 4-segment partition:
##
## Call:
## confint.breakpointsfull(object = bp.ri, breaks = 3, vcov. = vcov.ri)
##
## Breakpoints at observation number:
## 2.5 % breakpoints 97.5 %
## 1 177 212 226
## 2 401 433 446
## 3 510 568 650
##
## Corresponding to breakdates:
## 2.5 % breakpoints 97.5 %
## 1 177 212 226
## 2 401 433 446
## 3 510 568 650
fac.ri <- breakfactor(bp.ri, breaks = 3, label = "Ecology")
fm.ri <- lm(U ~ 0 + fac.ri)
plot(U)
lines(fitted(bp.ri, breaks = 3), col = 2)
summary(fm.ri)
##
## Call:
## lm(formula = U ~ 0 + fac.ri)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.7020 -1.7523 -0.3696 1.2274 13.1609
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## fac.riEcology1 2.0536 0.1807 11.37 <2e-16 ***
## fac.riEcology2 3.5271 0.1770 19.93 <2e-16 ***
## fac.riEcology3 5.7521 0.2264 25.41 <2e-16 ***
## fac.riEcology4 7.1274 0.2567 27.76 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.631 on 669 degrees of freedom
## Multiple R-squared: 0.7439, Adjusted R-squared: 0.7423
## F-statistic: 485.7 on 4 and 669 DF, p-value: < 2.2e-16