GDPUSA TIME SERIES
library(tseries)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
# 1. Load data
GDP_data <- scan("GDPUSA.dat")
# 2. Convert to Time Series object
ts_GDP <- ts(GDP_data, start = c(1947, 1), frequency = 4)
2) Plotting the Data
plot(ts_GDP, main= "GDP USA")
We can observe a clear upward trend (non-stationary in levels) where GDP rises persistently and the slope increases over time.
As the level rises over time, fluctuations grow in absolute size i.e absolute fluctuations tend to be larger which suggests variance is not constant. To confirm this, we will examine log differences (growth)
We can detect subtle visible slowdowns/dips (e.g. 2008–09, 2020) yet the series returns to the growth path.
2) Decomposing Time-series and apllying Log- transformation
# Create a grouping variable. Each group will have 8 observations (2 years of quarterly data)
n <- length(ts_GDP)
groups <- gl(n = ceiling(n/8), k = 8, length = n)
# 1. Calculate Mean and Variance for each group
group_means <- tapply(ts_GDP, groups, mean)
group_vars <- tapply(ts_GDP, groups, var)
# 2. Plot Mean vs Variance
plot(group_means, group_vars,
main = "Mean-Variance Relationship",
xlab = "Mean", ylab = "Variance", pch = 19)
# 3. Boxplots grouped by period
boxplot(ts_GDP ~ groups,
main = "Boxplots (Every 2 Years)",
xlab = "2-Year Groups", ylab = "GDP")
The plot confirms a positive relation among mean and variance,meaning
that the positive trend implies a greater variability in absolute terms.
We can see the dots start low on the left and go higher on the right
meaning variance tends to increases as the GDP mean grows.Therefore the
series is not stationary.
We try to correct that in the following step by applying a Log transformation:
# 1. Apply Log transformation
ln_gdp <- log(ts_GDP)
# 2. Plot the new transformed series
plot(ln_gdp, main = "Log-Transformed GDP", ylab = "Log(GDP)")
3) Spotting Seasonal patterns
Despite the Log transformation, the series still show a clear upward trend. Taking the log fixed the variance (the width of the wiggles) but the line is still going up (Trend), i.e. the mean does not remain constant. According to the definition of Stationarity the series must have a constant mean over time.
Now, lets identify and remove any present seasonal pattern on our Log transforme series.
# 1. Seasonal Sub-series Plot (Monthplot)
# This shows the average value for each quarter (Q1, Q2, Q3, Q4)
monthplot(ln_gdp, main = "Seasonal Sub-series Plot", ylab = "Log(GDP)")
# 2. Autocorrelation Function (ACF)
# lag.max = 20 allows us to see several years back (4 quarters x 5 years)
acf(ln_gdp, lag.max = 20, main = "ACF of Log(GDP)")
- When Identifying seasonality, what we are trying to answer is: does
the series behave differently by season (quarter), after accounting for
the overall trend? - In the monthplot, the horizontal
line in each panel is the average for that
quarter. - Because these averages are almost the same
across quarters, seasonal differences in log
GDP are close to zero. - If seasonality were strong we’d see
quarter means at clearly different heights (and/or
different levels across panels).
Conclusion: The dominant feature is the long-run trend (GDP growing), not seasonality.This is also confirmed by the **ACF plot*: bars (correlation between lags) start high and very slowly decreasing meaning the data has a non-stationary mean.
4) Applying Regular Diferences
# 1. Take the first difference to remove the trend
diff_ln_gdp <- diff(ln_gdp)
# 2. Plot the result
plot(diff_ln_gdp, main = "First Difference of Log(GDP)", ylab = "Growth Rate")
Difference log GDP gives (approximate) growth rates. The graph show we
have removed the tendency i.e achieve a constant mean. However, it seems
we have not been totally successful in stabilizing the variance , since
the volatility of the series looks much higher in the earlier years,
while it reduces starting 1980 onward. Variance stability is
questionable and should be checked formally.
# Check ACF of the differenced series (Trend is now gone!)
acf(diff_ln_gdp, lag.max = 20, main = "ACF of Differenced Log(GDP)")
The absence of significant spikes at lags 4 or 8 confirms that no
seasonal differencing is needed, distinguishing simple momentum (lags
1-3) from actual seasonality.
Differencing log GDP turns the highly persistent level series into a near-stationary growth series: autocorrelation is mainly at lag 1 (showing short-run persistence) but with little structure beyond a few lags
Conclusion: The series diff_ln_gdp is the most appropriate stationary transformation for modeling.