GDPUSA TIME SERIES

library(tseries)
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
# 1. Load data
GDP_data <- scan("GDPUSA.dat")

# 2. Convert to Time Series object
ts_GDP <- ts(GDP_data, start = c(1947, 1), frequency = 4)

2) Plotting the Data

plot(ts_GDP, main= "GDP USA")

2) Decomposing Time-series and apllying Log- transformation

# Create a grouping variable. Each group will have 8 observations (2 years of quarterly data)

n <- length(ts_GDP)
groups <- gl(n = ceiling(n/8), k = 8, length = n)

# 1. Calculate Mean and Variance for each group
group_means <- tapply(ts_GDP, groups, mean)
group_vars <- tapply(ts_GDP, groups, var)

# 2. Plot Mean vs Variance
plot(group_means, group_vars, 
     main = "Mean-Variance Relationship",
     xlab = "Mean", ylab = "Variance", pch = 19)

# 3. Boxplots grouped by period
boxplot(ts_GDP ~ groups, 
        main = "Boxplots (Every 2 Years)",
        xlab = "2-Year Groups", ylab = "GDP")

The plot confirms a positive relation among mean and variance,meaning that the positive trend implies a greater variability in absolute terms. We can see the dots start low on the left and go higher on the right meaning variance tends to increases as the GDP mean grows.Therefore the series is not stationary.

We try to correct that in the following step by applying a Log transformation:

# 1. Apply Log transformation
ln_gdp <- log(ts_GDP)

# 2. Plot the new transformed series
plot(ln_gdp, main = "Log-Transformed GDP", ylab = "Log(GDP)")

3) Spotting Seasonal patterns

Despite the Log transformation, the series still show a clear upward trend. Taking the log fixed the variance (the width of the wiggles) but the line is still going up (Trend), i.e. the mean does not remain constant. According to the definition of Stationarity the series must have a constant mean over time.

Now, lets identify and remove any present seasonal pattern on our Log transforme series.

# 1. Seasonal Sub-series Plot (Monthplot)
# This shows the average value for each quarter (Q1, Q2, Q3, Q4)
monthplot(ln_gdp, main = "Seasonal Sub-series Plot", ylab = "Log(GDP)")

# 2. Autocorrelation Function (ACF)
# lag.max = 20 allows us to see several years back (4 quarters x 5 years)
acf(ln_gdp, lag.max = 20, main = "ACF of Log(GDP)")

- When Identifying seasonality, what we are trying to answer is: does the series behave differently by season (quarter), after accounting for the overall trend? - In the monthplot, the horizontal line in each panel is the average for that quarter. - Because these averages are almost the same across quarters, seasonal differences in log GDP are close to zero. - If seasonality were strong we’d see quarter means at clearly different heights (and/or different levels across panels).

Conclusion: The dominant feature is the long-run trend (GDP growing), not seasonality.This is also confirmed by the **ACF plot*: bars (correlation between lags) start high and very slowly decreasing meaning the data has a non-stationary mean.

4) Applying Regular Diferences

# 1. Take the first difference to remove the trend
diff_ln_gdp <- diff(ln_gdp)

# 2. Plot the result
plot(diff_ln_gdp, main = "First Difference of Log(GDP)", ylab = "Growth Rate")

Difference log GDP gives (approximate) growth rates. The graph show we have removed the tendency i.e achieve a constant mean. However, it seems we have not been totally successful in stabilizing the variance , since the volatility of the series looks much higher in the earlier years, while it reduces starting 1980 onward. Variance stability is questionable and should be checked formally.

# Check ACF of the differenced series (Trend is now gone!)
acf(diff_ln_gdp, lag.max = 20, main = "ACF of Differenced Log(GDP)")

The absence of significant spikes at lags 4 or 8 confirms that no seasonal differencing is needed, distinguishing simple momentum (lags 1-3) from actual seasonality.

Differencing log GDP turns the highly persistent level series into a near-stationary growth series: autocorrelation is mainly at lag 1 (showing short-run persistence) but with little structure beyond a few lags

Conclusion: The series diff_ln_gdp is the most appropriate stationary transformation for modeling.