ACF vs PACF in Time Series Analysis

Author

Arvind Sharma

1 Core Concepts

1.1 What is Autocorrelation?

Autocorrelation measures how correlated a time series is with delayed versions of itself at different time lags. Think of it as asking: “How similar is today’s value to yesterday’s, last week’s, or last month’s?”

1.2 ACF (Autocorrelation Function)

Definition: Measures the linear relationship between a time series and its lagged values, capturing both direct and indirect relationships.

Key Characteristics:

Shows the total correlation at each lag
Includes indirect effects transmitted through intermediate lags
Gradually decays for stationary AR processes
Cuts off sharply for MA processes

Mathematical Formula: \[\rho(k) = \frac{\text{Cov}(Y_t, Y_{t-k})}{\sqrt{\text{Var}(Y_t) \times \text{Var}(Y_{t-k})}}\]

1.3 PACF (Partial Autocorrelation Function)

Definition: Measures the direct linear relationship between a time series and its lagged values, after removing the influence of all intermediate lags.

Key Characteristics: - Shows only the direct correlation at each lag

Controls for the effects of all shorter lags
Cuts off sharply for AR processes after lag p
Gradually decays for MA processes

The Intuitive Difference: ACF vs PACF

Think of autocorrelation like gossip spreading through a network:

ACF (Autocorrelation Function): Measures total influence - both direct gossip AND gossip that traveled through other people first
PACF (Partial Autocorrelation Function): Measures direct influence only - what’s the correlation if we ignore all the middlemen?

2 Identification Rules (The Golden Rules)

Warning

ACF (Autocorrelation Function)and PACF (Partial Autocorrelation Function) analysis should ideally be performed on stationary data to properly identify the lag structure (AR and MA terms) for modeling. While you can plot them on non-stationary data, the results are often misleading or meaningless.

2.1 For AR(p) Processes

ACF: Gradually decays (exponential or sinusoidal pattern)
PACF: Cuts off sharply after lag p

Note

An \(AR(2)\) model assumes the current value depends on the two previous values (lags).

The Equation:

\(y_t = c + \phi_1 y_{t-1} + \phi_2 y_{t-2} + \varepsilon_t\)

\(y_t\): The value at time \(t\).
\(\phi_1, \phi_2\): The coefficients (parameters) for the lags.
\(\varepsilon_t\): White noise (error).

Typical \(p\) lags: In practice, \(p\) usually ranges from 0 to 5. For seasonal data, you might see much higher lags (e.g., \(p=12\) for monthly data), but for non-seasonal data, values higher than 2 or 3 often indicate the data needs differencing.

2.2 For MA(q) Processes

ACF: Cuts off sharply after lag q
PACF: Gradually decays (exponential or sinusoidal pattern)

Note

An \(MA(2)\) model assumes the current value depends on the current and past “shocks” or error terms.

The Equation:

\(y_t = c + \varepsilon_t + \theta_1 \varepsilon_{t-1} + \theta_2 \varepsilon_{t-2}\)

\(\theta_1, \theta_2\): The coefficients for the forecast errors.
\(\varepsilon_{t-1}, \varepsilon_{t-2}\): The errors from previous time steps.

Typical \(q\) terms: Similar to AR, \(q\) is typically 0 to 2. It is rare to see very high \(q\) values in non-seasonal models because the influence of random shocks usually dissipates quickly.

2.3 For ARMA(p,q) Processes

Both ACF and PACF: Gradually decay after their respective cutoff points

ARMA(2,1) Sales Model: An Analysis via COVID-19

An ARMA(2,1) sales model explains today’s sales as the result of three distinct forces: customer momentum, adjustment to recent disruptions, and new unexpected events. The COVID-19 pandemic provides a clear lens through which to view how these forces interact.

2.3.1 1. Autoregressive AR(2): Sales Momentum

The AR(2) component states that today’s sales depend on sales from the previous two periods. Even during the volatility of the pandemic, consumer behavior showed significant persistence.

Example: When lockdowns began, demand shifts were not just one-day events. Grocery stores saw sustained surges due to stockpiling, while travel experienced persistent declines.
Business Logic: If grocery sales surged for two consecutive weeks, that momentum likely continued into the third. The AR(2) component captures this “behavioral stickiness,” allowing the sustained sales levels of the COVID era to influence the current forecast.

2.3.2 2. Moving Average MA(1): Adjustment to Shocks

The MA(1) component captures how a system adjusts after an unexpected disruption. COVID-19 created massive forecasting errors because firms could not initially predict lockdown policies or supply chain failures.

Example: Suppose a retailer expected moderate demand but faced a massive spike due to panic buying. That prediction mistake is a “shock.”
Business Logic: In the following period, the business reacts to that shock by adjusting inventory or pricing. Importantly, the MA term does not average sales levels; it reflects how the model “corrects” itself based on the most recent forecasting surprise.

2.3.3 3. The Error Term: New Unpredictable Developments

Even after accounting for momentum and past adjustments, new events continued to emerge. These represent the Error Term (\(\varepsilon_t\)), or pure “White Noise.”

Examples: Sudden government policy shifts, the emergence of a new virus variant, or a surprise vaccine rollout announcement.
The “Catch-all”: These events are unpredictable. In this model, external factors like a sudden discount or a new lockdown are not modeled separately; they are swallowed by this error term as a “random shock” occurring in the current moment.

2.3.4 Economic Interpretation

Using the pandemic as a case study, the ARMA(2,1) model provides a realistic economic narrative:

Persistence (AR): Customer demand patterns during the pandemic showed strong momentum.
Correction (MA): Firms continuously updated their expectations based on recent disruptions.
Uncertainty (Error Term): New developments created constant, unpredictable shocks.

Together, these components allow the model to track how sales evolved within a highly uncertain and rapidly changing marketplace.

3 Setup and Data Preparation

# Load required packages
library(forecast)
library(tseries)
library(ggplot2)
library(gridExtra)
library(knitr)
# library(DT)

# Set seed for reproducibility
set.seed(123)

3.1 AR Process Examples

3.1.1 AR(2) Process Simulation

Let’s simulate and analyze an AR(2) process: \(y_t = 0.6y_{t-1} - 0.3y_{t-2} + \epsilon_t\)

# Simulate AR(2) process
ar2_series <- arima.sim(model = list(ar = c(0.6, -0.3)), n = 500)

# Create comprehensive plots
par(mfrow = c(2, 2))
plot(ar2_series, main = "AR(2) Time Series", ylab = "Value", xlab = "Time")
acf(ar2_series, main = "ACF of AR(2)", lag.max = 20)
pacf(ar2_series, main = "PACF of AR(2)", lag.max = 20)

# Add theoretical ACF for comparison
ar2_theoretical_acf <- ARMAacf(ar = c(0.6, -0.3), lag.max = 20)
plot(0:20, ar2_theoretical_acf, type = "h", main = "Theoretical ACF of AR(2)", 
     xlab = "Lag", ylab = "ACF")

Observation: Notice how the PACF cuts off after lag 2, while the ACF gradually decays. This is the signature pattern of an AR(2) process.

3.1.2 AR(1) Process for Comparison

# Simulate AR(1) process
ar1_series <- arima.sim(model = list(ar = c(0.7)), n = 500)

par(mfrow = c(2, 2))
plot(ar1_series, main = "AR(1) Time Series")
acf(ar1_series, main = "ACF of AR(1)")
pacf(ar1_series, main = "PACF of AR(1)")

3.2 MA Process Examples

3.2.1 MA(2) Process Simulation

Let’s simulate and analyze an MA(2) process: \(y_t = \epsilon_t + 0.7\epsilon_{t-1} + 0.4\epsilon_{t-2}\)

# Simulate MA(2) process
ma2_series <- arima.sim(model = list(ma = c(0.7, 0.4)), n = 500)

par(mfrow = c(2, 2))
plot(ma2_series, main = "MA(2) Time Series")
acf(ma2_series, main = "ACF of MA(2)")
pacf(ma2_series, main = "PACF of MA(2)")

# Theoretical ACF for MA(2)
ma2_theoretical_acf <- ARMAacf(ma = c(0.7, 0.4), lag.max = 20)
plot(0:20, ma2_theoretical_acf, type = "h", main = "Theoretical ACF of MA(2)",
     xlab = "Lag", ylab = "ACF")

Observation: The ACF cuts off after lag 2, while the PACF gradually decays - the opposite pattern from AR processes.

3.3 ARMA Process Example

3.3.1 ARMA(1,1) Process

# Simulate ARMA(1,1) process
arma11_series <- arima.sim(model = list(ar = c(0.5), ma = c(0.3)), n = 500)

par(mfrow = c(2, 2))
plot(arma11_series, main = "ARMA(1,1) Time Series")
acf(arma11_series, main = "ACF of ARMA(1,1)")
pacf(arma11_series, main = "PACF of ARMA(1,1)")

3.4 Model Identification Function

# Function for systematic model identification
identify_model <- function(ts_data, series_name = "Time Series") {
  cat("=== Model Identification for", series_name, "===\n\n")
  
  # 1. Stationarity test
  adf_result <- adf.test(ts_data)
  cat("ADF Test p-value:", round(adf_result$p.value, 4), "\n")
  cat("Series is", ifelse(adf_result$p.value < 0.05, "stationary", "non-stationary"), "\n\n")
  
  # 2. Generate plots
  par(mfrow = c(2, 2))
  plot(ts_data, main = paste(series_name, "- Time Plot"))
  acf(ts_data, main = paste(series_name, "- ACF"))
  pacf(ts_data, main = paste(series_name, "- PACF"))
  
  # 3. Model suggestions
  cat("Model Identification Guidelines:\n")
  cat("- If PACF cuts off at lag p and ACF decays: AR(p)\n")
  cat("- If ACF cuts off at lag q and PACF decays: MA(q)\n")
  cat("- If both decay gradually: ARMA(p,q)\n\n")
}

3.4.1 Practical Example: Applying the Function

# Test the identification function on our simulated series
identify_model(ar2_series, "AR(2) Series")

=== Model Identification for AR(2) Series ===

ADF Test p-value: 0.01 
Series is stationary

Model Identification Guidelines:
- If PACF cuts off at lag p and ACF decays: AR(p)
- If ACF cuts off at lag q and PACF decays: MA(q)
- If both decay gradually: ARMA(p,q)

4 Advanced Diagnostics

4.1 Information Criteria Comparison

AR(2) corresponds to our earlier simulated series. Let’s compare different models using AIC and BIC. We expect AR(2) to perform best.

# Compare different models for AR(2) series
models_to_test <- list(
  "AR(1)" = c(1, 0, 0),
  "AR(2)" = c(2, 0, 0),
  "AR(3)" = c(3, 0, 0),
  "MA(1)" = c(0, 0, 1),
  "MA(2)" = c(0, 0, 2),
  "ARMA(1,1)" = c(1, 0, 1),
  "ARMA(2,1)" = c(2, 0, 1)
)

# Calculate AIC and BIC for each model
results <- data.frame(
  Model = names(models_to_test),
  AIC = NA,
  BIC = NA
)

for (i in 1:length(models_to_test)) {
  fit <- arima(ar2_series, order = models_to_test[[i]])
  results$AIC[i] <- AIC(fit)
  results$BIC[i] <- BIC(fit)
}

# Display results
kable(results, digits = 2, caption = "Model Comparison for AR(2) Series")

Model Comparison for AR(2) Series
Model	AIC	BIC
AR(1)	1446.05	1458.69
AR(2)	1397.93	1414.79
AR(3)	1399.81	1420.88
MA(1)	1406.87	1419.51
MA(2)	1408.16	1425.02
ARMA(1,1)	1408.39	1425.25
ARMA(2,1)	1399.68	1420.76

# Find best models
best_aic <- results$Model[which.min(results$AIC)]
best_bic <- results$Model[which.min(results$BIC)]

cat("\nBest model by AIC:", best_aic)


Best model by AIC: AR(2)

cat("\nBest model by BIC:", best_bic)


Best model by BIC: AR(2)

4.2 Residual Analysis

Expect the residuals of the best-fitting model to behave like white noise.

# Fit the best model and check residuals
best_model <- arima(ar2_series, order = c(2, 0, 0))

# Residual diagnostics
par(mfrow = c(2, 2))
plot(residuals(best_model), main = "Residuals")
acf(residuals(best_model), main = "ACF of Residuals")
pacf(residuals(best_model), main = "PACF of Residuals")
qqnorm(residuals(best_model))
qqline(residuals(best_model))

# Ljung-Box test
ljung_test <- Box.test(residuals(best_model), lag = 10, type = "Ljung-Box")
cat("\nLjung-Box test p-value:", round(ljung_test$p.value, 4))


Ljung-Box test p-value: 0.9051

cat("\nResiduals are", ifelse(ljung_test$p.value > 0.05, "white noise (good!)", "not white noise"))


Residuals are white noise (good!)

5 Summary Table: Pattern Recognition

ACF/PACF Pattern Recognition Guide
Process	ACF_Pattern	PACF_Pattern	Key_Insight
AR(p)	Gradual decay	Cuts off after lag p	PACF tells AR order
MA(q)	Cuts off after lag q	Gradual decay	ACF tells MA order
ARMA(p,q)	Gradual decay after lag q	Gradual decay after lag p	Both patterns present
Random Walk	Very slow decay	Large spike at lag 1	Non-stationary signal
White Noise	No significant lags	No significant lags	No autocorrelation

6 Real-world Application Tips

6.1 1. Dealing with Non-stationarity

ADF test and differencing:

# Simulate a random walk (non-stationary)
rw_series <- cumsum(rnorm(200))

par(mfrow = c(2, 2))
plot(rw_series, main = "Random Walk (Non-stationary)", type="l")
acf(rw_series, main = "ACF of Random Walk")

# Difference to make stationary
diff_rw <- diff(rw_series)
plot(diff_rw, main = "Differenced Series (Stationary)", type="l")
acf(diff_rw, main = "ACF of Differenced Series")

cat("ADF test on random walk:", round(adf.test(rw_series)$p.value, 4))

ADF test on random walk: 0.9082

cat("\nADF test on differenced series:", round(adf.test(diff_rw)$p.value, 4))


ADF test on differenced series: 0.01

6.2 2. Seasonal Patterns

Seasonal patterns can complicate ACF/PACF interpretation:

# Create series with seasonal pattern
t <- 1:100
seasonal_series <- 2 * sin(2 * pi * t / 12) + rnorm(100, 0, 0.5)

par(mfrow = c(2, 2))
plot(seasonal_series, main = "Seasonal Time Series", type="l")
acf(seasonal_series, lag.max = 36, main = "ACF (Extended for Seasonality)")
pacf(seasonal_series, lag.max = 36, main = "PACF (Extended for Seasonality)")

7 Key Takeaways

ACF captures total correlation (direct + indirect effects)
PACF isolates direct correlation (controlling for intermediate lags)
Use both together for comprehensive model identification
Always check stationarity before interpreting patterns
Validate with residual analysis after model fitting
Consider information criteria for model selection
Look for seasonal patterns in extended ACF/PACF plots

7.1 Exercise for Practice

Try identifying the following simulated series:

# Mystery series - can you identify the process?
mystery_series <- arima.sim(model = list(ar = c(0.8), ma = c(-0.4)), n = 300)

par(mfrow = c(1, 3))
plot(mystery_series, main = "Mystery Series - What process is this?")
acf(mystery_series, main = "ACF of Mystery Series")
pacf(mystery_series, main = "PACF of Mystery Series")

cat("Hint: Look at the patterns in both ACF and PACF!")

Hint: Look at the patterns in both ACF and PACF!

cat("\nAnswer: This is an ARMA(1,1) process with φ=0.8 and θ=-0.4")


Answer: This is an ARMA(1,1) process with φ=0.8 and θ=-0.4

8 Appendix

8.1 Understanding the Spike at Lag 0

The spike at lag 0 is ALWAYS 1.0 - this is not random! Here’s why:

Lag 0 = Correlation with itself

At lag 0, we’re asking: “How correlated is today’s value with… today’s value?”
The answer is always perfectly correlated = 1.0

Why is this important?

It’s your reference point - all other correlations are relative to this
If lag 0 weren’t 1.0, something would be seriously wrong with your data!
It helps you gauge the magnitude of other correlations

8.2 Intuitive Explanation: Why PACF “Regresses Out” Intermediate Lags

To identify the order of an Autoregressive (AR) process, we must distinguish between cumulative correlation and direct impact. This note uses two analogies to explain why the Autocorrelation Function (ACF) lingers while the Partial Autocorrelation Function (PACF) “shuts off.”

Case Study: Imagine a sequence of events where each day’s misery is a direct result of the previous day’s misfortune.

The Series (\(X_t\))

\(X_t\) (Day 1): No GPA, no Job, “Totally Screwed.”
\(X_{t-1}\) (Day 2): You missed a major exam.
\(X_{t-2}\) (Day 3): You received a massive hospital bill.
\(X_{t-3}\) (Day 4): You were in a traffic accident.
\(X_{t-4}\) (Day 5): You “Felt Low” (the initial shock).

8.2.1 The ACF: The “Snowball” Effect / Domino Chain

The ACF at lag \(k\) measures the total correlation between \(X_t\) and \(X_{t-k}\).

Logic: It tracks the “memory” of the series. Even if “Felt Low” (\(X_{t-4}\)) isn’t the direct reason you are unemployed today, it initiated the chain reaction (Accident \(\rightarrow\) Bill \(\rightarrow\) Exam \(\rightarrow\) Screwed).
Result: The ACF at lag 4 will be high. It captures the cumulative linear dependence moving through the entire domino chain.
“Your bad mood 4 days ago still correlates with today because the whole disaster chain propagated forward.”


Day 5        Day 4        Day 3        Day 2        Day 1
(Felt Low) → (Accident) → (Bill) → (Exam) → (Screwed)
     \____________________________________________/
                  TOTAL ASSOCIATION

8.2.2 The PACF: The “Direct Laser”

The PACF at lag \(k\) measures the correlation between \(X_t\) and \(X_{t-k}\) after removing the influence of all intervening lags (\(1, 2, \dots, k-1\)).

Logic: It asks: “Given that I already know you missed your exam, have a hospital bill, and had an accident, does knowing you ‘Felt Low’ 5 days ago provide any NEW information about why you are screwed today?”
- “After accounting for accident, bill, and exam…, did your Day-5 mood still independently wreck your Day-1 life?”


Day 5        Day 4        Day 3        Day 2        Day 1
(Felt Low) → [BLOCKED] → [BLOCKED] → [BLOCKED] → (Screwed)
     \____________________________________________/
                 DIRECT EFFECT ONLY

\[ Corr(Y_t,Y_{t−4}∣Y_{t−1},Y_{t−2},Y_{t−3}) \]

Mathematical Intuition: In an \(AR(1)\) process: \[X_t = \phi X_{t-1} + \epsilon_t\] The PACF at lag 2, 3, and 4 will be zero because \(X_{t-1}\) already contains all the information from the past that is relevant to the present.

8.2.3 The Academic Alternative: The Telephone Game

The Telephone Game is the standard formal alternative for identifying AR processes.

Feature	The Telephone Game Analogy
The Series	A message passed from Person \(A \rightarrow B \rightarrow C \rightarrow D\).
ACF (Lag 3)	Measures how much Person D’s message resembles Person A’s. Correlation is high because the words evolved step-by-step.
PACF (Lag 3)	Measures how much Person D “copied” Person A directly, after accounting for what Person C said.
Interpretation	In a perfect \(AR(1)\), Person D only hears Person C. The PACF for lags 2 and 3 is zero because D has no direct contact with A or B.

8.2.4 Summary for Model Identification

Tip

ACF = “Does the past echo?” ; PACF = “Which past day actually caused today?”
ACF tells you how memory flows ;PACF tells you where memory originates
ACF measures persistence ; PACF identifies structure

AR(p) Identification: The PACF is the “hero.” It will show significant spikes only for the actual number of direct steps (\(p\)) and then “shut off” to zero.
MA(q) Identification: The ACF is the “hero.” It will show significant spikes only for the number of “shocks” (\(q\)) currently affecting the system, then cut off.

The Golden Rule of Identification: > * If the PACF hits a wall, it’s an AR model (count the spikes for \(p\)). If the ACF hits a wall, it’s an MA model (count the spikes for \(q\)). If both linger and decay, it’s time to start testing ARMA combinations.