Introduction

This report delves into an analysis of time series data concerning KP_PREV clients, examining the data overall, by typology, and county-wise. The dataset encompasses information on the yearly count of clients for each typology and county. The primary objective of this analysis is to discern and visually represent the trend component within the time series data utilizing the Hodrick-Prescott (HP) filter, while also evaluating its significance.

Methods

Data Description

The dataset comprises five typologies: FSW, MSM, PWID, TG, and People_in_Prison, each featuring biannual client counts over a span of 9 years across 47 counties. The People_in_Prison typology was omitted, and biannual counts were aggregated due to mutual exclusivity in reporting. Only counties supported by PEPFAR were considered in this analysis as they exhibited consistent reporting by typology over the years. FY 2024 was excluded from analysis due to incomplete data validation at the time of this study.

Analysis Approach

Exploratory data analysis involved plotting counts by year to ensure data accuracy, consistency, and absence of errors or missing values. Identified outliers and anomalies were appropriately addressed. The absence of seasonal patterns was noted in the dataset since the analysis aggregated data annually. As the annual data provided a holistic view over the entire year, it lacked the granularity to detect finer temporal variations indicative of seasonal patterns.

The Hodrick-Prescott filter (HP filter) was applied to the time series data using the hpfilter function from the mFilter package to extract the trend component. This filter decomposes the time series into a trend component and a cyclical component, isolating long-term trends from short-term fluctuations or cycles within the data.

The HP filter worked by solving the optimization problem to minimize the following objective function:

\[ \min \sum_{t=1}^{T} (y_{t} - \tau_{t})^{2} + \lambda \sum_{t=2}^{T-1} ((\tau_{t+1} - \tau_{t}) - (\tau_{t} - \tau_{t-1}))^{2} \]

Where:

\({T}\) is the total number of observations.

\(y_{t}\) is the observed value of the time series at time \(t\).

\(\tau_{t}\) is the trend component at time \(t\).

\(\lambda\) is the smoothing parameter.

The first term represents the fit to the data, and the second term represents the smoothness of the trend. The smoothing parameter \(\lambda\) controls the trade-off between fitting the data and smoothing the trend.

The HP filter produced a trend component \(\tau_{t}\) and a cyclical component \((y_{t} - \tau_{t})\). The trend component represented the long-term behavior of the time series, while the cyclical component captured short-term fluctuations around the trend.

A non-parametric trend test (the Mann-Kendall test), was used assuming a non-normal distribution of the data. The statistical significance of the trend was assessed and p-values were calculated accordingly.

We plotted the trend component and included the p-value from Kendall’s tau test as text on the plot, along with a label indicating whether the trend was significant or not. Typically, if the p-value was below 0.05 significance level, the observed change was statistically significant, suggesting that it was unlikely to have occurred due to random chance alone.

Analysis

fig 1: Description of plot 1

fig 1: Description of plot 1

fig 2: Description of plot 2

fig 2: Description of plot 2

fig 3: Description of plot 3

fig 3: Description of plot 3

fig 4: Description of plot 4

fig 4: Description of plot 4

fig 5: Description of plot 5

fig 5: Description of plot 5

fig 6: Description of plot 6

fig 6: Description of plot 6

fig 7: Description of plot 7

fig 7: Description of plot 7

fig 8: Description of plot 8

fig 8: Description of plot 8

fig 9: Description of plot 9

fig 9: Description of plot 9

fig 10: Description of plot 10

fig 10: Description of plot 10

fig 11: Description of plot 11

fig 11: Description of plot 11

fig 12: Description of plot 12

fig 12: Description of plot 12

fig 13: Description of plot 13

fig 13: Description of plot 13

fig 14: Description of plot 14

fig 14: Description of plot 14