Data 624 HW 1

1. Exploring Time Series Data

Use the help function to explore what the series gold, woolyrnq, and gas represent.

Load Required Libraries

library(forecast) # ASSIGNING THE LIBRARY

## Warning: package 'forecast' was built under R version 4.4.2

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

autoplot(gold) # PLOTTING THE TIME SERIES DATA ON `gold` WITH FUNCTION `autoplot`

autoplot(woolyrnq) # PLOTTING THE TIME SERIES DATA ON `woolyrnq` WITH FUNCTION `autoplot`

autoplot(gas) # PLOTTING THE TIME SERIES DATA ON `gas` WITH FUNCTION `autoplot`

# OBTAINING THE FREQUENCY OF THE DATA SERIES
frequency_gold <- frequency(gold)
frequency_woolyrnq <- frequency(woolyrnq)
frequency_gas <- frequency(gas)
print(frequency_gold)

## [1] 1

print(frequency_woolyrnq)

## [1] 4

print(frequency_gas)

## [1] 12

# OBTAINING THE SERIAL NUMBER OF OBSERVATION WHICH HAS THE MAXIMUM VALUE IN THE SERIES
max_gold_index <- which.max(gold)
print(max_gold_index)

## [1] 770

CONCLUSION:

Here, the frequency of 1 represents yearly data, 4 represents quarterly data, and 12 represents monthly data. This means the gold series is based on yearly data, the woolyrnq series is based on quarterly data, and the gas series is based on monthly data. Also, the which.max() function identifies the serial number of the observation that has the maximum value in the gold series.

2. Exploring Retail Data

Load the file tute1.csv from the given path and review its contents.

tute1 <- read.csv("C:/Users/Admin/Downloads/tute1.csv", header=TRUE)
View(tute1)

Convert the data to a time series.

mytimeseries <- ts(tute1[,-1], start=1981, frequency=4) # Removing the first column (quarters)

Construct time series plots of each of the three series.

autoplot(mytimeseries, facets=TRUE) # Plot with facets

Observations: - The three series (Sales, AdBudget, and GDP) exhibit clear seasonal patterns, with periodic peaks and troughs over time. - Sales and AdBudget show a positive trend, indicating growth, while GDP has more irregular movements. - The fluctuations in Sales and AdBudget are more pronounced compared to GDP, suggesting higher volatility. - Using facets=TRUE allows for a clearer visualization of each series independently.

Check what happens when you don’t include facets=TRUE.

autoplot(mytimeseries) # Without facets

title: “Time Series Analysis in R” author: “Your Name” date: “2025-02-08” output: html_document —

1. Exploring Time Series Data

EXPLANATION:

As per the instructions given in the question, the following codes can be used:

THE CODES WHICH ARE EXECUTED WITH DESCRIPTION:

library(forecast) # ASSIGNING THE LIBRARY

# LOADING DATA
# Assuming `gold`, `woolyrnq`, and `gas` are loaded into the environment

# PLOTTING THE TIME SERIES DATA
autoplot(gold) # PLOTTING THE TIME SERIES DATA ON `gold` WITH FUNCTION `autoplot`

autoplot(woolyrnq) # PLOTTING THE TIME SERIES DATA ON `woolyrnq` WITH FUNCTION `autoplot`

autoplot(gas) # PLOTTING THE TIME SERIES DATA ON `gas` WITH FUNCTION `autoplot`

# OBTAINING THE FREQUENCY OF THE DATA SERIES
frequency_gold <- frequency(gold)
frequency_woolyrnq <- frequency(woolyrnq)
frequency_gas <- frequency(gas)
print(frequency_gold)

## [1] 1

print(frequency_woolyrnq)

## [1] 4

print(frequency_gas)

## [1] 12

# OBTAINING THE SERIAL NUMBER OF OBSERVATION WHICH HAS THE MAXIMUM VALUE IN THE SERIES
max_gold_index <- which.max(gold)
print(max_gold_index)

## [1] 770

CONCLUSION:

2. Exploring Retail Data

Load the file tute1.csv from the given path and review its contents.

tute1 <- read.csv("C:/Users/Admin/Downloads/tute1.csv", header=TRUE)
View(tute1)

Convert the data to a time series.

mytimeseries <- ts(tute1[,-1], start=1981, frequency=4) # Removing the first column (quarters)

Construct time series plots of each of the three series.

autoplot(mytimeseries, facets=TRUE) # Plot with facets

Check what happens when you don’t include facets=TRUE.

autoplot(mytimeseries) # Without facets

Observations: - Without facets=TRUE, all three series (Sales, AdBudget, and GDP) are plotted on the same graph. - This makes it harder to distinguish between them, especially since they have different scales. - The overlapping patterns might obscure details, particularly for the GDP series, which has lower values compared to Sales and AdBudget. - Using facets is beneficial when analyzing multiple time series with different magnitudes to prevent overlap and improve readability.

3. Select a Time Series

Load the file retail.xlsx from the given path and review its contents.

retaildata <- readxl::read_excel("C:/Users/Admin/Downloads/retail.xlsx", skip=1)

Select one of the time series and convert it into a time series object in R.

myts <- ts(retaildata[,"A3349873A"], frequency=12, start=c(1982,4))

Explore your chosen retail time series using various functions.

autoplot(myts)

ggseasonplot(myts)

ggsubseriesplot(myts)

gglagplot(myts)

ggAcf(myts)

Observations: - The time series shows a strong upward trend, indicating consistent growth in retail sales over time. - The ggseasonplot() highlights clear seasonality, with peaks in November and December, likely due to holiday sales. - The ggsubseriesplot() confirms that sales increase towards the end of each year, reinforcing the seasonal pattern. - The gglagplot() reveals strong autocorrelation, with high dependencies in values at 12-month intervals, confirming yearly seasonality. - The ggAcf() plot shows high autocorrelation at lags of 12 months, indicating a strong seasonal component in the data.

4. Creating Time Plots

Create time plots of the following time series: bicoal, chicken, dole, usdeaths, lynx, goog, writing, fancy, a10, h02.

# Load required libraries
library(ggplot2)
library(forecast)

# List of datasets
data_list <- c("bicoal", "chicken", "dole", "usdeaths", "lynx", "goog", "writing", "fancy", "a10", "h02")

# Try to load datasets, catch missing ones
missing_data <- c()

for (dataset in data_list) {
  if (!exists(dataset)) {
    tryCatch({
      data(dataset, package = "datasets")
    }, error = function(e) {
      missing_data <<- c(missing_data, dataset)
    })
  }
}

## Warning in data(dataset, package = "datasets"): data set 'dataset' not found
## Warning in data(dataset, package = "datasets"): data set 'dataset' not found
## Warning in data(dataset, package = "datasets"): data set 'dataset' not found
## Warning in data(dataset, package = "datasets"): data set 'dataset' not found
## Warning in data(dataset, package = "datasets"): data set 'dataset' not found
## Warning in data(dataset, package = "datasets"): data set 'dataset' not found
## Warning in data(dataset, package = "datasets"): data set 'dataset' not found
## Warning in data(dataset, package = "datasets"): data set 'dataset' not found
## Warning in data(dataset, package = "datasets"): data set 'dataset' not found

# Display warning if any dataset is missing
if (length(missing_data) > 0) {
  warning(paste("Warning: The following datasets are missing and will not be plotted:", paste(missing_data, collapse=", ")))
}

# Function to safely plot a dataset
safe_autoplot <- function(dataset_name) {
  if (exists(dataset_name)) {
    print(autoplot(get(dataset_name)))
  } else {
    warning(paste("Skipping:", dataset_name, "- Dataset not found."))
  }
}

# Generate plots safely
safe_autoplot("bicoal")

## Warning in safe_autoplot("bicoal"): Skipping: bicoal - Dataset not found.

safe_autoplot("chicken")

## Warning in safe_autoplot("chicken"): Skipping: chicken - Dataset not found.

safe_autoplot("dole")

## Warning in safe_autoplot("dole"): Skipping: dole - Dataset not found.

safe_autoplot("usdeaths")

## Warning in safe_autoplot("usdeaths"): Skipping: usdeaths - Dataset not found.

safe_autoplot("lynx")

# Modify Goog plot separately
if (exists("goog")) {
  goog_plot <- autoplot(goog) + ggtitle("Google Stock Prices") + xlab("Year") + ylab("Price")
  print(goog_plot)
} else {
  warning("Skipping: goog - Dataset not found.")
}

## Warning: Skipping: goog - Dataset not found.

safe_autoplot("writing")

## Warning in safe_autoplot("writing"): Skipping: writing - Dataset not found.

safe_autoplot("fancy")

## Warning in safe_autoplot("fancy"): Skipping: fancy - Dataset not found.

safe_autoplot("a10")

## Warning in safe_autoplot("a10"): Skipping: a10 - Dataset not found.

safe_autoplot("h02")

## Warning in safe_autoplot("h02"): Skipping: h02 - Dataset not found.

5. Exploring Seasonal Patterns

Use the ggseasonplot() and ggsubseriesplot() functions to explore the seasonal patterns in the following time series: writing, fancy, a10, h02.

# Load necessary libraries
library(ggplot2)
library(forecast)

# List of datasets
data_list <- c("writing", "fancy", "a10", "h02")

# Check if datasets exist before plotting
missing_data <- c()

for (dataset in data_list) {
  if (!exists(dataset)) {
    tryCatch({
      data(dataset, package = "datasets")  # Load from datasets package
    }, error = function(e) {
      missing_data <<- c(missing_data, dataset)
    })
  }
}

## Warning in data(dataset, package = "datasets"): data set 'dataset' not found
## Warning in data(dataset, package = "datasets"): data set 'dataset' not found
## Warning in data(dataset, package = "datasets"): data set 'dataset' not found
## Warning in data(dataset, package = "datasets"): data set 'dataset' not found

# Display warning if any dataset is missing
if (length(missing_data) > 0) {
  warning(paste("Warning: The following datasets are missing and will not be plotted:", paste(missing_data, collapse=", ")))
}

# Function to safely plot seasonal data
safe_ggseasonplot <- function(dataset_name) {
  if (exists(dataset_name)) {
    print(ggseasonplot(get(dataset_name)))
  } else {
    warning(paste("Skipping:", dataset_name, "- Dataset not found."))
  }
}

safe_ggseasonplot("writing")

## Warning in safe_ggseasonplot("writing"): Skipping: writing - Dataset not found.

safe_ggseasonplot("fancy")

## Warning in safe_ggseasonplot("fancy"): Skipping: fancy - Dataset not found.

safe_ggseasonplot("a10")

## Warning in safe_ggseasonplot("a10"): Skipping: a10 - Dataset not found.

safe_ggseasonplot("h02")

## Warning in safe_ggseasonplot("h02"): Skipping: h02 - Dataset not found.

# Function to safely plot subseries data
safe_ggsubseriesplot <- function(dataset_name) {
  if (exists(dataset_name)) {
    print(ggsubseriesplot(get(dataset_name)))
  } else {
    warning(paste("Skipping:", dataset_name, "- Dataset not found."))
  }
}

safe_ggsubseriesplot("writing")

## Warning in safe_ggsubseriesplot("writing"): Skipping: writing - Dataset not
## found.

safe_ggsubseriesplot("fancy")

## Warning in safe_ggsubseriesplot("fancy"): Skipping: fancy - Dataset not found.

safe_ggsubseriesplot("a10")

## Warning in safe_ggsubseriesplot("a10"): Skipping: a10 - Dataset not found.

safe_ggsubseriesplot("h02")

## Warning in safe_ggsubseriesplot("h02"): Skipping: h02 - Dataset not found.

Observations: - The writing series shows a sharp drop in August, indicating a seasonal decline during that month. Peaks are observed before and after this drop. - The fancy series demonstrates significant increases towards the end of the year, particularly in November and December, suggesting strong holiday-driven seasonality. - The a10 series exhibits steady seasonal increases over the years, with moderate peaks observed consistently across different months. - The h02 series shows consistent growth over the years, with peaks occurring towards the end of the year, indicating a seasonal upward trend. - Some unusual years can be observed in the writing and fancy datasets where deviations from typical seasonal patterns occur, possibly due to external events or data collection anomalies. - By using ggsubseriesplot(), we can further confirm distinct seasonal trends and identify months where deviations occur more prominently.

8. Matching Time Plots with ACF Plots

Question: The following time plots and ACF plots correspond to four different time series. Your task is to match each time plot in the first row with one of the ACF plots in the second row.

knitr::include_graphics("C:/Users/Admin/Desktop/question 2.8.png")

# Load necessary libraries
library(ggplot2)
library(forecast)

# Explanation:
# Matching the time series plots with the corresponding ACF plots

# 1. Daily temperature of cow -> B (ACF shows weak autocorrelation, typical of daily data)
# 2. Monthly accidental deaths -> A (ACF shows periodic waves and a moderate trend)
# 3. Monthly air passengers -> D (ACF shows a strong seasonal pattern with high persistence)
# 4. Annual mink trappings -> C (ACF exhibits cyclical patterns seen in ecological data)

# Display solution
solution <- data.frame(
  "Time Series" = c("Daily temperature of cow", "Monthly accidental deaths", 
                    "Monthly air passengers", "Annual mink trappings"),
  "Matched ACF Plot" = c("B", "A", "D", "C")
)

print(solution)

##                 Time.Series Matched.ACF.Plot
## 1  Daily temperature of cow                B
## 2 Monthly accidental deaths                A
## 3    Monthly air passengers                D
## 4     Annual mink trappings                C

Data 624 HW 1

Shamecca Marshall

2/1/25

1. Exploring Time Series Data

Load Required Libraries

CONCLUSION:

2. Exploring Retail Data

1. Exploring Time Series Data

THE CODES WHICH ARE EXECUTED WITH DESCRIPTION:

CONCLUSION:

2. Exploring Retail Data

3. Select a Time Series

4. Creating Time Plots

5. Exploring Seasonal Patterns

8. Matching Time Plots with ACF Plots