HW1_redone

Please submit exercises 2.1, 2.2, 2.3, 2.4, 2.5 and 2.8 from the Hyndman online Forecasting book. Please submit both your Rpubs link as well as attach the .pdf file with your code.

Exercises 2.1

Explore the following four time series: Bricks from aus_production, Lynx from pelt, Close from gafa_stock, Demand from vic_elec.

Use ? (or help()) to find out about the data in each series.
What is the time interval of each series?
Use autoplot() to produce a time plot of each series.0
For the last plot, modify the axis labels and title.

Step 1: Load Required Libraries

Before working with the time series datasets, ensure you have the necessary packages installed and loaded.

# Install packages if not already installed
# install.packages("fpp3")  # For time series data and visualization

# Load required libraries
library(fpp3)

## Warning: package 'fpp3' was built under R version 4.4.2

## Registered S3 method overwritten by 'tsibble':
##   method               from 
##   as_tibble.grouped_df dplyr

## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.1 ──

## ✔ tibble      3.2.1     ✔ tsibble     1.1.6
## ✔ dplyr       1.1.4     ✔ tsibbledata 0.4.1
## ✔ tidyr       1.3.1     ✔ feasts      0.4.1
## ✔ lubridate   1.9.3     ✔ fable       0.4.1
## ✔ ggplot2     3.5.1

## Warning: package 'tsibble' was built under R version 4.4.2

## Warning: package 'tsibbledata' was built under R version 4.4.2

## Warning: package 'feasts' was built under R version 4.4.2

## Warning: package 'fabletools' was built under R version 4.4.2

## Warning: package 'fable' was built under R version 4.4.2

## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date()    masks base::date()
## ✖ dplyr::filter()      masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval()  masks lubridate::interval()
## ✖ dplyr::lag()         masks stats::lag()
## ✖ tsibble::setdiff()   masks base::setdiff()
## ✖ tsibble::union()     masks base::union()

Step 2: Exploring the Time Series Data

We will use ? or help() to find information about each dataset.

# Check details about each dataset
?aus_production   # Information about "Bricks" series

## starting httpd help server ... done

?pelt             # Information about "Lynx" series
?gafa_stock       # Information about "Close" series
?vic_elec         # Information about "Demand" series

Interpretation of Time Intervals for Each Series: *

Bricks (aus_production)

Time Interval: Quarterly
Represents the number of clay bricks produced in Australia (in millions).

Lynx (pelt)

Time Interval: Yearly
Represents the number of lynx pelts traded in Canada (from the Hudson’s Bay Company).

Close (gafa_stock)

Time Interval: Daily
Represents the closing stock prices of major technology companies (Google, Amazon, Facebook, Apple).

Demand (vic_elec)

Time Interval: Half-hourly
Represents electricity demand in Victoria, Australia.

Step 3: Plotting Each Time Series with autoplot()

Now, let’s visualize each time series.

3.1 Plotting Bricks Production in Australia

# Plot Bricks Production
autoplot(aus_production, Bricks) +
  ggtitle("Australian Brick Production") +
  ylab("Millions of Units") +
  xlab("Year")

## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).

Analysis: * The trend shows variations in brick production over time. * Some seasonal patterns may be visible, especially cyclic fluctuations.

3.2 Plotting Lynx Pelt Trading Data

# Plot Lynx Pelt Data
autoplot(pelt, Lynx) +
  ggtitle("Lynx Pelt Trading in Canada") +
  ylab("Number of Pelts") +
  xlab("Year")

Analysis: * The data exhibits a clear cyclic pattern, likely corresponding to the predator-prey cycle. * There are periodic spikes in lynx populations, followed by declines.

3.3 Plotting GAFA Stock Prices

# Plot Closing Stock Prices
autoplot(gafa_stock, Close) +
  ggtitle("Closing Stock Prices of GAFA Companies") +
  ylab("Stock Price (USD)") +
  xlab("Date")

Analysis: * The stock prices generally show an increasing trend over time. * There might be periods of volatility, reflecting market fluctuations.

3.4 Plotting Electricity Demand in Victoria with Custom Labels

# Customizing plot for electricity demand
autoplot(vic_elec, Demand) +
  ggtitle("Electricity Demand in Victoria (Australia)") +
  ylab("Electricity Demand (MW)") +
  xlab("Year") +
  theme_minimal()

Analysis: * Electricity demand fluctuates significantly, possibly reflecting daily and seasonal consumption patterns. * Spikes may correspond to extreme weather events (hot summers, cold winters).

Exercises 2.2

Use filter() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock.

Step 1: Load Required Libraries

Ensure you have the necessary packages installed and loaded.

# Install packages if not already installed
# install.packages("tidyverse")
# install.packages("fpp3")  # Includes gafa_stock dataset

# Load required libraries
library(fpp3)
library(dplyr)

Step 2: Identify the Peak Closing Price for Each Stock

We will use the group_by() function to group the dataset by stock (Symbol) and then use filter() to extract the days corresponding to the peak closing price.

# Find the peak closing price for each stock
peak_prices <- gafa_stock %>%
  group_by(Symbol) %>%  # Group by stock symbol (Google, Apple, Facebook, Amazon)
  filter(Close == max(Close)) %>%  # Filter for the max closing price per stock
  arrange(Symbol)  # Arrange alphabetically for better readability

# Display the result
print(peak_prices)

## # A tsibble: 4 x 8 [!]
## # Key:       Symbol [4]
## # Groups:    Symbol [4]
##   Symbol Date        Open  High   Low Close Adj_Close   Volume
##   <chr>  <date>     <dbl> <dbl> <dbl> <dbl>     <dbl>    <dbl>
## 1 AAPL   2018-10-03  230.  233.  230.  232.      230. 28654800
## 2 AMZN   2018-09-04 2026. 2050. 2013  2040.     2040.  5721100
## 3 FB     2018-07-25  216.  219.  214.  218.      218. 58954200
## 4 GOOG   2018-07-26 1251  1270. 1249. 1268.     1268.  2405600

Step 3: Explanation

group_by(Symbol): Ensures that we check the max closing price within each stock category.
max(Close): Identifies the highest closing price for each stock.
filter(Close == max(Close)): Selects only the rows where the closing price matches the maximum.
arrange(Symbol): Neatly arranges the output for readability.

Exercises 2.3

Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.

#a. You can read the data into R with the following script:
tute1 <- readr::read_csv("tute1.csv")

## Rows: 100 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl  (3): Sales, AdBudget, GDP
## date (1): Quarter
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

View(tute1)

#Convert the data to time series
mytimeseries <- tute1 |>
  mutate(Quarter = yearquarter(Quarter)) |>
  as_tsibble(index = Quarter)

#Construct time series plots of each of the three series
mytimeseries |>
  pivot_longer(-Quarter) |>
  ggplot(aes(x = Quarter, y = value, colour = name)) +
  geom_line() +
  facet_grid(name ~ ., scales = "free_y")

Check What Happens Without facet_grid()

Now, let’s remove facet_grid() and see the effect.

# Plot without faceting
mytimeseries |>
  pivot_longer(-Quarter) |>
  ggplot(aes(x = Quarter, y = value, colour = name)) +
  geom_line() +
  labs(title = "Time Series of Sales, AdBudget, and GDP",
       x = "Year",
       y = "Value",
       colour = "Variable") +
  theme_minimal()

Comparison of Plots * With facet_grid(name ~ ., scales = “free_y”): * Each variable is plotted in its own panel, making it easier to observe individual trends. * The scales = “free_y” ensures each panel has an appropriate y-axis scale. * Without facet_grid(): * All three series appear on the same plot, sharing a common y-axis. * If the scales differ greatly (e.g., GDP vs. AdBudget), some lines might appear flat and unreadable.

Exercises 2.4

The USgas package contains data on the demand for natural gas in the US.

Install the USgas package. 2 ,Create a tsibble from us_total with year as the index and state as the key.
Plot the annual natural gas consumption by state for the New England area (comprising the states of Maine, Vermont, New Hampshire, Massachusetts, Connecticut and Rhode Island).

# Install necessary packages if not installed
# install.packages("USgas")
# install.packages("tidyverse")
# install.packages("fpp3")

# Load libraries
library(USgas)

## Warning: package 'USgas' was built under R version 4.4.3

library(tidyverse)

## Warning: package 'tidyverse' was built under R version 4.4.3

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats 1.0.0     ✔ readr   2.1.5
## ✔ purrr   1.0.2     ✔ stringr 1.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter()     masks stats::filter()
## ✖ tsibble::interval() masks lubridate::interval()
## ✖ dplyr::lag()        masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(fpp3)

# Convert us_total to tsibble with correct column names
us_gas_tsibble <- us_total %>%
  as_tsibble(index = year, key = state)

# Check unique states to confirm New England states exist
unique(us_gas_tsibble$state)

##  [1] "Alabama"                            "Alaska"                            
##  [3] "Arizona"                            "Arkansas"                          
##  [5] "California"                         "Colorado"                          
##  [7] "Connecticut"                        "Delaware"                          
##  [9] "District of Columbia"               "Federal Offshore -- Gulf of Mexico"
## [11] "Florida"                            "Georgia"                           
## [13] "Hawaii"                             "Idaho"                             
## [15] "Illinois"                           "Indiana"                           
## [17] "Iowa"                               "Kansas"                            
## [19] "Kentucky"                           "Louisiana"                         
## [21] "Maine"                              "Maryland"                          
## [23] "Massachusetts"                      "Michigan"                          
## [25] "Minnesota"                          "Mississippi"                       
## [27] "Missouri"                           "Montana"                           
## [29] "Nebraska"                           "Nevada"                            
## [31] "New Hampshire"                      "New Jersey"                        
## [33] "New Mexico"                         "New York"                          
## [35] "North Carolina"                     "North Dakota"                      
## [37] "Ohio"                               "Oklahoma"                          
## [39] "Oregon"                             "Pennsylvania"                      
## [41] "Rhode Island"                       "South Carolina"                    
## [43] "South Dakota"                       "Tennessee"                         
## [45] "Texas"                              "U.S."                              
## [47] "Utah"                               "Vermont"                           
## [49] "Virginia"                           "Washington"                        
## [51] "West Virginia"                      "Wisconsin"                         
## [53] "Wyoming"

# Filter for New England states
new_england_gas <- us_gas_tsibble %>%
  filter(state %in% c("Maine", "Vermont", "New Hampshire", 
                      "Massachusetts", "Connecticut", "Rhode Island"))

# Ensure data exists
print(new_england_gas)

## # A tsibble: 138 x 3 [1Y]
## # Key:       state [6]
##     year state            y
##    <int> <chr>        <int>
##  1  1997 Connecticut 144708
##  2  1998 Connecticut 131497
##  3  1999 Connecticut 152237
##  4  2000 Connecticut 159712
##  5  2001 Connecticut 146278
##  6  2002 Connecticut 177587
##  7  2003 Connecticut 154075
##  8  2004 Connecticut 162642
##  9  2005 Connecticut 168067
## 10  2006 Connecticut 172682
## # ℹ 128 more rows

# Plot with correct y-axis mapping
new_england_gas %>%
  ggplot(aes(x = year, y = y, colour = state, group = state)) +  # Added group = state
  geom_line() +
  labs(title = "Annual Natural Gas Consumption in New England States",
       x = "Year",
       y = "Consumption (Billion Cubic Feet)",
       colour = "State") +
  theme_minimal()

Analysis & Interpretation * This plot will show how natural gas consumption has changed over time in each of the six New England states. * Trends & Patterns: * Some states might show increasing usage over time due to population growth and industrial demand. * Others might have stable or declining trends due to energy efficiency improvements or a shift to alternative energy sources.

Exercises 2.5

Download tourism.xlsx from the book website and read it into R using readxl::read_excel().
Create a tsibble which is identical to the tourism tsibble from the tsibble package.
Find what combination of Region and Purpose had the maximum number of overnight trips on average.
Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.

Step 1: Install and Load Required Packages

Ensure that the required packages are installed and loaded.

# Load required libraries
library(readxl)

## Warning: package 'readxl' was built under R version 4.4.3

library(tidyverse)
library(fpp3)

Step 2: Read the tourism.xlsx File into R

Download tourism.xlsx from the book website and place it in your working directory.

# Read the Excel file
tourism_data <- read_excel("tourism.xlsx")

# View the first few rows
head(tourism_data)

## # A tibble: 6 × 5
##   Quarter    Region   State           Purpose  Trips
##   <chr>      <chr>    <chr>           <chr>    <dbl>
## 1 1998-01-01 Adelaide South Australia Business  135.
## 2 1998-04-01 Adelaide South Australia Business  110.
## 3 1998-07-01 Adelaide South Australia Business  166.
## 4 1998-10-01 Adelaide South Australia Business  127.
## 5 1999-01-01 Adelaide South Australia Business  137.
## 6 1999-04-01 Adelaide South Australia Business  200.

Step 3: Convert the Data into a tsibble

We need to structure the dataset into a tsibble similar to the tourism tsibble in the fpp3 package.

# Convert to a tsibble with Quarter as the index
tourism_tsibble <- tourism_data %>%
  mutate(Quarter = yearquarter(Quarter)) %>%  # Ensure Quarter is in yearquarter format
  as_tsibble(index = Quarter, key = c(Region, State, Purpose))

# Check the structure
glimpse(tourism_tsibble)

## Rows: 24,320
## Columns: 5
## Key: Region, State, Purpose [304]
## $ Quarter <qtr> 1998 Q1, 1998 Q2, 1998 Q3, 1998 Q4, 1999 Q1, 1999 Q2, 1999 Q3,…
## $ Region  <chr> "Adelaide", "Adelaide", "Adelaide", "Adelaide", "Adelaide", "A…
## $ State   <chr> "South Australia", "South Australia", "South Australia", "Sout…
## $ Purpose <chr> "Business", "Business", "Business", "Business", "Business", "B…
## $ Trips   <dbl> 135.0777, 109.9873, 166.0347, 127.1605, 137.4485, 199.9126, 16…

Step 4: Find the Combination of Region and Purpose with the Maximum Average Overnight Trips

We calculate the average number of trips for each combination of Region and Purpose, then find the maximum.

# Find the combination with the highest average trips
max_trips <- tourism_tsibble %>%
  group_by(Region, Purpose) %>%   # Group by Region and Purpose
  summarise(avg_trips = mean(Trips, na.rm = TRUE), .groups = "drop") %>% # Compute average trips
  arrange(desc(avg_trips)) %>%  # Sort in descending order
  slice(1)  # Select the top combination

## Warning: Current temporal ordering may yield unexpected results.
## ℹ Suggest to sort by `Region`, `Purpose`, `Quarter` first.

# Display result
print(max_trips)

## # A tsibble: 1 x 4 [1Q]
## # Key:       Region, Purpose [1]
##   Region    Purpose  Quarter avg_trips
##   <chr>     <chr>      <qtr>     <dbl>
## 1 Melbourne Visiting 2017 Q4      985.

Step 5: Create a tsibble with Total Trips by State

Now, we aggregate the total trips by State while combining all Regions and Purposes.

# Aggregate trips by State without grouping by Quarter
state_trips_tsibble <- tourism_tsibble %>%
  index_by(Quarter) %>%  # Ensure Quarter is treated correctly
  summarise(Total_Trips = sum(Trips, na.rm = TRUE), .groups = "drop") %>%  
  as_tsibble(index = Quarter)

# View the new tsibble
glimpse(state_trips_tsibble)

## Rows: 80
## Columns: 2
## $ Quarter     <qtr> 1998 Q1, 1998 Q2, 1998 Q3, 1998 Q4, 1999 Q1, 1999 Q2, 1999…
## $ Total_Trips <dbl> 23182.20, 20323.38, 19826.64, 20830.13, 22087.35, 21458.37…

Exercises 2.8

Use the following graphics functions: autoplot(), gg_season(), gg_subseries(), gg_lag(), ACF() and explore features from the following time series: “Total Private” Employed from us_employment, Bricks from aus_production, Hare from pelt, “H02” Cost from PBS, and Barrels from us_gasoline.

Can you spot any seasonality, cyclicity and trend?
What do you learn about the series?
What can you say about the seasonal patterns?
Can you identify any unusual years?

Step 1: Load Required Libraries

Ensure the necessary libraries are installed and loaded.

# Install packages if not already installed
# install.packages("fpp3")
# install.packages("ggplot2")

# Load required libraries
library(fpp3)
library(ggplot2)

Step 2: Explore Each Time Series

2.1 “Total Private” Employed from us_employment

# Load the dataset
data("us_employment")

# Filter for "Total Private" employment
total_private <- us_employment %>%
  filter(Title == "Total Private")

# Plot using autoplot()
autoplot(total_private, Employed) +
  ggtitle("Total Private Employment in the US") +
  ylab("Number of Employees") +
  xlab("Year")

# Seasonal plot
gg_season(total_private, Employed) +
  ggtitle("Seasonal Plot of Total Private Employment")

# Subseries plot
gg_subseries(total_private, Employed) +
  ggtitle("Subseries Plot of Total Private Employment")

# Lag plot
gg_lag(total_private, Employed, geom = "point") +
  ggtitle("Lag Plot of Total Private Employment")

# ACF plot
total_private %>%
  ACF(Employed) %>%
  autoplot() +
  ggtitle("ACF of Total Private Employment")

Observations: * Trend: The series shows a clear upward trend, indicating growth in private employment over time. * Seasonality: The seasonal plot reveals consistent patterns, likely tied to annual cycles (e.g., hiring during certain months). * Cyclicity: No strong evidence of cyclical behavior beyond seasonal effects. * Unusual Years: Check for dips during economic recessions (e.g., 2008 financial crisis).

2.2 Bricks from aus_production

# Load the dataset
data("aus_production")

# Plot using autoplot()
autoplot(aus_production, Bricks) +
  ggtitle("Australian Brick Production") +
  ylab("Millions of Units") +
  xlab("Year")

## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).

# Seasonal plot
gg_season(aus_production, Bricks) +
  ggtitle("Seasonal Plot of Australian Brick Production")

## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).

# Subseries plot
gg_subseries(aus_production, Bricks) +
  ggtitle("Subseries Plot of Australian Brick Production")

## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_line()`).

# Lag plot
gg_lag(aus_production, Bricks, geom = "point") +
  ggtitle("Lag Plot of Australian Brick Production")

## Warning: Removed 20 rows containing missing values (gg_lag).

# ACF plot
aus_production %>%
  ACF(Bricks) %>%
  autoplot() +
  ggtitle("ACF of Australian Brick Production")

Observations:

Trend: The series shows fluctuations but no clear long-term trend.
Seasonality: Strong seasonal patterns, likely tied to construction cycles (e.g., higher production in warmer months).
Cyclicity: Possible cyclical behavior due to economic conditions affecting construction.
Unusual Years: Look for years with significant drops or spikes, possibly due to economic events.

2.3 Hare from pelt

# Load the dataset
data("pelt")

# Plot using autoplot()
autoplot(pelt, Hare) +
  ggtitle("Hare Pelts Traded in Canada") +
  ylab("Number of Pelts") +
  xlab("Year")

# Seasonal plot (not applicable for yearly data)
# gg_season(pelt, Hare)  # Not meaningful for yearly data

# Subseries plot (not applicable for yearly data)
# gg_subseries(pelt, Hare)  # Not meaningful for yearly data

# Lag plot
gg_lag(pelt, Hare, geom = "point") +
  ggtitle("Lag Plot of Hare Pelts")

# ACF plot
pelt %>%
  ACF(Hare) %>%
  autoplot() +
  ggtitle("ACF of Hare Pelts")

Observations: * Trend: The series shows cyclical behavior, likely tied to predator-prey dynamics. * Seasonality: Not applicable (yearly data). * Cyclicity: Strong cyclical patterns, consistent with ecological cycles. * Unusual Years: Look for years with extreme values, possibly due to environmental factors.

2.4 “H02” Cost from PBS

# Load the dataset
data("PBS")

# Filter for "H02" cost
h02_cost <- PBS %>%
  filter(ATC2 == "H02")

# Inspect the structure of h02_cost to identify column names
glimpse(h02_cost)

## Rows: 816
## Columns: 9
## Key: Concession, Type, ATC1, ATC2 [4]
## $ Month      <mth> 1991 Jul, 1991 Aug, 1991 Sep, 1991 Oct, 1991 Nov, 1991 Dec,…
## $ Concession <chr> "Concessional", "Concessional", "Concessional", "Concession…
## $ Type       <chr> "Co-payments", "Co-payments", "Co-payments", "Co-payments",…
## $ ATC1       <chr> "H", "H", "H", "H", "H", "H", "H", "H", "H", "H", "H", "H",…
## $ ATC1_desc  <chr> "Systemic hormonal preparations, excl. sex hormones and ins…
## $ ATC2       <chr> "H02", "H02", "H02", "H02", "H02", "H02", "H02", "H02", "H0…
## $ ATC2_desc  <chr> "CORTICOSTEROIDS FOR SYSTEMIC USE", "CORTICOSTEROIDS FOR SY…
## $ Scripts    <dbl> 63261, 53528, 52822, 54016, 49281, 51798, 42436, 52913, 629…
## $ Cost       <dbl> 317384.0, 269891.0, 269703.0, 280418.0, 268070.0, 277139.0,…

# Plot using autoplot()
autoplot(h02_cost, Cost) +
  ggtitle("H02 Cost in Australia") +
  ylab("Cost") +
  xlab("Year")

# Seasonal plot
gg_season(h02_cost, Cost) +
  ggtitle("Seasonal Plot of H02 Cost")

# Subseries plot
gg_subseries(h02_cost, Cost) +
  ggtitle("Subseries Plot of H02 Cost")

# Filter for a single time series
# Based on the structure, let's filter by Concession and Type
h02_single_series <- h02_cost %>%
  filter(Concession == "Concessional", Type == "Co-payments")

# Check the filtered data
print(h02_single_series)

## # A tsibble: 204 x 9 [1M]
## # Key:       Concession, Type, ATC1, ATC2 [1]
##       Month Concession   Type     ATC1  ATC1_desc ATC2  ATC2_desc Scripts   Cost
##       <mth> <chr>        <chr>    <chr> <chr>     <chr> <chr>       <dbl>  <dbl>
##  1 1991 Jul Concessional Co-paym… H     Systemic… H02   CORTICOS…   63261 317384
##  2 1991 Aug Concessional Co-paym… H     Systemic… H02   CORTICOS…   53528 269891
##  3 1991 Sep Concessional Co-paym… H     Systemic… H02   CORTICOS…   52822 269703
##  4 1991 Oct Concessional Co-paym… H     Systemic… H02   CORTICOS…   54016 280418
##  5 1991 Nov Concessional Co-paym… H     Systemic… H02   CORTICOS…   49281 268070
##  6 1991 Dec Concessional Co-paym… H     Systemic… H02   CORTICOS…   51798 277139
##  7 1992 Jan Concessional Co-paym… H     Systemic… H02   CORTICOS…   42436 221772
##  8 1992 Feb Concessional Co-paym… H     Systemic… H02   CORTICOS…   52913 272345
##  9 1992 Mar Concessional Co-paym… H     Systemic… H02   CORTICOS…   62908 325700
## 10 1992 Apr Concessional Co-paym… H     Systemic… H02   CORTICOS…   68499 349271
## # ℹ 194 more rows

# Lag plot for the filtered series
gg_lag(h02_single_series, Cost, geom = "point") +
  ggtitle("Lag Plot of H02 Cost (Filtered Series)") +
  xlab("Lag") +
  ylab("Cost")

# ACF plot
h02_cost %>%
  ACF(Cost) %>%
  autoplot() +
  ggtitle("ACF of H02 Cost")

Observations: * Trend: The series may show an increasing trend, possibly due to inflation or increased usage. * Seasonality: Strong seasonal patterns, likely tied to healthcare usage (e.g., higher costs in winter). * Cyclicity: No strong evidence of cyclical behavior beyond seasonal effects. * Unusual Years: Look for years with significant changes, possibly due to policy changes.

2.5 Barrels from us_gasoline

# Load the dataset
data("us_gasoline")

# Plot using autoplot()
autoplot(us_gasoline, Barrels) +
  ggtitle("US Gasoline Production") +
  ylab("Barrels") +
  xlab("Year")

# Seasonal plot
gg_season(us_gasoline, Barrels) +
  ggtitle("Seasonal Plot of US Gasoline Production")

# Subseries plot
gg_subseries(us_gasoline, Barrels) +
  ggtitle("Subseries Plot of US Gasoline Production")

# Lag plot
gg_lag(us_gasoline, Barrels, geom = "point") +
  ggtitle("Lag Plot of US Gasoline Production")

# ACF plot
us_gasoline %>%
  ACF(Barrels) %>%
  autoplot() +
  ggtitle("ACF of US Gasoline Production")

Observations: * Trend: The series shows an increasing trend, reflecting growing gasoline production. * Seasonality: Strong seasonal patterns, likely tied to driving seasons (e.g., higher production in summer). * Cyclicity: No strong evidence of cyclical behavior beyond seasonal effects. * Unusual Years: Look for years with significant drops, possibly due to economic downturns or supply disruptions.

Summary of Findings

Seasonality: Most series exhibit strong seasonal patterns (e.g., Bricks, H02 Cost, us_gasoline).
Trend: Many series show clear trends (e.g., Total Private employment, us_gasoline).
Cyclicity: Some series exhibit cyclical behavior (e.g., Hare pelts).
Unusual Years: Dips or spikes in certain years may correspond to economic or environmental events.

HW1_redone

Taha Malik

2025-03-09

Exercises 2.1

Step 1: Load Required Libraries

Step 2: Exploring the Time Series Data

Step 3: Plotting Each Time Series with autoplot()

3.1 Plotting Bricks Production in Australia

3.2 Plotting Lynx Pelt Trading Data

3.3 Plotting GAFA Stock Prices

3.4 Plotting Electricity Demand in Victoria with Custom Labels

Exercises 2.2

Step 1: Load Required Libraries

Step 2: Identify the Peak Closing Price for Each Stock

Step 3: Explanation

Exercises 2.3

Check What Happens Without facet_grid()

Exercises 2.4

Exercises 2.5

Step 1: Install and Load Required Packages

Step 2: Read the tourism.xlsx File into R

Step 3: Convert the Data into a tsibble

Step 4: Find the Combination of Region and Purpose with the Maximum Average Overnight Trips

Step 5: Create a tsibble with Total Trips by State

Exercises 2.8

Step 1: Load Required Libraries

Step 2: Explore Each Time Series

Summary of Findings