Explore the following four time series:
Answer:
Loading the library:
library(fpp3)
## Registered S3 method overwritten by 'tsibble':
## method from
## as_tibble.grouped_df dplyr
## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.1 ──
## ✔ tibble 3.2.1 ✔ tsibble 1.1.6
## ✔ dplyr 1.1.4 ✔ tsibbledata 0.4.1
## ✔ tidyr 1.3.1 ✔ feasts 0.4.1
## ✔ lubridate 1.9.2 ✔ fable 0.4.1
## ✔ ggplot2 3.5.1
## Warning: package 'tibble' was built under R version 4.2.3
## Warning: package 'dplyr' was built under R version 4.2.3
## Warning: package 'tidyr' was built under R version 4.2.3
## Warning: package 'lubridate' was built under R version 4.2.3
## Warning: package 'ggplot2' was built under R version 4.2.3
## Warning: package 'tsibbledata' was built under R version 4.2.3
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date() masks base::date()
## ✖ dplyr::filter() masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval() masks lubridate::interval()
## ✖ dplyr::lag() masks stats::lag()
## ✖ tsibble::setdiff() masks base::setdiff()
## ✖ tsibble::union() masks base::union()
?aus_production
## starting httpd help server ... done
aus_production: is a time series dataset that has quarterly estimates of selected indicators of manufacturing production in Australia. It is s a half-hourly tsibble with six values. ‘Bricks’ is one of the values. ‘Bricks’ in this dataset are clay brick production in millions of bricks.
?pelt
pelt: The pelt dataset contains Hudson Bay Company trading records for Snowshoe Hare and Canadian Lynx furs from 1845 to 1935. It includes trade records from all areas of the company. This dataset is an annual tsibble with two variables: ‘Hare’ and ‘Lynx’. The ‘Lynx’ variable represents the number of Canadian Lynx pelts traded.
?gafa_stock
gafa_stock: The gafa_stock dataset contains historical stock prices for Google, Amazon, Facebook, and Apple from 2014 to 2018. All prices are in USD ($). It is a tsibble with data recorded on irregular trading days, including the following variables: Open, High, Low, Close, Adj_Close, and Volume. The ‘Close’ variable represents the closing price of the stock.
?vic_elec
vic_elec: The vic_elec dataset contains half-hourly electricity demand data for Victoria, Australia. It is a half-hourly tsibble with three variables: Demand, Temperature, and Holiday. The ‘Demand’ variable represents the total electricity demand in megawatt-hours (MWh).
We can use the ‘interval()’ to find out the time interval of each series:
interval(aus_production)
## <interval[1]>
## [1] 1Q
interval(pelt)
## <interval[1]>
## [1] 1Y
interval(gafa_stock)
## <interval[1]>
## [1] !
interval(vic_elec)
## <interval[1]>
## [1] 30m
The above code chunk demonstrates the time interval of each series as follows:
aus_production: Quarterly
pelt: Yearly
gafa_stock: Irregular intervals (based on trading days)
vic_elec: Half-hourly
Use autoplot() to produce a time plot of each series.
We will use the autoplot() to produce a time plot of each series for the follwing variables: ‘Bricks’ from aus_production, ‘Lynx’ from pelt, ‘Close’ from gafa_stock, and the ‘demand’ from vic_elec.
autoplot(aus_production, Bricks) +
ggtitle("Quarterly clay brick production in millions of bricks") + xlab("Year") +
ylab("Bricks Production (Millions)")
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).
The plot visualizes the Quarterly Bricks Production in Australia. It shows the time trend for Bricks over the years. we can see a long-term increasing trend in quarterly brick production in Australia from the 1950s to the 1980s, followed by fluctuations and a decline after 1990. This may indicate possible economic or industry shifts.
autoplot(pelt, Lynx) +
ggtitle("Canadian Lynx Pelts Trading Record")
The plot shows a cyclical pattern in Canadian Lynx pelts trading records, we can see pattern of rapid growth followed by a sharp decline.
colnames(gafa_stock)
## [1] "Symbol" "Date" "Open" "High" "Low" "Close"
## [7] "Adj_Close" "Volume"
autoplot(gafa_stock, Close) +
ggtitle("Stock Closing Prices (2014-2018)")
The plot shows that Amazon and Google saw significant growth in their stock prices from 2014 to 2018, with sharp rises followed by declines. We can also see that Apple and Facebook had lower prices and more stable trends.
autoplot(vic_elec, Demand)
The plot shows electricity demand in Victoria, Australia, from 2012 to 2015, with strong seasonal patterns and high variability.
autoplot(vic_elec, Demand) +
ggtitle("Half-Hourly Electricity Demand in Victoria (2012-2015)") +
xlab("Year") +
ylab("Electricity Demand (MWh)")
I added title as “Half-Hourly Electricity Demand in Victoria (2012-2015)”, added “Year” as the x-axis and “Electricity Demand (MWh)” as the y - axis.
Use filter() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock.
Answer:
In this exercise, we must include the group_by() function to group the data by “Symbol” (stock ticker) to ensure that we find the peak closing price for each individual stock. If we only use filter() without group_by(), the result would return just one date—specifically, the single day when any one of the four stocks had the highest closing price in the entire dataset, rather than showing the peak for each stock separately.
library(knitr)
gafa_stock %>%
group_by(Symbol) %>%
filter(Close == max(Close)) %>%
select(Symbol, Date, Close) %>%
kable(col.names = c("Stock", "Peak Date", "Closing Price ($)"),
caption = "Peak Closing Prices for GAFA Stocks (2014-2018)")
| Stock | Peak Date | Closing Price ($) |
|---|---|---|
| AAPL | 2018-10-03 | 232.07 |
| AMZN | 2018-09-04 | 2039.51 |
| FB | 2018-07-25 | 217.50 |
| GOOG | 2018-07-26 | 1268.33 |
Download the file “tute1.csv” from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.
a. You can read the data into R with the following script:
Answer:
tute1 <- readr::read_csv("https://raw.githubusercontent.com/FarhanaAkther23/DATA624/refs/heads/main/DATA624%20-%20HW1/tute1.csv")
## Rows: 100 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Quarter
## dbl (3): Sales, AdBudget, GDP
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
View(tute1)
b. Convert the data to time series
mytimeseries <- tute1 |>
mutate(Quarter = yearquarter(Quarter)) |>
as_tsibble(index = Quarter)
c. Construct time series plots of each of the three series
mytimeseries |>
pivot_longer(-Quarter) |>
ggplot(aes(x = Quarter, y = value, colour = name)) +
geom_line() +
facet_grid(name ~ ., scales = "free_y")
Check what happens when you don’t include facet_grid().
mytimeseries |>
pivot_longer(-Quarter) |>
ggplot(aes(x = Quarter, y = value, colour = name)) +
geom_line()
Without using ‘facet_grid()’, all three time series are plotted on the same graph and share the same y-axis scale. This makes it harder to distinguish their individual variations. As we can see in the first plot (with ‘facet_grid’), all three series shows similar seasonality patterns, but when plotted together without individual plots, differences in scale make them appear more similar than they actually are. This can lead to misinterpretation of the relationships between the variables.
Answer:
The “USgas” package contains data on the demand for natural gas in the US.
a. Install the USgas package.
library(USgas)
## Warning: package 'USgas' was built under R version 4.2.3
b. Create a tsibble from us_total with year as the index and state as the key.
I used the examples from the book to construct the tsibble from
us_total and made year the index and state the key.
us_gas_tsibble <- us_total |>
as_tsibble(index = year, key = state)
c. Plot the annual natural gas consumption by state for the New England area (comprising the states of Maine, Vermont, New Hampshire, Massachusetts, Connecticut and Rhode Island).
We can use dplyr functions such as “mutate()”, “filter()”, “select()” and “summarise(”) to work with tsibble objects. To filter for New England States let’s take a look at them with “unique()”, then we filter the dataset to include only of these states.
unique(us_gas_tsibble$state)
## [1] "Alabama" "Alaska"
## [3] "Arizona" "Arkansas"
## [5] "California" "Colorado"
## [7] "Connecticut" "Delaware"
## [9] "District of Columbia" "Federal Offshore -- Gulf of Mexico"
## [11] "Florida" "Georgia"
## [13] "Hawaii" "Idaho"
## [15] "Illinois" "Indiana"
## [17] "Iowa" "Kansas"
## [19] "Kentucky" "Louisiana"
## [21] "Maine" "Maryland"
## [23] "Massachusetts" "Michigan"
## [25] "Minnesota" "Mississippi"
## [27] "Missouri" "Montana"
## [29] "Nebraska" "Nevada"
## [31] "New Hampshire" "New Jersey"
## [33] "New Mexico" "New York"
## [35] "North Carolina" "North Dakota"
## [37] "Ohio" "Oklahoma"
## [39] "Oregon" "Pennsylvania"
## [41] "Rhode Island" "South Carolina"
## [43] "South Dakota" "Tennessee"
## [45] "Texas" "U.S."
## [47] "Utah" "Vermont"
## [49] "Virginia" "Washington"
## [51] "West Virginia" "Wisconsin"
## [53] "Wyoming"
new_england_gas <- us_gas_tsibble |>
filter(state %in% c("Maine", "Vermont", "New Hampshire",
"Massachusetts", "Connecticut", "Rhode Island"))
glimpse(new_england_gas)
## Rows: 138
## Columns: 3
## Key: state [6]
## $ year <int> 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007…
## $ state <chr> "Connecticut", "Connecticut", "Connecticut", "Connecticut", "Con…
## $ y <int> 144708, 131497, 152237, 159712, 146278, 177587, 154075, 162642, …
new_england_gas |>
ggplot(aes(x = year, y = y, colour = state)) +
geom_line() +
facet_wrap(~state, scales = "free_y") +
ggtitle("Annual Natural Gas Consumption in New England (1997-2019)") +
xlab("Year") +
ylab("Natural Gas Consumption") +
theme_minimal()
From the graph, we can see that natural gas consumption has been steadily increasing in Connecticut, Massachusetts, and Vermont, while Maine, Rhode Island, and New Hampshire have shown more fluctuations, with Maine experiencing a sharp peak before declining.
Answer:
a. Download tourism.xlsx from the book website and read it into R using readxl::read_excel().
We use the the “?tourism” to explore the “tourism” tsibble. Then after loading the excel file, I followed the books examples to create tsibbles while identifying the “index” column and the “key” columns.
?tourism
library(readxl)
## Warning: package 'readxl' was built under R version 4.2.3
tourism <- readxl::read_excel("DATA624 - HW1/tourism.xlsx")
glimpse(tourism)
## Rows: 24,320
## Columns: 5
## $ Quarter <chr> "1998-01-01", "1998-04-01", "1998-07-01", "1998-10-01", "1999-…
## $ Region <chr> "Adelaide", "Adelaide", "Adelaide", "Adelaide", "Adelaide", "A…
## $ State <chr> "South Australia", "South Australia", "South Australia", "Sout…
## $ Purpose <chr> "Business", "Business", "Business", "Business", "Business", "B…
## $ Trips <dbl> 135.0777, 109.9873, 166.0347, 127.1605, 137.4485, 199.9126, 16…
B. Create a tsibble which is identical to the “tourism” tsibble from the “tsibble” package.
We need to match the format of the built-in ‘tourism’ dataset:
tourism_tsibble <- tourism |>
mutate(Quarter = yearquarter(Quarter)) |>
as_tsibble(index = Quarter, key = c(Region, State, Purpose))
glimpse(tourism_tsibble) # check the result
## Rows: 24,320
## Columns: 5
## Key: Region, State, Purpose [304]
## $ Quarter <qtr> 1998 Q1, 1998 Q2, 1998 Q3, 1998 Q4, 1999 Q1, 1999 Q2, 1999 Q3,…
## $ Region <chr> "Adelaide", "Adelaide", "Adelaide", "Adelaide", "Adelaide", "A…
## $ State <chr> "South Australia", "South Australia", "South Australia", "Sout…
## $ Purpose <chr> "Business", "Business", "Business", "Business", "Business", "B…
## $ Trips <dbl> 135.0777, 109.9873, 166.0347, 127.1605, 137.4485, 199.9126, 16…
C. Find what combination of ‘Region’ and ‘Purpose’ had the maximum number of overnight trips on average.
The average number of trips for each Region - Purpose combination:
tourism_tsibble |>
as_tibble() |>
group_by(Region, Purpose) |>
summarise(Avg_Trips = mean(Trips, na.rm = TRUE), .groups = "drop") |>
arrange(desc(Avg_Trips)) |>
head()
## # A tibble: 6 × 3
## Region Purpose Avg_Trips
## <chr> <chr> <dbl>
## 1 Sydney Visiting 747.
## 2 Melbourne Visiting 619.
## 3 Sydney Business 602.
## 4 North Coast NSW Holiday 588.
## 5 Sydney Holiday 550.
## 6 Gold Coast Holiday 528.
The maximum number of overnight trips on average:
max_avg_trips <- tourism |>
group_by(Region, Purpose) |>
summarise(Avg_Trips = mean(Trips, na.rm = TRUE), .groups = "drop") |>
arrange(desc(Avg_Trips)) |>
slice(1)
max_avg_trips
## # A tibble: 1 × 3
## Region Purpose Avg_Trips
## <chr> <chr> <dbl>
## 1 Sydney Visiting 747.
The maximum number of overnight trips on average is 747.27
d. Create a new ‘tsibble’ which combines the Purposes and Regions, and just has total trips by State.
In this exercise, we need to compute the total number of trips by State while combining all Purposes and Regions. This means we should group by State and Quarter and sum up all trips. To achieve this, we used the group_by() function with State and Quarter to aggregate trips across all regions and purposes, followed by the summarise() function to compute the total trips per State per Quarter. Finally, we converted the resulting dataset into a tsibble to preserve the time series structure.
state_trips_tsibble <- tourism |>
mutate(Quarter = yearquarter(Quarter)) |> # Ensure Quarter is in correct format
group_by(Quarter, State) |>
summarise(Total_Trips = sum(Trips, na.rm = TRUE), .groups = "drop") |> # Sum up trips
as_tsibble(index = Quarter, key = State)
state_trips_tsibble
## # A tsibble: 640 x 3 [1Q]
## # Key: State [8]
## Quarter State Total_Trips
## <qtr> <chr> <dbl>
## 1 1998 Q1 ACT 551.
## 2 1998 Q2 ACT 416.
## 3 1998 Q3 ACT 436.
## 4 1998 Q4 ACT 450.
## 5 1999 Q1 ACT 379.
## 6 1999 Q2 ACT 558.
## 7 1999 Q3 ACT 449.
## 8 1999 Q4 ACT 595.
## 9 2000 Q1 ACT 600.
## 10 2000 Q2 ACT 557.
## # ℹ 630 more rows
Use the following graphics functions: autoplot(), gg_season(), gg_subseries(), gg_lag(), ACF() and explore features from the following time series: “Total Private” Employed from us_employment, Bricks from aus_production, Hare from pelt, “H02” Cost from PBS, and Barrels from us_gasoline.
- Can you spot any seasonality, cyclicity and trend?
- What do you learn about the series?
- What can you say about the seasonal patterns?
- Can you identify any unusual years?
Let’s understand the Time Series Data.
Let’s also understand what are these graphics functions are (autoplot(), gg_season(), gg_subseries(), gg_lag(), ACF()) and check the structure of the dataset to will help us confirm the available variables and how they are formatted.
autoplot(): First step in time series analysis to get a general sense of the data. It creates a basic time series plot and displays the trend, seasonality, and fluctuations over time as well as helps us to identify long-term growth, declines, or sudden changes.
gg_season(): Compares and helps spot seasonal patterns across years. It shows how the time series behaves within each season (e.g., months, quarters). If seasonality exists, we will see consistent peaks and troughs at the same points each year.
gg_subseries(): Helps see the relative strength of each season as it breaks the data into separate time series for each season. It displays the average trend for each season (month/quarter) over multiple years. If strong seasonality exists, months will show consistent differences (e.g., Dec always being high, Jan always being low).
gg_lag(): Provides a visual check for patterns & seasonality. It plots current values vs. past values at different lags. If points form a diagonal line, the series is highly correlated (trend-dominant). If circular or repeating patterns appear, it suggests seasonality.
ACF(): Quantifies how past values affect future values. Shows how current values are correlated with past values (lags). Strong peaks at regular lags suggest seasonality. If autocorrelation declines gradually, it indicates a trend rather than seasonality.
Now let’s explore features from the following time series:
us_employment:
glimpse(us_employment)
## Rows: 143,412
## Columns: 4
## Key: Series_ID [148]
## $ Month <mth> 1939 Jan, 1939 Feb, 1939 Mar, 1939 Apr, 1939 May, 1939 Jun, …
## $ Series_ID <chr> "CEU0500000001", "CEU0500000001", "CEU0500000001", "CEU05000…
## $ Title <chr> "Total Private", "Total Private", "Total Private", "Total Pr…
## $ Employed <dbl> 25338, 25447, 25833, 25801, 26113, 26485, 26481, 26848, 2746…
head(us_employment)
## # A tsibble: 6 x 4 [1M]
## # Key: Series_ID [1]
## Month Series_ID Title Employed
## <mth> <chr> <chr> <dbl>
## 1 1939 Jan CEU0500000001 Total Private 25338
## 2 1939 Feb CEU0500000001 Total Private 25447
## 3 1939 Mar CEU0500000001 Total Private 25833
## 4 1939 Apr CEU0500000001 Total Private 25801
## 5 1939 May CEU0500000001 Total Private 26113
## 6 1939 Jun CEU0500000001 Total Private 26485
autoplot():
us_employment_private <- us_employment |>
filter(Title == "Total Private")
autoplot(us_employment_private, Employed) +
ggtitle("Total Private Employment in the U.S.")
gg_season():
us_employment_private |>
gg_season(Employed) +
ggtitle("Seasonal Patterns in Total Private Employment")
gg_subseries():
us_employment_private |>
gg_subseries(Employed) +
ggtitle("Subseries Plot: Total Private Employment")
gg_lag():
us_employment_private |>
gg_lag(Employed, geom = "point") +
ggtitle("Lag Plot: Total Private Employment")
ACF():
us_employment_private |>
ACF(Employed) |>
autoplot() +
ggtitle("Autocorrelation of Total Private Employment")
Overall Analysis: The Total Private Employment data shows a steady increase over time, reflecting long-term economic growth in the U.S. While there are some ups and downs, especially during economic downturns like the 2008 financial crisis, these cycles don’t follow a regular pattern. Seasonal effects are minimal, as employment levels don’t show consistent fluctuations within a year. This was confirmed by the seasonal plots (gg_season() and gg_subseries()) and the autocorrelation function (ACF()). The lag plot (gg_lag()) and ACF plot indicate strong correlation between past and future values, meaning employment tends to follow a predictable growth trend.
aus_production:
glimpse(aus_production)
## Rows: 218
## Columns: 7
## $ Quarter <qtr> 1956 Q1, 1956 Q2, 1956 Q3, 1956 Q4, 1957 Q1, 1957 Q2, 1957…
## $ Beer <dbl> 284, 213, 227, 308, 262, 228, 236, 320, 272, 233, 237, 313…
## $ Tobacco <dbl> 5225, 5178, 5297, 5681, 5577, 5651, 5317, 6152, 5758, 5641…
## $ Bricks <dbl> 189, 204, 208, 197, 187, 214, 227, 222, 199, 229, 249, 234…
## $ Cement <dbl> 465, 532, 561, 570, 529, 604, 603, 582, 554, 620, 646, 637…
## $ Electricity <dbl> 3923, 4436, 4806, 4418, 4339, 4811, 5259, 4735, 4608, 5196…
## $ Gas <dbl> 5, 6, 7, 6, 5, 7, 7, 6, 5, 7, 8, 6, 5, 7, 8, 6, 6, 8, 8, 7…
head(aus_production)
## # A tsibble: 6 x 7 [1Q]
## Quarter Beer Tobacco Bricks Cement Electricity Gas
## <qtr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1956 Q1 284 5225 189 465 3923 5
## 2 1956 Q2 213 5178 204 532 4436 6
## 3 1956 Q3 227 5297 208 561 4806 7
## 4 1956 Q4 308 5681 197 570 4418 6
## 5 1957 Q1 262 5577 187 529 4339 5
## 6 1957 Q2 228 5651 214 604 4811 7
autoplot():
aus_production |>
autoplot(Bricks) +
ggtitle("Quarterly Brick Production in Australia")
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).
gg_season():
aus_production |>
gg_season(Bricks) +
ggtitle("Seasonal Patterns in Brick Production")
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).
gg_subseries()
aus_production |>
gg_subseries(Bricks) +
ggtitle("Subseries Plot: Brick Production by Quarter") +
xlab("Year") +
ylab("Bricks Produced (Millions)")
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_line()`).
gg_lag:
aus_production |>
gg_lag(Bricks, geom = "point") +
ggtitle("Lag Plot: Brick Production")
## Warning: Removed 20 rows containing missing values (gg_lag).
ACF():
aus_production |>
ACF(Bricks) |>
autoplot() +
ggtitle("Autocorrelation of Brick Production")
Overall Analysis: The brick production time series shows a clear upward trend from the 1950s to the 1980s, followed by a gradual decline over the next 25 years. Seasonality is present, as seen in the gg_season() plot, with production consistently peaking in Q3 and dropping in Q4. The lag plot (gg_lag()) and autocorrelation function (ACF()) confirm a strong quarterly seasonal cycle, with peaks at lags of 4, 8, and 12 quarters. Additionally, cyclical behavior is evident, as the long-term trend shifts direction around the mid-point of the dataset, suggesting economic or construction industry cycles influencing brick production.
pelt:
?pelt
glimpse(pelt)
## Rows: 91
## Columns: 3
## $ Year <dbl> 1845, 1846, 1847, 1848, 1849, 1850, 1851, 1852, 1853, 1854, 1855,…
## $ Hare <dbl> 19580, 19600, 19610, 11990, 28040, 58000, 74600, 75090, 88480, 61…
## $ Lynx <dbl> 30090, 45150, 49150, 39520, 21230, 8420, 5560, 5080, 10170, 19600…
head(pelt)
## # A tsibble: 6 x 3 [1Y]
## Year Hare Lynx
## <dbl> <dbl> <dbl>
## 1 1845 19580 30090
## 2 1846 19600 45150
## 3 1847 19610 49150
## 4 1848 11990 39520
## 5 1849 28040 21230
## 6 1850 58000 8420
autoplut()
pelt |>
autoplot(Hare) +
ggtitle("Annual Hare Pelt Trading Records")
gg_season():
The pelt dataset is annual and each year has only one observation, so there’s no seasonal breakdown to compare across months or quarters. Therefore, gg_season() will not Work for this dataset. Movign on to gg_subseries().
gg_subseries():
pelt |>
gg_subseries(Hare) +
ggtitle("Subseries Plot: Snowshoe Hare Pelts")
gg_lag:
pelt |>
gg_lag(Hare, geom = "point") +
ggtitle("Lag Plot: Hare Pelts")
ACF():
pelt |>
ACF(Hare) |>
autoplot() +
ggtitle("Autocorrelation of Hare Pelts")
Overall Analysis: The hare population goes through regular boom-and-bust cycles, peaking about every 10 years. After a peak, the population drops sharply and takes several years to recover. This pattern is likely due to predator-prey dynamics when hare numbers increase, predators (like lynx) also grow, leading to a decline in hares before the cycle starts again. The autocorrelation plot (ACF()) confirms this cycle, showing strong positive correlation every 10 years and negative correlation around 5 years, meaning a high hare population today usually leads to a low population in about five years.
source:
https://www2.nau.edu/lrm22/lessons/predator_prey/predator_prey.html#:~:text=This%20can%20lead%20to%20cyclical%20patterns%20of,prey%20numbers%20and%20then%20decrease%20as%20well.&text=While%20this%20is%20an%20indirect%20measure%20of,of%20hare%20and%20lynx%20in%20the%20wild.
https://www.youtube.com/watch?v=bgsZy2HAZVM
PBS:
?PBS # Monthly Medicare Australia prescription data
head(PBS)
## # A tsibble: 6 x 9 [1M]
## # Key: Concession, Type, ATC1, ATC2 [1]
## Month Concession Type ATC1 ATC1_desc ATC2 ATC2_desc Scripts Cost
## <mth> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 1991 Jul Concessional Co-paymen… A Alimenta… A01 STOMATOL… 18228 67877
## 2 1991 Aug Concessional Co-paymen… A Alimenta… A01 STOMATOL… 15327 57011
## 3 1991 Sep Concessional Co-paymen… A Alimenta… A01 STOMATOL… 14775 55020
## 4 1991 Oct Concessional Co-paymen… A Alimenta… A01 STOMATOL… 15380 57222
## 5 1991 Nov Concessional Co-paymen… A Alimenta… A01 STOMATOL… 14371 52120
## 6 1991 Dec Concessional Co-paymen… A Alimenta… A01 STOMATOL… 15028 54299
glimpse(PBS)
## Rows: 67,596
## Columns: 9
## Key: Concession, Type, ATC1, ATC2 [336]
## $ Month <mth> 1991 Jul, 1991 Aug, 1991 Sep, 1991 Oct, 1991 Nov, 1991 Dec,…
## $ Concession <chr> "Concessional", "Concessional", "Concessional", "Concession…
## $ Type <chr> "Co-payments", "Co-payments", "Co-payments", "Co-payments",…
## $ ATC1 <chr> "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A",…
## $ ATC1_desc <chr> "Alimentary tract and metabolism", "Alimentary tract and me…
## $ ATC2 <chr> "A01", "A01", "A01", "A01", "A01", "A01", "A01", "A01", "A0…
## $ ATC2_desc <chr> "STOMATOLOGICAL PREPARATIONS", "STOMATOLOGICAL PREPARATIONS…
## $ Scripts <dbl> 18228, 15327, 14775, 15380, 14371, 15028, 11040, 15165, 168…
## $ Cost <dbl> 67877.00, 57011.00, 55020.00, 57222.00, 52120.00, 54299.00,…
view(PBS)
PBS |>
distinct(ATC2, ATC2_desc)
## # A tibble: 107 × 2
## ATC2 ATC2_desc
## <chr> <chr>
## 1 A01 STOMATOLOGICAL PREPARATIONS
## 2 A02 DRUGS FOR ACID RELATED DISORDERS
## 3 A03 DRUGS FOR FUNCTIONAL GASTROINTESTINAL DISORDERS
## 4 A04 ANTIEMETICS AND ANTINAUSEANTS
## 5 A05 BILE AND LIVER THERAPY
## 6 A06 LAXATIVES
## 7 A07 ANTIDIARR ,INTEST ANTIINFL /ANTIINFECT AGENTS
## 8 A09 DIGESTIVES, INCL ENZYMES
## 9 A10 ANTIDIABETIC THERAPY
## 10 A11 VITAMINS
## # ℹ 97 more rows
PBS |>
filter(ATC2 == "H02") |>
distinct(ATC2_desc)
## # A tibble: 1 × 1
## ATC2_desc
## <chr>
## 1 CORTICOSTEROIDS FOR SYSTEMIC USE
PBS_H02 <- PBS |>
filter(ATC2 == "H02") # we only select H02 (Corticosteroids) data
autoplot():
PBS_H02 |>
autoplot(Cost) +
ggtitle("Monthly Cost of H02 (Corticosteroids) Prescriptions")
gg_season():
PBS_H02 |>
gg_season(Cost) +
ggtitle("Seasonal Patterns in H02 (Corticosteroids) Prescription Cost")
gg_subseries():
PBS_H02 |>
gg_subseries(Cost) +
ggtitle("Subseries Plot: H02 (Corticosteroids) Prescription Cost")
*gg_lag():
Since gg_lag() requires a single time series, we will to filter one category before plotting. let’s select one specific category like “Concessional Co-payments”.
PBS_H02_filtered <- PBS_H02 |>
filter(Concession == "Concessional", Type == "Co-payments") # we selected one category
PBS_H02_filtered |>
gg_lag(Cost, geom = "point")
ACF():
PBS_H02_filtered |>
ACF(Cost) |>
autoplot() +
ggtitle("Autocorrelation of H02 (Corticosteroids) Prescription Cost")
Overall Alalysis: The cost of H02 (Corticosteroids) prescriptions has been steadily increasing over time, with clear seasonal patterns. Every year, costs tend to start lower in January and rise towards the end of the year, peaking around October to December. This trend is evident in the gg_season() and gg_subseries() plots. The autocorrelation analysis confirms this pattern, showing that past costs especially from 12 months ago strongly influence current costs, as indicated by gg_lag(). The ACF plot further proves this, with significant spikes at 12-month lags, confirms annual seasonality. Overall, the data suggests that prescription costs consistently increase towards year-end, possibly due to insurance policies, medical demand, or prescription renewals.
us_gasoline
?us_gasoline
glimpse(us_gasoline)
## Rows: 1,355
## Columns: 2
## $ Week <week> 1991 W06, 1991 W07, 1991 W08, 1991 W09, 1991 W10, 1991 W11, 1…
## $ Barrels <dbl> 6.621, 6.433, 6.582, 7.224, 6.875, 6.947, 7.328, 6.777, 7.503,…
head(us_gasoline)
## # A tsibble: 6 x 2 [1W]
## Week Barrels
## <week> <dbl>
## 1 1991 W06 6.62
## 2 1991 W07 6.43
## 3 1991 W08 6.58
## 4 1991 W09 7.22
## 5 1991 W10 6.88
## 6 1991 W11 6.95
view(us_gasoline)
us_gasoline |>
autoplot(Barrels) +
ggtitle("Weekly Gasoline Consumption in the US (1991-2017)")
us_gasoline |>
gg_season(Barrels)+
ggtitle("Seasonal Patterns in US Weekly Gasoline Consumption")
gg_subseries():
us_gasoline |>
gg_subseries(Barrels) +
ggtitle("Breakdown of Weekly Gasoline Consumption Trends")
gg_lag():
us_gasoline |>
gg_lag(Barrels, geom = "point") +
ggtitle("Lag Plot of Weekly Gasoline Consumption")
ACF():
us_gasoline |>
ACF(Barrels) |>
autoplot() +
ggtitle("Autocorrelation of Weekly Gasoline Consumption")
Overall Analysis: There is a steady upward trend in weekly gasoline consumption from 1991 to around 2007, which indicates a growing fuel demand. However, around 2008-2009, we can see a noticeable drop and fluctuation. This is can be due to global financial crisis, which may have reduced the fuel demand. Seasonal patterns show higher consumption during summer months (June-August) and lower usage in winter (January-February). The autocorrelation analysis confirms strong consistency where past values significantly influence future consumption. Overall, the data suggests a long-term increasing trend with seasonal cycles and economic factors affecting fluctuations of gasoline consumption over time.