Explore the following four time series: Bricks from aus_production, Lynx from pelt, Close from gafa_stock, Demand from vic_elec.- Use ? (or help()) to find out about the data in each series.- What is the time interval of each series?- Use autoplot() to produce a time plot of each series.- For the last plot, modify the axis labels and title.
library(fpp3)
## Registered S3 method overwritten by 'tsibble':
## method from
## as_tibble.grouped_df dplyr
## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.1 ──
## âś” tibble 3.2.1 âś” tsibble 1.1.6
## âś” dplyr 1.1.4 âś” tsibbledata 0.4.1
## âś” tidyr 1.3.1 âś” feasts 0.4.1
## âś” lubridate 1.9.4 âś” fable 0.4.1
## âś” ggplot2 3.5.1
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## âś– lubridate::date() masks base::date()
## âś– dplyr::filter() masks stats::filter()
## âś– tsibble::intersect() masks base::intersect()
## âś– tsibble::interval() masks lubridate::interval()
## âś– dplyr::lag() masks stats::lag()
## âś– tsibble::setdiff() masks base::setdiff()
## âś– tsibble::union() masks base::union()
library(tsibble)
?aus_production
## starting httpd help server ...
## done
?pelt
?gafa_stock
?vic_elec
Now we look at the intervals
interval(aus_production)
## <interval[1]>
## [1] 1Q
interval(pelt)
## <interval[1]>
## [1] 1Y
interval(gafa_stock)
## <interval[1]>
## [1] !
interval(vic_elec)
## <interval[1]>
## [1] 30m
Use filter() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock.
gafa_max_close <- gafa_stock |>
group_by(Symbol) |>
filter(Close == max(Close)) |>
select(Symbol, Date)
print(gafa_max_close)
## # A tsibble: 4 x 2 [!]
## # Key: Symbol [4]
## # Groups: Symbol [4]
## Symbol Date
## <chr> <date>
## 1 AAPL 2018-10-03
## 2 AMZN 2018-09-04
## 3 FB 2018-07-25
## 4 GOOG 2018-07-26
Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation. a. You can read the data into R with the following script:
tute1 <- readr::read_csv("tute1.csv")
## Rows: 100 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (3): Sales, AdBudget, GDP
## date (1): Quarter
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
View(tute1)
mytimeseries <- tute1 |>
mutate(Quarter = yearquarter(Quarter)) |>
as_tsibble(index = Quarter)
view(mytimeseries)
mytimeseries |>
pivot_longer(-Quarter) |>
ggplot(aes(x = Quarter, y = value, colour = name)) +
geom_line() +
facet_grid(name ~ ., scales = "free_y")
Without facet_grid(), does not separate the three names, might be harder to see in other cases.
mytimeseries |>
pivot_longer(-Quarter) |>
ggplot(aes(x = Quarter, y = value, colour = name)) +
geom_line()
library(USgas)
us_total <- us_total |>
as_tsibble(index = year, key = state)
us_total |>
filter(state %in% c("Maine", "Vermont", "New Hampshire", "Massachusetts", "Connecticut", "Rhode Island")) |>
ggplot(aes(x = year, y = y)) +
geom_line() +
facet_grid(state ~ ., scales = "free_y") +
labs(title = "Annual Natural Gas Consumption by States",
y = "Gas Consumption")
library(readxl)
tourism <- read_excel("tourism.xlsx")
view(tourism)
tourism_tsibble <- tourism |>
mutate(Quarter = yearquarter(Quarter)) |>
as_tsibble(key = c(Region, State, Purpose), index = Quarter)
max_trips <- tourism |>
group_by(Region, Purpose) |>
summarise(avg_trips = mean(Trips), .groups = 'drop') |>
filter(avg_trips == max(avg_trips)) |>
select(Region, Purpose)
print(max_trips)
## # A tibble: 1 Ă— 2
## Region Purpose
## <chr> <chr>
## 1 Sydney Visiting
new_tib <- tourism |>
group_by(State) |>
summarise(total_trips = sum(Trips))
print(new_tib)
## # A tibble: 8 Ă— 2
## State total_trips
## <chr> <dbl>
## 1 ACT 41007.
## 2 New South Wales 557367.
## 3 Northern Territory 28614.
## 4 Queensland 386643.
## 5 South Australia 118151.
## 6 Tasmania 54137.
## 7 Victoria 390463.
## 8 Western Australia 147820.
Use the following graphics functions: autoplot(), gg_season(), gg_subseries(), gg_lag(), ACF() and explore features from the following time series: “Total Private” Employed from us_employment, Bricks from aus_production, Hare from pelt, “H02” Cost from PBS, and Barrels from us_gasoline.
us_employment |>
filter(Title == "Total Private") |>
autoplot(Employed)
us_employment |> filter(Title == "Total Private") |>
gg_season(Employed)
us_employment |> filter(Title == "Total Private") |>
gg_subseries(Employed)
us_employment |> filter(Title == "Total Private") |>
gg_lag(Employed)
us_employment |> filter(Title == "Total Private") |>
ACF(Employed) |>
autoplot()
Can you spot any seasonality, cyclicity and trend?
The employment rate seems to increase overtime by decades, although seems there is always a small fall in employment every 10-ish years. Seasonally, they are rising in employment in a very consistent number. The data seems like it is a seasonality because increase by month
What do you learn about the series?
From the series, I learned that the trends are repetitive and the population seem to increase as years go, following similar pattern.
What can you say about the seasonal patterns?
From what I have seen, every month the increase in number of employee is consistent, although I am a bit confused on why it always start from 25000 employees instead of the total amount at the end of last month.
Can you identify any unusual years?
There is alot of small decrease in employment over the years, but I notice there is a large decrease around 2010.
aus_production |> autoplot(Bricks)
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).
aus_production |> gg_season(Bricks)
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).
aus_production |> gg_subseries(Bricks)
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_line()`).
aus_production |> gg_lag(Bricks)
## Warning: Removed 20 rows containing missing values (gg_lag).
aus_production |> ACF(Bricks) |>
autoplot()
Can you spot any seasonality, cyclicity and trend?
I see bunch of up and down, but overall, its increases until 1980, then gradually decreases after. Seems like a trend on how it isn’t really seasonality nor cyclicity.
What do you learn about the series?
I learned that the trend is quite repetitive as well, with a slight difference in the average amount of bricks.
What can you say about the seasonal patterns?
Seasonally, the pattern is like a hill, increases, then reach the tip, then gradually decrease.
Can you identify any unusual years?
I notice a huge decrease in the number of bricks around 1980 Q1.
pelt |> autoplot(Hare)
pelt |> gg_subseries(Hare)
pelt |> gg_lag(Hare)
pelt |> ACF(Hare) |>
autoplot()
Can you spot any seasonality, cyclicity and trend?
I was not able to do the seasonal trend since it only gave info in years. The trend itself is a bit inconsistent, but constant up and downs.
What do you learn about the series?
I learned that the series is very inconsistent, there isn’t any trend.
What can you say about the seasonal patterns?
the seasonal is very inconsistent, there isn’t any pattern, alot of fluctuations
Can you identify any unusual years?
There was the highest increase around 1870s, which can be unusual.
PBS |> filter(ATC2 == "H02") |>
autoplot(Cost)
PBS |> filter(ATC2 == "H02") |>
gg_season(Cost)
PBS |> filter(ATC2 == "H02") |>
gg_subseries(Cost)
# Does not work
#PBS |> filter(ATC2 == "H02") |>
# gg_lag(Cost)
PBS |> filter(ATC2 == "H02") |>
ACF(Cost)
Can you spot any seasonality, cyclicity and trend?
All of the data seems consistent overall, ups and downs, seems like a trending data then seasonality or cyclicity from a clear changes over the data.
What do you learn about the series?
I learned that that some are increasing and some are steady, on the same place.
What can you say about the seasonal patterns?
The seasonal is consistent and repetitve by each concession, just all 4 are different in pattern.
Can you identify any unusual years?
Before 1995, general/safety seems to have fluctuated cost compared to the rest of the data.
us_gasoline |> autoplot(Barrels)
us_gasoline |> gg_season(Barrels)
us_gasoline |> gg_subseries(Barrels)
us_gasoline |> gg_lag(Barrels)
us_gasoline |> ACF(Barrels) |>
autoplot()
Can you spot any seasonality, cyclicity and trend?
Seems like seasonality from how it changes throughout each season in the graph.
What do you learn about the series?
I learned that gasoline consumption became more and more relevant as years go.
What can you say about the seasonal patterns?
I don’t really see a seasonal pattern, seems like all over the place, but as years go, the amount of barrels overall increases.
Can you identify any unusual years?
On 2009, there is a small decrease in numbers of barrel, which was unusual since gas is more important over time in this trend.