library(fpp3)
Bricks
from
aus_production
, Lynx
from pelt
,
Close
from gafa_stock
, Demand
from vic_elec
.?aus_production
## starting httpd help server ... done
Quarterly production of selected commodities in Australia. Description Quarterly estimates of selected indicators of manufacturing production in Australia.
Format Time series of class tsibble.
Details aus_production is a half-hourly tsibble with six values:
Beer: Beer production in megalitres. Tobacco: Tobacco and cigarette production in tonnes. Bricks: Clay brick production in millions of bricks. Cement: Portland cement production in thousands of tonnes. Electricity: Electricity production in gigawatt hours. Gas: Gas production in petajoules.
aus_production |> select(Bricks) |> head()
aus_production |> select(Bricks) |> autoplot()
## Plot variable not specified, automatically selected `.vars = Bricks`
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).
The TS has an quaterly index
?pelt
Description Hudson Bay Company trading records for Snowshoe Hare and Canadian Lynx furs from 1845 to 1935. This data contains trade records for all areas of the company.
Format Time series of class tsibble
Details pelt is an annual tsibble with two values:
Hare: The number of Snowshoe Hare pelts traded. Lynx: The number of Canadian Lynx pelts traded.
pelt |>
select(Lynx) |>
head()
The index is yearly
pelt|> select(Lynx) |> autoplot()
## Plot variable not specified, automatically selected `.vars = Lynx`
?gafa_stock
GAFA stock prices Description Historical stock prices from 2014-2018 for Google, Amazon, Facebook and Apple. All prices are in $USD.
Format Time series of class tsibble
Details gafa_stock is a tsibble containing data on irregular trading days:
Open: The opening price for the stock. High: The stock’s highest trading price. Low: The stock’s lowest trading price. Close: The closing price for the stock. Adj_Close: The adjusted closing price for the stock. Volume: The amount of stock traded. Each stock is uniquely identified by one key:
Symbol: The ticker symbol for the stock. Source Yahoo Finance historical data
gafa_stock|> filter(Symbol=="AAPL")|>
select(Close) |>
head()
The index of the TS is daily
gafa_stock|> filter(Symbol=="AAPL")|>
select(Close) |>
autoplot()
## Plot variable not specified, automatically selected `.vars = Close`
?vic_elec
vic_elec is a half-hourly tsibble with three values:
Demand: Total electricity demand in MWh. Temperature: Temperature of Melbourne (BOM site 086071). Holiday: Indicator for if that day is a public holiday. Format Time series of class tsibble.
Details This data is for operational demand, which is the demand met by local scheduled generating units, semi-scheduled generating units, and non-scheduled intermittent generating units of aggregate capacity larger than 30 MWh, and by generation imports to the region. The operational demand excludes the demand met by non-scheduled non-intermittent generating units, non-scheduled intermittent generating units of aggregate capacity smaller than 30 MWh, exempt generation (e.g. rooftop solar, gas tri-generation, very small wind farms, etc), and demand of local scheduled loads. It also excludes some very large industrial users (such as mines or smelters).
Source Australian Energy Market Operator.
vic_elec |> select(Demand)|> head()
The index of this TS is every 30 minutes
vic_elec |> select(Demand)|>
autoplot()+
labs(title=" Bi-Hourly Electric Demand in Victory",
x="Time",y= "Demand in MegaWatts MW")+
theme_classic()
## Plot variable not specified, automatically selected `.vars = Demand`
filter()
to find what days corresponded to the peak
closing price for each of the four stocks in
gafa_stock
.peak_closing_price_dates <- gafa_stock |>
group_by(Symbol)|>
filter(Close ==max(Close, na.rm = TRUE))|>
select(Symbol, Date, Close)
peak_closing_price_dates
Download the file tute1.csv
from the book website, open it in Excel
(or some other spreadsheet application), and review its contents. You
should find four columns of information. Columns B through D each
contain a quarterly series, labelled Sales, AdBudget and GDP. Sales
contains the quarterly sales for a small company over the period
1981-2005. AdBudget is the advertising budget and GDP is the gross
domestic product. All series have been adjusted for inflation.
tute1 <- readr::read_csv("tute1.csv")
## Rows: 100 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (3): Sales, AdBudget, GDP
## date (1): Quarter
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
View(tute1)
2. Convert the data to time series
mytimeseries <- tute1 |>
mutate(Quarter = yearquarter(Quarter)) |>
as_tsibble(index = Quarter)
3. Construct time series plots of each of the three series
mytimeseries |> pivot_longer(-Quarter) |> ggplot(aes(x = Quarter, y = value, colour = name)) + geom_line() + facet_grid(name ~ ., scales = "free_y")
Check what happens when you don’t include
facet_grid()
.
mytimeseries |> pivot_longer(-Quarter) |> ggplot(aes(x = Quarter, y = value, colour = name)) + geom_line()
It includes the Colunms in one single plot
The USgas
package contains data on the demand for
natural gas in the US.
USgas
package.library(USgas)
## Warning: package 'USgas' was built under R version 4.4.1
us_total
with year as the index
and state as the key.us_total_tsibble <- us_total |>
as_tsibble(index = year, key = state)
3. Plot the annual natural gas consumption by state for the New England area (comprising the states of Maine, Vermont, New Hampshire, Massachusetts, Connecticut and Rhode Island).
us_total_tsibble |>
filter(state %in% c("Connecticut", "Maine", "Massachusetts", "New Hampshire", "Rhode Island", "Vermont")) |>
ggplot(aes(x = year, y = y, colour = state)) +
geom_line() +
facet_wrap(state ~ ., scales = "free_y",ncol=2) +
labs(title = "Time Series of US Gas Consumption for Selected States",
x = "Year",
y = "Gas Consumption") +
theme_minimal()
tourism.xlsx
from the book website and read it into R
using readxl::read_excel()
.readxl::read_excel("tourism.xlsx")
tourism
tsibble from the tsibble
package.tourism_data<- readxl::read_excel("tourism.xlsx")
tourism_tsibble<- tourism_data |>
mutate(Quarter = yearquarter(Quarter))|>
as_tsibble(key=c(Region,State,Purpose,Trips),index = Quarter)
tourism_tsibble
3. Find what combination of `Region` and `Purpose` had the maximum number of overnight trips on average.
max_mean_trip_RP <- tourism_tsibble|>
group_by(Region, Purpose)|>
summarise(mean_trips = mean(Trips, na.rm = TRUE))|>
ungroup()|>
filter(mean_trips == max(mean_trips))
max_mean_trip_RP
4. Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.
state_trips_tsibble <- tourism_tsibble |>
group_by(State) |>
summarise(state_trips = sum(Trips, na.rm = TRUE), .groups = 'drop') |>
as_tsibble(index = Quarter, key = State)
head(state_trips_tsibble)
8. The Employment data for Total Private from us_employment shows a generally upward trend. However, when zoomed in, seasonal patterns become apparent. Notable dips are observed in the mid-1970s, early 1980s, and after 2007, which correspond to real-world economic downturns in the US economy.
total_private<- us_employment|>
filter(Title =="Total Private")
autoplot(total_private,Employed)
gg_season(total_private,Employed)
gg_subseries(total_private,Employed)
gg_lag(total_private,Employed,geom = "point", lags = 1:12)
ACF(total_private,Employed)|>
autoplot()
Bricks from aus_production There seems to be a positive trend every 4 quarters which makes sense, I say the based on the ACF chart.
autoplot(aus_production,Bricks)
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).
gg_season(aus_production,Bricks)
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).
gg_subseries(aus_production,Bricks)
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_line()`).
gg_lag(aus_production,Bricks,geom = "point", lags = 1:4)
## Warning: Removed 20 rows containing missing values (gg_lag).
ACF(aus_production,Bricks)|>
autoplot()
Hare from pelt The data here shows a cyclical pattern with production spiking every 8 years or so there other smaller recurring patterns every two years. I suspect population growth and shipping schedules factor in this trend.
autoplot(pelt, Hare, )
#gg_season(pelt, Hare)
gg_subseries(pelt, Hare)
gg_lag(pelt, Hare,geom = "point")
ACF(pelt, Hare)|>
autoplot()
Barrels from us_gasoline
There does seem to be some patterns in the data, it might be better to convert this to a monthly or yearly TS.
autoplot(us_gasoline, Barrels)
gg_season(us_gasoline, Barrels)
gg_subseries(us_gasoline, Barrels)
gg_lag(us_gasoline, Barrels, lags = 1:4, geom = "point")
ACF(us_gasoline, Barrels)|>
autoplot()
“Ho2” cost from PBS The ACF shows strong seasonality for every TS except General co-payment.
h02<- PBS|> filter(ATC2 =="H02")
h02ts <- h02 |>
mutate(Month = yearmonth(Month)) |>
as_tsibble(index = Month, )
autoplot(h02ts,Cost)
gg_season(h02ts,Cost)
gg_subseries(h02ts,Cost)
ACF(h02ts,Cost)|>
autoplot()
</div>