getwd()
## [1] "/Users/carmenphan/Desktop/R markdown"
Sys.time()
## [1] "2023-09-18 19:04:13 EDT"
Use the help function to explore what the series
gafa_stock,PBS,vic_elecandpeltrepresent.
- Use
autoplot()to plot some of the series in these data sets.- What is the time interval of each series?
autoplot(gafa_stock, Close)
Stock prices for these technology stocks have risen for most of the series, until mid-late 2018.
gafa_stock
Interval is daily. Looking closer at the data, we can see that the index is a Date variable. It also appears that observations occur only on trading days, creating lots of implicit missing values.
There are too many series to plot. Let’s focus on aggregate A10 expenditure.
a10 <- PBS %>%
filter(ATC2 == "A10") %>%
summarise(Cost = sum(Cost))
a10 %>%
autoplot(Cost)
Appears to have upward trend (perhaps exponential), and seasonality which varies proportionately to the level of the series.
a10
Observations are made once every month.
vic_elec %>%
autoplot(Demand)
Appears to have an annual seasonal pattern, where demand is higher during summer and winter. Can’t see much detail, so let’s zoom in.
vic_elec %>%
filter(yearmonth(Time) == yearmonth("2012 June")) %>%
autoplot(Demand)
Appears to have a daily pattern, where less electricity is used overnight. Also appears to have a working day effect (less demand on weekends and holidays).
vic_elec
Data is available at 30 minute intervals.
pelt %>% autoplot(Lynx)
Canadian lynx trappings appears to be cyclic, as the extent of peak trappings is unpredictable, and the spacing between the peaks is irregular.
pelt %>% autoplot(Hare)
Similar can be said for snowshoe hare trappings, although this series appears more erratic.
pelt
pelt %>%
pivot_longer(Hare:Lynx, names_to="Animal", values_to="Trappings") -> pelt1
pelt1
pelt1 %>%
pivot_wider(names_from = Animal, values_from = Trappings) -> pelt2
pelt2
pelt1 %>%
autoplot(Trappings)
pelt %>%
pivot_longer(Hare:Lynx, names_to="Animal", values_to="Trappings") %>%
autoplot(Trappings)
Plotting both Lynx and Hare trappings, it appears that the peaks in Canadian Lynx trappings occur shortly after peaks in Snowshoe Hare trappings. This relationship is due to the Canadian Lynx being specialised hunters of the Snowshoe Hare, resulting in a strong predator-prey relationship.
interval(pelt)
## <interval[1]>
## [1] 1Y
Observations are made once per year.
Use
filter()to find what days corresponded to the peak closing price for each of the four stocks ingafa_stock.
gafa_stock %>%
group_by(Symbol) %>%
filter(Close == max(Close)) %>%
ungroup() %>%
select(Symbol, Date, Close)
Download the file
tute1.csvfrom the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.
tute1 <- readr::read_csv("data/tute1.csv")
## Rows: 100 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (3): Sales, AdBudget, GDP
## date (1): Quarter
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#View(tute1)
mytimeseries <- tute1 %>%
mutate(Quarter = yearquarter(Quarter)) %>%
as_tsibble(index = Quarter)
mytimeseries %>%
pivot_longer(-Quarter, names_to="Key", values_to="Value") %>%
ggplot(aes(x = Quarter, y = Value, colour = Key)) +
geom_line() +
facet_grid(vars(Key), scales = "free_y")
# Without faceting:
mytimeseries %>%
pivot_longer(-Quarter, names_to="Key", values_to="Value") %>%
ggplot(aes(x = Quarter, y = Value, colour = Key)) +
geom_line()
The
USgaspackage contains data on the demand for natural gas in the US.
- Install the
USgaspackage.- Create a tsibble from
us_totalwith year as the index and state as the key.- Plot the annual natural gas consumption by state for the New England area (comprising the states of Maine, Vermont, New Hampshire, Massachusetts, Connecticut and Rhode Island).
#install.packages("USgas")
library(USgas)
us_tsibble <- us_total %>%
as_tsibble(index=year, key=state)
# For each state
us_tsibble %>%
filter(state %in% c("Maine", "Vermont", "New Hampshire", "Massachusetts",
"Connecticut", "Rhode Island")) %>%
autoplot(y/1e3) +
labs(y = "billion cubic feet")
- Download
tourism.xlsxfrom the book website and read it into R usingread_excel()from thereadxlpackage.- Create a tsibble which is identical to the
tourismtsibble from thetsibblepackage.- Find what combination of
RegionandPurposehad the maximum number of overnight trips on average.- Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.
download.file("http://OTexts.com/fpp3/extrafiles/tourism.xlsx",
tourism_file <- tempfile())
my_tourism <- readxl::read_excel(tourism_file) %>%
mutate(Quarter = yearquarter(Quarter)) %>%
as_tsibble(
index = Quarter,
key = c(Region, State, Purpose)
)
my_tourism
tourism
my_tourism %>%
as_tibble() %>%
group_by(Region, Purpose) %>%
summarise(Trips = mean(Trips)) %>%
ungroup() %>%
filter(Trips == max(Trips))
## `summarise()` has grouped output by 'Region'. You can override using the
## `.groups` argument.
state_tourism <- my_tourism %>%
group_by(State) %>%
summarise(Trips = sum(Trips)) %>%
ungroup()
state_tourism