Data 624 assignment 2 KDD

library(fpp3)

## Warning: package 'fpp3' was built under R version 4.3.3

## Registered S3 method overwritten by 'tsibble':
##   method               from 
##   as_tibble.grouped_df dplyr

## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.0 ──

## ✔ tibble      3.2.1     ✔ tsibble     1.1.5
## ✔ dplyr       1.1.4     ✔ tsibbledata 0.4.1
## ✔ tidyr       1.3.1     ✔ feasts      0.3.2
## ✔ lubridate   1.9.3     ✔ fable       0.3.4
## ✔ ggplot2     3.5.1     ✔ fabletools  0.4.2

## Warning: package 'tidyr' was built under R version 4.3.3

## Warning: package 'ggplot2' was built under R version 4.3.3

## Warning: package 'tsibble' was built under R version 4.3.3

## Warning: package 'tsibbledata' was built under R version 4.3.3

## Warning: package 'feasts' was built under R version 4.3.3

## Warning: package 'fabletools' was built under R version 4.3.3

## Warning: package 'fable' was built under R version 4.3.3

## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date()    masks base::date()
## ✖ dplyr::filter()      masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval()  masks lubridate::interval()
## ✖ dplyr::lag()         masks stats::lag()
## ✖ tsibble::setdiff()   masks base::setdiff()
## ✖ tsibble::union()     masks base::union()

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats 1.0.0     ✔ readr   2.1.5
## ✔ purrr   1.0.2     ✔ stringr 1.5.1

## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter()     masks stats::filter()
## ✖ tsibble::interval() masks lubridate::interval()
## ✖ dplyr::lag()        masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(readr)
library(latex2exp)

## Warning: package 'latex2exp' was built under R version 4.3.3

library(seasonal)

## Warning: package 'seasonal' was built under R version 4.3.3

## 
## Attaching package: 'seasonal'
## 
## The following object is masked from 'package:tibble':
## 
##     view

#loaded relevant libraries

Do exercises 3.1, 3.2, 3.3, 3.4, 3.5, 3.7, 3.8 and 3.9 from the online Hyndman book. Please include your Rpubs link along with.pdf file of your run code

Q1.Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?

#very large figure needed for this many countries
#unique(global_economy$Country) #263 countries in global_economy
head(global_economy) #to display column names

## # A tsibble: 6 x 9 [1Y]
## # Key:       Country [1]
##   Country     Code   Year         GDP Growth   CPI Imports Exports Population
##   <fct>       <fct> <dbl>       <dbl>  <dbl> <dbl>   <dbl>   <dbl>      <dbl>
## 1 Afghanistan AFG    1960  537777811.     NA    NA    7.02    4.13    8996351
## 2 Afghanistan AFG    1961  548888896.     NA    NA    8.10    4.45    9166764
## 3 Afghanistan AFG    1962  546666678.     NA    NA    9.35    4.88    9345868
## 4 Afghanistan AFG    1963  751111191.     NA    NA   16.9     9.17    9533954
## 5 Afghanistan AFG    1964  800000044.     NA    NA   18.1     8.89    9731361
## 6 Afghanistan AFG    1965 1006666638.     NA    NA   21.4    11.3     9938414

global_economy |>
ggplot(aes(x = Year, y = GDP/Population, color = Country)) + #plot the GDP per capita by country
  geom_line(na.rm = TRUE)+ #ignore missing values
  labs(title= "GDP per capita", y = "$US") + # providing the axis labels
  guides(color = guide_legend(ncol = 3)) #making the legend not have a large width

global_gdp_capita <- global_economy |> #adding GDP per capita to global economy and putting it in global_gdp_capita dataframe
  mutate(gdp_capita = GDP / Population) |> # adding column of gdp per capita
  arrange(desc(gdp_capita)) # place the highest GDP per capita at top of dataframe

## Warning: Current temporal ordering may yield unexpected results.
## ℹ Suggest to sort by `Country`, `Year` first.

head(global_gdp_capita) #display the highest GDP per capita.

## Warning: Current temporal ordering may yield unexpected results.
## ℹ Suggest to sort by `Country`, `Year` first.

## # A tsibble: 6 x 10 [1Y]
## # Key:       Country [2]
##   Country  Code   Year    GDP Growth   CPI Imports Exports Population gdp_capita
##   <fct>    <fct> <dbl>  <dbl>  <dbl> <dbl>   <dbl>   <dbl>      <dbl>      <dbl>
## 1 Monaco   MCO    2014 7.06e9  7.18     NA      NA      NA      38132    185153.
## 2 Monaco   MCO    2008 6.48e9  0.732    NA      NA      NA      35853    180640.
## 3 Liechte… LIE    2014 6.66e9 NA        NA      NA      NA      37127    179308.
## 4 Liechte… LIE    2013 6.39e9 NA        NA      NA      NA      36834    173528.
## 5 Monaco   MCO    2013 6.55e9  9.57     NA      NA      NA      37971    172589.
## 6 Monaco   MCO    2016 6.47e9  3.21     NA      NA      NA      38499    168011.

print(global_gdp_capita)

## # A tsibble: 15,150 x 10 [1Y]
## # Key:       Country [263]
##    Country Code   Year    GDP Growth   CPI Imports Exports Population gdp_capita
##    <fct>   <fct> <dbl>  <dbl>  <dbl> <dbl>   <dbl>   <dbl>      <dbl>      <dbl>
##  1 Monaco  MCO    2014 7.06e9  7.18     NA      NA      NA      38132    185153.
##  2 Monaco  MCO    2008 6.48e9  0.732    NA      NA      NA      35853    180640.
##  3 Liecht… LIE    2014 6.66e9 NA        NA      NA      NA      37127    179308.
##  4 Liecht… LIE    2013 6.39e9 NA        NA      NA      NA      36834    173528.
##  5 Monaco  MCO    2013 6.55e9  9.57     NA      NA      NA      37971    172589.
##  6 Monaco  MCO    2016 6.47e9  3.21     NA      NA      NA      38499    168011.
##  7 Liecht… LIE    2015 6.27e9 NA        NA      NA      NA      37403    167591.
##  8 Monaco  MCO    2007 5.87e9 14.4      NA      NA      NA      35111    167125.
##  9 Liecht… LIE    2016 6.21e9 NA        NA      NA      NA      37666    164993.
## 10 Monaco  MCO    2015 6.26e9  4.94     NA      NA      NA      38307    163369.
## # ℹ 15,140 more rows

In most recent years Monoco has had the highest GDP per capita. Liechtenstein seems to be number 2 and sometimes crosses Monoco. Other countries with high GDP per capita are Norway, Luxembourg, Bermuda. The highest GDP per capita has exchanged several times throughout the years.

Q2. For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.

United States GDP from global_economy. Slaughter of Victorian “Bulls, bullocks and steers” in aus_livestock. Victorian Electricity Demand from vic_elec. Gas production from aus_production.

global_economy |> 
  filter(Country == "United States") |> 
ggplot(aes(x = Year, y = GDP)) + #plot the GDP of the US
  geom_line(na.rm = TRUE)+ #ignore missing values
  labs(title= "GDP for US", y = "$US")

global_economy |> 
  filter(Country == "United States") |> 
ggplot(aes(x = Year, y = GDP/Population, color = Country)) + #plot the GDP per capita of the US
  geom_line(na.rm = TRUE)+ #ignore missing values
  labs(title= "GDP per capita for US", y = "$US")

Changing the GDP to GDP per capita for the united states yields similar results to the GDP graph. dividing by the population is a better indicator of the wealth per person in a country.

head(aus_livestock)

## # A tsibble: 6 x 4 [1M]
## # Key:       Animal, State [1]
##      Month Animal                     State                        Count
##      <mth> <fct>                      <fct>                        <dbl>
## 1 1976 Jul Bulls, bullocks and steers Australian Capital Territory  2300
## 2 1976 Aug Bulls, bullocks and steers Australian Capital Territory  2100
## 3 1976 Sep Bulls, bullocks and steers Australian Capital Territory  2100
## 4 1976 Oct Bulls, bullocks and steers Australian Capital Territory  1900
## 5 1976 Nov Bulls, bullocks and steers Australian Capital Territory  2100
## 6 1976 Dec Bulls, bullocks and steers Australian Capital Territory  1800

unique(aus_livestock$State) #Victoria is Victorian

## [1] Australian Capital Territory New South Wales             
## [3] Northern Territory           Queensland                  
## [5] South Australia              Tasmania                    
## [7] Victoria                     Western Australia           
## 8 Levels: Australian Capital Territory New South Wales ... Western Australia

aus_livestock |> 
  filter(Animal == "Bulls, bullocks and steers", State =="Victoria") |> 
ggplot(aes(x = Month, y = Count)) + #plot the count of bulls
  geom_line(na.rm = TRUE) #ignore missing values

bulls_victoria <- aus_livestock |> # create a new data frame with the bulls already filtered
  filter(Animal == "Bulls, bullocks and steers", State =="Victoria")
dcmp <- bulls_victoria |>
  model(stl = STL(Count))
components(dcmp) |> autoplot() #decompose using STL and graph

  aus_livestock |> 
  filter(Animal == "Bulls, bullocks and steers", State =="Victoria") |>  gg_season() #decomposition of bull slaughter

## Plot variable not specified, automatically selected `y = Count`

  aus_livestock |> 
  filter(Animal == "Bulls, bullocks and steers", State =="Victoria") |>  gg_subseries() # stacks slaughter count of each year on top of each other

## Plot variable not specified, automatically selected `y = Count`

  aus_livestock |> 
  filter(Animal == "Bulls, bullocks and steers", State =="Victoria") |> 
  mutate(days_month = days_in_month(Month),  # Get number of days in the month
         daily_average = Count / days_month) |>
    ggplot(aes(x = Month, y = daily_average)) + #plot the average number per day slaughtered 
  geom_line(na.rm = TRUE) #ignore missing values

Breaking the livestock up into seasons shows the overall decreasing trend of the count slaughtered as the years increase.The subseries shows the overall decrease per for the same month over years as well. Transforming the data by dividing the count by the number of days in the month helps to avoid having less count due to a month having fewer days. Overall the data looks the same.

 head(vic_elec)

## # A tsibble: 6 x 5 [30m] <Australia/Melbourne>
##   Time                Demand Temperature Date       Holiday
##   <dttm>               <dbl>       <dbl> <date>     <lgl>  
## 1 2012-01-01 00:00:00  4383.        21.4 2012-01-01 TRUE   
## 2 2012-01-01 00:30:00  4263.        21.0 2012-01-01 TRUE   
## 3 2012-01-01 01:00:00  4049.        20.7 2012-01-01 TRUE   
## 4 2012-01-01 01:30:00  3878.        20.6 2012-01-01 TRUE   
## 5 2012-01-01 02:00:00  4036.        20.4 2012-01-01 TRUE   
## 6 2012-01-01 02:30:00  3866.        20.2 2012-01-01 TRUE

vic_elec |>  
ggplot(aes(x = ymd(Date), y = Demand)) + #plot the Demand 
  geom_line(na.rm = TRUE)  #ignore missing values

vic_elec  |>  gg_season(Demand, period = "week") # week by week

vic_elec |> gg_season(Demand, period = "year")

  vic_elec |> gg_season(Demand, period = "day")

  vic_elec |>  #attempting a decomposition
  model(
    classical_decomposition(Demand, type = "additive")
  ) |>
  components() |>
  autoplot() +
  labs(title = "Classical additive decomposition of victoria electricity demand")

## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).

  avg_vic_elec <- vic_elec  |>
  mutate(
    `48-MA` = slider::slide_dbl(Demand, mean,
                .before = 24, .after = 23, .complete = TRUE)
  )
  avg_vic_elec  |> ggplot() +
  geom_line(aes(y = `48-MA`, x = Date), colour = "#D55E00") +
  labs(y = "average Demand",
       title = "electricy Demand moving average")

## Warning: Removed 47 rows containing missing values or values outside the scale range
## (`geom_line()`).

showing the demand each hour per day can help with day to day electric plant operations. Seeing the overall trend for a year helps with seasonal operations. Decomposition simply shows how noisy the data is. Utilizing a moving average can show the electricity demand where the average is roughly a 24 hour period. This helps clean up the data significantly.

aus_production |>  
ggplot(aes(x = Quarter, y = Gas)) + #plot the Gas production
  geom_line(na.rm = TRUE)

lambda <- 10 #try a "large" lambda
aus_production |>
  autoplot(box_cox(Gas, lambda)) +
  labs(y = "",
       title = latex2exp::TeX(paste0(
         "Transformed gas production with $\\lambda$ = ",
         round(lambda,2))))

lambda2 <- -10 # tried a negative lambda
aus_production |>
  autoplot(box_cox(Gas, lambda2)) +
  labs(y = "",
       title = latex2exp::TeX(paste0(
         "Transformed gas production with $\\lambda$ = ",
         round(lambda2,2))))

lambda3 <- 0.1
aus_production |>
  autoplot(box_cox(Gas, lambda3)) +
  labs(y = "",
       title = latex2exp::TeX(paste0(
         "Transformed gas production with $\\lambda$ = ",
         round(lambda3,2))))

The example provided in the text book uses box cox transformations to make the seasonal variation uniform. It normalizes the data for statistical analysis or regression. at the extreme lambda values (-10, 10) the transformation can make the seasonal variation worse. Around lambda = 0 the seasonal variation becomes more uniform.

Q3.Why is a Box-Cox transformation unhelpful for the canadian_gas data?

head(canadian_gas)

## # A tsibble: 6 x 2 [1M]
##      Month Volume
##      <mth>  <dbl>
## 1 1960 Jan   1.43
## 2 1960 Feb   1.31
## 3 1960 Mar   1.40
## 4 1960 Apr   1.17
## 5 1960 May   1.12
## 6 1960 Jun   1.01

canadian_gas |>  
ggplot(aes(x = Month, y = Volume)) + #plot the Gas production
  geom_line(na.rm = TRUE)

lambda <- canadian_gas |>  #test to see if any lambda would be helpful
  features(Volume, features = guerrero) |>
  pull(lambda_guerrero)
canadian_gas |>
  autoplot(box_cox(Volume, lambda)) +
  labs(y = "",
       title = latex2exp::TeX(paste0(
         "Transformed canadian gas production with $\\lambda$ = ",
         round(lambda,2))))

For the canadian gas data, even when applying the guerrero method to find an appropriate lambda for the box cox method, the seasonal variation does not become uniform. The 80s seasonal variation stays large and the post 90s and pre 80s seasonal variation remains smaller.

Q4. What Box-Cox transformation would you select for your retail data (from Exercise 7 in Section 2.10)?

set.seed(987654321)
myseries <- aus_retail |>
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))
head(myseries)

## # A tsibble: 6 x 5 [1M]
## # Key:       State, Industry [1]
##   State    Industry                           `Series ID`    Month Turnover
##   <chr>    <chr>                              <chr>          <mth>    <dbl>
## 1 Victoria Other recreational goods retailing A3349415T   1982 Apr     15.6
## 2 Victoria Other recreational goods retailing A3349415T   1982 May     15.8
## 3 Victoria Other recreational goods retailing A3349415T   1982 Jun     15.2
## 4 Victoria Other recreational goods retailing A3349415T   1982 Jul     15.2
## 5 Victoria Other recreational goods retailing A3349415T   1982 Aug     14.5
## 6 Victoria Other recreational goods retailing A3349415T   1982 Sep     15.1

myseries |>  
ggplot(aes(x = Month, y = Turnover)) + #plot the retail turnover
  geom_line(na.rm = TRUE)

lambda <- myseries |>  #test to see if any lambda would be helpful
  features(Turnover, features = guerrero) |>
  pull(lambda_guerrero)
myseries |>
  autoplot(box_cox(Turnover, lambda)) +
  labs(y = "",
       title = latex2exp::TeX(paste0(
         "Transformed canadian gas production with $\\lambda$ = ",
         round(lambda,2))))

The guerrero method chooses -0.27 for a lambda value for the box cox transformation. Seasonal variation is more uniform than in the original plot.

Q5. For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. Tobacco from aus_production, Economy class passengers between Melbourne and Sydney from ansett, and Pedestrian counts at Southern Cross Station from pedestrian.

aus_production |>  
autoplot(Tobacco) + #plot the tobacco production
  geom_line(na.rm = TRUE)

## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_line()`).

lambda <- aus_production |>  #test to see if any lambda would be helpful
  features(Tobacco, features = guerrero) |>
  pull(lambda_guerrero)
aus_production |>
  autoplot(box_cox(Tobacco, lambda)) +
  labs(y = "",
       title = latex2exp::TeX(paste0(
         "Transformed Tobacco production with $\\lambda$ = ",
         round(lambda,2))))

## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_line()`).

Lambda for Tobacco production is close to 1 (1 being no transformation) which indicates a box cox transformation does not change the seasonal variation much.

?ansett

## starting httpd help server ... done

head(ansett)

## # A tsibble: 6 x 4 [1W]
## # Key:       Airports, Class [1]
##       Week Airports Class    Passengers
##     <week> <chr>    <chr>         <dbl>
## 1 1989 W28 ADL-PER  Business        193
## 2 1989 W29 ADL-PER  Business        254
## 3 1989 W30 ADL-PER  Business        185
## 4 1989 W31 ADL-PER  Business        254
## 5 1989 W32 ADL-PER  Business        191
## 6 1989 W33 ADL-PER  Business        136

unique(ansett$Airports) #find how melborn and sydney airports are labeled picked MEL-SYD

##  [1] "ADL-PER" "MEL-ADL" "MEL-BNE" "MEL-OOL" "MEL-PER" "MEL-SYD" "SYD-ADL"
##  [8] "SYD-BNE" "SYD-OOL" "SYD-PER"

ansett |>  filter(Airports == "MEL-SYD", Class == "Economy") |> 
autoplot(Passengers) + #plot the economy passengers over time
  geom_line(na.rm = TRUE)

mel_syd_eco <- ansett |>  filter(Airports == "MEL-SYD", Class == "Economy")
lambda <- mel_syd_eco |>  #test to see which lambda is selected by guerrero method
  features(Passengers, features = guerrero) |>
  pull(lambda_guerrero)
mel_syd_eco |>
  autoplot(box_cox(Passengers, lambda)) +
  labs(y = "",
       title = latex2exp::TeX(paste0(
         "Transformed Passengers (MEL-SYD economy) with $\\lambda$ = ",
         round(lambda,2))))

due to the extreme lack of flights during one point in time the flight data needs a larger lambda for it’s box cox transformation. It skews the data. the variation during this period decreases.

head(pedestrian)

## # A tsibble: 6 x 5 [1h] <Australia/Melbourne>
## # Key:       Sensor [1]
##   Sensor         Date_Time           Date        Time Count
##   <chr>          <dttm>              <date>     <int> <int>
## 1 Birrarung Marr 2015-01-01 00:00:00 2015-01-01     0  1630
## 2 Birrarung Marr 2015-01-01 01:00:00 2015-01-01     1   826
## 3 Birrarung Marr 2015-01-01 02:00:00 2015-01-01     2   567
## 4 Birrarung Marr 2015-01-01 03:00:00 2015-01-01     3   264
## 5 Birrarung Marr 2015-01-01 04:00:00 2015-01-01     4   139
## 6 Birrarung Marr 2015-01-01 05:00:00 2015-01-01     5    77

south_ped <- pedestrian |> filter(Sensor == "Southern Cross Station")

south_ped |> autoplot(Count) + #plot the southern cross station pedestrian counts over time
  geom_line(na.rm = TRUE)

lambda <- south_ped |>  #test to see which lambda is selected by guerrero method
  features(Count, features = guerrero) |>
  pull(lambda_guerrero)
south_ped |>
  autoplot(box_cox(Count, lambda)) +
  labs(y = "",
       title = latex2exp::TeX(paste0(
         "Transformed Southern cross station count with $\\lambda$ = ",
         round(lambda,2))))

The extremely noisy data becomes more noisy. though the seasonal variation after the transformation is much more uniform.

Consider the last five years of the Gas data from aus_production.

Q7. gas <- tail(aus_production, 5*4) |> select(Gas) Plot the time series. Can you identify seasonal fluctuations and/or a trend-cycle? Use classical_decomposition with type=multiplicative to calculate the trend-cycle and seasonal indices. Do the results support the graphical interpretation from part a? Compute and plot the seasonally adjusted data. Change one observation to be an outlier (e.g., add 300 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier? Does it make any difference if the outlier is near the end rather than in the middle of the time series?

gas <- tail(aus_production, 5*4) |> select(Gas) 

gas |>   autoplot()

## Plot variable not specified, automatically selected `.vars = Gas`

every year the overall gas production keeps increasing. The general seasonal fluctuations are as follows: The gas production is lowest at the beginning of quarter 1 each year. the production raises to its maximum in the middle of the year and lowers for end of quarter 4.

gas |>
  model(
    classical_decomposition(Gas, type = "multiplicative")
  ) |>
  components() |>
  autoplot() +
  labs(title = "Classical multiplitive decomposition of Gas")

## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).

the decomposition supports the statements made.

dcmp <- gas |>
  model(stl = STL(Gas)) # STL decomposition used
components(dcmp) |>
  as_tsibble() |> 
  autoplot(Gas, colour = "gray") +
  geom_line(aes(y=season_adjust), colour = "#0072B2") +
  labs(y = "Gas",
       title = "Total Gas")

gas1 <- gas
gas1$Gas[1] <- gas$Gas[1] + 300 # add 300 to the first value of gas and update the dataframe of gas1(created new dataframe for modifications)
head(gas1)

## # A tsibble: 6 x 2 [1Q]
##     Gas Quarter
##   <dbl>   <qtr>
## 1   521 2005 Q3
## 2   180 2005 Q4
## 3   171 2006 Q1
## 4   224 2006 Q2
## 5   233 2006 Q3
## 6   192 2006 Q4

gas10 <- gas
gas10$Gas[10] <- gas$Gas[10] + 300

dcmp <- gas1 |>
  model(stl = STL(Gas)) # STL decomposition used
components(dcmp) |>
  as_tsibble() |>
  autoplot(Gas, colour = "gray") +
  geom_line(aes(y=season_adjust), colour = "#0072B2") +
  labs(y = "Gas",
       title = "Total Gas with 1st point outlier")

dcmp <- gas10 |>
  model(stl = STL(Gas)) # STL decomposition used
components(dcmp) |>
  as_tsibble() |>
  autoplot(Gas, colour = "gray") +
  geom_line(aes(y=season_adjust), colour = "#0072B2") +
  labs(y = "Gas",
       title = "Total Gas with 10th point outlier")

adding an outlier vastly changes the seasonally adjusted data. the seasonally adjusted data is heavy biased or follows the outlier. putting the outlier at the end vs the middle does not seem to make a difference. the decomposition follows the outlier whether it is at the end or the middle.

Recall your retail time series data (from Exercise 7 in Section 2.10). Decompose the series using X-11. Does it reveal any outliers, or unusual features that you had not noticed previously?

myseries |> autoplot(Turnover)

x11_dcmp <- myseries |> #applying a x11 decomp to my retail series "myseries"
  model(x11 = X_13ARIMA_SEATS(Turnover ~ x11())) |>
  components()
autoplot(x11_dcmp) +
  labs(title =
    "Decomposition of Retail using X-11.")

some of the highest irregularities occur after january 2010this is possibly due to the trend switching from a positive to a negative slope. it also shows an increasing seasonal variation, but towards 2020 the seasonal variation remains consistant.

Q9. Figures 3.19 and 3.20 show the result of decomposing the number of persons in the civilian labour force in Australia each month from February 1978 to August 1995.

a.Write about 3–5 sentences describing the results of the decomposition. Pay particular attention to the scales of the graphs in making your interpretation.

The decomposition shows that the trend is increasing. There is a seasonality to the data, however the variation is much smaller than the general trend. The seasonality shows the data rises, lowers, rises, lower, rises. “///” As the years progress the end of the year had progressively higher values in the seasonal plot.

Is the recession of 1991/1992 visible in the estimated components?

The recession of 1991/1992 is visible in the estimated components. with the exception of a few months such as january and februrary the 1991/1992 had lower values than the previous year. it also shows slower growth in january and februrary and september. the slope decreased (but was not negative) during those months.

Data 624 assignment 2 KDD

Keith DeNivo

2024-09-23