This report is a summary of lesson by Harrison Brown, Data Camp
AirPassengers
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 1949 112 118 132 129 121 135 148 148 136 119 104 118
## 1950 115 126 141 135 125 149 170 170 158 133 114 140
## 1951 145 150 178 163 172 178 199 199 184 162 146 166
## 1952 171 180 193 181 183 218 230 242 209 191 172 194
## 1953 196 196 236 235 229 243 264 272 237 211 180 201
## 1954 204 188 235 227 234 264 302 293 259 229 203 229
## 1955 242 233 267 269 270 315 364 347 312 274 237 278
## 1956 284 277 317 313 318 374 413 405 355 306 271 306
## 1957 315 301 356 348 355 422 465 467 404 347 305 336
## 1958 340 318 362 348 363 435 491 505 404 359 310 337
## 1959 360 342 406 396 420 472 548 559 463 407 362 405
## 1960 417 391 419 461 472 535 622 606 508 461 390 432
summary(AirPassengers)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 104.0 180.0 265.5 280.3 360.5 622.0
autoplot()autoplot(zoo(maunaloa))
class() 함수로 object들의 속성 확인 가능
numeric
19223character
"2022-08-09"August 9, 2022"Date
"2022-08-09"lubridate::as_date()POSIXct
"2022-08-09 20:17:00 UTC"as.POSIXct()is.POSIXct()국가, 지역에 따라 time element 순서가 다를 수 있음
* U.S: 12/20/2022 * U.K: 20/12/2022 * Ambiguous most of the year: *
6/4/2010: June 4th or April 6th??
2022-06-042022-06-04 = June 4th, 2022-) between date elements2022-06-04 vs. 20220604lubridate::parse_date_time()
dates_vector <- c("12/20/2022", "2022-12-21", "December 22, 2022")
dates_vector
## [1] "12/20/2022" "2022-12-21" "December 22, 2022"
parse_date_time(dates_vector,
orders = c("%m/%d/%Y",
"%Y-%m-%d",
"%B %d, %Y"))
## [1] "2022-12-20 UTC" "2022-12-21 UTC" "2022-12-22 UTC"
start()end()frequency()deltat()timecyclestart(AirPassengers)
## [1] 1949 1
# Decimal date
end(ftse)
## [1] 1998.646
end(ftse) %>%
lubridate::date_decimal()
## [1] "1998-08-24 20:18:27 UTC"
frequency(ftse)
## [1] 260
# weekly
frequency(maunaloa)
## [1] 52.17855
# delta t
deltat(ftse)
## [1] 0.003846154
Date/POSIXct data# Save the start point of maunaloa: maunaloa_start
maunaloa_start <- start(maunaloa)
# Assign the formatted date to start_iso
start_iso <- date_decimal(maunaloa_start)
# Convert to Date class
as_date(start_iso)
## [1] "1974-05-17"
zoo(x = ..., order.by = ...)as.zoo: converting to zoo from tsindex()coredata()c(): when joining# # Determine the overlapping indexes
# overlapping_index <-
# index(coffee_overlap) %in% index(coffee)
#
# # Create a subset of the elements which do not overlap
# coffee_subset <- coffee_overlap[!overlapping_index]
#
# # Combine the coffee time series and the new subset
# coffee_combined <- c(coffee, coffee_subset)
#
# autoplot(coffee_combined)
fortify.zoo(): from zoo to data frameWindows:
Purpose of windows:
View a specified range of data
Focus in on years/events of interest
Ignore observations at the “edge” of the data
stats::window(x = ..., start = ..., end = ...)
# Create a window from dow_jones
ftse_window <- window(ftse, start = "1995-01-01", end = "1997-01-01")
# Create an autoplot from the original dow_jones
autoplot(ftse) +
labs(y = "Price (USD)")
# Create an autoplot from dow_jones_window
autoplot(ftse_window) +
labs(y = "Price (USD)")
# Complete the logical expression
subset <- index(maunaloa) >= "1990" &
index(maunaloa) <= "2010"
# Extract the subset of maunaloa
maunaloa_subset <- maunaloa[subset]
# Autoplot the subsetted maunaloa dataset
autoplot(zoo(maunaloa_subset))
Aggregation:
ex) Monthly data: Which data to use?
2003-01-01?2003-01-31?2003-01-15?2003-01?zoo::as.yearmonzoo::as.yearqtrlaborday_2022 <- as_date("2022-09-05")
as.yearmon(laborday_2022)
## [1] "9 2022"
as.yearqtr(laborday_2022)
## [1] "2022 Q3"
as.yearqtr(2018.639)
## [1] "2018 Q3"
Frequency:
Temporal resolution(해상도):
mean, sum,
max to the chosen intervalsum of daily datamean of hourly valuesxts:
eXtensible Time Series
Extend the zoo package and zoo class of
objects
apply.*(data = ..., FUN = ...) functions
endpoints(x = ..., on = ..., k = ...)
on: “weeks”, “months”, “days”, …k: integer로 on에서 설정한 기간 단위period.apply()
zoo_maunaloa <- zoo(maunaloa)
index(zoo_maunaloa) <- date_decimal(index(zoo_maunaloa))
autoplot(zoo_maunaloa)
# Aggregate to the monthly max and autoplot
monthly_max <- apply.monthly(zoo_maunaloa, FUN = max)
autoplot(monthly_max)
# Create the index from every third month
three_month_index <- endpoints(x = zoo_maunaloa,
on = "months",
k = 3)
# Apply the maximum to the time series using the index
three_month_max <- period.apply(x = zoo_maunaloa,
INDEX = three_month_index,
FUN = max)
# Autoplot with labels and theme
autoplot(three_month_max)
na. fucntion from zoo: *
na.fill(object = ..., fill = ...): 단순히 fill 인수 값으로
대체 * na.locf(): 가장 최근 관찰값으로 대체 *
na.approx(): 선형 보간 활용하여 대체
Measure of how statistics change as the data moves in time
zoo::rollmean(x = ..., k = ..., align = ..., fill = ...)
k: size of windowalign: alignment of window; “right”, “left”,
“center”fill: values to fill-in outside of windowzoo::rollsum()
zoo::rollmax()
zoo::rollapply(data = ..., width = ..., FUN = ..., align = ..., fill = ...):
사용자정의함수 가능
data: Time series objectwidth: Width of window(k)FUN: Summary functionbase::seq_along() 으로 rollapply의
width 인수 생성