DATA 608 - PROJECT 1
Introduction
This project consists of 3 parts - two required and one bonus and is worth 15% of your grade.
Part A – ATM Forecast, ATM624Data.xlsx
In part A, I want you to forecast how much cash is taken out of 4 different ATM machines for May 2010. The data is given in a single file. The variable ‘Cash’ is provided in hundreds of dollars, other than that it is straight forward. I am being somewhat ambiguous on purpose to make this have a little more business feeling. Explain and demonstrate your process, techniques used and not used, and your actual forecast. I am giving you data via an excel file, please provide your written report on your findings, visuals, discussion and your R code via an RPubs link along with the actual.rmd file Also please submit the forecast which you will put in an Excel readable file.
Part B – Forecasting Power, ResidentialCustomerForecastLoad-624.xlsx
Part B consists of a simple dataset of residential power usage for January 1998 until December 2013. Your assignment is to model these data and a monthly forecast for 2014. The data is given in a single file. The variable ‘KWH’ is power consumption in Kilowatt hours, the rest is straight forward. Add this to your existing files above.
Part C – BONUS, optional (part or all), Waterflow_Pipe1.xlsx and Waterflow_Pipe2.xlsx
Part C consists of two data sets. These are simple 2 columns sets, however they have different time stamps. Your optional assignment is to time-base sequence the data and aggregate based on hour (example of what this looks like, follows). Note for multiple recordings within an hour, take the mean. Then to determine if the data is stationary and can it be forecast. If so, provide a week forward forecast and present results via Rpubs and .rmd and the forecast in an Excel readable file.
Package
The following R package are used in this project.
Part A
Data Exploration
Data Summary
atm <- rio::import('https://raw.githubusercontent.com/oggyluky11/DATA624-SPRING-2021/main/PROJECT_1/ATM624Data.xlsx',
col_types = c('date', 'text', 'numeric')) %>%
mutate(ATM = as.factor(ATM))
summary(atm)
## DATE ATM Cash
## Min. :2009-05-01 00:00:00 ATM1:365 Min. : 0.0
## 1st Qu.:2009-08-01 00:00:00 ATM2:365 1st Qu.: 0.5
## Median :2009-11-01 00:00:00 ATM3:365 Median : 73.0
## Mean :2009-10-31 19:11:48 ATM4:365 Mean : 155.6
## 3rd Qu.:2010-02-01 00:00:00 NA's: 14 3rd Qu.: 114.0
## Max. :2010-05-14 00:00:00 Max. :10919.8
## NA's :19
Missing-value Check
It’s observed that there are 6 missing values of [Cash] from series ATM1
& ATM2
before May 2010, and all [Cash] values after May 2010 are missing. As we are requested to forecast how much cash is taken in May 2010, the current data rows of May 2010 are removed.
## DATE ATM Cash
## Min. :2009-05-01 ATM1:365 Min. : 0.0
## 1st Qu.:2009-07-31 ATM2:365 1st Qu.: 0.5
## Median :2009-10-30 ATM3:365 Median : 73.0
## Mean :2009-10-30 ATM4:365 Mean : 155.6
## 3rd Qu.:2010-01-29 3rd Qu.: 114.0
## Max. :2010-04-30 Max. :10919.8
## NA's :5
Timelineness Check
Check the timelineness of the daily series. It is checked that there are no daily gaps in the daily time series.
full_date <- seq(min(atm_mod$DATE), max(atm_mod$DATE), by = 'days')
data.frame(full_date) %>% filter(!full_date %in% atm_mod$DATE)
Outliner Check
Check that there exist significant outliner at ATM4 series.
atm_mod %>%
ggplot(aes(x=Cash)) +
geom_boxplot(na.rm = TRUE) +
facet_grid(cols = vars(ATM), scales = 'free') +
ggtitle('Boxplot: ATM')
Data Manipulation
Imputing Missing Values
Using na_interpolation
function from imputeTS
package to impute missing values for time series ATM1
& ATM2
atm_ts <- atm_mod %>%
spread(key = 'ATM', value = 'Cash') %>%
select(-DATE) %>%
ts(frequency = 7)
atm_ts_imp <- na_interpolation(atm_ts)
ggarrange(
ggplot_na_imputations(atm_ts[,1], atm_ts_imp[,1]) +
theme_igray() +
ggtitle('Imputed Values - ATM 1'),
ggplot_na_imputations(atm_ts[,2], atm_ts_imp[,2]) +
theme_igray() +
ggtitle('Imputed Values - ATM 2'),
nrow = 2)
Handling Outliners
suspress outliner in ATM4
using function tsclean
.
Data Visualization
Observed that outliners are suppressed in the final data set.
ATM 1
Observation on Raw Data
Significant weekly seasonality exists;
No sign of steady trend but small fluctuation over time;
ACF shows decreasing trend in seasonal lags and PACF shows drop off after the first seasonal lag.
Both ACF and PACF show non-seasonal lags either within the critical limit of slightly above the limit.
Based on the above observation, the time series
atm_1
is non-stationary with significant seasonality and little trend. seasonal Differecing is required to transformatm_1
into a stationary series.
Time Series Transformation
Perform seasonal differencing with lag = 7;
check with unit root test that the p-value is less than 0.05 therefore the transformed data set is within the expected range of staionary.
atm_1_mod <- atm_1 %>%
BoxCox(BoxCox.lambda(atm_1)) %>%
diff(lag = 7)
atm_1_mod %>%
ur.kpss() %>%
summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 0.0153
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
Observation on Transformed Data
The seasonal effect is elimiated after deferencing, the transformed data shows no siginificant seasonality or trend.
As the data set becomes stationary after seasonal diferencing, no further differencing is needed.
As this
atm_1
data set is non-stationary with seasonality, and becomes stationary after seasonal deferencing, an ARIMA model with seasonal difference D = 1. And because no further differencing is needed, the trend differnce d = 0.The PACF shows decreasing trend in the seasonal tags, the ACF shows drop off after the first seasonal tag, therefore the Seasonal AR factor P = 0 and Seasonal MA factor Q = 1.
The PACF shows decreasing trend in non-seasonal tags with multiple lags above critical limit, and ACF shows drop off after the frist non-seasonal tag, therefore the AR factor p = 0 and MA factor q >= 1
Therefore from the analysis above, suggested ARIMA models are ARIMA(0,0,>=1)(0,1,1)[7]
Build ARIMA Model
Use auto.arima
function to determine a model with lowest AICc, this process verifies the claim above for suggested ARIMA models are ARIMA(0,0,>=1)(0,1,1)[7]. The value q obtained by auto.arima
is 2.
The final model is ARIMA(0,0,2)(0,1,1)[7].
Checked that the p-value for Ljung-Box test is greater that 0.05, which means the residuals of the model have no remaining autocorrelations.
## Series: atm_1
## ARIMA(0,0,2)(0,1,1)[7]
## Box Cox transformation: lambda= 0.2615708
##
## Coefficients:
## ma1 ma2 sma1
## 0.1126 -0.1094 -0.6418
## s.e. 0.0524 0.0520 0.0432
##
## sigma^2 estimated as 1.764: log likelihood=-609.99
## AIC=1227.98 AICc=1228.09 BIC=1243.5
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,0,2)(0,1,1)[7]
## Q* = 9.8626, df = 11, p-value = 0.5428
##
## Model df: 3. Total lags used: 14
Forecast
To forecast the cash withdrawal in May 2010, we set h = 31
atm_1_ARIMA %>%
forecast(h=31) %>%
autoplot() +
labs(title = 'Forecast for ATM Machines 1 May 2010 Cash Withdrawal',
x = 'Day',
y = 'Amount (in hundreds of dollars)')
ATM 2
Observation on Raw Data
Significant weekly seasonality exists;
Slightly decreasing trend over time;
ACF shows positive and decreasing trend in seasonal lags and PACF shows drop off after the first two seasonal lags.
ACF shows slightly decreasing trend on non-seasonal lags, PACF shows drop off after the first two lags.
Based on the above observation, the time series
atm_2
is non-stationary with significant seasonality and slightly decreasing trend. seasonal Differecing is required to transformatm_2
into a stationary series.
Time Series Transformation
Perform seasonal differencing with lag = 7;
check with unit root test that the p-value is less than 0.05 therefore the transformed data set is within the expected range of staionary.
atm_2_mod <- atm_2 %>%
BoxCox(BoxCox.lambda(atm_2)) %>%
diff(lag = 7)
atm_2_mod %>%
ur.kpss() %>%
summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 0.0162
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
Observation on Transformed Data
The seasonal effect is elimiated after deferencing, the transformed data shows no siginificant seasonality or trend.
As the data set becomes stationary after seasonal diferencing, no further trend differencing is needed.
As this
atm_2
data set is non-stationary with seasonality, and becomes stationary after seasonal deferencing, an ARIMA model with seasonal difference D = 1. And because no further differencing is needed, the trend differnce d = 0.The PACF shows decreasing trend in the seasonal tags, the ACF shows drop off after the first seasonal tag, therefore the Seasonal AR factor P = 0 and Seasonal MA factor Q = 1.
Both ACF and PACF shows stable variations within or slightly above the critical limits, therefore both AR and MA factors can not be omitted, the AR factor p >= 1 and MA factor q >= 1
Therefore from the analysis above, suggested ARIMA models are ARIMA(>=1,0,>=1)(0,1,1)[7]
Build ARIMA Model
Use auto.arima
function to determine a model with lowest AICc, this process verifies the claim above for suggested ARIMA models are ARIMA(>=1,0,>=1)(0,1,1)[7]. The value p, q obtained by auto.arima
are both 3.
The final model is ARIMA(3,0,3)(0,1,1)[7] with drift.
Checked that the p-value for Ljung-Box test is greater that 0.05, which means the residuals of the model have no remaining autocorrelations.
## Series: atm_2
## ARIMA(3,0,3)(0,1,1)[7] with drift
## Box Cox transformation: lambda= 0.7242585
##
## Coefficients:
## ar1 ar2 ar3 ma1 ma2 ma3 sma1 drift
## 0.4902 -0.4948 0.8326 -0.4823 0.3203 -0.7837 -0.7153 -0.0203
## s.e. 0.0863 0.0743 0.0614 0.1060 0.0941 0.0621 0.0453 0.0072
##
## sigma^2 estimated as 67.52: log likelihood=-1260.59
## AIC=2539.18 AICc=2539.69 BIC=2574.1
##
## Ljung-Box test
##
## data: Residuals from ARIMA(3,0,3)(0,1,1)[7] with drift
## Q* = 8.944, df = 6, p-value = 0.1768
##
## Model df: 8. Total lags used: 14
Forecast
To forecast the cash withdrawal in May 2010, we set h = 31
atm_2_ARIMA %>%
forecast(h=31) %>%
autoplot() +
labs(title = 'Forecast for ATM Machines 2 May 2010 Cash Withdrawal',
x = 'Day',
y = 'Amount (in hundreds of dollars)')
ATM 3
Observation on Raw Data
There are only 3 valid data point exists in the time series.
Not enough information for inferring trend or seasonality, developing an advanced forecast model is not possible.
Intead, use average method as the forcasting model.
Build Forcasting Model with Average Method
atm_3 <- atm_ts_imp[,3]
autoplot(atm_3,
main = 'ATM 3 Cash Time Series ggtsdisplay') +
autolayer(meanf(atm_3 %>% window(start=52.7), h = 31)) +
labs(title = 'Forecast for ATM Machines 3 May 2010 Cash Withdrawal',
x = 'Day',
y = 'Amount (in hundreds of dollars)')
ATM 4
Observation on Raw Data
No stable seasonality over time;
No stable trend over time;
The fluctuation over time appears to be random;
Both ACF and PACF shows no significant spike at seasonal lags.
Both ACF and PACF shows stable variable within critical limit expect a few spike in the begining.
Based on the above observation, the time series
atm_4
is stationary with no seasonality and no stable trend. Differecing is not required to transformatm_4
.
Time Series Transformation
No differencing is performed due to no seasonality, however Box-cox is performed to stablize fluctuation in some degree;
check with unit root test that the p-value is slightly over 0.05.
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 0.0797
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
Observation on Transformed Data
As this
atm_4
data set is somewhat stationary, an ARIMA model with difference factors D = 0 and d = 0.As both ACF and PACF show decreasing trend in seasonal lags, however PACF decrease more dramatically than ACF and drop off after lag 21, therefore the Seasonal AR factor P >= 1 and Seasonal MA factor Q >= 0.
Both ACF and PACF shows stable variations within or slightly above the critical limits, and PACF shows multiple spikes above critical limit, therefore the AR factor p >= 0 and MA factor q >= 0.
Therefore from the analysis above, suggested ARIMA models are ARIMA(>=0,0,>=0)(>=1,0,>=0)[7].
Build ARIMA Model
Use auto.arima
function to determine a model with lowest AICc, this process verifies the claim above for suggested ARIMA models are ARIMA(>=0,0,>=0)(>=1,0,>=0)[7]. The value p, q, P, Q obtained by auto.arima
are 1, 0, 2, 0 respectively.
The final model is ARIMA(1,0,0)(2,0,0)[7] with non-zero mean.
Checked that the p-value for Ljung-Box test is greater that 0.05, which means the residuals of the model have no remaining autocorrelations.
## Series: atm_4
## ARIMA(1,0,0)(2,0,0)[7] with non-zero mean
## Box Cox transformation: lambda= 0.4492823
##
## Coefficients:
## ar1 sar1 sar2 mean
## 0.0801 0.2076 0.2031 28.5695
## s.e. 0.0526 0.0516 0.0524 1.2477
##
## sigma^2 estimated as 175.6: log likelihood=-1459.6
## AIC=2929.2 AICc=2929.37 BIC=2948.7
##
## Ljung-Box test
##
## data: Residuals from ARIMA(1,0,0)(2,0,0)[7] with non-zero mean
## Q* = 16.891, df = 10, p-value = 0.07681
##
## Model df: 4. Total lags used: 14
Forecast
To forecast the cash withdrawal in May 2010, we set h = 31
atm_4_ARIMA %>%
forecast(h=31) %>%
autoplot() +
labs(title = 'Forecast for ATM Machines 4 May 2010 Cash Withdrawal',
x = 'Day',
y = 'Amount (in hundreds of dollars)')
Export Forecast to CSV
ATM_forecast <- cbind(ATM_1 = atm_1_ARIMA %>% forecast(h=31) %>% .$mean,
ATM_2 = atm_2_ARIMA %>% forecast(h=31) %>% .$mean,
ATM_3 = meanf(atm_3 %>% window(start=52.7), h = 31) %>% .$mean,
ATM_4 = atm_4_ARIMA %>% forecast(h=31) %>% .$mean) %>%
data.frame() %>%
mutate(DATE = seq(as.Date('2010-5-1'), as.Date('2010-5-31'), by = 'days')) %>%
select(DATE, everything())
ATM_forecast
Part B
Data Exploration
Data Summary
res<- rio::import('https://raw.githubusercontent.com/oggyluky11/DATA624-SPRING-2021/main/PROJECT_1/ResidentialCustomerForecastLoad-624.xlsx')
summary(res)
## CaseSequence YYYY-MMM KWH
## Min. :733.0 Length:192 Min. : 770523
## 1st Qu.:780.8 Class :character 1st Qu.: 5429912
## Median :828.5 Mode :character Median : 6283324
## Mean :828.5 Mean : 6502475
## 3rd Qu.:876.2 3rd Qu.: 7620524
## Max. :924.0 Max. :10655730
## NA's :1
Missing-value Check
It’s observed that there is only one missing value in Sep 2008.
Timelineness Check
Check the timelineness of the monthly series. It is checked that there are no monthly gaps in the time series. There are total 12 years’ monthly data in the time series.
res %>%
mutate(Month = str_extract(`YYYY-MMM`, '[[:alpha:]]+')) %>%
group_by(Month) %>%
tally(n='Count')
Outliner Check
Check that there is one outliner at case sequence 883.
Data Manipulation
Imputing Missing Values & Handling Outliner
Impute missing value & suspress outliner using function tsclean
.
Data Visualization
Observed that outliners are suppressed in the final data set.
Observation on Raw Data
Significant weekly seasonality exists;
No sign of steady trend but small fluctuation over time;
ACF shows decreasing trend in seasonal lags and PACF shows drop off after the first seasonal lag.
Both ACF and PACF show non-seasonal lags either within the critical limit of slightly above the limit.
Based on the above observation, the time series
atm_1
is non-stationary with significant seasonality and little trend. seasonal Differecing is required to transformatm_1
into a stationary series.
Time Series Transformation
Perform seasonal differencing with lag = 12;
check with unit root test that the p-value is greater than 0.05 therefore the test is failed. Sometimes it is not possible to find a model that passes all of the tests.
res_ts_mod <- res_ts_imp %>%
BoxCox(BoxCox.lambda(res_ts_imp)) %>%
diff(lag = 12)
res_ts_mod %>%
ur.kpss() %>%
summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 4 lags.
##
## Value of test-statistic is: 0.1049
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
Observation on Transformed Data
The seasonal effect is elimiated after deferencing, the transformed data shows no siginificant seasonality or trend.
As the data set becomes stationary after seasonal diferencing, no further differencing is needed.
As this data set is non-stationary with seasonality, and becomes stationary after seasonal deferencing, an ARIMA model with seasonal difference D = 1. And because no further differencing is needed, the trend differnce d = 0.
The PACF shows decreasing trend in the seasonal tags with two spikes above critical limit, the ACF shows drop off after the first seasonal tag, therefore the Seasonal AR factor P = 0 and Seasonal MA factor Q >= 1.
The PACF shows decreasing trend in non-seasonal tags with multiple lags above critical limit, and ACF shows stable variation within the critical limit after the frist non-seasonal tag, therefore the AR factor p >= 1 and MA factor q = 0.
Therefore from the analysis above, suggested ARIMA models are ARIMA(>=1,0,0)(0,1,1)[12].
Build ARIMA Model
Use auto.arima
function to determine a model with lowest AICc, this process verifies the claim above for suggested ARIMA models are ARIMA(>=1,0,0)(0,1,1)[12]. The value p obtained by auto.arima
is 1.
The final model is ARIMA(1,0,0)(0,1,1)[12] with drift.
Checked that the p-value for Ljung-Box test is greater that 0.05, which means the residuals of the model have no remaining autocorrelations.
res_ARIMA <- auto.arima(res_ts %>% tsclean(), lambda = BoxCox.lambda(res_ts %>% tsclean()), approximation = FALSE)
res_ARIMA
## Series: res_ts %>% tsclean()
## ARIMA(1,0,0)(0,1,1)[12] with drift
## Box Cox transformation: lambda= -0.1442665
##
## Coefficients:
## ar1 sma1 drift
## 0.2903 -0.7349 1e-04
## s.e. 0.0724 0.0698 1e-04
##
## sigma^2 estimated as 8.731e-05: log likelihood=585.27
## AIC=-1162.55 AICc=-1162.32 BIC=-1149.78
##
## Ljung-Box test
##
## data: Residuals from ARIMA(1,0,0)(0,1,1)[12] with drift
## Q* = 25.496, df = 21, p-value = 0.2263
##
## Model df: 3. Total lags used: 24
Forecast
To forecast the cash withdrawal in May 2010, we set h = 12
res_ARIMA %>%
forecast(h=12) %>%
autoplot() +
labs(title = 'Forecast for Residential power usage in year 2014',
x = 'Year-Month',
y = 'KWH')
## [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
Add Forecast to Existing File
res_forecast <- data.frame(seq(max(res$CaseSequence)+1,max(res$CaseSequence)+12),
paste('2014',month.abb, sep = '-'),
res_ARIMA %>% forecast(h=12) %>% .$mean,
stringsAsFactors = FALSE)
names(res_forecast) <- c('CaseSequence', 'YYYY-MMM', 'KWH')
write.xlsx(rbind(res, res_forecast),
'ResidentialCustomerForecastLoad-624_with_Forecast.xlsx',
'ResidentialCustomerForecastLoad',
row.names = FALSE)
Part C
Load Data
wf_p1 <- import('https://raw.githubusercontent.com/oggyluky11/DATA624-SPRING-2021/main/PROJECT_1/Waterflow_Pipe1.xlsx',
col_types = c('date', 'numeric'))
wf_p2 <- import('https://raw.githubusercontent.com/oggyluky11/DATA624-SPRING-2021/main/PROJECT_1/Waterflow_Pipe2.xlsx',
col_types = c('date', 'numeric'))
Aggregate wf_p1
Based on Hour
wf_p1_mod <- wf_p1 %>%
mutate(`Date Time` = round(`Date Time`, 'hours') %>% as.POSIXct()) %>%
group_by(`Date Time`) %>%
summarise(WaterFlow = mean(WaterFlow))
wf_p1_mod
Sum Up wf_p1
and wf_p2
WaterFlow Readings
wf_p <- wf_p1_mod %>%
rbind(wf_p2) %>%
group_by(`Date Time`) %>%
summarise(WaterFlow = sum(WaterFlow))
wf_p %>% arrange(desc(`Date Time`))
Observation on Data
Slightly decreasing trend is observed, decreasing ACF that above critical limit justified trend effect.
No obvious seasonality is presented according to ACF and PACF;
No significant outliners is observed;
The data is non-stationary, differencing is needed in the next step.
wf_ts <- ts(wf_p %>% select(WaterFlow), frequency = 24)
ggtsdisplay(wf_ts,
main = 'ggtsdisplay: Water Flow Readings from Oct 23rd to Dec 3rd of Year 2015')
Data Transformation
Box-cox is performed to stablize variation.
first order differecing is perfromed.
The unit root test shows P-value less than 0.05, demostrating staionary.
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 7 lags.
##
## Value of test-statistic is: 0.0098
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
Observation on Transformed Data
Trend effect is eliminated after differencing; Modeling with ARIMA is applicable with difference factor d = 1 and seasonal difference factor D = 0;
Decreasing seasonal lags in PACF and stable seasonal lags within critical limit in ACF hints AR factor p = 0 and MA factor q >= 1;
Multiple spikes in non-seasonal lags in ACF and stable non-seasonal lags within critcal limit in PACF hints seasonal AR factor P >= 1 and seasonal MA factor Q = 0;
Suggested model: ARIMA(0,1,>=1)(>=1,0,0)[24].
ggtsdisplay(wf_ts_mod,
main = 'ggtsdisplay: Water Flow Readings from Oct 23rd to Dec 3rd of Year 2015 (Transformed)')
Build ARIMA Model
The auto arima function verifies the claim that Suggested models are ARIMA(0,1,>=1)(>=1,0,0)[24]. The Q, p are both estimated to be 1.
Final Model: ARIMA(0,1,1)(1,0,0)[24] .
## Series: wf_ts
## ARIMA(0,1,1)(1,0,0)[24]
## Box Cox transformation: lambda= 0.877383
##
## Coefficients:
## ma1 sar1
## -0.9577 0.0771
## s.e. 0.0104 0.0322
##
## sigma^2 estimated as 109.2: log likelihood=-3765.62
## AIC=7537.24 AICc=7537.27 BIC=7551.97
Forecast
One weeks’ forecast
wf_ARIMA %>%
forecast(h=24*7) %>%
autoplot() +
labs(title = "Forecast for one weeks' Flow Readings",
x = 'Hours',
y = 'WaterFlow Reading')
Export Forecast to xlsx
wf_forecast <- wf_ARIMA %>% forecast(h=24*7) %>% .$mean %>%
data.frame() %>%
cbind(seq(max(wf_p$`Date Time`) + 60*60,max(wf_p$`Date Time`) + 60*60*24*7,by = 60*60))
names(wf_forecast) <- c('WaterFlow', 'Date Time')
wf_forecast <- wf_forecast %>% select('Date Time', 'WaterFlow')
wf_forecast