Introduction

As a restaurant owner, it is very important to make sure that all of the customer have a high level of customer satisfaction. the food’s served is not only the main variable in the F&B business. Some variables such as amount of ingredients and number of seats are also an important factor for the business continuity. The owner knows these case and understand there are some problem that may arise if it not handled properly. One of the ways to overcome this problem is by predicting the possible number of customers who will come to the restaurant.

We will help the restaurant owner to solve this problem! using the transaction data approximately for the last two and a half months, we will try to predict the possible number of customer for the next 7 days. Furthermore, we will also try to find any seasonality that may help the owner’s business in the future.

To get the output, we will create a time series model that may forecast and predict the number of customer for the next 7 days. We will use several model and compare it based on the Mean Absolute Error of the model.

Import Data

First we will import some library that we may need to conduct our analysis.

# Data Aggregation
library(tidyr)
library(readr)
library(dplyr)
library(lubridate)
library(purrr)
library(zoo)
library(recipes)
library(imputeTS)

# Data Set
library(fpp2)

# model and evaluation
library(forecast)
library(TTR)
library(tseries)

# visualization
library(ggplot2)
library(TSstudio)
library(plotly)

options(scipen = 999)

Read Data

We will read our data that have been provided and put it into an object. We have been provided with two types of data. train data contain a set of information that we will need as a base of our model. test data are the data that will be use to evaluate our model.

# Read data train
fnb_clean <- read.csv("data_input/data-train.csv")

# Read data test
fnb_test <- read.csv("data_input/data-test.csv")

Based on the data set above, we can see the timestamp of each transactions created in restaurant on transaction_date variable. Each of these transactions recorded in a receipt_number that consists of item_id. Each receipt_number is an identity of a single customer that ordered foods / drinks in the restaurant.

Data Preprocessing

As our analysis output is the number of customer prediction based on time series model, we will need our data to fulfill several things that needed before model creation.

We will need our data to be:

  1. Arranged

  2. Aggregated

  3. No missing Values

  4. Padded

  5. Arrange data

Time series model is a model that analyze and predict the outcome of sequence in a specific range of time. which required our data to be in order from beginning to the end.

# Arrange data
fnb_clean <- fnb_clean %>%
   arrange(transaction_date)
  1. Aggregate Data

To create a model, we need our data to have a same interval from one observation to another. For these data, we will create an hourly interval by using floor_date function. We will also subset the crucial variables like transaction_date and receipt_number. The last thing we are going to do is to match our train data with our test data.

This is our test data.

fnb_test

We will aggregate our data by grouping our train data based on transaction_date and summarize it with the number of customer in hourly interval.

# Data Aggregation

fnb_clean <- fnb_clean %>%
   #filter(sales_type != "delivery") %>%
   select(transaction_date, receipt_number) %>%
   mutate(transaction_date = ymd_hms(transaction_date),
          transaction_date = floor_date(transaction_date, unit = "hour")) %>%
   group_by(transaction_date) %>%
   summarise(visitor = n_distinct(receipt_number)) %>%
      ungroup()
## `summarise()` ungrouping output (override with `.groups` argument)
fnb_clean

The owner of the restaurant informed us that the restaurant open from 10:00 AM to 10:00 PM. Meanwhile on the data frame, we can see that the operational hour of the restaurant are from 10:00 AM to 01:AM. with some days it started late from 01:00 PM.

We need to fix the interval by padding and matching it with the same information we got from the owner.

  1. Pad Data and check for NA values

by pad the data we include any necessary information that needed for our analysis. we will use pad() function to pad our train data. The goal of this phase is to make sure that our data have the same interval from one point to another.

We also filter our data to make sure the record only stated the operational hours of the restaurant which is 10:00 AM to 10:00 PM.

# Padding

library(padr)

set.seed(100)
fnb_clean <- fnb_clean %>%
   pad(start_val = ymd_hms("2017-12-01 00:00:00"),
       end_val = ymd_hms("2018-02-18 23:00:00")) %>%
   #na_interpolation(option = "stine") %>%
   #na_ma(k = 4, weighting = "exponential") %>%
   mutate(visitor = replace_na(visitor, 0)) %>%
   mutate(operational_hour = hour(transaction_date)) %>% 
   filter(operational_hour %in% c(10:22)) %>%
   select(-operational_hour)
## pad applied on the interval: hour
fnb_clean

aside from padding, we will mutate any NA values generated from pad() and transform it into other value. we need to transform our NA value because some models can not handling missing values.

for this case, i will change my NA values to the values of 0.

The next last thing to do in data aggregation is to check whether we have any NA value left or not. We will use is.na() function and check it in each of our column.

# Check NA 

colSums(is.na(fnb_clean))
## transaction_date          visitor 
##                0                0
fnb_clean$visitor
##    [1]  0  0  0 16 38 27 29 44 50 66 70 63 63 10 17 18 32 21 40 36 36 41 68 61
##   [25] 62 54  7 13 20 35 23 30 31 44 55 66 47 49 54  5  9 10 13 12 15 19 27 20
##   [49] 39 54 57 52  7 14 13  9 12 16 22 23 35 71 57 54 35  4 14 28 22 29 25 26
##   [73] 30 35 47 48 58 33  2 10  9 14 16 21 20 34 40 60 57 53 41  0  0  0 13 19
##   [97] 24 29 26 45 43 46 54 44 14 21 32 30 30 26 31 32 27 41 74 65 66  8 10 18
##  [121] 16 22 21 25 37 41 50 62 56 46 14 13 13 18 14 18 16 17 26 27 51 48 43  9
##  [145] 12 20 14 13 15 26 27 36 46 52 48 40 12 11 26 25 20 20 21 26 34 58 61 41
##  [169] 40  8  6 18 21 14 23 21 23 30 47 42 62 37  0  0  0 11 22 17 26 21 44 57
##  [193] 40 51 49 13 14 18 27 30 23 27 20 26 51 64 59 43  4 15 23 22 21 36 33 42
##  [217] 39 48 56 58 40  8 11 17 18 11 22 23 28 37 54 61 56 42  7 15 17 13 17 15
##  [241] 21 29 30 42 54 49 39  3 10 16 16 25 17 17 22 33 35 48 49 44 11 11 26 12
##  [265] 18 17 24 26 26 52 32 43 32  0  0  0  4 24 28 17 21 27 39 60 38 52 13 16
##  [289] 16 18 19 25 24 29 27 55 47 60 63  5 11 18 21 17 19 16 11 15 41 68 70 37
##  [313]  5 10 20 32 21 21 22 28 33 31 64 51 41  3 16 12 18 26 18 27 28 33 53 40
##  [337] 42 40  9  5  3 12 10 20 23 19 42 39 47 31 39  2 10 16 17 17 23 15 26 31
##  [361] 34 56 58 34  0  0  0  6 22 16 18 24 32 57 61 52 46  4  8 30 26 21 21 29
##  [385] 38 34 37 66 61 41  5 11 17 24 21 14 21 28 19 38 27 51 63  4 10 31 29 18
##  [409] 23 33 39 37 64 56 58 48 11 18 18 21 24 25 27 31 49 57 54 38 42 12  7 16
##  [433] 25 28 26 13 18 30 61 46 47 36  5  6 19 10 24 17  9 14 26 47 46 51 37  0
##  [457]  0  0  8 21 26 17 25 33 46 65 42 44 11 18 28 26 35 18 38 36 54 53 59 67
##  [481] 51 10 19 17 30 28 37 39 34 29 52 56 47 41  9 12  8 12 21 17 25 24 28 38
##  [505] 51 44 35  9  5  9 17 21 20 18 20 22 45 43 46 44  8 12 12 16 31 22 21 19
##  [529] 32 52 48 40 41  9  9 21 21 21 18 15 10 21 40 44 50 37  0  0  0 13 12 16
##  [553] 14 30 33 53 63 41 37  9 11 22 29 26 24 27 32 44 53 63 61 41 13 11 28 19
##  [577] 26 41 29 43 29 36 43 48 40  6  7 19 17  9 25 24 20 15 34 50 42 39  4 18
##  [601] 21 12 23 24 23 27 28 37 47 41 51  6 14 20  9 18 17 28 28 21 44 49 47 37
##  [625] 13  7 13 16 14 28 10 19 28 51 52 38 32  0  0  0  8 12 15 20 14 30 42 54
##  [649] 60 48  5 15 29 29 20 22 26 27 37 57 66 58 58  7 19 28 42 31 33 41 45 37
##  [673] 52 61 59 42  6 10 15 17 13 17 14 25 28 48 43 37 36  9 10 15 18 19 24 21
##  [697] 35 27 45 30 46 34 12 12 19 10 19 16 24 27 20 31 43 52 35  5  7 18 22 17
##  [721] 18 14 29 33 62 61 60 44  0  0  0  7 20 21 16 16 37 43 65 56 58  5 12 22
##  [745] 25 35 29 33 39 34 54 80 61 50  8 16 21 30 32 32 33 36 41 60 57 50 47 14
##  [769] 11 22 15 20 16 17 25 33 42 44 46 33  5  7 11 19 18 21 19 24 26 39 49 48
##  [793] 42  7 19 20 16 23 20 14 27 31 43 64 46 40  9 10 24 11 14 19 20 19 34 50
##  [817] 58 53 43  0  0  0  3 19 22 20 24 38 46 69 57 49 16 22 28 27 30 31 32 38
##  [841] 38 47 76 61 48  6 19 23 29 38 41 48 36 47 59 83 65 44  5  5 15 10 17 22
##  [865]  8 18 31 24 40 44 37  5 10 18 15  9 15 16 21 27 36 37 45 43  9 16 18 13
##  [889] 14 16 30 28 31 39 45 36 45  9 20 18 15 19 23 28 18 23 43 57 61 47  0  0
##  [913]  0  2 15 19 14 19 34 56 61 64 50  7 23 26 22 28 35 26 34 43 56 67 74 59
##  [937]  6 21 29 26 29 33 25 45 47 60 70 57 53  6 13 20 21 14 22 25 17 24 42 52
##  [961] 49 42  7  9 13 19 17 25 28 25 30 38 45 46 36  6  8 15 11  8 11 17 20 18
##  [985] 33 49 62 49  6  4  6 13 14 14  7 15 19 30 35 40 42  0  0  0  3 21 24 34
## [1009] 29 28 61 70 59 52  2 11 10 24 13 33 31 23 32 55 65 61 67 15 16 21 25 25
## [1033] 27 34 26 31 56 67 49 41

The result above shows us there are not any NA value left in our train data and it can be used for the next steps of analysis.

Cross Validation

Before creating the model, we will split our train data into two different data set. train data as a data that will be trained in to our model, and validation data as a data that provide an unbiased evaluation of a model fit on the training data set.

source: http://tarangshah.com/blog/2017-12-03/train-validation-and-test-sets/

We will split our train data into two type of data set named train and validation'. Ourvalidation` data will be a set of data consist of the last seven days of the restaurant operational hour.

# Cross Validation

fnb_val <- tail(fnb_clean, 13*7)

fnb_val$visitor <- round(fnb_val$visitor)

fnb_train <- head(fnb_clean, nrow(fnb_clean) - 13*7)

fnb_train$visitor <- round(fnb_train$visitor)

# Plot data Train & Validation 

Plot1 <- fnb_train %>%
   ggplot(aes(x = transaction_date, y = visitor)) +
   geom_line() +
   scale_x_datetime(name = "Transaction Date", 
                    date_breaks = "2 weeks", 
                    expand = expansion(mult = 0.05, add = 0.05)) +
   scale_y_continuous(breaks = seq(0, 400, 50), expand = expansion(mult = 0.05, add = 0.05)) +
   geom_line(data = fnb_val, 
             aes(color = "red"), 
             show.legend = F)
  
ggplotly(Plot1)

The graphic above shows us the composition of our data:

  • train data started from 2017-12-01 into 2018-02-11
  • validation data started from 2018-02-12 into 2018-02-18

Create Time Series Object

To create a time series model, we need to create a time series object from our train data. time series object will be based on the visitor column as it is the one that we are going to predict, we set the frequency to be 13 as it is the operational hours of the restaurant.

# Create TS Object

fnb_ts <- ts(data = fnb_train$visitor,
            frequency = 13) # hourly frequency

Next step, we will decompose our time series object and try to see the trend and seasonality by put it into a plot using autoplot()

# Decompose TS Object

fnb_decomp <- fnb_ts %>%
   decompose()

fnb_decomp %>% autoplot()

As we can see from the plot, the trend still shown us some pattern like our seasonal, this indicate there are other seasonality pattern that have not been caught by the plot. To overcome this, we will create a Multi Seasonal Time Series Object.

# Create MSTS Object based on train data

fnb_multi <- msts(fnb_train$visitor, 
                  seasonal.periods = c(13, # Hourly Frequency
                                       13*7)) # Weekly Frequency
           
# Decompose

fnb_multi_decomp <- fnb_multi %>%
   mstl()

fnb_multi_decomp %>%
   tail(13*7*4) %>%
   autoplot()

From our MSTS plot, we can see that our trend is already going smooth, and we can see the pattern of our seasonality

it consist of hourly and weekly pattern of our restaurant operational hours.

Seasonality Analysis

df.fnb_multi <- as.data.frame(fnb_multi_decomp)

From the plot below, we can see there are several seasonality pattern

df.fnb_multi$Seasonal13 %>%
   tail(13*7) %>%
   msts(seasonal.periods = c(13)) %>%
   ts_plot()
## Warning: `arrange_()` is deprecated as of dplyr 0.7.0.
## Please use `arrange()` instead.
## See vignette('programming') for more help
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
  1. the daily seasonality pattern shows us how in daily operational hour, the later the day, the higher the visitor of the restaurant.
df.fnb_multi$Seasonal91 %>%
   tail(13*7) %>%
   msts(seasonal.periods = c(13*7)) %>%
   ts_plot()
  1. the weekly seasonality patern shows us how in weekly operational hour in the middle days of the week there are a decrease number of customer but continously rise and surpass the early days in the weekend

Modelling

After we succeed creating the MSTS object, we will try to put our object into several models. Some models that we will try to create is

  • sltf() model
  • Multi-Seasonal ARIMA model
  • Multi-Seasonal Holt-Winter
  • TBATS Model
  1. Membuat model dengan fungsi stlf()
# Create Model

stlf_fnb <- stlf(y = fnb_multi, h = 13*7)

# Forecast

f_stlf <- forecast(stlf_fnb, h = 13*7)

# Plot

fnb_multi %>%
   tail(13*7*4) %>%
   autoplot() +
   autolayer(f_stlf, series = "STLF Method", color = "orange")

Membuat model dengan Multi-Seasonal ARIMA

# Create Model

fnb_arima <- stlm(fnb_multi, method = "arima")

# Forecast

f_arima <- forecast(fnb_arima, h = 13*7)

# Plot

fnb_multi %>%
    tail(13*7*4) %>%
    autoplot() +
    autolayer(f_arima, series = "Multi-Seasonal ARIMA", color = "red")

Membuat model dengan Multi-seasonal Holt-Winter

# Create Model

fnb_hw <- stlm(fnb_multi, method = "ets")

# Forecast

f_hw <- forecast(fnb_hw, h = 13*7)

# Plot

fnb_multi %>%
    tail(13*7*4) %>%
    autoplot() +
    autolayer(f_hw, series = "Multi-Seasonal Holt-Winter", color = "green")

Membuat model dengan TBATS model

# Create Model

 tbats_mod <- fnb_multi %>%
   tbats(use.box.cox = F,
         use.trend = T,
         use.damped.trend = T)

# Forecast

 f_tbats <- forecast(tbats_mod, h = 13*7)

# Plot
 
 fnb_multi %>%
    tail(13*7*4) %>%
    autoplot() +
    autolayer(f_tbats, series = "TBATS Model", color = "blue")

# Compare accuracy

# stlf

accuracy(stlf_fnb, fnb_val$visitor)
##                       ME     RMSE      MAE  MPE MAPE      MASE      ACF1
## Training set -0.01769208 5.331640 4.058445  NaN  Inf 0.4076074 0.1382671
## Test set     -6.06338970 9.317976 7.664924 -Inf  Inf 0.7698218        NA
# Multi-Seasonal ARIMA

accuracy(f_arima, fnb_val$visitor)
##                      ME     RMSE      MAE MPE MAPE      MASE         ACF1
## Training set -0.0162468 5.186077 3.928382 NaN  Inf 0.3945446 0.0007599934
## Test set     -2.0839295 7.320859 5.647336 NaN  Inf 0.5671866           NA
# Multi-Seasonal Holt-Winter

accuracy(f_hw, fnb_val$visitor)
##                       ME     RMSE      MAE  MPE MAPE      MASE      ACF1
## Training set -0.01769208 5.331640 4.058445  NaN  Inf 0.4076074 0.1382671
## Test set     -6.06338970 9.317976 7.664924 -Inf  Inf 0.7698218        NA
# TBATS

accuracy(f_tbats, fnb_val$visitor)
##                      ME    RMSE      MAE  MPE MAPE      MASE        ACF1
## Training set -0.1608187 6.56482 5.037088  NaN  Inf 0.5058967 -0.00079954
## Test set     -6.1462594 9.43372 7.635236 -Inf  Inf 0.7668401          NA

To evaluate our model, we will judge it based on the Mean Absolute Error of the model, we use MAE as it is commonly used in time series analysis and easier to interpret. MAE measure the average magnitude of the error in a set of prediction

source : https://medium.com/human-in-a-machine-world/mae-and-rmse-which-metric-is-better-e60ac3bde13d

from MAE value, we can conclude that our Multi-Seasonal ARIMA model have the lowest error value by comparison with the value of5.53024

therefore, we will use Multi-Seasonal ARIMA model as our prediction model and check the assumptions of the residuals.

Assumption

There are two kind of assumption that will be checked for our model residual:

  1. Autocorrelation
  2. Normality

we will use Ljung-Box test for our autocorrelation test and Shapiro-Wilk test for our normality test. It can be conclude that we have a good and optimal model supposed the residual able to fulfill these assumptions.

  1. Autocorrelation Test
# Residual Autocorrelation

Box.test(fnb_arima$residuals, type = "Ljung-Box",
         lag = 2*13*7)
## 
##  Box-Ljung test
## 
## data:  fnb_arima$residuals
## X-squared = 347.4, df = 182, p-value = 0.000000000001927

p-value > 0.05 : No Autocorrelation in residual

p-value < 0.05 : There are Autocorrelation in residual

Based on the result, our p-value is lower than 0.05, which means there are autocorrelations in our model residual.

  1. Normality Test
# Normality

shapiro.test(fnb_arima$residuals)
## 
##  Shapiro-Wilk normality test
## 
## data:  fnb_arima$residuals
## W = 0.99028, p-value = 0.00000643

p-value > 0.05 : Residual normally distributed

p-value < 0.05 : Residual not normally distributed

Based on the result, our p-value is lower than 0.05, which means our residual not normally distributed.

Model Tuning

The assumption checking shown us that our model’s residual failed to fulfill the assumptions, means that our model is not optimal enough and can be improved.

To fulfill these assumptions, we will try to transform our data. Using recipes package, we will transform our data into it square-root value and do some scalling.

# Data transformation

rec <- recipe(visitor ~., data = fnb_train) %>%
   step_sqrt(all_outcomes()) %>% 
   step_center(all_outcomes()) %>%
   step_scale(all_outcomes()) %>%
   prep()

# Ubah data train

fnb_train_rec <- juice(rec)


# Ubah data val

fnb_val_rec <- bake(rec, fnb_val)

We can see below that our visitor column have been transformed into in root-square value and already beig scaled. this data will be used as our new train data.

tail(fnb_train_rec, 3)

We also going to create a revert function, which will reverting our forecast value so it will be correctly compared with our validation data.

# Revert Function

rec_rev <- function(x, rec) {
   
   means <- rec$steps[[2]]$means[["visitor"]]
   sds <- rec$steps[[3]]$sds[["visitor"]]
   
   x <- (x * sds + means)^2
   x <- round(x)
   #x <- exp(x)
   x <- ifelse(x < 0, 0, x)
   
   #X <- exp(x) kalo jadi log
   
   x
   
}

Create Time Series Object based on Transformed Data

We will create a new MSTS object, based on train_rec data. we will set it into hourly, weekly, and monthly frequency.

# Create MSTS object based on data train and val rec

fnb_multi_rec <- msts(fnb_train_rec$visitor, 
                  seasonal.periods = c(13, # Hourly
                                       13*7, # Weekly
                                       13*7*4)) # Monthly
                                                         
# Decompose

fnb_multi_decomp_rec <- fnb_multi_rec %>%
   mstl()

fnb_multi_decomp_rec %>%
   tail(13*7*4) %>%
   autoplot()

Modelling based on Transformed Data

As we already choose Multi-Seasonal ARIMA as our prediction model, we will create another Multi-Seasonal ARIMA model but based on our new MSTS object.

# Create a Model

fnb_arima_rec <- stlm(fnb_multi_rec, method = "arima")

# Forecast

f_arima_rec <- forecast(fnb_arima_rec, h = 13*7)

# Revert the transformed value

rec_rev_arima <- rec_rev(f_arima_rec$mean, rec = rec)

# Check the accuracy

accuracy(rec_rev_arima, fnb_val$visitor)
##                ME     RMSE      MAE       MPE     MAPE
## Test set -1.10989 7.113677 5.417582 -18.16312 34.27183
# Plot

fnb_multi_rec %>%
   tail(13*7*4) %>%
   autoplot() +
   autolayer(f_arima_rec, series = "Multi-Seasonal ARIMA", color = "red")

Based on our model evaluation, the MAE value of our model is 5.417582, which is lower than our previous model.We will try to use this model residual to check the assumptions.

Assumption

We will check the Autocorrelation and Normality of our new model residual with the same test we used previously

  1. Autocorrelation test
# Residual Autocorrelation
   
Box.test(fnb_arima_rec$residuals, type = "Ljung-Box",
         lag = 2*13*7)
## 
##  Box-Ljung test
## 
## data:  fnb_arima_rec$residuals
## X-squared = 478.5, df = 182, p-value < 0.00000000000000022

p-value > 0.05 : No Autocorrelation in residual

p-value < 0.05 : There are Autocorrelation in residual

Based on the result, our p-value is lower than 0.05, which means there are autocorrelations in our model residual.

  1. Normality Test
# Normality

shapiro.test(fnb_arima_rec$residuals)
## 
##  Shapiro-Wilk normality test
## 
## data:  fnb_arima_rec$residuals
## W = 0.99859, p-value = 0.6576

p-value > 0.05 : Residual normally distributed

p-value < 0.05 : Residual not normally distributed

Based on the result, our p-value is higher than 0.05, which means our residual is normally distributed.

As our Autocorrelation assumption still has not been fulfilled it can be said that our model can be tuned to be more optimal. and it can be said that there are other variable that can be utilize to create a model and there are dependency between t period and t-1 period

Predict to data test

# Forecast the target using your model
forecast_mod <- forecast(fnb_arima_rec)

# Create submission data
submission <- fnb_test %>% 
  mutate(visitor = rec_rev_arima)

# save data
write.csv(submission, "submission-arif.csv", row.names = F)

# check first 3 data
head(submission, 3)

Daily Interval

submission_weekday <- submission %>% 
  mutate(Date = date(datetime)) %>% 
  group_by(Date) %>% 
  summarise(Visitor = sum(visitor)) %>% 
  mutate(Days = wday(Date, label = T, abbr = F)) 
## Warning: Problem with `mutate()` input `Date`.
## i tz(): Don't know how to compute timezone for object of class factor; returning "UTC". This warning will become an error in the next major version of lubridate.
## i Input `Date` is `date(datetime)`.
## Warning: tz(): Don't know how to compute timezone for object of class factor;
## returning "UTC". This warning will become an error in the next major version of
## lubridate.
## `summarise()` ungrouping output (override with `.groups` argument)
plot_weekday <- submission_weekday %>% 
  ggplot(aes(x = Days,
             y = Visitor)) +
   geom_point()+
   theme_minimal()+
  scale_y_continuous(limits = c(100, 450))

ggplotly(plot_weekday)

Hourly Interval

submission_hour <- submission %>% 
   mutate(datetime = ymd_hms(datetime)) %>% 
   mutate(hour = hour(datetime)) %>% 
   select(-datetime) %>% 
   group_by(hour) %>% 
   summarise(visitor = n_distinct(visitor)) %>% 
   arrange(hour) 
## `summarise()` ungrouping output (override with `.groups` argument)
plot_hour <- submission_hour %>% 
   ggplot(mapping = aes(x = hour, y = visitor))+
   geom_line(aes(color = "red"), show.legend = F)+
   theme_minimal()+
   labs(title = "Visitor per Hour", 
        x = "Hour", 
        y = "visitor")+
   scale_x_continuous(limits = c(10,22), breaks = seq(0,22))

ggplotly(plot_hour)

according plot above the highest amount of visitor would be in the afternoon from 15:00 into 22:00