Hi !! Welcome to my Rmd for data visualization. This time, I am looking dataset from external source (kaggle). Please enjoy it.
Love Bike is the middle level company, which working at rental bike bussiness. The company give two kind service, such as : - hour rental bike - daily rental bike
The company hope to develop their business. So the management want to analize data : - What time the company get the highest visitors / customers (hour)? - What weekdays the company get the highest visitors / customers (day)? - Is the company need to focus adding the employee on time above ? - Is the company need do other improvement?
This is set up library
library(lubridate)
library(ggplot2)
library(tidyverse)
library(ggpubr)
library(scales)
library(glue)
library(plotly)rent_day <- read.csv("data_input/day.csv")
rent_hour <- read.csv("data_input/hour.csv")#check data
#head(rent_day)
tail(rent_day)#check data
#head(rent_hour)
tail(rent_hour)# check data
dim(rent_hour)#> [1] 17379 17
names(rent_hour)#> [1] "instant" "dteday" "season" "yr" "mnth"
#> [6] "hr" "holiday" "weekday" "workingday" "weathersit"
#> [11] "temp" "atemp" "hum" "windspeed" "casual"
#> [16] "registered" "cnt"
dim(rent_day)#> [1] 731 16
names(rent_day)#> [1] "instant" "dteday" "season" "yr" "mnth"
#> [6] "holiday" "weekday" "workingday" "weathersit" "temp"
#> [11] "atemp" "hum" "windspeed" "casual" "registered"
#> [16] "cnt"
========================================= Dataset characteristics =========================================
Both hour.csv and day.csv have the following fields, except hr which is not available in day.csv
- instant : record index
- dteday : date
- season : season (1:springer, 2:summer, 3:fall, 4:winter)
- yr : year (0: 2011, 1:2012)
- mnth : month ( 1 to 12)
- hr : hour (0 to 23)
- holiday : weather day is holiday or not (extracted from http://dchr.dc.gov/page/holiday-schedule)
- weekday : day of the week
- workingday : if day is neither weekend nor holiday is 1, otherwise is 0.
- weathersit :
+ 1: Clear, Few clouds, Partly cloudy, Partly cloudy
+ 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist
+ 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds
+ 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog
- temp : Normalized temperature in Celsius. The values are divided to 41 (max)
- atemp : Normalized feeling temperature in Celsius. The values are divided to 50 (max)
- hum : Normalized humidity. The values are divided to 100 (max)
- windspeed : Normalized wind speed. The values are divided to 67 (max)
- casual : count of casual users
- registered : count of registered users
- cnt : count of total rental bikes including both casual and registered
Check data type for each column using the str() function.
The str() function will do a sanity check on the structure and show sample data for each variable.
rent_hour$dteday <- ymd(rent_hour$dteday)
rent_day$dteday <- ymd(rent_day$dteday)str(rent_hour)#> 'data.frame': 17379 obs. of 17 variables:
#> $ instant : int 1 2 3 4 5 6 7 8 9 10 ...
#> $ dteday : Date, format: "2011-01-01" "2011-01-01" ...
#> $ season : int 1 1 1 1 1 1 1 1 1 1 ...
#> $ yr : int 0 0 0 0 0 0 0 0 0 0 ...
#> $ mnth : int 1 1 1 1 1 1 1 1 1 1 ...
#> $ hr : int 0 1 2 3 4 5 6 7 8 9 ...
#> $ holiday : int 0 0 0 0 0 0 0 0 0 0 ...
#> $ weekday : int 6 6 6 6 6 6 6 6 6 6 ...
#> $ workingday: int 0 0 0 0 0 0 0 0 0 0 ...
#> $ weathersit: int 1 1 1 1 1 2 1 1 1 1 ...
#> $ temp : num 0.24 0.22 0.22 0.24 0.24 0.24 0.22 0.2 0.24 0.32 ...
#> $ atemp : num 0.288 0.273 0.273 0.288 0.288 ...
#> $ hum : num 0.81 0.8 0.8 0.75 0.75 0.75 0.8 0.86 0.75 0.76 ...
#> $ windspeed : num 0 0 0 0 0 0.0896 0 0 0 0 ...
#> $ casual : int 3 8 5 3 0 0 2 1 1 8 ...
#> $ registered: int 13 32 27 10 1 1 0 2 7 6 ...
#> $ cnt : int 16 40 32 13 1 1 2 3 8 14 ...
rent_hour$yr <- as.factor(rent_hour$yr)
rent_hour$holiday <- as.factor(rent_hour$holiday)
rent_hour$workingday <- as.factor(rent_hour$workingday)
str(rent_hour)#> 'data.frame': 17379 obs. of 17 variables:
#> $ instant : int 1 2 3 4 5 6 7 8 9 10 ...
#> $ dteday : Date, format: "2011-01-01" "2011-01-01" ...
#> $ season : int 1 1 1 1 1 1 1 1 1 1 ...
#> $ yr : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
#> $ mnth : int 1 1 1 1 1 1 1 1 1 1 ...
#> $ hr : int 0 1 2 3 4 5 6 7 8 9 ...
#> $ holiday : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
#> $ weekday : int 6 6 6 6 6 6 6 6 6 6 ...
#> $ workingday: Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
#> $ weathersit: int 1 1 1 1 1 2 1 1 1 1 ...
#> $ temp : num 0.24 0.22 0.22 0.24 0.24 0.24 0.22 0.2 0.24 0.32 ...
#> $ atemp : num 0.288 0.273 0.273 0.288 0.288 ...
#> $ hum : num 0.81 0.8 0.8 0.75 0.75 0.75 0.8 0.86 0.75 0.76 ...
#> $ windspeed : num 0 0 0 0 0 0.0896 0 0 0 0 ...
#> $ casual : int 3 8 5 3 0 0 2 1 1 8 ...
#> $ registered: int 13 32 27 10 1 1 0 2 7 6 ...
#> $ cnt : int 16 40 32 13 1 1 2 3 8 14 ...
str(rent_day)#> 'data.frame': 731 obs. of 16 variables:
#> $ instant : int 1 2 3 4 5 6 7 8 9 10 ...
#> $ dteday : Date, format: "2011-01-01" "2011-01-02" ...
#> $ season : int 1 1 1 1 1 1 1 1 1 1 ...
#> $ yr : int 0 0 0 0 0 0 0 0 0 0 ...
#> $ mnth : int 1 1 1 1 1 1 1 1 1 1 ...
#> $ holiday : int 0 0 0 0 0 0 0 0 0 0 ...
#> $ weekday : int 6 0 1 2 3 4 5 6 0 1 ...
#> $ workingday: int 0 0 1 1 1 1 1 0 0 1 ...
#> $ weathersit: int 2 2 1 1 1 1 2 2 1 1 ...
#> $ temp : num 0.344 0.363 0.196 0.2 0.227 ...
#> $ atemp : num 0.364 0.354 0.189 0.212 0.229 ...
#> $ hum : num 0.806 0.696 0.437 0.59 0.437 ...
#> $ windspeed : num 0.16 0.249 0.248 0.16 0.187 ...
#> $ casual : int 331 131 120 108 82 88 148 68 54 41 ...
#> $ registered: int 654 670 1229 1454 1518 1518 1362 891 768 1280 ...
#> $ cnt : int 985 801 1349 1562 1600 1606 1510 959 822 1321 ...
rent_day$yr <- as.factor(rent_day$yr)
rent_day$holiday <- as.factor(rent_day$holiday)
rent_day$workingday <- as.factor(rent_day$workingday)
str(rent_day)#> 'data.frame': 731 obs. of 16 variables:
#> $ instant : int 1 2 3 4 5 6 7 8 9 10 ...
#> $ dteday : Date, format: "2011-01-01" "2011-01-02" ...
#> $ season : int 1 1 1 1 1 1 1 1 1 1 ...
#> $ yr : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
#> $ mnth : int 1 1 1 1 1 1 1 1 1 1 ...
#> $ holiday : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
#> $ weekday : int 6 0 1 2 3 4 5 6 0 1 ...
#> $ workingday: Factor w/ 2 levels "0","1": 1 1 2 2 2 2 2 1 1 2 ...
#> $ weathersit: int 2 2 1 1 1 1 2 2 1 1 ...
#> $ temp : num 0.344 0.363 0.196 0.2 0.227 ...
#> $ atemp : num 0.364 0.354 0.189 0.212 0.229 ...
#> $ hum : num 0.806 0.696 0.437 0.59 0.437 ...
#> $ windspeed : num 0.16 0.249 0.248 0.16 0.187 ...
#> $ casual : int 331 131 120 108 82 88 148 68 54 41 ...
#> $ registered: int 654 670 1229 1454 1518 1518 1362 891 768 1280 ...
#> $ cnt : int 985 801 1349 1562 1600 1606 1510 959 822 1321 ...
We can use colSums() and anyNA() function to check if there is missing value in our data.
colSums(is.na(rent_hour))#> instant dteday season yr mnth hr holiday
#> 0 0 0 0 0 0 0
#> weekday workingday weathersit temp atemp hum windspeed
#> 0 0 0 0 0 0 0
#> casual registered cnt
#> 0 0 0
colSums(is.na(rent_day))#> instant dteday season yr mnth holiday weekday
#> 0 0 0 0 0 0 0
#> workingday weathersit temp atemp hum windspeed casual
#> 0 0 0 0 0 0 0
#> registered cnt
#> 0 0
colSums(is.na()) is for check how many missing value in our data. If the result is zero (0), so our data have no missing value.
anyNA(rent_hour)#> [1] FALSE
anyNA(rent_day)#> [1] FALSE
anyNA() is for check missing value in our data. If the result is “FALSE”, so our data have no missing value / NA.
Now, rent_hour and rent_day dataset is ready to be processed and analyzed.
The summary() function will do a sanity check on the structure and show IQR and class data for each variable.
summary(rent_hour)#> instant dteday season yr mnth
#> Min. : 1 Min. :2011-01-01 Min. :1.000 0:8645 Min. : 1.000
#> 1st Qu.: 4346 1st Qu.:2011-07-04 1st Qu.:2.000 1:8734 1st Qu.: 4.000
#> Median : 8690 Median :2012-01-02 Median :3.000 Median : 7.000
#> Mean : 8690 Mean :2012-01-02 Mean :2.502 Mean : 6.538
#> 3rd Qu.:13034 3rd Qu.:2012-07-02 3rd Qu.:3.000 3rd Qu.:10.000
#> Max. :17379 Max. :2012-12-31 Max. :4.000 Max. :12.000
#> hr holiday weekday workingday weathersit
#> Min. : 0.00 0:16879 Min. :0.000 0: 5514 Min. :1.000
#> 1st Qu.: 6.00 1: 500 1st Qu.:1.000 1:11865 1st Qu.:1.000
#> Median :12.00 Median :3.000 Median :1.000
#> Mean :11.55 Mean :3.004 Mean :1.425
#> 3rd Qu.:18.00 3rd Qu.:5.000 3rd Qu.:2.000
#> Max. :23.00 Max. :6.000 Max. :4.000
#> temp atemp hum windspeed
#> Min. :0.020 Min. :0.0000 Min. :0.0000 Min. :0.0000
#> 1st Qu.:0.340 1st Qu.:0.3333 1st Qu.:0.4800 1st Qu.:0.1045
#> Median :0.500 Median :0.4848 Median :0.6300 Median :0.1940
#> Mean :0.497 Mean :0.4758 Mean :0.6272 Mean :0.1901
#> 3rd Qu.:0.660 3rd Qu.:0.6212 3rd Qu.:0.7800 3rd Qu.:0.2537
#> Max. :1.000 Max. :1.0000 Max. :1.0000 Max. :0.8507
#> casual registered cnt
#> Min. : 0.00 Min. : 0.0 Min. : 1.0
#> 1st Qu.: 4.00 1st Qu.: 34.0 1st Qu.: 40.0
#> Median : 17.00 Median :115.0 Median :142.0
#> Mean : 35.68 Mean :153.8 Mean :189.5
#> 3rd Qu.: 48.00 3rd Qu.:220.0 3rd Qu.:281.0
#> Max. :367.00 Max. :886.0 Max. :977.0
summary(rent_day)#> instant dteday season yr mnth
#> Min. : 1.0 Min. :2011-01-01 Min. :1.000 0:365 Min. : 1.00
#> 1st Qu.:183.5 1st Qu.:2011-07-02 1st Qu.:2.000 1:366 1st Qu.: 4.00
#> Median :366.0 Median :2012-01-01 Median :3.000 Median : 7.00
#> Mean :366.0 Mean :2012-01-01 Mean :2.497 Mean : 6.52
#> 3rd Qu.:548.5 3rd Qu.:2012-07-01 3rd Qu.:3.000 3rd Qu.:10.00
#> Max. :731.0 Max. :2012-12-31 Max. :4.000 Max. :12.00
#> holiday weekday workingday weathersit temp
#> 0:710 Min. :0.000 0:231 Min. :1.000 Min. :0.05913
#> 1: 21 1st Qu.:1.000 1:500 1st Qu.:1.000 1st Qu.:0.33708
#> Median :3.000 Median :1.000 Median :0.49833
#> Mean :2.997 Mean :1.395 Mean :0.49538
#> 3rd Qu.:5.000 3rd Qu.:2.000 3rd Qu.:0.65542
#> Max. :6.000 Max. :3.000 Max. :0.86167
#> atemp hum windspeed casual
#> Min. :0.07907 Min. :0.0000 Min. :0.02239 Min. : 2.0
#> 1st Qu.:0.33784 1st Qu.:0.5200 1st Qu.:0.13495 1st Qu.: 315.5
#> Median :0.48673 Median :0.6267 Median :0.18097 Median : 713.0
#> Mean :0.47435 Mean :0.6279 Mean :0.19049 Mean : 848.2
#> 3rd Qu.:0.60860 3rd Qu.:0.7302 3rd Qu.:0.23321 3rd Qu.:1096.0
#> Max. :0.84090 Max. :0.9725 Max. :0.50746 Max. :3410.0
#> registered cnt
#> Min. : 20 Min. : 22
#> 1st Qu.:2497 1st Qu.:3152
#> Median :3662 Median :4548
#> Mean :3656 Mean :4504
#> 3rd Qu.:4776 3rd Qu.:5956
#> Max. :6946 Max. :8714
It works well, but a more complete function is the skim() function from the “skimr” package. It breaks down the variables by type with relevant summary information, PLUS a small histogram for each numeric variable.
library(skimr)
skim(rent_hour)| Name | rent_hour |
| Number of rows | 17379 |
| Number of columns | 17 |
| _______________________ | |
| Column type frequency: | |
| Date | 1 |
| factor | 3 |
| numeric | 13 |
| ________________________ | |
| Group variables | None |
Variable type: Date
| skim_variable | n_missing | complete_rate | min | max | median | n_unique |
|---|---|---|---|---|---|---|
| dteday | 0 | 1 | 2011-01-01 | 2012-12-31 | 2012-01-02 | 731 |
Variable type: factor
| skim_variable | n_missing | complete_rate | ordered | n_unique | top_counts |
|---|---|---|---|---|---|
| yr | 0 | 1 | FALSE | 2 | 1: 8734, 0: 8645 |
| holiday | 0 | 1 | FALSE | 2 | 0: 16879, 1: 500 |
| workingday | 0 | 1 | FALSE | 2 | 1: 11865, 0: 5514 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| instant | 0 | 1 | 8690.00 | 5017.03 | 1.00 | 4345.50 | 8690.00 | 13034.50 | 17379.00 | ▇▇▇▇▇ |
| season | 0 | 1 | 2.50 | 1.11 | 1.00 | 2.00 | 3.00 | 3.00 | 4.00 | ▇▇▁▇▇ |
| mnth | 0 | 1 | 6.54 | 3.44 | 1.00 | 4.00 | 7.00 | 10.00 | 12.00 | ▇▆▆▅▇ |
| hr | 0 | 1 | 11.55 | 6.91 | 0.00 | 6.00 | 12.00 | 18.00 | 23.00 | ▇▇▆▇▇ |
| weekday | 0 | 1 | 3.00 | 2.01 | 0.00 | 1.00 | 3.00 | 5.00 | 6.00 | ▇▃▃▃▇ |
| weathersit | 0 | 1 | 1.43 | 0.64 | 1.00 | 1.00 | 1.00 | 2.00 | 4.00 | ▇▃▁▁▁ |
| temp | 0 | 1 | 0.50 | 0.19 | 0.02 | 0.34 | 0.50 | 0.66 | 1.00 | ▂▇▇▇▁ |
| atemp | 0 | 1 | 0.48 | 0.17 | 0.00 | 0.33 | 0.48 | 0.62 | 1.00 | ▁▆▇▆▁ |
| hum | 0 | 1 | 0.63 | 0.19 | 0.00 | 0.48 | 0.63 | 0.78 | 1.00 | ▁▃▇▇▆ |
| windspeed | 0 | 1 | 0.19 | 0.12 | 0.00 | 0.10 | 0.19 | 0.25 | 0.85 | ▇▆▂▁▁ |
| casual | 0 | 1 | 35.68 | 49.31 | 0.00 | 4.00 | 17.00 | 48.00 | 367.00 | ▇▁▁▁▁ |
| registered | 0 | 1 | 153.79 | 151.36 | 0.00 | 34.00 | 115.00 | 220.00 | 886.00 | ▇▃▁▁▁ |
| cnt | 0 | 1 | 189.46 | 181.39 | 1.00 | 40.00 | 142.00 | 281.00 | 977.00 | ▇▃▁▁▁ |
skim(rent_day)| Name | rent_day |
| Number of rows | 731 |
| Number of columns | 16 |
| _______________________ | |
| Column type frequency: | |
| Date | 1 |
| factor | 3 |
| numeric | 12 |
| ________________________ | |
| Group variables | None |
Variable type: Date
| skim_variable | n_missing | complete_rate | min | max | median | n_unique |
|---|---|---|---|---|---|---|
| dteday | 0 | 1 | 2011-01-01 | 2012-12-31 | 2012-01-01 | 731 |
Variable type: factor
| skim_variable | n_missing | complete_rate | ordered | n_unique | top_counts |
|---|---|---|---|---|---|
| yr | 0 | 1 | FALSE | 2 | 1: 366, 0: 365 |
| holiday | 0 | 1 | FALSE | 2 | 0: 710, 1: 21 |
| workingday | 0 | 1 | FALSE | 2 | 1: 500, 0: 231 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| instant | 0 | 1 | 366.00 | 211.17 | 1.00 | 183.50 | 366.00 | 548.50 | 731.00 | ▇▇▇▇▇ |
| season | 0 | 1 | 2.50 | 1.11 | 1.00 | 2.00 | 3.00 | 3.00 | 4.00 | ▇▇▁▇▇ |
| mnth | 0 | 1 | 6.52 | 3.45 | 1.00 | 4.00 | 7.00 | 10.00 | 12.00 | ▇▅▅▅▇ |
| weekday | 0 | 1 | 3.00 | 2.00 | 0.00 | 1.00 | 3.00 | 5.00 | 6.00 | ▇▃▃▃▇ |
| weathersit | 0 | 1 | 1.40 | 0.54 | 1.00 | 1.00 | 1.00 | 2.00 | 3.00 | ▇▁▅▁▁ |
| temp | 0 | 1 | 0.50 | 0.18 | 0.06 | 0.34 | 0.50 | 0.66 | 0.86 | ▂▇▇▇▅ |
| atemp | 0 | 1 | 0.47 | 0.16 | 0.08 | 0.34 | 0.49 | 0.61 | 0.84 | ▂▇▆▇▂ |
| hum | 0 | 1 | 0.63 | 0.14 | 0.00 | 0.52 | 0.63 | 0.73 | 0.97 | ▁▁▆▇▂ |
| windspeed | 0 | 1 | 0.19 | 0.08 | 0.02 | 0.13 | 0.18 | 0.23 | 0.51 | ▃▇▅▁▁ |
| casual | 0 | 1 | 848.18 | 686.62 | 2.00 | 315.50 | 713.00 | 1096.00 | 3410.00 | ▇▆▂▁▁ |
| registered | 0 | 1 | 3656.17 | 1560.26 | 20.00 | 2497.00 | 3662.00 | 4776.50 | 6946.00 | ▂▅▇▅▃ |
| cnt | 0 | 1 | 4504.35 | 1937.21 | 22.00 | 3152.00 | 4548.00 | 5956.00 | 8714.00 | ▂▅▇▅▃ |
Summary : A. data :rent_hour 1. casual => sd value = 49.30 and mean = 35.70 2. registered => sd value = 151.00 and mean = 154.00 3. cnt => sd value = 181.00 and mean = 189.00
B. data :rent_day 1. casual => sd value = 687.00 and mean = 848.00 2. registered => sd value = 1560.00 and mean = 3656.00 3. cnt => sd value = 1937.00 and mean = 4504.00
boxplot(rent_hour$casual, rent_hour$registered, rent_hour$cnt , main="Boxplot for Hour Rental Bike (continuos var)")boxplot(rent_day$casual, rent_day$registered, rent_day$cnt , main="Boxplot for Daily Rental Bike (continuos var)")From result above, we find posibilities for the outliers, but from our calculation, sd value in cnt is not far above mean value (in my oppinion its still be tolerated), so the process may continue.
rent_hour and rent_day!rent_hour$year <- year(rent_hour$dteday)
rent_hour$weekdays <- wday(rent_hour$dteday, # data
label = T, # nama (T)/digit (F)
abbr = F, # disingkat/tidak
week_start = 1, # minggu dimulai di hari ke berapa (Senin = 1)
locale = "IND") # penggantian bahasa
head(rent_hour)rent_day$year <- year(rent_day$dteday)
rent_day$weekdays <- wday(rent_day$dteday, # data
label = T, # nama (T)/digit (F)
abbr = F, # disingkat/tidak
week_start = 1, # minggu dimulai di hari ke berapa (Senin = 1)
locale = "IND") # penggantian bahasa
head(rent_day)wday <- aggregate(cnt ~ year + weekdays,
data = rent_day,
FUN = mean)
wday#Plotting
g1 <- ggplot(data = wday, mapping = aes(x = cnt, y =reorder(weekdays,cnt) ),
text = glue("Total Rent Bike by {weekdays}: {(cnt)}")) +
geom_col(aes(fill = year), position = "dodge") +
labs(x = "Weekdays", y = NULL,
fill = NULL,
title = "Top Visitors Rental Bike by Weekdays") +
facet_wrap(~year,
scales = "free_y")+
theme_minimal()+
theme(axis.text.x = element_text(angle = 90,
hjust = 1))
g1ggplotly(g1, tooltip = "text")Insight : - On the year 2011, “Love Bike” got highest visitors on Jumat (3500.115), Selasa (3468.038) and Senin (3465.788) - On the year 2012, “Love Bike” got highest visitors on Kamis (5977.750), Jumat (5880.462) dan Rabu (5843.827)
hhour <- aggregate(cnt ~ year + hr,
data = rent_hour,
FUN = mean)
hhour# Plotting
g2 <- ggplot(data = hhour, mapping = aes(x = hr, y = cnt)) +
geom_line(aes(group = year,
col = year)) +
geom_point(aes(col = year)) +
labs(x = "Hours", y = NULL,
fill = NULL,
title = "Top Visitors Rental Bike by Hours") +
scale_x_continuous(breaks = seq(0,23,1)) +
theme_minimal()+
theme(axis.text.x = element_text(angle = 90,
hjust = 1))
g2Insight : - On the year 2011, “Love Bike” got highest visitors at hour : 17:00 (349.676712), 18:00 (322.272727) and 08:00 (262.994490)
Based on the plotting on above, we can recommend : 1) “Love Bike” can increase the number of employees to working on day : Kamis, Jumat and Rabu at hour : 08:00, 17:00 and 18:00.
We hope that the analize data can help the improvement of both income and service at “Love Bike” company.
I get data rent_day and rent_hour from https://www.kaggle.com/lakshmi25npathi/bike-rental-count-prediction-using-python/data?select=hour.csv