27/10/2024- 19-55 pm

Case Study 1

Case Study: How Does a Bike-Share Navigate Speedy Success?

Introduction

Welcome to the Cyclistic bike-share analysis case study! In this case study, you will perform many real-world tasks of a junior data analyst. You will work for a fictional company, Cyclistic, and meet different characters and team members. In order to answer the key business questions, you will follow the steps of the data analysis process: ask, prepare, process, analyze, share, and act. Along the way, the Case Study Roadmap tables — including guiding questions and key tasks — will help you stay on the right path. By the end of this lesson, you will have a portfolio-ready case study. Download the packet and reference the details of this case study anytime. Then, when you begin your job hunt, your case study will be a tangible way to demonstrate your knowledge and skills to potential employers.

Scenario

You are a junior data analyst working in the marketing analyst team at Cyclistic, a bike-share company in Chicago.The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, your team wants to understand how casual riders and annual members use Cyclistic bikes differently. From these insights, your team will design a new marketing strategy to convert casual riders into annual members. But first, Cyclists executives must approve your recommendations, so they must be backed up with compelling data insights and professional data visualizations.

Characters and teams

● Cyclistic: A bike-share program that features more than 5,800 bicycles and 600 docking stations. Cyclistic sets itself apart by also offering reclining bikes, hand tricycles, and cargo bikes, making bike-share more inclusive to people with disabilities and riders who can’t use a standard two-wheeled bike. The majority of riders opt for traditional bikes; about 8% of riders use the assistive options. Cyclistic users are more likely to ride for leisure, but about 30% use them to commute to work each day. ● Lily Moreno: The director of marketing and your manager. Moreno is responsible for the development of campaigns and initiatives to promote the bike-share program. These may include email, social media, and other channels. ● Cyclistic marketing analytics team: A team of data analysts who are responsible for collecting, analyzing, and reporting data that helps guide Cyclistic marketing strategy. You joined this team six months ago and have been busy learning about Cyclistic’s mission and business goals — as well as how you, as a junior data analyst, can help Cyclistic achieve them. ● Cyclistic executive team: The notoriously detail-oriented executive team will decide whether to approve the recommended marketing program.

About the company

In 2016, Cyclistic launched a successful bike-share offering. Since then, the program has grown to a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime. Until now, Cyclistic’s marketing strategy relied on building general awareness and appealing to broad consumer segments. One approach that helped make these things possible was the flexibility of its pricing plans: single-ride passes, full-day passes, and annual memberships. Customers who purchase single-ride or full-day passes are referred to as casual riders. Customers who purchase annual memberships are Cyclistic members. Cyclistic’s finance analysts have concluded that annual members are much more profitable than casual riders. Although the pricing flexibility helps Cyclistic attract more customers, Moreno believes that maximizing the number of annual members will be key to future growth. Rather than creating a marketing campaign that targets all-new customers, Moreno believes there is a very good chance to convert casual riders into members. She notes that casual riders are already aware of the Cyclistic program and have chosen Cyclistic for their mobility needs.

Moreno has set a clear goal: Design marketing strategies aimed at converting casual riders into annual members. In order to do that, however, the marketing analyst team needs to better understand how annual members and casual riders differ, why casual riders would buy a membership, and how digital media could affect their marketing tactics. Moreno and her team are interested in analyzing the Cyclistic historical bike trip data to identify trends.

S———————————————————————————————————————S

Setting up my Environment

Notes: setting up my R environment by loading the appropriate packages

https://rmarkdown.rstudio.com/lesson-3.html

Begining the Process: Instaling packages and load libraries

1.-

1.-Instaling packages and load libraries

1.1.-

library(tidyverse)
library(readr)
library(lubridate)
library(tidyr)
library(ggplot2)
library(data.table)

1.2.-

library(tidytext)
library(dplyr)
library(gt)

End Instaling packages and load libraries

S———————————————————————————————————————S

Begining of the Data Gathering Phase.

2.-

2.- Read a csv file into a tibble

https://readr.tidyverse.org/reference/read_delim.html

tripdata_202310 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202310-divvy-tripdata.csv")
## Rows: 537113 Columns: 13
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (7): ride_id, rideable_type, start_station_name, start_station_id, end_...
## dbl  (4): start_lat, start_lng, end_lat, end_lng
## dttm (2): started_at, ended_at
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#View(tripdata_202310)
print(tripdata_202310)
## # A tibble: 537,113 × 13
##    ride_id          rideable_type started_at          ended_at           
##    <chr>            <chr>         <dttm>              <dttm>             
##  1 4449097279F8BBE7 classic_bike  2023-10-08 10:36:26 2023-10-08 10:49:19
##  2 9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08
##  3 667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53
##  4 F92714CC6B019B96 classic_bike  2023-10-24 19:13:03 2023-10-24 19:18:29
##  5 5E34BA5DE945A9CC classic_bike  2023-10-09 18:19:26 2023-10-09 18:30:56
##  6 F7D7420AFAC53CD9 electric_bike 2023-10-04 17:10:59 2023-10-04 17:25:21
##  7 870B2D4CD112D7B7 electric_bike 2023-10-31 17:32:20 2023-10-31 17:44:20
##  8 D9179D36E32D456C classic_bike  2023-10-02 18:51:51 2023-10-02 18:57:09
##  9 F8E131281F722FEF classic_bike  2023-10-17 08:28:18 2023-10-17 08:50:03
## 10 91938B71748FA405 classic_bike  2023-10-17 19:17:38 2023-10-17 19:32:23
## # ℹ 537,103 more rows
## # ℹ 9 more variables: start_station_name <chr>, start_station_id <chr>,
## #   end_station_name <chr>, end_station_id <chr>, start_lat <dbl>,
## #   start_lng <dbl>, end_lat <dbl>, end_lng <dbl>, member_casual <chr>

2.2.-

tripdata_202311 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202311-divvy-tripdata.csv")
#View(tripdata_202311)

tripdata_202312 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202312-divvy-tripdata.csv")
#View(tripdata_202312)

tripdata_202401 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202401-divvy-tripdata.csv")
#View(tripdata_202401)

tripdata_202402 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202402-divvy-tripdata.csv")
#View(tripdata_202402)

tripdata_202403 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202403-divvy-tripdata.csv")
#View(tripdata_202403)

tripdata_202404 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202404-divvy-tripdata.csv")
#View(tripdata_202404)

tripdata_202405 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202405-divvy-tripdata.csv")
#View(tripdata_202405)

tripdata_202406 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202406-divvy-tripdata.csv")
#View(tripdata_202406)

tripdata_202407 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202407-divvy-tripdata.csv")
#View(tripdata_202407)

tripdata_202408 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202408-divvy-tripdata.csv")
#View(tripdata_202408)

2.3.-

tripdata_202409 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202409-divvy-tripdata.csv")
## Rows: 821276 Columns: 13
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (7): ride_id, rideable_type, start_station_name, start_station_id, end_...
## dbl  (4): start_lat, start_lng, end_lat, end_lng
## dttm (2): started_at, ended_at
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#View(tripdata_202409)
print(tripdata_202409)
## # A tibble: 821,276 × 13
##    ride_id          rideable_type started_at          ended_at           
##    <chr>            <chr>         <dttm>              <dttm>             
##  1 31D38723D5A8665A electric_bike 2024-09-26 15:30:58 2024-09-26 15:30:59
##  2 67CB39987F4E895B electric_bike 2024-09-26 15:31:32 2024-09-26 15:53:13
##  3 DA61204FD26EC681 electric_bike 2024-09-26 15:00:33 2024-09-26 15:02:25
##  4 06F160D46AF235DD electric_bike 2024-09-26 18:19:06 2024-09-26 18:38:53
##  5 6FCA41D4317601EB electric_bike 2024-09-03 19:49:57 2024-09-03 20:07:08
##  6 9F291E82895C45E5 electric_bike 2024-09-04 01:45:18 2024-09-04 02:01:38
##  7 625D2EA831E1F8AC electric_bike 2024-09-04 16:22:16 2024-09-04 16:26:20
##  8 A21DCB6834BCAD0D electric_bike 2024-09-04 16:31:58 2024-09-04 16:38:52
##  9 0EEB8A4CF63DA7AE electric_bike 2024-09-28 20:30:28 2024-09-28 20:33:20
## 10 6CE10020F5D0D7B8 electric_bike 2024-09-28 20:10:48 2024-09-28 20:24:32
## # ℹ 821,266 more rows
## # ℹ 9 more variables: start_station_name <chr>, start_station_id <chr>,
## #   end_station_name <chr>, end_station_id <chr>, start_lat <dbl>,
## #   start_lng <dbl>, end_lat <dbl>, end_lng <dbl>, member_casual <chr>

2.3.1.- tripdata_202409 tibble: 821,276 rows, 13 columns

End of the Data Gathering Phase.

S———————————————————————————————————————S

Beginning of the Data Processing Phase in its Initial Part

3.-

3.- Determining the column name and data type in each column of the dataset

https://www.geeksforgeeks.org/check-data-type-of-each-dataframe-column-in-r/

sapply(tripdata_202310,class)
## $ride_id
## [1] "character"
## 
## $rideable_type
## [1] "character"
## 
## $started_at
## [1] "POSIXct" "POSIXt" 
## 
## $ended_at
## [1] "POSIXct" "POSIXt" 
## 
## $start_station_name
## [1] "character"
## 
## $start_station_id
## [1] "character"
## 
## $end_station_name
## [1] "character"
## 
## $end_station_id
## [1] "character"
## 
## $start_lat
## [1] "numeric"
## 
## $start_lng
## [1] "numeric"
## 
## $end_lat
## [1] "numeric"
## 
## $end_lng
## [1] "numeric"
## 
## $member_casual
## [1] "character"
#sapply(tripdata_202311,class)

#sapply(tripdata_202312,class)

#sapply(tripdata_202401,class)

#sapply(tripdata_202402,class)

#sapply(tripdata_202403,class)

#sapply(tripdata_202404,class)

#sapply(tripdata_202405,class)

#sapply(tripdata_202406,class)

#sapply(tripdata_202407,class)

#sapply(tripdata_202408,class)

sapply(tripdata_202409,class)
## $ride_id
## [1] "character"
## 
## $rideable_type
## [1] "character"
## 
## $started_at
## [1] "POSIXct" "POSIXt" 
## 
## $ended_at
## [1] "POSIXct" "POSIXt" 
## 
## $start_station_name
## [1] "character"
## 
## $start_station_id
## [1] "character"
## 
## $end_station_name
## [1] "character"
## 
## $end_station_id
## [1] "character"
## 
## $start_lat
## [1] "numeric"
## 
## $start_lng
## [1] "numeric"
## 
## $end_lat
## [1] "numeric"
## 
## $end_lng
## [1] "numeric"
## 
## $member_casual
## [1] "character"

3.1.- The 12 datasets have the same names in their columns and each column with the same name in the 12 datasets has the same data type

4.-

4.- Joining 12 datasets from October 2023 to September 2024

Definition of :oneyear_trips tibble

https://dplyr.tidyverse.org/reference/bind_rows.html

oneyear_trips <- bind_rows(tripdata_202310, tripdata_202311, tripdata_202312, tripdata_202401,tripdata_202402, tripdata_202403,tripdata_202404, tripdata_202405, tripdata_202406, tripdata_202407, tripdata_202408, tripdata_202409)
#View(oneyear_trips)
print(oneyear_trips) 
## # A tibble: 5,860,374 × 13
##    ride_id          rideable_type started_at          ended_at           
##    <chr>            <chr>         <dttm>              <dttm>             
##  1 4449097279F8BBE7 classic_bike  2023-10-08 10:36:26 2023-10-08 10:49:19
##  2 9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08
##  3 667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53
##  4 F92714CC6B019B96 classic_bike  2023-10-24 19:13:03 2023-10-24 19:18:29
##  5 5E34BA5DE945A9CC classic_bike  2023-10-09 18:19:26 2023-10-09 18:30:56
##  6 F7D7420AFAC53CD9 electric_bike 2023-10-04 17:10:59 2023-10-04 17:25:21
##  7 870B2D4CD112D7B7 electric_bike 2023-10-31 17:32:20 2023-10-31 17:44:20
##  8 D9179D36E32D456C classic_bike  2023-10-02 18:51:51 2023-10-02 18:57:09
##  9 F8E131281F722FEF classic_bike  2023-10-17 08:28:18 2023-10-17 08:50:03
## 10 91938B71748FA405 classic_bike  2023-10-17 19:17:38 2023-10-17 19:32:23
## # ℹ 5,860,364 more rows
## # ℹ 9 more variables: start_station_name <chr>, start_station_id <chr>,
## #   end_station_name <chr>, end_station_id <chr>, start_lat <dbl>,
## #   start_lng <dbl>, end_lat <dbl>, end_lng <dbl>, member_casual <chr>

4.1.-oneyear_trips tibble :5,860,374 rows, 13 columns

4.2.-

4.2.- Create a formatted tables from oneyear_trips tibble

https://www.geeksforgeeks.org/kable-method-in-r/

library(knitr)
kable(oneyear_trips[1:5, ], caption= "oneyear_trips" )
oneyear_trips
ride_id rideable_type started_at ended_at start_station_name start_station_id end_station_name end_station_id start_lat start_lng end_lat end_lng member_casual
4449097279F8BBE7 classic_bike 2023-10-08 10:36:26 2023-10-08 10:49:19 Orleans St & Chestnut St (NEXT Apts) 620 Sheffield Ave & Webster Ave TA1309000033 41.89820 -87.63754 41.92154 -87.65382 member
9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08 Desplaines St & Kinzie St TA1306000003 Sheffield Ave & Webster Ave TA1309000033 41.88864 -87.64441 41.92154 -87.65382 member
667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53 Orleans St & Chestnut St (NEXT Apts) 620 Franklin St & Lake St TA1307000111 41.89807 -87.63751 41.88584 -87.63550 member
F92714CC6B019B96 classic_bike 2023-10-24 19:13:03 2023-10-24 19:18:29 Desplaines St & Kinzie St TA1306000003 Franklin St & Lake St TA1307000111 41.88872 -87.64445 41.88584 -87.63550 member
5E34BA5DE945A9CC classic_bike 2023-10-09 18:19:26 2023-10-09 18:30:56 Desplaines St & Kinzie St TA1306000003 Franklin St & Lake St TA1307000111 41.88872 -87.64445 41.88584 -87.63550 member

End of the Data Processing Phase in its Initial Part

S———————————————————————————————————————S

Begining of the Cleaning Phase

5.-

5.- Start cleaning

Finding the amount of NA Values existing in each column

https://www.statology.org/is-na/

https://www.statology.org/colsums-function-in-r/

na_counts_oneyear_trips <- colSums(is.na(oneyear_trips))
#View(na_counts_oneyear_trips)
print(na_counts_oneyear_trips)
##            ride_id      rideable_type         started_at           ended_at 
##                  0                  0                  0                  0 
## start_station_name   start_station_id   end_station_name     end_station_id 
##            1056882            1056882            1092195            1092195 
##          start_lat          start_lng            end_lat            end_lng 
##                  0                  0               7458               7458 
##      member_casual 
##                  0

6.-

6.- Delete rows with NA field.

Verify rows with NA field.

Create a new tibble

Definition of: oneyear_trips_clean tibble

oneyear_trips_clean <- drop_na(oneyear_trips)
print(oneyear_trips_clean) 
## # A tibble: 4,233,605 × 13
##    ride_id          rideable_type started_at          ended_at           
##    <chr>            <chr>         <dttm>              <dttm>             
##  1 4449097279F8BBE7 classic_bike  2023-10-08 10:36:26 2023-10-08 10:49:19
##  2 9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08
##  3 667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53
##  4 F92714CC6B019B96 classic_bike  2023-10-24 19:13:03 2023-10-24 19:18:29
##  5 5E34BA5DE945A9CC classic_bike  2023-10-09 18:19:26 2023-10-09 18:30:56
##  6 F7D7420AFAC53CD9 electric_bike 2023-10-04 17:10:59 2023-10-04 17:25:21
##  7 870B2D4CD112D7B7 electric_bike 2023-10-31 17:32:20 2023-10-31 17:44:20
##  8 D9179D36E32D456C classic_bike  2023-10-02 18:51:51 2023-10-02 18:57:09
##  9 F8E131281F722FEF classic_bike  2023-10-17 08:28:18 2023-10-17 08:50:03
## 10 91938B71748FA405 classic_bike  2023-10-17 19:17:38 2023-10-17 19:32:23
## # ℹ 4,233,595 more rows
## # ℹ 9 more variables: start_station_name <chr>, start_station_id <chr>,
## #   end_station_name <chr>, end_station_id <chr>, start_lat <dbl>,
## #   start_lng <dbl>, end_lat <dbl>, end_lng <dbl>, member_casual <chr>
na_counts_oneyear_trips_clean <- colSums(is.na(oneyear_trips_clean))
#View(oneyear_trips_clean)
print(na_counts_oneyear_trips_clean)
##            ride_id      rideable_type         started_at           ended_at 
##                  0                  0                  0                  0 
## start_station_name   start_station_id   end_station_name     end_station_id 
##                  0                  0                  0                  0 
##          start_lat          start_lng            end_lat            end_lng 
##                  0                  0                  0                  0 
##      member_casual 
##                  0

6.1.-oneyear_trips_clean :4,233,605 rows, 13 columns `

7.-

7.- Deleting unnecessary columns for not providing data to solve the business task.

Syntax: mydata2 <- subset(mydata, select = -c(x,z) ) https://www.listendata.com/2015/06/r-keep-drop-columns-from-data-frame.html

Create a new tibble

Definition of:oneyear_trips_clean2 tibble

oneyear_trips_clean2 <- subset(oneyear_trips_clean, select = -c(start_lat, start_lng, end_lat, end_lng))
#View(oneyear_trips_clean2)
print(oneyear_trips_clean2)
## # A tibble: 4,233,605 × 9
##    ride_id          rideable_type started_at          ended_at           
##    <chr>            <chr>         <dttm>              <dttm>             
##  1 4449097279F8BBE7 classic_bike  2023-10-08 10:36:26 2023-10-08 10:49:19
##  2 9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08
##  3 667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53
##  4 F92714CC6B019B96 classic_bike  2023-10-24 19:13:03 2023-10-24 19:18:29
##  5 5E34BA5DE945A9CC classic_bike  2023-10-09 18:19:26 2023-10-09 18:30:56
##  6 F7D7420AFAC53CD9 electric_bike 2023-10-04 17:10:59 2023-10-04 17:25:21
##  7 870B2D4CD112D7B7 electric_bike 2023-10-31 17:32:20 2023-10-31 17:44:20
##  8 D9179D36E32D456C classic_bike  2023-10-02 18:51:51 2023-10-02 18:57:09
##  9 F8E131281F722FEF classic_bike  2023-10-17 08:28:18 2023-10-17 08:50:03
## 10 91938B71748FA405 classic_bike  2023-10-17 19:17:38 2023-10-17 19:32:23
## # ℹ 4,233,595 more rows
## # ℹ 5 more variables: start_station_name <chr>, start_station_id <chr>,
## #   end_station_name <chr>, end_station_id <chr>, member_casual <chr>

7.-oneyear_trips_clean2 tibble :4,233,605 rows, 9 columns

`

8.-

8.- Checking for repeated rows in the ride_id column, as each value is unique for each ride

Syntax: sum(duplicated(df$col))

nrow(oneyear_trips_clean2) 
## [1] 4233605
sum(duplicated(oneyear_trips_clean2$ride_id))
## [1] 137

9.-

9.- Checking the datatype from start_at column y end_at column

Syntax: class(df$dttm)

End Cleaning

class(oneyear_trips_clean2$started_at)  
## [1] "POSIXct" "POSIXt"
class(oneyear_trips_clean2$ended_at)
## [1] "POSIXct" "POSIXt"

End of the Cleaning Phase.

S———————————————————————————————————————S

Beginning of the Data Processing Phase in its Final Part

10.-

10.- Adding columns (_addcol)

New dataset, generated from the data set that was cleaned up, “oneyear_trips_clean2”, with new columns added to have the data disaggregated by day of week, month, number of day of month, year

Crate a new tibble

Definition of: oneyear_trips_addcol tibble

oneyear_trips_addcol <- oneyear_trips_clean2
#View(oneyear_trips_addcol)
print(oneyear_trips_addcol)
## # A tibble: 4,233,605 × 9
##    ride_id          rideable_type started_at          ended_at           
##    <chr>            <chr>         <dttm>              <dttm>             
##  1 4449097279F8BBE7 classic_bike  2023-10-08 10:36:26 2023-10-08 10:49:19
##  2 9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08
##  3 667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53
##  4 F92714CC6B019B96 classic_bike  2023-10-24 19:13:03 2023-10-24 19:18:29
##  5 5E34BA5DE945A9CC classic_bike  2023-10-09 18:19:26 2023-10-09 18:30:56
##  6 F7D7420AFAC53CD9 electric_bike 2023-10-04 17:10:59 2023-10-04 17:25:21
##  7 870B2D4CD112D7B7 electric_bike 2023-10-31 17:32:20 2023-10-31 17:44:20
##  8 D9179D36E32D456C classic_bike  2023-10-02 18:51:51 2023-10-02 18:57:09
##  9 F8E131281F722FEF classic_bike  2023-10-17 08:28:18 2023-10-17 08:50:03
## 10 91938B71748FA405 classic_bike  2023-10-17 19:17:38 2023-10-17 19:32:23
## # ℹ 4,233,595 more rows
## # ℹ 5 more variables: start_station_name <chr>, start_station_id <chr>,
## #   end_station_name <chr>, end_station_id <chr>, member_casual <chr>

10.1.-oneyear_trips_addcol tibble :4,233,605 rows, 9 columns

11.-

11.- Conversion between 2 different data types, and create a new column

From the column “started_at” with format “dttm” data type, convert from “datatime”(dttm) data type to “Date” data type and create a new column (“date”) in the dataset.

Syntax: df\(date <- as.Date(df\)started_at)

oneyear_trips_addcol$date <- as.Date(oneyear_trips_addcol$started_at)
#View(oneyear_trips_addcol)
print(oneyear_trips_addcol$date[1:9])
## [1] "2023-10-08" "2023-10-11" "2023-10-12" "2023-10-24" "2023-10-09"
## [6] "2023-10-04" "2023-10-31" "2023-10-02" "2023-10-17"

12.-

12.- Partial extraction of data, (days of the week), from a column and use that data to create a new column (name _dayofweek)

Extract from the column “date”, the part corresponding to the “names of the days of the week” and create a new column “name _dayofweek” in the dataset:

oneyear_trips_addcol$name_dayofweek <- format(as.Date(oneyear_trips_addcol$date), "%a")
#View(oneyear_trips_addcol)
print(oneyear_trips_addcol$name_dayofweek[1:9])
## [1] "Sun" "Wed" "Thu" "Tue" "Mon" "Wed" "Tue" "Mon" "Tue"

13.-

13.- Partial extraction of data, (month), from a column and use that data to create a new column (month[month-year in abbreviated form])

Extract from the column “date”, the part corresponding to the “month” and create a new column “month”(month-year in abbreviated form) in the dataset.

oneyear_trips_addcol$month <- format(as.Date(oneyear_trips_addcol$date), "%b_%y")
#View(oneyear_trips_addcol)
print(oneyear_trips_addcol$month[1:9])  
## [1] "Oct_23" "Oct_23" "Oct_23" "Oct_23" "Oct_23" "Oct_23" "Oct_23" "Oct_23"
## [9] "Oct_23"
#print(oneyear_trips_addcol$month[4203000:4203006])

14.-

14.- Partial extraction of data, (No. day), from a column and use that data to create a new column (No. day)

Extract from the column “date”, the part corresponding to the “No. day” and create a new column “No. day” in the dataset.

oneyear_trips_addcol$No._day <- format(as.Date(oneyear_trips_addcol$date), "%d")
#View(oneyear_trips_addcol)
print(oneyear_trips_addcol$No._day[1:9])
## [1] "08" "11" "12" "24" "09" "04" "31" "02" "17"

15.-

15.- Partial extraction of data, (year), from a column and use that data to create a new column (year)

Extract from the column “date”, the part corresponding to the “year” and create a new column “year” (with 4 digits) in the dataset.

oneyear_trips_addcol$year <- format(as.Date(oneyear_trips_addcol$date), "%Y")
#View(oneyear_trips_addcol)
print(oneyear_trips_addcol$year[1:9])
## [1] "2023" "2023" "2023" "2023" "2023" "2023" "2023" "2023" "2023"

16.-

16.- Create a new column from the subtraction of values from 2 columns and finally convert the data type of the result

Calculate the Trips Duration Time by subtracting the values in the “ended_at” column, from the values in the “started_at” column through the “difftime” function, and then convert that value into a “numeric” “double” data type format, and create a new name column (“ride_duration”) in the dataset

oneyear_trips_addcol$ride_duration <- as.double(difftime(oneyear_trips_addcol$ended_at, oneyear_trips_addcol$started_at))
#View(oneyear_trips_addcol)
print(oneyear_trips_addcol)
## # A tibble: 4,233,605 × 15
##    ride_id          rideable_type started_at          ended_at           
##    <chr>            <chr>         <dttm>              <dttm>             
##  1 4449097279F8BBE7 classic_bike  2023-10-08 10:36:26 2023-10-08 10:49:19
##  2 9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08
##  3 667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53
##  4 F92714CC6B019B96 classic_bike  2023-10-24 19:13:03 2023-10-24 19:18:29
##  5 5E34BA5DE945A9CC classic_bike  2023-10-09 18:19:26 2023-10-09 18:30:56
##  6 F7D7420AFAC53CD9 electric_bike 2023-10-04 17:10:59 2023-10-04 17:25:21
##  7 870B2D4CD112D7B7 electric_bike 2023-10-31 17:32:20 2023-10-31 17:44:20
##  8 D9179D36E32D456C classic_bike  2023-10-02 18:51:51 2023-10-02 18:57:09
##  9 F8E131281F722FEF classic_bike  2023-10-17 08:28:18 2023-10-17 08:50:03
## 10 91938B71748FA405 classic_bike  2023-10-17 19:17:38 2023-10-17 19:32:23
## # ℹ 4,233,595 more rows
## # ℹ 11 more variables: start_station_name <chr>, start_station_id <chr>,
## #   end_station_name <chr>, end_station_id <chr>, member_casual <chr>,
## #   date <date>, name_dayofweek <chr>, month <chr>, No._day <chr>, year <chr>,
## #   ride_duration <dbl>
class(oneyear_trips_addcol$ride_duration)
## [1] "numeric"
nrow(subset(oneyear_trips_addcol,ride_duration < 0))
## [1] 103

16.1.-oneyear_trips_addcol tibble :4,233,605 rows, 15 columns

17.-

17.- Delete rows from the oneyear_trips_addcol tibble

Delete rows from the oneyear_trips_addcol tibble that have negative values in the “ride_duration” column

oneyear_trips_addcolv2 <- subset(oneyear_trips_addcol, ride_duration >=0)
#View(oneyear_trips_addcolv2)
print(oneyear_trips_addcolv2)
## # A tibble: 4,233,502 × 15
##    ride_id          rideable_type started_at          ended_at           
##    <chr>            <chr>         <dttm>              <dttm>             
##  1 4449097279F8BBE7 classic_bike  2023-10-08 10:36:26 2023-10-08 10:49:19
##  2 9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08
##  3 667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53
##  4 F92714CC6B019B96 classic_bike  2023-10-24 19:13:03 2023-10-24 19:18:29
##  5 5E34BA5DE945A9CC classic_bike  2023-10-09 18:19:26 2023-10-09 18:30:56
##  6 F7D7420AFAC53CD9 electric_bike 2023-10-04 17:10:59 2023-10-04 17:25:21
##  7 870B2D4CD112D7B7 electric_bike 2023-10-31 17:32:20 2023-10-31 17:44:20
##  8 D9179D36E32D456C classic_bike  2023-10-02 18:51:51 2023-10-02 18:57:09
##  9 F8E131281F722FEF classic_bike  2023-10-17 08:28:18 2023-10-17 08:50:03
## 10 91938B71748FA405 classic_bike  2023-10-17 19:17:38 2023-10-17 19:32:23
## # ℹ 4,233,492 more rows
## # ℹ 11 more variables: start_station_name <chr>, start_station_id <chr>,
## #   end_station_name <chr>, end_station_id <chr>, member_casual <chr>,
## #   date <date>, name_dayofweek <chr>, month <chr>, No._day <chr>, year <chr>,
## #   ride_duration <dbl>
nrow(subset(oneyear_trips_addcolv2,ride_duration < 0))
## [1] 0

17.-oneyear_trips_addcolv2 tibble :4,233,502 rows, 15 columns

18.-

18.- Knowing the distinct values of the data in a column

Syntax: unique(df$col)

unique(oneyear_trips_addcolv2$rideable_type)
## [1] "classic_bike"     "electric_bike"    "electric_scooter"
unique(oneyear_trips_addcolv2$member_casual)
## [1] "member" "casual"

19.-

19.- Determining structure and data types in the dataset

sapply(oneyear_trips_addcolv2,class)
## $ride_id
## [1] "character"
## 
## $rideable_type
## [1] "character"
## 
## $started_at
## [1] "POSIXct" "POSIXt" 
## 
## $ended_at
## [1] "POSIXct" "POSIXt" 
## 
## $start_station_name
## [1] "character"
## 
## $start_station_id
## [1] "character"
## 
## $end_station_name
## [1] "character"
## 
## $end_station_id
## [1] "character"
## 
## $member_casual
## [1] "character"
## 
## $date
## [1] "Date"
## 
## $name_dayofweek
## [1] "character"
## 
## $month
## [1] "character"
## 
## $No._day
## [1] "character"
## 
## $year
## [1] "character"
## 
## $ride_duration
## [1] "numeric"

20.-

20.- Checking for values indicating that there are rows corresponding to “Tests” of operation and/or Quality Control, to be carried out on the bicycles in some stations

https://forum.posit.co/t/find-values-starting-with-a-specific-word/63438

filter(oneyear_trips_addcolv2,grepl("TEST",start_station_name))  
## # A tibble: 0 × 15
## # ℹ 15 variables: ride_id <chr>, rideable_type <chr>, started_at <dttm>,
## #   ended_at <dttm>, start_station_name <chr>, start_station_id <chr>,
## #   end_station_name <chr>, end_station_id <chr>, member_casual <chr>,
## #   date <date>, name_dayofweek <chr>, month <chr>, No._day <chr>, year <chr>,
## #   ride_duration <dbl>
filter(oneyear_trips_addcolv2,grepl("test",start_station_name))  
## # A tibble: 0 × 15
## # ℹ 15 variables: ride_id <chr>, rideable_type <chr>, started_at <dttm>,
## #   ended_at <dttm>, start_station_name <chr>, start_station_id <chr>,
## #   end_station_name <chr>, end_station_id <chr>, member_casual <chr>,
## #   date <date>, name_dayofweek <chr>, month <chr>, No._day <chr>, year <chr>,
## #   ride_duration <dbl>
filter(oneyear_trips_addcolv2,grepl("Test",start_station_name))  
## # A tibble: 0 × 15
## # ℹ 15 variables: ride_id <chr>, rideable_type <chr>, started_at <dttm>,
## #   ended_at <dttm>, start_station_name <chr>, start_station_id <chr>,
## #   end_station_name <chr>, end_station_id <chr>, member_casual <chr>,
## #   date <date>, name_dayofweek <chr>, month <chr>, No._day <chr>, year <chr>,
## #   ride_duration <dbl>
filter(oneyear_trips_addcolv2,grepl("HQ",start_station_name))
## # A tibble: 0 × 15
## # ℹ 15 variables: ride_id <chr>, rideable_type <chr>, started_at <dttm>,
## #   ended_at <dttm>, start_station_name <chr>, start_station_id <chr>,
## #   end_station_name <chr>, end_station_id <chr>, member_casual <chr>,
## #   date <date>, name_dayofweek <chr>, month <chr>, No._day <chr>, year <chr>,
## #   ride_duration <dbl>
filter(oneyear_trips_addcolv2,grepl("HQ QR",start_station_name))
## # A tibble: 0 × 15
## # ℹ 15 variables: ride_id <chr>, rideable_type <chr>, started_at <dttm>,
## #   ended_at <dttm>, start_station_name <chr>, start_station_id <chr>,
## #   end_station_name <chr>, end_station_id <chr>, member_casual <chr>,
## #   date <date>, name_dayofweek <chr>, month <chr>, No._day <chr>, year <chr>,
## #   ride_duration <dbl>
nrow(oneyear_trips_addcolv2)
## [1] 4233502

20.1.-oneyear_trips_addcolv2 tibble :4,233,502 rows, 15 columns

There are no rows with values related to any Test type (0 rows x 15 columns) the number of rows remains 4,233,502, as it appeared in the comment of section 20.1.- (oneyear_trips_addcolv2 :4,233,502 rows, 15 columns)

Final Definition of : “oneyear_trips_addcolv2”

Dataset with 4,233,502 rows and 15 columns, with data types for the following variables: Name of the Days of Week (name_dayofweek), Month(month) and Rides’ Duration (ride_duration), in “character” format the first one and the second one, and in “numeric (”double") format the last one.

21.-

21.- To know the summary of the characteristics and values of each column in the oneyear_trips_addcolv2 dataset

summary(oneyear_trips_addcolv2)
##    ride_id          rideable_type        started_at                    
##  Length:4233502     Length:4233502     Min.   :2023-10-01 00:00:05.00  
##  Class :character   Class :character   1st Qu.:2024-02-17 14:50:08.50  
##  Mode  :character   Mode  :character   Median :2024-06-01 09:50:22.13  
##                                        Mean   :2024-05-04 05:18:28.32  
##                                        3rd Qu.:2024-08-02 09:45:49.52  
##                                        Max.   :2024-09-30 23:52:58.17  
##     ended_at                      start_station_name start_station_id  
##  Min.   :2023-10-01 00:02:02.00   Length:4233502     Length:4233502    
##  1st Qu.:2024-02-17 15:02:46.75   Class :character   Class :character  
##  Median :2024-06-01 10:04:56.27   Mode  :character   Mode  :character  
##  Mean   :2024-05-04 05:34:59.60                                        
##  3rd Qu.:2024-08-02 10:02:49.55                                        
##  Max.   :2024-09-30 23:59:52.55                                        
##  end_station_name   end_station_id     member_casual           date           
##  Length:4233502     Length:4233502     Length:4233502     Min.   :2023-10-01  
##  Class :character   Class :character   Class :character   1st Qu.:2024-02-17  
##  Mode  :character   Mode  :character   Mode  :character   Median :2024-06-01  
##                                                           Mean   :2024-05-03  
##                                                           3rd Qu.:2024-08-02  
##                                                           Max.   :2024-09-30  
##  name_dayofweek        month             No._day              year          
##  Length:4233502     Length:4233502     Length:4233502     Length:4233502    
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##  ride_duration    
##  Min.   :    0.0  
##  1st Qu.:  347.0  
##  Median :  603.5  
##  Mean   :  991.3  
##  3rd Qu.: 1083.4  
##  Max.   :90562.0

22.-

22.- To know the summary of the characteristics and values of the column (“oneyear_trips_addcolv2$ride_duration”) in the dataset

class(oneyear_trips_addcolv2$ride_duration)
## [1] "numeric"
summary(oneyear_trips_addcolv2$ride_duration)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0   347.0   603.5   991.3  1083.4 90562.0
minimum_value <- oneyear_trips_addcolv2 %>%
  slice_min(ride_duration) %>%
  select(ride_duration) 
minimum_value
## # A tibble: 316 × 1
##    ride_duration
##            <dbl>
##  1             0
##  2             0
##  3             0
##  4             0
##  5             0
##  6             0
##  7             0
##  8             0
##  9             0
## 10             0
## # ℹ 306 more rows
maximum_value <- oneyear_trips_addcolv2 %>%
  slice_max(ride_duration) %>%
  select(ride_duration) 
maximum_value
## # A tibble: 1 × 1
##   ride_duration
##           <dbl>
## 1         90562

23.-

23.- Create a new tibble: oneyeartrips_ok tibble

Definition of: oneyeartrips_ok tibble

oneyeartrips_ok <- oneyear_trips_addcolv2
#View(oneyeartrips_ok)
print(oneyeartrips_ok)
## # A tibble: 4,233,502 × 15
##    ride_id          rideable_type started_at          ended_at           
##    <chr>            <chr>         <dttm>              <dttm>             
##  1 4449097279F8BBE7 classic_bike  2023-10-08 10:36:26 2023-10-08 10:49:19
##  2 9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08
##  3 667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53
##  4 F92714CC6B019B96 classic_bike  2023-10-24 19:13:03 2023-10-24 19:18:29
##  5 5E34BA5DE945A9CC classic_bike  2023-10-09 18:19:26 2023-10-09 18:30:56
##  6 F7D7420AFAC53CD9 electric_bike 2023-10-04 17:10:59 2023-10-04 17:25:21
##  7 870B2D4CD112D7B7 electric_bike 2023-10-31 17:32:20 2023-10-31 17:44:20
##  8 D9179D36E32D456C classic_bike  2023-10-02 18:51:51 2023-10-02 18:57:09
##  9 F8E131281F722FEF classic_bike  2023-10-17 08:28:18 2023-10-17 08:50:03
## 10 91938B71748FA405 classic_bike  2023-10-17 19:17:38 2023-10-17 19:32:23
## # ℹ 4,233,492 more rows
## # ℹ 11 more variables: start_station_name <chr>, start_station_id <chr>,
## #   end_station_name <chr>, end_station_id <chr>, member_casual <chr>,
## #   date <date>, name_dayofweek <chr>, month <chr>, No._day <chr>, year <chr>,
## #   ride_duration <dbl>

23.1- See data type [char] in the columns:name_dayofweek and month

23.2- oneyeartrips_ok tibble :4,233,502 rows, 15 columns

24.-

24.- Various characteristics of the Duration Time of the Trips = ride_duration in seconds

oneyeartrips_ok %>%
  group_by(member_casual) %>%
  summarise(min_ride_duration = min(ride_duration), max_ride_duration = max(ride_duration), median_ride_duration = median(ride_duration), mean_ride_duration = mean(ride_duration))
## # A tibble: 2 × 5
##   member_casual min_ride_duration max_ride_duration median_ride_duration
##   <chr>                     <dbl>             <dbl>                <dbl>
## 1 casual                        0             90562                 799.
## 2 member                        0             89859                 527 
## # ℹ 1 more variable: mean_ride_duration <dbl>

25.-

25.- Processing two(2) columns so that their values are presented in levels

In the next paragraph of text, the column called “name_dayofweek”(month) in the tibble “oneyeartrips_ok” will be called “-col-name_dayofweek(month)”, due to the difficulty introduced by the The dollar symbol($ = -col-)

Preparation of the columns :“oneyeartrips_ok-col-name_dayofweek” and “oneyeartrips_ok-col-month” with their values in an ordered form , for the presentation of graphs relating the values of the Rides’ Number (“ride_duration”) and Average Rides’ Time (avg_ride_duration) as a function of those values in the 2 columns mentioned initially.

oneyeartrips_ok$name_dayofweek <- ordered(oneyeartrips_ok$name_dayofweek, levels= c('Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun'))
levels(oneyeartrips_ok$name_dayofweek)
## [1] "Mon" "Tue" "Wed" "Thu" "Fri" "Sat" "Sun"
class(oneyeartrips_ok$name_dayofweek)
## [1] "ordered" "factor"
oneyeartrips_ok$month <- ordered(oneyeartrips_ok$month,levels=c('Oct_23','Nov_23','Dec_23','Jan_24','Feb_24','Mar_24','Apr_24','May_24','Jun_24','Jul_24','Aug_24','Sep_24'))
levels(oneyeartrips_ok$month)
##  [1] "Oct_23" "Nov_23" "Dec_23" "Jan_24" "Feb_24" "Mar_24" "Apr_24" "May_24"
##  [9] "Jun_24" "Jul_24" "Aug_24" "Sep_24"
class(oneyeartrips_ok$month)
## [1] "ordered" "factor"
#View(oneyeartrips_ok)
print(oneyeartrips_ok)
## # A tibble: 4,233,502 × 15
##    ride_id          rideable_type started_at          ended_at           
##    <chr>            <chr>         <dttm>              <dttm>             
##  1 4449097279F8BBE7 classic_bike  2023-10-08 10:36:26 2023-10-08 10:49:19
##  2 9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08
##  3 667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53
##  4 F92714CC6B019B96 classic_bike  2023-10-24 19:13:03 2023-10-24 19:18:29
##  5 5E34BA5DE945A9CC classic_bike  2023-10-09 18:19:26 2023-10-09 18:30:56
##  6 F7D7420AFAC53CD9 electric_bike 2023-10-04 17:10:59 2023-10-04 17:25:21
##  7 870B2D4CD112D7B7 electric_bike 2023-10-31 17:32:20 2023-10-31 17:44:20
##  8 D9179D36E32D456C classic_bike  2023-10-02 18:51:51 2023-10-02 18:57:09
##  9 F8E131281F722FEF classic_bike  2023-10-17 08:28:18 2023-10-17 08:50:03
## 10 91938B71748FA405 classic_bike  2023-10-17 19:17:38 2023-10-17 19:32:23
## # ℹ 4,233,492 more rows
## # ℹ 11 more variables: start_station_name <chr>, start_station_id <chr>,
## #   end_station_name <chr>, end_station_id <chr>, member_casual <chr>,
## #   date <date>, name_dayofweek <ord>, month <ord>, No._day <chr>, year <chr>,
## #   ride_duration <dbl>
sapply(oneyeartrips_ok,class)
## $ride_id
## [1] "character"
## 
## $rideable_type
## [1] "character"
## 
## $started_at
## [1] "POSIXct" "POSIXt" 
## 
## $ended_at
## [1] "POSIXct" "POSIXt" 
## 
## $start_station_name
## [1] "character"
## 
## $start_station_id
## [1] "character"
## 
## $end_station_name
## [1] "character"
## 
## $end_station_id
## [1] "character"
## 
## $member_casual
## [1] "character"
## 
## $date
## [1] "Date"
## 
## $name_dayofweek
## [1] "ordered" "factor" 
## 
## $month
## [1] "ordered" "factor" 
## 
## $No._day
## [1] "character"
## 
## $year
## [1] "character"
## 
## $ride_duration
## [1] "numeric"

25.1.-oneyear_trips_ok tibble:4,233,502 rows, 15 columns

Final Definition of: oneyeartrips_ok tibble

Tibble with 4,233,502 rows and 15 columns, with data type for the following variables, Name of the Days of Week (name_dayoftheweek), Month(month) and Rides’ Duration (ride_duration), in “ordered levels” format first one and the second one, and in “numeric”(“double”) format the last one.

End of the Data Processing Phase in its Final Part

S———————————————————————————————————————S

Begining Graphical Analysis and Visualization Phase

26.-

26.- Package and library needed to present some graphics in faceting form, with value on “Y” axis in descending form

Deleted and moved at begining install.packages section

27.-

27.- Start of the graphical analysis to find a solution and implementation to the business task

oneyeartrips_ok %>%
  group_by(member_casual, name_dayofweek) %>%
  summarise(rides_number = n(), average_duration_seg = mean(ride_duration)) %>%
  arrange(member_casual, desc(rides_number))
## `summarise()` has grouped output by 'member_casual'. You can override using the
## `.groups` argument.
## # A tibble: 14 × 4
## # Groups:   member_casual [2]
##    member_casual name_dayofweek rides_number average_duration_seg
##    <chr>         <ord>                 <int>                <dbl>
##  1 casual        Sat                  306363                1622.
##  2 casual        Sun                  264784                1658.
##  3 casual        Fri                  214951                1380.
##  4 casual        Wed                  188730                1288.
##  5 casual        Thu                  185003                1246.
##  6 casual        Mon                  182059                1388.
##  7 casual        Tue                  164597                1244.
##  8 member        Wed                  453234                 728.
##  9 member        Thu                  428973                 711.
## 10 member        Tue                  426248                 715.
## 11 member        Mon                  407476                 712.
## 12 member        Fri                  375985                 721.
## 13 member        Sat                  332298                 836.
## 14 member        Sun                  302801                 838.

27.1.- Tibble with 14 rows and 4 columns

S———————————————————————————————————————S

SECTION: RIDES’ NUMBER — DAY of WEEK – USER TYPE

A.-

CALCULATION AND GRAPHS OF DAY_OF WEEK (MONTH) VS RIDE_DURATION, PARAMETRIZED VIA THE 12 MONTHS OF THE YEAR(7 DAYS OF THE WEEK), FACETING BY USER TYPE(MEMBER_CASUAL)

28.-

28.- A.1.- Crate a tibble: summary1ridesnumber_ days_ol, based on oneyeartrips_ok dataset

In this section begins the graphical analysis where are presented on the axis of the “Y” values of “Number of Trips” (“Rides’ Number”) made by the two types of users of this transport system: “member” and “casual”, and on the axis of the “X”, the names of the 7 days of the week, by User Type ("member and casual) or parametrized by the 12 months of the year,faceting by User Type

User Type = member_casual

Summary tibble of 3 columns necessary for the following point to be able to show the graphic, in which on the axis “X” are shown the days of the week (“name_dayofweek”) in an ordered form de (“Monday” to “Sunday”), and on the “Y” axis the Rides’ Number(ride_duration), by User Type.

User Type = member_casual

Tibble based on oneyeartrips_ok dataset

Crate a new tibble

Definition of: summary1ridesnumber_ days_ol tibble

summary1ridesnumber_days_ol   <- oneyeartrips_ok %>%
     group_by(member_casual, name_dayofweek) %>%
     summarise(rides_number = n())  %>%
     arrange(member_casual, name_dayofweek)
## `summarise()` has grouped output by 'member_casual'. You can override using the
## `.groups` argument.
print(summary1ridesnumber_days_ol)
## # A tibble: 14 × 3
## # Groups:   member_casual [2]
##    member_casual name_dayofweek rides_number
##    <chr>         <ord>                 <int>
##  1 casual        Mon                  182059
##  2 casual        Tue                  164597
##  3 casual        Wed                  188730
##  4 casual        Thu                  185003
##  5 casual        Fri                  214951
##  6 casual        Sat                  306363
##  7 casual        Sun                  264784
##  8 member        Mon                  407476
##  9 member        Tue                  426248
## 10 member        Wed                  453234
## 11 member        Thu                  428973
## 12 member        Fri                  375985
## 13 member        Sat                  332298
## 14 member        Sun                  302801
sapply(summary1ridesnumber_days_ol,class)
## $member_casual
## [1] "character"
## 
## $name_dayofweek
## [1] "ordered" "factor" 
## 
## $rides_number
## [1] "integer"

28.1.- ’summary1ridesnumber_days_ol tibble with 14 rows and 3 columns. ol - ordered levels

29.-

29.- A.1.1.- Plot: “Rides’ Number by each Day of Week by User Type”

This set of code lines are made to be able to present on the “X” axis the Days of the Week (“name_dayofweek”) in an ordered form of “Monday to Sunday”,and on the “Y” axis the values of Rides’ Number (“rides_number”), by each User Type

“Rides’ Number by each Day of Week by User Type” Plot

User Type = member_casual

https://ggplot2.tidyverse.org/reference/element.html

summary1ridesnumber_days_ol   %>%
  ggplot(aes(x = name_dayofweek, y = rides_number/1000, fill = member_casual)) +
  geom_col(position = "dodge") +
  theme(legend.position="bottom") +
  theme(legend.title = element_text(colour= "black", size= 11, face="bold")) +
  theme(legend.text = element_text(colour= "black", size= 10, face= "bold")) +
  labs(title = 'Rides’ Number by each Day of Week by User Type') +
  theme(plot.title = element_text(size = 11, face= "bold")) +
  xlab('Name of Days of Week') +
  ylab('Rides’ Number (in thousands)') +    
  theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
  theme(axis.title.x = element_text(face = "bold")) +
  theme(axis.text.y = element_text(face = "bold")) +
  theme(axis.title.y = element_text(face = "bold"))

#ggsave("Rides’ Number by each Day of Week by User Type.png")

29.1.-

29.1.- Plot1 = “Rides’ Number by each Day of Week by User Type”

https://patchwork.data-imaginist.com/

library(patchwork)
p1 <- summary1ridesnumber_days_ol   %>%
  ggplot(aes(x = name_dayofweek, y = rides_number/1000, fill = member_casual)) +
  geom_col(position = "dodge") +
  theme(legend.position="bottom") +
  theme(legend.title = element_text(colour= "black", size= 11, face="bold")) +
  theme(legend.text = element_text(colour= "black", size= 10, face= "bold")) +
  labs(title = 'Rides’ Number by each Day of Week by User Type') +
  theme(plot.title = element_text(size = 11, face= "bold")) +
  xlab('Name of Days of Week') +
  ylab('Rides’ Number (in thousands)') +    
  theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
  theme(axis.title.x = element_text(face = "bold")) +
  theme(axis.text.y = element_text(face = "bold")) +
  theme(axis.title.y = element_text(face = "bold"))
p1

30.-

30.- Create a tibble: summary2ridesnumber_days_char, based on oneyear_trips_addcolv2 dataset

This summary tibble generates 3 columns arranged in the following order: member_casual-name_dayofweek-rides_number, and it is necessary for the generation of the graphics with descending values on the “Y” axis mentioned in the following point.

Tibble based on oneyear_trips_addcolv2 dataset

User Type = member_casual

Create a new tibble

Definition of: summary2ridesnumber_days_char tibble

sumary2ridesnumber_days_char <-  oneyear_trips_addcolv2 %>%
   group_by(member_casual, name_dayofweek) %>%
   summarise(rides_number = n())  %>%
   arrange(member_casual, name_dayofweek)
## `summarise()` has grouped output by 'member_casual'. You can override using the
## `.groups` argument.
print(sumary2ridesnumber_days_char)
## # A tibble: 14 × 3
## # Groups:   member_casual [2]
##    member_casual name_dayofweek rides_number
##    <chr>         <chr>                 <int>
##  1 casual        Fri                  214951
##  2 casual        Mon                  182059
##  3 casual        Sat                  306363
##  4 casual        Sun                  264784
##  5 casual        Thu                  185003
##  6 casual        Tue                  164597
##  7 casual        Wed                  188730
##  8 member        Fri                  375985
##  9 member        Mon                  407476
## 10 member        Sat                  332298
## 11 member        Sun                  302801
## 12 member        Thu                  428973
## 13 member        Tue                  426248
## 14 member        Wed                  453234
sapply(sumary2ridesnumber_days_char,class)
##  member_casual name_dayofweek   rides_number 
##    "character"    "character"      "integer"

30.1.- summary2ridesnumber_days_char tibble with 14 rows and 3 columns. char = character

31.-

31.- A.2.1.- Plot: ”Descending Rides’ Number by each Day of Week, Faceting by User Type”

This set of code lines are made to be able to present on the “X” axis, the values of “name_dayofweek” according to the descending order of values of the Rides’ Number (“rides_number”) on the “Y” axis, faceting by User Type.

User Type = member_casual

The presentation is with the axles rotated 90 degrees.

”Descending Rides’ Number by each Day of Week, Faceting by User Type” Plot

https://juliasilge.com/blog/reorder-within/

sumary2ridesnumber_days_char %>%
  ggplot() + 
  geom_col(aes(reorder_within(name_dayofweek, rides_number, member_casual), rides_number/1000, fill = member_casual), position = "dodge") +
  facet_wrap(~member_casual, scales = "free_y") + 
  theme(legend.position = "bottom") +
  theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
  theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
  coord_flip() +
  labs(title = 'Descending Rides’ Number by each Day, Faceting by User Type') +
  theme(plot.title = element_text(size = 11,face="bold")) +    
  xlab('Name of Days of Week') +
  ylab('Descending Rides’ Number (in thousands)') +    
  theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
  theme(axis.title.x = element_text(face = "bold")) +
  theme(axis.text.y = element_text(face = "bold")) +
  theme(axis.title.y = element_text(face = "bold"))

#ggsave("Descending Rides’ Number by each Day of Week.png")

31.1.-

31.1.- Plot2 = “Descending Rides’ Number by each Day of Week” and plotting p1/p2

#{r Plot2, width= 10, height= 15} #p2 <- sumary2ridesnumber_days_char %>% ggplot() + geom_col(aes(reorder_within(name_dayofweek, rides_number, member_casual), rides_number/1000, fill = member_casual), position = "dodge") + facet_wrap(~member_casual, scales = "free_y") + theme(legend.position = "bottom") + theme(legend.title = element_text(colour = "black", size =11, face ="bold")) + theme(legend.text = element_text(colour = "black", size =10, face ="bold")) + coord_flip() + labs(title = 'Descending Rides’ Number by each Day, Faceting by User Type') + theme(plot.title = element_text(size = 11,face="bold")) + xlab('Name of Days of Week') + ylab('Descending Rides’ Number (in thousands)') + theme(axis.text.x = element_text(angle = 30, face = "bold" )) + theme(axis.title.x = element_text(face = "bold")) + theme(axis.text.y = element_text(face = "bold")) + theme(axis.title.y = element_text(face = "bold")) #p1/p2 #

32.-

32.- A.3.- Create a tibble summary3ridesnumber_month_ol, based on oneyear_trips_ok dataset

summary tibble generates 3 columns arranged in the following order: member_casual-month-rides_number, and is required for generation of the graph, where on the “X” axis are shown the 12 months of the year (month) in an ordered form (“Oct_23 to Sep_24”), as they appear in the following point. On the “Y” axis, the values of the Rides’ Number (“rides_number”) are shown according to what is presented on the “X” axis

Tibble based on oneyear_trips_ok dataset

User Type = member_casual

Create a new tibble

Definition of: summary3ridesnumber_month_ol tibble

sumary3ridesnumber_month_ol  <- oneyeartrips_ok %>%
   group_by(member_casual, month) %>%
   summarise(rides_number = n())  %>%
   arrange(member_casual, month)
## `summarise()` has grouped output by 'member_casual'. You can override using the
## `.groups` argument.
print(sumary3ridesnumber_month_ol)
## # A tibble: 24 × 3
## # Groups:   member_casual [2]
##    member_casual month  rides_number
##    <chr>         <ord>         <int>
##  1 casual        Oct_23       130297
##  2 casual        Nov_23        72077
##  3 casual        Dec_23        36686
##  4 casual        Jan_24        17713
##  5 casual        Feb_24        38170
##  6 casual        Mar_24        62818
##  7 casual        Apr_24        93943
##  8 casual        May_24       167481
##  9 casual        Jun_24       210679
## 10 casual        Jul_24       231970
## # ℹ 14 more rows
sapply(sumary3ridesnumber_month_ol,class)
## $member_casual
## [1] "character"
## 
## $month
## [1] "ordered" "factor" 
## 
## $rides_number
## [1] "integer"

32.1.- summary3ridesnumber_month_ol. ol = ordered levels

33.-

33.- A.3.1.- Plot: “Rides’ Number by each Month by User Type”

“Rides’ Number by each Month by User Type” Plot sumary3ridesnumber_month_ol

sumary3ridesnumber_month_ol  %>%
   ggplot(aes(x = month, y = rides_number/1000, fill = member_casual)) +
   geom_col(position = "dodge") +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Rides’ Number by each Month by User Type') +
   theme(plot.title = element_text(size = 11,face="bold")) +
   xlab('Month') +
   ylab('Rides’ Number by each Month (in thousands)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Rides’ Number by each Month by User Type.png")

34.-

34.- A.4.- Create a tibble: summary4ridesnumber_month_char, based on oneyear_trips_addcolv2 dataset

This summary tibble generates 3 columns arranged in the following order: member_casual-month-rides_number, and it is necessary for generating the graphics with descending values on the axis “Y” mentioned in the following point

Tibble based on oneyear_trips_addcolv2 dataset

User Type = member_casual

Create a new tibble

Definition of: summary4ridesnumber_month_char

summary4ridesnumber_month_char  <-  oneyear_trips_addcolv2 %>%
   group_by(member_casual, month) %>%
   summarise(rides_number = n())  %>%
   arrange(member_casual, month)
## `summarise()` has grouped output by 'member_casual'. You can override using the
## `.groups` argument.
print(summary4ridesnumber_month_char)
## # A tibble: 24 × 3
## # Groups:   member_casual [2]
##    member_casual month  rides_number
##    <chr>         <chr>         <int>
##  1 casual        Apr_24        93943
##  2 casual        Aug_24       228518
##  3 casual        Dec_23        36686
##  4 casual        Feb_24        38170
##  5 casual        Jan_24        17713
##  6 casual        Jul_24       231970
##  7 casual        Jun_24       210679
##  8 casual        Mar_24        62818
##  9 casual        May_24       167481
## 10 casual        Nov_23        72077
## # ℹ 14 more rows
sapply(summary4ridesnumber_month_char, class)
## member_casual         month  rides_number 
##   "character"   "character"     "integer"

34.1.- summary4ridesnumber_month_char. char = character

35.-

35.- A.4.1.- Plot: ”Descending Rides’ Number by each Month, Faceting by User Type”

This set of code lines are made to be able to present on the “X” axis, the values of “month” according to the descending order of values of the Rides’ Number (“rides_number”) on the “Y” axis, faceting by User Type

User Type = member_casual

The presentation is with the axles rotated 90 degrees.

”Descending Rides’ Number by each Month, Faceting by User Type” Plot

https://juliasilge.com/blog/reorder-within/

https://ggplot2.tidyverse.org/reference/coord_flip.html

summary4ridesnumber_month_char  %>%
    ggplot() + 
    geom_col(aes(reorder_within(month, rides_number, member_casual), rides_number/1000, fill = member_casual), position = "dodge") +    
    facet_wrap(~member_casual, scales = "free_y") + 
    theme(legend.position = "bottom") +
    theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
    theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
    coord_flip() +
    labs(title = 'Descending Rides’ Number by each Month, Faceting by User Type') +
    theme(plot.title = element_text(size = 11, face="bold")) +    
    xlab('Month') +
    ylab('Descending Rides’ Number (in thousands)') +    
    theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
    theme(axis.title.x = element_text(face = "bold")) +
    theme(axis.text.y = element_text(face = "bold")) +
    theme(axis.title.y = element_text(face = "bold"))

#ggsave("Descending Rides’ Number by each Month, Faceting by User Type.png")

AB.-

CALCULATION AND GRAPHS OF DAY_OF WEEK (MONTH) VS RIDE_DURATION, PARAMETRIZED VIA THE 12 MONTHS OF THE YEAR(7 DAYS OF THE WEEK), FACETING BY USER TYPE (MEMBER_CASUAL)

36.-

36.- A.5.- Create a tibble:summary5_usertype-daysofweek-month-ridesnumber_ol , based on oneyeartrips_ok dataset

This is to make a summary of the calculation and graphs of “name_dayof week” vs the Rides’ Number “rides_number”, parametrized via the 12 months of the year, faceting by user type.

User Type = member_casual

Tibble based on oneyeartrips_ok dataset

Definition of : summary5columns4_ol =

summary5_usertype-daysofweek-month-ridesnumber_ol

summary5columns4_ol <- oneyeartrips_ok %>%
   group_by(member_casual, name_dayofweek, month) %>%
   summarise(rides_number = n())  %>%
   arrange(member_casual, name_dayofweek, month)
## `summarise()` has grouped output by 'member_casual', 'name_dayofweek'. You can
## override using the `.groups` argument.
print(summary5columns4_ol)
## # A tibble: 168 × 4
## # Groups:   member_casual, name_dayofweek [14]
##    member_casual name_dayofweek month  rides_number
##    <chr>         <ord>          <ord>         <int>
##  1 casual        Mon            Oct_23        18015
##  2 casual        Mon            Nov_23         7961
##  3 casual        Mon            Dec_23         3537
##  4 casual        Mon            Jan_24         2862
##  5 casual        Mon            Feb_24         5160
##  6 casual        Mon            Mar_24         8024
##  7 casual        Mon            Apr_24        14304
##  8 casual        Mon            May_24        18031
##  9 casual        Mon            Jun_24        21493
## 10 casual        Mon            Jul_24        27425
## # ℹ 158 more rows
sapply(summary5columns4_ol,class)
## $member_casual
## [1] "character"
## 
## $name_dayofweek
## [1] "ordered" "factor" 
## 
## $month
## [1] "ordered" "factor" 
## 
## $rides_number
## [1] "integer"
#View(summary5columns4_ol)

36.1.- summary5columns4_ol. ol = ordered levels

37.-

37.- A.5.1- Plot: “Rides’ Number by each Day of Week in each Month, Faceting by User Type”

“Rides’ Number by each Day of Week in each Month, Faceting by User Type” Plot

https://www.youtube.com/watch?v=h14MWrYZjL0&t=58s

summary5columns4_ol %>%
    ggplot(aes(x = name_dayofweek, y = rides_number/1000, group = month, colour = month, shape = month )) +
    geom_point() +
    scale_shape_manual(values=seq(0,11)) +
scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#FF3399","#A0A0A0","#000000","#99FFFF","#66FF66","#FFFF00","#FFCC99")) +
    geom_line(linewidth = 1.2 ) +
    facet_grid(member_casual ~ .) +
    theme(legend.position="bottom") +
    theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
    theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
    labs(title = 'Rides’ Number by each Day of Week in each Month, Faceting by User Type') +
    theme(plot.title = element_text(size = 11, face="bold")) +
    xlab('Name of Day of Week') +
    ylab('Rides’ Number (in thousands)') +    
    theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
    theme(axis.title.x = element_text(face = "bold")) +
    theme(axis.text.y = element_text(face = "bold")) +
    theme(axis.title.y = element_text(face = "bold"))

#ggsave("'Rides’ Number by each Day of Week in each Month, Faceting by User Type.png")

38.-

38.- A.5.2- Plot: “Rides’ Number by each Month in each Day of Week, Faceting by User Type”

User Type = member-casual

“Rides’ Number by each Month in each Day of Week, Faceting by User Type” Plot

summary5columns4_ol %>%
    ggplot(aes(x = month, y = rides_number/1000, group = name_dayofweek, colour = name_dayofweek, shape = name_dayofweek )) +
    geom_point() +
    scale_shape_manual(values=seq(0,6)) +
scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#000000","#99FFFF")) +
    geom_line(linewidth = 1.2) +
    facet_grid(member_casual ~ .) +
    theme(legend.position="bottom") +
    theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
    theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
    labs(title = 'Rides’ Number by each Month in each Day of Week, Faceting by User Type') +
    theme(plot.title = element_text(size = 11, face="bold")) +
    xlab('Month') +
    ylab('Rides’ Number (in thousands)') +    
    theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
    theme(axis.title.x = element_text(face = "bold")) +
    theme(axis.text.y = element_text(face = "bold")) +
    theme(axis.title.y = element_text(face = "bold"))

#ggsave("Rides’ Number by each Month in each Day of Week, Faceting by User Type.png")

S———————————————————————————————————————S

SECTION: AVG RiDES’ TIME — DAY of WEEK – USER TYPE

B.-

CALCULATION AND GRAPHS OF DAY_OF WEEK (MONTH) VS AVERAGE RIDE_DURATION, PARAMETRIZED VIA THE 12 MONTHS OF THE YEAR(7 DAYS OF THE WEEK), BY USER TYPE OR FACETING BY USER TYPE (MEMBER_CASUAL)

39.-

39.- B.1.- Create a tibble:Summary6columns4_ol tibble, based on oneyeartrips_ok dataset

In this section begins the graphical analysis where are presented on the axis of the “Y” values of “Average duration Time of Trips” (“Average Rides’ Time”) made by the two types of users of this transport system: “member” and “casual”, and on the axis of the “X”, the names of the 7 days of the week, as the names of the 12 months of the year, facetting by “member and casual”

This summary table generates 4 columns arranged in the following order: member_casual-month-name_dayofweek-avg_rides_duration_seg, and is necessary for the generation of the graphs shown in the following points.

Tibble based on oneyeartrips_ok dataset

User Type = member-casual

Crate a new tibble

Definition of: Summary6columns4_ol tibble =

Summary6columns4_ol <- oneyeartrips_ok %>% 
   group_by(member_casual, month, name_dayofweek) %>%
   summarise(avg_rides_duration_seg = mean(ride_duration)) %>%
   arrange(member_casual, month, name_dayofweek)
## `summarise()` has grouped output by 'member_casual', 'month'. You can override
## using the `.groups` argument.
print(Summary6columns4_ol)
## # A tibble: 168 × 4
## # Groups:   member_casual, month [24]
##    member_casual month  name_dayofweek avg_rides_duration_seg
##    <chr>         <ord>  <ord>                           <dbl>
##  1 casual        Oct_23 Mon                             1225.
##  2 casual        Oct_23 Tue                             1164.
##  3 casual        Oct_23 Wed                             1170.
##  4 casual        Oct_23 Thu                             1040.
##  5 casual        Oct_23 Fri                             1164.
##  6 casual        Oct_23 Sat                             1366.
##  7 casual        Oct_23 Sun                             1603.
##  8 casual        Nov_23 Mon                              970.
##  9 casual        Nov_23 Tue                              861.
## 10 casual        Nov_23 Wed                              917.
## # ℹ 158 more rows
sapply(Summary6columns4_ol, class)
## $member_casual
## [1] "character"
## 
## $month
## [1] "ordered" "factor" 
## 
## $name_dayofweek
## [1] "ordered" "factor" 
## 
## $avg_rides_duration_seg
## [1] "numeric"
#View(Summary6columns4_ol)

39.1.- Summary6columns4_ol. ol = ordered levels

40.-

40.- B.1.1.- Plot: “Average Rides’ Time by each Day of Week by User Type”

This set of code lines is made to be able to present on the “X” axis the Days of the Week (“name_dayofweek”) in an ordered form from “Monday to Sunday”, and on the “Y” axis the values of Average Rides’ Time (“avg_rides_duration_seg”), by each User Type

User Type = member_casual

Average Rides’ Time by each Day of Week by User Type Plot

Summary6columns4_ol  %>%
   ggplot(aes(x = name_dayofweek, y = avg_rides_duration_seg /60, fill = member_casual)) +
   geom_col(position = "dodge") +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Average Rides’ Time by each Day of Week by User Type ') +
   theme(plot.title = element_text(size = 11, face="bold")) +
   xlab('Name of Days of Week') +
   ylab('Average Rides’ Time (in min.)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Average Rides’ Time by each Day of Week by User Type.png")

41.-

41.- B.2.- Plot: ”Descending Average Rides’ Time by each Day of Week, Faceting by User Type”

This set of code lines are made to be able to present on the “X” axis, the values of “name_dayofweek” according to the descending order of values of Average Rides’ Time (“avg_rides_duration_seg”), on the “Y” axis, faceting by User Type

The presentation is with the axles rotated 90 degrees.

”Descending Average Rides’ Time by each Day of Week, Faceting by User Type” Plot

https://juliasilge.com/blog/reorder-within/

Summary6columns4_ol %>% 
   ggplot() + 
   geom_col(aes(reorder_within(name_dayofweek, avg_rides_duration_seg, member_casual), avg_rides_duration_seg /60, fill = member_casual), position = "dodge") +
   facet_wrap(~member_casual, scales = "free_y") + 
   theme(legend.position = "bottom") +
   theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
   theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
   coord_flip() +
   labs(title = 'Descending  Average Rides’ Time by each Day od Week, Faceting by User Type') +
   theme(plot.title = element_text(size = 11, face="bold")) +    
   xlab('Name of Days of Week') +
   ylab('Descending  Average Rides’_Duration (in min.)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Descending  Average Rides’ Time by each Day of Week, Faceting by User Type.png")

42.-

42.- B.3.- Plot: “Average Rides’ Time by each Month by User Type”

This set of code lines is made to be able to present on the “X” axis the variable “month” in an ordered form from “Oct_23” to “Sep_24”, and on the “Y” axis" the values “avg_rides_duration_seg” corresponding to the order of the “X” axis “month”, by User Type

Average Rides’ Time by each Month by User Type Plot

Summary6columns4_ol %>%
   ggplot(aes(x = month, y = avg_rides_duration_seg /60, fill = member_casual)) +
   geom_col(position = "dodge") +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Average Rides’ Time by each Month by User Type') +
   theme(plot.title = element_text(size = 11, face="bold")) +
   xlab('Month') +
   ylab('Average Rides’ Time (in min.)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Average Rides’ Time by each Month by User Type.png")

43.-

43.- B.4.- Create a tibble: summary7cols3_char, based on “oneyear_trips_addcolv2” dataset

This summary tibble generates 3 columns arranged in thr following order :**User Type(member_casual)-Month(month)-Average Rides’Duration(avg_rides_duration_seg)*, and it is necessary for generating the graphics with descending values on the axis “Y” mentioned in the following point

Tibble based on “oneyear_trips_addcolv2” dataset.

Definition of : summary7cols3_char tibble =

summary7_usertype-month-avg_rides_duration_seg_char

summary7cols3_char <- oneyear_trips_addcolv2  %>%
   group_by(member_casual, month)  %>%
   summarise(avg_rides_duration_seg = mean(ride_duration)) %>%
   arrange(member_casual, month, avg_rides_duration_seg)
## `summarise()` has grouped output by 'member_casual'. You can override using the
## `.groups` argument.
print(summary7cols3_char)
## # A tibble: 24 × 3
## # Groups:   member_casual [2]
##    member_casual month  avg_rides_duration_seg
##    <chr>         <chr>                   <dbl>
##  1 casual        Apr_24                  1486.
##  2 casual        Aug_24                  1489.
##  3 casual        Dec_23                   992.
##  4 casual        Feb_24                  1189.
##  5 casual        Jan_24                   932.
##  6 casual        Jul_24                  1589.
##  7 casual        Jun_24                  1577.
##  8 casual        Mar_24                  1322.
##  9 casual        May_24                  1612.
## 10 casual        Nov_23                  1073.
## # ℹ 14 more rows
sapply(summary7cols3_char,class)
##          member_casual                  month avg_rides_duration_seg 
##            "character"            "character"              "numeric"

43.1.- summary7cols3_char. char = character. Tibble with 24 rows and 3 columns

44.-

44.- B.4.1.- Plot: ”Descending Average Rides’ Time by each Month, Faceting by User Type"

This set of code lines are made to be able to present on the “X” axis, the values of “month” according to the descending order of values of the Average Rides’ Times (“avg_rides_duration_seg”) on the “Y” axis, faceting by User Type

User Type = member_casual

The presentation is with the axles rotated 90 degrees.

”Descending Average Rides’ Time by each Month, Faceting by User Type Plot

https://juliasilge.com/blog/reorder-within/

summary7cols3_char %>%
   ggplot() + 
  geom_col(aes(reorder_within(month, avg_rides_duration_seg, member_casual), avg_rides_duration_seg /60, fill = member_casual), position = "dodge") +
  facet_wrap(~member_casual, scales = "free_y") + 
  theme(legend.position = "bottom") +
  theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
  theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
  coord_flip() +
  labs(title = 'Descending Average Rides’ Time by each Month, Faceting by User Type') +
  theme(plot.title = element_text(size = 11, face="bold")) +    
  xlab('Month') +
  ylab('Descending Average Rides’ Time (in min.)') +    
  theme(axis.text = element_text(face = "bold")) +
  theme(axis.title.y = element_text(face = "bold")) +
  theme(axis.title.x = element_text(face = "bold"))

#ggsave("Descending Average Rides’ Time by each Month, Faceting by User Type.png")

45.-

45.- B.5.1.- Plot: “Avg. Rides’ Time by each Day of Week in each Month, Faceting by User Type”

This set of code lines is to make a summary of the calculation and graphs of the variable “name_dayof week” vs the Average Rides’ Time “avg_rides_duration_seg”, parameterized via the 12 months of the year, faceting by User Type.

User Type = member_casual

Avg. Rides’ Time by each Day of Week in each Month, Faceting by User Type Plot

Summary6columns4_ol %>%
    ggplot(aes(x = name_dayofweek, y = avg_rides_duration_seg/60, group = month, colour = month, shape = month)) +
   geom_point() +
   scale_shape_manual(values=seq(0,11)) +
scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#FF3399","#A0A0A0","#000000","#99FFFF","#66FF66","#FFFF00","#FFCC99")) +
    geom_line(linewidth = 1.2) +
    facet_grid(member_casual ~ .) +
    theme(legend.position="bottom") +
    theme(legend.title = element_text(colour= "black", size = 11, face="bold")) +
    theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
    labs(title = 'Avg. Rides’ Time by each Day of Week in each Month, Faceting by User Type') +
   theme(plot.title = element_text(size = 11,face="bold")) +
   xlab('Name of Day of Week') +
   ylab(' Average Rides’ Time (in min.)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold", size = 11 )) + 
   theme(axis.title.x = element_text(face = "bold", size = 11)) +
   theme(axis.text.y = element_text(face = "bold", size = 11)) +
   theme(axis.title.y = element_text(face = "bold", size = 11))

#ggsave("Avg. Rides’ Time by each Day of Week in each Month, Faceting by User Type.png")

46.-

46.- B.5.2.- Plot: “Avg. Rides’ Time by each Month in each Day of Week, Faceting by User Type”

This set of code lines is to make a summary of the calculation and graphs of the variable “month” vs the Average Rides’ Time “avg_rides_duration_seg”, parameterized via the 7 days of the week of each month, faceting by User Type.

User Type = member_casual

“Avg. Rides’ Time by each Month in each Day of Week, Faceting by User Type” Plot

Summary6columns4_ol %>%
   ggplot(aes(x = month, y = avg_rides_duration_seg/60, group = name_dayofweek, colour = name_dayofweek, shape = name_dayofweek)) +
   geom_point() +
   scale_shape_manual(values=seq(0,6)) +
   scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#000000","#99FFFF")) +
   geom_line(linewidth = 1.2) +
   facet_grid(member_casual ~ .) +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Avg. Rides’ Time by each Month in each Day of Week, Faceting by User Type') +
  theme(plot.title = element_text(size = 11, face = "bold")) +
   xlab('Month') +
   ylab('Average Rides’ Time (in min.)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Avg. Rides’ Time by each Month in each Day of Week, Faceting by User Type.png")

47.-

47.- B.5.2.1.- Create a tibble: Expanded_view_member_section46 , based on Summary6columns4_ol

Summary tibble for Expanded view of the “member” section, point 46

https://www.statology.org/dplyr-filter-multiple-conditions/

Expanded_view_member_section46 <- Summary6columns4_ol %>%
  filter(member_casual == 'member')
print(Expanded_view_member_section46)
## # A tibble: 84 × 4
## # Groups:   member_casual, month [12]
##    member_casual month  name_dayofweek avg_rides_duration_seg
##    <chr>         <ord>  <ord>                           <dbl>
##  1 member        Oct_23 Mon                              659.
##  2 member        Oct_23 Tue                              694.
##  3 member        Oct_23 Wed                              691.
##  4 member        Oct_23 Thu                              675.
##  5 member        Oct_23 Fri                              678.
##  6 member        Oct_23 Sat                              738.
##  7 member        Oct_23 Sun                              785.
##  8 member        Nov_23 Mon                              643.
##  9 member        Nov_23 Tue                              621.
## 10 member        Nov_23 Wed                              643.
## # ℹ 74 more rows
#View(Expanded_view_member_section46)

48.-

48.- B.5.2.1m.- Plot: “Expanded View Avg. Rides’ Time by each Month in each Day of Week, by Member User Type”

Expanded view of the “member” section in graph (“Avg. Rides’ Time by each Month in each Day of Week, by User Type” Plot) at point 46

https://www.statology.org/dplyr-filter-multiple-conditions/

Expanded_view_member_section46 %>%    
   ggplot(aes(x = month, y = avg_rides_duration_seg/60, group = name_dayofweek, colour = name_dayofweek, shape = name_dayofweek)) +
   geom_point() +
   scale_shape_manual(values=seq(0,6)) +
   scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#000000","#99FFFF")) +
   geom_line(linewidth = 1.2) +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Expanded View Avg. Rides’ Time by each Month in each Day of Week, by Member User Type') +
   theme(plot.title = element_text(size = 11, face = "bold")) +
   xlab('Month') +
   ylab('Average Rides’ Time (in min.)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Expanded View Avg. Rides’ Time by each Month in each Day of Week, by Member User Type.png")

49.-

49.- B.5.2.1c.- Create a tibble:Expanded_view_casual_section46, based on Summary6columns4_ol

Summary tibble for Expanded view of the “casual” section, point 46

https://www.statology.org/dplyr-filter-multiple-conditions/

Expanded_view_casual_section46 <- Summary6columns4_ol %>%
  filter(member_casual == 'casual')
print(Expanded_view_casual_section46)
## # A tibble: 84 × 4
## # Groups:   member_casual, month [12]
##    member_casual month  name_dayofweek avg_rides_duration_seg
##    <chr>         <ord>  <ord>                           <dbl>
##  1 casual        Oct_23 Mon                             1225.
##  2 casual        Oct_23 Tue                             1164.
##  3 casual        Oct_23 Wed                             1170.
##  4 casual        Oct_23 Thu                             1040.
##  5 casual        Oct_23 Fri                             1164.
##  6 casual        Oct_23 Sat                             1366.
##  7 casual        Oct_23 Sun                             1603.
##  8 casual        Nov_23 Mon                              970.
##  9 casual        Nov_23 Tue                              861.
## 10 casual        Nov_23 Wed                              917.
## # ℹ 74 more rows
#View(Expanded_view_casual_section46)

50.-

50.- B.5.2.1ev- Plot: “Expanded View Avg. Rides’ Time by each Month in each Day of Week, by Casual User Type”

Expanded view of the “casual” section in graph (“Avg. Rides’ Time by each Month in each Day of Week, by User Type” Plot) at point 46

Expanded_view_casual_section46 %>%    
   ggplot(aes(x = month, y = avg_rides_duration_seg/60, group = name_dayofweek, colour = name_dayofweek, shape = name_dayofweek)) +
   geom_point() +
   scale_shape_manual(values=seq(0,6)) +
   scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#000000","#99FFFF")) +
   geom_line(linewidth = 1.2) +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Expanded View Avg. Rides’ Time by each Month in each Day of Week, by Casual User Type') +
   theme(plot.title = element_text(size = 11, face = "bold")) +
   xlab('Month') +
   ylab('Average Rides’ Time (in min.)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Expanded View Avg. Rides’ Time by each Month in each Day of Week, by Casual User Type.png")

S———————————————————————————————————————S

SECTION: RIDES’ NUMBER – DAY of WEEK – RIDEABLE TYPE

C.-

CALCULATION AND GRAPHS OF DAY_OF WEEK (MONTH) VS AVERAGE RIDES’ TIME, PARAMETRIZED VIA THE 12 MONTHS OF THE YEAR(7 DAYS OF THE WEEK), BY RIDEABLE TYPE OR FACETING BY RIDEABLE TYPE

51.-

51.- C.1.- Create three tibble: summary8columns6_ol and summary8columns5day_char – summary8columns5month_char

Create three tibble:summary8columns6_ol and summary8columns5day_char – summary8columns5month_char, tibbles based on “oneyeartrips_ok” dataset and oneyear_trips_addcolv2 dataset

In this section begins the graphical analysis where are presented on the axis of the “Y” values of “Rides’ Time” or “Average Ride_Duration” (“Avg. Rides’ Time”), and on the axis of the “X”, the names of the 12 months of the year, parametrized by the names of the 7 days of the week, consumed by the three types of Rideable Type: “clasicc_bike” “docked_bike” and “electric_bike” (by Rideable Type or faceting by Rideable Type)

This summary tibble generates (6)–(5) columns arranged in the following order :User Type(member_casual)-rideable_type-Month(name_dayofweek)-Days of Week(month)-rides_number-Average Rides’Duration(avg_rides_duration_seg), it is necessary for generating the graphics of Month(or Days of Week) vs Rides’ Number (or Avg. Rides’ Time), via 12 Month of year(or the 7 Days of Week), by Rideable Type or faceting by Rideable Type, mentioned in the following point

Tibbles based on “oneyeartrips_ok” dataset and oneyear_trips_addcolv2 dataset.

Definition of :summary8columns6_ol definition and summary8columns5day_char – summary8columns5month_char =

summary8_usertypecasual-rideable_type-month-name_dayofweek-rides_number-avg_rides_duration_seg_ol

summary8columns6_ol <- oneyeartrips_ok %>%
   group_by(member_casual, rideable_type, month, name_dayofweek) %>%
   summarise(rides_number = n(), avg_rides_duration_seg = mean(ride_duration)) %>%
   arrange(member_casual,rideable_type, month, name_dayofweek)
## `summarise()` has grouped output by 'member_casual', 'rideable_type', 'month'.
## You can override using the `.groups` argument.
print(summary8columns6_ol)
## # A tibble: 352 × 6
## # Groups:   member_casual, rideable_type, month [52]
##    member_casual rideable_type month  name_dayofweek rides_number
##    <chr>         <chr>         <ord>  <ord>                 <int>
##  1 casual        classic_bike  Oct_23 Mon                   11017
##  2 casual        classic_bike  Oct_23 Tue                   12187
##  3 casual        classic_bike  Oct_23 Wed                    9902
##  4 casual        classic_bike  Oct_23 Thu                    8300
##  5 casual        classic_bike  Oct_23 Fri                    9143
##  6 casual        classic_bike  Oct_23 Sat                   12586
##  7 casual        classic_bike  Oct_23 Sun                   19550
##  8 casual        classic_bike  Nov_23 Mon                    4469
##  9 casual        classic_bike  Nov_23 Tue                    4133
## 10 casual        classic_bike  Nov_23 Wed                    5254
## # ℹ 342 more rows
## # ℹ 1 more variable: avg_rides_duration_seg <dbl>
#View(summary8columns6_ol)

summary8columns5day_char <- oneyear_trips_addcolv2 %>%
   #filter(member_casual == 'member') %>%
   group_by(member_casual, rideable_type, name_dayofweek) %>%
   summarise(rides_number = n(), avg_rides_duration_seg = mean(ride_duration)) %>%
   arrange(member_casual,rideable_type, name_dayofweek)
## `summarise()` has grouped output by 'member_casual', 'rideable_type'. You can
## override using the `.groups` argument.
print(summary8columns5day_char)
## # A tibble: 42 × 5
## # Groups:   member_casual, rideable_type [6]
##    member_casual rideable_type name_dayofweek rides_number
##    <chr>         <chr>         <chr>                 <int>
##  1 casual        classic_bike  Fri                  133667
##  2 casual        classic_bike  Mon                  112942
##  3 casual        classic_bike  Sat                  210811
##  4 casual        classic_bike  Sun                  181235
##  5 casual        classic_bike  Thu                  112217
##  6 casual        classic_bike  Tue                  100216
##  7 casual        classic_bike  Wed                  115705
##  8 casual        electric_bike Fri                   77527
##  9 casual        electric_bike Mon                   65050
## 10 casual        electric_bike Sat                   92154
## # ℹ 32 more rows
## # ℹ 1 more variable: avg_rides_duration_seg <dbl>
#View(summary8columns5day_char)

summary8columns5month_char <- oneyear_trips_addcolv2 %>%
   group_by(member_casual, rideable_type, month) %>%
   summarise(rides_number = n(), avg_rides_duration_seg = mean(ride_duration)) %>%
   arrange(member_casual,rideable_type, month)
## `summarise()` has grouped output by 'member_casual', 'rideable_type'. You can
## override using the `.groups` argument.
print(summary8columns5month_char)
## # A tibble: 52 × 5
## # Groups:   member_casual, rideable_type [6]
##    member_casual rideable_type month  rides_number avg_rides_duration_seg
##    <chr>         <chr>         <chr>         <int>                  <dbl>
##  1 casual        classic_bike  Apr_24        57421                  1847.
##  2 casual        classic_bike  Aug_24       148030                  1762.
##  3 casual        classic_bike  Dec_23        20280                  1296.
##  4 casual        classic_bike  Feb_24        27591                  1368.
##  5 casual        classic_bike  Jan_24        10328                  1204.
##  6 casual        classic_bike  Jul_24       159027                  1840.
##  7 casual        classic_bike  Jun_24       143499                  1837.
##  8 casual        classic_bike  Mar_24        39320                  1631.
##  9 casual        classic_bike  May_24       115974                  1879.
## 10 casual        classic_bike  Nov_23        42244                  1347.
## # ℹ 42 more rows
#View(summary8columns5month_char)

Definiton of : summary8columns6_ol and summary8columns5day_char – summary8columns5month_char

**52.-

52.- Create a tibbles: summary8columns5month_ol, based on “oneyeartrips_ok” dataset

summary8columns5month_ol <- oneyeartrips_ok %>%
   #filter(member_casual == 'casual') %>%
   group_by(member_casual, rideable_type, month) %>%
   summarise(rides_number = n(), avg_rides_duration_seg = mean(ride_duration)) %>%
   arrange(member_casual,rideable_type, month)
## `summarise()` has grouped output by 'member_casual', 'rideable_type'. You can
## override using the `.groups` argument.
print(summary8columns5month_ol)
## # A tibble: 52 × 5
## # Groups:   member_casual, rideable_type [6]
##    member_casual rideable_type month  rides_number avg_rides_duration_seg
##    <chr>         <chr>         <ord>         <int>                  <dbl>
##  1 casual        classic_bike  Oct_23        82685                  1561.
##  2 casual        classic_bike  Nov_23        42244                  1347.
##  3 casual        classic_bike  Dec_23        20280                  1296.
##  4 casual        classic_bike  Jan_24        10328                  1204.
##  5 casual        classic_bike  Feb_24        27591                  1368.
##  6 casual        classic_bike  Mar_24        39320                  1631.
##  7 casual        classic_bike  Apr_24        57421                  1847.
##  8 casual        classic_bike  May_24       115974                  1879.
##  9 casual        classic_bike  Jun_24       143499                  1837.
## 10 casual        classic_bike  Jul_24       159027                  1840.
## # ℹ 42 more rows
#View(summary8columns5month_ol)

**53.-

53.- summary8columns5day_ol , summary8columns5daycasual_ol and summary8columns5member_ol – summary8columns5daymember_ol, based on oneyeartrips_ok dataset

summary8columns5day_ol <- oneyeartrips_ok %>%
   group_by(member_casual, rideable_type, name_dayofweek) %>%
   summarise(rides_number = n(), avg_rides_duration_seg = mean(ride_duration)) %>%
   arrange(member_casual,rideable_type, name_dayofweek)
## `summarise()` has grouped output by 'member_casual', 'rideable_type'. You can
## override using the `.groups` argument.
print(summary8columns5day_ol)
## # A tibble: 42 × 5
## # Groups:   member_casual, rideable_type [6]
##    member_casual rideable_type name_dayofweek rides_number
##    <chr>         <chr>         <ord>                 <int>
##  1 casual        classic_bike  Mon                  112942
##  2 casual        classic_bike  Tue                  100216
##  3 casual        classic_bike  Wed                  115705
##  4 casual        classic_bike  Thu                  112217
##  5 casual        classic_bike  Fri                  133667
##  6 casual        classic_bike  Sat                  210811
##  7 casual        classic_bike  Sun                  181235
##  8 casual        electric_bike Mon                   65050
##  9 casual        electric_bike Tue                   61154
## 10 casual        electric_bike Wed                   69263
## # ℹ 32 more rows
## # ℹ 1 more variable: avg_rides_duration_seg <dbl>
#View(summary8columns5day_ol)

summary8columns5daycasual_ol <- oneyeartrips_ok %>%
   filter(member_casual == 'casual') %>%
   group_by(member_casual, rideable_type, name_dayofweek) %>%
   summarise(rides_number = n(), avg_rides_duration_seg = mean(ride_duration)) %>%
   arrange(member_casual,rideable_type, name_dayofweek)
## `summarise()` has grouped output by 'member_casual', 'rideable_type'. You can
## override using the `.groups` argument.
print(summary8columns5daycasual_ol)
## # A tibble: 21 × 5
## # Groups:   member_casual, rideable_type [3]
##    member_casual rideable_type name_dayofweek rides_number
##    <chr>         <chr>         <ord>                 <int>
##  1 casual        classic_bike  Mon                  112942
##  2 casual        classic_bike  Tue                  100216
##  3 casual        classic_bike  Wed                  115705
##  4 casual        classic_bike  Thu                  112217
##  5 casual        classic_bike  Fri                  133667
##  6 casual        classic_bike  Sat                  210811
##  7 casual        classic_bike  Sun                  181235
##  8 casual        electric_bike Mon                   65050
##  9 casual        electric_bike Tue                   61154
## 10 casual        electric_bike Wed                   69263
## # ℹ 11 more rows
## # ℹ 1 more variable: avg_rides_duration_seg <dbl>
#View(summary8columns5daycasual_ol)

summary8columns5daymember_ol <- oneyeartrips_ok %>%
   filter(member_casual == 'member') %>%
   group_by(member_casual, rideable_type, name_dayofweek) %>%
   summarise(rides_number = n(), avg_rides_duration_seg = mean(ride_duration)) %>%
   arrange(member_casual,rideable_type, name_dayofweek)
## `summarise()` has grouped output by 'member_casual', 'rideable_type'. You can
## override using the `.groups` argument.
print(summary8columns5daymember_ol)
## # A tibble: 21 × 5
## # Groups:   member_casual, rideable_type [3]
##    member_casual rideable_type name_dayofweek rides_number
##    <chr>         <chr>         <ord>                 <int>
##  1 member        classic_bike  Mon                  270694
##  2 member        classic_bike  Tue                  281868
##  3 member        classic_bike  Wed                  298813
##  4 member        classic_bike  Thu                  282238
##  5 member        classic_bike  Fri                  244981
##  6 member        classic_bike  Sat                  227747
##  7 member        classic_bike  Sun                  209724
##  8 member        electric_bike Mon                  133194
##  9 member        electric_bike Tue                  140919
## 10 member        electric_bike Wed                  150581
## # ℹ 11 more rows
## # ℹ 1 more variable: avg_rides_duration_seg <dbl>
#View(summary8columns5daymember_ol)

53.1..-

53.1..-C.1.1.- Plot: “Rides’ Number, by each Day of Week by Rideable Type”

summary8columns5daymember_ol %>%
   ggplot(aes(x = name_dayofweek, y = rides_number/1000, fill = rideable_type)) +
   geom_col(position = "dodge") +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Rides’ Number, by each Day of Week by Rideable Type') +
   theme(plot.title = element_text(size = 11, face="bold")) +
   xlab('Day of Week') +
   ylab('Rides’ Number (in thousands)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Rides’ Number, by each Day of Week by Rideable Type.png")

54.-

54.-C.1.1m.- Plot: “Rides’ Number, Only Member, by each Days by Rideable Type”

summary8columns5daymember_ol %>%
   filter(member_casual == 'member') %>% 
   ggplot(aes(x = name_dayofweek, y = rides_number/1000, fill = rideable_type)) +
   geom_col(position = "dodge") +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Rides’ Number, Only Member, by each Day of Week by Rideable Type') +
   theme(plot.title = element_text(size = 11, face="bold")) +
   xlab('Day of Week') +
   ylab('Rides’ Number (in thousands)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Rides’ Number, Only Member, by each Day of Week by Rideable Type.png")

55.-

55.-C.1.1c.- Plot: “Rides’ Number ,Only Casual, by each Day of Week by Rideable Type”

summary8columns5daycasual_ol %>%
   filter(member_casual == 'casual') %>% 
   ggplot(aes(x = name_dayofweek, y = rides_number/1000, fill = rideable_type)) +
   geom_col(position = "dodge") +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Rides’ Number,Only Casual, by each Day of Week by Rideable Type') +
   theme(plot.title = element_text(size = 11, face="bold")) +
   xlab('Day of Week') +
   ylab('Rides’ Number (in thousands)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Rides’ Number,Only Casual, by each Day of Week by Rideable Type.png")

56.-

56.-C.2.1m.- Plot: “Rides’ Number, Only Member, by each Day of Week in each Month, Faceting by Rideable Type”

summary8columns6_ol %>%
   filter(member_casual == 'member') %>% 
   ggplot(aes(x = name_dayofweek, y = rides_number/1000, group = month, colour = month, shape = month)) +
   geom_point() +
   scale_shape_manual(values=seq(0,11)) +
scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#FF3399","#A0A0A0","#000000","#99FFFF","#66FF66","#FFFF00","#FFCC99")) +
   geom_line(linewidth = 1.2) +
   facet_grid(rideable_type ~ .) +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Rides’ Number,Only Member, by each Day of Week in each Month, Faceting by Rideable Type') +
   theme(plot.title = element_text(size = 11, face = "bold")) +
   xlab('Day of Week') +
   ylab('Rides’ Number (in thousands)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Rides’ Number,Only Member, by each Day of Week in each Month, Faceting by Rideable Type.png")

57.-

57.-C.2.1c.- Plot: “Rides’ Number, Only Casual, by each Day of Week in each Month, Faceting by Rideable Type”

summary8columns6_ol %>%
   filter(member_casual == 'casual') %>% 
   ggplot(aes(x = name_dayofweek, y = rides_number/1000, group = month, colour = month, shape = month)) +
   geom_point() +
   scale_shape_manual(values=seq(0,11)) +
scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#FF3399","#A0A0A0","#000000","#99FFFF","#66FF66","#FFFF00","#FFCC99")) +
   geom_line(linewidth = 1.2) +
   facet_grid(rideable_type ~ .) +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Rides’ Number, Only Casual, by each Day of Week in each Month, Faceting by Rideable Type') +
   theme(plot.title = element_text(size = 11, face = "bold")) +
   xlab('Day of Week') +
   ylab('Rides’ Number (in thousands)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Rides’ Number, Only Casual, by each Day of Week in each Month, Faceting by Rideable Type.png")

**58.-

58.-C.3.1.- Plot: “Descending Rides’ Number by Day of Week, faceting Rideable Type”

summary8columns5day_char %>%
  ggplot() + 
  geom_col(aes(reorder_within(name_dayofweek, rides_number, member_casual), rides_number/1000, fill = rideable_type), position = "dodge") +
  facet_wrap(~rideable_type, scales = "free_y") + 
  theme(legend.position = "bottom") +
  theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
  theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
  coord_flip() +
  labs(title = 'Descending Rides’ Number by each Day of Week, Faceting by Rideable Type') +
  theme(plot.title = element_text(size = 11,face="bold")) +    
  xlab('Name of Days of Week') +
  ylab('Descending Rides’ Number (in thousands)') +    
  theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
  theme(axis.title.x = element_text(face = "bold")) +
  theme(axis.text.y = element_text(face = "bold")) +
  theme(axis.title.y = element_text(face = "bold"))

#ggsave("Descending Rides’ Number, Only Member, by each Day of Week, Faceting by Rideable Type.png")

**59.-

59.-C.3.1m.- Plot: “Descending Rides’ Number, Only Member, by Day of Week, faceting Rideable Type”

summary8columns5day_char %>%
  filter(member_casual == 'member') %>%
  ggplot() + 
  geom_col(aes(reorder_within(name_dayofweek, rides_number, member_casual), rides_number/1000, fill = rideable_type), position = "dodge") +
  facet_wrap(~rideable_type, scales = "free_y") + 
  theme(legend.position = "bottom") +
  theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
  theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
  coord_flip() +
  labs(title = 'Descending Rides’ Number, Only Member, by each Day of Week, Faceting by Rideable Type') +
  theme(plot.title = element_text(size = 11,face="bold")) +    
  xlab('Name of Days of Week') +
  ylab('Descending Rides’ Number (in thousands)') +    
  theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
  theme(axis.title.x = element_text(face = "bold")) +
  theme(axis.text.y = element_text(face = "bold")) +
  theme(axis.title.y = element_text(face = "bold"))

#ggsave("Descending Rides’ Number, Only Member, by each Day of Week, Faceting by Rideable Type.png")

**60.-

60.-C.3.1c.- Plot: “Descending Rides’ Number, Only Casual, by Day of Week, faceting Rideable Type”

summary8columns5day_char %>%
  filter(member_casual == 'casual') %>%
  ggplot() + 
  geom_col(aes(reorder_within(name_dayofweek, rides_number, member_casual), rides_number/1000, fill = rideable_type), position = "dodge") +
  facet_wrap(~rideable_type, scales = "free_y") + 
  theme(legend.position = "bottom") +
  theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
  theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
  coord_flip() +
  labs(title = 'Descending Rides’ Number, Only Casual, by each Day of Week, Faceting by Rideable Type') +
  theme(plot.title = element_text(size = 11,face="bold")) +    
  xlab('Name of Days of Week') +
  ylab('Descending Rides’ Number (in thousands)') +    
  theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
  theme(axis.title.x = element_text(face = "bold")) +
  theme(axis.text.y = element_text(face = "bold")) +
  theme(axis.title.y = element_text(face = "bold"))

#ggsave("Descending Rides’ Number, Only Casual, by each Day of Week, Faceting by Rideable Type.png")

S———————————————————————————————————————S

SECTION: RIDES’ NUMBER – MONTH – RIDEABL TYPE

61

61.-C.4.1m.- Plot: “Rides’ Number,Only Member, by each Month by Rideable Type”

summary8columns5month_ol %>%
   #filter(member_casual == 'member') %>% 
   ggplot(aes(x = month, y = rides_number/1000, fill = rideable_type)) +
   geom_col(position = "dodge") +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Rides’ Number,Only Member, by each Month by Rideable Type') +
   theme(plot.title = element_text(size = 11, face="bold")) +
   xlab('Month') +
   ylab('Rides’ Number (in thousands)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Rides’ Number,Only Member, by each Month by Rideable Type.png")

62.-

62.-C.4.1c.- Plot: “Rides’ Number, Only Casual, by each Month by Rideable Type”

summary8columns5month_ol %>%
   filter(member_casual == 'casual') %>% 
   ggplot(aes(x = month, y = rides_number/1000, fill = rideable_type)) +
   geom_col(position = "dodge") +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Rides’ Number, Only Casual, by each Month by Rideable Type') +
   theme(plot.title = element_text(size = 11, face="bold")) +
   xlab('Month') +
   ylab('Rides’ Number (in thousands)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Rides’ Number, Only Casual, by each Month by Rideable Type.png")

63.-

63.-C.5.1m.- Plot:“Descending Rides’ Number,Only Member, by Month, faceting Rideable Type”

summary8columns5month_char %>%
  filter(member_casual == 'member') %>%
  ggplot() + 
  geom_col(aes(reorder_within(month, rides_number, rideable_type), rides_number/1000, fill = rideable_type), position = "dodge") +
  facet_wrap(~rideable_type, scales = "free_y") + 
  theme(legend.position = "bottom") +
  theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
  theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
  coord_flip() +
  #scale_x_reordered() +
  labs(title = 'Descending Rides’ Number,Only Member, by Month, Faceting by Rideable Type') +
  theme(plot.title = element_text(size = 11,face="bold")) +    
  xlab('Name of Days of Week') +
  ylab('Descending Rides’ Number (in thousands)') +    
  theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
  theme(axis.title.x = element_text(face = "bold")) +
  theme(axis.text.y = element_text(face = "bold")) +
  theme(axis.title.y = element_text(face = "bold"))

#ggsave("Descending Rides’ Number,Only Member, by each Month, Faceting By Rideable Type.png")

64.-

64.-C.5.1c.- Plot: “Descending Rides’ Number,Only Casual, by each Month, Faceting by Rideable Type”

summary8columns5month_char %>%
  filter(member_casual == 'casual') %>%
  ggplot() + 
  geom_col(aes(reorder_within(month, rides_number, rideable_type), rides_number/1000, fill = rideable_type), position = "dodge") +
  facet_wrap(~rideable_type, scales = "free_y") + 
  theme(legend.position = "bottom") +
  theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
  theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
  coord_flip() +
  labs(title = 'Descending Rides’ Number, Only Casual, by Month, Faceting by Rideable Type') +
  theme(plot.title = element_text(size = 11,face="bold")) +    
  xlab('Month') +
  ylab('Descending Rides’ Number (in thousands)') +    
  theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
  theme(axis.title.x = element_text(face = "bold")) +
  theme(axis.text.y = element_text(face = "bold")) +
  theme(axis.title.y = element_text(face = "bold"))

#ggsave("Descending Rides’ Number,Only Casual, by each Month, Faceting by Rideable Type.png")

65.-

65.- C.6.2m.- Plot: “Rides’ Number, Only Member, by each Month in each Day of Week, Faceting by Rideable”

“Rides’ Number by each Month parametrized by the 7 Days of Week,Only Member, Facetting by Rideable Type” Plot

summary8columns6_ol %>%
   filter(member_casual == 'member') %>% 
   ggplot(aes(x = month, y = rides_number/1000, group = name_dayofweek, colour = name_dayofweek, shape = name_dayofweek)) +
   geom_point() +
   scale_shape_manual(values=seq(0,6)) +
   scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#000000","#99FFFF")) +
   geom_line(linewidth = 1.2) +
   facet_grid(rideable_type ~ .) +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Rides’ Number, Only Member, by each Month in each Day of Week, Faceting by Rideable') +
   theme(plot.title = element_text(size = 11, face = "bold")) +
   xlab('Month') +
   ylab('Rides’ Number (in thousands)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Rides’ Number, Only Member, by each Month in each Day of Week, Faceting by Rideable.png")

66.-

66.- C.6.2c.- Plot: “Rides’ Number, Only Casual, by each Month in each Day of Week, Faceting by Rideable”

“Rides’ Number, Only Casual, by each Month in each Day of Week (parametrized by the 7 Days of Week), Facetting by Rideable Type” Plot

summary8columns6_ol %>%
   filter(member_casual == 'casual') %>% 
   ggplot(aes(x = month, y = rides_number/1000, group = name_dayofweek, colour = name_dayofweek, shape = name_dayofweek)) +
   geom_point() +
   scale_shape_manual(values=seq(0,6)) +
   scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#000000","#99FFFF")) +
   geom_line(linewidth = 1.2) +
   facet_grid(rideable_type ~ .) +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Rides’ Number, Only Casual, by each Month in each Day of Week, Faceting by Rideable') +
   theme(plot.title = element_text(size = 11, face = "bold")) +
   xlab('Month') +
   ylab('Rides’ Number (in thousands)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Rides’ Number, Only Casual, by each Month in each Day of Week, Faceting by Rideable.png")

S———————————————————————————————————————S

SECTION: AVERAGE(AVG) RIDES’S TIME – MONTH – RIDEABLE TYPE

70.-

70.- C.7.1.- Plot: “Average(Avg) Rides’ Time by each Month by Rideable Type”

This set of code lines is made to be able to present on the “X” axis the variable “month” in an ordered form from “Oct_23” to “Sep_24”, and on the “Y” axis" the values “avg_rides_duration_seg” corresponding to the order of the “X” axis “month”, by Rideable Type

Average Rides’ Time by each Month by Rideable Type

summary8columns6_ol %>%
   ggplot(aes(x = month, y = avg_rides_duration_seg /60, fill = rideable_type)) +
   geom_col(position = "dodge") +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Average Rides’ Time by each Month by Rideable Type') +
   theme(plot.title = element_text(size = 11, face="bold")) +
   xlab('Month') +
   ylab('Average Rides’ Time (in min.)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Average Rides’ Time by each Month by Rideable Type.png")

71.-

71.-C.7.1m.- Plot: “Avg Rides’ Time, Only Member, by each Month, by Rideable Type”

summary8columns6_ol %>%
  filter(member_casual == 'member') %>%
   ggplot(aes(x = month, y = avg_rides_duration_seg /60, fill = rideable_type)) +
   geom_col(position = "dodge") +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Avg Rides’ Time, Only Member, by each Month, by Rideable Type') +
   theme(plot.title = element_text(size = 11, face="bold")) +
   xlab('Month') +
   ylab('Average Rides’ Time (in min.)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Average Rides’ Time, Only Member, by each Month by Rideable Type.png")

72.-

72.-C.7.1c.- Plot: “Avg Rides’ Time, Only Casual, by each Month, by Rideable Type”

summary8columns6_ol %>%
  filter(member_casual == 'casual') %>%
   ggplot(aes(x = month, y = avg_rides_duration_seg /60, fill = rideable_type)) +
   geom_col(position = "dodge") +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Avg Rides’ Time, Only Casual, by each Month, by Rideable Type') +
   theme(plot.title = element_text(size = 11, face="bold")) +
   xlab('Month') +
   ylab('Average Rides’ Time (in min.)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Average Rides’ Time, Only Casual, by each Month by Rideable Type.png")

73.-

73.-C.8.1m.- Plot: “Descending Avg Rides’ Time, Only Member, by Month, Faceting by Rideable Type”

summary8columns5month_char %>%
  filter(member_casual == 'member') %>%
  ggplot() + 
  geom_col(aes(reorder_within(month, avg_rides_duration_seg, rideable_type), avg_rides_duration_seg/60, fill = rideable_type), position = "dodge") +
  facet_wrap(~rideable_type, scales = "free_y") + 
  theme(legend.position = "bottom") +
  theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
  theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
  coord_flip() +
  labs(title = 'Descending Avg Rides’ Time, Only Member, by Month, Faceting by Rideable Type') +
  theme(plot.title = element_text(size = 11,face="bold")) +    
  xlab('Month') +
  ylab('Descending Rides’ Number (in thousands)') +    
  theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
  theme(axis.title.x = element_text(face = "bold")) +
  theme(axis.text.y = element_text(face = "bold")) +
  theme(axis.title.y = element_text(face = "bold"))

#ggsave("Descending Avg Rides’ Time, Only Member, by each Month, Faceting by Rideable Type.png")

74.-

74.-C.8.1c.- Plot: “Descending Avg Rides’ Time, Only Casual, by Month, Faceting by Rideable Type”

Descending Avg Rides’ Time, Only Casual, by each Month, Faceting by Rideable Type

summary8columns5month_char %>%
  filter(member_casual == 'casual') %>%
  ggplot() + 
  geom_col(aes(reorder_within(month, avg_rides_duration_seg, rideable_type), avg_rides_duration_seg/60, fill = rideable_type), position = "dodge") +
  facet_wrap(~rideable_type, scales = "free_y") + 
  theme(legend.position = "bottom") +
  theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
  theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
  coord_flip() +
  labs(title = 'Descending Avg Rides’ Time, Only Casual, by Month, Faceting by Rideable Type') +
  theme(plot.title = element_text(size = 11,face="bold")) +    
  xlab('Month') +
  ylab('Descending Rides’ Number (in thousands)') +    
  theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
  theme(axis.title.x = element_text(face = "bold")) +
  theme(axis.text.y = element_text(face = "bold")) +
  theme(axis.title.y = element_text(face = "bold"))

#ggsave("Descending Avg Rides’ Time, Only Casual, by each Month, Faceting by Rideable Type.png")

75.-

75.- C.9.1m.- Plot: “Avg. Rides’ Time,Only Member, by each Month in each Day of Week, Faceting by Rideable”

“Avg. Rides’ Time,Only Member, by each Month, in each Day of Week (parametrized by 7 Days of Week), Faceting by Rideable Type” Plot

summary8columns6_ol %>%
   filter(member_casual == 'member') %>% 
   ggplot(aes(x = month, y = avg_rides_duration_seg/60, group = name_dayofweek, colour = name_dayofweek, shape = name_dayofweek)) +
   geom_point() +
   scale_shape_manual(values=seq(0,6)) +
   scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#000000","#99FFFF")) +
   geom_line(linewidth = 1.2) +
   facet_grid(rideable_type ~ .) +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Avg. Rides’ Time,Only Member, by each Month in each Day of Week, Faceting by Rideable Type') +
   theme(plot.title = element_text(size = 11, face = "bold")) +
   xlab('Month') +
   ylab('Average Rides’ Time (in min.)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Avg. Rides’ Time,Only Member, by each Month in each Day of Week, Faceting by Rideable Type.png")

76.-

76.- C.9.2c.- Plot: “Avg. Rides’ Time, Only Casual, by each Month in each Day of Week, Faceting by Rideable Type”

“Avg. Rides’ Time, Only Casual, by each Month in each Day of Week (parametrized by 7 Days of Week), Faceting by Rideable Type” Plot

summary8columns6_ol %>%
   filter(member_casual == 'casual') %>%
   ggplot(aes(x = month, y = avg_rides_duration_seg/60, group = name_dayofweek, colour = name_dayofweek, shape = name_dayofweek)) +
   geom_point() +
   scale_shape_manual(values=seq(0,6)) +
   scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#000000","#99FFFF")) +
   geom_line(linewidth = 1.2) +
   facet_grid(rideable_type ~ .) +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Avg. Rides’ Time, Only Casual, by each Month in each Day of Week, Faceting by Rideable Type') +
   theme(plot.title = element_text(size = 11, face = "bold")) +
   xlab('Month') +
   ylab('Average Rides’ Time (in min.)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Avg. Rides’ Time, Only Casual, by each Month in each Day of Week, Faceting by Rideable Type.png")

S———————————————————————————————————————S

SECTION: AVERAGE(AVG) RIDES’ TIME – DAY OF WEEK – RIDEABLE TYPE

80.-

80.- C.1.1.- Plot: “Average Rides’ Time by each Day of Week by Rideable Type”

This set of code lines is made to be able to present on the “X” axis the variable Day of Week (“name_dayofweek”) in an ordered form from “Monday” to “Sunday”, and on the “Y” axis" the values “avg_rides_duration_seg” corresponding to the order of the “X” axis “month”, by Rideable Type

Average Rides’ Time by each Day of Week by Rideable Type

summary8columns6_ol %>%
   ggplot(aes(x = name_dayofweek, y = avg_rides_duration_seg /60, fill = rideable_type)) +
   geom_col(position = "dodge") +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Average Rides’ Time by each Day of Week by Rideable Type') +
   theme(plot.title = element_text(size = 11, face="bold")) +
   xlab('Day of Week') +
   ylab('Average Rides’ Time (in min.)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Average Rides’ Time by each Day of Week by Rideable Type.png")

81.-

81.-C.1.1m.- Plot: “Avg Rides’ Time, Only Member, by each Day of Week, by Rideable Type”

summary8columns6_ol %>%
  filter(member_casual == 'member') %>%
   ggplot(aes(x = name_dayofweek, y = avg_rides_duration_seg /60, fill = rideable_type)) +
   geom_col(position = "dodge") +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Avg Rides’ Time, Only Member, by each Day of Week, by Rideable Type') +
   theme(plot.title = element_text(size = 11, face="bold")) +
   xlab('Day of Week') +
   ylab('Average Rides’ Time (in min.)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Average Rides’ Time, Only Member, by each Day of Week by Rideable Type.png")

82.-

82.-C.1.1c.- Plot: “Avg Rides’ Time, Only Casual, by each Day of Week, by Rideable Type”

summary8columns6_ol %>%
  filter(member_casual == 'casual') %>%
   ggplot(aes(x = name_dayofweek, y = avg_rides_duration_seg /60, fill = rideable_type)) +
   geom_col(position = "dodge") +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Avg Rides’ Time, Only Casual, by each Day of Week, by Rideable Type') +
   theme(plot.title = element_text(size = 11, face="bold")) +
   xlab('Day of Week') +
   ylab('Average Rides’ Time (in min.)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Average Rides’ Time, Only Casual, by each Day of Week by Rideable Type.png")

83.-

83.-C.2.1m.- Plot: “Descending Avg Rides’ Time, Only Member, by Day of Week, Faceting by Rideable Type”

summary8columns5day_char %>%
  filter(member_casual == 'member') %>%
  ggplot() + 
  geom_col(aes(reorder_within(name_dayofweek, avg_rides_duration_seg, rideable_type), avg_rides_duration_seg/60, fill = rideable_type), position = "dodge") +
  facet_wrap(~rideable_type, scales = "free_y") + 
  theme(legend.position = "bottom") +
  theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
  theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
  coord_flip() +
  labs(title = 'Descending Avg Rides’ Time, Only Member, by Day of Week, Faceting by Rideable Type') +
  theme(plot.title = element_text(size = 11,face="bold")) +    
  xlab('Day of Week') +
  ylab('Descending Rides’ Number (in thousands)') +    
  theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
  theme(axis.title.x = element_text(face = "bold")) +
  theme(axis.text.y = element_text(face = "bold")) +
  theme(axis.title.y = element_text(face = "bold"))

#ggsave("Descending Avg Rides’ Time, Only Member, by each Day of Week, Faceting by Rideable Type.png")

84.-

84.-C.2.1c.- Plot: “Descending Avg Rides’ Time, Only Casual, by Day of Week, Faceting by Rideable Type”

Descending Avg Rides’ Time, Only Casual, by each Day of Week, Faceting by Rideable Type

summary8columns5day_char %>%
  filter(member_casual == 'casual') %>%
  ggplot() + 
  geom_col(aes(reorder_within(name_dayofweek, avg_rides_duration_seg, rideable_type), avg_rides_duration_seg/60, fill = rideable_type), position = "dodge") +
  facet_wrap(~rideable_type, scales = "free_y") + 
  theme(legend.position = "bottom") +
  theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
  theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
  coord_flip() +
  labs(title = 'Descending Avg Rides’ Time, Only Casual, by Day of Week, Faceting by Rideable Type') +
  theme(plot.title = element_text(size = 11,face="bold")) +    
  xlab('Day of Week') +
  ylab('Descending Rides’ Number (in thousands)') +    
  theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
  theme(axis.title.x = element_text(face = "bold")) +
  theme(axis.text.y = element_text(face = "bold")) +
  theme(axis.title.y = element_text(face = "bold"))

#ggsave("Descending Avg Rides’ Time, Only Casual, by each Day of Week, Faceting by Rideable Type.png")

85.-

85.- C.3.1m. Plot: “Avg. Rides’ Time,Only Member, by each Day of Week in each Month, Faceting by Rideable”

“Avg. Rides’ Time,Only Member, by each Day of Week, in each Month (parametrized by 12 Month in the Year), Faceting by Rideable Type” Plot

summary8columns6_ol %>%
   filter(member_casual == 'member') %>% 
   ggplot(aes(x = name_dayofweek, y = avg_rides_duration_seg/60, group = month, colour = month, shape = month)) +
   geom_point() +
   scale_shape_manual(values=seq(0,11)) +
   scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#FF3399","#A0A0A0","#000000","#99FFFF","#66FF66","#FFFF00","#FFCC99")) +
   geom_line(linewidth = 1.2) +
   facet_grid(rideable_type ~ .) +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Avg. Rides’ Time,Only Member, by Day of Week in each Month of Year, Faceting by Rideable') +
   theme(plot.title = element_text(size = 11, face = "bold")) +
   xlab('Day of Week') +
   ylab('Average Rides’ Time (in min.)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Avg. Rides’ Time,Only Member, by Day of Week in each Month of Year, Faceting by Rideable.png")

86.-

86.- C.3.1c. Plot: “Avg. Rides’ Time,Only Casual, by each Day of Week in each Month, Faceting by Rideable”

“Avg. Rides’ Time,Only Casual, by each Day of Week, in each Month (parametrized by 12 Month in the Year), Faceting by Rideable Type” Plot

summary8columns6_ol %>%
   filter(member_casual == 'casual') %>% 
   ggplot(aes(x = name_dayofweek, y = avg_rides_duration_seg/60, group = month, colour = month, shape = month)) +
   geom_point() +
   scale_shape_manual(values=seq(0,11)) +
   scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#FF3399","#A0A0A0","#000000","#99FFFF","#66FF66","#FFFF00","#FFCC99")) +
   geom_line(linewidth = 1.2) +
   facet_grid(rideable_type ~ .) +
   theme(legend.position="bottom") +
   theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
   theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
   labs(title = 'Avg. Rides’ Time,Only Casual, by Day of Week in each Month of Year, Faceting by Rideable') +
   theme(plot.title = element_text(size = 11, face = "bold")) +
   xlab('Day of Week') +
   ylab('Average Rides’ Time (in min.)') +    
   theme(axis.text.x = element_text(angle = 30, face = "bold" )) + 
   theme(axis.title.x = element_text(face = "bold")) +
   theme(axis.text.y = element_text(face = "bold")) +
   theme(axis.title.y = element_text(face = "bold"))

#ggsave("Avg. Rides’ Time,Only Casual, by Day of Week in each Month of Year, Faceting by Rideable.png")

S———————————————————————————————————————S

SECTION: VISUALIZATION OF THE MOST IMPORTANT RESULTS LEADING TO SOLVING THE BUSINESS TASK

131-

131.- Plot131: “Rides’ Number by Day of Week, Filling by User Type and Rides’ Number by Day of Week in each Month , Faceting by User Type”

knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/1.-Rides' Number by Day of Week, Filling by User Type and Rides' Number by Day of Week in each Month , Faceting by User Type.png")

132-

132.- Plot132: “Rides’ Number by Month, Filling by User Type and Rides’ Number by Month in each Day of Week , Faceting by User Type”

knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/2.-Rides' Number by Month, Filling by User Type and Rides' Number by Month in each Day of Week , Faceting by User Type.png")

133-

133.- Plot133.-Only Member-Rides’ Number by Day of Week, Filling by Rideable Type and Rides’ Number by Day of Week in each Month , Faceting by Rideable Type"

knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/3.-Only Member-Rides' Number by Day of Week, Filling by Rideable Type and Rides' Number by Day of Week in each Month , Faceting by Rideable Type.png")

134-

134.- Plot134: “-Only Casual-Rides’ Number by Day of Week, Filling by Rideable Type and Rides’ Number by Day of Week in each Month , Faceting by Rideable Type”

knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/4.-Only Casual-Rides' Number by Day of Week, Filling by Rideable Type and Rides' Number by Day of Week in each Month , Faceting by Rideable Type.png")

135-

135.- Plot135:“-Only-Member-Rides’ Number by Month, Filling by Rideable Type and Rides’ Number by Month in each Day of Week , Faceting by Rideable Type”

knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/5.-Only Member-Rides' Number by Month, Filling by Rideable Type and Rides' Number by Month in each Day of Week, Faceting by Rideable Type.png")

136-

136.- Plot136: “-Only Member-Rides’ Number by Month, Filling by Rideable Type and Rides’ Number by Month in each Day of Week, Faceting by Rideable Type”

knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/6.-Only Casual-Rides' Number by Month, Filling by Rideable Type and Rides' Number by Month in each Day of Week, Faceting by Rideable Type.png")

141-

141.- Plot141: “Avg Rides’ Time by Day of Week, Filling by User Type and Rides’ Number by Day of Week in each Month , Faceting by User Type”

knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/1.-Avg Rides' Time by Day of Week, Filling by User Type and Avg Rides' Time by Day of Week in each Month , Faceting by User Type.png")

142-

142.- Plot142: “Avg Rides’ Time by Month, Filling by User Type and Rides’ Number by Month in each Day of Week , Faceting by User Type”

knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/2.-Avg Rides' Time by Month, Filling by User Type and Rides' Number by Month in each Day of Week , Faceting by User Type.png")

143-

143.- Plot143: “-Only Member-Avg Rides’ Time by Day of Week, Filling by Rideable Type and Avg Rides’ Time by Day of Week in each Month , Faceting by Rideable Type”

knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/3.-Only Member-Avg Rides' Time by Day of Week, Filling by Rideable Type and Avg Rides' Time by Day of Week in each Month , Faceting by Rideable Type.png")

144-

144.- Plot144: “-Only Casual-Avg Rides’ Time by Day of Week, Filling by Rideable Type and Avg Rides’ Time by Day of Week in each Month , Faceting by Rideable Type”

knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/4.-Only Casual-Avg Rides' Time by Day of Week, Filling by Rideable Type and Avg Rides' Time by Day of Week in each Month , Faceting by Rideable Type.png")

145-

145.- Plot145: “-Only-Member-Avg Rides’ Time by Month, Filling by Rideable Type and Avg Rides’ Time by Month in each Day of Week , Faceting by Rideable Type”

knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/5.-Only Member-Avg Rides' Time by Month, Filling by Rideable Type and Avg Rides' Time by Month in each Day of Week, Faceting by Rideable Type.png")

146-

146.- Plot146: “-Only-Casual-Avg Rides’ Time by Month, Filling by Rideable Type and Avg Rides’ Time by Month in each Day of Week , Faceting by Rideable Type”

knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/6.-Only Casual-Avg Rides' Time by Month, Filling by Rideable Type and Avg Rides' Time by Month in each Day of Week, Faceting by Rideable Type.png")

S———————————————————————————————————————S

PLOTS: FOR CONCLUSIONS

150.-

150.- Plot: 2 plots in one frame: 1.- “Rides’ Number in the Year by User Type, based in each Day of Week”. 2.- “Avg. Rides’ Time in the Year by User Type, based in each Day of Week”

dataA= matrix(c(35.58, 64.42), ncol=2, byrow=TRUE)
colnames(dataA) = c('casual','member')
finalA=as.table(dataA)
#print(finalA)
lblsA <- paste(colnames(dataA), "\n", finalA,"%", sep="")
#print(A)

dataB= matrix(c(62.65, 37.35), ncol=2, byrow=TRUE)
colnames(dataB) = c('casual','member')
finalB=as.table(dataB)
#print(finalB)
lblsB <- paste(colnames(dataB), "\n", finalB,"%", sep="")
#print(lblsB)

par(mfrow=c(1,2), cex.main = 1.0, cex.axis= 0.2, mar = c(3, 7, 2, 1))

pie(finalA, labels = lblsA, col=rainbow(length(lblsA)), main= "   \n Rides' Number in the Year\n by User Type,\n based in each Day of Week")
pie(finalB, labels = lblsB, col=rainbow(length(lblsB)), main= "   \n Avg. Rides' Time in the Year\n by User Type,\n based in each Day of Week")

150.1.- Start of Data Analysis Results

The first graph shows that the relationship between Rides’ Number made by the types of users in the “Casuals” class, approximately, is half of the Rides’ Number made by the types of users in the “Members” class

The second graph shows that the relationship between the Average Rides’ Time made by the types of users in the “Casuals” class is twice of the Average Rides’ Time made by the types of users in the “Members” class.

The procedure to convert the user types of the “Casuals” class into user types of the class " Members“, must initially begin with a promotion period, where the values resulting from the 2 variables analyzed (Rides’ Number and/or Average Ride’s Time) which are lower in the types of users of the”Casuals" class than the types of users of the “Members” class, so that they can achieve and/or exceed the results of the “Members” class, so that they may subsequently be offered the “Members” class benefits and so they can be converted to such a class in a natural form,

From the above mentioned results it can be concluded that the first recommendation , resulting from this data analysis, is to implement a strategy to increase the Rides’ Number** of the types of users of the class “Casuals” at least twice what they did in this year of study. At the same time, the results of the values of the Average Travel Time by the types of users in the “Casuals” class does not need to be promoted because they are well above the resulting values for the types of users in the “Members” class**

151.-

151.- Plot: “Ratio Casual/Member in Rides’ Number in each Day of Week, based on User Type”

# fig.width = 12.29, fig.height= 8.00
dataPct5 <- list(Day = c('Mon','Tue','Wed','Thu','Fri','Sat','Sun'), ratio= c(0.45, 0.39, 0.42, 0.43, 0.57, 0.92, 0.87)) 
finalPct5 <- as.data.frame(dataPct5)
gt_finalPct5 <- gt(finalPct5)

# values in table <= 0.57 , in bold
gt_finalPct5 <- gt_finalPct5 %>%
tab_style(style = cell_text(weight = "bold"), 
  locations = cells_body(columns = ratio,          
  rows = ratio <= 0.57))

colors <- c("#FFFFFF", "#F5FCC2","#FF0000", "#7FFF00", "#B3C732", "#A0522D", "#000000")

pie5 <- ggplot(data = finalPct5, aes(x="", y = ratio, fill = reorder(Day,-ratio ))) +
       geom_col(color = "black") +
       coord_polar("y", start = 0) +
       geom_text(aes(x = 1.6, label = ratio), position = position_stack(vjust = 0.5), fontface = "bold") +  
       theme(panel.background = element_blank(),
       axis.line = element_blank(),
       axis.text = element_blank(),
       axis.ticks = element_blank(),
       axis.title = element_blank(),
       plot.title = element_text(hjust = 0.5, size = 20, face="bold")) +
ggtitle(" Ratio Casual/Member in Rides' Number \n in each Day of Week, based on User Type") +
scale_fill_manual(values = colors)

pie5 + gt_finalPct5

151.1.- First Result of the Data Analysis

It can be seen that the graphic (Pie Chart + table) presents the results of the numerical analysis of the values of the ratio Casual/Member, in the case of Rides’ Number by the 2 types of users, by each Day of the Week, during the 1-year period between October 2023 and September 2024.

The values of that ratio(casual/member) shown in “bold” in the table, indicate that on days going from Monday to Friday , the Rides’ Number from the types of users in the Casuals"* class are on average 0.45 (roughly 50%) times lower than the results from the types of users in the “Members” class. On the weekends, however, the ratio is roughly 1.00(roughly 100%), which indicates that both types of users make, practically, the same Rides’ Number

These results are practically repeated when the same analysis of the data is made, but based on Rideable_Type , which is a very important indication to take into consideration in the elaboration of the Strategy to convert casual riders into annual members.

152.-

152.- Plot: Plot: 2 plot in one frame: (1.-) “Rides’ Number in the Year by Rideable Type, based on each Day of Week”. (2.-) "Avg. Rides’ Time in the Year by Rideable Type, based on each Day of Week

dataC= matrix(c(65.73, 1.13, 33.14), ncol=3, byrow=TRUE)
colnames(dataC) = c('classic_bike','electric_scooter','electric_bike')
finalC=as.table(dataC)
#print(finalC)
lblsC <- paste(colnames(dataC), "\n", finalC,"%", sep="")
#print(lblsC)

dataD= matrix(c(59.82, 3.10, 37.08), ncol=3, byrow=TRUE)
colnames(dataD) = c('classic_bike','electric_scooter','electric_bike')
finalD=as.table(dataD)
#print(finalD)
lblsD <- paste(colnames(dataD), "\n", finalD,"%", sep="")
#print(lblsD)

par(mfrow=c(1,2), cex.main = 1.0, cex.axis= 0.2, mar = c(3, 7, 2, 1))

pie(finalC, labels = lblsC, col=rainbow(length(lblsC)), main= "   \n Rides' Number in the Year\n by Rideable Type,\n based on each Day of Week")
pie(finalD, labels = lblsD, col=rainbow(length(lblsD)), main= "   \n Avg. Rides' Time in the Year\n by Rideable Type,\n based on each Day of Week")

152.1.- Second Result of the Data Analysis

It can be seen that the 2 graphs (Pie Chart) present the results of numerical analysis of both the values of the Rides’ Numbers and the Average Rides’ Time in the three types of Rideable Type (classic_bike, electric_scooter, electric_bike) during the period of 1 year that runs between October 2023 and September 2024, taking together the 2 type of users (member_casual).

The two graphs show that the relationship between Rides’ Numbers and Average Rides’ Time made by the three types of Rideabe_Type is maintained, approximately, twice the use of classic_bike compared to electric_bike, and with regard to electric_scooter, its use is practically nil with respect to classic_bike and electric_bike

153.-

153.- Plot: “Rides’ Number: % Used by Only Member in each Day of Week in the Year, based on Rideable Type”

# fig.width = 12.29, fig.height= 8.00
dataPct1 <- list(Day=c('Mon-Classic_Bike','Tue-Classic_Bike','Wed-Classic_Bike','Thu-Classic_Bike','Fri-Classic_Bike','Sat-Classic_Bike','Sun-Classic_Bike','Week-ES','Mon-Electric_Bike','Tue-Electric_Bike','Wed-Electric_Bike','Thu-Electric_Bike','Fri-Electric_Bike','Sat-Electric_Bike','Sun-Electric_Bike'), percentage=c(6.39, 6.66, 7.06, 6.67, 5.79, 5.38, 4.95, 0.52, 3.15, 3.33, 3.56, 3.38, 3.02, 2.42, 2.15)) 

finalPct1 <- as.data.frame(dataPct1)
gt_finalPct1 <- gt(finalPct1)

# values in table >= 4.95 , in bold
gt_finalPct1 <- gt_finalPct1 %>%
tab_style(style = cell_text(weight = "bold"), 
  locations = cells_body(columns = percentage,          
  rows = percentage >= 4.95))

colors <- c("#FFFFFF", "#F5FCC2","#FF0000", "#7FFF00", "#B3C732", "#A0522D", "#000000", "#FFFF00", "#EE7600", "#00994C", "#9933FF", "#FF3399", "#A0A0A0", "#718200", "#99FFFF")

pie1 <- ggplot(data = finalPct1, aes(x="", y = percentage, fill = reorder(Day,-percentage ))) +
       geom_col(color = "black") +
       coord_polar("y", start = 0) +
       geom_text(aes(x = 1.6, label = paste0(percentage,"\n", "%")), position = position_stack(vjust = 0.5), fontface = "bold") +
       theme(panel.background = element_blank(),
       axis.line = element_blank(),
       axis.text = element_blank(),
       axis.ticks = element_blank(),
       axis.title = element_blank(),
       plot.title = element_text(hjust = 0.5, size = 20, face="bold")) +
ggtitle("Rides' Number:% Used by Only Member\n in each Day of Week in the Year,\n based on Rideable Type") +
scale_fill_manual(values = colors)

pie1 + gt_finalPct1

153.1.- Third Result of the Data Analysis

Nomenclature: Week-Es = Summary of values (%) of the Electric_Scooter in the full week

It can be seen that in the graphic (Pie Chart) and in the table that generated this graphic, the results of the numerical analysis of the values for the types of users of the class “Members”* are presented, as regards the Rides’ Number of the three types of Rideable_Type (classic_bike, electric_scooter, electric_bike),in each Day of Week, expressed as partial percentages (%) of the total Rides’ Number performed by the three types of Rideable_Type and the two types of users, as a whole during the 1-year period between October 2023 and September 2024.

In the “Pie Chart” type graphic and in the table to its right, it is shown practically that the relationship between the Rides’ Number made by the types of users of the class “Members” in each Day of the Week during the year under study, in the classicc_bikes, approximately, are double of the Rides’ Number performed by electric_bikes.

The summary values (%) of Electric_Scooter in the full week (Week-ES) are negligible compared to the rest of the table values

154.-

154.- Plot: “Rides’ Number: % Used by Only Casual in each Day of Week in the Year,based on Rideable Type”

#fig.width = 12.29, fig.height= 8.00
dataPct2 <- list(Day=c('Mon-C_Bike','Tue-C_Bike','Wed-C_Bike','Thu-C_Bike','Fri-C_Bike','Sat-C_Bike','Sun-C_Bike','Week-ES','Mon-E_Bike','Tue-E_Bike','Wed-E_Bike','Thu-E_Bike','Fri-E_Bike','Sat-E_Bike','Sun-E_Bike'), percentage=c(2.67, 2.37, 2.73, 2.65, 3.16, 4.98, 4.28, 0.61, 1.54, 1.44,  1.64, 1.63, 1.83, 2.18, 1.89)) 

finalPct2 <- as.data.frame(dataPct2)
gt_finalPct2 <- gt(finalPct2)

# values in table >= 2.37 , in bold
gt_finalPct2 <- gt_finalPct2 %>%
tab_style(style = cell_text(weight = "bold"), 
  locations = cells_body(columns = percentage,          
  rows = percentage >= 2.37))

colors <- c("#FFFFFF", "#F5FCC2","#FF0000", "#7FFF00", "#B3C732", "#A0522D", "#000000", "#FFFF00", "#EE7600", "#00994C", "#9933FF", "#FF3399", "#A0A0A0", "#718200", "#99FFFF")

pie2 <- ggplot(data = finalPct2, aes(x="", y = percentage, fill = reorder(Day,-percentage ))) +
       geom_col(color = "black") +
       coord_polar("y", start = 0) +
       geom_text(aes(x = 1.6, label = paste0(percentage,"\n", "%")), position = position_stack(vjust = 0.5), fontface = "bold") +
       theme(panel.background = element_blank(),
       axis.line = element_blank(),
       axis.text = element_blank(),
       axis.ticks = element_blank(),
       axis.title = element_blank(),
       plot.title = element_text(hjust = 0.5, size = 20, face="bold")) +
ggtitle(" Rides' Number: % Used\n by Only Casual in each Day of Week\n in the Year, based on Rideable Type") +
scale_fill_manual(values = colors)

pie2 + gt_finalPct2

154.1.-Fourth Result of the Data Analysis

It can be seen that in the graphic (Pie Chart) and in the table that generated this graphic, the results of the numerical analysis of the values for the types of users of the class “Casuals”* are presented, as regards the Rides’ Number of the three types of Rideable_Type (classic_bike, electric_scooter, electric_bike), by each Day of Week, expressed as partial percentages (%) of the total Rides’ Number performed by the three types of Rideable_Type and the two types of users, as a whole, during the 1-year period between October 2023 and September 2024.

In the “Pie Chart” type graphic and in the table to its right, it is shown practically that the relationship between the Rides’ Number made by the types of users of the class “Casuals”, in each Day of the Week, during the year under study, in the classicc_bikes, approximately, are double of the Rides’ Number performed by electric_bikes.

155.-

155.- Plot: “Rides’ Number:% Used by Rideable_Type in one the Year, by User Type”

#fig.width = 12.29, fig.height= 8.00
dataPct3 <- list(Rideable= c('classic_bike-M','electric_scooter-M','electric_bike-M','classic_bike-C','electric_scooter-C','electric_bike-C'), 
percentage= c(42.90, 0.52, 21.00, 22.84, 0.61, 12.14))
finalPct3 <- as.data.frame(dataPct3)

gt_finalPct3 <- gt(finalPct3)

colors <- c("#FFFFFF", "#F5FCC2","#FF0000", "#7FFF00", "#B3C732", "#A0522D")

pie3 <- ggplot(data = finalPct3, aes(x="", y = percentage, fill = reorder(Rideable,-percentage))) +  
       geom_col(color = "black") +
       coord_polar("y", start = 0) +
       geom_text(aes(x = 1.6, label = paste0(percentage,"\n", "%")), position = position_stack(vjust = 0.5), fontface = "bold") +
       theme(panel.background = element_blank(),
       axis.line = element_blank(),
       axis.text = element_blank(),
       axis.ticks = element_blank(),
       axis.title = element_blank(),
       plot.title = element_text(hjust = 0.5, size = 20, face="bold")) +
ggtitle(" Rides' Number: % Used\n by Rideable_Type\n in one Year, by User Type") +
scale_fill_manual(values = colors)

pie3 + gt_finalPct3

155.1.- Fifth Result of the Data Analysis.

In the Pie Chart and the table that generated it, the relation between the Number of Trips made by the 2 types of Rideabe_Type (classic_bike, electric_bike) is shown. The classic_bike-M/electric_bike-M ratio is equal to 2.04. The classic_bike-C/electric_bike-C ratio is equal to 1.88*. The relation classic_bike-C/classic_bike-M is equal to 0.53, The relation electric_bike-C/electric_bike-M is equal to 0.58 With the electric_scooter its use is practically nil in respect of the classic_bike and electric_bike in both types of use.

Estos resultados mostrados en las diversos ratios: Member/Member, Casual/Casual, Casual/Member, indican que los ratios entre los classic_bike-M /electric_bike-M, classic_bike-C /electric_bike-C, indican que practicamente los classic_bike con respecto al electric_bike, realizaron el doble de los Rides’ Number . En cambio los ratios Casual/Member para los casos tales como classic_bike-C/classic_bike-M y electric_bike-C/electric_bike-M, practicamente los elementos de la clase “Casuals” realizaron la mitad de los Rides’ Number en comparacion con los elementos de la clase “Members”

These results shown in the various ratios: Member/Member, Casual/Casual, Casual/Member, indicate that the ratios between classic_bike-M /electric_bike-M, classic_bike-C /electric_bike-C, indicate that practically classic_bike versus electric_bike, performed double of the Rides’ Number . On the contrary the Casual/Member ratios for cases such as classic_bike-C/classic_bike-M and electric_bike-C/electric_bike-M, practically the elements of class “Casuals” performed half of the Rides’ Number in comparison with the elements of class “Members”

156.-

156.- Plot: “Avg Rides’ Time: % Used by Rideable_Type in one Year, by User Type”

#fig.width = 12.29, fig.height= 8.00
dataPct4 <- list(Rideable= c('classic_bike-M','electric_scooter-M','electric_bike-M','classic_bike-C','electric_scooter-C','electric_bike-C'), 
percentage= c(20.03, 1.32, 15.99, 39.78, 1.78, 21.09)) 
finalPct4 <- as.data.frame(dataPct4)
gt_finalPct4 <- gt(finalPct4)

colors <- c("#FFFFFF", "#F5FCC2","#FF0000", "#7FFF00", "#B3C732", "#A0522D")

pie4 <- ggplot(data = finalPct4, aes(x="", y = percentage, fill = reorder(Rideable,-percentage ))) +
       geom_col(color = "black") +
       coord_polar("y", start = 0) +
       geom_text(aes(x = 1.6, label = paste0(percentage,"\n", "%")), position = position_stack(vjust = 0.5), fontface = "bold") +
       theme(panel.background = element_blank(),
       axis.line = element_blank(),
       axis.text = element_blank(),
       axis.ticks = element_blank(),
       axis.title = element_blank(),
       plot.title = element_text(hjust = 0.5, size = 20, face="bold")) +
ggtitle(" Avg Rides' Time: % Used\n by Rideable_Type in one Year,\n by User Type") +
scale_fill_manual(values = colors)

pie4 + gt_finalPct4

156.1.- Sixth Result of the Data Analysis de Datos.

Nomenclature: Member = M , Casual = C It can be seen that in the graph (Pie Chart + table) are presented the results of the numerical analysis of the values of Average Rides’ Time in the three types of Rideable_Type (classic_bike, electric_scooter, electric_bike) during the 1-year period between October 2023 and September 2024, based on the total contribution of the two types of users **(Member(M)_Casual(C))**

In the Pie Chart and the table that generated it is shown the relationship between the Average Rides’ Time performed by the three types of Rideabe_Type(classic_bike, electric_scooter, electric_bike). In the first case the classic_bike-M/electric_bike-M ratio is equal to 1.25 , in the second case the classic_bike-C /electric_bike-C ratio is equal to 1.87 In the third case the classic_bike-C /classic_bike-M ratio is equal to 1.97. In the fourth case, the electric_bike-C/electric_bike-M ratio is equal to 1.32. In the fifth case, the use of electric_scooters is practically nil compared to classic_bike and electric_bike.

From the above mentioned paragraph , it does not seem necessary to relate the section of Average Rides’ Time** in the “Strategy” for Convert to the casual riders into annual members, because casual riders use higher values the Average Rides’ Time than annual members**

157.-

157.- Plot: " Ratio: Casual/Member Avg Rides’ Time, in each Day of Week,in one Year, based on User Type"

# fig.width = 12.29, fig.height= 8.00
dataPct6 <- list(Day = c('Mon','Tue','Wed','Thu','Fri','Sat','Sun'), ratio= c(1.77, 1.60, 1.60, 1.57, 1.70, 1.70, 1.79)) 
finalPct6 <- as.data.frame(dataPct6)
gt_finalPct6 <- gt(finalPct6)

# values in table >= 1.57 , in bold
gt_finalPct6 <- gt_finalPct6 %>%
tab_style(style = cell_text(weight = "bold"), 
  locations = cells_body(columns = ratio,          
  rows = ratio >= 1.57))

colors <- c("#FFFFFF", "#F5FCC2","#FF0000", "#7FFF00", "#B3C732", "#A0522D", "#000000")

pie6 <- ggplot(data = finalPct6, aes(x="", y = ratio, fill = reorder(Day,-ratio ))) +
       geom_col(color = "black") +
       coord_polar("y", start = 0) +
       geom_text(aes(x = 1.6, label = ratio), position = position_stack(vjust = 0.5), fontface = "bold") +
       theme(panel.background = element_blank(),
       axis.line = element_blank(),
       axis.text = element_blank(),
       axis.ticks = element_blank(),
       axis.title = element_blank(),
       plot.title = element_text(hjust = 0.5, size = 20, face="bold")) +
ggtitle(" Ratio Casual/Member in\n Avg Rides' Time, in each Day of Week \n in one Year, based on User Type") +
scale_fill_manual(values = colors)

pie6 + gt_finalPct6

157.1.- Seventh Result of the Data Analysis

It can be seen that in the graphic (Pie Chart + table) are presented the results of numerical analysis of the values of the ratios of Average Rides’ Time generated when comparing to user types, class “Casual”, with user types, class “Member”, on each Day of the Week, during the 1 year period between October 2023 and September 2024.

In the Pie Chart and the table that generated it, the relationship between Average Ride’s Time performed by the 2 types of users , which produces during the whole week, a Casual/Member ratio on the variable Average Ride’s Time, with an average of 1.68 during the full week, indicating that the “Casual” class types performed 68% more than the Average Ride’s Time, that performed by user types class “Members”, which determines that the Strategy to Convert casual riders into annual members, does not need to take into account the Average Ride’s Time, because they already passed the performance of the annual members.

158.-

158.- Plot: “Ratio: Casual/Member Classic_Bike,in Rides’ Number in each Day of Week,in one Year, based on Rideable_Type”

#fig.width = 12.29, fig.height= 8.00
dataPct7 <- list(Day = c('Mon','Tue','Wed','Thu','Fri','Sat','Sun'), ratio= c(0.42, 0.36, 0.39, 0.40, 0.55, 0.93, 0.86)) 
finalPct7 <- as.data.frame(dataPct7)
gt_finalPct7 <- gt(finalPct7)

# values in table <= 0.55 , in bold
gt_finalPct7 <- gt_finalPct7 %>%
tab_style(style = cell_text(weight = "bold"), 
  locations = cells_body(columns = ratio,          
  rows = ratio <= 0.55))

colors <- c("#FFFFFF", "#F5FCC2","#FF0000", "#7FFF00", "#B3C732", "#A0522D", "#000000")

pie7 <- ggplot(data = finalPct7, aes(x="", y = ratio, fill = reorder(Day,-ratio ))) +
       geom_col(color = "black") +
       coord_polar("y", start = 0) +
       geom_text(aes(x = 1.6, label = ratio), position = position_stack(vjust = 0.5), fontface = "bold") +
       theme(panel.background = element_blank(),
       axis.line = element_blank(),
       axis.text = element_blank(),
       axis.ticks = element_blank(),
       axis.title = element_blank(),
       plot.title = element_text(hjust = 0.5, size = 20, face="bold")) +
ggtitle(" Ratio: Casual/Member Classic_Bike,\n in Rides' Number in each Day of Week,\n  in one Year, based on Rideable_Type") +
scale_fill_manual(values = colors)

pie7 + gt_finalPct7

158.1.-Eighth Result of the Data Analysis

It can be seen that the graph (Pie Chart + table) presents the results of numerical analysis of the values of the ratio: Casual/Member Classic_Bike, in the case of Rides’ Number, for the 2 types of users, for each Day of the Week, based on the Classic_Bike Rideable_Type, during the 1-year period between October 2023 and September 2024.

The values of this ratio shown in “bold”, indicate that on days going from Monday to Friday , the Rides’ Number of the user types class Casual are on average 0.42 (practically 50% ) times lower than the user types “Member” class for Classic_Bike Rideable_Type. On weekends the ratio has,practically, a value of 1.00(practically 100% ), which indicates that both types of users perform basically the same Rides’ Number

These results are practically repeated in the following section (159.-), only that using as base Electric_Bike Rideable_Type, and also when the same analysis of the data is done, but based on User_Type(see section (151.-), which is a very important indication to take into consideration in the elaboration of the Strategy for Converting user types “Casual” class into user types “Member” class , because in these cases it is indicating that the way to increase the Rides’ Number of user types “Casual” class, is by promoting the use of classic_bike and electric_bike at twice the value they currently have, on days from Monday to Friday.

159.-

159.- Plot: “Ratio: Casual/Member Electric_Bike, in Rides’ Number in each Day of Week, in one Year, based on Rideable_Type”

# fig.width = 12.29, fig.height= 8.00
dataPct8 <- list(Day = c('Mon','Tue','Wed','Thu','Fri','Sat','Sun'), ratio= c(0.49, 0.43, 0.46, 0.48, 0.61, 0.90, 0.88)) 
finalPct8 <- as.data.frame(dataPct8)
gt_finalPct8 <- gt(finalPct8)

# values in table <= 0.61 , in bold
gt_finalPct8 <- gt_finalPct8 %>%
tab_style(style = cell_text(weight = "bold"), 
  locations = cells_body(columns = ratio,          
  rows = ratio <= 0.61))

colors <- c("#FFFFFF", "#F5FCC2","#FF0000", "#7FFF00", "#B3C732", "#A0522D", "#000000")

pie8 <- ggplot(data = finalPct8, aes(x="", y = ratio, fill = reorder(Day,-ratio))) +
       geom_col(color = "black") +
       coord_polar("y", start = 0) +
       geom_text(aes(x = 1.6, label = ratio), position = position_stack(vjust = 0.5), fontface = "bold") +
       theme(panel.background = element_blank(),
       axis.line = element_blank(),
       axis.text = element_blank(),
       axis.ticks = element_blank(),
       axis.title = element_blank(),
       plot.title = element_text(hjust = 0.5, size = 20, face="bold")) +
ggtitle(" Ratio: Casual/Member Electric_Bike,\n in Rides' Number in each Day of Week,\n in one Year, based on Rideable_Type") +
scale_fill_manual(values = colors)

pie8 + gt_finalPct8

S———————————————————————————————————————S

CONCLUSIONS

160.-

160.- CONCLUSIONS

Introduction:

The Procedure or Strategy to convert the user types of the “Casuals” class into user types of the class " Members“, must initially begin with a promotion period, where the values resulting from the 2 variables analyzed (Rides’ Number and/or Average Ride’s Time) which are lower in the types of users of the”Casuals" class than the types of users of the “Members” class, so that they can achieve and/or exceed the results of the “Members” class, so that they may subsequently be offered the “Members” class benefits and so they can be converted to such a class in a natural form.

In the “Details” section, the difference between the cycling behaviour of the casual riders and the annual members is clearly illustrated

Details:

1.- The first conclusion , resulting from this data analysis, is to implement a Strategy to increase the Rides’ Number of the types of users of the “Casuals” class at least twice what they did in this year of study, this point is explained in the following conclusion. At the same time, the results of the values of the Average Travel Time by the types of users in the class “Casuals” does not need to be promoted in the strategy because they are well above the resulting values for the types of users in the “Members” class, as indicated in section No. 156 (Avg Rides’ Time: % Used by Rideable_Type in one Year, by User Type).

2.- The values of that ratio(casual/member) shown in “bold” in the table from secction No. 151 (Ratio Casual/Member in Rides’ Number in each Day of Week, based on User Type), indicate that on days going from Monday to Friday , the Rides’ Number from the types of users in the Casuals"* class are on average 0.45 (roughly 50%) times lower than the results from the types of users in the “Members” class. On the weekends, however, the ratio is roughly 1.00(roughly 100%), which indicates that both types of users make, practically, the same Rides’ Number. These results are practically repeated when the same analysis of the data is made, but based on Rideable_Type, in the No. 158 and No.159 secctions (Ratio: Casual/Member Classic_Bike,in Rides’ Number in each Day of Week,in one Year, based on Rideable_Type and Ratio: Casual/Member Electric_Bike, in Rides’ Number in each Day of Week, in one Year, based on Rideable_Type) , which is a very important indication to take into consideration in the elaboration of the Strategy to convert casual riders into annual members.

3.-To implement the Strategy proposed, regarding increasing the “Rides’ Numner” from Monday to Friday, and make it a reality in an effective way, I would propose using social networks, emphasizing the greater use of the electric_bike*.For this Divvy can use an Education campaign of how to use the electric_bike and a Promotion campaign of that item with the name “Don’t disguise as a cyclist”, showing that the casual rider can use such item, wearing the clothes used daily to go to the office, as no physical effort is made.

END

End Note: To present a less detailed view of this project please visit the website: https://www.kaggle.com/code/bernardomelendez/divvy-bike-sharing-oct-23-sep-24-case-study