Welcome to the Cyclistic bike-share analysis case study! In this case study, you will perform many real-world tasks of a junior data analyst. You will work for a fictional company, Cyclistic, and meet different characters and team members. In order to answer the key business questions, you will follow the steps of the data analysis process: ask, prepare, process, analyze, share, and act. Along the way, the Case Study Roadmap tables — including guiding questions and key tasks — will help you stay on the right path. By the end of this lesson, you will have a portfolio-ready case study. Download the packet and reference the details of this case study anytime. Then, when you begin your job hunt, your case study will be a tangible way to demonstrate your knowledge and skills to potential employers.
You are a junior data analyst working in the marketing analyst team at Cyclistic, a bike-share company in Chicago.The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, your team wants to understand how casual riders and annual members use Cyclistic bikes differently. From these insights, your team will design a new marketing strategy to convert casual riders into annual members. But first, Cyclists executives must approve your recommendations, so they must be backed up with compelling data insights and professional data visualizations.
● Cyclistic: A bike-share program that features more than 5,800 bicycles and 600 docking stations. Cyclistic sets itself apart by also offering reclining bikes, hand tricycles, and cargo bikes, making bike-share more inclusive to people with disabilities and riders who can’t use a standard two-wheeled bike. The majority of riders opt for traditional bikes; about 8% of riders use the assistive options. Cyclistic users are more likely to ride for leisure, but about 30% use them to commute to work each day. ● Lily Moreno: The director of marketing and your manager. Moreno is responsible for the development of campaigns and initiatives to promote the bike-share program. These may include email, social media, and other channels. ● Cyclistic marketing analytics team: A team of data analysts who are responsible for collecting, analyzing, and reporting data that helps guide Cyclistic marketing strategy. You joined this team six months ago and have been busy learning about Cyclistic’s mission and business goals — as well as how you, as a junior data analyst, can help Cyclistic achieve them. ● Cyclistic executive team: The notoriously detail-oriented executive team will decide whether to approve the recommended marketing program.
In 2016, Cyclistic launched a successful bike-share offering. Since then, the program has grown to a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime. Until now, Cyclistic’s marketing strategy relied on building general awareness and appealing to broad consumer segments. One approach that helped make these things possible was the flexibility of its pricing plans: single-ride passes, full-day passes, and annual memberships. Customers who purchase single-ride or full-day passes are referred to as casual riders. Customers who purchase annual memberships are Cyclistic members. Cyclistic’s finance analysts have concluded that annual members are much more profitable than casual riders. Although the pricing flexibility helps Cyclistic attract more customers, Moreno believes that maximizing the number of annual members will be key to future growth. Rather than creating a marketing campaign that targets all-new customers, Moreno believes there is a very good chance to convert casual riders into members. She notes that casual riders are already aware of the Cyclistic program and have chosen Cyclistic for their mobility needs.
Moreno has set a clear goal: Design marketing strategies aimed at converting casual riders into annual members. In order to do that, however, the marketing analyst team needs to better understand how annual members and casual riders differ, why casual riders would buy a membership, and how digital media could affect their marketing tactics. Moreno and her team are interested in analyzing the Cyclistic historical bike trip data to identify trends.
Notes: setting up my R environment by loading the appropriate packages
1.-
1.1.-
library(tidyverse)
library(readr)
library(lubridate)
library(tidyr)
library(ggplot2)
library(data.table)
1.2.-
library(tidytext)
library(dplyr)
library(gt)
2.-
https://readr.tidyverse.org/reference/read_delim.html
tripdata_202310 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202310-divvy-tripdata.csv")
## Rows: 537113 Columns: 13
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (7): ride_id, rideable_type, start_station_name, start_station_id, end_...
## dbl (4): start_lat, start_lng, end_lat, end_lng
## dttm (2): started_at, ended_at
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#View(tripdata_202310)
print(tripdata_202310)
## # A tibble: 537,113 × 13
## ride_id rideable_type started_at ended_at
## <chr> <chr> <dttm> <dttm>
## 1 4449097279F8BBE7 classic_bike 2023-10-08 10:36:26 2023-10-08 10:49:19
## 2 9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08
## 3 667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53
## 4 F92714CC6B019B96 classic_bike 2023-10-24 19:13:03 2023-10-24 19:18:29
## 5 5E34BA5DE945A9CC classic_bike 2023-10-09 18:19:26 2023-10-09 18:30:56
## 6 F7D7420AFAC53CD9 electric_bike 2023-10-04 17:10:59 2023-10-04 17:25:21
## 7 870B2D4CD112D7B7 electric_bike 2023-10-31 17:32:20 2023-10-31 17:44:20
## 8 D9179D36E32D456C classic_bike 2023-10-02 18:51:51 2023-10-02 18:57:09
## 9 F8E131281F722FEF classic_bike 2023-10-17 08:28:18 2023-10-17 08:50:03
## 10 91938B71748FA405 classic_bike 2023-10-17 19:17:38 2023-10-17 19:32:23
## # ℹ 537,103 more rows
## # ℹ 9 more variables: start_station_name <chr>, start_station_id <chr>,
## # end_station_name <chr>, end_station_id <chr>, start_lat <dbl>,
## # start_lng <dbl>, end_lat <dbl>, end_lng <dbl>, member_casual <chr>
2.2.-
tripdata_202311 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202311-divvy-tripdata.csv")
#View(tripdata_202311)
tripdata_202312 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202312-divvy-tripdata.csv")
#View(tripdata_202312)
tripdata_202401 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202401-divvy-tripdata.csv")
#View(tripdata_202401)
tripdata_202402 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202402-divvy-tripdata.csv")
#View(tripdata_202402)
tripdata_202403 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202403-divvy-tripdata.csv")
#View(tripdata_202403)
tripdata_202404 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202404-divvy-tripdata.csv")
#View(tripdata_202404)
tripdata_202405 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202405-divvy-tripdata.csv")
#View(tripdata_202405)
tripdata_202406 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202406-divvy-tripdata.csv")
#View(tripdata_202406)
tripdata_202407 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202407-divvy-tripdata.csv")
#View(tripdata_202407)
tripdata_202408 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202408-divvy-tripdata.csv")
#View(tripdata_202408)
2.3.-
tripdata_202409 <- read_csv("/Users/user/Desktop/1CaseStudy1v1/202409-divvy-tripdata.csv")
## Rows: 821276 Columns: 13
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (7): ride_id, rideable_type, start_station_name, start_station_id, end_...
## dbl (4): start_lat, start_lng, end_lat, end_lng
## dttm (2): started_at, ended_at
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#View(tripdata_202409)
print(tripdata_202409)
## # A tibble: 821,276 × 13
## ride_id rideable_type started_at ended_at
## <chr> <chr> <dttm> <dttm>
## 1 31D38723D5A8665A electric_bike 2024-09-26 15:30:58 2024-09-26 15:30:59
## 2 67CB39987F4E895B electric_bike 2024-09-26 15:31:32 2024-09-26 15:53:13
## 3 DA61204FD26EC681 electric_bike 2024-09-26 15:00:33 2024-09-26 15:02:25
## 4 06F160D46AF235DD electric_bike 2024-09-26 18:19:06 2024-09-26 18:38:53
## 5 6FCA41D4317601EB electric_bike 2024-09-03 19:49:57 2024-09-03 20:07:08
## 6 9F291E82895C45E5 electric_bike 2024-09-04 01:45:18 2024-09-04 02:01:38
## 7 625D2EA831E1F8AC electric_bike 2024-09-04 16:22:16 2024-09-04 16:26:20
## 8 A21DCB6834BCAD0D electric_bike 2024-09-04 16:31:58 2024-09-04 16:38:52
## 9 0EEB8A4CF63DA7AE electric_bike 2024-09-28 20:30:28 2024-09-28 20:33:20
## 10 6CE10020F5D0D7B8 electric_bike 2024-09-28 20:10:48 2024-09-28 20:24:32
## # ℹ 821,266 more rows
## # ℹ 9 more variables: start_station_name <chr>, start_station_id <chr>,
## # end_station_name <chr>, end_station_id <chr>, start_lat <dbl>,
## # start_lng <dbl>, end_lat <dbl>, end_lng <dbl>, member_casual <chr>
2.3.1.- tripdata_202409 tibble: 821,276 rows, 13 columns
3.-
https://www.geeksforgeeks.org/check-data-type-of-each-dataframe-column-in-r/
sapply(tripdata_202310,class)
## $ride_id
## [1] "character"
##
## $rideable_type
## [1] "character"
##
## $started_at
## [1] "POSIXct" "POSIXt"
##
## $ended_at
## [1] "POSIXct" "POSIXt"
##
## $start_station_name
## [1] "character"
##
## $start_station_id
## [1] "character"
##
## $end_station_name
## [1] "character"
##
## $end_station_id
## [1] "character"
##
## $start_lat
## [1] "numeric"
##
## $start_lng
## [1] "numeric"
##
## $end_lat
## [1] "numeric"
##
## $end_lng
## [1] "numeric"
##
## $member_casual
## [1] "character"
#sapply(tripdata_202311,class)
#sapply(tripdata_202312,class)
#sapply(tripdata_202401,class)
#sapply(tripdata_202402,class)
#sapply(tripdata_202403,class)
#sapply(tripdata_202404,class)
#sapply(tripdata_202405,class)
#sapply(tripdata_202406,class)
#sapply(tripdata_202407,class)
#sapply(tripdata_202408,class)
sapply(tripdata_202409,class)
## $ride_id
## [1] "character"
##
## $rideable_type
## [1] "character"
##
## $started_at
## [1] "POSIXct" "POSIXt"
##
## $ended_at
## [1] "POSIXct" "POSIXt"
##
## $start_station_name
## [1] "character"
##
## $start_station_id
## [1] "character"
##
## $end_station_name
## [1] "character"
##
## $end_station_id
## [1] "character"
##
## $start_lat
## [1] "numeric"
##
## $start_lng
## [1] "numeric"
##
## $end_lat
## [1] "numeric"
##
## $end_lng
## [1] "numeric"
##
## $member_casual
## [1] "character"
3.1.- The 12 datasets have the same names in their columns and each column with the same name in the 12 datasets has the same data type
4.-
Definition of :oneyear_trips tibble
https://dplyr.tidyverse.org/reference/bind_rows.html
oneyear_trips <- bind_rows(tripdata_202310, tripdata_202311, tripdata_202312, tripdata_202401,tripdata_202402, tripdata_202403,tripdata_202404, tripdata_202405, tripdata_202406, tripdata_202407, tripdata_202408, tripdata_202409)
#View(oneyear_trips)
print(oneyear_trips)
## # A tibble: 5,860,374 × 13
## ride_id rideable_type started_at ended_at
## <chr> <chr> <dttm> <dttm>
## 1 4449097279F8BBE7 classic_bike 2023-10-08 10:36:26 2023-10-08 10:49:19
## 2 9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08
## 3 667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53
## 4 F92714CC6B019B96 classic_bike 2023-10-24 19:13:03 2023-10-24 19:18:29
## 5 5E34BA5DE945A9CC classic_bike 2023-10-09 18:19:26 2023-10-09 18:30:56
## 6 F7D7420AFAC53CD9 electric_bike 2023-10-04 17:10:59 2023-10-04 17:25:21
## 7 870B2D4CD112D7B7 electric_bike 2023-10-31 17:32:20 2023-10-31 17:44:20
## 8 D9179D36E32D456C classic_bike 2023-10-02 18:51:51 2023-10-02 18:57:09
## 9 F8E131281F722FEF classic_bike 2023-10-17 08:28:18 2023-10-17 08:50:03
## 10 91938B71748FA405 classic_bike 2023-10-17 19:17:38 2023-10-17 19:32:23
## # ℹ 5,860,364 more rows
## # ℹ 9 more variables: start_station_name <chr>, start_station_id <chr>,
## # end_station_name <chr>, end_station_id <chr>, start_lat <dbl>,
## # start_lng <dbl>, end_lat <dbl>, end_lng <dbl>, member_casual <chr>
4.1.-oneyear_trips tibble :5,860,374 rows, 13 columns
4.2.-
https://www.geeksforgeeks.org/kable-method-in-r/
library(knitr)
kable(oneyear_trips[1:5, ], caption= "oneyear_trips" )
| ride_id | rideable_type | started_at | ended_at | start_station_name | start_station_id | end_station_name | end_station_id | start_lat | start_lng | end_lat | end_lng | member_casual |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4449097279F8BBE7 | classic_bike | 2023-10-08 10:36:26 | 2023-10-08 10:49:19 | Orleans St & Chestnut St (NEXT Apts) | 620 | Sheffield Ave & Webster Ave | TA1309000033 | 41.89820 | -87.63754 | 41.92154 | -87.65382 | member |
| 9CF060543CA7B439 | electric_bike | 2023-10-11 17:23:59 | 2023-10-11 17:36:08 | Desplaines St & Kinzie St | TA1306000003 | Sheffield Ave & Webster Ave | TA1309000033 | 41.88864 | -87.64441 | 41.92154 | -87.65382 | member |
| 667F21F4D6BDE69C | electric_bike | 2023-10-12 07:02:33 | 2023-10-12 07:06:53 | Orleans St & Chestnut St (NEXT Apts) | 620 | Franklin St & Lake St | TA1307000111 | 41.89807 | -87.63751 | 41.88584 | -87.63550 | member |
| F92714CC6B019B96 | classic_bike | 2023-10-24 19:13:03 | 2023-10-24 19:18:29 | Desplaines St & Kinzie St | TA1306000003 | Franklin St & Lake St | TA1307000111 | 41.88872 | -87.64445 | 41.88584 | -87.63550 | member |
| 5E34BA5DE945A9CC | classic_bike | 2023-10-09 18:19:26 | 2023-10-09 18:30:56 | Desplaines St & Kinzie St | TA1306000003 | Franklin St & Lake St | TA1307000111 | 41.88872 | -87.64445 | 41.88584 | -87.63550 | member |
5.-
Finding the amount of NA Values existing in each column
https://www.statology.org/is-na/
https://www.statology.org/colsums-function-in-r/
na_counts_oneyear_trips <- colSums(is.na(oneyear_trips))
#View(na_counts_oneyear_trips)
print(na_counts_oneyear_trips)
## ride_id rideable_type started_at ended_at
## 0 0 0 0
## start_station_name start_station_id end_station_name end_station_id
## 1056882 1056882 1092195 1092195
## start_lat start_lng end_lat end_lng
## 0 0 7458 7458
## member_casual
## 0
6.-
Verify rows with NA field.
Create a new tibble
Definition of: oneyear_trips_clean tibble
oneyear_trips_clean <- drop_na(oneyear_trips)
print(oneyear_trips_clean)
## # A tibble: 4,233,605 × 13
## ride_id rideable_type started_at ended_at
## <chr> <chr> <dttm> <dttm>
## 1 4449097279F8BBE7 classic_bike 2023-10-08 10:36:26 2023-10-08 10:49:19
## 2 9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08
## 3 667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53
## 4 F92714CC6B019B96 classic_bike 2023-10-24 19:13:03 2023-10-24 19:18:29
## 5 5E34BA5DE945A9CC classic_bike 2023-10-09 18:19:26 2023-10-09 18:30:56
## 6 F7D7420AFAC53CD9 electric_bike 2023-10-04 17:10:59 2023-10-04 17:25:21
## 7 870B2D4CD112D7B7 electric_bike 2023-10-31 17:32:20 2023-10-31 17:44:20
## 8 D9179D36E32D456C classic_bike 2023-10-02 18:51:51 2023-10-02 18:57:09
## 9 F8E131281F722FEF classic_bike 2023-10-17 08:28:18 2023-10-17 08:50:03
## 10 91938B71748FA405 classic_bike 2023-10-17 19:17:38 2023-10-17 19:32:23
## # ℹ 4,233,595 more rows
## # ℹ 9 more variables: start_station_name <chr>, start_station_id <chr>,
## # end_station_name <chr>, end_station_id <chr>, start_lat <dbl>,
## # start_lng <dbl>, end_lat <dbl>, end_lng <dbl>, member_casual <chr>
na_counts_oneyear_trips_clean <- colSums(is.na(oneyear_trips_clean))
#View(oneyear_trips_clean)
print(na_counts_oneyear_trips_clean)
## ride_id rideable_type started_at ended_at
## 0 0 0 0
## start_station_name start_station_id end_station_name end_station_id
## 0 0 0 0
## start_lat start_lng end_lat end_lng
## 0 0 0 0
## member_casual
## 0
6.1.-oneyear_trips_clean :4,233,605 rows, 13 columns `
7.-
Syntax: mydata2 <- subset(mydata, select = -c(x,z) ) https://www.listendata.com/2015/06/r-keep-drop-columns-from-data-frame.html
Create a new tibble
Definition of:oneyear_trips_clean2 tibble
oneyear_trips_clean2 <- subset(oneyear_trips_clean, select = -c(start_lat, start_lng, end_lat, end_lng))
#View(oneyear_trips_clean2)
print(oneyear_trips_clean2)
## # A tibble: 4,233,605 × 9
## ride_id rideable_type started_at ended_at
## <chr> <chr> <dttm> <dttm>
## 1 4449097279F8BBE7 classic_bike 2023-10-08 10:36:26 2023-10-08 10:49:19
## 2 9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08
## 3 667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53
## 4 F92714CC6B019B96 classic_bike 2023-10-24 19:13:03 2023-10-24 19:18:29
## 5 5E34BA5DE945A9CC classic_bike 2023-10-09 18:19:26 2023-10-09 18:30:56
## 6 F7D7420AFAC53CD9 electric_bike 2023-10-04 17:10:59 2023-10-04 17:25:21
## 7 870B2D4CD112D7B7 electric_bike 2023-10-31 17:32:20 2023-10-31 17:44:20
## 8 D9179D36E32D456C classic_bike 2023-10-02 18:51:51 2023-10-02 18:57:09
## 9 F8E131281F722FEF classic_bike 2023-10-17 08:28:18 2023-10-17 08:50:03
## 10 91938B71748FA405 classic_bike 2023-10-17 19:17:38 2023-10-17 19:32:23
## # ℹ 4,233,595 more rows
## # ℹ 5 more variables: start_station_name <chr>, start_station_id <chr>,
## # end_station_name <chr>, end_station_id <chr>, member_casual <chr>
7.-oneyear_trips_clean2 tibble :4,233,605 rows, 9 columns
`
8.-
Syntax: sum(duplicated(df$col))
nrow(oneyear_trips_clean2)
## [1] 4233605
sum(duplicated(oneyear_trips_clean2$ride_id))
## [1] 137
9.-
Syntax: class(df$dttm)
End Cleaning
class(oneyear_trips_clean2$started_at)
## [1] "POSIXct" "POSIXt"
class(oneyear_trips_clean2$ended_at)
## [1] "POSIXct" "POSIXt"
10.-
New dataset, generated from the data set that was cleaned up, “oneyear_trips_clean2”, with new columns added to have the data disaggregated by day of week, month, number of day of month, year
Crate a new tibble
Definition of: oneyear_trips_addcol tibble
oneyear_trips_addcol <- oneyear_trips_clean2
#View(oneyear_trips_addcol)
print(oneyear_trips_addcol)
## # A tibble: 4,233,605 × 9
## ride_id rideable_type started_at ended_at
## <chr> <chr> <dttm> <dttm>
## 1 4449097279F8BBE7 classic_bike 2023-10-08 10:36:26 2023-10-08 10:49:19
## 2 9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08
## 3 667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53
## 4 F92714CC6B019B96 classic_bike 2023-10-24 19:13:03 2023-10-24 19:18:29
## 5 5E34BA5DE945A9CC classic_bike 2023-10-09 18:19:26 2023-10-09 18:30:56
## 6 F7D7420AFAC53CD9 electric_bike 2023-10-04 17:10:59 2023-10-04 17:25:21
## 7 870B2D4CD112D7B7 electric_bike 2023-10-31 17:32:20 2023-10-31 17:44:20
## 8 D9179D36E32D456C classic_bike 2023-10-02 18:51:51 2023-10-02 18:57:09
## 9 F8E131281F722FEF classic_bike 2023-10-17 08:28:18 2023-10-17 08:50:03
## 10 91938B71748FA405 classic_bike 2023-10-17 19:17:38 2023-10-17 19:32:23
## # ℹ 4,233,595 more rows
## # ℹ 5 more variables: start_station_name <chr>, start_station_id <chr>,
## # end_station_name <chr>, end_station_id <chr>, member_casual <chr>
10.1.-oneyear_trips_addcol tibble :4,233,605 rows, 9 columns
11.-
From the column “started_at” with format “dttm” data type, convert from “datatime”(dttm) data type to “Date” data type and create a new column (“date”) in the dataset.
Syntax: df\(date <- as.Date(df\)started_at)
oneyear_trips_addcol$date <- as.Date(oneyear_trips_addcol$started_at)
#View(oneyear_trips_addcol)
print(oneyear_trips_addcol$date[1:9])
## [1] "2023-10-08" "2023-10-11" "2023-10-12" "2023-10-24" "2023-10-09"
## [6] "2023-10-04" "2023-10-31" "2023-10-02" "2023-10-17"
12.-
Extract from the column “date”, the part corresponding to the “names of the days of the week” and create a new column “name _dayofweek” in the dataset:
oneyear_trips_addcol$name_dayofweek <- format(as.Date(oneyear_trips_addcol$date), "%a")
#View(oneyear_trips_addcol)
print(oneyear_trips_addcol$name_dayofweek[1:9])
## [1] "Sun" "Wed" "Thu" "Tue" "Mon" "Wed" "Tue" "Mon" "Tue"
13.-
Extract from the column “date”, the part corresponding to the “month” and create a new column “month”(month-year in abbreviated form) in the dataset.
oneyear_trips_addcol$month <- format(as.Date(oneyear_trips_addcol$date), "%b_%y")
#View(oneyear_trips_addcol)
print(oneyear_trips_addcol$month[1:9])
## [1] "Oct_23" "Oct_23" "Oct_23" "Oct_23" "Oct_23" "Oct_23" "Oct_23" "Oct_23"
## [9] "Oct_23"
#print(oneyear_trips_addcol$month[4203000:4203006])
14.-
Extract from the column “date”, the part corresponding to the “No. day” and create a new column “No. day” in the dataset.
oneyear_trips_addcol$No._day <- format(as.Date(oneyear_trips_addcol$date), "%d")
#View(oneyear_trips_addcol)
print(oneyear_trips_addcol$No._day[1:9])
## [1] "08" "11" "12" "24" "09" "04" "31" "02" "17"
15.-
Extract from the column “date”, the part corresponding to the “year” and create a new column “year” (with 4 digits) in the dataset.
oneyear_trips_addcol$year <- format(as.Date(oneyear_trips_addcol$date), "%Y")
#View(oneyear_trips_addcol)
print(oneyear_trips_addcol$year[1:9])
## [1] "2023" "2023" "2023" "2023" "2023" "2023" "2023" "2023" "2023"
16.-
Calculate the Trips Duration Time by subtracting the values in the “ended_at” column, from the values in the “started_at” column through the “difftime” function, and then convert that value into a “numeric” “double” data type format, and create a new name column (“ride_duration”) in the dataset
oneyear_trips_addcol$ride_duration <- as.double(difftime(oneyear_trips_addcol$ended_at, oneyear_trips_addcol$started_at))
#View(oneyear_trips_addcol)
print(oneyear_trips_addcol)
## # A tibble: 4,233,605 × 15
## ride_id rideable_type started_at ended_at
## <chr> <chr> <dttm> <dttm>
## 1 4449097279F8BBE7 classic_bike 2023-10-08 10:36:26 2023-10-08 10:49:19
## 2 9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08
## 3 667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53
## 4 F92714CC6B019B96 classic_bike 2023-10-24 19:13:03 2023-10-24 19:18:29
## 5 5E34BA5DE945A9CC classic_bike 2023-10-09 18:19:26 2023-10-09 18:30:56
## 6 F7D7420AFAC53CD9 electric_bike 2023-10-04 17:10:59 2023-10-04 17:25:21
## 7 870B2D4CD112D7B7 electric_bike 2023-10-31 17:32:20 2023-10-31 17:44:20
## 8 D9179D36E32D456C classic_bike 2023-10-02 18:51:51 2023-10-02 18:57:09
## 9 F8E131281F722FEF classic_bike 2023-10-17 08:28:18 2023-10-17 08:50:03
## 10 91938B71748FA405 classic_bike 2023-10-17 19:17:38 2023-10-17 19:32:23
## # ℹ 4,233,595 more rows
## # ℹ 11 more variables: start_station_name <chr>, start_station_id <chr>,
## # end_station_name <chr>, end_station_id <chr>, member_casual <chr>,
## # date <date>, name_dayofweek <chr>, month <chr>, No._day <chr>, year <chr>,
## # ride_duration <dbl>
class(oneyear_trips_addcol$ride_duration)
## [1] "numeric"
nrow(subset(oneyear_trips_addcol,ride_duration < 0))
## [1] 103
16.1.-oneyear_trips_addcol tibble :4,233,605 rows, 15 columns
17.-
Delete rows from the oneyear_trips_addcol tibble that have negative values in the “ride_duration” column
oneyear_trips_addcolv2 <- subset(oneyear_trips_addcol, ride_duration >=0)
#View(oneyear_trips_addcolv2)
print(oneyear_trips_addcolv2)
## # A tibble: 4,233,502 × 15
## ride_id rideable_type started_at ended_at
## <chr> <chr> <dttm> <dttm>
## 1 4449097279F8BBE7 classic_bike 2023-10-08 10:36:26 2023-10-08 10:49:19
## 2 9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08
## 3 667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53
## 4 F92714CC6B019B96 classic_bike 2023-10-24 19:13:03 2023-10-24 19:18:29
## 5 5E34BA5DE945A9CC classic_bike 2023-10-09 18:19:26 2023-10-09 18:30:56
## 6 F7D7420AFAC53CD9 electric_bike 2023-10-04 17:10:59 2023-10-04 17:25:21
## 7 870B2D4CD112D7B7 electric_bike 2023-10-31 17:32:20 2023-10-31 17:44:20
## 8 D9179D36E32D456C classic_bike 2023-10-02 18:51:51 2023-10-02 18:57:09
## 9 F8E131281F722FEF classic_bike 2023-10-17 08:28:18 2023-10-17 08:50:03
## 10 91938B71748FA405 classic_bike 2023-10-17 19:17:38 2023-10-17 19:32:23
## # ℹ 4,233,492 more rows
## # ℹ 11 more variables: start_station_name <chr>, start_station_id <chr>,
## # end_station_name <chr>, end_station_id <chr>, member_casual <chr>,
## # date <date>, name_dayofweek <chr>, month <chr>, No._day <chr>, year <chr>,
## # ride_duration <dbl>
nrow(subset(oneyear_trips_addcolv2,ride_duration < 0))
## [1] 0
17.-oneyear_trips_addcolv2 tibble :4,233,502 rows, 15 columns
18.-
Syntax: unique(df$col)
unique(oneyear_trips_addcolv2$rideable_type)
## [1] "classic_bike" "electric_bike" "electric_scooter"
unique(oneyear_trips_addcolv2$member_casual)
## [1] "member" "casual"
19.-
sapply(oneyear_trips_addcolv2,class)
## $ride_id
## [1] "character"
##
## $rideable_type
## [1] "character"
##
## $started_at
## [1] "POSIXct" "POSIXt"
##
## $ended_at
## [1] "POSIXct" "POSIXt"
##
## $start_station_name
## [1] "character"
##
## $start_station_id
## [1] "character"
##
## $end_station_name
## [1] "character"
##
## $end_station_id
## [1] "character"
##
## $member_casual
## [1] "character"
##
## $date
## [1] "Date"
##
## $name_dayofweek
## [1] "character"
##
## $month
## [1] "character"
##
## $No._day
## [1] "character"
##
## $year
## [1] "character"
##
## $ride_duration
## [1] "numeric"
20.-
https://forum.posit.co/t/find-values-starting-with-a-specific-word/63438
filter(oneyear_trips_addcolv2,grepl("TEST",start_station_name))
## # A tibble: 0 × 15
## # ℹ 15 variables: ride_id <chr>, rideable_type <chr>, started_at <dttm>,
## # ended_at <dttm>, start_station_name <chr>, start_station_id <chr>,
## # end_station_name <chr>, end_station_id <chr>, member_casual <chr>,
## # date <date>, name_dayofweek <chr>, month <chr>, No._day <chr>, year <chr>,
## # ride_duration <dbl>
filter(oneyear_trips_addcolv2,grepl("test",start_station_name))
## # A tibble: 0 × 15
## # ℹ 15 variables: ride_id <chr>, rideable_type <chr>, started_at <dttm>,
## # ended_at <dttm>, start_station_name <chr>, start_station_id <chr>,
## # end_station_name <chr>, end_station_id <chr>, member_casual <chr>,
## # date <date>, name_dayofweek <chr>, month <chr>, No._day <chr>, year <chr>,
## # ride_duration <dbl>
filter(oneyear_trips_addcolv2,grepl("Test",start_station_name))
## # A tibble: 0 × 15
## # ℹ 15 variables: ride_id <chr>, rideable_type <chr>, started_at <dttm>,
## # ended_at <dttm>, start_station_name <chr>, start_station_id <chr>,
## # end_station_name <chr>, end_station_id <chr>, member_casual <chr>,
## # date <date>, name_dayofweek <chr>, month <chr>, No._day <chr>, year <chr>,
## # ride_duration <dbl>
filter(oneyear_trips_addcolv2,grepl("HQ",start_station_name))
## # A tibble: 0 × 15
## # ℹ 15 variables: ride_id <chr>, rideable_type <chr>, started_at <dttm>,
## # ended_at <dttm>, start_station_name <chr>, start_station_id <chr>,
## # end_station_name <chr>, end_station_id <chr>, member_casual <chr>,
## # date <date>, name_dayofweek <chr>, month <chr>, No._day <chr>, year <chr>,
## # ride_duration <dbl>
filter(oneyear_trips_addcolv2,grepl("HQ QR",start_station_name))
## # A tibble: 0 × 15
## # ℹ 15 variables: ride_id <chr>, rideable_type <chr>, started_at <dttm>,
## # ended_at <dttm>, start_station_name <chr>, start_station_id <chr>,
## # end_station_name <chr>, end_station_id <chr>, member_casual <chr>,
## # date <date>, name_dayofweek <chr>, month <chr>, No._day <chr>, year <chr>,
## # ride_duration <dbl>
nrow(oneyear_trips_addcolv2)
## [1] 4233502
20.1.-oneyear_trips_addcolv2 tibble :4,233,502 rows, 15 columns
There are no rows with values related to any Test type (0 rows x 15 columns) the number of rows remains 4,233,502, as it appeared in the comment of section 20.1.- (oneyear_trips_addcolv2 :4,233,502 rows, 15 columns)
Final Definition of : “oneyear_trips_addcolv2”
Dataset with 4,233,502 rows and 15 columns, with data types for the following variables: Name of the Days of Week (name_dayofweek), Month(month) and Rides’ Duration (ride_duration), in “character” format the first one and the second one, and in “numeric (”double") format the last one.
21.-
summary(oneyear_trips_addcolv2)
## ride_id rideable_type started_at
## Length:4233502 Length:4233502 Min. :2023-10-01 00:00:05.00
## Class :character Class :character 1st Qu.:2024-02-17 14:50:08.50
## Mode :character Mode :character Median :2024-06-01 09:50:22.13
## Mean :2024-05-04 05:18:28.32
## 3rd Qu.:2024-08-02 09:45:49.52
## Max. :2024-09-30 23:52:58.17
## ended_at start_station_name start_station_id
## Min. :2023-10-01 00:02:02.00 Length:4233502 Length:4233502
## 1st Qu.:2024-02-17 15:02:46.75 Class :character Class :character
## Median :2024-06-01 10:04:56.27 Mode :character Mode :character
## Mean :2024-05-04 05:34:59.60
## 3rd Qu.:2024-08-02 10:02:49.55
## Max. :2024-09-30 23:59:52.55
## end_station_name end_station_id member_casual date
## Length:4233502 Length:4233502 Length:4233502 Min. :2023-10-01
## Class :character Class :character Class :character 1st Qu.:2024-02-17
## Mode :character Mode :character Mode :character Median :2024-06-01
## Mean :2024-05-03
## 3rd Qu.:2024-08-02
## Max. :2024-09-30
## name_dayofweek month No._day year
## Length:4233502 Length:4233502 Length:4233502 Length:4233502
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## ride_duration
## Min. : 0.0
## 1st Qu.: 347.0
## Median : 603.5
## Mean : 991.3
## 3rd Qu.: 1083.4
## Max. :90562.0
22.-
class(oneyear_trips_addcolv2$ride_duration)
## [1] "numeric"
summary(oneyear_trips_addcolv2$ride_duration)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0 347.0 603.5 991.3 1083.4 90562.0
minimum_value <- oneyear_trips_addcolv2 %>%
slice_min(ride_duration) %>%
select(ride_duration)
minimum_value
## # A tibble: 316 × 1
## ride_duration
## <dbl>
## 1 0
## 2 0
## 3 0
## 4 0
## 5 0
## 6 0
## 7 0
## 8 0
## 9 0
## 10 0
## # ℹ 306 more rows
maximum_value <- oneyear_trips_addcolv2 %>%
slice_max(ride_duration) %>%
select(ride_duration)
maximum_value
## # A tibble: 1 × 1
## ride_duration
## <dbl>
## 1 90562
23.-
Definition of: oneyeartrips_ok tibble
oneyeartrips_ok <- oneyear_trips_addcolv2
#View(oneyeartrips_ok)
print(oneyeartrips_ok)
## # A tibble: 4,233,502 × 15
## ride_id rideable_type started_at ended_at
## <chr> <chr> <dttm> <dttm>
## 1 4449097279F8BBE7 classic_bike 2023-10-08 10:36:26 2023-10-08 10:49:19
## 2 9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08
## 3 667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53
## 4 F92714CC6B019B96 classic_bike 2023-10-24 19:13:03 2023-10-24 19:18:29
## 5 5E34BA5DE945A9CC classic_bike 2023-10-09 18:19:26 2023-10-09 18:30:56
## 6 F7D7420AFAC53CD9 electric_bike 2023-10-04 17:10:59 2023-10-04 17:25:21
## 7 870B2D4CD112D7B7 electric_bike 2023-10-31 17:32:20 2023-10-31 17:44:20
## 8 D9179D36E32D456C classic_bike 2023-10-02 18:51:51 2023-10-02 18:57:09
## 9 F8E131281F722FEF classic_bike 2023-10-17 08:28:18 2023-10-17 08:50:03
## 10 91938B71748FA405 classic_bike 2023-10-17 19:17:38 2023-10-17 19:32:23
## # ℹ 4,233,492 more rows
## # ℹ 11 more variables: start_station_name <chr>, start_station_id <chr>,
## # end_station_name <chr>, end_station_id <chr>, member_casual <chr>,
## # date <date>, name_dayofweek <chr>, month <chr>, No._day <chr>, year <chr>,
## # ride_duration <dbl>
23.1- See data type [char] in the columns:name_dayofweek and month
23.2- oneyeartrips_ok tibble :4,233,502 rows, 15 columns
24.-
oneyeartrips_ok %>%
group_by(member_casual) %>%
summarise(min_ride_duration = min(ride_duration), max_ride_duration = max(ride_duration), median_ride_duration = median(ride_duration), mean_ride_duration = mean(ride_duration))
## # A tibble: 2 × 5
## member_casual min_ride_duration max_ride_duration median_ride_duration
## <chr> <dbl> <dbl> <dbl>
## 1 casual 0 90562 799.
## 2 member 0 89859 527
## # ℹ 1 more variable: mean_ride_duration <dbl>
25.-
In the next paragraph of text, the column called “name_dayofweek”(month) in the tibble “oneyeartrips_ok” will be called “-col-name_dayofweek(month)”, due to the difficulty introduced by the The dollar symbol($ = -col-)
Preparation of the columns :“oneyeartrips_ok-col-name_dayofweek” and “oneyeartrips_ok-col-month” with their values in an ordered form , for the presentation of graphs relating the values of the Rides’ Number (“ride_duration”) and Average Rides’ Time (avg_ride_duration) as a function of those values in the 2 columns mentioned initially.
oneyeartrips_ok$name_dayofweek <- ordered(oneyeartrips_ok$name_dayofweek, levels= c('Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun'))
levels(oneyeartrips_ok$name_dayofweek)
## [1] "Mon" "Tue" "Wed" "Thu" "Fri" "Sat" "Sun"
class(oneyeartrips_ok$name_dayofweek)
## [1] "ordered" "factor"
oneyeartrips_ok$month <- ordered(oneyeartrips_ok$month,levels=c('Oct_23','Nov_23','Dec_23','Jan_24','Feb_24','Mar_24','Apr_24','May_24','Jun_24','Jul_24','Aug_24','Sep_24'))
levels(oneyeartrips_ok$month)
## [1] "Oct_23" "Nov_23" "Dec_23" "Jan_24" "Feb_24" "Mar_24" "Apr_24" "May_24"
## [9] "Jun_24" "Jul_24" "Aug_24" "Sep_24"
class(oneyeartrips_ok$month)
## [1] "ordered" "factor"
#View(oneyeartrips_ok)
print(oneyeartrips_ok)
## # A tibble: 4,233,502 × 15
## ride_id rideable_type started_at ended_at
## <chr> <chr> <dttm> <dttm>
## 1 4449097279F8BBE7 classic_bike 2023-10-08 10:36:26 2023-10-08 10:49:19
## 2 9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08
## 3 667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53
## 4 F92714CC6B019B96 classic_bike 2023-10-24 19:13:03 2023-10-24 19:18:29
## 5 5E34BA5DE945A9CC classic_bike 2023-10-09 18:19:26 2023-10-09 18:30:56
## 6 F7D7420AFAC53CD9 electric_bike 2023-10-04 17:10:59 2023-10-04 17:25:21
## 7 870B2D4CD112D7B7 electric_bike 2023-10-31 17:32:20 2023-10-31 17:44:20
## 8 D9179D36E32D456C classic_bike 2023-10-02 18:51:51 2023-10-02 18:57:09
## 9 F8E131281F722FEF classic_bike 2023-10-17 08:28:18 2023-10-17 08:50:03
## 10 91938B71748FA405 classic_bike 2023-10-17 19:17:38 2023-10-17 19:32:23
## # ℹ 4,233,492 more rows
## # ℹ 11 more variables: start_station_name <chr>, start_station_id <chr>,
## # end_station_name <chr>, end_station_id <chr>, member_casual <chr>,
## # date <date>, name_dayofweek <ord>, month <ord>, No._day <chr>, year <chr>,
## # ride_duration <dbl>
sapply(oneyeartrips_ok,class)
## $ride_id
## [1] "character"
##
## $rideable_type
## [1] "character"
##
## $started_at
## [1] "POSIXct" "POSIXt"
##
## $ended_at
## [1] "POSIXct" "POSIXt"
##
## $start_station_name
## [1] "character"
##
## $start_station_id
## [1] "character"
##
## $end_station_name
## [1] "character"
##
## $end_station_id
## [1] "character"
##
## $member_casual
## [1] "character"
##
## $date
## [1] "Date"
##
## $name_dayofweek
## [1] "ordered" "factor"
##
## $month
## [1] "ordered" "factor"
##
## $No._day
## [1] "character"
##
## $year
## [1] "character"
##
## $ride_duration
## [1] "numeric"
25.1.-oneyear_trips_ok tibble:4,233,502 rows, 15 columns
Final Definition of: oneyeartrips_ok tibble
Tibble with 4,233,502 rows and 15 columns, with data type for the following variables, Name of the Days of Week (name_dayoftheweek), Month(month) and Rides’ Duration (ride_duration), in “ordered levels” format first one and the second one, and in “numeric”(“double”) format the last one.
26.-
Deleted and moved at begining install.packages section
27.-
oneyeartrips_ok %>%
group_by(member_casual, name_dayofweek) %>%
summarise(rides_number = n(), average_duration_seg = mean(ride_duration)) %>%
arrange(member_casual, desc(rides_number))
## `summarise()` has grouped output by 'member_casual'. You can override using the
## `.groups` argument.
## # A tibble: 14 × 4
## # Groups: member_casual [2]
## member_casual name_dayofweek rides_number average_duration_seg
## <chr> <ord> <int> <dbl>
## 1 casual Sat 306363 1622.
## 2 casual Sun 264784 1658.
## 3 casual Fri 214951 1380.
## 4 casual Wed 188730 1288.
## 5 casual Thu 185003 1246.
## 6 casual Mon 182059 1388.
## 7 casual Tue 164597 1244.
## 8 member Wed 453234 728.
## 9 member Thu 428973 711.
## 10 member Tue 426248 715.
## 11 member Mon 407476 712.
## 12 member Fri 375985 721.
## 13 member Sat 332298 836.
## 14 member Sun 302801 838.
27.1.- Tibble with 14 rows and 4 columns
A.-
CALCULATION AND GRAPHS OF DAY_OF WEEK (MONTH) VS RIDE_DURATION, PARAMETRIZED VIA THE 12 MONTHS OF THE YEAR(7 DAYS OF THE WEEK), FACETING BY USER TYPE(MEMBER_CASUAL)
28.-
In this section begins the graphical analysis where are presented on the axis of the “Y” values of “Number of Trips” (“Rides’ Number”) made by the two types of users of this transport system: “member” and “casual”, and on the axis of the “X”, the names of the 7 days of the week, by User Type ("member and casual) or parametrized by the 12 months of the year,faceting by User Type
User Type = member_casual
Summary tibble of 3 columns necessary for the following point to be able to show the graphic, in which on the axis “X” are shown the days of the week (“name_dayofweek”) in an ordered form de (“Monday” to “Sunday”), and on the “Y” axis the Rides’ Number(ride_duration), by User Type.
User Type = member_casual
Tibble based on oneyeartrips_ok dataset
Crate a new tibble
Definition of: summary1ridesnumber_ days_ol tibble
summary1ridesnumber_days_ol <- oneyeartrips_ok %>%
group_by(member_casual, name_dayofweek) %>%
summarise(rides_number = n()) %>%
arrange(member_casual, name_dayofweek)
## `summarise()` has grouped output by 'member_casual'. You can override using the
## `.groups` argument.
print(summary1ridesnumber_days_ol)
## # A tibble: 14 × 3
## # Groups: member_casual [2]
## member_casual name_dayofweek rides_number
## <chr> <ord> <int>
## 1 casual Mon 182059
## 2 casual Tue 164597
## 3 casual Wed 188730
## 4 casual Thu 185003
## 5 casual Fri 214951
## 6 casual Sat 306363
## 7 casual Sun 264784
## 8 member Mon 407476
## 9 member Tue 426248
## 10 member Wed 453234
## 11 member Thu 428973
## 12 member Fri 375985
## 13 member Sat 332298
## 14 member Sun 302801
sapply(summary1ridesnumber_days_ol,class)
## $member_casual
## [1] "character"
##
## $name_dayofweek
## [1] "ordered" "factor"
##
## $rides_number
## [1] "integer"
28.1.- ’summary1ridesnumber_days_ol tibble with 14 rows and 3 columns. ol - ordered levels
29.-
This set of code lines are made to be able to present on the “X” axis the Days of the Week (“name_dayofweek”) in an ordered form of “Monday to Sunday”,and on the “Y” axis the values of Rides’ Number (“rides_number”), by each User Type
“Rides’ Number by each Day of Week by User Type” Plot
User Type = member_casual
https://ggplot2.tidyverse.org/reference/element.html
summary1ridesnumber_days_ol %>%
ggplot(aes(x = name_dayofweek, y = rides_number/1000, fill = member_casual)) +
geom_col(position = "dodge") +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size= 11, face="bold")) +
theme(legend.text = element_text(colour= "black", size= 10, face= "bold")) +
labs(title = 'Rides’ Number by each Day of Week by User Type') +
theme(plot.title = element_text(size = 11, face= "bold")) +
xlab('Name of Days of Week') +
ylab('Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Rides’ Number by each Day of Week by User Type.png")
29.1.-
https://patchwork.data-imaginist.com/
library(patchwork)
p1 <- summary1ridesnumber_days_ol %>%
ggplot(aes(x = name_dayofweek, y = rides_number/1000, fill = member_casual)) +
geom_col(position = "dodge") +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size= 11, face="bold")) +
theme(legend.text = element_text(colour= "black", size= 10, face= "bold")) +
labs(title = 'Rides’ Number by each Day of Week by User Type') +
theme(plot.title = element_text(size = 11, face= "bold")) +
xlab('Name of Days of Week') +
ylab('Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
p1
30.-
This summary tibble generates 3 columns arranged in the following order: member_casual-name_dayofweek-rides_number, and it is necessary for the generation of the graphics with descending values on the “Y” axis mentioned in the following point.
Tibble based on oneyear_trips_addcolv2 dataset
User Type = member_casual
Create a new tibble
Definition of: summary2ridesnumber_days_char tibble
sumary2ridesnumber_days_char <- oneyear_trips_addcolv2 %>%
group_by(member_casual, name_dayofweek) %>%
summarise(rides_number = n()) %>%
arrange(member_casual, name_dayofweek)
## `summarise()` has grouped output by 'member_casual'. You can override using the
## `.groups` argument.
print(sumary2ridesnumber_days_char)
## # A tibble: 14 × 3
## # Groups: member_casual [2]
## member_casual name_dayofweek rides_number
## <chr> <chr> <int>
## 1 casual Fri 214951
## 2 casual Mon 182059
## 3 casual Sat 306363
## 4 casual Sun 264784
## 5 casual Thu 185003
## 6 casual Tue 164597
## 7 casual Wed 188730
## 8 member Fri 375985
## 9 member Mon 407476
## 10 member Sat 332298
## 11 member Sun 302801
## 12 member Thu 428973
## 13 member Tue 426248
## 14 member Wed 453234
sapply(sumary2ridesnumber_days_char,class)
## member_casual name_dayofweek rides_number
## "character" "character" "integer"
30.1.- summary2ridesnumber_days_char tibble with 14 rows and 3 columns. char = character
31.-
This set of code lines are made to be able to present on the “X” axis, the values of “name_dayofweek” according to the descending order of values of the Rides’ Number (“rides_number”) on the “Y” axis, faceting by User Type.
User Type = member_casual
The presentation is with the axles rotated 90 degrees.
”Descending Rides’ Number by each Day of Week, Faceting by User Type” Plot
https://juliasilge.com/blog/reorder-within/
sumary2ridesnumber_days_char %>%
ggplot() +
geom_col(aes(reorder_within(name_dayofweek, rides_number, member_casual), rides_number/1000, fill = member_casual), position = "dodge") +
facet_wrap(~member_casual, scales = "free_y") +
theme(legend.position = "bottom") +
theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
coord_flip() +
labs(title = 'Descending Rides’ Number by each Day, Faceting by User Type') +
theme(plot.title = element_text(size = 11,face="bold")) +
xlab('Name of Days of Week') +
ylab('Descending Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Descending Rides’ Number by each Day of Week.png")
31.1.-
#{r Plot2, width= 10, height= 15} #p2 <- sumary2ridesnumber_days_char %>% ggplot() + geom_col(aes(reorder_within(name_dayofweek, rides_number, member_casual), rides_number/1000, fill = member_casual), position = "dodge") + facet_wrap(~member_casual, scales = "free_y") + theme(legend.position = "bottom") + theme(legend.title = element_text(colour = "black", size =11, face ="bold")) + theme(legend.text = element_text(colour = "black", size =10, face ="bold")) + coord_flip() + labs(title = 'Descending Rides’ Number by each Day, Faceting by User Type') + theme(plot.title = element_text(size = 11,face="bold")) + xlab('Name of Days of Week') + ylab('Descending Rides’ Number (in thousands)') + theme(axis.text.x = element_text(angle = 30, face = "bold" )) + theme(axis.title.x = element_text(face = "bold")) + theme(axis.text.y = element_text(face = "bold")) + theme(axis.title.y = element_text(face = "bold")) #p1/p2 #
32.-
summary tibble generates 3 columns arranged in the following order: member_casual-month-rides_number, and is required for generation of the graph, where on the “X” axis are shown the 12 months of the year (month) in an ordered form (“Oct_23 to Sep_24”), as they appear in the following point. On the “Y” axis, the values of the Rides’ Number (“rides_number”) are shown according to what is presented on the “X” axis
Tibble based on oneyear_trips_ok dataset
User Type = member_casual
Create a new tibble
Definition of: summary3ridesnumber_month_ol tibble
sumary3ridesnumber_month_ol <- oneyeartrips_ok %>%
group_by(member_casual, month) %>%
summarise(rides_number = n()) %>%
arrange(member_casual, month)
## `summarise()` has grouped output by 'member_casual'. You can override using the
## `.groups` argument.
print(sumary3ridesnumber_month_ol)
## # A tibble: 24 × 3
## # Groups: member_casual [2]
## member_casual month rides_number
## <chr> <ord> <int>
## 1 casual Oct_23 130297
## 2 casual Nov_23 72077
## 3 casual Dec_23 36686
## 4 casual Jan_24 17713
## 5 casual Feb_24 38170
## 6 casual Mar_24 62818
## 7 casual Apr_24 93943
## 8 casual May_24 167481
## 9 casual Jun_24 210679
## 10 casual Jul_24 231970
## # ℹ 14 more rows
sapply(sumary3ridesnumber_month_ol,class)
## $member_casual
## [1] "character"
##
## $month
## [1] "ordered" "factor"
##
## $rides_number
## [1] "integer"
32.1.- summary3ridesnumber_month_ol. ol = ordered levels
33.-
“Rides’ Number by each Month by User Type” Plot sumary3ridesnumber_month_ol
sumary3ridesnumber_month_ol %>%
ggplot(aes(x = month, y = rides_number/1000, fill = member_casual)) +
geom_col(position = "dodge") +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Rides’ Number by each Month by User Type') +
theme(plot.title = element_text(size = 11,face="bold")) +
xlab('Month') +
ylab('Rides’ Number by each Month (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Rides’ Number by each Month by User Type.png")
34.-
This summary tibble generates 3 columns arranged in the following order: member_casual-month-rides_number, and it is necessary for generating the graphics with descending values on the axis “Y” mentioned in the following point
Tibble based on oneyear_trips_addcolv2 dataset
User Type = member_casual
Create a new tibble
Definition of: summary4ridesnumber_month_char
summary4ridesnumber_month_char <- oneyear_trips_addcolv2 %>%
group_by(member_casual, month) %>%
summarise(rides_number = n()) %>%
arrange(member_casual, month)
## `summarise()` has grouped output by 'member_casual'. You can override using the
## `.groups` argument.
print(summary4ridesnumber_month_char)
## # A tibble: 24 × 3
## # Groups: member_casual [2]
## member_casual month rides_number
## <chr> <chr> <int>
## 1 casual Apr_24 93943
## 2 casual Aug_24 228518
## 3 casual Dec_23 36686
## 4 casual Feb_24 38170
## 5 casual Jan_24 17713
## 6 casual Jul_24 231970
## 7 casual Jun_24 210679
## 8 casual Mar_24 62818
## 9 casual May_24 167481
## 10 casual Nov_23 72077
## # ℹ 14 more rows
sapply(summary4ridesnumber_month_char, class)
## member_casual month rides_number
## "character" "character" "integer"
34.1.- summary4ridesnumber_month_char. char = character
35.-
This set of code lines are made to be able to present on the “X” axis, the values of “month” according to the descending order of values of the Rides’ Number (“rides_number”) on the “Y” axis, faceting by User Type
User Type = member_casual
The presentation is with the axles rotated 90 degrees.
”Descending Rides’ Number by each Month, Faceting by User Type” Plot
https://juliasilge.com/blog/reorder-within/
https://ggplot2.tidyverse.org/reference/coord_flip.html
summary4ridesnumber_month_char %>%
ggplot() +
geom_col(aes(reorder_within(month, rides_number, member_casual), rides_number/1000, fill = member_casual), position = "dodge") +
facet_wrap(~member_casual, scales = "free_y") +
theme(legend.position = "bottom") +
theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
coord_flip() +
labs(title = 'Descending Rides’ Number by each Month, Faceting by User Type') +
theme(plot.title = element_text(size = 11, face="bold")) +
xlab('Month') +
ylab('Descending Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Descending Rides’ Number by each Month, Faceting by User Type.png")
AB.-
CALCULATION AND GRAPHS OF DAY_OF WEEK (MONTH) VS RIDE_DURATION, PARAMETRIZED VIA THE 12 MONTHS OF THE YEAR(7 DAYS OF THE WEEK), FACETING BY USER TYPE (MEMBER_CASUAL)
36.-
This is to make a summary of the calculation and graphs of “name_dayof week” vs the Rides’ Number “rides_number”, parametrized via the 12 months of the year, faceting by user type.
User Type = member_casual
Tibble based on oneyeartrips_ok dataset
Definition of : summary5columns4_ol =
summary5_usertype-daysofweek-month-ridesnumber_ol
summary5columns4_ol <- oneyeartrips_ok %>%
group_by(member_casual, name_dayofweek, month) %>%
summarise(rides_number = n()) %>%
arrange(member_casual, name_dayofweek, month)
## `summarise()` has grouped output by 'member_casual', 'name_dayofweek'. You can
## override using the `.groups` argument.
print(summary5columns4_ol)
## # A tibble: 168 × 4
## # Groups: member_casual, name_dayofweek [14]
## member_casual name_dayofweek month rides_number
## <chr> <ord> <ord> <int>
## 1 casual Mon Oct_23 18015
## 2 casual Mon Nov_23 7961
## 3 casual Mon Dec_23 3537
## 4 casual Mon Jan_24 2862
## 5 casual Mon Feb_24 5160
## 6 casual Mon Mar_24 8024
## 7 casual Mon Apr_24 14304
## 8 casual Mon May_24 18031
## 9 casual Mon Jun_24 21493
## 10 casual Mon Jul_24 27425
## # ℹ 158 more rows
sapply(summary5columns4_ol,class)
## $member_casual
## [1] "character"
##
## $name_dayofweek
## [1] "ordered" "factor"
##
## $month
## [1] "ordered" "factor"
##
## $rides_number
## [1] "integer"
#View(summary5columns4_ol)
36.1.- summary5columns4_ol. ol = ordered levels
37.-
“Rides’ Number by each Day of Week in each Month, Faceting by User Type” Plot
https://www.youtube.com/watch?v=h14MWrYZjL0&t=58s
summary5columns4_ol %>%
ggplot(aes(x = name_dayofweek, y = rides_number/1000, group = month, colour = month, shape = month )) +
geom_point() +
scale_shape_manual(values=seq(0,11)) +
scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#FF3399","#A0A0A0","#000000","#99FFFF","#66FF66","#FFFF00","#FFCC99")) +
geom_line(linewidth = 1.2 ) +
facet_grid(member_casual ~ .) +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Rides’ Number by each Day of Week in each Month, Faceting by User Type') +
theme(plot.title = element_text(size = 11, face="bold")) +
xlab('Name of Day of Week') +
ylab('Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("'Rides’ Number by each Day of Week in each Month, Faceting by User Type.png")
38.-
User Type = member-casual
“Rides’ Number by each Month in each Day of Week, Faceting by User Type” Plot
summary5columns4_ol %>%
ggplot(aes(x = month, y = rides_number/1000, group = name_dayofweek, colour = name_dayofweek, shape = name_dayofweek )) +
geom_point() +
scale_shape_manual(values=seq(0,6)) +
scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#000000","#99FFFF")) +
geom_line(linewidth = 1.2) +
facet_grid(member_casual ~ .) +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Rides’ Number by each Month in each Day of Week, Faceting by User Type') +
theme(plot.title = element_text(size = 11, face="bold")) +
xlab('Month') +
ylab('Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Rides’ Number by each Month in each Day of Week, Faceting by User Type.png")
B.-
CALCULATION AND GRAPHS OF DAY_OF WEEK (MONTH) VS AVERAGE RIDE_DURATION, PARAMETRIZED VIA THE 12 MONTHS OF THE YEAR(7 DAYS OF THE WEEK), BY USER TYPE OR FACETING BY USER TYPE (MEMBER_CASUAL)
39.-
In this section begins the graphical analysis where are presented on the axis of the “Y” values of “Average duration Time of Trips” (“Average Rides’ Time”) made by the two types of users of this transport system: “member” and “casual”, and on the axis of the “X”, the names of the 7 days of the week, as the names of the 12 months of the year, facetting by “member and casual”
This summary table generates 4 columns arranged in the following order: member_casual-month-name_dayofweek-avg_rides_duration_seg, and is necessary for the generation of the graphs shown in the following points.
Tibble based on oneyeartrips_ok dataset
User Type = member-casual
Crate a new tibble
Definition of: Summary6columns4_ol tibble =
Summary6columns4_ol <- oneyeartrips_ok %>%
group_by(member_casual, month, name_dayofweek) %>%
summarise(avg_rides_duration_seg = mean(ride_duration)) %>%
arrange(member_casual, month, name_dayofweek)
## `summarise()` has grouped output by 'member_casual', 'month'. You can override
## using the `.groups` argument.
print(Summary6columns4_ol)
## # A tibble: 168 × 4
## # Groups: member_casual, month [24]
## member_casual month name_dayofweek avg_rides_duration_seg
## <chr> <ord> <ord> <dbl>
## 1 casual Oct_23 Mon 1225.
## 2 casual Oct_23 Tue 1164.
## 3 casual Oct_23 Wed 1170.
## 4 casual Oct_23 Thu 1040.
## 5 casual Oct_23 Fri 1164.
## 6 casual Oct_23 Sat 1366.
## 7 casual Oct_23 Sun 1603.
## 8 casual Nov_23 Mon 970.
## 9 casual Nov_23 Tue 861.
## 10 casual Nov_23 Wed 917.
## # ℹ 158 more rows
sapply(Summary6columns4_ol, class)
## $member_casual
## [1] "character"
##
## $month
## [1] "ordered" "factor"
##
## $name_dayofweek
## [1] "ordered" "factor"
##
## $avg_rides_duration_seg
## [1] "numeric"
#View(Summary6columns4_ol)
39.1.- Summary6columns4_ol. ol = ordered levels
40.-
This set of code lines is made to be able to present on the “X” axis the Days of the Week (“name_dayofweek”) in an ordered form from “Monday to Sunday”, and on the “Y” axis the values of Average Rides’ Time (“avg_rides_duration_seg”), by each User Type
User Type = member_casual
Average Rides’ Time by each Day of Week by User Type Plot
Summary6columns4_ol %>%
ggplot(aes(x = name_dayofweek, y = avg_rides_duration_seg /60, fill = member_casual)) +
geom_col(position = "dodge") +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Average Rides’ Time by each Day of Week by User Type ') +
theme(plot.title = element_text(size = 11, face="bold")) +
xlab('Name of Days of Week') +
ylab('Average Rides’ Time (in min.)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Average Rides’ Time by each Day of Week by User Type.png")
41.-
This set of code lines are made to be able to present on the “X” axis, the values of “name_dayofweek” according to the descending order of values of Average Rides’ Time (“avg_rides_duration_seg”), on the “Y” axis, faceting by User Type
The presentation is with the axles rotated 90 degrees.
”Descending Average Rides’ Time by each Day of Week, Faceting by User Type” Plot
https://juliasilge.com/blog/reorder-within/
Summary6columns4_ol %>%
ggplot() +
geom_col(aes(reorder_within(name_dayofweek, avg_rides_duration_seg, member_casual), avg_rides_duration_seg /60, fill = member_casual), position = "dodge") +
facet_wrap(~member_casual, scales = "free_y") +
theme(legend.position = "bottom") +
theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
coord_flip() +
labs(title = 'Descending Average Rides’ Time by each Day od Week, Faceting by User Type') +
theme(plot.title = element_text(size = 11, face="bold")) +
xlab('Name of Days of Week') +
ylab('Descending Average Rides’_Duration (in min.)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Descending Average Rides’ Time by each Day of Week, Faceting by User Type.png")
42.-
This set of code lines is made to be able to present on the “X” axis the variable “month” in an ordered form from “Oct_23” to “Sep_24”, and on the “Y” axis" the values “avg_rides_duration_seg” corresponding to the order of the “X” axis “month”, by User Type
Average Rides’ Time by each Month by User Type Plot
Summary6columns4_ol %>%
ggplot(aes(x = month, y = avg_rides_duration_seg /60, fill = member_casual)) +
geom_col(position = "dodge") +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Average Rides’ Time by each Month by User Type') +
theme(plot.title = element_text(size = 11, face="bold")) +
xlab('Month') +
ylab('Average Rides’ Time (in min.)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Average Rides’ Time by each Month by User Type.png")
43.-
This summary tibble generates 3 columns arranged in thr following order :**User Type(member_casual)-Month(month)-Average Rides’Duration(avg_rides_duration_seg)*, and it is necessary for generating the graphics with descending values on the axis “Y” mentioned in the following point
Tibble based on “oneyear_trips_addcolv2” dataset.
Definition of : summary7cols3_char tibble =
summary7_usertype-month-avg_rides_duration_seg_char
summary7cols3_char <- oneyear_trips_addcolv2 %>%
group_by(member_casual, month) %>%
summarise(avg_rides_duration_seg = mean(ride_duration)) %>%
arrange(member_casual, month, avg_rides_duration_seg)
## `summarise()` has grouped output by 'member_casual'. You can override using the
## `.groups` argument.
print(summary7cols3_char)
## # A tibble: 24 × 3
## # Groups: member_casual [2]
## member_casual month avg_rides_duration_seg
## <chr> <chr> <dbl>
## 1 casual Apr_24 1486.
## 2 casual Aug_24 1489.
## 3 casual Dec_23 992.
## 4 casual Feb_24 1189.
## 5 casual Jan_24 932.
## 6 casual Jul_24 1589.
## 7 casual Jun_24 1577.
## 8 casual Mar_24 1322.
## 9 casual May_24 1612.
## 10 casual Nov_23 1073.
## # ℹ 14 more rows
sapply(summary7cols3_char,class)
## member_casual month avg_rides_duration_seg
## "character" "character" "numeric"
43.1.- summary7cols3_char. char = character. Tibble with 24 rows and 3 columns
44.-
This set of code lines are made to be able to present on the “X” axis, the values of “month” according to the descending order of values of the Average Rides’ Times (“avg_rides_duration_seg”) on the “Y” axis, faceting by User Type
User Type = member_casual
The presentation is with the axles rotated 90 degrees.
”Descending Average Rides’ Time by each Month, Faceting by User Type Plot
https://juliasilge.com/blog/reorder-within/
summary7cols3_char %>%
ggplot() +
geom_col(aes(reorder_within(month, avg_rides_duration_seg, member_casual), avg_rides_duration_seg /60, fill = member_casual), position = "dodge") +
facet_wrap(~member_casual, scales = "free_y") +
theme(legend.position = "bottom") +
theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
coord_flip() +
labs(title = 'Descending Average Rides’ Time by each Month, Faceting by User Type') +
theme(plot.title = element_text(size = 11, face="bold")) +
xlab('Month') +
ylab('Descending Average Rides’ Time (in min.)') +
theme(axis.text = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold")) +
theme(axis.title.x = element_text(face = "bold"))
#ggsave("Descending Average Rides’ Time by each Month, Faceting by User Type.png")
45.-
This set of code lines is to make a summary of the calculation and graphs of the variable “name_dayof week” vs the Average Rides’ Time “avg_rides_duration_seg”, parameterized via the 12 months of the year, faceting by User Type.
User Type = member_casual
Avg. Rides’ Time by each Day of Week in each Month, Faceting by User Type Plot
Summary6columns4_ol %>%
ggplot(aes(x = name_dayofweek, y = avg_rides_duration_seg/60, group = month, colour = month, shape = month)) +
geom_point() +
scale_shape_manual(values=seq(0,11)) +
scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#FF3399","#A0A0A0","#000000","#99FFFF","#66FF66","#FFFF00","#FFCC99")) +
geom_line(linewidth = 1.2) +
facet_grid(member_casual ~ .) +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size = 11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Avg. Rides’ Time by each Day of Week in each Month, Faceting by User Type') +
theme(plot.title = element_text(size = 11,face="bold")) +
xlab('Name of Day of Week') +
ylab(' Average Rides’ Time (in min.)') +
theme(axis.text.x = element_text(angle = 30, face = "bold", size = 11 )) +
theme(axis.title.x = element_text(face = "bold", size = 11)) +
theme(axis.text.y = element_text(face = "bold", size = 11)) +
theme(axis.title.y = element_text(face = "bold", size = 11))
#ggsave("Avg. Rides’ Time by each Day of Week in each Month, Faceting by User Type.png")
46.-
This set of code lines is to make a summary of the calculation and graphs of the variable “month” vs the Average Rides’ Time “avg_rides_duration_seg”, parameterized via the 7 days of the week of each month, faceting by User Type.
User Type = member_casual
“Avg. Rides’ Time by each Month in each Day of Week, Faceting by User Type” Plot
Summary6columns4_ol %>%
ggplot(aes(x = month, y = avg_rides_duration_seg/60, group = name_dayofweek, colour = name_dayofweek, shape = name_dayofweek)) +
geom_point() +
scale_shape_manual(values=seq(0,6)) +
scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#000000","#99FFFF")) +
geom_line(linewidth = 1.2) +
facet_grid(member_casual ~ .) +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Avg. Rides’ Time by each Month in each Day of Week, Faceting by User Type') +
theme(plot.title = element_text(size = 11, face = "bold")) +
xlab('Month') +
ylab('Average Rides’ Time (in min.)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Avg. Rides’ Time by each Month in each Day of Week, Faceting by User Type.png")
47.-
Summary tibble for Expanded view of the “member” section, point 46
https://www.statology.org/dplyr-filter-multiple-conditions/
Expanded_view_member_section46 <- Summary6columns4_ol %>%
filter(member_casual == 'member')
print(Expanded_view_member_section46)
## # A tibble: 84 × 4
## # Groups: member_casual, month [12]
## member_casual month name_dayofweek avg_rides_duration_seg
## <chr> <ord> <ord> <dbl>
## 1 member Oct_23 Mon 659.
## 2 member Oct_23 Tue 694.
## 3 member Oct_23 Wed 691.
## 4 member Oct_23 Thu 675.
## 5 member Oct_23 Fri 678.
## 6 member Oct_23 Sat 738.
## 7 member Oct_23 Sun 785.
## 8 member Nov_23 Mon 643.
## 9 member Nov_23 Tue 621.
## 10 member Nov_23 Wed 643.
## # ℹ 74 more rows
#View(Expanded_view_member_section46)
48.-
Expanded view of the “member” section in graph (“Avg. Rides’ Time by each Month in each Day of Week, by User Type” Plot) at point 46
https://www.statology.org/dplyr-filter-multiple-conditions/
Expanded_view_member_section46 %>%
ggplot(aes(x = month, y = avg_rides_duration_seg/60, group = name_dayofweek, colour = name_dayofweek, shape = name_dayofweek)) +
geom_point() +
scale_shape_manual(values=seq(0,6)) +
scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#000000","#99FFFF")) +
geom_line(linewidth = 1.2) +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Expanded View Avg. Rides’ Time by each Month in each Day of Week, by Member User Type') +
theme(plot.title = element_text(size = 11, face = "bold")) +
xlab('Month') +
ylab('Average Rides’ Time (in min.)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Expanded View Avg. Rides’ Time by each Month in each Day of Week, by Member User Type.png")
49.-
Summary tibble for Expanded view of the “casual” section, point 46
https://www.statology.org/dplyr-filter-multiple-conditions/
Expanded_view_casual_section46 <- Summary6columns4_ol %>%
filter(member_casual == 'casual')
print(Expanded_view_casual_section46)
## # A tibble: 84 × 4
## # Groups: member_casual, month [12]
## member_casual month name_dayofweek avg_rides_duration_seg
## <chr> <ord> <ord> <dbl>
## 1 casual Oct_23 Mon 1225.
## 2 casual Oct_23 Tue 1164.
## 3 casual Oct_23 Wed 1170.
## 4 casual Oct_23 Thu 1040.
## 5 casual Oct_23 Fri 1164.
## 6 casual Oct_23 Sat 1366.
## 7 casual Oct_23 Sun 1603.
## 8 casual Nov_23 Mon 970.
## 9 casual Nov_23 Tue 861.
## 10 casual Nov_23 Wed 917.
## # ℹ 74 more rows
#View(Expanded_view_casual_section46)
50.-
Expanded view of the “casual” section in graph (“Avg. Rides’ Time by each Month in each Day of Week, by User Type” Plot) at point 46
Expanded_view_casual_section46 %>%
ggplot(aes(x = month, y = avg_rides_duration_seg/60, group = name_dayofweek, colour = name_dayofweek, shape = name_dayofweek)) +
geom_point() +
scale_shape_manual(values=seq(0,6)) +
scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#000000","#99FFFF")) +
geom_line(linewidth = 1.2) +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Expanded View Avg. Rides’ Time by each Month in each Day of Week, by Casual User Type') +
theme(plot.title = element_text(size = 11, face = "bold")) +
xlab('Month') +
ylab('Average Rides’ Time (in min.)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Expanded View Avg. Rides’ Time by each Month in each Day of Week, by Casual User Type.png")
C.-
CALCULATION AND GRAPHS OF DAY_OF WEEK (MONTH) VS AVERAGE RIDES’ TIME, PARAMETRIZED VIA THE 12 MONTHS OF THE YEAR(7 DAYS OF THE WEEK), BY RIDEABLE TYPE OR FACETING BY RIDEABLE TYPE
51.-
Create three tibble:summary8columns6_ol and summary8columns5day_char – summary8columns5month_char, tibbles based on “oneyeartrips_ok” dataset and oneyear_trips_addcolv2 dataset
In this section begins the graphical analysis where are presented on the axis of the “Y” values of “Rides’ Time” or “Average Ride_Duration” (“Avg. Rides’ Time”), and on the axis of the “X”, the names of the 12 months of the year, parametrized by the names of the 7 days of the week, consumed by the three types of Rideable Type: “clasicc_bike” “docked_bike” and “electric_bike” (by Rideable Type or faceting by Rideable Type)
This summary tibble generates (6)–(5) columns arranged in the following order :User Type(member_casual)-rideable_type-Month(name_dayofweek)-Days of Week(month)-rides_number-Average Rides’Duration(avg_rides_duration_seg), it is necessary for generating the graphics of Month(or Days of Week) vs Rides’ Number (or Avg. Rides’ Time), via 12 Month of year(or the 7 Days of Week), by Rideable Type or faceting by Rideable Type, mentioned in the following point
Tibbles based on “oneyeartrips_ok” dataset and oneyear_trips_addcolv2 dataset.
Definition of :summary8columns6_ol definition and summary8columns5day_char – summary8columns5month_char =
summary8_usertypecasual-rideable_type-month-name_dayofweek-rides_number-avg_rides_duration_seg_ol
summary8columns6_ol <- oneyeartrips_ok %>%
group_by(member_casual, rideable_type, month, name_dayofweek) %>%
summarise(rides_number = n(), avg_rides_duration_seg = mean(ride_duration)) %>%
arrange(member_casual,rideable_type, month, name_dayofweek)
## `summarise()` has grouped output by 'member_casual', 'rideable_type', 'month'.
## You can override using the `.groups` argument.
print(summary8columns6_ol)
## # A tibble: 352 × 6
## # Groups: member_casual, rideable_type, month [52]
## member_casual rideable_type month name_dayofweek rides_number
## <chr> <chr> <ord> <ord> <int>
## 1 casual classic_bike Oct_23 Mon 11017
## 2 casual classic_bike Oct_23 Tue 12187
## 3 casual classic_bike Oct_23 Wed 9902
## 4 casual classic_bike Oct_23 Thu 8300
## 5 casual classic_bike Oct_23 Fri 9143
## 6 casual classic_bike Oct_23 Sat 12586
## 7 casual classic_bike Oct_23 Sun 19550
## 8 casual classic_bike Nov_23 Mon 4469
## 9 casual classic_bike Nov_23 Tue 4133
## 10 casual classic_bike Nov_23 Wed 5254
## # ℹ 342 more rows
## # ℹ 1 more variable: avg_rides_duration_seg <dbl>
#View(summary8columns6_ol)
summary8columns5day_char <- oneyear_trips_addcolv2 %>%
#filter(member_casual == 'member') %>%
group_by(member_casual, rideable_type, name_dayofweek) %>%
summarise(rides_number = n(), avg_rides_duration_seg = mean(ride_duration)) %>%
arrange(member_casual,rideable_type, name_dayofweek)
## `summarise()` has grouped output by 'member_casual', 'rideable_type'. You can
## override using the `.groups` argument.
print(summary8columns5day_char)
## # A tibble: 42 × 5
## # Groups: member_casual, rideable_type [6]
## member_casual rideable_type name_dayofweek rides_number
## <chr> <chr> <chr> <int>
## 1 casual classic_bike Fri 133667
## 2 casual classic_bike Mon 112942
## 3 casual classic_bike Sat 210811
## 4 casual classic_bike Sun 181235
## 5 casual classic_bike Thu 112217
## 6 casual classic_bike Tue 100216
## 7 casual classic_bike Wed 115705
## 8 casual electric_bike Fri 77527
## 9 casual electric_bike Mon 65050
## 10 casual electric_bike Sat 92154
## # ℹ 32 more rows
## # ℹ 1 more variable: avg_rides_duration_seg <dbl>
#View(summary8columns5day_char)
summary8columns5month_char <- oneyear_trips_addcolv2 %>%
group_by(member_casual, rideable_type, month) %>%
summarise(rides_number = n(), avg_rides_duration_seg = mean(ride_duration)) %>%
arrange(member_casual,rideable_type, month)
## `summarise()` has grouped output by 'member_casual', 'rideable_type'. You can
## override using the `.groups` argument.
print(summary8columns5month_char)
## # A tibble: 52 × 5
## # Groups: member_casual, rideable_type [6]
## member_casual rideable_type month rides_number avg_rides_duration_seg
## <chr> <chr> <chr> <int> <dbl>
## 1 casual classic_bike Apr_24 57421 1847.
## 2 casual classic_bike Aug_24 148030 1762.
## 3 casual classic_bike Dec_23 20280 1296.
## 4 casual classic_bike Feb_24 27591 1368.
## 5 casual classic_bike Jan_24 10328 1204.
## 6 casual classic_bike Jul_24 159027 1840.
## 7 casual classic_bike Jun_24 143499 1837.
## 8 casual classic_bike Mar_24 39320 1631.
## 9 casual classic_bike May_24 115974 1879.
## 10 casual classic_bike Nov_23 42244 1347.
## # ℹ 42 more rows
#View(summary8columns5month_char)
Definiton of : summary8columns6_ol and summary8columns5day_char – summary8columns5month_char
**52.-
summary8columns5month_ol <- oneyeartrips_ok %>%
#filter(member_casual == 'casual') %>%
group_by(member_casual, rideable_type, month) %>%
summarise(rides_number = n(), avg_rides_duration_seg = mean(ride_duration)) %>%
arrange(member_casual,rideable_type, month)
## `summarise()` has grouped output by 'member_casual', 'rideable_type'. You can
## override using the `.groups` argument.
print(summary8columns5month_ol)
## # A tibble: 52 × 5
## # Groups: member_casual, rideable_type [6]
## member_casual rideable_type month rides_number avg_rides_duration_seg
## <chr> <chr> <ord> <int> <dbl>
## 1 casual classic_bike Oct_23 82685 1561.
## 2 casual classic_bike Nov_23 42244 1347.
## 3 casual classic_bike Dec_23 20280 1296.
## 4 casual classic_bike Jan_24 10328 1204.
## 5 casual classic_bike Feb_24 27591 1368.
## 6 casual classic_bike Mar_24 39320 1631.
## 7 casual classic_bike Apr_24 57421 1847.
## 8 casual classic_bike May_24 115974 1879.
## 9 casual classic_bike Jun_24 143499 1837.
## 10 casual classic_bike Jul_24 159027 1840.
## # ℹ 42 more rows
#View(summary8columns5month_ol)
**53.-
summary8columns5day_ol <- oneyeartrips_ok %>%
group_by(member_casual, rideable_type, name_dayofweek) %>%
summarise(rides_number = n(), avg_rides_duration_seg = mean(ride_duration)) %>%
arrange(member_casual,rideable_type, name_dayofweek)
## `summarise()` has grouped output by 'member_casual', 'rideable_type'. You can
## override using the `.groups` argument.
print(summary8columns5day_ol)
## # A tibble: 42 × 5
## # Groups: member_casual, rideable_type [6]
## member_casual rideable_type name_dayofweek rides_number
## <chr> <chr> <ord> <int>
## 1 casual classic_bike Mon 112942
## 2 casual classic_bike Tue 100216
## 3 casual classic_bike Wed 115705
## 4 casual classic_bike Thu 112217
## 5 casual classic_bike Fri 133667
## 6 casual classic_bike Sat 210811
## 7 casual classic_bike Sun 181235
## 8 casual electric_bike Mon 65050
## 9 casual electric_bike Tue 61154
## 10 casual electric_bike Wed 69263
## # ℹ 32 more rows
## # ℹ 1 more variable: avg_rides_duration_seg <dbl>
#View(summary8columns5day_ol)
summary8columns5daycasual_ol <- oneyeartrips_ok %>%
filter(member_casual == 'casual') %>%
group_by(member_casual, rideable_type, name_dayofweek) %>%
summarise(rides_number = n(), avg_rides_duration_seg = mean(ride_duration)) %>%
arrange(member_casual,rideable_type, name_dayofweek)
## `summarise()` has grouped output by 'member_casual', 'rideable_type'. You can
## override using the `.groups` argument.
print(summary8columns5daycasual_ol)
## # A tibble: 21 × 5
## # Groups: member_casual, rideable_type [3]
## member_casual rideable_type name_dayofweek rides_number
## <chr> <chr> <ord> <int>
## 1 casual classic_bike Mon 112942
## 2 casual classic_bike Tue 100216
## 3 casual classic_bike Wed 115705
## 4 casual classic_bike Thu 112217
## 5 casual classic_bike Fri 133667
## 6 casual classic_bike Sat 210811
## 7 casual classic_bike Sun 181235
## 8 casual electric_bike Mon 65050
## 9 casual electric_bike Tue 61154
## 10 casual electric_bike Wed 69263
## # ℹ 11 more rows
## # ℹ 1 more variable: avg_rides_duration_seg <dbl>
#View(summary8columns5daycasual_ol)
summary8columns5daymember_ol <- oneyeartrips_ok %>%
filter(member_casual == 'member') %>%
group_by(member_casual, rideable_type, name_dayofweek) %>%
summarise(rides_number = n(), avg_rides_duration_seg = mean(ride_duration)) %>%
arrange(member_casual,rideable_type, name_dayofweek)
## `summarise()` has grouped output by 'member_casual', 'rideable_type'. You can
## override using the `.groups` argument.
print(summary8columns5daymember_ol)
## # A tibble: 21 × 5
## # Groups: member_casual, rideable_type [3]
## member_casual rideable_type name_dayofweek rides_number
## <chr> <chr> <ord> <int>
## 1 member classic_bike Mon 270694
## 2 member classic_bike Tue 281868
## 3 member classic_bike Wed 298813
## 4 member classic_bike Thu 282238
## 5 member classic_bike Fri 244981
## 6 member classic_bike Sat 227747
## 7 member classic_bike Sun 209724
## 8 member electric_bike Mon 133194
## 9 member electric_bike Tue 140919
## 10 member electric_bike Wed 150581
## # ℹ 11 more rows
## # ℹ 1 more variable: avg_rides_duration_seg <dbl>
#View(summary8columns5daymember_ol)
53.1..-
summary8columns5daymember_ol %>%
ggplot(aes(x = name_dayofweek, y = rides_number/1000, fill = rideable_type)) +
geom_col(position = "dodge") +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Rides’ Number, by each Day of Week by Rideable Type') +
theme(plot.title = element_text(size = 11, face="bold")) +
xlab('Day of Week') +
ylab('Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Rides’ Number, by each Day of Week by Rideable Type.png")
54.-
summary8columns5daymember_ol %>%
filter(member_casual == 'member') %>%
ggplot(aes(x = name_dayofweek, y = rides_number/1000, fill = rideable_type)) +
geom_col(position = "dodge") +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Rides’ Number, Only Member, by each Day of Week by Rideable Type') +
theme(plot.title = element_text(size = 11, face="bold")) +
xlab('Day of Week') +
ylab('Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Rides’ Number, Only Member, by each Day of Week by Rideable Type.png")
55.-
summary8columns5daycasual_ol %>%
filter(member_casual == 'casual') %>%
ggplot(aes(x = name_dayofweek, y = rides_number/1000, fill = rideable_type)) +
geom_col(position = "dodge") +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Rides’ Number,Only Casual, by each Day of Week by Rideable Type') +
theme(plot.title = element_text(size = 11, face="bold")) +
xlab('Day of Week') +
ylab('Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Rides’ Number,Only Casual, by each Day of Week by Rideable Type.png")
56.-
summary8columns6_ol %>%
filter(member_casual == 'member') %>%
ggplot(aes(x = name_dayofweek, y = rides_number/1000, group = month, colour = month, shape = month)) +
geom_point() +
scale_shape_manual(values=seq(0,11)) +
scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#FF3399","#A0A0A0","#000000","#99FFFF","#66FF66","#FFFF00","#FFCC99")) +
geom_line(linewidth = 1.2) +
facet_grid(rideable_type ~ .) +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Rides’ Number,Only Member, by each Day of Week in each Month, Faceting by Rideable Type') +
theme(plot.title = element_text(size = 11, face = "bold")) +
xlab('Day of Week') +
ylab('Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Rides’ Number,Only Member, by each Day of Week in each Month, Faceting by Rideable Type.png")
57.-
summary8columns6_ol %>%
filter(member_casual == 'casual') %>%
ggplot(aes(x = name_dayofweek, y = rides_number/1000, group = month, colour = month, shape = month)) +
geom_point() +
scale_shape_manual(values=seq(0,11)) +
scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#FF3399","#A0A0A0","#000000","#99FFFF","#66FF66","#FFFF00","#FFCC99")) +
geom_line(linewidth = 1.2) +
facet_grid(rideable_type ~ .) +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Rides’ Number, Only Casual, by each Day of Week in each Month, Faceting by Rideable Type') +
theme(plot.title = element_text(size = 11, face = "bold")) +
xlab('Day of Week') +
ylab('Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Rides’ Number, Only Casual, by each Day of Week in each Month, Faceting by Rideable Type.png")
**58.-
summary8columns5day_char %>%
ggplot() +
geom_col(aes(reorder_within(name_dayofweek, rides_number, member_casual), rides_number/1000, fill = rideable_type), position = "dodge") +
facet_wrap(~rideable_type, scales = "free_y") +
theme(legend.position = "bottom") +
theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
coord_flip() +
labs(title = 'Descending Rides’ Number by each Day of Week, Faceting by Rideable Type') +
theme(plot.title = element_text(size = 11,face="bold")) +
xlab('Name of Days of Week') +
ylab('Descending Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Descending Rides’ Number, Only Member, by each Day of Week, Faceting by Rideable Type.png")
**59.-
summary8columns5day_char %>%
filter(member_casual == 'member') %>%
ggplot() +
geom_col(aes(reorder_within(name_dayofweek, rides_number, member_casual), rides_number/1000, fill = rideable_type), position = "dodge") +
facet_wrap(~rideable_type, scales = "free_y") +
theme(legend.position = "bottom") +
theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
coord_flip() +
labs(title = 'Descending Rides’ Number, Only Member, by each Day of Week, Faceting by Rideable Type') +
theme(plot.title = element_text(size = 11,face="bold")) +
xlab('Name of Days of Week') +
ylab('Descending Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Descending Rides’ Number, Only Member, by each Day of Week, Faceting by Rideable Type.png")
**60.-
summary8columns5day_char %>%
filter(member_casual == 'casual') %>%
ggplot() +
geom_col(aes(reorder_within(name_dayofweek, rides_number, member_casual), rides_number/1000, fill = rideable_type), position = "dodge") +
facet_wrap(~rideable_type, scales = "free_y") +
theme(legend.position = "bottom") +
theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
coord_flip() +
labs(title = 'Descending Rides’ Number, Only Casual, by each Day of Week, Faceting by Rideable Type') +
theme(plot.title = element_text(size = 11,face="bold")) +
xlab('Name of Days of Week') +
ylab('Descending Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Descending Rides’ Number, Only Casual, by each Day of Week, Faceting by Rideable Type.png")
61
summary8columns5month_ol %>%
#filter(member_casual == 'member') %>%
ggplot(aes(x = month, y = rides_number/1000, fill = rideable_type)) +
geom_col(position = "dodge") +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Rides’ Number,Only Member, by each Month by Rideable Type') +
theme(plot.title = element_text(size = 11, face="bold")) +
xlab('Month') +
ylab('Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Rides’ Number,Only Member, by each Month by Rideable Type.png")
62.-
summary8columns5month_ol %>%
filter(member_casual == 'casual') %>%
ggplot(aes(x = month, y = rides_number/1000, fill = rideable_type)) +
geom_col(position = "dodge") +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Rides’ Number, Only Casual, by each Month by Rideable Type') +
theme(plot.title = element_text(size = 11, face="bold")) +
xlab('Month') +
ylab('Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Rides’ Number, Only Casual, by each Month by Rideable Type.png")
63.-
summary8columns5month_char %>%
filter(member_casual == 'member') %>%
ggplot() +
geom_col(aes(reorder_within(month, rides_number, rideable_type), rides_number/1000, fill = rideable_type), position = "dodge") +
facet_wrap(~rideable_type, scales = "free_y") +
theme(legend.position = "bottom") +
theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
coord_flip() +
#scale_x_reordered() +
labs(title = 'Descending Rides’ Number,Only Member, by Month, Faceting by Rideable Type') +
theme(plot.title = element_text(size = 11,face="bold")) +
xlab('Name of Days of Week') +
ylab('Descending Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Descending Rides’ Number,Only Member, by each Month, Faceting By Rideable Type.png")
64.-
summary8columns5month_char %>%
filter(member_casual == 'casual') %>%
ggplot() +
geom_col(aes(reorder_within(month, rides_number, rideable_type), rides_number/1000, fill = rideable_type), position = "dodge") +
facet_wrap(~rideable_type, scales = "free_y") +
theme(legend.position = "bottom") +
theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
coord_flip() +
labs(title = 'Descending Rides’ Number, Only Casual, by Month, Faceting by Rideable Type') +
theme(plot.title = element_text(size = 11,face="bold")) +
xlab('Month') +
ylab('Descending Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Descending Rides’ Number,Only Casual, by each Month, Faceting by Rideable Type.png")
65.-
“Rides’ Number by each Month parametrized by the 7 Days of Week,Only Member, Facetting by Rideable Type” Plot
summary8columns6_ol %>%
filter(member_casual == 'member') %>%
ggplot(aes(x = month, y = rides_number/1000, group = name_dayofweek, colour = name_dayofweek, shape = name_dayofweek)) +
geom_point() +
scale_shape_manual(values=seq(0,6)) +
scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#000000","#99FFFF")) +
geom_line(linewidth = 1.2) +
facet_grid(rideable_type ~ .) +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Rides’ Number, Only Member, by each Month in each Day of Week, Faceting by Rideable') +
theme(plot.title = element_text(size = 11, face = "bold")) +
xlab('Month') +
ylab('Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Rides’ Number, Only Member, by each Month in each Day of Week, Faceting by Rideable.png")
66.-
“Rides’ Number, Only Casual, by each Month in each Day of Week (parametrized by the 7 Days of Week), Facetting by Rideable Type” Plot
summary8columns6_ol %>%
filter(member_casual == 'casual') %>%
ggplot(aes(x = month, y = rides_number/1000, group = name_dayofweek, colour = name_dayofweek, shape = name_dayofweek)) +
geom_point() +
scale_shape_manual(values=seq(0,6)) +
scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#000000","#99FFFF")) +
geom_line(linewidth = 1.2) +
facet_grid(rideable_type ~ .) +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Rides’ Number, Only Casual, by each Month in each Day of Week, Faceting by Rideable') +
theme(plot.title = element_text(size = 11, face = "bold")) +
xlab('Month') +
ylab('Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Rides’ Number, Only Casual, by each Month in each Day of Week, Faceting by Rideable.png")
70.-
This set of code lines is made to be able to present on the “X” axis the variable “month” in an ordered form from “Oct_23” to “Sep_24”, and on the “Y” axis" the values “avg_rides_duration_seg” corresponding to the order of the “X” axis “month”, by Rideable Type
Average Rides’ Time by each Month by Rideable Type
summary8columns6_ol %>%
ggplot(aes(x = month, y = avg_rides_duration_seg /60, fill = rideable_type)) +
geom_col(position = "dodge") +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Average Rides’ Time by each Month by Rideable Type') +
theme(plot.title = element_text(size = 11, face="bold")) +
xlab('Month') +
ylab('Average Rides’ Time (in min.)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Average Rides’ Time by each Month by Rideable Type.png")
71.-
summary8columns6_ol %>%
filter(member_casual == 'member') %>%
ggplot(aes(x = month, y = avg_rides_duration_seg /60, fill = rideable_type)) +
geom_col(position = "dodge") +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Avg Rides’ Time, Only Member, by each Month, by Rideable Type') +
theme(plot.title = element_text(size = 11, face="bold")) +
xlab('Month') +
ylab('Average Rides’ Time (in min.)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Average Rides’ Time, Only Member, by each Month by Rideable Type.png")
72.-
summary8columns6_ol %>%
filter(member_casual == 'casual') %>%
ggplot(aes(x = month, y = avg_rides_duration_seg /60, fill = rideable_type)) +
geom_col(position = "dodge") +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Avg Rides’ Time, Only Casual, by each Month, by Rideable Type') +
theme(plot.title = element_text(size = 11, face="bold")) +
xlab('Month') +
ylab('Average Rides’ Time (in min.)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Average Rides’ Time, Only Casual, by each Month by Rideable Type.png")
73.-
summary8columns5month_char %>%
filter(member_casual == 'member') %>%
ggplot() +
geom_col(aes(reorder_within(month, avg_rides_duration_seg, rideable_type), avg_rides_duration_seg/60, fill = rideable_type), position = "dodge") +
facet_wrap(~rideable_type, scales = "free_y") +
theme(legend.position = "bottom") +
theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
coord_flip() +
labs(title = 'Descending Avg Rides’ Time, Only Member, by Month, Faceting by Rideable Type') +
theme(plot.title = element_text(size = 11,face="bold")) +
xlab('Month') +
ylab('Descending Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Descending Avg Rides’ Time, Only Member, by each Month, Faceting by Rideable Type.png")
74.-
Descending Avg Rides’ Time, Only Casual, by each Month, Faceting by Rideable Type
summary8columns5month_char %>%
filter(member_casual == 'casual') %>%
ggplot() +
geom_col(aes(reorder_within(month, avg_rides_duration_seg, rideable_type), avg_rides_duration_seg/60, fill = rideable_type), position = "dodge") +
facet_wrap(~rideable_type, scales = "free_y") +
theme(legend.position = "bottom") +
theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
coord_flip() +
labs(title = 'Descending Avg Rides’ Time, Only Casual, by Month, Faceting by Rideable Type') +
theme(plot.title = element_text(size = 11,face="bold")) +
xlab('Month') +
ylab('Descending Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Descending Avg Rides’ Time, Only Casual, by each Month, Faceting by Rideable Type.png")
75.-
“Avg. Rides’ Time,Only Member, by each Month, in each Day of Week (parametrized by 7 Days of Week), Faceting by Rideable Type” Plot
summary8columns6_ol %>%
filter(member_casual == 'member') %>%
ggplot(aes(x = month, y = avg_rides_duration_seg/60, group = name_dayofweek, colour = name_dayofweek, shape = name_dayofweek)) +
geom_point() +
scale_shape_manual(values=seq(0,6)) +
scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#000000","#99FFFF")) +
geom_line(linewidth = 1.2) +
facet_grid(rideable_type ~ .) +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Avg. Rides’ Time,Only Member, by each Month in each Day of Week, Faceting by Rideable Type') +
theme(plot.title = element_text(size = 11, face = "bold")) +
xlab('Month') +
ylab('Average Rides’ Time (in min.)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Avg. Rides’ Time,Only Member, by each Month in each Day of Week, Faceting by Rideable Type.png")
76.-
“Avg. Rides’ Time, Only Casual, by each Month in each Day of Week (parametrized by 7 Days of Week), Faceting by Rideable Type” Plot
summary8columns6_ol %>%
filter(member_casual == 'casual') %>%
ggplot(aes(x = month, y = avg_rides_duration_seg/60, group = name_dayofweek, colour = name_dayofweek, shape = name_dayofweek)) +
geom_point() +
scale_shape_manual(values=seq(0,6)) +
scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#000000","#99FFFF")) +
geom_line(linewidth = 1.2) +
facet_grid(rideable_type ~ .) +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Avg. Rides’ Time, Only Casual, by each Month in each Day of Week, Faceting by Rideable Type') +
theme(plot.title = element_text(size = 11, face = "bold")) +
xlab('Month') +
ylab('Average Rides’ Time (in min.)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Avg. Rides’ Time, Only Casual, by each Month in each Day of Week, Faceting by Rideable Type.png")
80.-
This set of code lines is made to be able to present on the “X” axis the variable Day of Week (“name_dayofweek”) in an ordered form from “Monday” to “Sunday”, and on the “Y” axis" the values “avg_rides_duration_seg” corresponding to the order of the “X” axis “month”, by Rideable Type
Average Rides’ Time by each Day of Week by Rideable Type
summary8columns6_ol %>%
ggplot(aes(x = name_dayofweek, y = avg_rides_duration_seg /60, fill = rideable_type)) +
geom_col(position = "dodge") +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Average Rides’ Time by each Day of Week by Rideable Type') +
theme(plot.title = element_text(size = 11, face="bold")) +
xlab('Day of Week') +
ylab('Average Rides’ Time (in min.)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Average Rides’ Time by each Day of Week by Rideable Type.png")
81.-
summary8columns6_ol %>%
filter(member_casual == 'member') %>%
ggplot(aes(x = name_dayofweek, y = avg_rides_duration_seg /60, fill = rideable_type)) +
geom_col(position = "dodge") +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Avg Rides’ Time, Only Member, by each Day of Week, by Rideable Type') +
theme(plot.title = element_text(size = 11, face="bold")) +
xlab('Day of Week') +
ylab('Average Rides’ Time (in min.)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Average Rides’ Time, Only Member, by each Day of Week by Rideable Type.png")
82.-
summary8columns6_ol %>%
filter(member_casual == 'casual') %>%
ggplot(aes(x = name_dayofweek, y = avg_rides_duration_seg /60, fill = rideable_type)) +
geom_col(position = "dodge") +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Avg Rides’ Time, Only Casual, by each Day of Week, by Rideable Type') +
theme(plot.title = element_text(size = 11, face="bold")) +
xlab('Day of Week') +
ylab('Average Rides’ Time (in min.)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Average Rides’ Time, Only Casual, by each Day of Week by Rideable Type.png")
83.-
summary8columns5day_char %>%
filter(member_casual == 'member') %>%
ggplot() +
geom_col(aes(reorder_within(name_dayofweek, avg_rides_duration_seg, rideable_type), avg_rides_duration_seg/60, fill = rideable_type), position = "dodge") +
facet_wrap(~rideable_type, scales = "free_y") +
theme(legend.position = "bottom") +
theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
coord_flip() +
labs(title = 'Descending Avg Rides’ Time, Only Member, by Day of Week, Faceting by Rideable Type') +
theme(plot.title = element_text(size = 11,face="bold")) +
xlab('Day of Week') +
ylab('Descending Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Descending Avg Rides’ Time, Only Member, by each Day of Week, Faceting by Rideable Type.png")
84.-
Descending Avg Rides’ Time, Only Casual, by each Day of Week, Faceting by Rideable Type
summary8columns5day_char %>%
filter(member_casual == 'casual') %>%
ggplot() +
geom_col(aes(reorder_within(name_dayofweek, avg_rides_duration_seg, rideable_type), avg_rides_duration_seg/60, fill = rideable_type), position = "dodge") +
facet_wrap(~rideable_type, scales = "free_y") +
theme(legend.position = "bottom") +
theme(legend.title = element_text(colour = "black", size =11, face ="bold")) +
theme(legend.text = element_text(colour = "black", size =10, face ="bold")) +
coord_flip() +
labs(title = 'Descending Avg Rides’ Time, Only Casual, by Day of Week, Faceting by Rideable Type') +
theme(plot.title = element_text(size = 11,face="bold")) +
xlab('Day of Week') +
ylab('Descending Rides’ Number (in thousands)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Descending Avg Rides’ Time, Only Casual, by each Day of Week, Faceting by Rideable Type.png")
85.-
“Avg. Rides’ Time,Only Member, by each Day of Week, in each Month (parametrized by 12 Month in the Year), Faceting by Rideable Type” Plot
summary8columns6_ol %>%
filter(member_casual == 'member') %>%
ggplot(aes(x = name_dayofweek, y = avg_rides_duration_seg/60, group = month, colour = month, shape = month)) +
geom_point() +
scale_shape_manual(values=seq(0,11)) +
scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#FF3399","#A0A0A0","#000000","#99FFFF","#66FF66","#FFFF00","#FFCC99")) +
geom_line(linewidth = 1.2) +
facet_grid(rideable_type ~ .) +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Avg. Rides’ Time,Only Member, by Day of Week in each Month of Year, Faceting by Rideable') +
theme(plot.title = element_text(size = 11, face = "bold")) +
xlab('Day of Week') +
ylab('Average Rides’ Time (in min.)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Avg. Rides’ Time,Only Member, by Day of Week in each Month of Year, Faceting by Rideable.png")
86.-
“Avg. Rides’ Time,Only Casual, by each Day of Week, in each Month (parametrized by 12 Month in the Year), Faceting by Rideable Type” Plot
summary8columns6_ol %>%
filter(member_casual == 'casual') %>%
ggplot(aes(x = name_dayofweek, y = avg_rides_duration_seg/60, group = month, colour = month, shape = month)) +
geom_point() +
scale_shape_manual(values=seq(0,11)) +
scale_color_manual(values=c("#FF66FF","#FF0000","#0000CC","#00994C","#9933FF","#FF3399","#A0A0A0","#000000","#99FFFF","#66FF66","#FFFF00","#FFCC99")) +
geom_line(linewidth = 1.2) +
facet_grid(rideable_type ~ .) +
theme(legend.position="bottom") +
theme(legend.title = element_text(colour= "black", size=11, face="bold")) +
theme(legend.text = element_text(colour= "black", size=10, face="bold")) +
labs(title = 'Avg. Rides’ Time,Only Casual, by Day of Week in each Month of Year, Faceting by Rideable') +
theme(plot.title = element_text(size = 11, face = "bold")) +
xlab('Day of Week') +
ylab('Average Rides’ Time (in min.)') +
theme(axis.text.x = element_text(angle = 30, face = "bold" )) +
theme(axis.title.x = element_text(face = "bold")) +
theme(axis.text.y = element_text(face = "bold")) +
theme(axis.title.y = element_text(face = "bold"))
#ggsave("Avg. Rides’ Time,Only Casual, by Day of Week in each Month of Year, Faceting by Rideable.png")
131-
knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/1.-Rides' Number by Day of Week, Filling by User Type and Rides' Number by Day of Week in each Month , Faceting by User Type.png")
132-
knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/2.-Rides' Number by Month, Filling by User Type and Rides' Number by Month in each Day of Week , Faceting by User Type.png")
133-
knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/3.-Only Member-Rides' Number by Day of Week, Filling by Rideable Type and Rides' Number by Day of Week in each Month , Faceting by Rideable Type.png")
134-
knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/4.-Only Casual-Rides' Number by Day of Week, Filling by Rideable Type and Rides' Number by Day of Week in each Month , Faceting by Rideable Type.png")
135-
knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/5.-Only Member-Rides' Number by Month, Filling by Rideable Type and Rides' Number by Month in each Day of Week, Faceting by Rideable Type.png")
136-
knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/6.-Only Casual-Rides' Number by Month, Filling by Rideable Type and Rides' Number by Month in each Day of Week, Faceting by Rideable Type.png")
141-
knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/1.-Avg Rides' Time by Day of Week, Filling by User Type and Avg Rides' Time by Day of Week in each Month , Faceting by User Type.png")
142-
knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/2.-Avg Rides' Time by Month, Filling by User Type and Rides' Number by Month in each Day of Week , Faceting by User Type.png")
143-
knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/3.-Only Member-Avg Rides' Time by Day of Week, Filling by Rideable Type and Avg Rides' Time by Day of Week in each Month , Faceting by Rideable Type.png")
144-
knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/4.-Only Casual-Avg Rides' Time by Day of Week, Filling by Rideable Type and Avg Rides' Time by Day of Week in each Month , Faceting by Rideable Type.png")
145-
knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/5.-Only Member-Avg Rides' Time by Month, Filling by Rideable Type and Avg Rides' Time by Month in each Day of Week, Faceting by Rideable Type.png")
146-
knitr::include_graphics("/Users/user/Desktop/1CaseStudy1v1/6.-Only Casual-Avg Rides' Time by Month, Filling by Rideable Type and Avg Rides' Time by Month in each Day of Week, Faceting by Rideable Type.png")
150.-
dataA= matrix(c(35.58, 64.42), ncol=2, byrow=TRUE)
colnames(dataA) = c('casual','member')
finalA=as.table(dataA)
#print(finalA)
lblsA <- paste(colnames(dataA), "\n", finalA,"%", sep="")
#print(A)
dataB= matrix(c(62.65, 37.35), ncol=2, byrow=TRUE)
colnames(dataB) = c('casual','member')
finalB=as.table(dataB)
#print(finalB)
lblsB <- paste(colnames(dataB), "\n", finalB,"%", sep="")
#print(lblsB)
par(mfrow=c(1,2), cex.main = 1.0, cex.axis= 0.2, mar = c(3, 7, 2, 1))
pie(finalA, labels = lblsA, col=rainbow(length(lblsA)), main= " \n Rides' Number in the Year\n by User Type,\n based in each Day of Week")
pie(finalB, labels = lblsB, col=rainbow(length(lblsB)), main= " \n Avg. Rides' Time in the Year\n by User Type,\n based in each Day of Week")
150.1.- Start of Data Analysis Results
The first graph shows that the relationship between Rides’ Number made by the types of users in the “Casuals” class, approximately, is half of the Rides’ Number made by the types of users in the “Members” class
The second graph shows that the relationship between the Average Rides’ Time made by the types of users in the “Casuals” class is twice of the Average Rides’ Time made by the types of users in the “Members” class.
The procedure to convert the user types of the “Casuals” class into user types of the class " Members“, must initially begin with a promotion period, where the values resulting from the 2 variables analyzed (Rides’ Number and/or Average Ride’s Time) which are lower in the types of users of the”Casuals" class than the types of users of the “Members” class, so that they can achieve and/or exceed the results of the “Members” class, so that they may subsequently be offered the “Members” class benefits and so they can be converted to such a class in a natural form,
From the above mentioned results it can be concluded that the first recommendation , resulting from this data analysis, is to implement a strategy to increase the Rides’ Number** of the types of users of the class “Casuals” at least twice what they did in this year of study. At the same time, the results of the values of the Average Travel Time by the types of users in the “Casuals” class does not need to be promoted because they are well above the resulting values for the types of users in the “Members” class**
151.-
# fig.width = 12.29, fig.height= 8.00
dataPct5 <- list(Day = c('Mon','Tue','Wed','Thu','Fri','Sat','Sun'), ratio= c(0.45, 0.39, 0.42, 0.43, 0.57, 0.92, 0.87))
finalPct5 <- as.data.frame(dataPct5)
gt_finalPct5 <- gt(finalPct5)
# values in table <= 0.57 , in bold
gt_finalPct5 <- gt_finalPct5 %>%
tab_style(style = cell_text(weight = "bold"),
locations = cells_body(columns = ratio,
rows = ratio <= 0.57))
colors <- c("#FFFFFF", "#F5FCC2","#FF0000", "#7FFF00", "#B3C732", "#A0522D", "#000000")
pie5 <- ggplot(data = finalPct5, aes(x="", y = ratio, fill = reorder(Day,-ratio ))) +
geom_col(color = "black") +
coord_polar("y", start = 0) +
geom_text(aes(x = 1.6, label = ratio), position = position_stack(vjust = 0.5), fontface = "bold") +
theme(panel.background = element_blank(),
axis.line = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
axis.title = element_blank(),
plot.title = element_text(hjust = 0.5, size = 20, face="bold")) +
ggtitle(" Ratio Casual/Member in Rides' Number \n in each Day of Week, based on User Type") +
scale_fill_manual(values = colors)
pie5 + gt_finalPct5
151.1.- First Result of the Data Analysis
It can be seen that the graphic (Pie Chart + table) presents the results of the numerical analysis of the values of the ratio Casual/Member, in the case of Rides’ Number by the 2 types of users, by each Day of the Week, during the 1-year period between October 2023 and September 2024.
The values of that ratio(casual/member) shown in “bold” in the table, indicate that on days going from Monday to Friday , the Rides’ Number from the types of users in the Casuals"* class are on average 0.45 (roughly 50%) times lower than the results from the types of users in the “Members” class. On the weekends, however, the ratio is roughly 1.00(roughly 100%), which indicates that both types of users make, practically, the same Rides’ Number
These results are practically repeated when the same analysis of the data is made, but based on Rideable_Type , which is a very important indication to take into consideration in the elaboration of the Strategy to convert casual riders into annual members.
152.-
dataC= matrix(c(65.73, 1.13, 33.14), ncol=3, byrow=TRUE)
colnames(dataC) = c('classic_bike','electric_scooter','electric_bike')
finalC=as.table(dataC)
#print(finalC)
lblsC <- paste(colnames(dataC), "\n", finalC,"%", sep="")
#print(lblsC)
dataD= matrix(c(59.82, 3.10, 37.08), ncol=3, byrow=TRUE)
colnames(dataD) = c('classic_bike','electric_scooter','electric_bike')
finalD=as.table(dataD)
#print(finalD)
lblsD <- paste(colnames(dataD), "\n", finalD,"%", sep="")
#print(lblsD)
par(mfrow=c(1,2), cex.main = 1.0, cex.axis= 0.2, mar = c(3, 7, 2, 1))
pie(finalC, labels = lblsC, col=rainbow(length(lblsC)), main= " \n Rides' Number in the Year\n by Rideable Type,\n based on each Day of Week")
pie(finalD, labels = lblsD, col=rainbow(length(lblsD)), main= " \n Avg. Rides' Time in the Year\n by Rideable Type,\n based on each Day of Week")
152.1.- Second Result of the Data Analysis
It can be seen that the 2 graphs (Pie Chart) present the results of numerical analysis of both the values of the Rides’ Numbers and the Average Rides’ Time in the three types of Rideable Type (classic_bike, electric_scooter, electric_bike) during the period of 1 year that runs between October 2023 and September 2024, taking together the 2 type of users (member_casual).
The two graphs show that the relationship between Rides’ Numbers and Average Rides’ Time made by the three types of Rideabe_Type is maintained, approximately, twice the use of classic_bike compared to electric_bike, and with regard to electric_scooter, its use is practically nil with respect to classic_bike and electric_bike
153.-
# fig.width = 12.29, fig.height= 8.00
dataPct1 <- list(Day=c('Mon-Classic_Bike','Tue-Classic_Bike','Wed-Classic_Bike','Thu-Classic_Bike','Fri-Classic_Bike','Sat-Classic_Bike','Sun-Classic_Bike','Week-ES','Mon-Electric_Bike','Tue-Electric_Bike','Wed-Electric_Bike','Thu-Electric_Bike','Fri-Electric_Bike','Sat-Electric_Bike','Sun-Electric_Bike'), percentage=c(6.39, 6.66, 7.06, 6.67, 5.79, 5.38, 4.95, 0.52, 3.15, 3.33, 3.56, 3.38, 3.02, 2.42, 2.15))
finalPct1 <- as.data.frame(dataPct1)
gt_finalPct1 <- gt(finalPct1)
# values in table >= 4.95 , in bold
gt_finalPct1 <- gt_finalPct1 %>%
tab_style(style = cell_text(weight = "bold"),
locations = cells_body(columns = percentage,
rows = percentage >= 4.95))
colors <- c("#FFFFFF", "#F5FCC2","#FF0000", "#7FFF00", "#B3C732", "#A0522D", "#000000", "#FFFF00", "#EE7600", "#00994C", "#9933FF", "#FF3399", "#A0A0A0", "#718200", "#99FFFF")
pie1 <- ggplot(data = finalPct1, aes(x="", y = percentage, fill = reorder(Day,-percentage ))) +
geom_col(color = "black") +
coord_polar("y", start = 0) +
geom_text(aes(x = 1.6, label = paste0(percentage,"\n", "%")), position = position_stack(vjust = 0.5), fontface = "bold") +
theme(panel.background = element_blank(),
axis.line = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
axis.title = element_blank(),
plot.title = element_text(hjust = 0.5, size = 20, face="bold")) +
ggtitle("Rides' Number:% Used by Only Member\n in each Day of Week in the Year,\n based on Rideable Type") +
scale_fill_manual(values = colors)
pie1 + gt_finalPct1
153.1.- Third Result of the Data Analysis
Nomenclature: Week-Es = Summary of values (%) of the Electric_Scooter in the full week
It can be seen that in the graphic (Pie Chart) and in the table that generated this graphic, the results of the numerical analysis of the values for the types of users of the class “Members”* are presented, as regards the Rides’ Number of the three types of Rideable_Type (classic_bike, electric_scooter, electric_bike),in each Day of Week, expressed as partial percentages (%) of the total Rides’ Number performed by the three types of Rideable_Type and the two types of users, as a whole during the 1-year period between October 2023 and September 2024.
In the “Pie Chart” type graphic and in the table to its right, it is shown practically that the relationship between the Rides’ Number made by the types of users of the class “Members” in each Day of the Week during the year under study, in the classicc_bikes, approximately, are double of the Rides’ Number performed by electric_bikes.
The summary values (%) of Electric_Scooter in the full week (Week-ES) are negligible compared to the rest of the table values
154.-
#fig.width = 12.29, fig.height= 8.00
dataPct2 <- list(Day=c('Mon-C_Bike','Tue-C_Bike','Wed-C_Bike','Thu-C_Bike','Fri-C_Bike','Sat-C_Bike','Sun-C_Bike','Week-ES','Mon-E_Bike','Tue-E_Bike','Wed-E_Bike','Thu-E_Bike','Fri-E_Bike','Sat-E_Bike','Sun-E_Bike'), percentage=c(2.67, 2.37, 2.73, 2.65, 3.16, 4.98, 4.28, 0.61, 1.54, 1.44, 1.64, 1.63, 1.83, 2.18, 1.89))
finalPct2 <- as.data.frame(dataPct2)
gt_finalPct2 <- gt(finalPct2)
# values in table >= 2.37 , in bold
gt_finalPct2 <- gt_finalPct2 %>%
tab_style(style = cell_text(weight = "bold"),
locations = cells_body(columns = percentage,
rows = percentage >= 2.37))
colors <- c("#FFFFFF", "#F5FCC2","#FF0000", "#7FFF00", "#B3C732", "#A0522D", "#000000", "#FFFF00", "#EE7600", "#00994C", "#9933FF", "#FF3399", "#A0A0A0", "#718200", "#99FFFF")
pie2 <- ggplot(data = finalPct2, aes(x="", y = percentage, fill = reorder(Day,-percentage ))) +
geom_col(color = "black") +
coord_polar("y", start = 0) +
geom_text(aes(x = 1.6, label = paste0(percentage,"\n", "%")), position = position_stack(vjust = 0.5), fontface = "bold") +
theme(panel.background = element_blank(),
axis.line = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
axis.title = element_blank(),
plot.title = element_text(hjust = 0.5, size = 20, face="bold")) +
ggtitle(" Rides' Number: % Used\n by Only Casual in each Day of Week\n in the Year, based on Rideable Type") +
scale_fill_manual(values = colors)
pie2 + gt_finalPct2
154.1.-Fourth Result of the Data Analysis
It can be seen that in the graphic (Pie Chart) and in the table that generated this graphic, the results of the numerical analysis of the values for the types of users of the class “Casuals”* are presented, as regards the Rides’ Number of the three types of Rideable_Type (classic_bike, electric_scooter, electric_bike), by each Day of Week, expressed as partial percentages (%) of the total Rides’ Number performed by the three types of Rideable_Type and the two types of users, as a whole, during the 1-year period between October 2023 and September 2024.
In the “Pie Chart” type graphic and in the table to its right, it is shown practically that the relationship between the Rides’ Number made by the types of users of the class “Casuals”, in each Day of the Week, during the year under study, in the classicc_bikes, approximately, are double of the Rides’ Number performed by electric_bikes.
155.-
#fig.width = 12.29, fig.height= 8.00
dataPct3 <- list(Rideable= c('classic_bike-M','electric_scooter-M','electric_bike-M','classic_bike-C','electric_scooter-C','electric_bike-C'),
percentage= c(42.90, 0.52, 21.00, 22.84, 0.61, 12.14))
finalPct3 <- as.data.frame(dataPct3)
gt_finalPct3 <- gt(finalPct3)
colors <- c("#FFFFFF", "#F5FCC2","#FF0000", "#7FFF00", "#B3C732", "#A0522D")
pie3 <- ggplot(data = finalPct3, aes(x="", y = percentage, fill = reorder(Rideable,-percentage))) +
geom_col(color = "black") +
coord_polar("y", start = 0) +
geom_text(aes(x = 1.6, label = paste0(percentage,"\n", "%")), position = position_stack(vjust = 0.5), fontface = "bold") +
theme(panel.background = element_blank(),
axis.line = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
axis.title = element_blank(),
plot.title = element_text(hjust = 0.5, size = 20, face="bold")) +
ggtitle(" Rides' Number: % Used\n by Rideable_Type\n in one Year, by User Type") +
scale_fill_manual(values = colors)
pie3 + gt_finalPct3
155.1.- Fifth Result of the Data Analysis.
In the Pie Chart and the table that generated it, the relation between the Number of Trips made by the 2 types of Rideabe_Type (classic_bike, electric_bike) is shown. The classic_bike-M/electric_bike-M ratio is equal to 2.04. The classic_bike-C/electric_bike-C ratio is equal to 1.88*. The relation classic_bike-C/classic_bike-M is equal to 0.53, The relation electric_bike-C/electric_bike-M is equal to 0.58 With the electric_scooter its use is practically nil in respect of the classic_bike and electric_bike in both types of use.
Estos resultados mostrados en las diversos ratios: Member/Member, Casual/Casual, Casual/Member, indican que los ratios entre los classic_bike-M /electric_bike-M, classic_bike-C /electric_bike-C, indican que practicamente los classic_bike con respecto al electric_bike, realizaron el doble de los Rides’ Number . En cambio los ratios Casual/Member para los casos tales como classic_bike-C/classic_bike-M y electric_bike-C/electric_bike-M, practicamente los elementos de la clase “Casuals” realizaron la mitad de los Rides’ Number en comparacion con los elementos de la clase “Members”
These results shown in the various ratios: Member/Member, Casual/Casual, Casual/Member, indicate that the ratios between classic_bike-M /electric_bike-M, classic_bike-C /electric_bike-C, indicate that practically classic_bike versus electric_bike, performed double of the Rides’ Number . On the contrary the Casual/Member ratios for cases such as classic_bike-C/classic_bike-M and electric_bike-C/electric_bike-M, practically the elements of class “Casuals” performed half of the Rides’ Number in comparison with the elements of class “Members”
156.-
#fig.width = 12.29, fig.height= 8.00
dataPct4 <- list(Rideable= c('classic_bike-M','electric_scooter-M','electric_bike-M','classic_bike-C','electric_scooter-C','electric_bike-C'),
percentage= c(20.03, 1.32, 15.99, 39.78, 1.78, 21.09))
finalPct4 <- as.data.frame(dataPct4)
gt_finalPct4 <- gt(finalPct4)
colors <- c("#FFFFFF", "#F5FCC2","#FF0000", "#7FFF00", "#B3C732", "#A0522D")
pie4 <- ggplot(data = finalPct4, aes(x="", y = percentage, fill = reorder(Rideable,-percentage ))) +
geom_col(color = "black") +
coord_polar("y", start = 0) +
geom_text(aes(x = 1.6, label = paste0(percentage,"\n", "%")), position = position_stack(vjust = 0.5), fontface = "bold") +
theme(panel.background = element_blank(),
axis.line = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
axis.title = element_blank(),
plot.title = element_text(hjust = 0.5, size = 20, face="bold")) +
ggtitle(" Avg Rides' Time: % Used\n by Rideable_Type in one Year,\n by User Type") +
scale_fill_manual(values = colors)
pie4 + gt_finalPct4
156.1.- Sixth Result of the Data Analysis de Datos.
Nomenclature: Member = M , Casual = C It can be seen that in the graph (Pie Chart + table) are presented the results of the numerical analysis of the values of Average Rides’ Time in the three types of Rideable_Type (classic_bike, electric_scooter, electric_bike) during the 1-year period between October 2023 and September 2024, based on the total contribution of the two types of users **(Member(M)_Casual(C))**
In the Pie Chart and the table that generated it is shown the relationship between the Average Rides’ Time performed by the three types of Rideabe_Type(classic_bike, electric_scooter, electric_bike). In the first case the classic_bike-M/electric_bike-M ratio is equal to 1.25 , in the second case the classic_bike-C /electric_bike-C ratio is equal to 1.87 In the third case the classic_bike-C /classic_bike-M ratio is equal to 1.97. In the fourth case, the electric_bike-C/electric_bike-M ratio is equal to 1.32. In the fifth case, the use of electric_scooters is practically nil compared to classic_bike and electric_bike.
157.-
# fig.width = 12.29, fig.height= 8.00
dataPct6 <- list(Day = c('Mon','Tue','Wed','Thu','Fri','Sat','Sun'), ratio= c(1.77, 1.60, 1.60, 1.57, 1.70, 1.70, 1.79))
finalPct6 <- as.data.frame(dataPct6)
gt_finalPct6 <- gt(finalPct6)
# values in table >= 1.57 , in bold
gt_finalPct6 <- gt_finalPct6 %>%
tab_style(style = cell_text(weight = "bold"),
locations = cells_body(columns = ratio,
rows = ratio >= 1.57))
colors <- c("#FFFFFF", "#F5FCC2","#FF0000", "#7FFF00", "#B3C732", "#A0522D", "#000000")
pie6 <- ggplot(data = finalPct6, aes(x="", y = ratio, fill = reorder(Day,-ratio ))) +
geom_col(color = "black") +
coord_polar("y", start = 0) +
geom_text(aes(x = 1.6, label = ratio), position = position_stack(vjust = 0.5), fontface = "bold") +
theme(panel.background = element_blank(),
axis.line = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
axis.title = element_blank(),
plot.title = element_text(hjust = 0.5, size = 20, face="bold")) +
ggtitle(" Ratio Casual/Member in\n Avg Rides' Time, in each Day of Week \n in one Year, based on User Type") +
scale_fill_manual(values = colors)
pie6 + gt_finalPct6
157.1.- Seventh Result of the Data Analysis
It can be seen that in the graphic (Pie Chart + table) are presented the results of numerical analysis of the values of the ratios of Average Rides’ Time generated when comparing to user types, class “Casual”, with user types, class “Member”, on each Day of the Week, during the 1 year period between October 2023 and September 2024.
In the Pie Chart and the table that generated it, the relationship between Average Ride’s Time performed by the 2 types of users , which produces during the whole week, a Casual/Member ratio on the variable Average Ride’s Time, with an average of 1.68 during the full week, indicating that the “Casual” class types performed 68% more than the Average Ride’s Time, that performed by user types class “Members”, which determines that the Strategy to Convert casual riders into annual members, does not need to take into account the Average Ride’s Time, because they already passed the performance of the annual members.
158.-
#fig.width = 12.29, fig.height= 8.00
dataPct7 <- list(Day = c('Mon','Tue','Wed','Thu','Fri','Sat','Sun'), ratio= c(0.42, 0.36, 0.39, 0.40, 0.55, 0.93, 0.86))
finalPct7 <- as.data.frame(dataPct7)
gt_finalPct7 <- gt(finalPct7)
# values in table <= 0.55 , in bold
gt_finalPct7 <- gt_finalPct7 %>%
tab_style(style = cell_text(weight = "bold"),
locations = cells_body(columns = ratio,
rows = ratio <= 0.55))
colors <- c("#FFFFFF", "#F5FCC2","#FF0000", "#7FFF00", "#B3C732", "#A0522D", "#000000")
pie7 <- ggplot(data = finalPct7, aes(x="", y = ratio, fill = reorder(Day,-ratio ))) +
geom_col(color = "black") +
coord_polar("y", start = 0) +
geom_text(aes(x = 1.6, label = ratio), position = position_stack(vjust = 0.5), fontface = "bold") +
theme(panel.background = element_blank(),
axis.line = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
axis.title = element_blank(),
plot.title = element_text(hjust = 0.5, size = 20, face="bold")) +
ggtitle(" Ratio: Casual/Member Classic_Bike,\n in Rides' Number in each Day of Week,\n in one Year, based on Rideable_Type") +
scale_fill_manual(values = colors)
pie7 + gt_finalPct7
158.1.-Eighth Result of the Data Analysis
It can be seen that the graph (Pie Chart + table) presents the results of numerical analysis of the values of the ratio: Casual/Member Classic_Bike, in the case of Rides’ Number, for the 2 types of users, for each Day of the Week, based on the Classic_Bike Rideable_Type, during the 1-year period between October 2023 and September 2024.
The values of this ratio shown in “bold”, indicate that on days going from Monday to Friday , the Rides’ Number of the user types class Casual are on average 0.42 (practically 50% ) times lower than the user types “Member” class for Classic_Bike Rideable_Type. On weekends the ratio has,practically, a value of 1.00(practically 100% ), which indicates that both types of users perform basically the same Rides’ Number
These results are practically repeated in the following section (159.-), only that using as base Electric_Bike Rideable_Type, and also when the same analysis of the data is done, but based on User_Type(see section (151.-), which is a very important indication to take into consideration in the elaboration of the Strategy for Converting user types “Casual” class into user types “Member” class , because in these cases it is indicating that the way to increase the Rides’ Number of user types “Casual” class, is by promoting the use of classic_bike and electric_bike at twice the value they currently have, on days from Monday to Friday.
159.-
# fig.width = 12.29, fig.height= 8.00
dataPct8 <- list(Day = c('Mon','Tue','Wed','Thu','Fri','Sat','Sun'), ratio= c(0.49, 0.43, 0.46, 0.48, 0.61, 0.90, 0.88))
finalPct8 <- as.data.frame(dataPct8)
gt_finalPct8 <- gt(finalPct8)
# values in table <= 0.61 , in bold
gt_finalPct8 <- gt_finalPct8 %>%
tab_style(style = cell_text(weight = "bold"),
locations = cells_body(columns = ratio,
rows = ratio <= 0.61))
colors <- c("#FFFFFF", "#F5FCC2","#FF0000", "#7FFF00", "#B3C732", "#A0522D", "#000000")
pie8 <- ggplot(data = finalPct8, aes(x="", y = ratio, fill = reorder(Day,-ratio))) +
geom_col(color = "black") +
coord_polar("y", start = 0) +
geom_text(aes(x = 1.6, label = ratio), position = position_stack(vjust = 0.5), fontface = "bold") +
theme(panel.background = element_blank(),
axis.line = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
axis.title = element_blank(),
plot.title = element_text(hjust = 0.5, size = 20, face="bold")) +
ggtitle(" Ratio: Casual/Member Electric_Bike,\n in Rides' Number in each Day of Week,\n in one Year, based on Rideable_Type") +
scale_fill_manual(values = colors)
pie8 + gt_finalPct8
160.-
The Procedure or Strategy to convert the user types of the “Casuals” class into user types of the class " Members“, must initially begin with a promotion period, where the values resulting from the 2 variables analyzed (Rides’ Number and/or Average Ride’s Time) which are lower in the types of users of the”Casuals" class than the types of users of the “Members” class, so that they can achieve and/or exceed the results of the “Members” class, so that they may subsequently be offered the “Members” class benefits and so they can be converted to such a class in a natural form.
In the “Details” section, the difference between the cycling behaviour of the casual riders and the annual members is clearly illustrated
1.- The first conclusion , resulting from this data analysis, is to implement a Strategy to increase the Rides’ Number of the types of users of the “Casuals” class at least twice what they did in this year of study, this point is explained in the following conclusion. At the same time, the results of the values of the Average Travel Time by the types of users in the class “Casuals” does not need to be promoted in the strategy because they are well above the resulting values for the types of users in the “Members” class, as indicated in section No. 156 (Avg Rides’ Time: % Used by Rideable_Type in one Year, by User Type).
2.- The values of that ratio(casual/member) shown in “bold” in the table from secction No. 151 (Ratio Casual/Member in Rides’ Number in each Day of Week, based on User Type), indicate that on days going from Monday to Friday , the Rides’ Number from the types of users in the Casuals"* class are on average 0.45 (roughly 50%) times lower than the results from the types of users in the “Members” class. On the weekends, however, the ratio is roughly 1.00(roughly 100%), which indicates that both types of users make, practically, the same Rides’ Number. These results are practically repeated when the same analysis of the data is made, but based on Rideable_Type, in the No. 158 and No.159 secctions (Ratio: Casual/Member Classic_Bike,in Rides’ Number in each Day of Week,in one Year, based on Rideable_Type and Ratio: Casual/Member Electric_Bike, in Rides’ Number in each Day of Week, in one Year, based on Rideable_Type) , which is a very important indication to take into consideration in the elaboration of the Strategy to convert casual riders into annual members.
3.-To implement the Strategy proposed, regarding increasing the “Rides’ Numner” from Monday to Friday, and make it a reality in an effective way, I would propose using social networks, emphasizing the greater use of the electric_bike*.For this Divvy can use an Education campaign of how to use the electric_bike and a Promotion campaign of that item with the name “Don’t disguise as a cyclist”, showing that the casual rider can use such item, wearing the clothes used daily to go to the office, as no physical effort is made.