INTRODUCTION

This is my capstone project for Google Data Analysis Certification Program. In this case study, I will perform real-world tasks as a junior data analyst for a fictional company, Cyclistic, and meet different characters and team members. In order to answer the key business questions, I will follow the steps of the data analysis process: ask, prepare, process, analyze, share, and act.

ABOUT THE COMPANY: CYCLISTIC

In 2016, Cyclistic launched a successful bike-share offering. Since then, the program has grown to a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system at any time.

Until now, Cyclistic’s marketing strategy relied on building general awareness and appealing to broad consumer segments. One approach that helped make these things possible was the flexibility of its pricing plans: single-ride passes, full-day passes, and annual memberships. Customers who purchase single-ride or full-day passes are referred to as Casual Riders. Customers who purchase annual memberships are Cyclistic Members.

Cyclistic’s finance analysts have concluded that Cyclistic Members are much more profitable than Casual Riders. Although the pricing flexibility helps Cyclistic attract more customers, Cyclistic believes that maximizing the number of annual members will be key to future growth. Rather than creating a marketing campaign that targets all-new customers, Cyclistic believes there is a very good chance to convert Casual Riders into members. The company notes that Casual Riders are already aware of the Cyclistic program and have chosen Cyclistic for their mobility needs.

GOAL

Design marketing strategies aimed at converting Casual Riders into Cyclistic Members. In order to do that, however, the marketing analyst team needs to better understand how Cyclistic Members and Casual Riders differ, why Casual Riders would buy a membership, and how digital media could affect their marketing tactics. Cyclistic marketing analytics team are interested in analyzing the Cyclistic historical bike trip data to identify trends.

SCENARIO

I am a junior data analyst working in the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, our team wants to understand how Casual Riders and Cyclistic Members use Cyclistic bikes differently. From these insights, our team will design a new marketing strategy to convert Casual Riders into Cyclistic Members. But first, Cyclistic executives must approve our team’s recommendations, so they must be backed up with compelling data insights and professional data visualizations.

CHARACTERS and TEAMS

Cyclistic: A bike-share program that features more than 5,800 bicycles and 600 docking stations. Cyclistic sets itself apart by also offering reclining bikes, hand tricycles, and cargo bikes, making bike-share more inclusive to people with disabilities and riders who can’t use a standard two-wheeled bike. The majority of riders opt for traditional bikes; about 8% of riders use assistive options. Cyclistic users are more likely to ride for leisure, but about 30% use them to commute to work each day.

Lily Moreno: The director of marketing and my manager. Moreno is responsible for developing campaigns and initiatives to promote the bike-share program. These may include email, social media, and other channels.

Cyclistic Marketing Analytics Team: A team of data analysts who are responsible for collecting, analyzing, and reporting data that helps guide Cyclistic marketing strategy. I joined this team six months ago and have been busy learning about Cyclistic’s mission and business goals --- as well as how I, as a junior data analyst, can help Cyclistic achieve them.

Cyclistic Executive Team: The notoriously detail-oriented executive team will decide whether to approve the recommended marketing program.

ASK PHASE

Three questions will guide the future marketing program:

1. How do Cyclistic Members and Casual Riders use Cyclistic bikes differently?

2. Why would Casual Riders buy Cyclistic annual memberships?

3. How can Cyclistic use digital media to influence Casual Riders to become members?

For this project, I’m tasked with analyzing the following question.

How do Cyclistic Members and Casual Riders use Cyclistic bikes differently?

PREPARE PHASE

To use Cyclistic’s historical trip data to analyze and identify trends. Downloaded 12 months of Cyclistic trip data from January to December 2021 and uploaded it to RStudio Desktop.

Link:

https://divvy-tripdata.s3.amazonaws.com/index.html

Note:

The datasets have a different name because Cyclistic is a fictional company. For the purposes of this case study, the datasets are appropriate and will enable me to answer the business questions. The data has been made available by Motivate International Inc. under this license. This is public data that I can use to explore how different customer types are using Cyclistic bikes. But note that data-privacy issues prohibit me from using riders’ personally identifiable information. This means that I won’t be able to connect pass purchases to credit card numbers to determine if Casual Riders live in the Cyclistic service area or if they have purchased multiple single passes.

PROCESS PHASE

I have chosen to use RStudio Desktop to process the following twelve months of given data for the year 2021.

alt text

Before starting to import the data, packages were installed so the data could be imported with ease.

library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.2.2
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0      ✔ purrr   0.3.5 
## ✔ tibble  3.1.8      ✔ dplyr   1.0.10
## ✔ tidyr   1.2.1      ✔ stringr 1.4.1 
## ✔ readr   2.1.3      ✔ forcats 0.5.2
## Warning: package 'ggplot2' was built under R version 4.2.2
## Warning: package 'tibble' was built under R version 4.2.2
## Warning: package 'dplyr' was built under R version 4.2.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(lubridate)
## Warning: package 'lubridate' was built under R version 4.2.2
## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union
library(plyr)
## ------------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## ------------------------------------------------------------------------------
## 
## Attaching package: 'plyr'
## The following objects are masked from 'package:dplyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize
## The following object is masked from 'package:purrr':
## 
##     compact
library(rmarkdown)
library(knitr)  

Importing the following Files to Environment:

jan <- read.csv("C:/R/202101-divvy-tripdata.csv")
feb <- read.csv("C:/R/202102-divvy-tripdata.csv")
mar <- read.csv("C:/R/202103-divvy-tripdata.csv")
apr <- read.csv("C:/R/202104-divvy-tripdata.csv")
may <- read.csv("C:/R/202105-divvy-tripdata.csv")
jun <- read.csv("C:/R/202106-divvy-tripdata.csv")
jul <- read.csv("C:/R/202107-divvy-tripdata.csv")
aug <- read.csv("C:/R/202108-divvy-tripdata.csv")
sep <- read.csv("C:/R/202109-divvy-tripdata.csv")
oct <- read.csv("C:/R/202110-divvy-tripdata.csv")
nov <- read.csv("C:/R/202111-divvy-tripdata.csv")
dec <- read.csv("C:/R/202112-divvy-tripdata.csv")

Merging all files to one file name, trip_data.

trip_data <- rbind(jan,feb,mar,apr,may,jun,jul,aug,sep,oct,nov,dec)

Column names:

colnames(trip_data)
##  [1] "ride_id"            "rideable_type"      "started_at"        
##  [4] "ended_at"           "start_station_name" "start_station_id"  
##  [7] "end_station_name"   "end_station_id"     "start_lat"         
## [10] "start_lng"          "end_lat"            "end_lng"           
## [13] "member_casual"

String variables for trip_data:

str(trip_data)
## 'data.frame':    5595063 obs. of  13 variables:
##  $ ride_id           : chr  "E19E6F1B8D4C42ED" "DC88F20C2C55F27F" "EC45C94683FE3F27" "4FA453A75AE377DB" ...
##  $ rideable_type     : chr  "electric_bike" "electric_bike" "electric_bike" "electric_bike" ...
##  $ started_at        : chr  "2021-01-23 16:14:19" "2021-01-27 18:43:08" "2021-01-21 22:35:54" "2021-01-07 13:31:13" ...
##  $ ended_at          : chr  "2021-01-23 16:24:44" "2021-01-27 18:47:12" "2021-01-21 22:37:14" "2021-01-07 13:42:55" ...
##  $ start_station_name: chr  "California Ave & Cortez St" "California Ave & Cortez St" "California Ave & Cortez St" "California Ave & Cortez St" ...
##  $ start_station_id  : chr  "17660" "17660" "17660" "17660" ...
##  $ end_station_name  : chr  "" "" "" "" ...
##  $ end_station_id    : chr  "" "" "" "" ...
##  $ start_lat         : num  41.9 41.9 41.9 41.9 41.9 ...
##  $ start_lng         : num  -87.7 -87.7 -87.7 -87.7 -87.7 ...
##  $ end_lat           : num  41.9 41.9 41.9 41.9 41.9 ...
##  $ end_lng           : num  -87.7 -87.7 -87.7 -87.7 -87.7 ...
##  $ member_casual     : chr  "member" "member" "member" "member" ...

Glimpse for trip_data:

glimpse(trip_data)
## Rows: 5,595,063
## Columns: 13
## $ ride_id            <chr> "E19E6F1B8D4C42ED", "DC88F20C2C55F27F", "EC45C94683…
## $ rideable_type      <chr> "electric_bike", "electric_bike", "electric_bike", …
## $ started_at         <chr> "2021-01-23 16:14:19", "2021-01-27 18:43:08", "2021…
## $ ended_at           <chr> "2021-01-23 16:24:44", "2021-01-27 18:47:12", "2021…
## $ start_station_name <chr> "California Ave & Cortez St", "California Ave & Cor…
## $ start_station_id   <chr> "17660", "17660", "17660", "17660", "17660", "17660…
## $ end_station_name   <chr> "", "", "", "", "", "", "", "", "", "Wood St & Augu…
## $ end_station_id     <chr> "", "", "", "", "", "", "", "", "", "657", "13258",…
## $ start_lat          <dbl> 41.90034, 41.90033, 41.90031, 41.90040, 41.90033, 4…
## $ start_lng          <dbl> -87.69674, -87.69671, -87.69664, -87.69666, -87.696…
## $ end_lat            <dbl> 41.89000, 41.90000, 41.90000, 41.92000, 41.90000, 4…
## $ end_lng            <dbl> -87.72000, -87.69000, -87.70000, -87.69000, -87.700…
## $ member_casual      <chr> "member", "member", "member", "member", "casual", "…

Summary for trip_data:

summary(trip_data)
##    ride_id          rideable_type       started_at          ended_at        
##  Length:5595063     Length:5595063     Length:5595063     Length:5595063    
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##  start_station_name start_station_id   end_station_name   end_station_id    
##  Length:5595063     Length:5595063     Length:5595063     Length:5595063    
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##    start_lat       start_lng         end_lat         end_lng      
##  Min.   :41.64   Min.   :-87.84   Min.   :41.39   Min.   :-88.97  
##  1st Qu.:41.88   1st Qu.:-87.66   1st Qu.:41.88   1st Qu.:-87.66  
##  Median :41.90   Median :-87.64   Median :41.90   Median :-87.64  
##  Mean   :41.90   Mean   :-87.65   Mean   :41.90   Mean   :-87.65  
##  3rd Qu.:41.93   3rd Qu.:-87.63   3rd Qu.:41.93   3rd Qu.:-87.63  
##  Max.   :42.07   Max.   :-87.52   Max.   :42.17   Max.   :-87.49  
##                                   NA's   :4771    NA's   :4771    
##  member_casual     
##  Length:5595063    
##  Class :character  
##  Mode  :character  
##                    
##                    
##                    
## 

Dimensions for trip_data:

dim(trip_data)
## [1] 5595063      13

Formatting

Removing values with missing information and blanks:

trip_data <- na.omit(trip_data)

Removing duplicates:

trip_data <- trip_data %>%
  distinct()

date, month, day, year, and day_of_week columns are added to trip_data:

trip_data$day_of_week <- wday(trip_data$started_at)
trip_data$date <- as.Date(trip_data$started_at)
trip_data$month <- format(as.Date(trip_data$date), "%m")
trip_data$day <- format(as.Date(trip_data$date), "%d")
trip_data$year <- format(as.Date(trip_data$date), "%Y")
trip_data$day_of_week <- format(as.Date(trip_data$date), "%A")

ride_length column is added:

trip_data$ride_length <- difftime(trip_data$ended_at, trip_data$started_at)

Changing ride_length to the numeric field:

trip_data$ride_length <- as.numeric(as.character(trip_data$ride_length))
is.numeric(trip_data$ride_length)
## [1] TRUE

Summary for trip_data$ride_length:

trip_data <- trip_data[!(trip_data$ride_length <= 0),]
sum(trip_data$ride_length <= 0)
## [1] 0

Column names after formatting:

colnames(trip_data)
##  [1] "ride_id"            "rideable_type"      "started_at"        
##  [4] "ended_at"           "start_station_name" "start_station_id"  
##  [7] "end_station_name"   "end_station_id"     "start_lat"         
## [10] "start_lng"          "end_lat"            "end_lng"           
## [13] "member_casual"      "day_of_week"        "date"              
## [16] "month"              "day"                "year"              
## [19] "ride_length"

Dimensions for trip_data:

dim(trip_data)
## [1] 5589640      19

String variables for trip_data:

str(trip_data)
## 'data.frame':    5589640 obs. of  19 variables:
##  $ ride_id           : chr  "E19E6F1B8D4C42ED" "DC88F20C2C55F27F" "EC45C94683FE3F27" "4FA453A75AE377DB" ...
##  $ rideable_type     : chr  "electric_bike" "electric_bike" "electric_bike" "electric_bike" ...
##  $ started_at        : chr  "2021-01-23 16:14:19" "2021-01-27 18:43:08" "2021-01-21 22:35:54" "2021-01-07 13:31:13" ...
##  $ ended_at          : chr  "2021-01-23 16:24:44" "2021-01-27 18:47:12" "2021-01-21 22:37:14" "2021-01-07 13:42:55" ...
##  $ start_station_name: chr  "California Ave & Cortez St" "California Ave & Cortez St" "California Ave & Cortez St" "California Ave & Cortez St" ...
##  $ start_station_id  : chr  "17660" "17660" "17660" "17660" ...
##  $ end_station_name  : chr  "" "" "" "" ...
##  $ end_station_id    : chr  "" "" "" "" ...
##  $ start_lat         : num  41.9 41.9 41.9 41.9 41.9 ...
##  $ start_lng         : num  -87.7 -87.7 -87.7 -87.7 -87.7 ...
##  $ end_lat           : num  41.9 41.9 41.9 41.9 41.9 ...
##  $ end_lng           : num  -87.7 -87.7 -87.7 -87.7 -87.7 ...
##  $ member_casual     : chr  "member" "member" "member" "member" ...
##  $ day_of_week       : chr  "Saturday" "Wednesday" "Thursday" "Thursday" ...
##  $ date              : Date, format: "2021-01-23" "2021-01-27" ...
##  $ month             : chr  "01" "01" "01" "01" ...
##  $ day               : chr  "23" "27" "21" "07" ...
##  $ year              : chr  "2021" "2021" "2021" "2021" ...
##  $ ride_length       : num  625 244 80 702 43 ...

Glimpse for trip_data:

glimpse(trip_data)
## Rows: 5,589,640
## Columns: 19
## $ ride_id            <chr> "E19E6F1B8D4C42ED", "DC88F20C2C55F27F", "EC45C94683…
## $ rideable_type      <chr> "electric_bike", "electric_bike", "electric_bike", …
## $ started_at         <chr> "2021-01-23 16:14:19", "2021-01-27 18:43:08", "2021…
## $ ended_at           <chr> "2021-01-23 16:24:44", "2021-01-27 18:47:12", "2021…
## $ start_station_name <chr> "California Ave & Cortez St", "California Ave & Cor…
## $ start_station_id   <chr> "17660", "17660", "17660", "17660", "17660", "17660…
## $ end_station_name   <chr> "", "", "", "", "", "", "", "", "", "Wood St & Augu…
## $ end_station_id     <chr> "", "", "", "", "", "", "", "", "", "657", "13258",…
## $ start_lat          <dbl> 41.90034, 41.90033, 41.90031, 41.90040, 41.90033, 4…
## $ start_lng          <dbl> -87.69674, -87.69671, -87.69664, -87.69666, -87.696…
## $ end_lat            <dbl> 41.89000, 41.90000, 41.90000, 41.92000, 41.90000, 4…
## $ end_lng            <dbl> -87.72000, -87.69000, -87.70000, -87.69000, -87.700…
## $ member_casual      <chr> "member", "member", "member", "member", "casual", "…
## $ day_of_week        <chr> "Saturday", "Wednesday", "Thursday", "Thursday", "S…
## $ date               <date> 2021-01-23, 2021-01-27, 2021-01-21, 2021-01-07, 20…
## $ month              <chr> "01", "01", "01", "01", "01", "01", "01", "01", "01…
## $ day                <chr> "23", "27", "21", "07", "23", "09", "04", "14", "09…
## $ year               <chr> "2021", "2021", "2021", "2021", "2021", "2021", "20…
## $ ride_length        <dbl> 625, 244, 80, 702, 43, 3227, 335, 400, 151, 433, 27…

Summary for trip_data:

summary(trip_data)
##    ride_id          rideable_type       started_at          ended_at        
##  Length:5589640     Length:5589640     Length:5589640     Length:5589640    
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##  start_station_name start_station_id   end_station_name   end_station_id    
##  Length:5589640     Length:5589640     Length:5589640     Length:5589640    
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##    start_lat       start_lng         end_lat         end_lng      
##  Min.   :41.64   Min.   :-87.84   Min.   :41.39   Min.   :-88.97  
##  1st Qu.:41.88   1st Qu.:-87.66   1st Qu.:41.88   1st Qu.:-87.66  
##  Median :41.90   Median :-87.64   Median :41.90   Median :-87.64  
##  Mean   :41.90   Mean   :-87.65   Mean   :41.90   Mean   :-87.65  
##  3rd Qu.:41.93   3rd Qu.:-87.63   3rd Qu.:41.93   3rd Qu.:-87.63  
##  Max.   :42.07   Max.   :-87.52   Max.   :42.17   Max.   :-87.49  
##  member_casual      day_of_week             date               month          
##  Length:5589640     Length:5589640     Min.   :2021-01-01   Length:5589640    
##  Class :character   Class :character   1st Qu.:2021-06-07   Class :character  
##  Mode  :character   Mode  :character   Median :2021-08-01   Mode  :character  
##                                        Mean   :2021-07-28                     
##                                        3rd Qu.:2021-09-24                     
##                                        Max.   :2021-12-31                     
##      day                year            ride_length     
##  Length:5589640     Length:5589640     Min.   :      1  
##  Class :character   Class :character   1st Qu.:    405  
##  Mode  :character   Mode  :character   Median :    719  
##                                        Mean   :   1259  
##                                        3rd Qu.:   1304  
##                                        Max.   :3356649

Mean for ride_length:

mean(trip_data$ride_length)
## [1] 1259.074

Median for ride_length:

median(trip_data$ride_length)
## [1] 719

Max for ride_length:

max(trip_data$ride_length)
## [1] 3356649

Minimum for ride_length:

min(trip_data$ride_length)
## [1] 1

Summary for ride_length:

summary(trip_data$ride_length)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       1     405     719    1259    1304 3356649

Analyze Phase

Alphabetize day_of_week:

trip_data$day_of_week <- ordered(trip_data$day_of_week, levels=c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"))

Aggregate ride_length by mean:

aggregate(trip_data$ride_length ~ trip_data$member_casual, FUN = mean)
##   trip_data$member_casual trip_data$ride_length
## 1                  casual             1814.5072
## 2                  member              801.3866
aggregate(trip_data$ride_length ~ trip_data$member_casual + trip_data$day_of_week, FUN = mean)
##    trip_data$member_casual trip_data$day_of_week trip_data$ride_length
## 1                   casual                Monday             1817.3043
## 2                   member                Monday              776.5425
## 3                   casual               Tuesday             1602.9681
## 4                   member               Tuesday              754.3039
## 5                   casual             Wednesday             1573.8958
## 6                   member             Wednesday              757.6898
## 7                   casual              Thursday             1563.1390
## 8                   member              Thursday              751.6961
## 9                   casual                Friday             1716.7117
## 10                  member                Friday              783.1931
## 11                  casual              Saturday             1966.3623
## 12                  member              Saturday              895.4791
## 13                  casual                Sunday             2120.4624
## 14                  member                Sunday              915.5270

Aggregate ride_length by median:

aggregate(trip_data$ride_length ~ trip_data$member_casual, FUN = median)
##   trip_data$member_casual trip_data$ride_length
## 1                  casual                   957
## 2                  member                   576
aggregate(trip_data$ride_length ~ trip_data$member_casual + trip_data$day_of_week, FUN = median)
##    trip_data$member_casual trip_data$day_of_week trip_data$ride_length
## 1                   casual                Monday                   956
## 2                   member                Monday                   552
## 3                   casual               Tuesday                   856
## 4                   member               Tuesday                   548
## 5                   casual             Wednesday                   837
## 6                   member             Wednesday                   553
## 7                   casual              Thursday                   826
## 8                   member              Thursday                   548
## 9                   casual                Friday                   897
## 10                  member                Friday                   566
## 11                  casual              Saturday                  1068
## 12                  member              Saturday                   649
## 13                  casual                Sunday                  1122
## 14                  member                Sunday                   652

Aggregate ride_length by maximum:

aggregate(trip_data$ride_length ~ trip_data$member_casual, FUN = max)
##   trip_data$member_casual trip_data$ride_length
## 1                  casual               3356649
## 2                  member                 89996
aggregate(trip_data$ride_length ~ trip_data$member_casual + trip_data$day_of_week, FUN = max)
##    trip_data$member_casual trip_data$day_of_week trip_data$ride_length
## 1                   casual                Monday               1897299
## 2                   member                Monday                 89994
## 3                   casual               Tuesday               2335375
## 4                   member               Tuesday                 88345
## 5                   casual             Wednesday               2337785
## 6                   member             Wednesday                 89990
## 7                   casual              Thursday               2946429
## 8                   member              Thursday                 89996
## 9                   casual                Friday               3341501
## 10                  member                Friday                 89996
## 11                  casual              Saturday               3356649
## 12                  member              Saturday                 89993
## 13                  casual                Sunday               3235296
## 14                  member                Sunday                 89991

Aggregate ride_length by minimum:

aggregate(trip_data$ride_length ~ trip_data$member_casual, FUN = min)
##   trip_data$member_casual trip_data$ride_length
## 1                  casual                     1
## 2                  member                     1
aggregate(trip_data$ride_length ~ trip_data$member_casual + trip_data$day_of_week, FUN = min)
##    trip_data$member_casual trip_data$day_of_week trip_data$ride_length
## 1                   casual                Monday                     1
## 2                   member                Monday                     1
## 3                   casual               Tuesday                     1
## 4                   member               Tuesday                     1
## 5                   casual             Wednesday                     1
## 6                   member             Wednesday                     1
## 7                   casual              Thursday                     1
## 8                   member              Thursday                     1
## 9                   casual                Friday                     1
## 10                  member                Friday                     1
## 11                  casual              Saturday                     1
## 12                  member              Saturday                     1
## 13                  casual                Sunday                     1
## 14                  member                Sunday                     1

Average, Median, Maximum, and Minimum for ride_length for Casual and Member Cyclist:

alt text

Calculating the Total Number of Rides for Casual and Member Cyclists:

alt text

Graph:

alt text

Calculating number_of_rides for each day_of_the_week:

alt text

Graph:

alt text

Formatting Month to be in correct order:

trip_data$month <- ordered(trip_data$month, levels = c("01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12"))

Calculating the number_of_rides for each month:

alt text

Graph:

alt text

Calculating average_ride_length per day_of_week:

alt text

Graph:

alt text

Calculating average_ride_length per month:

alt text

Graph:

alt text

Calculating ride_percentage of ride_length:

alt text

Calculating number_of_rides to ridable_type per Casual and Member Cyclists:

alt text

Graph:

alt text

Retrieving the most popular station for Casual Members:

alt text

Retrieving the most popular station for Member Cyclists:

alt text

Conclusion

The difference between a Cyclistic Member to a Casual Rider can be measured through the median and average ride length. A Casual Rider averages 1815 compare to a Cyclistic Member with an average of 801. The median for a Casual Rider is 957 and for a Cyclistic Member is 576. Comparing the numbers, Casual Riders averages more in ride length than Cyclistic Members. Also, the median is higher for a Casual Rider. For ride count, Casual Rider is at 2,525,174 with a ride percentage of 45.2% less than a Cyclistic Member which is at 3,064,466 with a 54.8%, Casual Riders uses the service more on Saturday and Cyclistic Member uses it more on Wednesday. During the month of July, Casual Riders have a higher number of rides than Cyclistic Members. Cyclistic Member has a higher number of rides in August and September. Casual Rider and Cyclistic Member uses the classic bike more than an electric bike. However, Casual Rider uses docked bikes more than Cyclistic Member. The most popular station for a Casual Member is Streeter Dr & Grand Ave. For Cyclistic Member is Clark St & Elm St. 

Recommendation:

1. Cyclistic company should highlight historical or local attractions to influence Cyclistic Members and new users to use the service in the weekends.
2. Marketing program should be implemented during the summer months to entice more new Cyclistic Members. 3. Some promotions and discounts should be given to Casual Riders and Cylistic Members to encourage the usage of other types of bikes.