Cyclitic is a fictional business. This report has been created based off a fictional scenario presented by Google in their Data Analytics Certificate Program. The stakeholders of this fictional firm are convinced that in order to improve business, they must incentivize their casual riders to become annual members. Google did offered a script for use in this project however that script is not the one involved in this report. The script and report involved in this project is unique and original by the author. This report was created to fulfill the specific task of determining key differences in activity between Cyclistic’s casual riders and annual members.
Understanding key and relevant differences in how annual members and casual riders use Cyclistic’s bikes differently within the last 12 months. Insight derived from this analysis will enable Cyclistic to make data-driven decisions that will encourage more casual riders to become annual members.
The data used for this project has been provided by Motivate International Inc. https://divvy-tripdata.s3.amazonaws.com/index.html. In this scenario the data has been collected by Cyclitic’s employees and is a representation of their riders’ activity with Cyclistic’s bikes. The data was collected by month and the last 12 months were collected. The rows of those 12 months were combined to create a single data frame of Cyclistic’s last 12 months of trip activity. This data frame consisted of 13 variables describing each ride’s id, bike type, user type, and the start and end of each trips’ time and location. This new data frame of Cyclistic’s last 12 months of activity contained about 5.9 million observations. Given its large quantity of data, it seemd most efficient to use R studio for this particular project.
First, the data type used for each trips’ start and end date and time was converted to a datetime data type from its original character data type. From this, 646 observations were found where the trip’s ending date and time was either before or the same as the trip’s starting date and time. Since communication with the data collectors is unattainable (as they are fictional in this scenario), these 646 observations were deemed unreliable and were omitted. Furthermore, count functions were performed in search of any nulls in any relevant variables and were omitted as communication with the data collectors is unattainable.
After the data was ensured to be credible and reliable, a new metric
was developed for each trip’s duration using the difference (measured in
hours) between the date and time of each trip’s beginning and end. This
new metric was used in a new data frame that gave the average trip
duration of each day by the user type. The date frame was grouped
primarily my day then secondarily by user type. This data frame was
designed in order to analyze the difference in trip duration between
members and casual riders
In addition, another data frame was created that included each trip’s
bike type and user type for the last 12 months. This data frame was
designed in order to analyze the difference in bike type use between
members and casual riders. The following grid is a result of counting
each bike type’s use by each rider type.
## bike_type member_use casual_use
## 1 classic 1970959 1218178
## 2 docked 0 253364
## 3 electric 1370893 1086345
From the analysis, there were key differences discovered between Cyclistic’s annual members and casual riders. A few key differences were discovered in the annual members’ and casual riders’ use of bike types. One key difference is that, in the last 12 months, Docked Bikes have only been used by casual riders. That is, no member has used one of Cyclistics’ docked bikes in the last 12 months. Furthermore, annual members are responsible for about 57% of trips within the last 12 months and casual riders are about responsible for 43%. This 57:43 relation is relatively consistent with the use of classic bikes at a ratio of 56:44 (annual member:casual rider). There is more of a significant difference with the ratio of electric bike use at 62% electric bike rides by annual members and 38% casual riders. Thus when it comes to electric bikes they are more preferred by annual members than casual riders. The following is a visual for Cyclistic’s bike use from annual members and casual riders by bike type.
Another key difference in the use of Cyclistic’s bike between annual members and casual riders is in the average duration of trips on any given day. The following visual shows that the average trip duration by annual members is relatively consistent and never exceeds 30 minutes, while the average duration per day by casual riders varied more throughout the last 12 months with some exceptional high averages around New Year’s Day. The following visual shows that casual riders usually rent Cyclistic’s bikes for longer periods of time than annual members especially around the month of Janruary.
As the pricing system between annual members and casual riders, trip duration, and bike type has not been provided with this scenario, recommendations may by vague. However, changes in the pricing system could be made that would benefit longer rides to be done as members members rather than casual riders. There could also be special sign-up rewards offered during the holidays as the longest trips do take place around the beginning of January. Furthermore, it seems that there is a lack of incentive for members to ride docked bikes. From this, it is recommended that a pricing system on docked bikes be in place such that it favors members over casual riders. These recommendations are expected to improve annual memberships by casual riders as they are informed decisions drive by the available data.
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.