Background

Cyclistic is a popular bike-sharing program in Chicago that started in 2016. They have a large fleet of bikes and many stations across the city. Cyclistic offers different pricing options, including single-ride passes, full-day passes, and annual memberships. The company wants to understand how their annual members and casual riders use the bikes differently and why casual riders might choose to become annual members. They also want to explore how digital media can help convince casual riders to become long-term members. To answer these questions, they will analyze their historical bike trip data to find patterns and insights that will guide their marketing strategies and attract more members.

In this Case Study provided by Google Analytics Certification, I challenged myself in providing my own process and analysis. This is to explore more ways on how I would approach the task with my own capabilities and to know my limits, so that I, as a learner, would grow further. The case study already provided a script for R but there are differences in which it wouldn’t be much applicable for the recent data provided.

Preparation

In preparation for the case study, the first step involved obtaining the necessary data from Cyclistic. A dataset spanning the previous 12 months was provided, from May of 2022 to April of 2023, and it was accessible through the link: https://divvy-tripdata.s3.amazonaws.com/. The dataset consisted of CSV files containing valuable information about the bike sharing program.

To streamline the data analysis process, the CSV files were checked and merged in RStudio, a popular integrated development environment for R programming. By merging the individual CSV files, a comprehensive master dataset was created, ensuring that all relevant data was consolidated for further processing and analysis. This master dataset would serve as the foundation for examining the usage patterns of annual members and casual riders in the Cyclistic bike sharing program.

## Summary statistics for Original Cyclistic Data (2022/05-2023/04)
##        X             ride_id          rideable_type       started_at       
##  Min.   :      1   Length:5859061     Length:5859061     Length:5859061    
##  1st Qu.:1464766   Class :character   Class :character   Class :character  
##  Median :2929531   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :2929531                                                           
##  3rd Qu.:4394296                                                           
##  Max.   :5859061                                                           
##                                                                            
##    ended_at         start_station_name start_station_id   end_station_name  
##  Length:5859061     Length:5859061     Length:5859061     Length:5859061    
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##  end_station_id       start_lat       start_lng         end_lat     
##  Length:5859061     Min.   :41.64   Min.   :-87.84   Min.   : 0.00  
##  Class :character   1st Qu.:41.88   1st Qu.:-87.66   1st Qu.:41.88  
##  Mode  :character   Median :41.90   Median :-87.64   Median :41.90  
##                     Mean   :41.90   Mean   :-87.65   Mean   :41.90  
##                     3rd Qu.:41.93   3rd Qu.:-87.63   3rd Qu.:41.93  
##                     Max.   :42.07   Max.   :-87.52   Max.   :42.37  
##                                                      NA's   :5973   
##     end_lng       member_casual     
##  Min.   :-88.14   Length:5859061    
##  1st Qu.:-87.66   Class :character  
##  Median :-87.64   Mode  :character  
##  Mean   :-87.65                     
##  3rd Qu.:-87.63                     
##  Max.   :  0.00                     
##  NA's   :5973

Cleaning

In the process of analyzing the historical bike trip data, one crucial step is to clean the dataset to ensure accurate and reliable results. The first task involves replacing any missing values, denoted as NA, within each column of the dataset. However, it’s important to acknowledge that there may be inaccuracies introduced during this replacement process, particularly in variables like longitude and latitude, due to the use of varying data for filling NA values.

To address the missing values, an appropriate approach would be to use methods such as mean imputation, median imputation, or interpolation, depending on the nature of the data. These techniques help fill in the missing values with estimated values based on the existing data points. However, it’s crucial to note that this process may introduce some degree of error, especially when dealing with location-based variables like longitude and latitude.

Furthermore, during the data cleaning process, it’s important to identify and remove any duplicate records present in the dataset. Duplicates can skew the analysis and lead to misleading conclusions. By eliminating duplicates, the dataset becomes more representative of the actual bike trip data, ensuring accuracy in subsequent analyses.

While efforts are made to clean the data and replace missing values, it’s crucial to be aware of potential inaccuracies introduced during the process, particularly with variables related to geographic information. To mitigate such issues, further validation and cross-checking of the cleaned dataset to minimize any potential impact on the accuracy of the results were done and needed further investigation.

## Summary Statistics for Cleaned Cyclistic Data (2022/05-2023/04)
##       X.1                X             ride_id          rideable_type     
##  Min.   :      1   Min.   :      1   Length:5859061     Length:5859061    
##  1st Qu.:1464766   1st Qu.:1464766   Class :character   Class :character  
##  Median :2929531   Median :2929531   Mode  :character   Mode  :character  
##  Mean   :2929531   Mean   :2929531                                        
##  3rd Qu.:4394296   3rd Qu.:4394296                                        
##  Max.   :5859061   Max.   :5859061                                        
##   started_at          ended_at           start_lat       start_lng     
##  Length:5859061     Length:5859061     Min.   :41.64   Min.   :-87.84  
##  Class :character   Class :character   1st Qu.:41.88   1st Qu.:-87.66  
##  Mode  :character   Mode  :character   Median :41.90   Median :-87.64  
##                                        Mean   :41.90   Mean   :-87.65  
##                                        3rd Qu.:41.93   3rd Qu.:-87.63  
##                                        Max.   :42.07   Max.   :-87.52  
##     end_lat         end_lng       member_casual      start_station_name
##  Min.   :41.65   Min.   :-88.11   Length:5859061     Length:5859061    
##  1st Qu.:41.87   1st Qu.:-87.68   Class :character   Class :character  
##  Median :41.89   Median :-87.65   Mode  :character   Mode  :character  
##  Mean   :41.87   Mean   :-87.72                                        
##  3rd Qu.:41.92   3rd Qu.:-87.63                                        
##  Max.   :42.06   Max.   :-87.53                                        
##  end_station_name   start_station_id   end_station_id    
##  Length:5859061     Length:5859061     Length:5859061    
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
## 

Analysis

In this analysis, we will examine the differences in bike usage patterns between annual members and casual riders in Cyclistic. Additionally, we aim to understand the factors that may influence casual riders to purchase annual memberships and explore the potential of digital media in converting casual riders into long-term members.

To begin, we will provide a brief summary of the key findings derived from the analysis:

By analyzing the provided data, we will gain insights into the usage patterns, preferences, and ride characteristics of annual members and casual riders, ultimately addressing the above business questions. The analysis will help inform decision-making processes and provide actionable recommendations to optimize marketing strategies and enhance member conversion rates.

Number of Members

Upon analyzing the data, a noteworthy finding emerged: more than half of the users in Cyclistic are annual members, indicating a higher proportion of members compared to casual riders. This observation suggests a positive trend towards long-term commitment and loyalty to the bike-sharing program.

The bar graph shows more than half of the program’s user base are members.

To visually represent this insight, a bar graph was created, illustrating the comparison between the number of annual members and casual riders. The graph clearly demonstrates that the annual members’ count exceeds that of casual riders, with the former constituting more than 50% of the user base.

This finding highlights the significance of annual memberships in Cyclistic’s business model, as they contribute to a substantial portion of the program’s user base. Leveraging this strong membership base presents an opportunity for Cyclistic to focus on retaining existing members and converting casual riders into annual members, thereby fostering continued growth and financial stability.

The subsequent sections of the analysis will delve deeper into understanding the differences in usage patterns, ride lengths, preferred bike types, and other factors between annual members and casual riders. These insights will help address the remaining guide questions and guide Cyclistic’s marketing strategies to maximize member conversion and improve overall customer satisfaction.

Bike Type Usage

The analysis of bike type usage in Cyclistic reveals interesting insights about the preferences of riders. Three bike types were considered: Classic, Docked, and Electric.

Above shows Docked bike is unappealing to riders. Are electric-based transportation the cool future?

The data indicates that the Docked bikes are not popular among riders, with usage limited primarily to casual riders. The graph illustrates that the number of rides taken on Docked bikes by casual riders is below 200,000, suggesting a relatively lower demand for this bike type.

On the other hand, among annual members, there is an almost equal split in bike type preference. Approximately 50% of members opt for the Classic bikes, while the remaining 50% prefer Electric bikes. The analysis reveals that both bike types are used between 1.75 million to 2 million times by members, indicating a balanced utilization.

Interestingly, casual riders show a stronger inclination towards Electric bikes compared to Classic bikes. This finding suggests that casual riders find the Electric bikes more appealing and suitable for their needs, potentially due to factors such as ease of use, convenience, or preference for a smoother riding experience.

Overall, Electric bikes emerge as the most popular bike type among Cyclistic users, followed closely by Classic bikes. Docked bikes, on the other hand, demonstrate limited popularity, primarily among casual riders.

Understanding the bike type preferences of members and casual riders allows Cyclistic to make informed decisions about bike fleet management and potential expansions. By focusing on the popular Electric and Classic bikes, Cyclistic can ensure an adequate supply of the preferred bike types to meet user demand and enhance customer satisfaction.

Seasonal Ride Frequency

The analysis of seasonal ride frequency provides valuable insights into the riding patterns of Cyclistic users across different seasons. The data reveals the popularity of bike rides during specific seasons and sheds light on the preferences of both member types: annual members and casual riders.

Warm Summers but with cool breeze are just really the best.

Summer emerges as the most popular season for bike rides among both annual members and casual riders. This finding suggests a higher demand for cycling during the warm summer months, potentially driven by pleasant weather conditions and increased outdoor activities.

Following Summer, Autumn stands out as the next preferred season for bike rides, indicating sustained rider engagement as the weather transitions. Spring also shows a considerable number of rides, indicating a continued interest in cycling as the weather improves.

Winter, on the other hand, exhibits relatively lower ride frequencies compared to the other seasons. The data indicates that the number of rides during winter is significantly lower, particularly among casual riders. This observation suggests that the cold weather and potential weather-related challenges during winter may deter casual riders from utilizing Cyclistic bikes.

The analysis further highlights that Autumn, Spring, and Winter are primarily composed of annual members, indicating their higher participation during these seasons. Summer, on the other hand, shows a more balanced distribution between annual members and casual riders, with casual riders showing a preference for riding during the warmer months.

Understanding the seasonal ride frequency enables Cyclistic to adapt its operations and marketing strategies accordingly. By allocating resources, such as bikes and station capacity, to meet the increased demand during the popular summer and autumn seasons, Cyclistic can enhance the user experience and ensure a smooth riding experience for both annual members and casual riders.

Ride Length Analysis

The findings from the Ride Length Analysis provide interesting insights into the duration of rides taken by Cyclistic users, highlighting differences between casual riders and annual members.

Who in their right mind would use a shared bike for more than 500 hours?

Who in their right mind would use a shared bike for more than 500 hours?

The mean ride length for casual riders is nearly 30 minutes, which suggests that they tend to use the bikes for longer durations compared to annual members. In contrast, the mean ride length for members is less than half of that, indicating shorter rides on average. This disparity may be attributed to the different usage patterns and motivations of casual riders and annual members.

One intriguing observation is the significant gap between the maximum ride lengths of casual riders and members. It appears that there are one or possibly more riders who have utilized a bike for more than 500 hours, resulting in a longer average ride length for casual riders. On the other hand, the maximum ride length for members is likely below a day of use, suggesting that members generally use the bikes for shorter durations.

If were to use one, I’d use it for short commutes.

The histogram depicting the frequency of ride durations further supports the findings. In the 0-100 minutes range, where most rides fall, there is a peak between 2-20 minutes, aligning with the mean ride length for members. However, in the 0-1000 minutes range, a different pattern emerges, with a higher frequency observed between 220-380 minutes. This indicates that there are occasional rides among both casual riders and members that extend to longer durations.

Madlads. Are they always doing a cycling marathon?

These findings imply that casual riders may have a different use case for Cyclistic bikes, potentially utilizing them for longer recreational rides or specific commuting needs. On the other hand, members tend to have shorter, more frequent rides, likely for daily commuting or shorter trips within the city.

Understanding the ride length patterns is crucial for Cyclistic to optimize its bike availability, maintenance schedules, and pricing plans. It allows them to cater to the different needs and preferences of casual riders and members and ensure a seamless riding experience for both user segments.

Identifying High-Traffic Stations

The analysis of high-traffic bike stations provides valuable insights into the stations that experience the highest demand among Cyclistic users. By examining the bar graph comparing the top 5 most used stations versus the top 5 least used stations, several key findings can be observed.

What can we say about the busy part of the city, right?

The station with the highest traffic is “Stony Island Ave & 63rd St.” This station attracts a significant number of users, with nearly 500,000 rides recorded by members and approximately 350,000 rides by casual riders. This indicates that the location is popular among both member types and experiences a high level of demand.

“Streeter Dr. & Grand Ave.” emerges as the next station with significant traffic, primarily used by casual riders. This suggests that the location may be situated in an area that appeals to casual riders, potentially due to nearby attractions, recreational areas, or commuting patterns.

Similarly, “DuSable Lake Shore Dr. & Monroe St.” also exhibits a high volume of casual rider traffic, indicating its popularity among this user segment. The station may be located in a prime area that attracts casual riders, such as near tourist destinations, parks, or residential neighborhoods.

On the other hand, “Kingsbury St. & Kinzie St.” stands out as a station mostly used by members. This suggests that the location may be situated in an area frequented by annual members, potentially reflecting its proximity to residential or business areas where members frequently commute or access their destinations.

Analyzing the top 5 least used stations reveals that these stations experience lower traffic, with frequencies below 40,000 rides. While these stations may have lower demand compared to the top 5 most used stations, they still contribute to the overall network and provide access to riders in specific areas with lesser usage.

These findings have significant implications for Cyclistic in terms of resource allocation, station management, and expansion strategies. Understanding the popularity of specific stations enables Cyclistic to ensure sufficient bike availability, optimize docking capacities, and prioritize maintenance efforts at high-traffic stations. Additionally, it can guide decisions on potential station expansion or relocation to areas with higher demand or untapped market potential.

Overall, this analysis provides valuable insights into the popularity of different bike stations, highlighting user preferences, and guiding operational decisions to enhance the overall user experience within the Cyclistic bike-sharing program.

Recommendations

Based on the analysis conducted, the following recommendations can be made to address the business questions posed earlier and leverage the insights gained from the data analysis:

  1. Differentiating Marketing Strategies: Given the differences observed in how annual members and casual riders use Cyclistic bikes, it is essential for Cyclistic to tailor its marketing strategies accordingly. To convert casual riders into annual members, targeted marketing campaigns can be designed to highlight the benefits and cost savings of becoming a member, such as exclusive discounts, priority access to bikes, or additional perks. By emphasizing the advantages specific to each user segment, Cyclistic can effectively influence casual riders to consider purchasing annual memberships.

    • Cyclistic can create targeted marketing strategies to convert casual riders into annual members. For example, they can launch personalized email campaigns that highlight the benefits of membership, such as unlimited rides, access to premium bikes, and exclusive discounts on partner services. They can also offer limited-time promotions, such as adiscounted annual membership rate or a free trial period, to incentivize casual riders to try out the membership. Additionally, Cyclistic can collaborate with local businesses and organizations to offer special perks for members, such as discounts at popular cafes, fitness centers, or bike accessory shops. By tailoring marketing messages and offers to address the specific needs and interests of casual riders, Cyclistic can effectively persuade them to make the switch to annual membership.
  2. Enhancing Digital Media Presence: To leverage digital media in influencing casual riders to become members, Cyclistic should focus on strengthening its online presence and engaging with potential customers through various digital channels. This can include social media campaigns, targeted online advertisements, and interactive content that showcases the convenience, flexibility, and community aspects of being a Cyclistic member. By strategically utilizing digital media, Cyclistic can effectively reach and connect with its target audience, encouraging them to take the step towards becoming annual members.

    • To leverage digital media in influencing casual riders to become members, Cyclistic can implement various strategies. For instance, they can develop engaging content for their social media platforms, showcasing stories and testimonials from satisfied members who highlightthe value they have gained from being part of Cyclistic. They can also collaborate with influential bloggers or social media influencers who have a strong following among the target audience of casual riders. By partnering with these influencers, Cyclistic can reach a wider audience and generate interest in annual memberships. Additionally, Cyclistic canoptimize their website and mobile app to provide a seamless user experience, making it easy for casual riders to navigate through membership options, benefits, and the registration process. By actively engaging with the target audience through digital media channels, Cyclistic can increase awareness, build brand loyalty, and ultimately influence casual riders to become members.
  3. Optimizing Bike Availability and Station Expansion: The analysis of popular bike stations provides valuable insights into areas of high demand and user preferences. Cyclistic should consider optimizing bike availability at the busiest stations by ensuring an adequate supply of bikes and docking spaces. Additionally, expansion efforts can be focused on areas where there is a high concentration of casual riders or untapped potential for new members. By strategically expanding the station network, Cyclistic can cater to the needs of different user segments and enhance accessibility for both casual riders and annual members.

    • To optimize bike availability and station expansion, Cyclistic can utilize the data gathered from popular bike stations analysis. For example, they can strategically deploy additional bikes and docking spaces at high-traffic stations like “Stony Island Ave & 63rd St” to ensure a sufficient supply for both casual riders and members. This can involve closely monitoring the demand patterns at these stations and implementing dynamic redistribution strategies to maintain optimal bike availability throughout the day. Additionally, Cyclistic can identify areas with a high concentration of casual riders, such as neighborhoods with popular tourist attractions or recreational areas, and consider expanding their station network in those locations. By increasing the number of stations and bikes in areas of high demand, Cyclistic can improve accessibility for riders and provide a seamless experience, thereby encouraging more casual riders to consider becoming annual members.
  4. Enhancing User Experience: To further encourage casual riders to become annual members, Cyclistic should prioritize enhancing the overall user experience. This can be achieved through initiatives such as improving bike maintenance, providing user-friendly mobile applications for easy bike rental and return, and offering personalized recommendations based on user preferences. By continually focusing on improving the user experience, Cyclistic can create a positive and seamless journey for its riders, fostering loyalty and increasing the likelihood of casual riders transitioning into annual members.

    • To enhance the user experience and promote membership conversion, Cyclistic can focus on various initiatives. For instance, they can invest in regular bike maintenance and repairs to ensure that the bikes are in good working condition for riders. They can also develop a user-friendly mobile application that offers features such as real-time bike availability, GPS navigation to the nearest stations, and personalized recommendations based on the user’s riding history. Moreover, Cyclistic can gather feedback from riders and use it to implement improvements in their services, such as optimizing the bike rental and return process or introducing additional bike types based on user preferences. By continuously enhancing the user experience and addressing any pain points, Cyclistic can build trust, satisfaction, andloyalty among both casual riders and members, ultimately increasing the likelihood of casual riders transitioning into annual members.
  5. Ongoing Analysis and Iterative Approach: The analysis conducted in this case study provides a snapshot of the current state of Cyclistic’s user base. To continuously address the business questions and refine strategies, it is important for Cyclistic to adopt an iterative approach. Regular analysis of user behavior, feedback, and market trends should be conducted to identify evolving patterns and make data-driven decisions. By staying agile and adaptive, Cyclistic can continuously optimize its operations, marketing tactics, and services to meet the evolving needs and preferences of its riders.

    • To maintain a competitive edge and address evolving market dynamics, Cyclistic should adopt an iterative approach to their strategies. They can continue analyzing user behavior, market trends, and feedback from riders to identify emerging patterns and preferences. For instance, they can monitor changes in the ride length distribution and adjust their pricing plans or incentives accordingly. They can also track the effectiveness of their marketing campaigns and digital media presence through metrics such as conversion rates, engagement levels, and membership sign-ups. This ongoing analysis will enable Cyclistic to make data-driven decisions and adapt their strategies to meet the evolving needs and preferences of their target audience. By staying agile and proactive, Cyclistic can continually optimize their operations, marketing tactics, and services to drive membership growth and ensure a seamless and enjoyable experience for all users.

By implementing these recommendations, Cyclistic can effectively address the business questions of how annual members and casual riders use Cyclistic bikes differently, why casual riders would buy annual memberships, and how digital media can influence casual riders to become members. These steps will help Cyclistic maximize the number of annual members, drive future growth, and ensure a positive and engaging experience for all users of the Cyclistic bike-sharing program.

Disclaimer:

This document is not an official company document and is solely created for the purpose of the Google Data Analytics Capstone project submission. The information and analysis presented in this document are based on simulated data and are not related to any real company or organization.

The Cyclistic data used in this project has been made available under license from Motivate International Inc. However, it is important to note that the data used for analysis and interpretation in this document is simulated and does not reflect actual data from Cyclistic or Motivate International Inc.

The findings, recommendations, and insights presented in this document are for educational purposes only and should not be considered as professional advice or endorsed by any company or organization. It is essential to conduct further research and analysis using real data and consult with relevant stakeholders before making any business decisions.