Hello everyone, it’s been a few months since I’ve been working on the Google Data Analytics Professional Certificate through Coursera. Throughout this journey I’ve accumulated lots of interesting, insightful and most importantly, useful information about various tools that were included with the bundle, such as Tableau, R programming, SQL, Spreadsheets.
This particular curriculum also introduced me to various sorts of standardized practices and also gave me a universal framework to follow throughout every single project along with some key data analyst terminologies and processes. Now below is a brief walk through of my thought process and overall understanding that I have gained overtime by completing this case study that is included with the course by using various tools, methods and strategy.
2 BACKGROUND INFORMATION
You are working for Cyclistic, a bike-sharing company. Bikes can be unlocked from one station and returned to any other station in the system anytime.
Cyclistic has flexible pricing plans: single-ride passes, full-day passes, and annual memberships. Customers who purchase single-ride or full-day passes are referred to as casual riders. Customers who purchase annual memberships are Cyclistic members.
The director of marketing believes the company’s future success depends on maximizing the number of annual memberships, as finance analysts have concluded that annual memberships are much more profitable than casual riders. She also believes that there is a good chance of converting casual riders to members as they are already aware of a Cyclistic program and have chosen it for their mobility needs.
3ASK PHASE
Some of the crucial questions asked will guide the direction of the future marketing program:
How casual and annual members use cyclistic services differently?
What is the estimate of people who choose cyclistic over any other daily commute services?
Does cyclistic have any unique value proposition for their riders over any other competitor in the same segment?
What is the overall past experience with different kinds of marketing platform, say it’s digital marketing,influencer marketing or traditional marketing?
3.1 Key Takeways
Identify business task.
The main objective is to design marketing strategies for converting casual riders to annual members by understanding how they differentiate.
The differentiation will be checked based upon certain parameters, which are their preferred weekday, rides per week,duration spent weekly and monthly,most visited routes etc.
Consider key Stakeholders.
Director of Marketing (Lily Moreno).
Marketing Analytics team.
Executive team.
3.2 Deliverable
Identify each and every pattern in which both rider types differentiate.
All possible related factors which are not letting casual riders opt for the annual membership program.
4PREPARE PHASE
Here, for this analysis, I will be using a public dataset that is made available on this page. The data has been made available by Motivate International Inc. under this license.
4.1 Key Task
Load each of the datasets month wise to maintain the consecutive order.
Download the datasets from a given online repository and then save them in a separate folder.
Identify the doc format and check if lists of doc are readable and writable.
Determine the credibility of data by inspecting for any vague or unwanted rows in each of the datasets and then sort accordingly.
Check for the total number of columns and its name to concatenate successfully.
4.2 Deliverable
Documenting the entire procedure involved in this phase.
A short brief for each operation performed for ease of understanding.
In this particular phase, I will start analysing the clean and processed dataset in order to find answers to many of the questions that will help stakeholders of a cyclistic company to take up on their marketing campaign in a specific direction which, in turn will lead to retaining existing members and converting all other forthcoming users to a subscription programme.
6.1 Key Task
Various sets of comparisons were made to deep dive into the dataset in order to understand the complete scenario of customer’s behaviour and their preferences
Series of analysis are performed to get the thorough details for the entire analysis and also lay down a path of conviction to answer existing and imminent intrigues.
Aggregate several columns to explore the various aspects of the dataset and its significance in how people perceive cyclistic services.
Examine every nook and cranny in the dataset using various built-in R functions to get the final profiling done for both the categories of customers.
Identifying trends and relationships for each of the member types and their utility.
6.2 Deliverable
Established summary through several useful functions such as ‘head()’, ‘filter()’,‘count()’, ‘glimpse()’, etc.
Computations performed will illustrate a complete brief on the utility of every member type in terms of their choices and their preferences.
Some statistical operations are also performed to develop a brief on some crucial factors that influence people’s conduct.
6.3CODE CHUNK
Summary of total_mins_spent column in terms of mean, median, max and min.
Storing top ten station name for member type in desc order using head() function along with total duration spent.
Code
popular_ride_route_member_top10 <-head(arrange(popular_ride_route_member,desc(number_of_rides)), n =10)
Storing top ten station name for member type in desc order using head() along with total distance travelled.
Code
popular_ride_distance_member_top10 <-head(arrange(popular_distance_travelled_member,desc(number_of_rides)), n =10)
Glance on newly obtained data.
Code
popular_ride_route_member_top10
# A tibble: 10 × 3
route number_of_rides average_dur…¹
<chr> <int> <drtn>
1 Ellis Ave & 60th St To Ellis Ave & 55th St 4082 0.10550989 h…
2 Ellis Ave & 55th St To Ellis Ave & 60th St 3652 0.11763394 h…
3 Ellis Ave & 60th St To University Ave & 57th St 3109 0.13157196 h…
4 University Ave & 57th St To Ellis Ave & 60th St 3010 0.12021290 h…
5 Calumet Ave & 33rd St To State St & 33rd St 1989 0.06755684 h…
6 State St & 33rd St To Calumet Ave & 33rd St 1954 0.09848971 h…
7 Loomis St & Lexington St To Morgan St & Polk St 1860 0.09136798 h…
8 Morgan St & Polk St To Loomis St & Lexington St 1653 0.11731717 h…
9 MLK Jr Dr & 29th St To State St & 33rd St 1422 0.21594116 h…
10 State St & 33rd St To MLK Jr Dr & 29th St 1392 0.21096384 h…
# … with abbreviated variable name ¹average_duration_minutes
Code
popular_ride_distance_member_top10
# A tibble: 10 × 3
route number_of_rides average_dis…¹
<chr> <int> <dbl>
1 Ellis Ave & 60th St To Ellis Ave & 55th St 4082 1.02
2 Ellis Ave & 55th St To Ellis Ave & 60th St 3652 1.02
3 Ellis Ave & 60th St To University Ave & 57th St 3109 0.716
4 University Ave & 57th St To Ellis Ave & 60th St 3010 0.716
5 Calumet Ave & 33rd St To State St & 33rd St 1989 0.654
6 State St & 33rd St To Calumet Ave & 33rd St 1954 0.653
7 Loomis St & Lexington St To Morgan St & Polk St 1860 0.868
8 Morgan St & Polk St To Loomis St & Lexington St 1653 0.867
9 MLK Jr Dr & 29th St To State St & 33rd St 1422 1.09
10 State St & 33rd St To MLK Jr Dr & 29th St 1392 1.09
# … with abbreviated variable name ¹average_distance
Extracting top 10 most visited stations via set of processing for casual type.
Filtering default dataset and creating a new one with filtered member type casual.
Storing top ten station name for casual member type in desc order using head() function along with total duration spent.
Code
popular_ride_route_casual_top10 <-head(arrange(popular_ride_route_casual,desc(number_of_rides)), n =10)
Storing top ten station name for casual member type in desc order using head() along with total distance travelled.
Code
popular_ride_distance_casual_top10 <-head(arrange(popular_distance_travelled_casual,desc(number_of_rides)), n =10)
Glance on newly obtained data.
Code
popular_ride_route_casual_top10
# A tibble: 10 × 3
route number…¹ avera…²
<chr> <int> <drtn>
1 Streeter Dr & Grand Ave To Millennium Park 3309 0.6745…
2 Millennium Park To Streeter Dr & Grand Ave 2927 0.7638…
3 Shedd Aquarium To Streeter Dr & Grand Ave 2822 0.5835…
4 Lake Shore Dr & Monroe St To Streeter Dr & Grand Ave 2811 0.5459…
5 DuSable Lake Shore Dr & Monroe St To Streeter Dr & Grand Ave 2736 0.4910…
6 Streeter Dr & Grand Ave To Michigan Ave & Oak St 2478 0.4997…
7 Dusable Harbor To Streeter Dr & Grand Ave 2280 0.4592…
8 Michigan Ave & Oak St To Streeter Dr & Grand Ave 2008 0.5707…
9 Streeter Dr & Grand Ave To Theater on the Lake 1951 0.5780…
10 Shedd Aquarium To Millennium Park 1818 0.4921…
# … with abbreviated variable names ¹number_of_rides, ²average_duration_minutes
Code
popular_ride_distance_casual_top10
# A tibble: 10 × 3
route number…¹ avera…²
<chr> <int> <dbl>
1 Streeter Dr & Grand Ave To Millennium Park 3309 1.60
2 Millennium Park To Streeter Dr & Grand Ave 2927 1.60
3 Shedd Aquarium To Streeter Dr & Grand Ave 2822 2.80
4 Lake Shore Dr & Monroe St To Streeter Dr & Grand Ave 2811 1.32
5 DuSable Lake Shore Dr & Monroe St To Streeter Dr & Grand Ave 2736 1.32
6 Streeter Dr & Grand Ave To Michigan Ave & Oak St 2478 1.37
7 Dusable Harbor To Streeter Dr & Grand Ave 2280 0.593
8 Michigan Ave & Oak St To Streeter Dr & Grand Ave 2008 1.37
9 Streeter Dr & Grand Ave To Theater on the Lake 1951 4.09
10 Shedd Aquarium To Millennium Park 1818 1.70
# … with abbreviated variable names ¹number_of_rides, ²average_distance
Removing some of the dataset that were created to fetch some specific data.
Taking a glance on start_station_name with respect to member_casual in descending order limited up to twenty entries.
Code
head(count(all_datasets_2021,start_station_name,member_casual,sort =TRUE), n =20)
start_station_name member_casual n
1 Streeter Dr & Grand Ave casual 54225
2 Millennium Park casual 26847
3 Michigan Ave & Oak St casual 23614
4 Clark St & Elm St member 23200
5 Kingsbury St & Kinzie St member 22277
6 Wells St & Concord Ln member 22245
7 Shedd Aquarium casual 20070
8 Wells St & Elm St member 19663
9 Dearborn St & Erie St member 18259
10 Wells St & Concord Ln casual 18201
11 Wells St & Huron St member 17845
12 St. Clair St & Erie St member 17760
13 Theater on the Lake casual 17706
14 Broadway & Barry Ave member 16232
15 Clinton St & Madison St member 16058
16 Desplaines St & Kinzie St member 15775
17 Clark St & Armitage Ave member 15478
18 Wabash Ave & Grand Ave member 15432
19 Clark St & Lincoln Ave member 15312
20 Clark St & Lincoln Ave casual 15250
Taking a glance on end_station_name with respect to member_casual in ascending order limited up to twenty entries.
Code
head(count(all_datasets_2021,end_station_name,member_casual,sort =TRUE), n =20)
end_station_name member_casual n
1 Streeter Dr & Grand Ave casual 57303
2 Millennium Park casual 28406
3 Michigan Ave & Oak St casual 25317
4 Clark St & Elm St member 23273
5 Wells St & Concord Ln member 22892
6 Kingsbury St & Kinzie St member 22462
7 Wells St & Elm St member 20218
8 Theater on the Lake casual 19393
9 Dearborn St & Erie St member 18918
10 Shedd Aquarium casual 18684
11 Wells St & Concord Ln casual 17939
12 St. Clair St & Erie St member 17587
13 Wells St & Huron St member 17508
14 Broadway & Barry Ave member 16815
15 Clinton St & Madison St member 16412
16 Green St & Madison St member 15949
17 Clark St & Lincoln Ave casual 15504
18 Lake Shore Dr & North Blvd casual 15474
19 DuSable Lake Shore Dr & North Blvd casual 15420
20 Clark St & Lincoln Ave member 15051
Taking a glance on total count by filtering and comparing start_station_name and end_station_name with respect to member_casual in ascending order up to twenty entries.
Code
head(count(filter(all_datasets_2021,start_station_name == end_station_name),member_casual,sort =TRUE), n =20)
member_casual n
1 casual 38301
2 member 22798
Looking for a new gleam by filtering and comparing start_station_name and end_station_name along with a preview of start_station_name with respect to member_casual in ascending order limited to twenty entries.
Code
head(count(filter(all_datasets_2021,start_station_name == end_station_name),start_station_name,member_casual,sort =TRUE), n =20)
start_station_name member_casual n
1 Streeter Dr & Grand Ave casual 1458
2 Michigan Ave & Oak St casual 852
3 Millennium Park casual 773
4 Indiana Ave & Roosevelt Rd casual 552
5 Buckingham Fountain casual 476
6 Lake Shore Dr & Monroe St casual 474
7 Dearborn St & Erie St member 458
8 Shedd Aquarium casual 456
9 DuSable Lake Shore Dr & Monroe St casual 397
10 Dusable Harbor casual 392
11 Montrose Harbor casual 377
12 Michigan Ave & 8th St casual 375
13 New St & Illinois St casual 339
14 Columbus Dr & Randolph St casual 321
15 Theater on the Lake casual 305
16 Wabash Ave & 9th St casual 282
17 Adler Planetarium casual 271
18 Lakefront Trail & Bryn Mawr Ave casual 267
19 Fairbanks Ct & Grand Ave casual 258
20 Michigan Ave & Lake St casual 253
Getting a preview by counting start_station_name with respect to day_of_journey by sorting it in ascending order limited to twenty entries.
start_station_name day_of_journey n
1 Streeter Dr & Grand Ave Saturday 17218
2 Streeter Dr & Grand Ave Sunday 14957
3 Streeter Dr & Grand Ave Friday 9319
4 Wells St & Concord Ln Saturday 8560
5 Michigan Ave & Oak St Saturday 8508
6 Streeter Dr & Grand Ave Monday 8322
7 Millennium Park Saturday 7981
8 Clark St & Lincoln Ave Saturday 7956
9 Michigan Ave & Oak St Sunday 7688
10 Theater on the Lake Saturday 7449
11 Theater on the Lake Sunday 7380
12 Millennium Park Sunday 7180
13 Clark St & Elm St Saturday 6994
14 Clark St & Armitage Ave Saturday 6799
15 Wells St & Elm St Saturday 6645
16 Streeter Dr & Grand Ave Wednesday 6596
17 Wells St & Concord Ln Sunday 6591
18 Streeter Dr & Grand Ave Tuesday 6272
19 Streeter Dr & Grand Ave Thursday 6236
20 Wells St & Concord Ln Friday 5982
Taking a glance on dataset by counting start_station_name, day_of_journey with respect to months alongside sorting the computation in ascending order limited upto twenty entries.
start_station_name day_of_journey month n
1 Streeter Dr & Grand Ave Saturday July 3762
2 Streeter Dr & Grand Ave Sunday August 2685
3 Streeter Dr & Grand Ave Sunday June 2613
4 Streeter Dr & Grand Ave Saturday August 2572
5 Streeter Dr & Grand Ave Saturday September 2560
6 Streeter Dr & Grand Ave Sunday July 2459
7 Streeter Dr & Grand Ave Saturday June 2433
8 Streeter Dr & Grand Ave Saturday May 2392
9 Streeter Dr & Grand Ave Sunday May 2276
10 Streeter Dr & Grand Ave Sunday September 2196
11 Streeter Dr & Grand Ave Friday July 2136
12 DuSable Lake Shore Dr & North Blvd Saturday August 1882
13 DuSable Lake Shore Dr & North Blvd Sunday August 1844
14 Streeter Dr & Grand Ave Monday July 1819
15 Michigan Ave & Oak St Saturday July 1664
16 Streeter Dr & Grand Ave Friday August 1654
17 Theater on the Lake Sunday August 1636
18 Streeter Dr & Grand Ave Friday June 1632
19 Lake Shore Dr & North Blvd Sunday June 1588
20 Lake Shore Dr & North Blvd Saturday June 1578
Previewing and pulling out the peak hours from departure time along with weekdays limited up to twenty entries.
April August December February January July June March
273783 635053 170679 40206 79809 647972 565822 189425
May November October September
415840 247404 457316 588006
Checking on ratio of rideable type sub category with member type to get a gleam on the distribution.
Code
head(count(all_datasets_2021,rideable_type,member_casual,sort =TRUE), n =20)
rideable_type member_casual n
1 classic_bike member 1892943
2 classic_bike casual 1139876
3 electric_bike member 558484
4 electric_bike casual 473011
5 docked_bike casual 247000
6 docked_bike member 1
Glimpse of the maximum distance travelled by a rider using max function.
Glimpse of the dataset through counting total_distance and month using sort function limited upto twenty entries.
Code
head(count(all_datasets_2021,total_distance, month, sort =FALSE), n =20)
total_distance month n
1 2.313061e-05 December 1
2 3.329401e-05 December 1
3 3.334390e-05 August 1
4 3.710650e-05 May 1
5 3.959635e-05 August 1
6 4.011380e-05 December 1
7 4.142217e-05 September 1
8 4.537711e-05 June 1
9 4.628751e-05 July 1
10 5.562038e-05 November 1
11 6.215622e-05 November 1
12 6.216103e-05 October 1
13 6.651749e-05 July 1
14 6.938482e-05 July 1
15 6.941745e-05 November 1
16 7.148174e-05 September 1
17 7.548542e-05 July 1
18 7.834864e-05 August 1
19 7.837490e-05 November 1
20 7.842346e-05 October 1
Code
head(count(all_datasets_2021,max(total_distance), month, sort =FALSE), n =20)
max(total_distance) month n
1 33.83804 April 273783
2 33.83804 August 635053
3 33.83804 December 170679
4 33.83804 February 40206
5 33.83804 January 79809
6 33.83804 July 647972
7 33.83804 June 565822
8 33.83804 March 189425
9 33.83804 May 415840
10 33.83804 November 247404
11 33.83804 October 457316
12 33.83804 September 588006
Glimpse of the dataset based on the total_mins_ spent column and month followed by membership type in descending and ascending order limited upto 30 entries maximum.
Code
head(count(all_datasets_2021,total_mins_spent, month, sort =TRUE), n =20)
total_mins_spent month n
1 0.12055556 hours August 643
2 0.10694444 hours August 641
3 0.11222222 hours August 636
4 0.14472222 hours August 636
5 0.10861111 hours August 633
6 0.11444444 hours August 628
7 0.10583333 hours August 627
8 0.09361111 hours August 622
9 0.12388889 hours August 620
10 0.11000000 hours August 619
11 0.09694444 hours August 618
12 0.11611111 hours August 618
13 0.09527778 hours September 617
14 0.11361111 hours August 617
15 0.10027778 hours August 616
16 0.11250000 hours September 616
17 0.12777778 hours August 616
18 0.11833333 hours August 615
19 0.11527778 hours July 614
20 0.12361111 hours August 614
Code
head(count(all_datasets_2021,max(total_mins_spent), month,member_casual, sort =TRUE), n =20)
max(total_mins_spent) month member_casual n
1 23.99 hours July casual 336693
2 23.99 hours August member 321251
3 23.99 hours September member 317155
4 23.99 hours August casual 313802
5 23.99 hours July member 311279
6 23.99 hours June member 293113
7 23.99 hours October member 280003
8 23.99 hours June casual 272709
9 23.99 hours September casual 270851
10 23.99 hours May member 225037
11 23.99 hours May casual 190803
12 23.99 hours November member 181237
13 23.99 hours October casual 177313
14 23.99 hours April member 170269
15 23.99 hours December member 128318
16 23.99 hours March member 124581
17 23.99 hours April casual 103514
18 23.99 hours January member 66550
19 23.99 hours November casual 66167
20 23.99 hours March casual 64844
Checking Quantile values and performing winsorization on the dataset.
member_casual day_of_journey max(total_mins_spent) n
1 member Wednesday 23.9825 hours 385290
2 member Tuesday 23.9825 hours 375621
3 member Thursday 23.9825 hours 361814
4 member Friday 23.9825 hours 353772
5 member Saturday 23.9825 hours 343383
6 member Monday 23.9825 hours 333892
7 member Sunday 23.9825 hours 297656
7SHARE PHASE
7.1 Key Task
Establish the best way to share visualization using R and tableau.
Illustrate every minute detail backed with explanation.
Choose adequate graph type to conclude findings along with legends, labels and heading to improve readability and interpretation.
Ensure work is easily accessible.
7.2 Deliverable
Convey findings accompanied with illustration and explanation.
Put a short description of every visualization added under this phase.
7.3VISUALIZATION
Here we are trying to look through overall distribution of members types based on total duration spent on weekdays.
Code
ggplot(data = all_datasets_2021)+aes(x = day_of_journey, y =as.numeric(total_mins_spent),fill = member_casual) +geom_bar(stat ="identity", width =0.5, position ='stack') +scale_fill_manual(values =c("red", "blue")) +labs(title ="Cyclistic Data: Week Day Vs Total Duration Spent",x ="Weekday of Journey", y ="Total Duration",fill ="Member Type") +theme(axis.text.x =element_text(angle =60,hjust =1)) +theme_minimal()
Here getting a glimpse on the utilization of rideable type by member type and then distributed across weekdays.
Code
ggplot(data = all_datasets_2021) + (mapping =aes(x = member_casual, fill = rideable_type)) +geom_bar(width =0.5, alpha =2.5) +facet_wrap(~day_of_journey) +scale_fill_manual(values =c("Black", " yellow", "green")) +labs(title ="Cyclistic Data: Member Type preference WIth Rideable Type",x ="Member Type", y ="No. of Count",fill ="Rideable Type") +theme(axis.text.x =element_text(angle =60,hjust =1)) +theme_classic()
Glimpse on the member type based on total duration spent on total distance travelled.
Code
ggplot( data = all_datasets_2021) +aes(x = hms::as_hms(total_mins_spent), y = total_distance, shape = member_casual,color = member_casual) +scale_color_manual(name ="Member Type",values =c("Black", " Purple")) +scale_shape_manual(name ="Member Type",values =c(19,17)) +geom_point(size =3,alpha =0.5,stroke =1)+labs(title ="Cyclistic Data: Total Minutes spent with Total Distance Travelled",caption ="Comparing the Difference by Member Type",x ="Total Minutes Spent", y ="Total Distance Travelled") +theme_minimal()
Taking a look on top 10 stations based on average duration spent by casual member type.
Code
ggplot( data = popular_ride_route_casual_top10) +aes(x =as_hms(average_duration_minutes) ,y = route, group =1) +geom_line(color="Red") +geom_point(shape=21, color="black", fill="blue", size=6) +labs( title ="Top 10 Most Visited Stations",subtitle ="For Casual Riders", x ="Duration Spent (In Hours)", y ="Station Name", caption ="Popularity of stations is determined by no of rides & average distance travelled") +theme_minimal()
Taking a look on top 10 stations based on average duration spent by subscribed member type.
Code
ggplot( data = popular_ride_route_member_top10) +aes(x =as_hms(average_duration_minutes) ,y = route, group =1) +geom_line(color="green") +geom_point(shape=21, color="black", fill="Brown", size=6) +labs( title ="Top 10 Most Visited Stations",subtitle ="For Membership Riders", x ="Duration Spent (In Hours)", y ="Station Name", caption ="Popularity of stations is determined by no of rides & average distance travelled") +theme_minimal()
Taking a look on top 10 stations based on average distance travelled by casual member type.
Code
ggplot(data = popular_ride_distance_casual_top10) +aes(x = average_distance, y = route, group =1) +geom_line(color="orange") +geom_point(shape =21, color ="black", fill ="dark green", size =6) +labs( title ="Top 10 Most Visited Stations",subtitle ="For Casual Riders", x ="Duration Spent (In Hours)", y ="Station Name",caption ="Popularity of stations is determined by no of rides & average time spent") +theme_light()
Taking a look on top 10 stations based on average distance travelled by subscribed member type.
Code
ggplot(data = popular_ride_distance_member_top10) +aes(x = average_distance, y = route, group =1) +geom_line(color="orange") +geom_point(shape =21, color ="black", fill ="dark green", size =6) +labs( title ="Top 10 Most Visited Stations",subtitle ="For Membership Riders", x ="Duration Spent (In Hours)", y ="Station Name", caption ="Popularity of stations is determined by no of rides & average time spent") +theme_ipsum()
Overview on the distribution split of member type based on each weekdays of every month.
Code
ggplot(data = all_datasets_2021) +geom_col(aes(x = day_of_journey,y = month, fill = member_casual), position ="identity") +scale_fill_manual(values =c("blue","red")) +labs(title ="Busiest WeekDay of The Month", subtitle =" Followed By Member Type", x ="Day of Journey" , y ="Month", fill ="Member Type") +theme_bw()
Getting a snap on overall allocation of total duration spent on every month based on each weekdays.
Code
ggplot(data = all_datasets_2021) +geom_col(mapping =aes(x = month, y =as.numeric(as.difftime(total_mins_spent)),fill = day_of_journey)) +scale_fill_manual(values =c("blue", "Orange","Dark Green","violet","Red","Brown","black")) +labs(title ="Monthly time duration spent", subtitle ="Followed By Weekday", x ="Total Time Duration",y =" Month", fill =" Day of Journey") +theme(axis.text.x =element_text(angle =40, vjust =0.5, hjust=1))
Overview of total distance travelled by each member type on every month.
Code
ggplot(data = all_datasets_2021) +aes(x = month, y = total_distance, fill = member_casual) +geom_bar(stat ="identity",position ="dodge") +scale_fill_manual(values =c("blue","orange"))+labs(title ="Distance Travelled by Month", subtitle ="Followed by Member Type", x =" Month",y ="Total Distance", fill ="Member Type") +theme(axis.text.x =element_text(angle =40, vjust =0.5, hjust=1),plot.background =element_rect(fill ="#D2B48C"))
Taking a glance on rideable type distribution on each weekdays of every month.
Code
ggplot(data = all_datasets_2021) +aes(x = day_of_journey, y = month) +geom_col(aes(fill = rideable_type),width = .5,position ='identity') +scale_fill_manual(values =c("Red","Blue","Dark green"))+labs(title ="Number of Rides Per Weeek",subtitle ="Followed By Rideable Type",x ="Month", y ="Day Of Journey", fill ="Rideable Type") +theme(axis.text.x =element_text(angle =50, vjust =0.5, hjust=1)) +theme_classic()
Overview on the monthly usage of rideable type.
Code
ggplot(data = all_datasets_2021) +aes(x = month, fill = rideable_type) +stat_count(width =0.5) +scale_fill_manual(values =c("Dark Blue","Maroon","Purple")) +labs( title ="Rideable type Popularity By Months", x ="Months", y ="Count", fill ="Rideable Type") +theme(axis.text.x =element_text(angle =40, vjust =0.5, hjust =1))
8 MAP VISUALS.
Complete Map
Start Station Name
End Station Name
9ACT PHASE
This crucial phase will be carried out by the executive team, Director of Marketing (Lily Moreno) and the Marketing Analytics team based on the above analysis made.
10CONCLUSION
Cyclistic have a greater number of subscribed members as compared to casual riders and, based on the overall analysis, the ratio in which casual riders differ from subscribed members is only 13.76%.
Casual riders mostly prefer the weekend slot for choosing a ride with Cyclistic, which is Friday, Saturday and Sunday.
Subscribed members in general prefer weekday slots for choosing a ride with Cyclistic,which is Monday to Thursday, as well as the weekend slot.
Based on twelve month’s data, it was observed that casual riders took the ride for the longest duration of time as compared to subscribed members and also the month which recorded the highest spent time was July.
Distribution of rideable type seems completely unbalanced and inappropriate as docked bikes seem to have the least significance in terms of the usage distribution where only one bike was rented within an entire year by a subscribed member as compared to casual riders.
The overall month’s wise distribution shows that Saturdays and Sundays seem to attract more crowds as compared to all other weekdays and its constant for every single month.
Peak hours are mostly from 5’oclock to 7’oclock in the evening for both departure and arrival by each of the member types.
Out of three rideable options, the classic bike is the most endorsed one, followed by the electric bike and the docked bike, which have very minimal interaction with subscribed members especially.
11DELIVERABLE
From the detailed analysis, it was learned that we need to cater to different sets of casual riders, first, who pursue cyclistic services in a completely oriented manner, and second, who just use them in an emergency or may be once in a while type.
One more thing needs to be noted to convert the maximum number of casual riders personified ads and campaigns are very essential and necessary.
The other thing is, it’s more about how to cater your services in a way which can have multiple unique value propositions based on the type of customers it is directed or promoted to.
It’s better to push some integral benefits that most people are surrounded by today, such as health, and then diversify health into various subsets like protection from cancer, arthritis,stroke etc to specific groups of casual riders.
Firstly introduce one month trial plan with all yearly perks and benefits and then start charging based on a monthly basis or annual basis.Also, use the old trick in the book by adding their payment information in advance and enabling auto payment on both a monthly, quarterly and yearly basis.
Remove the one-day subscription plan and include a minimum monthly plan with all the benefits that a member gets yearly and price it higher as compared to the annual plan and discounting should be done in a descending manner.
To enable the auto debit feature, ensure users the data is absolutely safe with the cyclistic plus add a tagline saying “We are never hard on commitments”.
Overall distribution of bikes is completely inappropriate as docked bike is just not considered by any members and it is mostly used by casual members due to unavailability. It is essential that company should consider converting all rideable types to electric bikes with different specifications and variants as the future is electric.
To make the subscription plan even more fascinating, some additional gear of its own brand can also be introduced or an association with a renowned brand can also be perfect.
Try hosting or sponsoring any big events that promote cyclistic and its services under all phenomena and also introduce flesh sales on any special events or occasions as it may help attract more number of customers.
12RESOURCES
Stack Overflow.
RStudio and Kaggle community.
Dataset was made available by Motivate International Inc.