In 2016, Cyclistic launched a successful bike-share offering. Since then, the program has grown to a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime.
Cyclistic sets itself apart by offering reclining bikes, hand tricycles, and cargo bikes, making bike-share more inclusive to people with disabilities and riders who can’t use a standard two-wheeled bike. The majority of riders opt for traditional bikes; about 8% of riders use the assistive options. Cyclistic users are more likely to ride for leisure, but about 30% use the bikes to commute to work each day.
Cyclistic has classified its customers broadly into 2 categories: Casual Riders and Members. Customers who purchase single-ride or full-day passes are referred to as Casual Riders. Customers who purchase annual memberships are Cyclistic Members.
Cyclistic’s finance analysts have concluded that annual members are much more profitable than casual riders.
The current position is of a junior data analyst working on the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, the team wants to understand how casual riders and annual members use Cyclistic bikes differently.
Data Analysis comprises of six phases: Ask, Prepare, Process, Analyse, Share and Act.
In this phase, in order to get a clear picture of the business task that needs to be executed, we need to ask questions to the Stakeholders and get a clear understanding of their interests and expectations from the Project. The primary task for this project is as given earlier:To understand the difference between Casual Riders and Members
First we load the libraries as per below:
library(tidyverse)
library(readr)
library(dplyr)
library(lubridate)
library(shiny)
library(shinydashboard)
library(fontawesome)
library(plotly)
library(DT)
library(janitor)
library(devtools)
library(urbnmapr)
library(tibble)
library(sf)
library(rvest)
library(viridis)
library(ggrepel)
library(ggthemes)
library(hms)
library(RColorBrewer)
library(classInt)
library(dots)
library(cartography)
library(mapsf)
library(scales)
library(ggpubr)
library(Hmisc)
library(gridExtra)
library("grid")
library("ggplotify")
library(effects)
library(ggplot2)
library(ggtext)
library(maps)
library(shinycssloaders)
library(plotly)
library(rlang)
library(ggcorrplot)
library(gghighlight)
library(webr)
library(grid)
library(reshape)
library(conflicted)
knitr::opts_chunk$set(echo = TRUE)
Now conflicts occur because of the presence of same package within multiple libraries. Hence for avoiding conflicts, we write the below code:
conflicts_prefer(DT::dataTableOutput)
conflicts_prefer(dplyr::filter)
conflicts_prefer(plotly::layout)
conflicts_prefer(dplyr::rename)
Now we first load the raw data. The raw data is released on a monthly basis and is made available under a shareable public license. For this case-study, we will work on the latest possible data of 1 year i.e. from May 2023-April 2024. Hence, in order to work on this data, we need to load 12 raw files as below :
May <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202305-divvy-tripdata/202305-divvy-tripdata.csv")
To get a view of this data, we need to execute the code as given below:
tibble(May)
## # A tibble: 604,827 × 13
## ride_id rideable_type started_at ended_at
## <chr> <chr> <dttm> <dttm>
## 1 0D9FA920C3062031 electric_bike 2023-05-07 19:53:48 2023-05-07 19:58:32
## 2 92485E5FB5888ACD electric_bike 2023-05-06 18:54:08 2023-05-06 19:03:35
## 3 FB144B3FC8300187 electric_bike 2023-05-21 00:40:21 2023-05-21 00:44:36
## 4 DDEB93BC2CE9AA77 classic_bike 2023-05-10 16:47:01 2023-05-10 16:59:52
## 5 C07B70172FC92F59 classic_bike 2023-05-09 18:30:34 2023-05-09 18:39:28
## 6 2BA66385DF8F815A classic_bike 2023-05-30 15:01:21 2023-05-30 15:17:00
## 7 31EFCCB05F12D8EF docked_bike 2023-05-09 14:13:40 2023-05-09 14:47:20
## 8 71DFF834E1D3CE0B classic_bike 2023-05-06 16:47:22 2023-05-06 16:52:13
## 9 2117485899B4CEA4 classic_bike 2023-05-15 12:47:26 2023-05-15 13:00:05
## 10 811149F69AAE82DD electric_bike 2023-05-19 05:44:26 2023-05-19 05:47:24
## # ℹ 604,817 more rows
## # ℹ 9 more variables: start_station_name <chr>, start_station_id <chr>,
## # end_station_name <chr>, end_station_id <chr>, start_lat <dbl>,
## # start_lng <dbl>, end_lat <dbl>, end_lng <dbl>, member_casual <chr>
Similarly, we can load the other files as well
June <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202306-divvy-tripdata/202306-divvy-tripdata.csv")
July <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202307-divvy-tripdata/202307-divvy-tripdata.csv")
Aug <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202308-divvy-tripdata/202308-divvy-tripdata.csv")
Sep <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202309-divvy-tripdata/202309-divvy-tripdata.csv")
Oct <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202310-divvy-tripdata/202310-divvy-tripdata.csv")
Nov <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202311-divvy-tripdata/202311-divvy-tripdata.csv")
Dec <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202312-divvy-tripdata/202312-divvy-tripdata.csv")
Jan <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202401-divvy-tripdata/202401-divvy-tripdata.csv")
Feb <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202402-divvy-tripdata/202402-divvy-tripdata.csv")
March <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202403-divvy-tripdata/202403-divvy-tripdata.csv")
April <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202404-divvy-tripdata/202404-divvy-tripdata.csv")
Now we need to aggregate the Data. For aggregating the data we take the help of rbind() function
rawdata_complete <- rbind(May, June, July, Aug, Sep, Oct, Nov, Dec, Jan, Feb, March, April)
Now we have the complete data to work with so we proceed with the Process Phase
In this phase, we need to clean the raw data so as to remove any irregularities by selecting and filtering out unwanted or missing data. We can also rename certain elements so as to make it look more meaningful and concise.
rawdata_complete <- rawdata_complete %>% rename(
`Bike Type` = rideable_type,
start = started_at,
end = ended_at,
`Rider Type` = member_casual
)
We take the help of recode() function to change the names of some elements within the columns
rawdata_complete <- rawdata_complete %>% mutate(`Rider Type` = recode(`Rider Type`,"member" = "Member",
"casual" = "Casual")) %>%
mutate(`Bike Type` = recode(`Bike Type`,
"classic_bike" = "Classic",
"electric_bike" = "Electric",
"docked_bike" = "Docked"))
Now after we have cleaned the data, we need to start with our analysis.
In this phase, we create new variables which can be useful for our Analysis. mutate() function can be used for creation of such new variables.
First, we will try to calculate the Station Distance between 2 stations. Now for calculating the geographical distance between 2 coordinates, we can use the Haversine formula. In this, we first need to calculate Haversine Angle from which we can get the distance in kms. Reference: https://www.youtube.com/watch?v=nsVsdHeTXIE
rawdata <- rawdata_complete %>% mutate(`Haversine Angle` = ((1 - cos(end_lat - start_lat))/2) + cos(start_lat) * cos(end_lat) * ((1 - cos(end_lng - start_lng))/2))
rawdata <- rawdata %>% mutate(`Station Distance` = (acos(1 - 2 * `Haversine Angle`) * pi/180) * 6371)
Now we select those variables which are important for our analysis and remove the remaining ones
raw_mod <- rawdata %>% select(`Bike Type`, start, end, start_station_name, end_station_name, `Rider Type`, `Station Distance`)
We add Ride-Length by taking a difference of starting and ending time
raw_mod <- raw_mod %>% mutate(ride_length = as_hms(difftime(end, start)))
For our analysis, it would be better to convert it into absolute value
raw_mod <- raw_mod %>% mutate(`Ride Length` = as.numeric(abs(ride_length)/60)) %>%
mutate_if(is.numeric, round, digits = 3)
We can modify the start variable (which gives us the starting time of the ride) to further create new variables which can really help us extract a lot of information pertaining to the time of the ride.
raw_mod <- raw_mod %>% mutate(date = as.Date(start))
raw_mod <- raw_mod %>% mutate(Month = month(ymd(date)),
Weekday = weekdays(ymd(date)),
`Day of Year` = yday(date),
`Day of Month` = mday(date),
`Start Time (in Hrs)` = hour(start))
We can also further calculate week of year from date. For a better outlook in the UI, it is better to give names to months in place of indicating them just in the form of numbers.
raw_mod <- raw_mod %>% mutate(`Week of Year` = week(date)) %>%
mutate(`Month Name` = case_when(
Month %in% c('1') ~ 'Jan 2024',
Month %in% c('2') ~ 'Feb 2024',
Month %in% c('3') ~ 'March 2024',
Month %in% c('4') ~ 'April 2024',
Month %in% c('5') ~ 'May 2023',
Month %in% c('6') ~ 'June 2023',
Month %in% c('7') ~ 'July 2023',
Month %in% c('8') ~ 'Aug 2023',
Month %in% c('9') ~ 'Sep 2023',
Month %in% c('10') ~ 'Oct 2023',
Month %in% c('11') ~ 'Nov 2023',
Month %in% c('12') ~ 'Dec 2023'))
Months can be further converted to Seasons for identifying seasonal trends. Since, the firm is situated in Chicago, US, there are primarily 4 seasons according to which we need to carry out further analysis. Those are Winter, Spring, Summer and Fall.
raw_mod <- raw_mod %>% mutate(Season = case_when(
Month %in% c('12', '1', '2') ~ 'Winter',
Month %in% c('3', '4', '5') ~ 'Spring',
Month %in% c('6', '7', '8') ~ 'Summer',
Month %in% c('9', '10', '11') ~ 'Fall'
))
We will filter rows wherein the ride length is less than 1 minute. Since, rides with ride length less than 1 minute cannot be counted as rides as some times such rides get recorded in the system even when the rider is just testing the locking and unlocking of the system.
yeardata <- raw_mod %>% filter(`Ride Length` > 1.000 | `Ride Length` == 1.000)
yeardata <- yeardata %>% rename(
Station = start_station_name
)
Now we create another variable called as “Route” wherein we associate a number for every unique route connecting 2 bike stations in Chicago
routes <- yeardata %>% select(Station, end_station_name)
Now we check if there is any na present in the columns
colSums(is.na(routes))
## Station end_station_name
## 843175 883058
Now since, na is present in the columns we filter those na values
routes <- routes %>% filter(!is.na(Station))
routes <- routes %>% filter(!is.na(end_station_name))
Again we check for na values
colSums(is.na(routes))
## Station end_station_name
## 0 0
Now since there are no na values present, we now extract the unique rows using distinct() function.
routes <- routes %>% distinct()
After separating out the distinct rows, we order the rows w.r.t Station Names and then create a separate column and populate it with row numbers so that it becomes easy to identify them.
routes <- routes[order(routes$Station),]
routes <- routes %>% mutate(Route = paste0("R", row_number()))
A view of the routes table is as given below
head(routes)
## # A tibble: 6 × 3
## Station end_station_name Route
## <chr> <chr> <chr>
## 1 2112 W Peterson Ave Western Ave & Gunnison St R1
## 2 2112 W Peterson Ave Western Ave & Granville Ave R2
## 3 2112 W Peterson Ave Hoyne Ave & Balmoral Ave R3
## 4 2112 W Peterson Ave Clark St & Schreiber Ave R4
## 5 2112 W Peterson Ave Clark St & Bryn Mawr Ave R5
## 6 2112 W Peterson Ave Oakley Ave & Touhy Ave R6
Now we need to create a similar “Route” variable in our main dataset i.e. yeardata. For creating such variable, we can take the help of merge() function as follows:
yeardata2 <- yeardata %>% filter(!is.na(Station))
yeardata2 <- yeardata2 %>% filter(!is.na(end_station_name))
yearroute <- merge(yeardata2, routes, by = c("Station","end_station_name"))
Now our main dataset after creation of new variables looks like this:
head(yearroute)
## Station end_station_name Bike Type start
## 1 2112 W Peterson Ave 2112 W Peterson Ave Electric 2024-04-21 14:11:46
## 2 2112 W Peterson Ave 2112 W Peterson Ave Docked 2023-08-18 09:13:58
## 3 2112 W Peterson Ave 2112 W Peterson Ave Classic 2023-08-10 16:38:51
## 4 2112 W Peterson Ave 2112 W Peterson Ave Electric 2023-06-14 19:03:41
## 5 2112 W Peterson Ave 2112 W Peterson Ave Electric 2023-06-14 19:06:01
## 6 2112 W Peterson Ave 2112 W Peterson Ave Electric 2023-07-23 05:13:33
## end Rider Type Station Distance ride_length Ride Length
## 1 2024-04-21 14:12:58 Casual 0.003 00:01:12 1.200
## 2 2023-08-18 15:57:25 Casual 0.000 06:43:27 403.450
## 3 2023-08-10 17:25:11 Casual 0.000 00:46:20 46.333
## 4 2023-06-14 19:04:59 Member 0.002 00:01:18 1.300
## 5 2023-06-14 19:07:22 Member 0.003 00:01:21 1.350
## 6 2023-07-23 05:14:33 Casual 0.003 00:01:00 1.000
## date Month Weekday Day of Year Day of Month Start Time (in Hrs)
## 1 2024-04-21 4 Sunday 112 21 14
## 2 2023-08-18 8 Friday 230 18 9
## 3 2023-08-10 8 Thursday 222 10 16
## 4 2023-06-14 6 Wednesday 165 14 19
## 5 2023-06-14 6 Wednesday 165 14 19
## 6 2023-07-23 7 Sunday 204 23 5
## Week of Year Month Name Season Route
## 1 16 April 2024 Spring R17
## 2 33 Aug 2023 Summer R17
## 3 32 Aug 2023 Summer R17
## 4 24 June 2023 Summer R17
## 5 24 June 2023 Summer R17
## 6 30 July 2023 Summer R17
We will now remove certain columns which are not useful for our work
yearroute <- yearroute %>% select(-start, -end, -ride_length, -date, -Month)
Since the dataset is very huge, we would be selecting only a 1/3 sample of the data so that it does not create any memory issues at the time of project deployment on Shinyapps.io
yearmod <- yearroute %>% sample_frac(0.33)
Now we need to prepare some data for generating Choropleth maps. For this we first need to download the shape files of Chicago. We can get those files here: https://data.cityofchicago.org/Transportation/Divvy-Bicycle-Stations/bbyy-e7gq/about_data.The files they keep on updating regularly.
From this website, we can extract the spatial data in the form of Shape file in csv format.
Data will look as given below:
shape_data <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/Datasets/other files/Divvy_Bicycle_Stations_20240605.csv")
## Rows: 948 Columns: 9
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (5): ID, Station Name, Short Name, Status, Location
## dbl (4): Total Docks, Docks in Service, Latitude, Longitude
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(shape_data)
## # A tibble: 6 × 9
## ID `Station Name` `Short Name` `Total Docks` `Docks in Service` Status
## <chr> <chr> <chr> <dbl> <dbl> <chr>
## 1 a3ad5c90-… Dorchester Av… KA1503000069 15 15 In Se…
## 2 157110506… Narragansett … <NA> 9 9 In Se…
## 3 a3b2af02-… MLK Jr Dr & 8… 586 11 11 In Se…
## 4 a3a57527-… Sedgwick St &… TA1307000062 27 27 In Se…
## 5 159404640… California & … <NA> 9 9 In Se…
## 6 a3b13836-… Kenton Ave & … 537 11 11 In Se…
## # ℹ 3 more variables: Latitude <dbl>, Longitude <dbl>, Location <chr>
We can also download shape file of Community Areas of Chicago. These will provide a community area map of Chicago in the background which can help us identify the areas where high density of riders are located.
The shape file of community areas can be downloaded from here: https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Boundaries-Community-Areas-current-/cauq-8yn6
boundaries <- read_sf("C:/Users/nevil/Desktop/Cyclistic(Reduced Data)/Boundaries - Community Areas (current)")
head(boundaries)
## Simple feature collection with 6 features and 9 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -87.7069 ymin: 41.79448 xmax: -87.58001 ymax: 41.99076
## Geodetic CRS: WGS 84
## # A tibble: 6 × 10
## area area_num_1 area_numbe comarea comarea_id community perimeter shape_area
## <dbl> <chr> <chr> <dbl> <dbl> <chr> <dbl> <dbl>
## 1 0 35 35 0 0 DOUGLAS 0 46004621.
## 2 0 36 36 0 0 OAKLAND 0 16913961.
## 3 0 37 37 0 0 FULLER PA… 0 19916705.
## 4 0 38 38 0 0 GRAND BOU… 0 48492503.
## 5 0 39 39 0 0 KENWOOD 0 29071742.
## 6 0 4 4 0 0 LINCOLN S… 0 71352328.
## # ℹ 2 more variables: shape_len <dbl>, geometry <MULTIPOLYGON [°]>
Now we will first filter out Station, Bike and Rider Type columns from yearroute dataset and rename them so as to match with the Shape file data.
shape_mod <- yearmod %>% select(Station, `Rider Type`, `Bike Type`)
shape_mod <- yearmod %>%
rename(
`Station Name` = Station,
`Type of Rider` = `Rider Type` # Rider Type modified to Type of Rider so as to match with the UI text
)
shape_mod <- shape_mod %>%
group_by(`Type of Rider`, `Bike Type`,`Station Name`) %>%
count(`Station Name`)
head(shape_mod)
## # A tibble: 6 × 4
## # Groups: Type of Rider, Bike Type, Station Name [6]
## `Type of Rider` `Bike Type` `Station Name` n
## <chr> <chr> <chr> <int>
## 1 Casual Classic 2112 W Peterson Ave 21
## 2 Casual Classic 63rd St Beach 135
## 3 Casual Classic 900 W Harrison St 581
## 4 Casual Classic Aberdeen St & Jackson Blvd 698
## 5 Casual Classic Aberdeen St & Monroe St 505
## 6 Casual Classic Aberdeen St & Randolph St 546
colSums(is.na(shape_mod))
## Type of Rider Bike Type Station Name n
## 0 0 0 0
Since, we dont have any na values present so we can now merge the shape_mod data with shape_data. We need to do this so that we can get the necessary spatial info from shape_data for every station present in shape_mod.
mytable = merge(shape_data, shape_mod, by = 'Station Name', all=F)
head(mytable)
## Station Name ID Short Name
## 1 2112 W Peterson Ave a3af2216-a135-11e9-9cda-0a87ae2ba916 KA1504000155
## 2 2112 W Peterson Ave a3af2216-a135-11e9-9cda-0a87ae2ba916 KA1504000155
## 3 2112 W Peterson Ave a3af2216-a135-11e9-9cda-0a87ae2ba916 KA1504000155
## 4 2112 W Peterson Ave a3af2216-a135-11e9-9cda-0a87ae2ba916 KA1504000155
## 5 2112 W Peterson Ave a3af2216-a135-11e9-9cda-0a87ae2ba916 KA1504000155
## 6 21st St & Pulaski Rd 1931696364106218226 21331
## Total Docks Docks in Service Status Latitude Longitude
## 1 11 11 In Service 41.99118 -87.68359
## 2 11 11 In Service 41.99118 -87.68359
## 3 11 11 In Service 41.99118 -87.68359
## 4 11 11 In Service 41.99118 -87.68359
## 5 11 11 In Service 41.99118 -87.68359
## 6 15 15 In Service 41.85344 -87.72487
## Location Type of Rider Bike Type n
## 1 POINT (-87.683593 41.991178) Casual Classic 21
## 2 POINT (-87.683593 41.991178) Casual Electric 36
## 3 POINT (-87.683593 41.991178) Member Electric 47
## 4 POINT (-87.683593 41.991178) Member Classic 69
## 5 POINT (-87.683593 41.991178) Casual Docked 4
## 6 POINT (-87.724874 41.853441) Member Electric 1
Now for every station we have spatial data as well as bike type and rider type in one single table. Now we need to make different subsets of table based on rider-bike combination so that we can get different choropleth maps for every combination.
mytable_casual_classic <- mytable %>% filter(`Type of Rider` == "Casual") %>% filter(`Bike Type` == "Classic") %>% mutate(Perc_n = (n/sum(n))*100)
head(mytable_casual_classic)
## Station Name ID Short Name
## 1 2112 W Peterson Ave a3af2216-a135-11e9-9cda-0a87ae2ba916 KA1504000155
## 2 63rd St Beach a3a547b8-a135-11e9-9cda-0a87ae2ba916 15491
## 3 900 W Harrison St a3a56ae5-a135-11e9-9cda-0a87ae2ba916 13028
## 4 Aberdeen St & Jackson Blvd a3a3cbac-a135-11e9-9cda-0a87ae2ba916 13157
## 5 Aberdeen St & Monroe St a3a4e4e3-a135-11e9-9cda-0a87ae2ba916 13156
## 6 Aberdeen St & Randolph St a3b3634c-a135-11e9-9cda-0a87ae2ba916 18062
## Total Docks Docks in Service Status Latitude Longitude
## 1 11 11 In Service 41.99118 -87.68359
## 2 15 15 In Service 41.78091 -87.57632
## 3 19 19 In Service 41.87475 -87.64981
## 4 23 23 In Service 41.87773 -87.65479
## 5 19 19 In Service 41.88042 -87.65552
## 6 15 15 In Service 41.88411 -87.65426
## Location Type of Rider Bike Type n
## 1 POINT (-87.683593 41.991178) Casual Classic 21
## 2 POINT (-87.576323747635 41.780910964248) Casual Classic 135
## 3 POINT (-87.649807 41.874754) Casual Classic 581
## 4 POINT (-87.65478743 41.87772613) Casual Classic 698
## 5 POINT (-87.655519 41.880419) Casual Classic 505
## 6 POINT (-87.654264 41.884114) Casual Classic 546
## Perc_n
## 1 0.007163544
## 2 0.046051352
## 3 0.198191376
## 4 0.238102548
## 5 0.172266170
## 6 0.186252136
As we can see in the above table, we created a Perc_n column which represents the density of riders. Similarly we create such tables for other combinations as well.
mytable_casual_electric <- mytable %>% filter(`Type of Rider` == "Casual") %>% filter(`Bike Type` == "Electric") %>% mutate(Perc_n = (n/sum(n))*100)
mytable_member_classic <- mytable %>% filter(`Type of Rider` == "Member") %>% filter(`Bike Type` == "Classic") %>% mutate(Perc_n = (n/sum(n))*100)
mytable_member_electric <- mytable %>% filter(`Type of Rider` == "Member") %>% filter(`Bike Type` == "Electric") %>% mutate(Perc_n = (n/sum(n))*100)
We now save these subsets of data in csv format and convert them into shape-files with the help of an online available csv-shape file conversion software. One of the software used for this purpose is as follows: https://mygeodata.cloud/converter/csv-to-shp.
After converting these csv files, we are now ready for generating different visualizations for bar charts, pie charts, heatmaps and choropleth maps.
The Act phase of data analysis is a crucial step in the data analysis lifecycle, focusing on taking actionable insights derived from data and implementing them in a practical context.
Identify Actions: Based on the insights gained from the analysis, determine specific actions to be taken. These could be strategic decisions, operational changes, or new initiatives.
Prioritize Actions: Not all actions will have the same impact. Prioritize them based on factors such as potential benefit, feasibility, cost, and alignment with organizational goals.
The insights from the case-study analysis have been mentioned in “Findings and Recommendations” section of the report.
Plan: Develop a detailed plan for implementing the chosen actions. This includes defining objectives, setting timelines, allocating resources, and identifying responsible parties.
Execution: Put the plan into action. This might involve deploying new processes, launching marketing campaigns, modifying existing systems, or introducing new products.
Track Progress: Continuously monitor the implementation of actions to ensure they are being executed as planned.
Adjust: Be ready to make adjustments based on feedback and initial results. This could involve modifying strategies, reallocating resources, or revising timelines.
UI defines the layout and the appearance of the app
The framework of UI is as given below
ui <- shinyUI(
fluidPage( # Fluidpage brings in the required fluidity for web-application to adjust the display as per the size of the screen.
dashboardPage(
dashboardHeader(), # This is code for Header Panel of Shiny Web-Application
dashboardSidebar(), # This is code for Side Panel of Web Application where multiple tabs are located
dashboardBody() # This is code for Main Panel of Web Application where multiple sub-tabs are located
)
)
)
dashboardHeader contains the code for displaying the information on the header strip of web-application. In the below code, some part of HTML is used for modifying the layout and including images in the header panel.
For integrating HTML into Shiny Web-Application, Shiny provides a list of functions named tags. The list can be seen here: https://shiny.posit.co/r/articles/build/tag-glossary/. Each function in the list creates an HTML tag that we can use to modify the layout of our Shiny App.
Provided below is a description of the tags used in dashboardHeader:
img: creates an image li : creates a list a : creates a link to a webpage
dashboardHeader(title = "Cyclistic Data Analysis", titleWidth = 250, tags$li(class = "dropdown", tags$img(src="Nevil.jpg", height = "50", width = "50", style="border-radius: 55%; border: 3px solid; color:indigo")), tags$li(class = "dropdown", tags$a(href = "https://rb.gy/br2sa3", icon("linkedin"), "My Profile")))
As seen in the Side-bar, 3 tabs have been provided. On clicking on any of the tabs, the sub-tabs placed in the central part of the dashboard i.e. the body vary w.r.t the tab selected in the Side panel.
For the Home tab, we have multiple sub-tabs as follows:
About: This section provides some basic information regarding the company and the service it provides on which the case-study analysis has been done
Business task: This section provides in detail information regarding the task that we need to perform.
Raw-Data: This section provides a view of the sample of the data that we will be working with.
Structure: This section provides structure of the raw data. Structure generally consists information regarding the Data type of variables. It is also called Meta-Data which is data about data.
p: Creates a paragraph br: Creates a line-break
# 1st tab item
tabItem(tabName = "home", tabBox(id = "t1", width = 12,
tabPanel(fluidRow(column(width = 12, div(style = "display: flex; align-items: center;", tags$img(src = "about.png", style = "width: 28px; height: 28px; margin-left: 7px;"), tags$span(style = "padding-left: 8px;", "About")))),
fluidRow(column(width = 8, tags$img(src = "https://rb.gy/c7rkrn", width = 300, height = 300), tags$br(), tags$a("Source: Google Case-Studies"), align = "center"), column(width = 8, tags$br(), tags$p(HTML("In 2016, <b>Cyclistic</b> launched a successful bike-share offering. Since then, the program has grown to a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime.")),
tags$p(HTML("Until now, Cyclistic’s marketing strategy relied on building general awareness and appealing to broad consumer segments. One approach that helped make these things possible was the
flexibility of its pricing plans: single-ride passes, full-day passes, and annual memberships. Customers who purchase <b><i>single-ride or full-day passes are referred to as casual riders</i></b>. Customers who purchase <b><i>annual memberships are Cyclistic members</i></b>.")), align = "justify"))),
#------------------------------------------------------------
tabPanel(fluidRow(column(width = 12, div(style = "display: flex; align-items: center;", tags$img(src = "task.png", style = "width: 30px; height: 30px; margin-left: 7px;"), tags$span(style = "padding-left: 8px;", "Business Task")))),
fluidRow(column(width = 8, tags$img(src = "https://rb.gy/0tkbmm", width = 300, height = 300), tags$br(), tags$a("Source: https://teknovatecrm.in/task-management"), align = "center"), column(width = 8, tags$br(), tags$p(HTML("Cyclistic’s finance analysts have concluded that annual members are much more profitable than casual riders. Although the pricing flexibility helps Cyclistic attract more customers,
Lily Moreno, the director of marketing, believes that maximizing the number of annual members will be key to future growth. Rather than creating a marketing campaign that targets all-new customers, Moreno, believes
<b><i>there is a solid opportunity to convert casual riders into members</i></b>. She notes that casual riders
are already aware of the Cyclistic program and have chosen Cyclistic for their mobility needs.")),
tags$p(HTML("<b>Task:</b> You are a junior data analyst working on the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company’s future success
depends on maximizing the number of annual memberships. Therefore, <b>your team wants to
understand how casual riders and annual members use Cyclistic bikes differently.</b>")), align = "justify"))),
#--------------------------------------------------------------------------------------------------------------
tabPanel(fluidRow(column(width = 12, div(style = "display: flex; align-items: center;", tags$img(src = "data.png", style = "width: 28px; height: 28px; margin-left: 7px;"), tags$span(style = "padding-left: 8px;", "Raw Data")))),
fluidRow(column(width = 8, tags$br(), tags$p(HTML("This is just a sample of the original data.Original Raw Data consists of 5.6 million observations, 13 variables and occupies a size of 1 GB.Uploading such a huge data would lead to low speed and memory issues.")))), dataTableOutput("dataT")),
#--------------------------------------------------------------------------------------------------------------
tabPanel(fluidRow(column(width = 12, div(style = "display: flex; align-items: center;", tags$img(src = "structure.png", style = "width: 28px; height: 28px; margin-left: 7px;"), tags$span(style = "padding-left: 8px;", "Structure")))), verbatimTextOutput("structure"))))
About sub-tab in the Home tab can be seen as given below:
Business Task looks as given below:
A snippet of Raw-Data is as given below:
The Structure (Meta Data) is as given below:
This tab primarily contains 2 sub-tabs: Periodic Analysis and Choropleth Maps. In Periodic Analysis, various set of controls are provided for the User. By giving input to these controls, a customised graph can be easily generated within no time.
These inputs are designed using selectInputs, checkboxInputs and radiobuttons of Shiny Application.
With the controls given in the Dashboard tab, user has enough flexibility t oselect any type of analysis from Monthly, Seasonal and Yearly. For every analysis, user has further control over the choice of X-Axis and Y-Axis inputs. If the user is interested in viewing the difference between Casual Riders and Mebers w.r.t type of Bike used, he can further partition the graph with a simple click on Radio Button.
For analysis wherein the chosen X-axis input is among Day of Year, Weeks or even Months, it is possible that the graph may appear as congested for the user. For such graphs, if user wants to filter out the maximum values, he can even do that with a simple checkboxInput or if the user wants to identify the trend of the graph, then by clicking on the trend line, user can get a trend line generated automatically.
For Stakeholders and users who would be interested in knowing the data responsibe for the generated graphs, a separate data table viewing option is also available as well.
Besides this, if the User clicks on the bars of the bar-chart, every individual bar generates a separate Pie chart or Heatmap (depending on the choice of Y-Axis input) which gives further information to the user for that particular bar. These linked charts prove to be helpful in providing detail indepth analysis without clustering up the space of Main panel in the Shiny Application.
With the given controls, a user can easily generate any one among 12800 different graphs.
The variety of charts produced are as follows:
Thus making a total of 12800 graphs.
# 2nd tab item
#--------------Code for Periodic Analysis sub-tab--------------------------------------------------------------
tabItem(tabName = "dash",tabBox(id = "t2", width = 12,
tabPanel(fluidPage(fluidRow(
column(width = 12,div(style = "display: flex; align-items: center;",
tags$img(src = "time.png", style = "width: 28px; height: 28px; margin-left: 7px;"),
tags$span(style = "padding-left: 8px;", "Periodic Analysis"))))),
fluidPage(
fluidRow(column(width = 3, selectInput(inputId = "SelAna", label = "Select the type of Analysis", choices = c("Monthly","Seasonal","Yearly"), selected = "Monthly")),
column(width = 3,selectInput(inputId = "Seldata",label = "Select the Dataset",choices = " "))
),
fluidRow(column(width = 3, selectInput(inputId = "SelXaxis2", label = "Select the X Axis", choices = " ")),
column(width = 3, selectInput(inputId = "SelYaxis2", label = "Select the Y Axis", choices = " "))),
fluidRow(column(width = 3, radioButtons(inputId = "Part2", label = "Partition the Graph?", choices = c("No", "Yes(by Bike-Type)"), selected = "No")),
column(width = 3, selectInput(inputId = "Data2", label = "Show Data Table?", choices = c("No", "Yes"), selected = "No"))),
fluidRow(column(width = 3,checkboxInput(inputId = "Check2", label = HTML("<b>Show Maximum Values</b>"), value = FALSE)),
column(width = 3, checkboxInput(inputId = "Trend2", label = HTML("<b>Show Trend Line</b>"), value = FALSE))),
fluidRow(column(width = 12, withSpinner(plotlyOutput("chart2")))),
fluidRow(column(width = 12, withSpinner(dataTableOutput("data3")))))),
#----------------------------------Code for Choropleth Maps sub-tab-------------------------------------------
tabPanel(fluidPage(div(style = "display: flex; align-items: center;", tags$img(src = "map.png", style = "width: 28px; height: 28px; margin-left: 7px;"), tags$span(style = "padding-left: 8px;", "Choropleth Maps"))),
fluidPage(div(style = "display:inline-block; vertical-align: top; padding-top: 5px; width: 500px",selectInput(inputId = "SelRid" , label = "Select the type of Rider", choices = c("casual", "member"), width = "30%", selected = "casual")),
div(style = "display:inline-block; vertical-align: top; padding-left: 5px; padding-top: 5px; width: 500px",selectInput(inputId = "SelBik" , label = "Select the type of Bike", choices = c("classic_bike", "electric_bike"), width = "30%", selected = "classic_bike")),
div(style = "display:inline-block; vertical-align: top; padding-left: 5px; padding-top: 5px; width: 500px",radioButtons(inputId = "Data4" , label = "Show Data Table?", choices = c("Yes", "No"), width = "30%", selected = "No"))),
withSpinner(plotOutput("map_plot")), width = 12, withSpinner(dataTableOutput("dataM")))))
Periodic Analysis subtab can be seen as given below:
Choropleth subtab is as given below:
This tab also comprises of 2 sub-tabs: Findings and Recommendations.
Findings provide a tabulated summary of important points which could be extracted from the data. Based on the points given in the findings, top three recommendations have been provided in the consequent tab.
#3rd tab item
tabItem(tabName = "insight",tabBox(id = "t3", width = 12,
tabPanel(fluidRow(column(width = 12, div(style = "display: flex; align-items: center;", tags$img(src = "finding.png", style = "width: 28px; height: 28px; margin-left: 7px;"), tags$span(style = "padding-left: 8px;", "Findings")))), tags$iframe(style = "height:700px; width:100%; scrolling=yes", src="Cyclistic-Results.pdf") ),
tabPanel(fluidRow(column(width = 12, div(style = "display: flex; align-items: center;", tags$img(src = "recommend.png", style = "width: 28px; height: 28px; margin-left: 7px;"), tags$span(style = "padding-left: 8px;", "Recommendations")))), fluidRow(column(width = 8, tags$p(HTML("<b>As per the findings, I would like to recommend the following: </b>")),
tags$p(HTML("1) People who would purchase Cyclistic Membership could be given some free rides annually to places like Dusable Lake Shore Drive and Streeter Dr where multiple parks are located. ")),
tags$p(HTML("2) Small groups of Riders can be formed by Cyclistic which would prefer going for early morning rides and also in the evening (around 17.00 hrs) in Spring and Summer covering several places in Chicago. Advertisements of such groups should be done focussed towards the residents in the region of Streeter Dr and Grand Ave and other top 25 such stations which can attract a lot of people who may join the groups to expand their Networks. Marketing team can provide Seasonal memberships at discounted prices to these groups.")),
tags$p(HTML("3) Since, the Casual riders are more during Weekends, Marathons could be organised on Weekends. Particiants could be provided with discounted memberships.")), align = "justify")))))
Findings tab can be seen as given below:
Recommendations can be seen as given below:
Server contains the logic that runs behind the scenes, processing the inputs and generating outputs.
In Shiny, the shinyServer function uses a function with arguments Input, Output, and Session to define the server-side logic of the app. Here’s a breakdown of these arguments:
This represents the inputs from the user interface (UI) of the Shiny app. It contains reactive values that reflect the current state of user inputs such as text fields, sliders, and buttons.
This represents the outputs that will be sent to the user interface. It contains slots for reactive output objects such as plots, tables, and text. We define these outputs in the server function, and they are rendered in the UI based on their associated IDs. For example, we can create a plot output with output$myPlot <- renderPlot({…}) which will then be displayed in the UI element with outputId = “myPlot”.
Session provides information and control over the user session. It can be used to manage user sessions, send custom messages, and handle session-specific events.
server <- shinyServer(
function(Session, Input, Output){
}
)
The controls in the UI are designed in such a way that when the user selects the first control i.e. the type of Analysis, then on the basis of input given to 1st control, the 2nd control gets populated with the available choices. Then on the basis of input given to the 2nd control i.e selection of dataset, then the X-Axis control gets populated with some list of choices which again determines the Y-input.
This is done so as to prevent any mismatch of inputs w.r.t each other. For instance, when Yearly analysis is selected, we have a X-Axis variable called “Seasons” which should not appear in Monthly or Seasonal analysis.
Thus for preventing any such possible mismatch, a csv file is separately maintained with all the matching options which is read automatically at the start of every Session.
The csv file is as shown below:
input_list <- read.csv("Inputs.csv")
head(input_list)
## Analysis Dataset X.Axis Y.Axis
## 1 Monthly May 2023 Weekday No.of Rides
## 2 Monthly May 2023 Weekday Avg.Ride-Duration
## 3 Monthly May 2023 Weekday Avg.Ride-Distance
## 4 Monthly May 2023 Start Time (in Hrs) No.of Rides
## 5 Monthly May 2023 Start Time (in Hrs) Avg.Ride-Duration
## 6 Monthly May 2023 Start Time (in Hrs) Avg.Ride-Distance
Inputs for the 1st control are always loaded. Based on the selection of 1st Input, subsequent controls get loaded consequently as per the code given below:
observe({
print(input$SelAna)
Inp1 <- input_list %>% filter(Analysis == input$SelAna) %>% select(Dataset)
updateSelectInput(session, inputId = "Seldata" , label = "Select the Dataset", choices = unique(Inp1))
})
observe({
Inp2 <- input_list %>% filter(Dataset == input$Seldata) %>% select(X.Axis)
updateSelectInput(session, inputId = "SelXaxis2" , label = "Select the X Axis", choices = unique(Inp2))
})
observe({
Inp3 <- input_list %>% filter(X.Axis == input$SelXaxis2) %>% select(Y.Axis)
updateSelectInput(session, inputId = "SelYaxis2" , label = "Select the Y Axis", choices = unique(Inp3))
})
This code snippet creates a highly customized DataTable using the DT package, which includes filtering columns, scrolling, and various layout and display options. This DataTable will be rendered in the Shiny app where dataT is referenced in the UI.
output$dataT <- DT::renderDataTable({
DT::datatable(
rawdata_sample,
filter = 'top', extensions = c('Buttons', 'Scroller'),
options = list(scrollY = 375,
scrollX = 500,
deferRender = TRUE,
scroller = TRUE,
# paging = TRUE,
# pageLength = 25,
buttons = list(list(extend = 'colvis', targets = 0, visible = FALSE)),
dom = 'lBfrtip',
fixedColumns = TRUE),
rownames = FALSE)
})
The str() function of base R package provides a concise summary of the data frame, including the data type of each column and the first few entries.
# Structure
output$structure <- renderPrint(
my_data <- rawdata_sample %>% select(-...1) %>% str()
)
Below provided is a short overview of how the code looks. The code is conditionaised at various points using the if-else structure.
Primarily the code can be broken down into 2 parts. First part of the code delineates an algorithm for analyzing the datset in case of number of rides. When the chosen Y-axis is either “Ride Duration” or “Ride Distance”, then the 2nd part of the code gets activated and a plot is rendered accordingly.
Bike type analysis is integrated into the code with the help of if-else condition.
output$chart2 <- renderPlotly({
yearroute$Weekday <- factor(yearroute$Weekday, levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"))
yearroute$`Month Name` <- factor(yearroute$`Month Name`, levels = c("May 2023", "June 2023", "July 2023", "Aug 2023", "Sep 2023", "Oct 2023", "Nov 2023", "Dec 2023", "Jan 2024", "Feb 2024", "March 2024", "April 2024"))
# Filtering Data as per the selection of type of Analysis
if(input$SelAna == "Yearly"){
yd <- yearroute
} else if(input$SelAna == "Seasonal"){
yd <- yearroute %>%
filter(Season == paste(input$Seldata))
} else {
yd <- yearroute %>%
filter(`Month Name` == paste(input$Seldata))
}
# When Y axis selected is no.of rides
if(Y-Axis is No.of Rides){
}
# When Y axis selected is Ride Duration or Ride Distance
if(Y-Axis is Ride Duration or Ride Distance){
}
if(input$SelYaxis2 == "No.of Rides"){
common <- {if(input$Part2 == "Yes(by Bike-Type)"){ yd %>%
select(`Rider Type`, `Bike Type`, !!sym(input$SelXaxis2)) %>%
group_by(!!sym(input$SelXaxis2), `Rider Type`, `Bike Type`) %>%
count(`Bike Type`) %>% ungroup() }
else {yd %>% select(`Rider Type`, !!sym(input$SelXaxis2)) %>%
group_by(!!sym(input$SelXaxis2), `Rider Type`) %>%
count(`Rider Type`) %>% ungroup() }}
if(input$SelXaxis2 == "Route"){
q = common %>%
group_by(!!sym(input$SelXaxis2)) %>%
mutate(total_n = sum(n), percentage = (n / total_n) * 100) %>%
ungroup()
s = q[order(-q$total_n), ]
r = s %>% select(Route) %>% distinct() %>% slice(1:25)
p = semi_join(s,r, by=("Route"))
p$Route <- factor(p$Route, levels = unique(r$Route)) ## This will keep the bars in predefined decreasing order in graph
routedata <- yearroute %>% select(Station, end_station_name, Route) %>% distinct()
routeshow = merge(p, routedata, by = "Route") %>% rename(`Start Station` = Station,`End Station` = end_station_name) ## This code is for modifying the table so as to display it in output along with station names
routeshow <- routeshow[order(-routeshow$total_n),]
if(input$Part2 == "Yes(by Bike-Type)"){
routeshow <- routeshow %>% select(Route, `Start Station`, `End Station`, `Rider Type`, `Bike Type`, n, total_n, percentage)
## Now we will write code for showing max values in data table
pmax <- p %>% group_by(`Rider Type`, `Bike Type`) %>% filter(n == max(n))
routeshowmax = merge(pmax, routedata, by = "Route") %>% rename(`Start Station` = Station,`End Station` = end_station_name) ## This code is for modifying the table so as to display it in output along with station names
routeshowmax <- routeshowmax[order(-routeshowmax$total_n),]
routeshowmax <- routeshowmax %>% select(Route, `Start Station`, `End Station`, `Rider Type`, `Bike Type`, n, total_n, percentage) ## reordering the columns
} else {
routeshow <- routeshow %>% select(Route, `Start Station`, `End Station`, `Rider Type`, n, total_n, percentage)
pmax <- p %>% group_by(`Rider Type`) %>% filter(n == max(n))
routeshowmax = merge(pmax, routedata, by = "Route") %>% rename(`Start Station` = Station,`End Station` = end_station_name) ## This code is for modifying the table so as to display it in output along with station names
routeshowmax <- routeshowmax[order(-routeshowmax$total_n),]
routeshowmax <- routeshowmax %>% select(Route, `Start Station`, `End Station`, `Rider Type`, n, total_n, percentage) ## reordering the columns
}
} else {
p = common %>%
mutate(n = n/1000000) %>% mutate_if(is.numeric, round, digits = 4) %>%
group_by(!!sym(input$SelXaxis2)) %>%
mutate(total_n = sum(n), percentage = (n / total_n) * 100) %>%
mutate_if(is.numeric, round, digits = 3) %>%
ungroup()
}
if(input$Check2){
if(input$Data2 == "Yes"){
output$data3 <- DT::renderDataTable({
DT::datatable(
if(input$SelXaxis2 == "Route"){
routeshowmax %>% mutate_if(is.numeric, round, digits = 5)
} else{
p %>% {if(input$Part2 == "Yes(by Bike-Type)"){group_by(. , `Rider Type`, `Bike Type`)} else {group_by(. , `Rider Type`)}} %>% filter(n == max(n)) %>%
mutate_if(is.numeric, round, digits = 5)
},
filter = 'top', extensions = c('Buttons', 'Scroller'),
options = list(scrollY = 375,
scrollX = 500,
deferRender = TRUE,
scroller = TRUE,
# paging = TRUE,
# pageLength = 25,
buttons = list(list(extend = 'colvis', targets = 0, visible = FALSE)),
dom = 'lBfrtip',
fixedColumns = TRUE),
rownames = FALSE)
})
}else{
output$data3 <- DT::renderDataTable({
DT::datatable(NULL)
})
}
if(input$Part2 == "Yes(by Bike-Type)"){
h = p %>% group_by(`Rider Type`, `Bike Type`) %>% filter(n == max(n)) %>% mutate_if(is.numeric, round, digits = 5)
d = p %>% mutate(n = 0)
combined_d <- merge(d, h, by = c(paste(input$SelXaxis2), "Rider Type", "Bike Type"), all = TRUE) %>%
mutate(n = ifelse(is.na(n.x), 0, n.x) + ifelse(is.na(n.y), 0, n.y)) %>% select(!!sym(input$SelXaxis2), `Rider Type`, `Bike Type`, n)
} else {
h = p %>% group_by(`Rider Type`) %>% filter(n == max(n)) %>% mutate_if(is.numeric, round, digits = 5)
d = p %>% mutate(n = 0)
combined_d <- merge(d, h, by = c(paste(input$SelXaxis2), "Rider Type"), all = TRUE) %>%
mutate(n = ifelse(is.na(n.x), 0, n.x) + ifelse(is.na(n.y), 0, n.y)) %>% select(!!sym(input$SelXaxis2), `Rider Type`, n)
}
q = p %>% ggplot() + geom_bar(data = p, mapping = aes(x = get(input$SelXaxis2), y = n, fill = `Rider Type`, text = paste(input$SelXaxis2, ":", get(input$SelXaxis2), "<br>No.of Rides:", n, "<br>Type of Rider:", `Rider Type`)),
stat = "identity", position = position_dodge(0.6), width = 0.5, alpha = 0.2) +
{if(input$Part2 == "Yes(by Bike-Type)"){facet_wrap(~`Bike Type`)} else {}} +
geom_bar(data = combined_d,
mapping = aes(x = get(input$SelXaxis2), y = n, fill = `Rider Type`), color = "black", linewidth = 0.3,
stat = "identity", position = position_dodge(0.6), width = 0.5, alpha = 0.8) +
{if(input$Part2 == "Yes(by Bike-Type)"){facet_wrap(~`Bike Type`)} else {}} +
geom_text(data = h, aes(label = paste0("<span style='color:black;font-weight:bold;'>", n, "</span>"), x = get(input$SelXaxis2), y = n), size = 2, position = position_nudge(x = -0.1, y = 0.001)) +
expand_limits(y = c(0, max(p$n) * 1.2)) +
scale_y_continuous(expand = c(0,0)) +
{
if(input$SelXaxis2 == "Route"){
labs(x = paste(input$SelXaxis2),
y = "Number of Rides")
} else{labs(x = paste(input$SelXaxis2),
y = "Number of Rides (in millions)")}
} +
theme_bw() +
{if(input$SelXaxis2 == "Weekday" | input$SelXaxis2 == "Month Name"){
theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust=1))
} else if(input$SelXaxis2 == "Route"){
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
}} +
{
if(input$Trend2){
geom_smooth(data = p, mapping = aes(x = as.numeric(get(input$SelXaxis2)), y = n, col=`Rider Type`), method=loess, se=FALSE, linewidth = 0.5, show.legend = FALSE)
}
}
} else{
if(input$Data2 == "Yes"){
output$data3 <- DT::renderDataTable({
DT::datatable(
if(input$SelXaxis2 == "Route"){
routeshow %>% mutate_if(is.numeric, round, digits = 5)
} else{
p %>% mutate_if(is.numeric, round, digits = 5)
},
filter = 'top', extensions = c('Buttons', 'Scroller'),
options = list(scrollY = 375,
scrollX = 500,
deferRender = TRUE,
scroller = TRUE,
# paging = TRUE,
# pageLength = 25,
buttons = list(list(extend = 'colvis', targets = 0, visible = FALSE)),
dom = 'lBfrtip',
fixedColumns = TRUE),
rownames = FALSE)
})
}else{
output$data3 <- DT::renderDataTable({
DT::datatable(NULL)
})
}
q = p %>% ggplot() + geom_bar(data = p, mapping = aes(x = get(input$SelXaxis2), y = n, fill = `Rider Type`, text = paste(input$SelXaxis2, ":", get(input$SelXaxis2), "<br>No.of Rides:", n, "<br>Type of Rider:", `Rider Type`)),
stat = "identity", position = position_dodge(0.6), width = 0.5) +
{if(input$Part2 == "Yes(by Bike-Type)"){facet_wrap(~`Bike Type`)}} +
expand_limits(y = c(0, max(p$n) * 1.2)) +
scale_y_continuous(expand = c(0,0)) +
{
if(input$SelXaxis2 == "Route"){
labs(x = paste(input$SelXaxis2),
y = "Number of Rides")
} else{labs(x = paste(input$SelXaxis2),
y = "Number of Rides (in millions)")}
} +
theme_bw() +
{if(input$SelXaxis2 == "Weekday" | input$SelXaxis2 == "Month Name"){
theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust=1))
} else if(input$SelXaxis2 == "Route"){
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
}} +
{
if(input$Trend2){
geom_smooth(data = p, mapping = aes(x = as.numeric(get(input$SelXaxis2)), y = n, col=`Rider Type`), method=loess, se=FALSE, linewidth = 0.5, show.legend = FALSE)
}
}
}
p_plotly <- ggplotly(q, tooltip = "text", source = "chart2")
p_plotly <- p_plotly %>%
layout(
title = list(
text = paste0("<b>","Number of Rides by ", input$SelXaxis2, "</b>","<br><span style='font-size:14pt; display:block; padding-top:10px;'>", "<sup>","Dataset:", input$Seldata, "</sup>", "</span></br>"),
x = 0.5, # Center alignment
xanchor = "center"),
margin = list(t = 100, r = 50, b = 50) # Increase top margin to add space between title and plot
)
p_plotly <- event_register(p_plotly, "plotly_click")
# Return plotly object
p_plotly
}
else if(input$SelYaxis2 != "No.of Rides"){
if(input$SelYaxis2 == "Avg.Ride-Duration"){
p1 = yd %>%
{if(input$Part2 == "Yes(by Bike-Type)"){group_by(. , !!sym(input$SelXaxis2), `Rider Type`, `Bike Type`)}else{group_by(. , !!sym(input$SelXaxis2), `Rider Type`)}} %>%
summarise(mean = mean(`Ride Length`)) %>% ungroup() %>% mutate_if(is.numeric, round, digits = 4)
} else if(input$SelYaxis2 == "Avg.Ride-Distance") {
p1 = yd %>%
filter(!is.na(`Station Distance`)) %>%
{if(input$Part2 == "Yes(by Bike-Type)"){group_by(. , !!sym(input$SelXaxis2), `Rider Type`, `Bike Type`)}else{group_by(. , !!sym(input$SelXaxis2), `Rider Type`)}} %>%
summarise(mean = mean(`Station Distance`)) %>% ungroup() %>% mutate_if(is.numeric, round, digits = 4)
}
if(input$Check2){
if(input$Data2 == "Yes"){
output$data3 <- DT::renderDataTable({
DT::datatable(
p1 %>% {if(input$Part2 == "Yes(by Bike-Type)"){group_by(. , `Rider Type`, `Bike Type`)}else{group_by(. , `Rider Type`)}} %>% filter(mean == max(mean)) %>%
mutate_if(is.numeric, round, digits = 4),
filter = 'top', extensions = c('Buttons', 'Scroller'),
options = list(scrollY = 375,
scrollX = 500,
deferRender = TRUE,
scroller = TRUE,
# paging = TRUE,
# pageLength = 25,
buttons = list(list(extend = 'colvis', targets = 0, visible = FALSE)),
dom = 'lBfrtip',
fixedColumns = TRUE),
rownames = FALSE)
})
}else{
output$data3 <- DT::renderDataTable({
DT::datatable(NULL)
})
}
h1 = p1 %>% {if(input$Part2 == "Yes(by Bike-Type)"){group_by(. , `Rider Type`, `Bike Type`)}else{group_by(. , `Rider Type`)}} %>% filter(mean == max(mean)) %>% mutate_if(is.numeric, round, digits = 4)
d1 = p1 %>% mutate(mean = 0)
if(input$Part2 == "Yes(by Bike-Type)"){
combined_d1 <- merge(d1, h1, by = c(paste(input$SelXaxis2), "Rider Type", "Bike Type"), all = TRUE) %>%
mutate(mean = ifelse(is.na(mean.x), 0, mean.x) + ifelse(is.na(mean.y), 0, mean.y)) %>%
select(!!sym(input$SelXaxis2), `Rider Type`,`Bike Type`, mean) %>% mutate_if(is.numeric, round, digits = 4)
} else {
combined_d1 <- merge(d1, h1, by = c(paste(input$SelXaxis2), "Rider Type"), all = TRUE) %>%
mutate(mean = ifelse(is.na(mean.x), 0, mean.x) + ifelse(is.na(mean.y), 0, mean.y)) %>%
select(!!sym(input$SelXaxis2), `Rider Type`, mean) %>% mutate_if(is.numeric, round, digits = 4)
}
q1 = p1 %>% ggplot() + geom_bar(data = p1, mapping = aes(x = get(input$SelXaxis2), y = mean, fill = `Rider Type`, text = paste(input$SelXaxis2, ":", get(input$SelXaxis2), "<br>", input$SelYaxis2, ":", mean , "<br>Type of Rider:", `Rider Type`)),
stat = "identity", position = position_dodge(0.6), width = 0.5, alpha = 0.2) +
{if(input$Part2 == "Yes(by Bike-Type)"){facet_wrap(~`Bike Type`)}} +
geom_bar(data = combined_d1,
mapping = aes(x = get(input$SelXaxis2), y = mean, fill = `Rider Type`), color = "black", linewidth = 0.3,
stat = "identity", position = position_dodge(0.6), width = 0.5, alpha = 0.8) +
{if(input$Part2 == "Yes(by Bike-Type)"){facet_wrap(~`Bike Type`)}} +
geom_text(data = h1, aes(label = paste0("<span style='color:black;font-weight:bold;'>", mean, "</span>"), x = get(input$SelXaxis2), y = mean), size = 2, position = position_nudge(x = -0.1, y = 0.001)) +
expand_limits(y = c(0, max(p1$mean) * 1.2)) +
scale_y_continuous(expand = c(0,0)) +
labs(x = paste(input$SelXaxis2),
y = paste(input$SelYaxis2),
caption = "Source: https://divvybikes.com/system-data") +
theme_bw() +
{if(input$SelXaxis2 == "Weekday" | input$SelXaxis2 == "Month Name"){
theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust=1))
}} +
{
if(input$Trend2){
geom_smooth(data = p1, mapping = aes(x = as.numeric(get(input$SelXaxis2)), y = mean, col=`Rider Type`), method=loess, se=FALSE, linewidth = 0.5, show.legend = FALSE)
}
}
} else{
if(input$Data2 == "Yes"){
output$data3 <- DT::renderDataTable({
DT::datatable(
p1 %>% mutate_if(is.numeric, round, digits = 4) ,
filter = 'top', extensions = c('Buttons', 'Scroller'),
options = list(scrollY = 375,
scrollX = 500,
deferRender = TRUE,
scroller = TRUE,
# paging = TRUE,
# pageLength = 25,
buttons = list(list(extend = 'colvis', targets = 0, visible = FALSE)),
dom = 'lBfrtip',
fixedColumns = TRUE),
rownames = FALSE)
})
}else{
output$data3 <- DT::renderDataTable({
DT::datatable(NULL)
})
}
q1 = p1 %>% ggplot() + geom_bar(data = p1, mapping = aes(x = get(input$SelXaxis2), y = mean, fill = `Rider Type`, text = paste(input$SelXaxis2, ":", get(input$SelXaxis2), "<br>", input$SelYaxis2, ":", mean , "<br>Type of Rider:", `Rider Type`)),
stat = "identity", position = position_dodge(0.6), width = 0.5) +
{if(input$Part2 == "Yes(by Bike-Type)"){facet_wrap(~`Bike Type`)}} +
expand_limits(y = c(0, max(p1$mean) * 1.2)) +
scale_y_continuous(expand = c(0,0)) +
labs(x = paste(input$SelXaxis2),
y = paste(input$SelYaxis2),
caption = "Source: https://divvybikes.com/system-data") +
theme_bw() +
{if(input$SelXaxis2 == "Weekday" | input$SelXaxis2 == "Month Name"){
theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust=1))
}} +
{
if(input$Trend2){
geom_smooth(data = p1, mapping = aes(x = as.numeric(get(input$SelXaxis2)), y = mean, col=`Rider Type`), method=loess, se=FALSE, linewidth = 0.5, show.legend = FALSE)
}
}
}
p_plotly1 <- ggplotly(q1, tooltip = "text", source = "chart2")
p_plotly1 <- p_plotly1 %>%
layout(
title = list(
text = paste0("<b>", input$SelYaxis2, " by ", input$SelXaxis2, "</b>","<br><span style='font-size:14pt; display:block; padding-top:10px;'>", "<sup>","Dataset:", input$Seldata, "</sup>", "</span></br>"),
x = 0.5, # Center alignment
xanchor = "center"),
margin = list(t = 100, r = 50, b = 50) # Increase top margin to add space between title and plot
)
p_plotly1 <- event_register(p_plotly1, "plotly_click")
# Return plotly object
p_plotly1
}
observeEvent(): This function sets up a reactive observer in Shiny. It watches for a specified event and triggers the code inside its block whenever that event occurs.
event_data(“plotly_click”, source = “chart2”): This function from the plotly package captures data from a click event on a Plotly chart. The source = “chart2” specifies that the click event is being listened to on a Plotly chart with the source identifier “chart2”.
The if conditional statement checks if click_data2 is not NULL, ensuring that the subsequent code only executes if a click event has occurred and valid data is available.
observeEvent(event_data("plotly_click", source = "chart2"), {
click_data2 <- event_data("plotly_click", source = "chart2")
if (!is.null(click_data2)) {
xInp <- click_data2[["pointNumber"]]
xInp2 <- round(as.integer(click_data2[["x"]]), digits = 0)
cInp <- click_data2[["curveNumber"]]
if(input$SelAna == "Yearly"){
dp <- yearroute
} else if(input$SelAna == "Seasonal"){
dp <- yearroute %>%
filter(Season == paste(input$Seldata))
} else {
dp <- yearroute %>%
filter(`Month Name` == paste(input$Seldata))
}
if(input$SelYaxis2 == "No.of Rides"){
}
if(input$SelYaxis2 != "No.of Rides"){
}
The reason behind generating Pie charts on the click of a bar is because the Y-axis variable is no.of rides. Rides can be shown as distribution in the form of Pie chart.
When the partition is not checked in, a Pie chart will be shown showing distribution of rides among Casual and Member Riders. When Partition box is checked, a Pie-Donut chart will be shown where a Donut chart will surround the Pie chart. In a Pie-Donut chart, the Pie chart shows the distribution of rides among Casual and Member whereas the Donut part further classifies the rides into Bike type: Classic, Docked and Electric for every rider.
if(input$SelYaxis2 == "No.of Rides"){
if(input$Part2 == "Yes(by Bike-Type)"){
common = dp %>%
select(`Rider Type`, `Bike Type`, !!sym(input$SelXaxis2)) %>%
group_by(!!sym(input$SelXaxis2), `Rider Type`, `Bike Type`) %>%
count(`Bike Type`) %>% ungroup()
} else {
common = dp %>%
select(`Rider Type`, !!sym(input$SelXaxis2)) %>%
group_by(!!sym(input$SelXaxis2), `Rider Type`) %>%
count(`Rider Type`) %>% ungroup()
}
if(input$SelXaxis2 == "Route"){
q1 = common %>%
group_by(Route) %>%
mutate(total_n = sum(n), percentage = (n / total_n) * 100) %>%
ungroup()
s1 = q1[order(-q1$total_n), ]
r1 = s1 %>% select(Route) %>% distinct() %>% slice(1:25)
p = semi_join(s1,r1, by=("Route"))
r1$Route <- factor(r1$Route, levels = unique(r1$Route)) ## This will keep the bars in predefined decreasing order in graph
xInp1 <- r1$Route[xInp + 1]
print(xInp + 1)
print(xInp1)
print(class(xInp1))
} else {
p = common %>%
mutate(n = n/1000000) %>% mutate_if(is.numeric, round, digits = 5)
if(input$SelXaxis2 == "Weekday"){
weekday <- c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")
print(xInp)
xInp1 = weekday[xInp + 1]
} else if((input$SelXaxis2 == "Start Time (in Hrs)")){
hours <- c("0","1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23")
print(xInp)
xInp1 = hours[xInp + 1]
} else if(input$SelXaxis2 == "Day of Month"){
days <- c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31")
print(xInp)
xInp1 = days[xInp + 1]
} else if(input$SelXaxis2 == "Week of Year"){
if(input$Part2 == "Yes(by Bike-Type)"){
if(cInp == "1"){
Week <- dp %>% filter(`Bike Type` == "Docked") %>% select(`Week of Year`) %>% distinct()
} else {
Week <- dp %>% select(`Week of Year`) %>% distinct()
}
} else {
Week <- dp %>% select(`Week of Year`) %>% distinct()
}
Week <- Week[order(Week$`Week of Year`),]
Week$`Week of Year` <- factor(Week$`Week of Year`, levels = unique(Week$`Week of Year`))
print(xInp)
xInp1 = Week$`Week of Year`[xInp + 1]
} else if(input$SelXaxis2 == "Day of Year"){
if(input$Part2 == "Yes(by Bike-Type)"){
if(cInp == "2" | cInp == "4"){
Yearday <- dp %>% filter(`Bike Type` == "Electric") %>% select(`Day of Year`) %>% distinct()
} else {
Yearday <- dp %>% select(`Day of Year`) %>% distinct()
}
} else {
Yearday <- dp %>% select(`Day of Year`) %>% distinct()
}
Yearday <- Yearday[order(Yearday$`Day of Year`),]
Yearday$`Day of Year` <- factor(Yearday$`Day of Year`, levels = unique(Yearday$`Day of Year`))
xInp1 = Yearday$`Day of Year`[xInp + 1]
} else if(input$SelXaxis2 == "Month Name"){
Months <- dp %>% select(`Month Name`) %>% distinct()
if(input$Seldata == "Entire Year"){
Months$`Month Name` <- factor(Months$`Month Name`, levels = c("May 2023", "June 2023", "July 2023", "Aug 2023", "Sep 2023", "Oct 2023", "Nov 2023", "Dec 2023", "Jan 2024", "Feb 2024", "March 2024", "April 2024"))
} else if(input$Seldata == "Spring"){
Months$`Month Name` <- factor(Months$`Month Name`, levels = c("May 2023", "March 2024", "April 2024"))
} else if(input$Seldata == "Summer"){
Months$`Month Name` <- factor(Months$`Month Name`, levels = c("June 2023", "July 2023", "Aug 2023"))
} else if(input$Seldata == "Fall"){
Months$`Month Name` <- factor(Months$`Month Name`, levels = c("Sep 2023", "Oct 2023", "Nov 2023"))
} else if(input$Seldata == "Winter"){
Months$`Month Name` <- factor(Months$`Month Name`, levels = c("Dec 2023", "Jan 2024", "Feb 2024"))
}
Months <- Months[order(Months$`Month Name`),]
xInp1 <- Months$`Month Name`[xInp + 1]
} else if(input$SelXaxis2 == "Season"){
if(input$Part2 == "Yes(by Bike-Type)"){
if(cInp == "1"){
Seasons <- dp %>% filter(`Bike Type` == "Docked") %>% select(`Season`) %>% distinct()
} else {
Seasons <- dp %>% select(`Season`) %>% distinct()
}
} else {
Seasons <- dp %>% select(`Season`) %>% distinct()
}
Seasons$Season <- factor(Seasons$Season, levels = c("Fall", "Spring", "Summer", "Winter"))
Seasons <- Seasons[order(Seasons$Season),]
print(xInp)
xInp1 <- Seasons$Season[xInp + 1]
print(xInp1)
print(as.character(xInp1))
}
}
m <- p %>% filter(!!sym(input$SelXaxis2) == as.character(xInp1)) %>% rename(Rider = `Rider Type`) %>% { if(input$Part2 == "Yes(by Bike-Type)") {rename(. , Bike = `Bike Type`)} else {.}}
if(input$Part2 == "Yes(by Bike-Type)"){
ch <- reactive({ PieDonut(m, aes(Rider, Bike, count = n), title = paste("Distribution of Rides for :", xInp1)) })
showModal(modalDialog(
plotOutput("Pie"),
title = paste("Pie-Donut Chart" )
))
output$Pie <- renderPlot({ch()})
} else {
custom_color <- c("#d45646", "#088F8F")
ch <- reactive({ plot_ly(data = m, labels = ~Rider, values = ~n, type = "pie", marker = list(colors = custom_color, line = list(color = "white", width = 1))) %>%
layout(title = paste("Distribution of Rides for :", xInp1))
})
showModal(modalDialog(
plotlyOutput("Pie"),
title = paste("Pie Chart" )
))
output$Pie <- renderPlotly({ch()})
}
}
else if(input$SelYaxis2 != "No.of Rides"){
if(input$SelXaxis2 == "Weekday"){
if(input$Part2 == "Yes(by Bike-Type)"){
if(input$SelYaxis2 == "Avg.Ride-Duration"){
q = dp %>%
group_by(Weekday, `Rider Type`, `Bike Type`, `Start Time (in Hrs)`) %>%
summarise(mean = mean(`Ride Length`)) %>% ungroup()
} else if(input$SelYaxis2 == "Avg.Ride-Distance"){
q = dp %>%
filter(!is.na(`Station Distance`)) %>%
group_by(Weekday, `Rider Type`, `Bike Type`, `Start Time (in Hrs)`) %>%
summarise(mean = mean(`Station Distance`)) %>% ungroup()
}
} else {
if(input$SelYaxis2 == "Avg.Ride-Duration"){
q = dp %>%
group_by(Weekday, `Rider Type`, `Start Time (in Hrs)`) %>%
summarise(mean = mean(`Ride Length`)) %>% ungroup()
} else if(input$SelYaxis2 == "Avg.Ride-Distance"){
q = dp %>%
filter(!is.na(`Station Distance`)) %>%
group_by(Weekday, `Rider Type`, `Start Time (in Hrs)`) %>%
summarise(mean = mean(`Station Distance`)) %>% ungroup()
}
}
weekdays <- c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")
xInp1 <- weekdays[xInp + 1]
q_cas_dock <- q %>% filter(Weekday == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Docked")} else {.}}
q_cas_clas <- q %>% filter(Weekday == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
q_cas_elec <- q %>% filter(Weekday == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
q_mem_clas <- q %>% filter(Weekday == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
q_mem_elec <- q %>% filter(Weekday == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
filterdata1 <- right_join(q_cas_dock, starttime, by='Start Time (in Hrs)')
filterdata2 <- right_join(q_cas_clas, starttime, by='Start Time (in Hrs)')
filterdata3 <- right_join(q_cas_elec, starttime, by='Start Time (in Hrs)')
filterdata4 <- right_join(q_mem_clas, starttime, by='Start Time (in Hrs)')
filterdata5 <- right_join(q_mem_elec, starttime, by='Start Time (in Hrs)')
if(!any(is.na(filterdata1)) && !any(is.na(filterdata2)) && !any(is.na(filterdata3)) && !any(is.na(filterdata4)) && !any(is.na(filterdata5))){
filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
} else {
if(any(is.na(filterdata1))){
filterdata1$mean <- ifelse(is.na(filterdata1$mean), 0, filterdata1$mean)
filterdata1$Weekday <- ifelse(is.na(filterdata1$Weekday), xInp1, filterdata1$Weekday)
filterdata1$`Rider Type` <- ifelse(is.na(filterdata1$`Rider Type`), "Casual", filterdata1$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata1$`Bike Type` <- ifelse(is.na(filterdata1$`Bike Type`), "Docked", filterdata1$`Bike Type`)
}
}
if(any(is.na(filterdata2))){
filterdata2$mean <- ifelse(is.na(filterdata2$mean), 0, filterdata2$mean)
filterdata2$Weekday <- ifelse(is.na(filterdata2$Weekday), xInp1, filterdata2$Weekday)
filterdata2$`Rider Type` <- ifelse(is.na(filterdata2$`Rider Type`), "Casual", filterdata2$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata2$`Bike Type` <- ifelse(is.na(filterdata2$`Bike Type`), "Classic", filterdata2$`Bike Type`)
}
}
if(any(is.na(filterdata3))){
filterdata3$mean <- ifelse(is.na(filterdata3$mean), 0, filterdata3$mean)
filterdata3$Weekday <- ifelse(is.na(filterdata3$Weekday), xInp1, filterdata3$Weekday)
filterdata3$`Rider Type` <- ifelse(is.na(filterdata3$`Rider Type`), "Casual", filterdata3$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata3$`Bike Type` <- ifelse(is.na(filterdata3$`Bike Type`), "Electric", filterdata3$`Bike Type`)
}
}
if(any(is.na(filterdata4))){
filterdata4$mean <- ifelse(is.na(filterdata4$mean), 0, filterdata4$mean)
filterdata4$Weekday <- ifelse(is.na(filterdata4$Weekday), xInp1, filterdata4$Weekday)
filterdata4$`Rider Type` <- ifelse(is.na(filterdata4$`Rider Type`), "Member", filterdata4$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata4$`Bike Type` <- ifelse(is.na(filterdata4$`Bike Type`), "Classic", filterdata4$`Bike Type`)
}
}
if(any(is.na(filterdata5))){
filterdata5$mean <- ifelse(is.na(filterdata5$mean), 0, filterdata5$mean)
filterdata5$Weekday <- ifelse(is.na(filterdata5$Weekday), xInp1, filterdata5$Weekday)
filterdata5$`Rider Type` <- ifelse(is.na(filterdata5$`Rider Type`), "Member", filterdata5$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata5$`Bike Type` <- ifelse(is.na(filterdata5$`Bike Type`), "Electric", filterdata5$`Bike Type`)
}
}
filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
}
filterdata$`Start Time (in Hrs)` <- factor(filterdata$`Start Time (in Hrs)`, levels = c("0","1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23"))
ch <- reactive({
{if(input$Part2 == "Yes(by Bike-Type)") {
ggplot(filterdata, aes(x = `Start Time (in Hrs)`, y = interaction(`Rider Type`, `Bike Type`), fill = mean))
} else {
ggplot(filterdata, aes(x = `Start Time (in Hrs)`, y = `Rider Type`, fill = mean))
}} +
geom_tile(color = "white") +
scale_fill_gradient(low = "lightblue", high = "darkblue") +
{if(input$Part2 == "Yes(by Bike-Type)") {
labs(x = "Time of the Day", y = "Rider-Bike Combination", fill = "mean")
} else {
labs(x = "Time of the Day", y = "Rider Type", fill = "mean")
}} +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
labs(title = paste(input$SelYaxis2, " distribution across Time of the Day for", xInp1))
})
showModal(modalDialog(
plotOutput("HeatMap"),
title = paste("Heat Map" )
))
output$HeatMap <- renderPlot({ch()})
} else if((input$SelXaxis2 == "Start Time (in Hrs)")){
if(input$Part2 == "Yes(by Bike-Type)") {
if(input$SelYaxis2 == "Avg.Ride-Duration"){
q = dp %>%
group_by(`Start Time (in Hrs)`, `Rider Type`, `Bike Type`, Station ) %>%
summarise(mean = mean(`Ride Length`)) %>% ungroup()
} else if(input$SelYaxis2 == "Avg.Ride-Distance"){
q = dp %>%
filter(!is.na(`Station Distance`)) %>%
group_by(`Start Time (in Hrs)`, `Rider Type`, `Bike Type`, Station ) %>%
summarise(mean = mean(`Station Distance`)) %>% ungroup()
}
} else {
if(input$SelYaxis2 == "Avg.Ride-Duration"){
q = dp %>%
group_by(`Start Time (in Hrs)`, `Rider Type`, Station ) %>%
summarise(mean = mean(`Ride Length`)) %>% ungroup()
} else if(input$SelYaxis2 == "Avg.Ride-Distance"){
q = dp %>%
filter(!is.na(`Station Distance`)) %>%
group_by(`Start Time (in Hrs)`, `Rider Type`, Station ) %>%
summarise(mean = mean(`Station Distance`)) %>% ungroup()
}
}
hours <- c("0","1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23")
xInp1 = hours[xInp + 1]
q_cas_dock <- q %>% filter(`Start Time (in Hrs)` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Docked")} else {.}}
q_cas_clas <- q %>% filter(`Start Time (in Hrs)` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
q_cas_elec <- q %>% filter(`Start Time (in Hrs)` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
q_mem_clas <- q %>% filter(`Start Time (in Hrs)` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
q_mem_elec <- q %>% filter(`Start Time (in Hrs)` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
filterdata1 <- right_join(q_cas_dock, Station, by='Station')
filterdata2 <- right_join(q_cas_clas, Station, by='Station')
filterdata3 <- right_join(q_cas_elec, Station, by='Station')
filterdata4 <- right_join(q_mem_clas, Station, by='Station')
filterdata5 <- right_join(q_mem_elec, Station, by='Station')
if(!any(is.na(filterdata1)) && !any(is.na(filterdata2)) && !any(is.na(filterdata3)) && !any(is.na(filterdata4)) && !any(is.na(filterdata5))){
filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
} else {
if(any(is.na(filterdata1))){
filterdata1$mean <- ifelse(is.na(filterdata1$mean), 0, filterdata1$mean)
filterdata1$`Start Time (in Hrs)` <- ifelse(is.na(filterdata1$`Start Time (in Hrs)`), xInp1, filterdata1$`Start Time (in Hrs)`)
filterdata1$`Rider Type` <- ifelse(is.na(filterdata1$`Rider Type`), "Casual", filterdata1$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata1$`Bike Type` <- ifelse(is.na(filterdata1$`Bike Type`), "Docked", filterdata1$`Bike Type`)
}
}
if(any(is.na(filterdata2))){
filterdata2$mean <- ifelse(is.na(filterdata2$mean), 0, filterdata2$mean)
filterdata2$`Start Time (in Hrs)` <- ifelse(is.na(filterdata2$`Start Time (in Hrs)`), xInp1, filterdata2$`Start Time (in Hrs)`)
filterdata2$`Rider Type` <- ifelse(is.na(filterdata2$`Rider Type`), "Casual", filterdata2$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata2$`Bike Type` <- ifelse(is.na(filterdata2$`Bike Type`), "Classic", filterdata2$`Bike Type`)
}
}
if(any(is.na(filterdata3))){
filterdata3$mean <- ifelse(is.na(filterdata3$mean), 0, filterdata3$mean)
filterdata3$`Start Time (in Hrs)` <- ifelse(is.na(filterdata3$`Start Time (in Hrs)`), xInp1, filterdata3$`Start Time (in Hrs)`)
filterdata3$`Rider Type` <- ifelse(is.na(filterdata3$`Rider Type`), "Casual", filterdata3$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata3$`Bike Type` <- ifelse(is.na(filterdata3$`Bike Type`), "Electric", filterdata3$`Bike Type`)
}
}
if(any(is.na(filterdata4))){
filterdata4$mean <- ifelse(is.na(filterdata4$mean), 0, filterdata4$mean)
filterdata4$`Start Time (in Hrs)` <- ifelse(is.na(filterdata4$`Start Time (in Hrs)`), xInp1, filterdata4$`Start Time (in Hrs)`)
filterdata4$`Rider Type` <- ifelse(is.na(filterdata4$`Rider Type`), "Member", filterdata4$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata4$`Bike Type` <- ifelse(is.na(filterdata4$`Bike Type`), "Classic", filterdata4$`Bike Type`)
}
}
if(any(is.na(filterdata5))){
filterdata5$mean <- ifelse(is.na(filterdata5$mean), 0, filterdata5$mean)
filterdata5$`Start Time (in Hrs)` <- ifelse(is.na(filterdata5$`Start Time (in Hrs)`), xInp1, filterdata5$`Start Time (in Hrs)`)
filterdata5$`Rider Type` <- ifelse(is.na(filterdata5$`Rider Type`), "Member", filterdata5$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata5$`Bike Type` <- ifelse(is.na(filterdata5$`Bike Type`), "Electric", filterdata5$`Bike Type`)
}
}
filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
}
filterdata$Station <- factor(filterdata$Station, levels = unique(Station$Station))
ch <- reactive({
{if(input$Part2 == "Yes(by Bike-Type)") {
ggplot(filterdata, aes(x = Station, y = interaction(`Rider Type`, `Bike Type`), fill = mean))
} else {
ggplot(filterdata, aes(x = Station, y = `Rider Type`, fill = mean))
}} +
geom_tile(color = "white") +
scale_fill_gradient(low = "yellow", high = "red") +
{if(input$Part2 == "Yes(by Bike-Type)") {
labs(x = "Top 20 Stations", y = "Rider-Bike Combination", fill = "mean")
} else {
labs(x = "Top 20 Stations", y = "Rider Type", fill = "mean")
}} +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
labs(title = paste(input$SelYaxis2, " distribution across Stations of Chicago for the Hour:", xInp1))
})
showModal(modalDialog(
plotOutput("HeatMap"),
title = paste("Heat Map" )
))
output$HeatMap <- renderPlot({ch()})
} else if(input$SelXaxis2 == "Day of Month"){
if(input$Part2 == "Yes(by Bike-Type)") {
if(input$SelYaxis2 == "Avg.Ride-Duration"){
q = dp %>%
group_by(`Day of Month`, `Rider Type`, `Bike Type`, `Start Time (in Hrs)`) %>%
summarise(mean = mean(`Ride Length`)) %>% ungroup()
} else if(input$SelYaxis2 == "Avg.Ride-Distance"){
q = dp %>%
filter(!is.na(`Station Distance`)) %>%
group_by(`Day of Month`, `Rider Type`, `Bike Type`, `Start Time (in Hrs)`) %>%
summarise(mean = mean(`Station Distance`)) %>% ungroup()
}
} else {
if(input$SelYaxis2 == "Avg.Ride-Duration"){
q = dp %>%
group_by(`Day of Month`, `Rider Type`,`Start Time (in Hrs)`) %>%
summarise(mean = mean(`Ride Length`)) %>% ungroup()
} else if(input$SelYaxis2 == "Avg.Ride-Distance"){
q = dp %>%
filter(!is.na(`Station Distance`)) %>%
group_by(`Day of Month`, `Rider Type`, `Start Time (in Hrs)`) %>%
summarise(mean = mean(`Station Distance`)) %>% ungroup()
}
}
days <- c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31")
xInp1 = days[xInp + 1]
q_cas_dock <- q %>% filter(`Day of Month` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Docked")} else {.}}
q_cas_clas <- q %>% filter(`Day of Month` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
q_cas_elec <- q %>% filter(`Day of Month` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
q_mem_clas <- q %>% filter(`Day of Month` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
q_mem_elec <- q %>% filter(`Day of Month` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
filterdata1 <- right_join(q_cas_dock, starttime, by='Start Time (in Hrs)')
filterdata2 <- right_join(q_cas_clas, starttime, by='Start Time (in Hrs)')
filterdata3 <- right_join(q_cas_elec, starttime, by='Start Time (in Hrs)')
filterdata4 <- right_join(q_mem_clas, starttime, by='Start Time (in Hrs)')
filterdata5 <- right_join(q_mem_elec, starttime, by='Start Time (in Hrs)')
if(!any(is.na(filterdata1)) && !any(is.na(filterdata2)) && !any(is.na(filterdata3)) && !any(is.na(filterdata4)) && !any(is.na(filterdata5))){
filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
} else {
if(any(is.na(filterdata1))){
filterdata1$mean <- ifelse(is.na(filterdata1$mean), 0, filterdata1$mean)
filterdata1$`Day of Month` <- ifelse(is.na(filterdata1$`Day of Month`), xInp1, filterdata1$`Day of Month`)
filterdata1$`Rider Type` <- ifelse(is.na(filterdata1$`Rider Type`), "Casual", filterdata1$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)"){
filterdata1$`Bike Type` <- ifelse(is.na(filterdata1$`Bike Type`), "Docked", filterdata1$`Bike Type`)
}
}
if(any(is.na(filterdata2))){
filterdata2$mean <- ifelse(is.na(filterdata2$mean), 0, filterdata2$mean)
filterdata2$`Day of Month` <- ifelse(is.na(filterdata2$`Day of Month`), xInp1, filterdata2$`Day of Month`)
filterdata2$`Rider Type` <- ifelse(is.na(filterdata2$`Rider Type`), "Casual", filterdata2$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)"){
filterdata2$`Bike Type` <- ifelse(is.na(filterdata2$`Bike Type`), "Classic", filterdata2$`Bike Type`)
}
}
if(any(is.na(filterdata3))){
filterdata3$mean <- ifelse(is.na(filterdata3$mean), 0, filterdata3$mean)
filterdata3$`Day of Month` <- ifelse(is.na(filterdata3$`Day of Month`), xInp1, filterdata3$`Day of Month`)
filterdata3$`Rider Type` <- ifelse(is.na(filterdata3$`Rider Type`), "Casual", filterdata3$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)"){
filterdata3$`Bike Type` <- ifelse(is.na(filterdata3$`Bike Type`), "Electric", filterdata3$`Bike Type`)
}
}
if(any(is.na(filterdata4))){
filterdata4$mean <- ifelse(is.na(filterdata4$mean), 0, filterdata4$mean)
filterdata4$`Day of Month` <- ifelse(is.na(filterdata4$`Day of Month`), xInp1, filterdata4$`Day of Month`)
filterdata4$`Rider Type` <- ifelse(is.na(filterdata4$`Rider Type`), "Member", filterdata4$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)"){
filterdata4$`Bike Type` <- ifelse(is.na(filterdata4$`Bike Type`), "Classic", filterdata4$`Bike Type`)
}
}
if(any(is.na(filterdata5))){
filterdata5$mean <- ifelse(is.na(filterdata5$mean), 0, filterdata5$mean)
filterdata5$`Day of Month` <- ifelse(is.na(filterdata5$`Day of Month`), xInp1, filterdata5$`Day of Month`)
filterdata5$`Rider Type` <- ifelse(is.na(filterdata5$`Rider Type`), "Member", filterdata5$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)"){
filterdata5$`Bike Type` <- ifelse(is.na(filterdata5$`Bike Type`), "Electric", filterdata5$`Bike Type`)
}
}
filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
}
filterdata$`Start Time (in Hrs)` <- factor(filterdata$`Start Time (in Hrs)`, levels = c("0","1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23"))
ch <- reactive({
{if(input$Part2 == "Yes(by Bike-Type)"){
ggplot(filterdata, aes(x = `Start Time (in Hrs)`, y = interaction(`Rider Type`, `Bike Type`), fill = mean))
} else {
ggplot(filterdata, aes(x = `Start Time (in Hrs)`, y = `Rider Type`, fill = mean))
} } +
geom_tile(color = "white") +
scale_fill_gradient(low = "pink", high = "maroon") +
{if(input$Part2 == "Yes(by Bike-Type)"){
labs(x = "Time of the Day", y = "Rider-Bike Combination", fill = "mean")
} else {
labs(x = "Time of the Day", y = "Rider Type", fill = "mean")
} } +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
labs(title = paste(input$SelYaxis2, " distribution across different time-period for Day:", xInp1))
})
showModal(modalDialog(
plotOutput("HeatMap"),
title = paste("Heat Map" )
))
output$HeatMap <- renderPlot({ch()})
} else if(input$SelXaxis2 == "Week of Year"){
if(input$Part2 == "Yes(by Bike-Type)"){
if(input$SelYaxis2 == "Avg.Ride-Duration"){
q = dp %>% { if(cInp == "1") {filter(. , `Bike Type` == "Docked")} else {.}} %>%
group_by(`Week of Year`, `Rider Type`, `Bike Type`, Weekday) %>%
summarise(mean = mean(`Ride Length`)) %>% ungroup()
} else if(input$SelYaxis2 == "Avg.Ride-Distance"){
q = dp %>% { if(cInp == "1") {filter(. , `Bike Type` == "Docked")} else {.}} %>%
filter(!is.na(`Station Distance`)) %>%
group_by(`Week of Year`, `Rider Type`, `Bike Type`, Weekday) %>%
summarise(mean = mean(`Station Distance`)) %>% ungroup()
} } else{
if(input$SelYaxis2 == "Avg.Ride-Duration"){
q = dp %>%
group_by(`Week of Year`, `Rider Type`,Weekday) %>%
summarise(mean = mean(`Ride Length`)) %>% ungroup()
} else if(input$SelYaxis2 == "Avg.Ride-Distance"){
q = dp %>%
filter(!is.na(`Station Distance`)) %>%
group_by(`Week of Year`, `Rider Type`, Weekday) %>%
summarise(mean = mean(`Station Distance`)) %>% ungroup()
} }
Yearlyweek <- q %>% select(`Week of Year`) %>% distinct()
Yearlyweek <- Yearlyweek[order(Yearlyweek$`Week of Year`), ]
weeklevel = unique(Yearlyweek$`Week of Year`)
xInp1 <- weeklevel[xInp + 1]
Weekday <- yearroute %>% select(Weekday) %>% distinct()
Weekday$Weekday <- factor(Weekday$Weekday, levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"))
Weekday <- Weekday[order(Weekday$Weekday), ]
# weekdays <- c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")
q_cas_dock <- q %>% filter(`Week of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Docked")} else {.}}
q_cas_clas <- q %>% filter(`Week of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
q_cas_elec <- q %>% filter(`Week of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
q_mem_clas <- q %>% filter(`Week of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
q_mem_elec <- q %>% filter(`Week of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
filterdata1 <- right_join(q_cas_dock, Weekday, by='Weekday')
filterdata2 <- right_join(q_cas_clas, Weekday, by='Weekday')
filterdata3 <- right_join(q_cas_elec, Weekday, by='Weekday')
filterdata4 <- right_join(q_mem_clas, Weekday, by='Weekday')
filterdata5 <- right_join(q_mem_elec, Weekday, by='Weekday')
if(!any(is.na(filterdata1)) && !any(is.na(filterdata2)) && !any(is.na(filterdata3)) && !any(is.na(filterdata4)) && !any(is.na(filterdata5))){
filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
} else {
if(any(is.na(filterdata1))){
filterdata1$mean <- ifelse(is.na(filterdata1$mean), 0, filterdata1$mean)
filterdata1$`Week of Year` <- ifelse(is.na(filterdata1$`Week of Year`), xInp1, filterdata1$`Week of Year`)
filterdata1$`Rider Type` <- ifelse(is.na(filterdata1$`Rider Type`), "Casual", filterdata1$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata1$`Bike Type` <- ifelse(is.na(filterdata1$`Bike Type`), "Docked", filterdata1$`Bike Type`)
}
}
if(any(is.na(filterdata2))){
filterdata2$mean <- ifelse(is.na(filterdata2$mean), 0, filterdata2$mean)
filterdata2$`Week of Year` <- ifelse(is.na(filterdata2$`Week of Year`), xInp1, filterdata2$`Week of Year`)
filterdata2$`Rider Type` <- ifelse(is.na(filterdata2$`Rider Type`), "Casual", filterdata2$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata2$`Bike Type` <- ifelse(is.na(filterdata2$`Bike Type`), "Classic", filterdata2$`Bike Type`)
}
}
if(any(is.na(filterdata3))){
filterdata3$mean <- ifelse(is.na(filterdata3$mean), 0, filterdata3$mean)
filterdata3$`Week of Year` <- ifelse(is.na(filterdata3$`Week of Year`), xInp1, filterdata3$`Week of Year`)
filterdata3$`Rider Type` <- ifelse(is.na(filterdata3$`Rider Type`), "Casual", filterdata3$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata3$`Bike Type` <- ifelse(is.na(filterdata3$`Bike Type`), "Electric", filterdata3$`Bike Type`)
}
}
if(any(is.na(filterdata4))){
filterdata4$mean <- ifelse(is.na(filterdata4$mean), 0, filterdata4$mean)
filterdata4$`Week of Year` <- ifelse(is.na(filterdata4$`Week of Year`), xInp1, filterdata4$`Week of Year`)
filterdata4$`Rider Type` <- ifelse(is.na(filterdata4$`Rider Type`), "Member", filterdata4$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata4$`Bike Type` <- ifelse(is.na(filterdata4$`Bike Type`), "Classic", filterdata4$`Bike Type`)
}
}
if(any(is.na(filterdata5))){
filterdata5$mean <- ifelse(is.na(filterdata5$mean), 0, filterdata5$mean)
filterdata5$`Week of Year` <- ifelse(is.na(filterdata5$`Week of Year`), xInp1, filterdata5$`Week of Year`)
filterdata5$`Rider Type` <- ifelse(is.na(filterdata5$`Rider Type`), "Member", filterdata5$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata5$`Bike Type` <- ifelse(is.na(filterdata5$`Bike Type`), "Electric", filterdata5$`Bike Type`)
}
}
filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
}
filterdata$Weekday <- factor(filterdata$Weekday, levels = unique(Weekday$Weekday))
ch <- reactive({
{if(input$Part2 == "Yes(by Bike-Type)") {
ggplot(filterdata, aes(x = Weekday, y = interaction(`Rider Type`, `Bike Type`), fill = mean))
} else {
ggplot(filterdata, aes(x = Weekday, y = `Rider Type`, fill = mean))
}} +
geom_tile(color = "white") +
scale_fill_gradient(low = "lightgreen", high = "darkgreen") +
{if(input$Part2 == "Yes(by Bike-Type)") {
labs(x = "Weekday", y = "Rider-Bike Combination", fill = "mean")
} else {
labs(x = "Weekday", y = "Rider Type", fill = "mean")
}} +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
labs(title = paste(input$SelYaxis2, " distribution across Weekday for the Week:", xInp1))
})
showModal(modalDialog(
plotOutput("HeatMap"),
title = paste("Heat Map" )
))
output$HeatMap <- renderPlot({ch()})
} else if(input$SelXaxis2 == "Month Name"){
if(input$Part2 == "Yes(by Bike-Type)") {
if(input$SelYaxis2 == "Avg.Ride-Duration"){
q = dp %>%
group_by(`Month Name`, `Rider Type`, `Bike Type`, `Week of Year`) %>%
summarise(mean = mean(`Ride Length`)) %>% ungroup()
} else if(input$SelYaxis2 == "Avg.Ride-Distance"){
q = dp %>%
filter(!is.na(`Station Distance`)) %>%
group_by(`Month Name`, `Rider Type`, `Bike Type`, `Week of Year`) %>%
summarise(mean = mean(`Station Distance`)) %>% ungroup()
}
} else {
if(input$SelYaxis2 == "Avg.Ride-Duration"){
q = dp %>%
group_by(`Month Name`, `Rider Type`, `Week of Year`) %>%
summarise(mean = mean(`Ride Length`)) %>% ungroup()
} else if(input$SelYaxis2 == "Avg.Ride-Distance"){
q = dp %>%
filter(!is.na(`Station Distance`)) %>%
group_by(`Month Name`, `Rider Type`, `Week of Year`) %>%
summarise(mean = mean(`Station Distance`)) %>% ungroup()
} }
Months <- dp %>% select(`Month Name`) %>% distinct()
if(input$Seldata == "Entire Year"){
Months$`Month Name` <- factor(Months$`Month Name`, levels = c("May 2023", "June 2023", "July 2023", "Aug 2023", "Sep 2023", "Oct 2023", "Nov 2023", "Dec 2023", "Jan 2024", "Feb 2024", "March 2024", "April 2024"))
} else if(input$Seldata == "Spring"){
Months$`Month Name` <- factor(Months$`Month Name`, levels = c("May 2023", "March 2024", "April 2024"))
} else if(input$Seldata == "Summer"){
Months$`Month Name` <- factor(Months$`Month Name`, levels = c("June 2023", "July 2023", "Aug 2023"))
} else if(input$Seldata == "Fall"){
Months$`Month Name` <- factor(Months$`Month Name`, levels = c("Sep 2023", "Oct 2023", "Nov 2023"))
} else if(input$Seldata == "Winter"){
Months$`Month Name` <- factor(Months$`Month Name`, levels = c("Dec 2023", "Jan 2024", "Feb 2024"))
}
Months <- Months[order(Months$`Month Name`),]
xInp1 <- Months$`Month Name`[xInp + 1]
q_cas_dock <- q %>% filter(`Month Name` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Docked")} else {.}}
q_cas_clas <- q %>% filter(`Month Name` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
q_cas_elec <- q %>% filter(`Month Name` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
q_mem_clas <- q %>% filter(`Month Name` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
q_mem_elec <- q %>% filter(`Month Name` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
Yearlyweek <- dp %>% filter(`Month Name` == as.character(xInp1)) %>% select(`Week of Year`) %>% distinct()
Yearlyweek <- Yearlyweek[order(Yearlyweek$`Week of Year`), ]
filterdata1 <- right_join(q_cas_dock, Yearlyweek, by='Week of Year')
filterdata2 <- right_join(q_cas_clas, Yearlyweek, by='Week of Year')
filterdata3 <- right_join(q_cas_elec, Yearlyweek, by='Week of Year')
filterdata4 <- right_join(q_mem_clas, Yearlyweek, by='Week of Year')
filterdata5 <- right_join(q_mem_elec, Yearlyweek, by='Week of Year')
if(!any(is.na(filterdata1)) && !any(is.na(filterdata2)) && !any(is.na(filterdata3)) && !any(is.na(filterdata4)) && !any(is.na(filterdata5))){
filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
} else {
if(any(is.na(filterdata1))){
filterdata1$mean <- ifelse(is.na(filterdata1$mean), 0, filterdata1$mean)
filterdata1$`Month Name` <- ifelse(is.na(filterdata1$`Month Name`), xInp1, filterdata1$`Month Name`)
filterdata1$`Rider Type` <- ifelse(is.na(filterdata1$`Rider Type`), "Casual", filterdata1$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata1$`Bike Type` <- ifelse(is.na(filterdata1$`Bike Type`), "Docked", filterdata1$`Bike Type`)
}
}
if(any(is.na(filterdata2))){
filterdata2$mean <- ifelse(is.na(filterdata2$mean), 0, filterdata2$mean)
filterdata2$`Month Name` <- ifelse(is.na(filterdata2$`Month Name`), xInp1, filterdata2$`Month Name`)
filterdata2$`Rider Type` <- ifelse(is.na(filterdata2$`Rider Type`), "Casual", filterdata2$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata2$`Bike Type` <- ifelse(is.na(filterdata2$`Bike Type`), "Classic", filterdata2$`Bike Type`)
}
}
if(any(is.na(filterdata3))){
filterdata3$mean <- ifelse(is.na(filterdata3$mean), 0, filterdata3$mean)
filterdata3$`Month Name` <- ifelse(is.na(filterdata3$`Month Name`), xInp1, filterdata3$`Month Name`)
filterdata3$`Rider Type` <- ifelse(is.na(filterdata3$`Rider Type`), "Casual", filterdata3$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata3$`Bike Type` <- ifelse(is.na(filterdata3$`Bike Type`), "Electric", filterdata3$`Bike Type`)
}
}
if(any(is.na(filterdata4))){
filterdata4$mean <- ifelse(is.na(filterdata4$mean), 0, filterdata4$mean)
filterdata4$`Month Name` <- ifelse(is.na(filterdata4$`Month Name`), xInp1, filterdata4$`Month Name`)
filterdata4$`Rider Type` <- ifelse(is.na(filterdata4$`Rider Type`), "Member", filterdata4$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata4$`Bike Type` <- ifelse(is.na(filterdata4$`Bike Type`), "Classic", filterdata4$`Bike Type`)
}
}
if(any(is.na(filterdata5))){
filterdata5$mean <- ifelse(is.na(filterdata5$mean), 0, filterdata5$mean)
filterdata5$`Month Name` <- ifelse(is.na(filterdata5$`Month Name`), xInp1, filterdata5$`Month Name`)
filterdata5$`Rider Type` <- ifelse(is.na(filterdata5$`Rider Type`), "Member", filterdata5$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata5$`Bike Type` <- ifelse(is.na(filterdata5$`Bike Type`), "Electric", filterdata5$`Bike Type`)
}
}
filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
}
filterdata$`Week of Year` <- factor(filterdata$`Week of Year`, levels = unique(Yearlyweek$`Week of Year`))
ch <- reactive({
{if(input$Part2 == "Yes(by Bike-Type)") {
ggplot(filterdata, aes(x = `Week of Year`, y = interaction(`Rider Type`, `Bike Type`), fill = mean))
} else {
ggplot(filterdata, aes(x = `Week of Year`, y = `Rider Type`, fill = mean))
}} +
geom_tile(color = "white") +
scale_fill_gradient(low = "#F9CE87", high = "#F28B1E") +
{if(input$Part2 == "Yes(by Bike-Type)") {
labs(x = "Week of Year", y = "Rider-Bike Combination", fill = "mean")
} else {
labs(x = "Week of Year", y = "Rider Type", fill = "mean")
}} +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
labs(title = paste(input$SelYaxis2, " distribution across Week of the Year for the Month:", xInp1))
})
showModal(modalDialog(
plotOutput("HeatMap"),
title = paste("Heat Map" )
))
output$HeatMap <- renderPlot({ch()})
} else if(input$SelXaxis2 == "Day of Year"){
if(input$Part2 == "Yes(by Bike-Type)"){
if(input$SelYaxis2 == "Avg.Ride-Duration"){
q = dp %>% { if(cInp == "2" | cInp == "4") {filter(. , `Bike Type` == "Electric")} else if(cInp == "1") {filter(. , `Bike Type` == "Docked")} else {.}} %>%
group_by(`Day of Year`, `Rider Type`, `Bike Type`, `Start Time (in Hrs)`) %>%
summarise(mean = mean(`Ride Length`)) %>% ungroup()
} else if(input$SelYaxis2 == "Avg.Ride-Distance"){
q = dp %>% { if(cInp == "2" | cInp == "4") {filter(. , `Bike Type` == "Electric")} else if(cInp == "1") {filter(. , `Bike Type` == "Docked")} else {.}} %>%
filter(!is.na(`Station Distance`)) %>%
group_by(`Day of Year`, `Rider Type`, `Bike Type`, `Start Time (in Hrs)`) %>%
summarise(mean = mean(`Station Distance`)) %>% ungroup()
} } else {
if(input$SelYaxis2 == "Avg.Ride-Duration"){
q = dp %>%
group_by(`Day of Year`, `Rider Type`, `Start Time (in Hrs)`) %>%
summarise(mean = mean(`Ride Length`)) %>% ungroup()
} else if(input$SelYaxis2 == "Avg.Ride-Distance"){
q = dp %>%
filter(!is.na(`Station Distance`)) %>%
group_by(`Day of Year`, `Rider Type`, `Start Time (in Hrs)`) %>%
summarise(mean = mean(`Station Distance`)) %>% ungroup()
} }
days = q %>% select(`Day of Year`) %>% distinct()
days = days[order(days$`Day of Year`), ]
days$`Day of Year` = factor(days$`Day of Year`, levels = unique(days$`Day of Year`))
xInp1 = days$`Day of Year`[xInp + 1]
q_cas_dock <- q %>% filter(`Day of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Docked")} else {.}}
q_cas_clas <- q %>% filter(`Day of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
q_cas_elec <- q %>% filter(`Day of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
q_mem_clas <- q %>% filter(`Day of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
q_mem_elec <- q %>% filter(`Day of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
filterdata1 <- right_join(q_cas_dock, starttime, by='Start Time (in Hrs)')
filterdata2 <- right_join(q_cas_clas, starttime, by='Start Time (in Hrs)')
filterdata3 <- right_join(q_cas_elec, starttime, by='Start Time (in Hrs)')
filterdata4 <- right_join(q_mem_clas, starttime, by='Start Time (in Hrs)')
filterdata5 <- right_join(q_mem_elec, starttime, by='Start Time (in Hrs)')
if(!any(is.na(filterdata1)) && !any(is.na(filterdata2)) && !any(is.na(filterdata3)) && !any(is.na(filterdata4)) && !any(is.na(filterdata5))){
filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
} else {
if(any(is.na(filterdata1))){
filterdata1$mean <- ifelse(is.na(filterdata1$mean), 0, filterdata1$mean)
filterdata1$`Day of Year` <- ifelse(is.na(filterdata1$`Day of Year`), xInp1, filterdata1$`Day of Year`)
filterdata1$`Rider Type` <- ifelse(is.na(filterdata1$`Rider Type`), "Casual", filterdata1$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata1$`Bike Type` <- ifelse(is.na(filterdata1$`Bike Type`), "Docked", filterdata1$`Bike Type`)
}
}
if(any(is.na(filterdata2))){
filterdata2$mean <- ifelse(is.na(filterdata2$mean), 0, filterdata2$mean)
filterdata2$`Day of Year` <- ifelse(is.na(filterdata2$`Day of Year`), xInp1, filterdata2$`Day of Year`)
filterdata2$`Rider Type` <- ifelse(is.na(filterdata2$`Rider Type`), "Casual", filterdata2$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata2$`Bike Type` <- ifelse(is.na(filterdata2$`Bike Type`), "Classic", filterdata2$`Bike Type`)
}
}
if(any(is.na(filterdata3))){
filterdata3$mean <- ifelse(is.na(filterdata3$mean), 0, filterdata3$mean)
filterdata3$`Day of Year` <- ifelse(is.na(filterdata3$`Day of Year`), xInp1, filterdata3$`Day of Year`)
filterdata3$`Rider Type` <- ifelse(is.na(filterdata3$`Rider Type`), "Casual", filterdata3$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata3$`Bike Type` <- ifelse(is.na(filterdata3$`Bike Type`), "Electric", filterdata3$`Bike Type`)
}
}
if(any(is.na(filterdata4))){
filterdata4$mean <- ifelse(is.na(filterdata4$mean), 0, filterdata4$mean)
filterdata4$`Day of Year` <- ifelse(is.na(filterdata4$`Day of Year`), xInp1, filterdata4$`Day of Year`)
filterdata4$`Rider Type` <- ifelse(is.na(filterdata4$`Rider Type`), "Member", filterdata4$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata4$`Bike Type` <- ifelse(is.na(filterdata4$`Bike Type`), "Classic", filterdata4$`Bike Type`)
}
}
if(any(is.na(filterdata5))){
filterdata5$mean <- ifelse(is.na(filterdata5$mean), 0, filterdata5$mean)
filterdata5$`Day of Year` <- ifelse(is.na(filterdata5$`Day of Year`), xInp1, filterdata5$`Day of Year`)
filterdata5$`Rider Type` <- ifelse(is.na(filterdata5$`Rider Type`), "Member", filterdata5$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata5$`Bike Type` <- ifelse(is.na(filterdata5$`Bike Type`), "Electric", filterdata5$`Bike Type`)
}
}
filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
}
filterdata$`Start Time (in Hrs)` <- factor(filterdata$`Start Time (in Hrs)`, levels = c("0","1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23"))
ch <- reactive({
{if(input$Part2 == "Yes(by Bike-Type)") {
ggplot(filterdata, aes(x = `Start Time (in Hrs)`, y = interaction(`Rider Type`, `Bike Type`), fill = mean))
} else {
ggplot(filterdata, aes(x = `Start Time (in Hrs)`, y = `Rider Type`, fill = mean))
}} +
geom_tile(color = "white") +
scale_fill_gradient(low = "#6EF0F7", high = "#09B4BD") +
{if(input$Part2 == "Yes(by Bike-Type)") {
labs(x = "Time of the Day", y = "Rider-Bike Combination", fill = "mean")
} else {
labs(x = "Time of the Day", y = "Rider Type", fill = "mean")
}} +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
labs(title = paste(input$SelYaxis2, " distribution across different time-period for Day:", xInp1))
})
showModal(modalDialog(
plotOutput("HeatMap"),
title = paste("Heat Map" )
))
output$HeatMap <- renderPlot({ch()})
} else if(input$SelXaxis2 == "Season"){
if(input$Part2 == "Yes(by Bike-Type)") {
if(input$SelYaxis2 == "Avg.Ride-Duration"){
q = dp %>% { if(cInp == "1") {filter(. , `Bike Type` == "Docked")} else {.}} %>%
group_by(Season, `Rider Type`, `Bike Type`, `Month Name`) %>%
summarise(mean = mean(`Ride Length`)) %>% ungroup()
} else if(input$SelYaxis2 == "Avg.Ride-Distance"){
q = dp %>% { if(cInp == "1") {filter(. , `Bike Type` == "Docked")} else {.}} %>%
filter(!is.na(`Station Distance`)) %>%
group_by(Season, `Rider Type`, `Bike Type`, `Month Name`) %>%
summarise(mean = mean(`Station Distance`)) %>% ungroup()
}
} else{
if(input$SelYaxis2 == "Avg.Ride-Duration"){
q = dp %>%
group_by(Season, `Rider Type`, `Month Name`) %>%
summarise(mean = mean(`Ride Length`)) %>% ungroup()
} else if(input$SelYaxis2 == "Avg.Ride-Distance"){
q = dp %>%
filter(!is.na(`Station Distance`)) %>%
group_by(Season, `Rider Type`, `Month Name`) %>%
summarise(mean = mean(`Station Distance`)) %>% ungroup()
} }
Seasons <- q %>% select(Season) %>% distinct()
Seasons$Season <- factor(Seasons$Season, levels = c("Fall", "Spring", "Summer", "Winter"))
Seasons <- Seasons[order(Seasons$Season),]
xInp1 <- Seasons$Season[xInp + 1]
q_cas_dock <- q %>% filter(Season == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Docked")} else {.}}
q_cas_clas <- q %>% filter(Season == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
q_cas_elec <- q %>% filter(Season == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
q_mem_clas <- q %>% filter(Season == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
q_mem_elec <- q %>% filter(Season == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
Months <- dp %>% filter(Season == as.character(xInp1)) %>% select(`Month Name`) %>% distinct()
if(xInp1 == "Spring"){
Months$`Month Name` <- factor(Months$`Month Name`, levels = c("May 2023", "March 2024", "April 2024"))
} else if(xInp1 == "Summer"){
Months$`Month Name` <- factor(Months$`Month Name`, levels = c("June 2023", "July 2023", "Aug 2023"))
} else if(xInp1 == "Fall"){
Months$`Month Name` <- factor(Months$`Month Name`, levels = c("Sep 2023", "Oct 2023", "Nov 2023"))
} else if(xInp1 == "Winter"){
Months$`Month Name` <- factor(Months$`Month Name`, levels = c("Dec 2023", "Jan 2024", "Feb 2024"))
}
Months <- Months[order(Months$`Month Name`),]
filterdata1 <- right_join(q_cas_dock, Months, by='Month Name')
filterdata2 <- right_join(q_cas_clas, Months, by='Month Name')
filterdata3 <- right_join(q_cas_elec, Months, by='Month Name')
filterdata4 <- right_join(q_mem_clas, Months, by='Month Name')
filterdata5 <- right_join(q_mem_elec, Months, by='Month Name')
if(!any(is.na(filterdata1)) && !any(is.na(filterdata2)) && !any(is.na(filterdata3)) && !any(is.na(filterdata4)) && !any(is.na(filterdata5))){
filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
} else {
if(any(is.na(filterdata1))){
filterdata1$mean <- ifelse(is.na(filterdata1$mean), 0, filterdata1$mean)
filterdata1$Season <- ifelse(is.na(filterdata1$Season), xInp1, filterdata1$Season)
filterdata1$`Rider Type` <- ifelse(is.na(filterdata1$`Rider Type`), "Casual", filterdata1$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata1$`Bike Type` <- ifelse(is.na(filterdata1$`Bike Type`), "Docked", filterdata1$`Bike Type`)
}
}
if(any(is.na(filterdata2))){
filterdata2$mean <- ifelse(is.na(filterdata2$mean), 0, filterdata2$mean)
filterdata2$Season <- ifelse(is.na(filterdata2$Season), xInp1, filterdata2$Season)
filterdata2$`Rider Type` <- ifelse(is.na(filterdata2$`Rider Type`), "Casual", filterdata2$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata2$`Bike Type` <- ifelse(is.na(filterdata2$`Bike Type`), "Classic", filterdata2$`Bike Type`)
}
}
if(any(is.na(filterdata3))){
filterdata3$mean <- ifelse(is.na(filterdata3$mean), 0, filterdata3$mean)
filterdata3$Season <- ifelse(is.na(filterdata3$Season), xInp1, filterdata3$Season)
filterdata3$`Rider Type` <- ifelse(is.na(filterdata3$`Rider Type`), "Casual", filterdata3$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata3$`Bike Type` <- ifelse(is.na(filterdata3$`Bike Type`), "Electric", filterdata3$`Bike Type`)
}
}
if(any(is.na(filterdata4))){
filterdata4$mean <- ifelse(is.na(filterdata4$mean), 0, filterdata4$mean)
filterdata4$Season <- ifelse(is.na(filterdata4$Season), xInp1, filterdata4$Season)
filterdata4$`Rider Type` <- ifelse(is.na(filterdata4$`Rider Type`), "Member", filterdata4$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata4$`Bike Type` <- ifelse(is.na(filterdata4$`Bike Type`), "Classic", filterdata4$`Bike Type`)
}
}
if(any(is.na(filterdata5))){
filterdata5$mean <- ifelse(is.na(filterdata5$mean), 0, filterdata5$mean)
filterdata5$Season <- ifelse(is.na(filterdata5$Season), xInp1, filterdata5$Season)
filterdata5$`Rider Type` <- ifelse(is.na(filterdata5$`Rider Type`), "Member", filterdata5$`Rider Type`)
if(input$Part2 == "Yes(by Bike-Type)") {
filterdata5$`Bike Type` <- ifelse(is.na(filterdata5$`Bike Type`), "Electric", filterdata5$`Bike Type`)
}
}
filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
}
filterdata$`Month Name` <- factor(filterdata$`Month Name`, levels = unique(Months$`Month Name`))
ch <- reactive({
{if(input$Part2 == "Yes(by Bike-Type)") {
ggplot(filterdata, aes(x = `Month Name`, y = interaction(`Rider Type`, `Bike Type`), fill = mean))
} else {
ggplot(filterdata, aes(x = `Month Name`, y = `Rider Type`, fill = mean))
}} +
geom_tile(color = "white") +
scale_fill_gradient(low = "#BD8DEC", high = "#7906EC") +
{if(input$Part2 == "Yes(by Bike-Type)") {
labs(x = "Months", y = "Rider-Bike Combination", fill = "mean")
} else {
labs(x = "Months", y = "Rider Type", fill = "mean")
}} +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
labs(title = paste(input$SelYaxis2, " distribution across Months for the Season:", xInp1))
})
showModal(modalDialog(
plotOutput("HeatMap"),
title = paste("Heat Map" )
))
output$HeatMap <- renderPlot({ch()})
}
}
Choropleth Maps are useful in identifying the regional density distribution. For current Case-study, a dot density map has been plotted on the top of the map of Chicago thus displaying the region where the density is the highest.
output$map_plot <- renderPlot({
if(input$Data4 == "Yes"){
output$dataM <- DT::renderDataTable({
DT::datatable(
mytable %>% filter(`Type of Rider` == paste(input$SelRid)) %>%
filter(`Bike Type` == paste(input$SelBik)) %>%
mutate(Perc_n = (n/sum(n))*100) %>% filter(Perc_n > "0.15") %>%
mutate_if(is.numeric, round, digits = 3),
filter = 'top', extensions = c('Buttons', 'Scroller'),
options = list(scrollY = 375,
scrollX = 500,
deferRender = TRUE,
scroller = TRUE,
# paging = TRUE,
# pageLength = 25,
buttons = list(list(extend = 'colvis', targets = 0, visible = FALSE)),
dom = 'lBfrtip',
fixedColumns = TRUE),
rownames = FALSE)
})
} else {
output$dataM <- DT::renderDataTable({
DT::datatable(
NULL
)
})
}
shp_1 <- read_sf("Boundaries - Community Areas (current)")
if(input$SelRid == "casual" & input$SelBik == "classic_bike"){
shp <- read_sf("mytable_casual_classic")
interval <- c(0, 0.005, 0.01, 0.015, 0.02, 0.025, 0.03, 0.035, 0.04, 0.045, 0.05, 0.055, 0.06, 0.065, 0.07, 0.075, 0.08, 0.085, 0.09, 0.095, 0.1, 0.105, 0.110, 0.115, 0.12, 0.125, 0.13, 0.135, 0.14, 0.145, 0.15, 0.155, 0.16, 0.165, 0.17, Inf )
l <- c("0.000-0.010", "0.010-0.020", "0.020-0.030",
"0.030-0.040", "0.040-0.050", "0.050-0.060",
"0.060-0.070", "0.070-0.080", "0.080-0.090",
"0.090-0.100", "0.100-0.110", "0.110-0.120",
"0.120-0.130", "0.130-0.140", "0.140-0.150",
"0.150-0.160", "0.160-0.170", "> 0.170" )
m = 0.17
} else if(input$SelRid == "casual" & input$SelBik == "electric_bike"){
shp <- read_sf("mytable_casual_electric")
interval <- c(0, 0.005, 0.01, 0.015, 0.02, 0.025, 0.03, 0.035, 0.04, 0.045, 0.05, 0.055, 0.06, 0.065, 0.07, 0.075, 0.08, 0.085, 0.09, 0.095, 0.1, 0.105, 0.110, 0.115, 0.12, 0.125, 0.13, 0.135, 0.14, 0.145, 0.15, Inf )
l <- c("0.000-0.010", "0.010-0.020", "0.020-0.030",
"0.030-0.040", "0.040-0.050", "0.050-0.060",
"0.060-0.070", "0.070-0.080", "0.080-0.090",
"0.090-0.100", "0.100-0.110", "0.110-0.120",
"0.120-0.130", "0.130-0.140", "0.140-0.150","> 0.150")
m = 0.15
} else if(input$SelRid == "member" & input$SelBik == "classic_bike"){
shp <- read_sf("mytable_member_classic")
interval <- c(0, 0.005, 0.01, 0.015, 0.02, 0.025, 0.03, 0.035, 0.04, 0.045, 0.05, 0.055, 0.06, 0.065, 0.07, 0.075, 0.08, 0.085, 0.09, 0.095, 0.1, 0.105, 0.110, 0.115, 0.12, 0.125, 0.13, 0.135, 0.14, 0.145, 0.15, 0.155, 0.16, 0.165, 0.17, 0.175, 0.18, 0.185, 0.19, 0.195, 0.200, 0.205, 0.210, Inf )
l <- c("0.000-0.010", "0.010-0.020", "0.020-0.030",
"0.030-0.040", "0.040-0.050", "0.050-0.060",
"0.060-0.070", "0.070-0.080", "0.080-0.090",
"0.090-0.100", "0.100-0.110", "0.110-0.120",
"0.120-0.130", "0.130-0.140", "0.140-0.150",
"0.150-0.160", "0.160-0.170", "0.170-0.180",
"0.180-0.190", "0.190-0.200", "0.200-0.210", "> 0.210" )
m = 0.21
} else {
shp <- read_sf("mytable_member_electric")
interval <- c(0, 0.005, 0.01, 0.015, 0.02, 0.025, 0.03, 0.035, 0.04, 0.045, 0.05, 0.055, 0.06, 0.065, 0.07, 0.075, 0.08, 0.085, 0.09, 0.095, 0.1, 0.105, 0.110, 0.115, 0.12, 0.125, 0.13, 0.135, 0.14, 0.145, 0.15, 0.155, 0.16, 0.165, 0.17, Inf )
l <- c("0.000-0.010", "0.010-0.020", "0.020-0.030",
"0.030-0.040", "0.040-0.050", "0.050-0.060",
"0.060-0.070", "0.070-0.080", "0.080-0.090",
"0.090-0.100", "0.100-0.110", "0.110-0.120",
"0.120-0.130", "0.130-0.140", "0.140-0.150",
"0.150-0.160", "0.160-0.170", "> 0.170" )
m = 0.17
}
d <- mytable %>% filter(`Type of Rider` == paste(input$SelRid)) %>% filter(`Bike Type` == paste(input$SelBik)) %>% mutate(Perc_n = (n/sum(n))*100)
n_interval <- length(interval) - 1
incidence_interval_1 <- classIntervals(d$Perc_n,
n = n_interval,
style = "fixed",
fixedBreaks = interval,
yearrouterecision = 3)
colors <- colorRampPalette(brewer.pal(9, "Oranges"))(n_interval)
colcod <- findColours(incidence_interval_1, colors)
p <- ggplot(shp_1) +
geom_sf(color = "black", fill = "khaki3") +
geom_sf(shp, mapping = aes(fill = Perc_n), color = colcod) +
coord_sf(xlim = c(-88, -87.52), ylim = c(41.6, 42.07)) +
ggthemes::theme_map() +
theme(legend.position = "none") +
theme(panel.background = element_rect(fill = "aliceblue", color = "white"),
axis.ticks = element_blank(),
axis.text = element_blank()) +
# panel.grid = element_line(color = "white", linewidth = 0.8)) +
labs(title = "Chicago City Map",
subtitle = paste("Heatmap showing percen. distribution of", input$SelRid,"-", input$SelBik,"riders"))
p
# Function to plot color bar
color.bar <- function(lut, min = 0, max = m, nticks = n_interval, ticks = seq(min, max, len = nticks/2), title='Dist. % per Dot') {
scale = (max-min)/(length(lut)-1)
plot(c(0,5), c(min,max), type='n', bty='n', xaxt='n', xlab='', yaxt='n', ylab='', main=title , adj = 0, line = -0.02, cex.main = 0.75)
axis(2, at = ticks, labels = l, cex.axis = 0.6, las=1)
for (i in 0:(length(lut)-1)) {
y = i*scale + min
rect(0,y,1.5,(length(lut)-1)*scale , col=lut[i], border= "black")
}
}
color.bar(colorRampPalette(brewer.pal(9, "Oranges"))(n_interval))
# Combining the ggplot with colorbar using gridExtra from Hmisc package.
p2 <- as.grob(p)
r <- as.grob(function() color.bar(colorRampPalette(brewer.pal(9, "Oranges"))(n_interval)))
grid.newpage()
grid.draw(p2)
vp = viewport(x=0.8, y=.45, width=.3, height=1.2)
pushViewport(vp)
grid.draw(r)
upViewport()
})
For deploying the web-app, first we need to combine the UI and server code in one file called “app.R”.
Include the below code before UI
library(tidyverse)
library(readr)
library(dplyr)
library(lubridate)
library(shiny)
library(shinydashboard)
library(fontawesome)
library(plotly)
library(DT)
library(janitor)
library(devtools)
library(urbnmapr)
library(tibble)
library(sf)
library(rvest)
library(viridis)
library(ggrepel)
library(ggthemes)
library(hms)
library(RColorBrewer)
library(classInt)
library(dots)
library(cartography)
library(mapsf)
library(scales)
library(ggpubr)
library(Hmisc)
library(gridExtra)
library("grid")
library("ggplotify")
library(effects)
library(ggplot2)
library(ggtext)
library(maps)
library(shinycssloaders)
library(plotly)
library(rlang)
library(ggcorrplot)
library(gghighlight)
library(webr)
library(grid)
library(reshape)
library(conflicted)
conflicts_prefer(DT::dataTableOutput)
conflicts_prefer(dplyr::filter)
conflicts_prefer(plotly::layout)
conflicts_prefer(dplyr::rename)
yearroute <- read_csv("yearroute.txt")
Station <- read_csv("Station.txt")
starttime <- read_csv("starttime.txt")
rawdata_sample <- read_csv("rawdata_sample.txt")
mytable <- read_csv("mytable.txt")
input_list <- read.csv("Inputs.csv")
c <- c("May 2023", "June 2023", "July 2023", "Aug 2023", "Sep 2023", "Oct 2023", "Nov 2023", "Dec 2023", "Jan 2024", "Feb 2024", "March 2024", "April 2024")
Include the below code after the Server
shinyApp(ui = ui, server = server)
Some points to be noted:
Include all your media files in a “www” folder and keep that folder inside the folder where the file app.R is present.
If there is a reference to any external file other than app.R, then give the path of that file relative to app.R file and not the absolute path (i.e. C:/Users/…..).
If there is OOM (Out of Memory) error given during deployment then we can do the following:
Some features of my Case-Study analysis are as follows:
Customised Analysis: With the various UI Controls available, we just need to input the parameters and a graph customised as per our requirement gets ready in no time!
Automation of Analysis: This is the main advantage of using R shiny i.e. for automating the analytical process. With the help of automation in Shiny, I am able to create 240 barcharts and 5022 linked charts for monthly analysis, 216 barcharts and 7318 linked charts for yearly analysis and 4 graphs for choropleth maps which amounts to 12800 graphs. Making 12800 graphs manually would really take a lot of time and effort. But with Shiny, it really becomes a lot more easier, time-saving and also it does not cluster up the space with graphs.
Complying with Stakeholder’s Needs: Many times, Stakeholders do prefer referring to the Data-table for which a separate option for viewing Data is provided as well.
Linked Charts: Linking of Pie Charts and Heatmaps to bars for indepth visualization.
Highlights: Spotlighting important numbers for ease of identification.
Attractive UI: Use of various icons for an attractive UI.
The findings and recommendations are as given below:
People who would purchase Cyclistic Membership could be given some free rides annually to places like Dusable Lake Shore Drive and Streeter Dr where multiple parks are located.
Small groups of Riders can be formed by Cyclistic which would prefer going for early morning rides and also in the evening (around 17.00 hrs) in Spring and Summer covering several places in Chicago. Advertisements of such groups should be done focussed towards the residents in the region of Streeter Dr and Grand Ave and other top 25 such stations which can attract a lot of people who may join the groups to expand their Networks. Marketing team can provide Seasonal memberships at discounted prices to these groups.
Since, the Casual riders are more during Weekends, Marathons could be organised on Weekends. Particiants could be provided with discounted memberships.
With this case-study, I really got to develop a lot of skills pertaining to programming in R studio and getting comfortable with Shiny webapp as well. Throughout this Case-Study analysis, I was able to get my hands on various tidyverse packages such as tidyr, dplyr, stringr, ggplot2 and also beyond that such as lubridate, shinydashboard, viridis, DT, Hmisc, gridextra, urbanmapr and many more!
I would love to thank the entire team at Google for such a wonderful course. Besides this, I would also like to give credit to the youtube playlist of Abhinav Agrawal Sir, Data Science Tutorials by Abhishek Agarwal Sir and the entire community on Stack-overflow for helping me out with their informative articles!