Introduction

About the Company

In 2016, Cyclistic launched a successful bike-share offering. Since then, the program has grown to a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime.

Cyclistic sets itself apart by offering reclining bikes, hand tricycles, and cargo bikes, making bike-share more inclusive to people with disabilities and riders who can’t use a standard two-wheeled bike. The majority of riders opt for traditional bikes; about 8% of riders use the assistive options. Cyclistic users are more likely to ride for leisure, but about 30% use the bikes to commute to work each day.

Cyclistic has classified its customers broadly into 2 categories: Casual Riders and Members. Customers who purchase single-ride or full-day passes are referred to as Casual Riders. Customers who purchase annual memberships are Cyclistic Members.

Cyclistic’s finance analysts have concluded that annual members are much more profitable than casual riders.

Business Task

The current position is of a junior data analyst working on the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, the team wants to understand how casual riders and annual members use Cyclistic bikes differently.

Methodology

Data Analysis comprises of six phases: Ask, Prepare, Process, Analyse, Share and Act.

Ask

In this phase, in order to get a clear picture of the business task that needs to be executed, we need to ask questions to the Stakeholders and get a clear understanding of their interests and expectations from the Project. The primary task for this project is as given earlier:To understand the difference between Casual Riders and Members

Prepare

First we load the libraries as per below:

library(tidyverse)
library(readr)
library(dplyr)
library(lubridate)
library(shiny)
library(shinydashboard)
library(fontawesome)
library(plotly)
library(DT)
library(janitor)
library(devtools)
library(urbnmapr)
library(tibble)
library(sf)
library(rvest)
library(viridis)
library(ggrepel)
library(ggthemes)
library(hms)
library(RColorBrewer)
library(classInt)
library(dots)
library(cartography)
library(mapsf)
library(scales)
library(ggpubr)
library(Hmisc)
library(gridExtra)
library("grid")
library("ggplotify")
library(effects)
library(ggplot2)
library(ggtext)
library(maps)
library(shinycssloaders)
library(plotly)
library(rlang)
library(ggcorrplot)
library(gghighlight)
library(webr)
library(grid)
library(reshape)
library(conflicted)
knitr::opts_chunk$set(echo = TRUE)

Now conflicts occur because of the presence of same package within multiple libraries. Hence for avoiding conflicts, we write the below code:

conflicts_prefer(DT::dataTableOutput)
conflicts_prefer(dplyr::filter)
conflicts_prefer(plotly::layout)
conflicts_prefer(dplyr::rename)

Now we first load the raw data. The raw data is released on a monthly basis and is made available under a shareable public license. For this case-study, we will work on the latest possible data of 1 year i.e. from May 2023-April 2024. Hence, in order to work on this data, we need to load 12 raw files as below :

May <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202305-divvy-tripdata/202305-divvy-tripdata.csv")

To get a view of this data, we need to execute the code as given below:

tibble(May)
## # A tibble: 604,827 × 13
##    ride_id          rideable_type started_at          ended_at           
##    <chr>            <chr>         <dttm>              <dttm>             
##  1 0D9FA920C3062031 electric_bike 2023-05-07 19:53:48 2023-05-07 19:58:32
##  2 92485E5FB5888ACD electric_bike 2023-05-06 18:54:08 2023-05-06 19:03:35
##  3 FB144B3FC8300187 electric_bike 2023-05-21 00:40:21 2023-05-21 00:44:36
##  4 DDEB93BC2CE9AA77 classic_bike  2023-05-10 16:47:01 2023-05-10 16:59:52
##  5 C07B70172FC92F59 classic_bike  2023-05-09 18:30:34 2023-05-09 18:39:28
##  6 2BA66385DF8F815A classic_bike  2023-05-30 15:01:21 2023-05-30 15:17:00
##  7 31EFCCB05F12D8EF docked_bike   2023-05-09 14:13:40 2023-05-09 14:47:20
##  8 71DFF834E1D3CE0B classic_bike  2023-05-06 16:47:22 2023-05-06 16:52:13
##  9 2117485899B4CEA4 classic_bike  2023-05-15 12:47:26 2023-05-15 13:00:05
## 10 811149F69AAE82DD electric_bike 2023-05-19 05:44:26 2023-05-19 05:47:24
## # ℹ 604,817 more rows
## # ℹ 9 more variables: start_station_name <chr>, start_station_id <chr>,
## #   end_station_name <chr>, end_station_id <chr>, start_lat <dbl>,
## #   start_lng <dbl>, end_lat <dbl>, end_lng <dbl>, member_casual <chr>

Similarly, we can load the other files as well

June <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202306-divvy-tripdata/202306-divvy-tripdata.csv")
July <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202307-divvy-tripdata/202307-divvy-tripdata.csv")
Aug <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202308-divvy-tripdata/202308-divvy-tripdata.csv")
Sep <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202309-divvy-tripdata/202309-divvy-tripdata.csv")
Oct <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202310-divvy-tripdata/202310-divvy-tripdata.csv")
Nov <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202311-divvy-tripdata/202311-divvy-tripdata.csv")
Dec <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202312-divvy-tripdata/202312-divvy-tripdata.csv")
Jan <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202401-divvy-tripdata/202401-divvy-tripdata.csv")
Feb <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202402-divvy-tripdata/202402-divvy-tripdata.csv")
March <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202403-divvy-tripdata/202403-divvy-tripdata.csv")
April <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/202404-divvy-tripdata/202404-divvy-tripdata.csv")

Now we need to aggregate the Data. For aggregating the data we take the help of rbind() function

rawdata_complete <- rbind(May, June, July, Aug, Sep, Oct, Nov, Dec, Jan, Feb, March, April)

Now we have the complete data to work with so we proceed with the Process Phase

Process

In this phase, we need to clean the raw data so as to remove any irregularities by selecting and filtering out unwanted or missing data. We can also rename certain elements so as to make it look more meaningful and concise.

rawdata_complete <- rawdata_complete %>%  rename(
  `Bike Type` = rideable_type,
  start = started_at,
  end = ended_at,
  `Rider Type` = member_casual
)

We take the help of recode() function to change the names of some elements within the columns

rawdata_complete <- rawdata_complete %>% mutate(`Rider Type` = recode(`Rider Type`,"member" = "Member",
                  "casual" = "Casual")) %>%
   mutate(`Bike Type` = recode(`Bike Type`,
                               "classic_bike" = "Classic",
                               "electric_bike" = "Electric",
                               "docked_bike" = "Docked"))

Now after we have cleaned the data, we need to start with our analysis.

Analyze

In this phase, we create new variables which can be useful for our Analysis. mutate() function can be used for creation of such new variables.

First, we will try to calculate the Station Distance between 2 stations. Now for calculating the geographical distance between 2 coordinates, we can use the Haversine formula. In this, we first need to calculate Haversine Angle from which we can get the distance in kms. Reference: https://www.youtube.com/watch?v=nsVsdHeTXIE

rawdata <- rawdata_complete %>% mutate(`Haversine Angle` = ((1 - cos(end_lat - start_lat))/2) + cos(start_lat) * cos(end_lat) * ((1 - cos(end_lng - start_lng))/2))

rawdata <- rawdata %>% mutate(`Station Distance` = (acos(1 - 2 * `Haversine Angle`) * pi/180) * 6371)

Now we select those variables which are important for our analysis and remove the remaining ones

raw_mod <- rawdata %>% select(`Bike Type`, start, end, start_station_name, end_station_name, `Rider Type`, `Station Distance`)

We add Ride-Length by taking a difference of starting and ending time

raw_mod <- raw_mod %>% mutate(ride_length = as_hms(difftime(end, start)))

For our analysis, it would be better to convert it into absolute value

raw_mod <- raw_mod %>% mutate(`Ride Length` = as.numeric(abs(ride_length)/60)) %>% 
  mutate_if(is.numeric, round, digits = 3)

We can modify the start variable (which gives us the starting time of the ride) to further create new variables which can really help us extract a lot of information pertaining to the time of the ride.

raw_mod <- raw_mod %>% mutate(date = as.Date(start))
raw_mod <- raw_mod %>% mutate(Month = month(ymd(date)),
                              Weekday = weekdays(ymd(date)),
                              `Day of Year` = yday(date),
                               `Day of Month` = mday(date),
                        `Start Time (in Hrs)` = hour(start))

We can also further calculate week of year from date. For a better outlook in the UI, it is better to give names to months in place of indicating them just in the form of numbers.

raw_mod <- raw_mod %>% mutate(`Week of Year` = week(date)) %>% 
  mutate(`Month Name` = case_when(
    Month %in% c('1') ~ 'Jan 2024',
    Month %in% c('2') ~ 'Feb 2024',
    Month %in% c('3') ~ 'March 2024',
    Month %in% c('4') ~ 'April 2024',
    Month %in% c('5') ~ 'May 2023',
    Month %in% c('6') ~ 'June 2023',
    Month %in% c('7') ~ 'July 2023',
    Month %in% c('8') ~ 'Aug 2023',
    Month %in% c('9') ~ 'Sep 2023',
    Month %in% c('10') ~ 'Oct 2023',
    Month %in% c('11') ~ 'Nov 2023',
    Month %in% c('12') ~ 'Dec 2023'))

Months can be further converted to Seasons for identifying seasonal trends. Since, the firm is situated in Chicago, US, there are primarily 4 seasons according to which we need to carry out further analysis. Those are Winter, Spring, Summer and Fall.

raw_mod <- raw_mod %>% mutate(Season = case_when(
  Month %in% c('12', '1', '2') ~ 'Winter',
  Month %in% c('3', '4', '5') ~ 'Spring',
  Month %in% c('6', '7', '8') ~ 'Summer',
  Month %in% c('9', '10', '11') ~ 'Fall'
))

We will filter rows wherein the ride length is less than 1 minute. Since, rides with ride length less than 1 minute cannot be counted as rides as some times such rides get recorded in the system even when the rider is just testing the locking and unlocking of the system.

yeardata <- raw_mod %>% filter(`Ride Length` > 1.000 | `Ride Length` == 1.000)

yeardata <- yeardata %>%  rename(
  Station = start_station_name
)

Now we create another variable called as “Route” wherein we associate a number for every unique route connecting 2 bike stations in Chicago

routes <- yeardata %>% select(Station, end_station_name) 

Now we check if there is any na present in the columns

colSums(is.na(routes))
##          Station end_station_name 
##           843175           883058

Now since, na is present in the columns we filter those na values

routes <- routes %>% filter(!is.na(Station))
routes <- routes %>% filter(!is.na(end_station_name))

Again we check for na values

colSums(is.na(routes))
##          Station end_station_name 
##                0                0

Now since there are no na values present, we now extract the unique rows using distinct() function.

routes <- routes %>% distinct()

After separating out the distinct rows, we order the rows w.r.t Station Names and then create a separate column and populate it with row numbers so that it becomes easy to identify them.

routes <- routes[order(routes$Station),]

routes <- routes %>% mutate(Route = paste0("R", row_number()))

A view of the routes table is as given below

head(routes)
## # A tibble: 6 × 3
##   Station             end_station_name            Route
##   <chr>               <chr>                       <chr>
## 1 2112 W Peterson Ave Western Ave & Gunnison St   R1   
## 2 2112 W Peterson Ave Western Ave & Granville Ave R2   
## 3 2112 W Peterson Ave Hoyne Ave & Balmoral Ave    R3   
## 4 2112 W Peterson Ave Clark St & Schreiber Ave    R4   
## 5 2112 W Peterson Ave Clark St & Bryn Mawr Ave    R5   
## 6 2112 W Peterson Ave Oakley Ave & Touhy Ave      R6

Now we need to create a similar “Route” variable in our main dataset i.e. yeardata. For creating such variable, we can take the help of merge() function as follows:

yeardata2 <- yeardata %>% filter(!is.na(Station))

yeardata2 <- yeardata2 %>% filter(!is.na(end_station_name))

yearroute <- merge(yeardata2, routes, by = c("Station","end_station_name"))

Now our main dataset after creation of new variables looks like this:

head(yearroute)
##               Station    end_station_name Bike Type               start
## 1 2112 W Peterson Ave 2112 W Peterson Ave  Electric 2024-04-21 14:11:46
## 2 2112 W Peterson Ave 2112 W Peterson Ave    Docked 2023-08-18 09:13:58
## 3 2112 W Peterson Ave 2112 W Peterson Ave   Classic 2023-08-10 16:38:51
## 4 2112 W Peterson Ave 2112 W Peterson Ave  Electric 2023-06-14 19:03:41
## 5 2112 W Peterson Ave 2112 W Peterson Ave  Electric 2023-06-14 19:06:01
## 6 2112 W Peterson Ave 2112 W Peterson Ave  Electric 2023-07-23 05:13:33
##                   end Rider Type Station Distance ride_length Ride Length
## 1 2024-04-21 14:12:58     Casual            0.003    00:01:12       1.200
## 2 2023-08-18 15:57:25     Casual            0.000    06:43:27     403.450
## 3 2023-08-10 17:25:11     Casual            0.000    00:46:20      46.333
## 4 2023-06-14 19:04:59     Member            0.002    00:01:18       1.300
## 5 2023-06-14 19:07:22     Member            0.003    00:01:21       1.350
## 6 2023-07-23 05:14:33     Casual            0.003    00:01:00       1.000
##         date Month   Weekday Day of Year Day of Month Start Time (in Hrs)
## 1 2024-04-21     4    Sunday         112           21                  14
## 2 2023-08-18     8    Friday         230           18                   9
## 3 2023-08-10     8  Thursday         222           10                  16
## 4 2023-06-14     6 Wednesday         165           14                  19
## 5 2023-06-14     6 Wednesday         165           14                  19
## 6 2023-07-23     7    Sunday         204           23                   5
##   Week of Year Month Name Season Route
## 1           16 April 2024 Spring   R17
## 2           33   Aug 2023 Summer   R17
## 3           32   Aug 2023 Summer   R17
## 4           24  June 2023 Summer   R17
## 5           24  June 2023 Summer   R17
## 6           30  July 2023 Summer   R17

We will now remove certain columns which are not useful for our work

yearroute <- yearroute %>% select(-start, -end, -ride_length, -date, -Month)

Since the dataset is very huge, we would be selecting only a 1/3 sample of the data so that it does not create any memory issues at the time of project deployment on Shinyapps.io

yearmod <- yearroute %>% sample_frac(0.33)

Now we need to prepare some data for generating Choropleth maps. For this we first need to download the shape files of Chicago. We can get those files here: https://data.cityofchicago.org/Transportation/Divvy-Bicycle-Stations/bbyy-e7gq/about_data.The files they keep on updating regularly.

From this website, we can extract the spatial data in the form of Shape file in csv format.

Data will look as given below:

shape_data <- read_csv("C:/Users/nevil/Desktop/Google Case_Studies/Case-Study 1/Datasets/other files/Divvy_Bicycle_Stations_20240605.csv")
## Rows: 948 Columns: 9
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (5): ID, Station Name, Short Name, Status, Location
## dbl (4): Total Docks, Docks in Service, Latitude, Longitude
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(shape_data)
## # A tibble: 6 × 9
##   ID         `Station Name` `Short Name` `Total Docks` `Docks in Service` Status
##   <chr>      <chr>          <chr>                <dbl>              <dbl> <chr> 
## 1 a3ad5c90-… Dorchester Av… KA1503000069            15                 15 In Se…
## 2 157110506… Narragansett … <NA>                     9                  9 In Se…
## 3 a3b2af02-… MLK Jr Dr & 8… 586                     11                 11 In Se…
## 4 a3a57527-… Sedgwick St &… TA1307000062            27                 27 In Se…
## 5 159404640… California & … <NA>                     9                  9 In Se…
## 6 a3b13836-… Kenton Ave & … 537                     11                 11 In Se…
## # ℹ 3 more variables: Latitude <dbl>, Longitude <dbl>, Location <chr>

We can also download shape file of Community Areas of Chicago. These will provide a community area map of Chicago in the background which can help us identify the areas where high density of riders are located.

The shape file of community areas can be downloaded from here: https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Boundaries-Community-Areas-current-/cauq-8yn6

boundaries <- read_sf("C:/Users/nevil/Desktop/Cyclistic(Reduced Data)/Boundaries - Community Areas (current)")

head(boundaries)
## Simple feature collection with 6 features and 9 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -87.7069 ymin: 41.79448 xmax: -87.58001 ymax: 41.99076
## Geodetic CRS:  WGS 84
## # A tibble: 6 × 10
##    area area_num_1 area_numbe comarea comarea_id community  perimeter shape_area
##   <dbl> <chr>      <chr>        <dbl>      <dbl> <chr>          <dbl>      <dbl>
## 1     0 35         35               0          0 DOUGLAS            0  46004621.
## 2     0 36         36               0          0 OAKLAND            0  16913961.
## 3     0 37         37               0          0 FULLER PA…         0  19916705.
## 4     0 38         38               0          0 GRAND BOU…         0  48492503.
## 5     0 39         39               0          0 KENWOOD            0  29071742.
## 6     0 4          4                0          0 LINCOLN S…         0  71352328.
## # ℹ 2 more variables: shape_len <dbl>, geometry <MULTIPOLYGON [°]>

Now we will first filter out Station, Bike and Rider Type columns from yearroute dataset and rename them so as to match with the Shape file data.

shape_mod <- yearmod %>% select(Station, `Rider Type`, `Bike Type`)

shape_mod <- yearmod %>% 
  rename(
    `Station Name` = Station,
    `Type of Rider` = `Rider Type`   # Rider Type modified to Type of Rider so as to match with the UI text
  )

shape_mod <- shape_mod %>% 
  group_by(`Type of Rider`, `Bike Type`,`Station Name`) %>% 
  count(`Station Name`)
head(shape_mod)
## # A tibble: 6 × 4
## # Groups:   Type of Rider, Bike Type, Station Name [6]
##   `Type of Rider` `Bike Type` `Station Name`                 n
##   <chr>           <chr>       <chr>                      <int>
## 1 Casual          Classic     2112 W Peterson Ave           21
## 2 Casual          Classic     63rd St Beach                135
## 3 Casual          Classic     900 W Harrison St            581
## 4 Casual          Classic     Aberdeen St & Jackson Blvd   698
## 5 Casual          Classic     Aberdeen St & Monroe St      505
## 6 Casual          Classic     Aberdeen St & Randolph St    546
colSums(is.na(shape_mod))
## Type of Rider     Bike Type  Station Name             n 
##             0             0             0             0

Since, we dont have any na values present so we can now merge the shape_mod data with shape_data. We need to do this so that we can get the necessary spatial info from shape_data for every station present in shape_mod.

mytable = merge(shape_data, shape_mod, by = 'Station Name', all=F)

head(mytable)
##           Station Name                                   ID   Short Name
## 1  2112 W Peterson Ave a3af2216-a135-11e9-9cda-0a87ae2ba916 KA1504000155
## 2  2112 W Peterson Ave a3af2216-a135-11e9-9cda-0a87ae2ba916 KA1504000155
## 3  2112 W Peterson Ave a3af2216-a135-11e9-9cda-0a87ae2ba916 KA1504000155
## 4  2112 W Peterson Ave a3af2216-a135-11e9-9cda-0a87ae2ba916 KA1504000155
## 5  2112 W Peterson Ave a3af2216-a135-11e9-9cda-0a87ae2ba916 KA1504000155
## 6 21st St & Pulaski Rd                  1931696364106218226        21331
##   Total Docks Docks in Service     Status Latitude Longitude
## 1          11               11 In Service 41.99118 -87.68359
## 2          11               11 In Service 41.99118 -87.68359
## 3          11               11 In Service 41.99118 -87.68359
## 4          11               11 In Service 41.99118 -87.68359
## 5          11               11 In Service 41.99118 -87.68359
## 6          15               15 In Service 41.85344 -87.72487
##                       Location Type of Rider Bike Type  n
## 1 POINT (-87.683593 41.991178)        Casual   Classic 21
## 2 POINT (-87.683593 41.991178)        Casual  Electric 36
## 3 POINT (-87.683593 41.991178)        Member  Electric 47
## 4 POINT (-87.683593 41.991178)        Member   Classic 69
## 5 POINT (-87.683593 41.991178)        Casual    Docked  4
## 6 POINT (-87.724874 41.853441)        Member  Electric  1

Now for every station we have spatial data as well as bike type and rider type in one single table. Now we need to make different subsets of table based on rider-bike combination so that we can get different choropleth maps for every combination.

mytable_casual_classic <- mytable %>% filter(`Type of Rider` == "Casual") %>% filter(`Bike Type` == "Classic") %>% mutate(Perc_n = (n/sum(n))*100)

head(mytable_casual_classic)
##                 Station Name                                   ID   Short Name
## 1        2112 W Peterson Ave a3af2216-a135-11e9-9cda-0a87ae2ba916 KA1504000155
## 2              63rd St Beach a3a547b8-a135-11e9-9cda-0a87ae2ba916        15491
## 3          900 W Harrison St a3a56ae5-a135-11e9-9cda-0a87ae2ba916        13028
## 4 Aberdeen St & Jackson Blvd a3a3cbac-a135-11e9-9cda-0a87ae2ba916        13157
## 5    Aberdeen St & Monroe St a3a4e4e3-a135-11e9-9cda-0a87ae2ba916        13156
## 6  Aberdeen St & Randolph St a3b3634c-a135-11e9-9cda-0a87ae2ba916        18062
##   Total Docks Docks in Service     Status Latitude Longitude
## 1          11               11 In Service 41.99118 -87.68359
## 2          15               15 In Service 41.78091 -87.57632
## 3          19               19 In Service 41.87475 -87.64981
## 4          23               23 In Service 41.87773 -87.65479
## 5          19               19 In Service 41.88042 -87.65552
## 6          15               15 In Service 41.88411 -87.65426
##                                   Location Type of Rider Bike Type   n
## 1             POINT (-87.683593 41.991178)        Casual   Classic  21
## 2 POINT (-87.576323747635 41.780910964248)        Casual   Classic 135
## 3             POINT (-87.649807 41.874754)        Casual   Classic 581
## 4         POINT (-87.65478743 41.87772613)        Casual   Classic 698
## 5             POINT (-87.655519 41.880419)        Casual   Classic 505
## 6             POINT (-87.654264 41.884114)        Casual   Classic 546
##        Perc_n
## 1 0.007163544
## 2 0.046051352
## 3 0.198191376
## 4 0.238102548
## 5 0.172266170
## 6 0.186252136

As we can see in the above table, we created a Perc_n column which represents the density of riders. Similarly we create such tables for other combinations as well.

mytable_casual_electric <- mytable %>% filter(`Type of Rider` == "Casual") %>% filter(`Bike Type` == "Electric") %>% mutate(Perc_n = (n/sum(n))*100)

mytable_member_classic <- mytable %>% filter(`Type of Rider` == "Member") %>% filter(`Bike Type` == "Classic") %>% mutate(Perc_n = (n/sum(n))*100)

mytable_member_electric <- mytable %>% filter(`Type of Rider` == "Member") %>% filter(`Bike Type` == "Electric") %>% mutate(Perc_n = (n/sum(n))*100)

We now save these subsets of data in csv format and convert them into shape-files with the help of an online available csv-shape file conversion software. One of the software used for this purpose is as follows: https://mygeodata.cloud/converter/csv-to-shp.

After converting these csv files, we are now ready for generating different visualizations for bar charts, pie charts, heatmaps and choropleth maps.

Share

After the Analysis Phase, we now have the data ready for creating various visualizations. Some of the important visualizations have been given here.

1.Bar Charts

Bar Charts have been generated using ggplot2 package. The package is based on Grammar of Graphics, a plotting framework that, like in language, views that graphics should also consist of basic grammatical elements. For instance, a simple sentence would consist of subject + verb + object, while a graph is built upon layers of data + aesthetics + geometry. The package has been designed around this theory, and plotting in ggplot2 includes adding layers of parameters (summary, metadata, and annotation). The syntax used is very user-friendly and intuitive, more so than base graphics or other visualization packages.

Different types of Charts generated using ggplot2 are as shown below:

Chart:1

p <- ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = `Rider Type`, y = `Ride Length`, fill = `Rider Type`),
             stat = 'summary', fun = 'mean', width = 0.3) +
  labs(x = "Type of Rider", 
       y = "Ride Length (in mins)",
       title = "Average Duration of Rides", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data") +
  theme_light() +
  theme(legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

ggplotly(p)

Chart:2

p <- ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = `Rider Type`, y = `Ride Length`, fill = `Bike Type`),
           stat = 'summary', fun = 'mean', position = position_dodge(0.6), width = 0.3) +
  labs(x = "Type of Rider", 
       y = "Ride Length (in mins)",
       title = "Average Duration of Rides", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data") +
  theme_light() +
  theme(legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

ggplotly(p)

Chart:3

yearroute$Weekday <- factor(yearroute$Weekday, levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"))

p <- ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = Weekday, y = `Ride Length`, fill = `Rider Type`),
           stat = 'summary', fun = 'mean', position = 'dodge') +
  labs(x = "Weekdays", 
       y = "Ride Length (in mins)",
       title = "Average Duration of Rides by Weekdays", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data",
       fill = "Type of Rider") +
  theme_light() +
  theme(axis.text.x = element_text(angle=45),
        legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

ggplotly(p)

Chart:4

yearroute$`Month Name` <- factor(yearroute$`Month Name`, levels = c("May 2023", "June 2023", "July 2023", "Aug 2023", "Sep 2023", "Oct 2023", "Nov 2023", "Dec 2023", "Jan 2024", "Feb 2024", "March 2024", "April 2024"))

p <- ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = `Month Name`, y = `Ride Length`, fill = `Rider Type`),
           stat = 'summary', fun = 'mean', position = 'dodge') +
  labs(x = "Months", 
       y = "Ride Length (in mins)",
       title = "Average Duration of Rides by Months", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data",
       fill = "Type of Rider") +
  theme_light() +
  theme(axis.text.x = element_text(angle=90),
        legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

ggplotly(p)

Chart:5

ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = `Start Time (in Hrs)`, y = `Ride Length`, fill = `Rider Type`),
           stat = 'summary', fun = 'mean', position = 'dodge') +
  labs(x = "Start Time (in Hrs)", 
       y = "Ride Length (in mins)",
       title = "Average Duration of Rides by Time of Day", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data",
       fill = "Type of Rider") +
  theme_light() +
  theme(axis.text.x = element_text(angle=90),
        legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

Chart:6

ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = `Day of Year`, y = `Ride Length`, fill = `Rider Type`),
           stat = 'summary', fun = 'mean', position = 'dodge') +
  labs(x = "Day of Year", 
       y = "Ride Length (in mins)",
       title = "Average Duration of Rides by Day of Year", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data",
       fill = "Type of Rider") +
  theme_light() +
  theme(axis.text.x = element_text(angle=90),
        legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

Chart:7

yearroute$Season <- factor(yearroute$Season, levels = c("Spring", "Summer", "Fall", "Winter"))

p <- ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = Season, y = `Ride Length`, fill = `Rider Type`),
           stat = 'summary', fun = 'mean', position = 'dodge') +
  labs(x = "Seasons", 
       y = "Ride Length (in mins)",
       title = "Average Duration of Rides by Seasons", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data",
       fill = "Type of Rider") +
  theme_light() +
  theme(axis.text.x = element_text(angle=90),
        legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

ggplotly(p)

Chart:8

p <- ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = `Rider Type`, fill = `Rider Type`), width = 0.3) +
  labs(x = "Type of Rider", 
       y = "Number of Rides",
       title = "Number of Rides by Rider Type", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data") +
  theme_light() + 
  theme(axis.text.x = element_text(angle=90),
        legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

ggplotly(p)

Chart:9

p <- ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = `Rider Type`, fill = `Bike Type`), position = 'dodge', width = 0.3) +
  # coord_flip() +
  labs(x = "Type of Rider", 
       y = "Number of Rides",
       title = "Number of Rides by Rider Type", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data") +
  theme_light() +
  theme(axis.text.x = element_text(angle=0),
        legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

ggplotly(p)

Chart:10

yearroute$Weekday <- factor(yearroute$Weekday, levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"))

p <- ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = Weekday, fill = `Rider Type`), position = 'dodge', width = 0.3) +
  # coord_flip() +
  labs(x = "Weekdays", 
       y = "Number of Rides",
       title = "Number of Rides by Weekdays", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data") +
  theme_light() + 
  theme(axis.text.x = element_text(angle=90),
        legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

ggplotly(p)

Chart:11

yearroute$`Month Name` <- factor(yearroute$`Month Name`, levels = c("May 2023", "June 2023", "July 2023", "Aug 2023", "Sep 2023", "Oct 2023", "Nov 2023", "Dec 2023", "Jan 2024", "Feb 2024", "March 2024", "April 2024"))

p <- ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = `Month Name`, fill = `Rider Type`), position = 'dodge', width = 0.3) +
  # coord_flip() +
  labs(x = "Months", 
       y = "Number of Rides",
       title = "Number of Rides by Months", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data") +
  theme_light() +
  theme(axis.text.x = element_text(angle=90),
        legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

ggplotly(p)

Chart:12

ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = `Start Time (in Hrs)`, fill = `Rider Type`), position = 'dodge', width = 0.3) +
  # coord_flip() +
  labs(x = "Start Time (in Hrs)", 
       y = "Number of Rides",
       title = "Number of Rides by Time of Day", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data") +
  theme_light() +
  theme(legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

Chart:13

`Day of Year` <- yearroute %>% select(`Day of Year`) %>% distinct()

yearroute$`Day of Year` <- factor(yearroute$`Day of Year`, levels = unique(`Day of Year`$`Day of Year`))

`Day of Year` <- `Day of Year`[order(`Day of Year`$`Day of Year`), ]

 ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = `Day of Year`, fill = `Rider Type`), position = 'dodge', width = 0.3) +
  # coord_flip() +
  labs(x = "Day of Year", 
       y = "Number of Rides",
       title = "Number of Rides by Days of Year", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data") +
  theme_light() +
  theme(legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

Chart:14

`Week of Year` <- yearroute %>% select(`Week of Year`) %>% distinct()

yearroute$`Week of Year` <- factor(yearroute$`Week of Year`, levels = unique(`Week of Year`$`Week of Year`))
              
`Week of Year` <- `Week of Year`[order(`Week of Year`$`Week of Year`), ]

 ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = `Week of Year`, fill = `Rider Type`), position = 'dodge', width = 0.3) +
  # coord_flip() +
  labs(x = "Week of Year", 
       y = "Number of Rides",
       title = "Number of Rides by Week of Year", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data") +
  theme_light() +
  theme(legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

Chart:15

yearroute$Season <- factor(yearroute$Season, levels = c("Spring", "Summer", "Fall", "Winter"))

g <- ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = Season, fill = `Rider Type`), position = 'dodge', width = 0.3) +
  # coord_flip() +
  labs(x = "Season", 
       y = "Number of Rides",
       title = "Number of Rides by Season", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data") +
  theme_light() +
  theme(legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

ggplotly(g)

Chart:16

          q = yearroute %>% 
              select(`Rider Type`, Route) %>% 
              group_by(Route, `Rider Type`) %>% 
              count(`Rider Type`) %>% ungroup() %>% 
              group_by(Route) %>% 
              mutate(total_n = sum(n), percentage = (n / total_n) * 100) %>% 
              ungroup()
            
            s = q[order(-q$total_n), ]
            
            r = s %>% select(Route) %>% distinct() %>% slice(1:25)
            
            p = semi_join(s,r, by=("Route"))
            
            p$Route <- factor(p$Route, levels = unique(r$Route)) ## This will keep the bars in predefined decreasing order in graph
            
            g <- p %>% ggplot() + geom_bar(mapping = aes(x = Route, y = n, fill = `Rider Type`), 
                                          stat = "identity", position = position_dodge(0.6), width = 0.5) +
              expand_limits(y = c(0, max(p$n) * 1.2)) +
              scale_y_continuous(expand = c(0,0)) +
              labs(x = "Route (Top 25)",
                  y = "Number of Rides",
                title = "Number of Rides by Route", 
                subtitle = "Entire Year", 
               caption = "Source: https://divvybikes.com/system-data") +
              theme_bw() +
              theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
                    legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))
            
            ggplotly(g)

Chart:17

g <- ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = `Rider Type`, y = `Station Distance`, fill = `Rider Type`),
             stat = 'summary', fun = 'mean', width = 0.3) +
  labs(x = "Type of Rider", 
       y = "Avg. Ride-Distance (in Kms)",
       title = "Average Distance of Rides by Rider Type", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data") +
  theme_light() + 
  theme(legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

ggplotly(g)
## Warning: Removed 116 rows containing non-finite outside the scale range
## (`stat_summary()`).

Chart:18

g <- ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = `Rider Type`, y = `Station Distance`, fill = `Bike Type`),
             stat = 'summary', fun = 'mean', position = "dodge", width = 0.3) +
  labs(x = "Type of Rider", 
       y = "Avg. Ride-Distance (in Kms)",
       title = "Average Distance of Rides by Rider Type", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data") +
  theme_light() + 
  theme(legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

ggplotly(g)
## Warning: Removed 116 rows containing non-finite outside the scale range
## (`stat_summary()`).

Chart:19

yearroute$Weekday <- factor(yearroute$Weekday, levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"))

g <- ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = Weekday, y = `Station Distance`, fill = `Rider Type`),
             stat = 'summary', fun = 'mean', position = "dodge", width = 0.3) +
  labs(x = "Weekday", 
       y = "Avg. Ride-Distance (in Kms)",
       title = "Average Distance of Rides by Weekday", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data") +
  theme_light() + 
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

ggplotly(g)
## Warning: Removed 116 rows containing non-finite outside the scale range
## (`stat_summary()`).

Chart:20

yearroute$`Month Name` <- factor(yearroute$`Month Name`, levels = c("May 2023", "June 2023", "July 2023", "Aug 2023", "Sep 2023", "Oct 2023", "Nov 2023", "Dec 2023", "Jan 2024", "Feb 2024", "March 2024", "April 2024"))

g <- ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = `Month Name`, y = `Station Distance`, fill = `Rider Type`),
             stat = 'summary', fun = 'mean', position = "dodge", width = 0.3) +
  labs(x = "Month", 
       y = "Avg. Ride-Distance (in Kms)",
       title = "Average Distance of Rides by Month", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data") +
  theme_light() + 
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

ggplotly(g)
## Warning: Removed 116 rows containing non-finite outside the scale range
## (`stat_summary()`).

Chart:21

ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = `Start Time (in Hrs)`, y = `Station Distance`, fill = `Rider Type`),
             stat = 'summary', fun = 'mean', position = "dodge", width = 0.3) +
  labs(x = "Start Time (in Hrs)", 
       y = "Avg. Ride-Distance (in Kms)",
       title = "Average Distance of Rides by Time of Day", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data") +
  theme_light() + 
  theme(legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))
## Warning: Removed 116 rows containing non-finite outside the scale range
## (`stat_summary()`).

Chart:22

yearroute$Season <- factor(yearroute$Season, levels = c("Spring", "Summer", "Fall", "Winter"))

g <- ggplot(data = yearroute) + 
  geom_bar(mapping = aes(x = Season, y = `Station Distance`, fill = `Rider Type`),
             stat = 'summary', fun = 'mean', position = "dodge", width = 0.3) +
  labs(x = "Season", 
       y = "Avg. Ride-Distance (in Kms)",
       title = "Average Distance of Rides by Season", 
       subtitle = "Entire Year", 
       caption = "Source: https://divvybikes.com/system-data") +
  theme_light() + 
  theme(legend.box.background = element_rect(color="black", linewidth = 1),
        legend.key = element_rect(colour = "black"))

ggplotly(g)
## Warning: Removed 116 rows containing non-finite outside the scale range
## (`stat_summary()`).

2.Pie-Donut Charts

Chart:1

         p = yearroute %>% 
              select(`Rider Type`, `Bike Type`) %>% 
              group_by(`Rider Type`, `Bike Type`) %>% 
              count(`Bike Type`) %>% ungroup() 

           m <- p %>% rename(Rider = `Rider Type`, Bike = `Bike Type`)
           
        PieDonut(m, aes(Rider, Bike, count = n), title = "Distribution of Rides for entire year")

Chart:2

          p = yearroute %>% 
              select(`Rider Type`, Weekday) %>% 
              group_by(`Rider Type`, Weekday) %>% 
              count(Weekday) %>% ungroup()

           m <- p %>% rename(Rider = `Rider Type`)
           
        PieDonut(m, aes(Rider, Weekday, count = n), title = "Distribution of Rides by Weekdays")

Chart:3

          p = yearroute %>% 
              select(`Rider Type`, Season) %>% 
              group_by(`Rider Type`, Season) %>% 
              count(Season) %>% ungroup()

           m <- p %>% rename(Rider = `Rider Type`)
           
        PieDonut(m, aes(Rider, Season, count = n), title = "Distribution of Rides by Seasons")

3.Heatmaps

Chart:1

            q = yearroute %>% 
                group_by(`Rider Type`, `Bike Type`, Weekday) %>% 
                summarise(mean = mean(`Ride Length`)) %>% ungroup()
## `summarise()` has grouped output by 'Rider Type', 'Bike Type'. You can override
## using the `.groups` argument.
yearroute$Weekday <- factor(yearroute$Weekday, levels = c("Sunday","Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"))
            
 g <- ggplot(q, aes(x = Weekday, y = interaction(`Rider Type`,`Bike Type`), fill = mean)) +
                geom_tile(color = "white") +
                scale_fill_gradient(low = "orange", high = "red") +
                labs(x = "Weekday", y = "Rider-Bike Combination", fill = "mean") +
                theme_minimal() +
                theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
                labs(title = " Distribution of Ride-Duration across Weekday ")
 
 ggplotly(g)

Chart:2

             q = yearroute %>% 
                group_by(`Rider Type`, `Bike Type`, Weekday) %>% 
                summarise(mean = mean(`Station Distance`)) %>% ungroup()
## `summarise()` has grouped output by 'Rider Type', 'Bike Type'. You can override
## using the `.groups` argument.
 filterdata1 <- q %>% filter(`Rider Type` == "Casual") %>% filter(`Bike Type` == "Docked")
 filterdata2 <- q %>% filter(`Rider Type` == "Casual") %>% filter(`Bike Type` == "Classic")
 filterdata3 <- q %>% filter(`Rider Type` == "Casual") %>% filter(`Bike Type` == "Electric")
 filterdata4 <- q %>% filter(`Rider Type` == "Member") %>% filter(`Bike Type` == "Classic")
 filterdata5 <- q %>% filter(`Rider Type` == "Member") %>% filter(`Bike Type` == "Electric")
 
            if(!any(is.na(filterdata1)) && !any(is.na(filterdata2)) && !any(is.na(filterdata3)) && !any(is.na(filterdata4)) && !any(is.na(filterdata5))){
              
              filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
              
          } else {
              
              
    if(any(is.na(filterdata1))){
                
        filterdata1$mean <- ifelse(is.na(filterdata1$mean), 0, filterdata1$mean)
        
    } 
    if(any(is.na(filterdata2))){
                
        filterdata2$mean <- ifelse(is.na(filterdata2$mean), 0, filterdata2$mean)
     } 
              
    if(any(is.na(filterdata3))){
                
        filterdata3$mean <- ifelse(is.na(filterdata3$mean), 0, filterdata3$mean)
        } 
              
    if(any(is.na(filterdata4))){
                
       filterdata4$mean <- ifelse(is.na(filterdata4$mean), 0, filterdata4$mean)
      } 
            
    if(any(is.na(filterdata5))){
                
       filterdata5$mean <- ifelse(is.na(filterdata5$mean), 0, filterdata5$mean)
      } 
              
      filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
          }
            
  filterdata$Weekday <- factor(filterdata$Weekday, levels = c("Sunday","Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"))
            
 g <- ggplot(filterdata, aes(x = Weekday, y = interaction(`Rider Type`,`Bike Type`), fill = mean)) +
                geom_tile(color = "white") +
                scale_fill_gradient(low = "orange", high = "red") +
                labs(x = "Weekday", y = "Rider-Bike Combination", fill = "mean") +
                theme_minimal() +
                theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
                labs(title = " Distribution of Ride-Distance across Weekday ")
 
 ggplotly(g)

Chart:3

q = yearroute %>% 
                group_by(`Rider Type`, `Bike Type`, `Start Time (in Hrs)`) %>% 
                summarise(mean = mean(`Ride Length`)) %>% ungroup()
## `summarise()` has grouped output by 'Rider Type', 'Bike Type'. You can override
## using the `.groups` argument.
 g <- ggplot(q, aes(x = `Start Time (in Hrs)`, y = interaction(`Rider Type`,`Bike Type`), fill = mean)) +
                geom_tile(color = "white") +
                scale_fill_gradient(low = "lightblue", high = "blue") +
                labs(x = "Start Time (in Hrs)", y = "Rider-Bike Combination", fill = "mean") +
                theme_minimal() +
                theme(axis.text.x = element_text(angle = 0, hjust = 1)) +
                labs(title = " Distribution of Ride-Duration across Time of Day ")
 
 ggplotly(g)

Chart:4

 q = yearroute %>% 
                group_by(`Rider Type`, `Bike Type`, `Start Time (in Hrs)`) %>% 
                summarise(mean = mean(`Station Distance`)) %>% ungroup()
## `summarise()` has grouped output by 'Rider Type', 'Bike Type'. You can override
## using the `.groups` argument.
 filterdata1 <- q %>% filter(`Rider Type` == "Casual") %>% filter(`Bike Type` == "Docked")
 filterdata2 <- q %>% filter(`Rider Type` == "Casual") %>% filter(`Bike Type` == "Classic")
 filterdata3 <- q %>% filter(`Rider Type` == "Casual") %>% filter(`Bike Type` == "Electric")
 filterdata4 <- q %>% filter(`Rider Type` == "Member") %>% filter(`Bike Type` == "Classic")
 filterdata5 <- q %>% filter(`Rider Type` == "Member") %>% filter(`Bike Type` == "Electric")
 
            if(!any(is.na(filterdata1)) && !any(is.na(filterdata2)) && !any(is.na(filterdata3)) && !any(is.na(filterdata4)) && !any(is.na(filterdata5))){
              
              filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
              
          } else {
              
              
    if(any(is.na(filterdata1))){
                
        filterdata1$mean <- ifelse(is.na(filterdata1$mean), 0, filterdata1$mean)
        
    } 
    if(any(is.na(filterdata2))){
                
        filterdata2$mean <- ifelse(is.na(filterdata2$mean), 0, filterdata2$mean)
     } 
              
    if(any(is.na(filterdata3))){
                
        filterdata3$mean <- ifelse(is.na(filterdata3$mean), 0, filterdata3$mean)
        } 
              
    if(any(is.na(filterdata4))){
                
       filterdata4$mean <- ifelse(is.na(filterdata4$mean), 0, filterdata4$mean)
      } 
            
    if(any(is.na(filterdata5))){
                
       filterdata5$mean <- ifelse(is.na(filterdata5$mean), 0, filterdata5$mean)
      } 
              
      filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
          }
           
g <- ggplot(filterdata, aes(x = `Start Time (in Hrs)`, y = interaction(`Rider Type`,`Bike Type`), fill = mean)) +
                geom_tile(color = "white") +
                scale_fill_gradient(low = "lightblue", high = "blue") +
                labs(x = "Start Time (in Hrs)", y = "Rider-Bike Combination", fill = "mean") +
                theme_minimal() +
                theme(axis.text.x = element_text(angle = 0, hjust = 1)) +
                labs(title = " Distribution of Ride-Distance across Time of the Day ")

ggplotly(g)

Chart:5

            q = yearroute %>% 
                group_by(`Rider Type`, `Bike Type`, `Month Name`) %>% 
                summarise(mean = mean(`Ride Length`)) %>% ungroup()
## `summarise()` has grouped output by 'Rider Type', 'Bike Type'. You can override
## using the `.groups` argument.
 q_cas_dock <- q %>% filter(`Rider Type` == "Casual") %>% filter(`Bike Type` == "Docked")
 q_cas_clas <- q %>% filter(`Rider Type` == "Casual") %>% filter(`Bike Type` == "Classic")
 q_cas_elec <- q %>% filter(`Rider Type` == "Casual") %>% filter(`Bike Type` == "Electric")
 q_mem_clas <- q %>% filter(`Rider Type` == "Member") %>% filter(`Bike Type` == "Classic")
 q_mem_elec <- q %>% filter(`Rider Type` == "Member") %>% filter(`Bike Type` == "Electric")
 
 Month = q %>% select(`Month Name`) %>% distinct()
 
            filterdata1 <- right_join(q_cas_dock, Month,   by='Month Name')
            filterdata2 <- right_join(q_cas_clas, Month,   by='Month Name')
            filterdata3 <- right_join(q_cas_elec, Month,   by='Month Name')
            filterdata4 <- right_join(q_mem_clas, Month,   by='Month Name')
            filterdata5 <- right_join(q_mem_elec, Month,   by='Month Name')
 
            if(!any(is.na(filterdata1)) && !any(is.na(filterdata2)) && !any(is.na(filterdata3)) && !any(is.na(filterdata4)) && !any(is.na(filterdata5))){
              
              filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
              
          } else {
              
              
    if(any(is.na(filterdata1))){
                
        filterdata1$mean <- ifelse(is.na(filterdata1$mean), 0, filterdata1$mean)
        filterdata1$`Rider Type` <- ifelse(is.na(filterdata1$`Rider Type`), "Casual", "Casual")
       filterdata1$`Bike Type` <- ifelse(is.na(filterdata1$`Bike Type`), "Docked", "Docked")
        
    } 
    if(any(is.na(filterdata2))){
                
        filterdata2$mean <- ifelse(is.na(filterdata2$mean), 0, filterdata2$mean)
     } 
              
    if(any(is.na(filterdata3))){
                
        filterdata3$mean <- ifelse(is.na(filterdata3$mean), 0, filterdata3$mean)
        } 
              
    if(any(is.na(filterdata4))){
                
       filterdata4$mean <- ifelse(is.na(filterdata4$mean), 0, filterdata4$mean)
      } 
            
    if(any(is.na(filterdata5))){
                
       filterdata5$mean <- ifelse(is.na(filterdata5$mean), 0, filterdata5$mean)
      } 
              
      filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
          }
 
 filterdata$`Month Name` <- factor(filterdata$`Month Name`, levels = c("May 2023", "June 2023", "July 2023", "Aug 2023", "Sep 2023", "Oct 2023", "Nov 2023", "Dec 2023", "Jan 2024", "Feb 2024", "March 2024", "April 2024"))
            
 g <- ggplot(filterdata, aes(x = `Month Name`, y = interaction(`Rider Type`,`Bike Type`), fill = mean)) +
                geom_tile(color = "white") +
                scale_fill_gradient(low = "lightgreen", high = "darkgreen") +
                labs(x = "Months", y = "Rider-Bike Combination", fill = "mean") +
                theme_minimal() +
                theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
                labs(title = " Distribution of Ride-Duration across Months ")
 ggplotly(g)

Chart:6

 q = yearroute %>% 
                group_by(`Rider Type`, `Bike Type`, `Month Name`) %>% 
                summarise(mean = mean(`Station Distance`)) %>% ungroup()
## `summarise()` has grouped output by 'Rider Type', 'Bike Type'. You can override
## using the `.groups` argument.
 q_cas_dock <- q %>% filter(`Rider Type` == "Casual") %>% filter(`Bike Type` == "Docked")
 q_cas_clas <- q %>% filter(`Rider Type` == "Casual") %>% filter(`Bike Type` == "Classic")
 q_cas_elec <- q %>% filter(`Rider Type` == "Casual") %>% filter(`Bike Type` == "Electric")
 q_mem_clas <- q %>% filter(`Rider Type` == "Member") %>% filter(`Bike Type` == "Classic")
 q_mem_elec <- q %>% filter(`Rider Type` == "Member") %>% filter(`Bike Type` == "Electric")
 
 Month = q %>% select(`Month Name`) %>% distinct()
 
            filterdata1 <- right_join(q_cas_dock, Month,   by='Month Name')
            filterdata2 <- right_join(q_cas_clas, Month,   by='Month Name')
            filterdata3 <- right_join(q_cas_elec, Month,   by='Month Name')
            filterdata4 <- right_join(q_mem_clas, Month,   by='Month Name')
            filterdata5 <- right_join(q_mem_elec, Month,   by='Month Name')
 
            if(!any(is.na(filterdata1)) && !any(is.na(filterdata2)) && !any(is.na(filterdata3)) && !any(is.na(filterdata4)) && !any(is.na(filterdata5))){
              
              filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
              
          } else {
              
              
    if(any(is.na(filterdata1))){
                
        filterdata1$mean <- ifelse(is.na(filterdata1$mean), 0, filterdata1$mean)
        filterdata1$`Rider Type` <- ifelse(is.na(filterdata1$`Rider Type`), "Casual", "Casual")
       filterdata1$`Bike Type` <- ifelse(is.na(filterdata1$`Bike Type`), "Docked", "Docked")
        
    } 
    if(any(is.na(filterdata2))){
                
        filterdata2$mean <- ifelse(is.na(filterdata2$mean), 0, filterdata2$mean)
     } 
              
    if(any(is.na(filterdata3))){
                
        filterdata3$mean <- ifelse(is.na(filterdata3$mean), 0, filterdata3$mean)
        } 
              
    if(any(is.na(filterdata4))){
                
       filterdata4$mean <- ifelse(is.na(filterdata4$mean), 0, filterdata4$mean)
      } 
            
    if(any(is.na(filterdata5))){
                
       filterdata5$mean <- ifelse(is.na(filterdata5$mean), 0, filterdata5$mean)
      } 
              
      filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
          }
 
 filterdata$`Month Name` <- factor(filterdata$`Month Name`, levels = c("May 2023", "June 2023", "July 2023", "Aug 2023", "Sep 2023", "Oct 2023", "Nov 2023", "Dec 2023", "Jan 2024", "Feb 2024", "March 2024", "April 2024"))
            
 g <- ggplot(filterdata, aes(x = `Month Name`, y = interaction(`Rider Type`,`Bike Type`), fill = mean)) +
                geom_tile(color = "white") +
                scale_fill_gradient(low = "lightgreen", high = "darkgreen") +
                labs(x = "Months", y = "Rider-Bike Combination", fill = "mean") +
                theme_minimal() +
                theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
                labs(title = " Distribution of Ride-Distance across Months ")
 
 ggplotly(g)

4.Choropleth Maps

A choropleth map is a thematic map in which areas are shaded or patterned in proportion to a statistical variable that represents an aggregate summary of a geographic characteristic within each area. These maps are particularly effective for visualizing regional data and illustrating variations across a spatial area.

Different shades or colors represent different ranges of the variable being visualized. Typically, darker shades represent higher values, while lighter shades represent lower values. These maps are well-suited for representing data like population density, per capita income, election results, or any other region-specific statistical data.

Below given map shows the distribution of Casual-Classic Riders. Similar Maps can be shown for other combinations as well. The data points for the interval are the density values taken from the Data-table.

        shp_1 <- read_sf("Boundaries - Community Areas (current)")
        shp <- read_sf("mytable_casual_classic")
        mytable <- read.csv("mytable.txt")
        
        interval <- c(0, 0.005, 0.01, 0.015, 0.02, 0.025, 0.03, 0.035, 0.04, 0.045, 0.05, 0.055, 0.06, 0.065, 0.07, 0.075, 0.08, 0.085, 0.09, 0.095, 0.1, 0.105, 0.110, 0.115, 0.12, 0.125, 0.13, 0.135, 0.14, 0.145, 0.15, 0.155, 0.16, 0.165, 0.17, Inf )
        
        l <- c("0.000-0.010", "0.010-0.020", "0.020-0.030", 
               "0.030-0.040", "0.040-0.050", "0.050-0.060", 
               "0.060-0.070", "0.070-0.080", "0.080-0.090", 
               "0.090-0.100", "0.100-0.110", "0.110-0.120", 
               "0.120-0.130", "0.130-0.140", "0.140-0.150", 
               "0.150-0.160", "0.160-0.170", "> 0.170" )
        
        m = 0.17
 
      d <- mytable %>% filter(`Type.of.Rider` == "casual") %>% filter(`Bike.Type` == "classic_bike") %>% mutate(Perc_n = (n/sum(n))*100)
      
      n_interval <- length(interval) - 1
 
      incidence_interval_1 <- classIntervals(d$Perc_n,
                                             n = n_interval,
                                             style = "fixed",
                                             fixedBreaks = interval,
                                             yearrouterecision = 3)
      
      colors <- colorRampPalette(brewer.pal(9, "Oranges"))(n_interval)
      
      colcod <- findColours(incidence_interval_1, colors)
      
      p <- ggplot(shp_1) +
        geom_sf(color = "black", fill = "khaki3") + 
        geom_sf(shp, mapping = aes(fill = Perc_n), color = colcod) +
        coord_sf(xlim = c(-88, -87.52), ylim = c(41.6, 42.07)) +
        ggthemes::theme_map() +
        theme(legend.position = "none") +
        theme(panel.background = element_rect(fill = "aliceblue", color = "white"),
              axis.ticks = element_blank(),
              axis.text = element_blank()) +
        # panel.grid = element_line(color = "white", linewidth = 0.8)) +
        labs(title = "Chicago City Map", 
             subtitle = paste("Heatmap showing percen. distribution of Casual-Classic riders")) 
      p

      # Function to plot color bar
      color.bar <- function(lut, min = 0, max = m, nticks = n_interval, ticks = seq(min, max, len = nticks/2), title='Dist. % per Dot') {
        scale = (max-min)/(length(lut)-1)
        plot(c(0,5), c(min,max), type='n', bty='n', xaxt='n', xlab='', yaxt='n', ylab='', main=title , adj = 0, line = -0.02, cex.main = 0.75)
        axis(2, at = ticks, labels = l, cex.axis = 0.6, las=1)
        
        for (i in 0:(length(lut)-1)) {
          y = i*scale + min
          rect(0,y,1.5,(length(lut)-1)*scale , col=lut[i], border= "black")
        }
      } 
      
      color.bar(colorRampPalette(brewer.pal(9, "Oranges"))(n_interval))

      # Combining the ggplot with colorbar using gridExtra from Hmisc package.

      p2 <- as.grob(p)
      r <- as.grob(function() color.bar(colorRampPalette(brewer.pal(9, "Oranges"))(n_interval)))
      
      grid.newpage()
      grid.draw(p2)
      vp = viewport(x=1, y=.45, width=.3, height=1.2)
      pushViewport(vp)
      grid.draw(r)
      upViewport()

Act

The Act phase of data analysis is a crucial step in the data analysis lifecycle, focusing on taking actionable insights derived from data and implementing them in a practical context.

1.Decision Making

Identify Actions: Based on the insights gained from the analysis, determine specific actions to be taken. These could be strategic decisions, operational changes, or new initiatives.

Prioritize Actions: Not all actions will have the same impact. Prioritize them based on factors such as potential benefit, feasibility, cost, and alignment with organizational goals.

The insights from the case-study analysis have been mentioned in “Findings and Recommendations” section of the report.

2.Implementation

Plan: Develop a detailed plan for implementing the chosen actions. This includes defining objectives, setting timelines, allocating resources, and identifying responsible parties.

Execution: Put the plan into action. This might involve deploying new processes, launching marketing campaigns, modifying existing systems, or introducing new products.

3.Monitoring

Track Progress: Continuously monitor the implementation of actions to ensure they are being executed as planned.

Adjust: Be ready to make adjustments based on feedback and initial results. This could involve modifying strategies, reallocating resources, or revising timelines.

Building the Dashboard

Setting up User Interface

UI defines the layout and the appearance of the app

1.Framework of UI

The framework of UI is as given below

ui <- shinyUI(
  fluidPage(  # Fluidpage brings in the required fluidity for web-application to adjust the display as per the size of the screen.
    
    dashboardPage(
      dashboardHeader(),  # This is code for Header Panel of Shiny Web-Application
      dashboardSidebar(), # This is code for Side Panel of Web Application where multiple tabs are located
      dashboardBody() # This is code for Main Panel of Web Application where multiple sub-tabs are located
    )
  )
)

2.Header Panel

dashboardHeader contains the code for displaying the information on the header strip of web-application. In the below code, some part of HTML is used for modifying the layout and including images in the header panel.

For integrating HTML into Shiny Web-Application, Shiny provides a list of functions named tags. The list can be seen here: https://shiny.posit.co/r/articles/build/tag-glossary/. Each function in the list creates an HTML tag that we can use to modify the layout of our Shiny App.

Provided below is a description of the tags used in dashboardHeader:

img: creates an image li : creates a list a : creates a link to a webpage

dashboardHeader(title = "Cyclistic Data Analysis", titleWidth = 250, tags$li(class = "dropdown", tags$img(src="Nevil.jpg", height = "50", width = "50", style="border-radius: 55%; border: 3px solid; color:indigo")), tags$li(class = "dropdown", tags$a(href = "https://rb.gy/br2sa3", icon("linkedin"), "My Profile")))

4.Body

A.Home Tab

As seen in the Side-bar, 3 tabs have been provided. On clicking on any of the tabs, the sub-tabs placed in the central part of the dashboard i.e. the body vary w.r.t the tab selected in the Side panel.

For the Home tab, we have multiple sub-tabs as follows:

  1. About: This section provides some basic information regarding the company and the service it provides on which the case-study analysis has been done

  2. Business task: This section provides in detail information regarding the task that we need to perform.

  3. Raw-Data: This section provides a view of the sample of the data that we will be working with.

  4. Structure: This section provides structure of the raw data. Structure generally consists information regarding the Data type of variables. It is also called Meta-Data which is data about data.

p: Creates a paragraph br: Creates a line-break

            # 1st tab item
tabItem(tabName = "home", tabBox(id = "t1", width = 12, 
      tabPanel(fluidRow(column(width = 12, div(style = "display: flex; align-items: center;", tags$img(src = "about.png", style = "width: 28px; height: 28px; margin-left: 7px;"), tags$span(style = "padding-left: 8px;", "About")))), 
               fluidRow(column(width = 8, tags$img(src = "https://rb.gy/c7rkrn", width = 300, height = 300), tags$br(), tags$a("Source: Google Case-Studies"), align = "center"), column(width = 8, tags$br(), tags$p(HTML("In 2016, <b>Cyclistic</b> launched a successful bike-share offering. Since then, the program has grown to a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime.")), 
               tags$p(HTML("Until now, Cyclistic’s marketing strategy relied on building general awareness and appealing to broad consumer segments. One approach that helped make these things possible was the
flexibility of its pricing plans: single-ride passes, full-day passes, and annual memberships. Customers who purchase <b><i>single-ride or full-day passes are referred to as casual riders</i></b>. Customers who purchase <b><i>annual memberships are Cyclistic members</i></b>.")), align = "justify"))),
                           
#------------------------------------------------------------             
   tabPanel(fluidRow(column(width = 12, div(style = "display: flex; align-items: center;", tags$img(src = "task.png", style = "width: 30px; height: 30px; margin-left: 7px;"), tags$span(style = "padding-left: 8px;", "Business Task")))), 
          fluidRow(column(width = 8, tags$img(src = "https://rb.gy/0tkbmm", width = 300, height = 300), tags$br(), tags$a("Source: https://teknovatecrm.in/task-management"), align = "center"), column(width = 8, tags$br(), tags$p(HTML("Cyclistic’s finance analysts have concluded that annual members are much more profitable than casual riders. Although the pricing flexibility helps Cyclistic attract more customers,
Lily Moreno, the director of marketing, believes that maximizing the number of annual members will be key to future growth. Rather than creating a marketing campaign that targets all-new customers, Moreno, believes
<b><i>there is a solid opportunity to convert casual riders into members</i></b>. She notes that casual riders
are already aware of the Cyclistic program and have chosen Cyclistic for their mobility needs.")),

tags$p(HTML("<b>Task:</b> You are a junior data analyst working on the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company’s future success
depends on maximizing the number of annual memberships. Therefore, <b>your team wants to
understand how casual riders and annual members use Cyclistic bikes differently.</b>")), align = "justify"))),
                           
#-------------------------------------------------------------------------------------------------------------- 
tabPanel(fluidRow(column(width = 12, div(style = "display: flex; align-items: center;", tags$img(src = "data.png", style = "width: 28px; height: 28px; margin-left: 7px;"), tags$span(style = "padding-left: 8px;", "Raw Data")))), 

         fluidRow(column(width = 8, tags$br(), tags$p(HTML("This is just a sample of the original data.Original Raw Data consists of 5.6 million observations, 13 variables and occupies a size of 1 GB.Uploading such a huge data would lead to low speed and memory issues.")))), dataTableOutput("dataT")),

#--------------------------------------------------------------------------------------------------------------                           
tabPanel(fluidRow(column(width = 12, div(style = "display: flex; align-items: center;", tags$img(src = "structure.png", style = "width: 28px; height: 28px; margin-left: 7px;"), tags$span(style = "padding-left: 8px;", "Structure")))), verbatimTextOutput("structure"))))

About sub-tab in the Home tab can be seen as given below:

Business Task looks as given below:

A snippet of Raw-Data is as given below:

The Structure (Meta Data) is as given below:

B. Dashboard Tab

This tab primarily contains 2 sub-tabs: Periodic Analysis and Choropleth Maps. In Periodic Analysis, various set of controls are provided for the User. By giving input to these controls, a customised graph can be easily generated within no time.

These inputs are designed using selectInputs, checkboxInputs and radiobuttons of Shiny Application.

With the controls given in the Dashboard tab, user has enough flexibility t oselect any type of analysis from Monthly, Seasonal and Yearly. For every analysis, user has further control over the choice of X-Axis and Y-Axis inputs. If the user is interested in viewing the difference between Casual Riders and Mebers w.r.t type of Bike used, he can further partition the graph with a simple click on Radio Button.

For analysis wherein the chosen X-axis input is among Day of Year, Weeks or even Months, it is possible that the graph may appear as congested for the user. For such graphs, if user wants to filter out the maximum values, he can even do that with a simple checkboxInput or if the user wants to identify the trend of the graph, then by clicking on the trend line, user can get a trend line generated automatically.

For Stakeholders and users who would be interested in knowing the data responsibe for the generated graphs, a separate data table viewing option is also available as well.

Besides this, if the User clicks on the bars of the bar-chart, every individual bar generates a separate Pie chart or Heatmap (depending on the choice of Y-Axis input) which gives further information to the user for that particular bar. These linked charts prove to be helpful in providing detail indepth analysis without clustering up the space of Main panel in the Shiny Application.

With the given controls, a user can easily generate any one among 12800 different graphs.

The variety of charts produced are as follows:

  1. Monthly Analysis: 240 barcharts, 5022 linked charts
  2. Seasonal and Yearly Analysis : 216 barcharts, 7318 linked charts
  3. Choropleth Maps: 4

Thus making a total of 12800 graphs.

          # 2nd tab item

#--------------Code for Periodic Analysis sub-tab--------------------------------------------------------------

tabItem(tabName = "dash",tabBox(id = "t2", width = 12, 
  tabPanel(fluidPage(fluidRow(
    column(width = 12,div(style = "display: flex; align-items: center;",
    tags$img(src = "time.png", style = "width: 28px; height: 28px; margin-left: 7px;"),
    tags$span(style = "padding-left: 8px;", "Periodic Analysis"))))),
    
fluidPage(
fluidRow(column(width = 3, selectInput(inputId = "SelAna", label = "Select the type of Analysis", choices = c("Monthly","Seasonal","Yearly"), selected = "Monthly")),
column(width = 3,selectInput(inputId = "Seldata",label = "Select the Dataset",choices = " "))
),
fluidRow(column(width = 3, selectInput(inputId = "SelXaxis2", label = "Select the X Axis", choices = " ")),
column(width = 3, selectInput(inputId = "SelYaxis2", label = "Select the Y Axis", choices = " "))),
fluidRow(column(width = 3, radioButtons(inputId = "Part2", label = "Partition the Graph?", choices = c("No", "Yes(by Bike-Type)"), selected = "No")),
column(width = 3, selectInput(inputId = "Data2", label = "Show Data Table?", choices = c("No", "Yes"), selected = "No"))),
fluidRow(column(width = 3,checkboxInput(inputId = "Check2", label = HTML("<b>Show Maximum Values</b>"), value = FALSE)),
column(width = 3, checkboxInput(inputId = "Trend2", label = HTML("<b>Show Trend Line</b>"), value = FALSE))),
fluidRow(column(width = 12, withSpinner(plotlyOutput("chart2")))),
fluidRow(column(width = 12, withSpinner(dataTableOutput("data3")))))),

#----------------------------------Code for Choropleth Maps sub-tab-------------------------------------------
                           
tabPanel(fluidPage(div(style = "display: flex; align-items: center;", tags$img(src = "map.png", style = "width: 28px; height: 28px; margin-left: 7px;"), tags$span(style = "padding-left: 8px;", "Choropleth Maps"))), 
         
fluidPage(div(style = "display:inline-block; vertical-align: top; padding-top: 5px; width: 500px",selectInput(inputId = "SelRid" , label = "Select the type of Rider", choices = c("casual", "member"), width = "30%", selected = "casual")),

div(style = "display:inline-block; vertical-align: top; padding-left: 5px; padding-top: 5px; width: 500px",selectInput(inputId = "SelBik" , label = "Select the type of Bike", choices = c("classic_bike", "electric_bike"), width = "30%", selected = "classic_bike")),

div(style = "display:inline-block; vertical-align: top; padding-left: 5px; padding-top: 5px; width: 500px",radioButtons(inputId = "Data4" , label = "Show Data Table?", choices = c("Yes", "No"), width = "30%", selected = "No"))),

withSpinner(plotOutput("map_plot")), width = 12, withSpinner(dataTableOutput("dataM")))))

Periodic Analysis subtab can be seen as given below:

Choropleth subtab is as given below:

C. Insights Tab

This tab also comprises of 2 sub-tabs: Findings and Recommendations.

Findings provide a tabulated summary of important points which could be extracted from the data. Based on the points given in the findings, top three recommendations have been provided in the consequent tab.

            #3rd tab item
tabItem(tabName = "insight",tabBox(id = "t3", width = 12, 
                                   
tabPanel(fluidRow(column(width = 12, div(style = "display: flex; align-items: center;", tags$img(src = "finding.png", style = "width: 28px; height: 28px; margin-left: 7px;"), tags$span(style = "padding-left: 8px;", "Findings")))), tags$iframe(style = "height:700px; width:100%; scrolling=yes", src="Cyclistic-Results.pdf") ),

tabPanel(fluidRow(column(width = 12, div(style = "display: flex; align-items: center;", tags$img(src = "recommend.png", style = "width: 28px; height: 28px; margin-left: 7px;"), tags$span(style = "padding-left: 8px;", "Recommendations")))), fluidRow(column(width = 8, tags$p(HTML("<b>As per the findings, I would like to recommend the following: </b>")), 
                       
tags$p(HTML("1) People who would purchase Cyclistic Membership could be given some free rides annually to places like Dusable Lake Shore Drive and Streeter Dr where multiple parks are located. ")), 

tags$p(HTML("2) Small groups of Riders can be formed by Cyclistic which would prefer going for early morning rides  and also in the evening (around 17.00 hrs) in Spring and Summer covering several places in Chicago. Advertisements of such groups should be done focussed towards the residents in the region of Streeter Dr and Grand Ave and other top 25 such stations which can attract a lot of people who may join the groups to expand their Networks. Marketing team can provide Seasonal memberships at discounted prices to these groups.")),

tags$p(HTML("3) Since, the Casual riders are more during Weekends, Marathons could be organised on Weekends. Particiants could be provided with discounted memberships.")), align = "justify")))))

Findings tab can be seen as given below:

Recommendations can be seen as given below:

Programming of Server

Server contains the logic that runs behind the scenes, processing the inputs and generating outputs.

1.Framework of Server

In Shiny, the shinyServer function uses a function with arguments Input, Output, and Session to define the server-side logic of the app. Here’s a breakdown of these arguments:

A.Input:

This represents the inputs from the user interface (UI) of the Shiny app. It contains reactive values that reflect the current state of user inputs such as text fields, sliders, and buttons.

B.Output:

This represents the outputs that will be sent to the user interface. It contains slots for reactive output objects such as plots, tables, and text. We define these outputs in the server function, and they are rendered in the UI based on their associated IDs. For example, we can create a plot output with output$myPlot <- renderPlot({…}) which will then be displayed in the UI element with outputId = “myPlot”.

C.Session:

Session provides information and control over the user session. It can be used to manage user sessions, send custom messages, and handle session-specific events.

server <- shinyServer(
  function(Session, Input, Output){
    
  }
)

2.Automating the Select Inputs.

The controls in the UI are designed in such a way that when the user selects the first control i.e. the type of Analysis, then on the basis of input given to 1st control, the 2nd control gets populated with the available choices. Then on the basis of input given to the 2nd control i.e selection of dataset, then the X-Axis control gets populated with some list of choices which again determines the Y-input.

This is done so as to prevent any mismatch of inputs w.r.t each other. For instance, when Yearly analysis is selected, we have a X-Axis variable called “Seasons” which should not appear in Monthly or Seasonal analysis.

Thus for preventing any such possible mismatch, a csv file is separately maintained with all the matching options which is read automatically at the start of every Session.

The csv file is as shown below:

input_list <- read.csv("Inputs.csv")

head(input_list)
##   Analysis  Dataset              X.Axis            Y.Axis
## 1  Monthly May 2023             Weekday       No.of Rides
## 2  Monthly May 2023             Weekday Avg.Ride-Duration
## 3  Monthly May 2023             Weekday Avg.Ride-Distance
## 4  Monthly May 2023 Start Time (in Hrs)       No.of Rides
## 5  Monthly May 2023 Start Time (in Hrs) Avg.Ride-Duration
## 6  Monthly May 2023 Start Time (in Hrs) Avg.Ride-Distance

Inputs for the 1st control are always loaded. Based on the selection of 1st Input, subsequent controls get loaded consequently as per the code given below:

    observe({
      print(input$SelAna)
      Inp1 <- input_list %>% filter(Analysis == input$SelAna) %>% select(Dataset)
      updateSelectInput(session, inputId = "Seldata" , label = "Select the Dataset", choices = unique(Inp1))
    })
    
    observe({
      Inp2 <- input_list %>% filter(Dataset == input$Seldata) %>% select(X.Axis)
      updateSelectInput(session, inputId = "SelXaxis2" , label = "Select the X Axis", choices = unique(Inp2))
    })
    
    observe({
      Inp3 <- input_list %>% filter(X.Axis == input$SelXaxis2) %>% select(Y.Axis)
      updateSelectInput(session, inputId = "SelYaxis2" , label = "Select the Y Axis", choices = unique(Inp3))
    })

3.Loading Data-table for viewing Raw Data

This code snippet creates a highly customized DataTable using the DT package, which includes filtering columns, scrolling, and various layout and display options. This DataTable will be rendered in the Shiny app where dataT is referenced in the UI.

output$dataT <- DT::renderDataTable({
      
      DT::datatable(
        rawdata_sample,
        filter = 'top', extensions = c('Buttons', 'Scroller'),
        options = list(scrollY = 375,
                       scrollX = 500,
                       deferRender = TRUE,
                       scroller = TRUE,
                       # paging = TRUE,
                       # pageLength = 25,
                       buttons = list(list(extend = 'colvis', targets = 0, visible = FALSE)),
                       dom = 'lBfrtip',
                       fixedColumns = TRUE), 
        rownames = FALSE)
    })  

4.Structure of Data

The str() function of base R package provides a concise summary of the data frame, including the data type of each column and the first few entries.

 # Structure
    output$structure <- renderPrint(
      
      my_data <- rawdata_sample %>% select(-...1) %>% str()
    )

5.Generating Bar Charts

A.Overview of the Code Structure

Below provided is a short overview of how the code looks. The code is conditionaised at various points using the if-else structure.

Primarily the code can be broken down into 2 parts. First part of the code delineates an algorithm for analyzing the datset in case of number of rides. When the chosen Y-axis is either “Ride Duration” or “Ride Distance”, then the 2nd part of the code gets activated and a plot is rendered accordingly.

Bike type analysis is integrated into the code with the help of if-else condition.

output$chart2 <- renderPlotly({
      
      yearroute$Weekday <- factor(yearroute$Weekday, levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"))
      yearroute$`Month Name` <- factor(yearroute$`Month Name`, levels = c("May 2023", "June 2023", "July 2023", "Aug 2023", "Sep 2023", "Oct 2023", "Nov 2023", "Dec 2023", "Jan 2024", "Feb 2024", "March 2024", "April 2024"))
      
# Filtering Data as per the selection of type of Analysis
      
      if(input$SelAna == "Yearly"){
        
        yd <- yearroute
        
      } else if(input$SelAna == "Seasonal"){
        
        yd <- yearroute %>% 
          filter(Season == paste(input$Seldata))
        
      } else {

        yd <- yearroute %>% 
          filter(`Month Name` == paste(input$Seldata))
      }
      
# When Y axis selected is no.of rides
      
    if(Y-Axis is No.of Rides){
     
    }
      
# When Y axis selected is Ride Duration or Ride Distance
      
    if(Y-Axis is Ride Duration or Ride Distance){
     
    }
B.When Y-Axis is Number of Rides
if(input$SelYaxis2 == "No.of Rides"){
        
     common <- {if(input$Part2 == "Yes(by Bike-Type)"){ yd %>% 
            select(`Rider Type`, `Bike Type`, !!sym(input$SelXaxis2)) %>% 
              group_by(!!sym(input$SelXaxis2), `Rider Type`, `Bike Type`) %>% 
              count(`Bike Type`) %>% ungroup() } 
            else {yd %>% select(`Rider Type`, !!sym(input$SelXaxis2)) %>% 
                group_by(!!sym(input$SelXaxis2), `Rider Type`) %>% 
                count(`Rider Type`) %>% ungroup() }}
         
          if(input$SelXaxis2 == "Route"){
            
            q = common %>% 
              group_by(!!sym(input$SelXaxis2)) %>% 
              mutate(total_n = sum(n), percentage = (n / total_n) * 100) %>% 
              ungroup()
            
            s = q[order(-q$total_n), ]
            r = s %>% select(Route) %>% distinct() %>% slice(1:25)
            p = semi_join(s,r, by=("Route"))
            p$Route <- factor(p$Route, levels = unique(r$Route)) ## This will keep the bars in predefined decreasing order in graph
            routedata <- yearroute %>% select(Station, end_station_name, Route) %>% distinct()
            routeshow = merge(p, routedata, by = "Route") %>% rename(`Start Station` = Station,`End Station` = end_station_name) ## This code is for modifying the table so as to display it in output along with station names
            routeshow <- routeshow[order(-routeshow$total_n),]
            
            if(input$Part2 == "Yes(by Bike-Type)"){
            routeshow <- routeshow %>% select(Route, `Start Station`, `End Station`, `Rider Type`, `Bike Type`, n, total_n, percentage)
            ## Now we will write code for showing max values in data table
            pmax <- p %>% group_by(`Rider Type`, `Bike Type`) %>% filter(n == max(n))
            routeshowmax = merge(pmax, routedata, by = "Route") %>% rename(`Start Station` = Station,`End Station` = end_station_name) ## This code is for modifying the table so as to display it in output along with station names
            routeshowmax <- routeshowmax[order(-routeshowmax$total_n),]
            routeshowmax <- routeshowmax %>% select(Route, `Start Station`, `End Station`, `Rider Type`, `Bike Type`, n, total_n, percentage) ## reordering the columns
            } else {
              routeshow <- routeshow %>% select(Route, `Start Station`, `End Station`, `Rider Type`, n, total_n, percentage) 
              pmax <- p %>% group_by(`Rider Type`) %>% filter(n == max(n))
              routeshowmax = merge(pmax, routedata, by = "Route") %>% rename(`Start Station` = Station,`End Station` = end_station_name) ## This code is for modifying the table so as to display it in output along with station names
              routeshowmax <- routeshowmax[order(-routeshowmax$total_n),]
              routeshowmax <- routeshowmax %>% select(Route, `Start Station`, `End Station`, `Rider Type`, n, total_n, percentage) ## reordering the columns
            }
            
         } else {
            
            p = common %>% 
              mutate(n = n/1000000) %>% mutate_if(is.numeric, round, digits = 4) %>% 
              group_by(!!sym(input$SelXaxis2)) %>% 
              mutate(total_n = sum(n), percentage = (n / total_n) * 100) %>% 
              mutate_if(is.numeric, round, digits = 3) %>% 
              ungroup()
          }
          
          if(input$Check2){
            
            if(input$Data2 == "Yes"){
              output$data3 <- DT::renderDataTable({
                
                DT::datatable(
                  
                  if(input$SelXaxis2 == "Route"){
                    
                    routeshowmax %>% mutate_if(is.numeric, round, digits = 5)
                    
                  } else{
                    
                    p %>% {if(input$Part2 == "Yes(by Bike-Type)"){group_by(. , `Rider Type`, `Bike Type`)} else {group_by(. , `Rider Type`)}} %>% filter(n == max(n)) %>% 
                      mutate_if(is.numeric, round, digits = 5)
                  },
                  filter = 'top', extensions = c('Buttons', 'Scroller'),
                  options = list(scrollY = 375,
                                 scrollX = 500,
                                 deferRender = TRUE,
                                 scroller = TRUE,
                                 # paging = TRUE,
                                 # pageLength = 25,
                                 buttons = list(list(extend = 'colvis', targets = 0, visible = FALSE)),
                                 dom = 'lBfrtip',
                                 fixedColumns = TRUE), 
                  rownames = FALSE)
              })  
              
            }else{ 
              
              output$data3 <- DT::renderDataTable({
                DT::datatable(NULL)
              })  
            }
            
          if(input$Part2 == "Yes(by Bike-Type)"){    
            h = p %>% group_by(`Rider Type`, `Bike Type`) %>% filter(n == max(n)) %>% mutate_if(is.numeric, round, digits = 5)
            d = p %>% mutate(n = 0)
            combined_d <- merge(d, h, by = c(paste(input$SelXaxis2), "Rider Type", "Bike Type"), all = TRUE) %>%
              mutate(n = ifelse(is.na(n.x), 0, n.x) + ifelse(is.na(n.y), 0, n.y)) %>% select(!!sym(input$SelXaxis2), `Rider Type`, `Bike Type`, n)
          } else {
            h = p %>% group_by(`Rider Type`) %>% filter(n == max(n)) %>% mutate_if(is.numeric, round, digits = 5)
            d = p %>% mutate(n = 0)
            combined_d <- merge(d, h, by = c(paste(input$SelXaxis2), "Rider Type"), all = TRUE) %>%
              mutate(n = ifelse(is.na(n.x), 0, n.x) + ifelse(is.na(n.y), 0, n.y)) %>% select(!!sym(input$SelXaxis2), `Rider Type`, n)
          }
            
            q = p %>% ggplot() + geom_bar(data = p, mapping = aes(x = get(input$SelXaxis2), y = n, fill = `Rider Type`, text = paste(input$SelXaxis2, ":", get(input$SelXaxis2), "<br>No.of Rides:", n, "<br>Type of Rider:", `Rider Type`)), 
                                          stat = "identity", position = position_dodge(0.6), width = 0.5, alpha = 0.2) + 
              {if(input$Part2 == "Yes(by Bike-Type)"){facet_wrap(~`Bike Type`)} else {}} + 
              geom_bar(data = combined_d, 
                       mapping = aes(x = get(input$SelXaxis2), y = n, fill = `Rider Type`), color = "black", linewidth = 0.3,
                       stat = "identity",  position = position_dodge(0.6), width = 0.5, alpha = 0.8) +
              {if(input$Part2 == "Yes(by Bike-Type)"){facet_wrap(~`Bike Type`)} else {}} + 
              geom_text(data = h, aes(label = paste0("<span style='color:black;font-weight:bold;'>", n, "</span>"), x = get(input$SelXaxis2), y = n), size = 2, position = position_nudge(x = -0.1, y = 0.001)) +
              expand_limits(y = c(0, max(p$n) * 1.2)) +
              scale_y_continuous(expand = c(0,0)) +
              
              { 
                if(input$SelXaxis2 == "Route"){
                  labs(x = paste(input$SelXaxis2),
                       y = "Number of Rides")
                } else{labs(x = paste(input$SelXaxis2),
                            y = "Number of Rides (in millions)")}
              } +
              
              theme_bw() +
              
              {if(input$SelXaxis2 == "Weekday" | input$SelXaxis2 == "Month Name"){
                
                theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust=1))
                
              } else if(input$SelXaxis2 == "Route"){
                
                theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
                
              }} + 
              
              {
                if(input$Trend2){
                  geom_smooth(data = p, mapping = aes(x = as.numeric(get(input$SelXaxis2)), y = n, col=`Rider Type`), method=loess, se=FALSE, linewidth = 0.5, show.legend = FALSE)
                }
              }
            
          } else{
            
            if(input$Data2 == "Yes"){
              output$data3 <- DT::renderDataTable({
                
                DT::datatable(
                  
                  if(input$SelXaxis2 == "Route"){
                    
                    routeshow %>% mutate_if(is.numeric, round, digits = 5)
                    
                  } else{
                    
                    p %>% mutate_if(is.numeric, round, digits = 5) 
                  },
                  filter = 'top', extensions = c('Buttons', 'Scroller'),
                  options = list(scrollY = 375,
                                 scrollX = 500,
                                 deferRender = TRUE,
                                 scroller = TRUE,
                                 # paging = TRUE,
                                 # pageLength = 25,
                                 buttons = list(list(extend = 'colvis', targets = 0, visible = FALSE)),
                                 dom = 'lBfrtip',
                                 fixedColumns = TRUE), 
                  rownames = FALSE)
              })  
              
            }else{ 
              
              output$data3 <- DT::renderDataTable({
                DT::datatable(NULL)
              })  
            }
            
            q = p %>% ggplot() + geom_bar(data = p, mapping = aes(x = get(input$SelXaxis2), y = n, fill = `Rider Type`, text = paste(input$SelXaxis2, ":", get(input$SelXaxis2), "<br>No.of Rides:", n, "<br>Type of Rider:", `Rider Type`)), 
                                          stat = "identity", position = position_dodge(0.6), width = 0.5) + 
              {if(input$Part2 == "Yes(by Bike-Type)"){facet_wrap(~`Bike Type`)}} +
              expand_limits(y = c(0, max(p$n) * 1.2)) +
              scale_y_continuous(expand = c(0,0)) +
              
              { 
                if(input$SelXaxis2 == "Route"){
                  labs(x = paste(input$SelXaxis2),
                       y = "Number of Rides")
                } else{labs(x = paste(input$SelXaxis2),
                            y = "Number of Rides (in millions)")}
              } +
              
              theme_bw() +
              
              {if(input$SelXaxis2 == "Weekday" | input$SelXaxis2 == "Month Name"){
                
                theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust=1))
                
              } else if(input$SelXaxis2 == "Route"){
                
                theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
                
              }} +
              
              {
                if(input$Trend2){
                  geom_smooth(data = p, mapping = aes(x = as.numeric(get(input$SelXaxis2)), y = n, col=`Rider Type`), method=loess, se=FALSE, linewidth = 0.5, show.legend = FALSE)
                }
              }
          }
          
       p_plotly <- ggplotly(q, tooltip = "text", source = "chart2")
          
          p_plotly <- p_plotly %>% 
            
            layout(
              
              title = list(
                text = paste0("<b>","Number of Rides by ", input$SelXaxis2, "</b>","<br><span style='font-size:14pt; display:block; padding-top:10px;'>", "<sup>","Dataset:", input$Seldata, "</sup>", "</span></br>"),
                x = 0.5, # Center alignment
                xanchor = "center"),
              margin = list(t = 100, r = 50, b = 50) # Increase top margin to add space between title and plot
            )
          
          p_plotly <- event_register(p_plotly, "plotly_click")
          # Return plotly object
          p_plotly
    
      }
C.When Y-Axis is Ride Duration or Ride Distance
else if(input$SelYaxis2 != "No.of Rides"){
 
      if(input$SelYaxis2 == "Avg.Ride-Duration"){
            
            p1 = yd %>% 
              {if(input$Part2 == "Yes(by Bike-Type)"){group_by(. , !!sym(input$SelXaxis2), `Rider Type`, `Bike Type`)}else{group_by(. , !!sym(input$SelXaxis2), `Rider Type`)}} %>% 
              summarise(mean = mean(`Ride Length`)) %>% ungroup() %>% mutate_if(is.numeric, round, digits = 4)
            
          } else if(input$SelYaxis2 == "Avg.Ride-Distance") {
            
            p1 = yd %>% 
              filter(!is.na(`Station Distance`)) %>% 
            {if(input$Part2 == "Yes(by Bike-Type)"){group_by(. , !!sym(input$SelXaxis2), `Rider Type`, `Bike Type`)}else{group_by(. , !!sym(input$SelXaxis2), `Rider Type`)}} %>% 
              summarise(mean = mean(`Station Distance`)) %>% ungroup() %>% mutate_if(is.numeric, round, digits = 4)
          }
          
          if(input$Check2){
            
            if(input$Data2 == "Yes"){
              output$data3 <- DT::renderDataTable({
                
                DT::datatable(
                  p1 %>% {if(input$Part2 == "Yes(by Bike-Type)"){group_by(. , `Rider Type`, `Bike Type`)}else{group_by(. , `Rider Type`)}} %>% filter(mean == max(mean)) %>% 
                    mutate_if(is.numeric, round, digits = 4),
                  filter = 'top', extensions = c('Buttons', 'Scroller'),
                  options = list(scrollY = 375,
                                 scrollX = 500,
                                 deferRender = TRUE,
                                 scroller = TRUE,
                                 # paging = TRUE,
                                 # pageLength = 25,
                                 buttons = list(list(extend = 'colvis', targets = 0, visible = FALSE)),
                                 dom = 'lBfrtip',
                                 fixedColumns = TRUE), 
                  rownames = FALSE)
              })  
              
            }else{ 
              
              output$data3 <- DT::renderDataTable({
                DT::datatable(NULL)
              })  
            }
            
            h1 = p1 %>% {if(input$Part2 == "Yes(by Bike-Type)"){group_by(. , `Rider Type`, `Bike Type`)}else{group_by(. , `Rider Type`)}} %>% filter(mean == max(mean)) %>% mutate_if(is.numeric, round, digits = 4)
            
            d1 = p1 %>% mutate(mean = 0)
            
          if(input$Part2 == "Yes(by Bike-Type)"){ 
            combined_d1 <- merge(d1, h1, by = c(paste(input$SelXaxis2), "Rider Type", "Bike Type"), all = TRUE) %>%
              mutate(mean = ifelse(is.na(mean.x), 0, mean.x) + ifelse(is.na(mean.y), 0, mean.y)) %>%
              select(!!sym(input$SelXaxis2), `Rider Type`,`Bike Type`, mean) %>% mutate_if(is.numeric, round, digits = 4)
          } else {
            combined_d1 <- merge(d1, h1, by = c(paste(input$SelXaxis2), "Rider Type"), all = TRUE) %>%
              mutate(mean = ifelse(is.na(mean.x), 0, mean.x) + ifelse(is.na(mean.y), 0, mean.y)) %>%
              select(!!sym(input$SelXaxis2), `Rider Type`, mean) %>% mutate_if(is.numeric, round, digits = 4)
          }
            
            q1 = p1 %>% ggplot() + geom_bar(data = p1, mapping = aes(x = get(input$SelXaxis2), y = mean, fill = `Rider Type`, text = paste(input$SelXaxis2, ":", get(input$SelXaxis2), "<br>", input$SelYaxis2, ":", mean , "<br>Type of Rider:", `Rider Type`)), 
                                            stat = "identity", position = position_dodge(0.6), width = 0.5, alpha = 0.2) + 
              {if(input$Part2 == "Yes(by Bike-Type)"){facet_wrap(~`Bike Type`)}} +
              geom_bar(data = combined_d1, 
                       mapping = aes(x = get(input$SelXaxis2), y = mean, fill = `Rider Type`), color = "black", linewidth = 0.3,
                       stat = "identity",  position = position_dodge(0.6), width = 0.5, alpha = 0.8) +
              {if(input$Part2 == "Yes(by Bike-Type)"){facet_wrap(~`Bike Type`)}} +
              geom_text(data = h1, aes(label = paste0("<span style='color:black;font-weight:bold;'>", mean, "</span>"), x = get(input$SelXaxis2), y = mean), size = 2, position = position_nudge(x = -0.1, y = 0.001)) +
              expand_limits(y = c(0, max(p1$mean) * 1.2)) +
              scale_y_continuous(expand = c(0,0)) +
              labs(x = paste(input$SelXaxis2),
                   y = paste(input$SelYaxis2),
                   caption = "Source: https://divvybikes.com/system-data") +
              theme_bw() +
              
              {if(input$SelXaxis2 == "Weekday" | input$SelXaxis2 == "Month Name"){
                theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust=1))
              }} +
              
              {
                if(input$Trend2){
                  geom_smooth(data = p1, mapping = aes(x = as.numeric(get(input$SelXaxis2)), y = mean, col=`Rider Type`), method=loess, se=FALSE, linewidth = 0.5, show.legend = FALSE)
                }
              }
            
          } else{
            
            if(input$Data2 == "Yes"){
              output$data3 <- DT::renderDataTable({
                
                DT::datatable(
                  p1 %>% mutate_if(is.numeric, round, digits = 4) ,
                  filter = 'top', extensions = c('Buttons', 'Scroller'),
                  options = list(scrollY = 375,
                                 scrollX = 500,
                                 deferRender = TRUE,
                                 scroller = TRUE,
                                 # paging = TRUE,
                                 # pageLength = 25,
                                 buttons = list(list(extend = 'colvis', targets = 0, visible = FALSE)),
                                 dom = 'lBfrtip',
                                 fixedColumns = TRUE), 
                  rownames = FALSE)
              })  
              
            }else{ 
              
              output$data3 <- DT::renderDataTable({
                DT::datatable(NULL)
              })  
            }
            
            q1 = p1 %>% ggplot() + geom_bar(data = p1, mapping = aes(x = get(input$SelXaxis2), y = mean, fill = `Rider Type`, text = paste(input$SelXaxis2, ":", get(input$SelXaxis2), "<br>", input$SelYaxis2, ":", mean , "<br>Type of Rider:", `Rider Type`)), 
                                            stat = "identity", position = position_dodge(0.6), width = 0.5) +
              {if(input$Part2 == "Yes(by Bike-Type)"){facet_wrap(~`Bike Type`)}} +
              expand_limits(y = c(0, max(p1$mean) * 1.2)) +
              scale_y_continuous(expand = c(0,0)) +
              labs(x = paste(input$SelXaxis2),
                   y = paste(input$SelYaxis2),
                   caption = "Source: https://divvybikes.com/system-data") +
              theme_bw() +
              
              {if(input$SelXaxis2 == "Weekday" | input$SelXaxis2 == "Month Name"){
                theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust=1))
              }} +
              
              {
                if(input$Trend2){
                  geom_smooth(data = p1, mapping = aes(x = as.numeric(get(input$SelXaxis2)), y = mean, col=`Rider Type`), method=loess, se=FALSE, linewidth = 0.5, show.legend = FALSE)
                }
              }
          }
         
          p_plotly1 <- ggplotly(q1, tooltip = "text", source = "chart2")
          
          p_plotly1 <- p_plotly1 %>% 
            
            layout(
              title = list(
                text = paste0("<b>", input$SelYaxis2, " by ", input$SelXaxis2, "</b>","<br><span style='font-size:14pt; display:block; padding-top:10px;'>", "<sup>","Dataset:", input$Seldata, "</sup>", "</span></br>"),
                x = 0.5, # Center alignment
                xanchor = "center"),
              margin = list(t = 100, r = 50, b = 50) # Increase top margin to add space between title and plot
            )
          
          p_plotly1 <- event_register(p_plotly1, "plotly_click")
          
          # Return plotly object
          p_plotly1
        } 

6.Generating Pie Charts and Heatmaps (Linked Charts)

A.Overview of the code:
  1. observeEvent(): This function sets up a reactive observer in Shiny. It watches for a specified event and triggers the code inside its block whenever that event occurs.

  2. event_data(“plotly_click”, source = “chart2”): This function from the plotly package captures data from a click event on a Plotly chart. The source = “chart2” specifies that the click event is being listened to on a Plotly chart with the source identifier “chart2”.

The if conditional statement checks if click_data2 is not NULL, ensuring that the subsequent code only executes if a click event has occurred and valid data is available.

      observeEvent(event_data("plotly_click", source = "chart2"), {
      click_data2 <- event_data("plotly_click", source = "chart2")
      
      if (!is.null(click_data2)) {
        
        xInp <- click_data2[["pointNumber"]]
        xInp2 <- round(as.integer(click_data2[["x"]]), digits = 0)
        cInp <- click_data2[["curveNumber"]]
       
        if(input$SelAna == "Yearly"){
          
          dp <- yearroute
          
        } else if(input$SelAna == "Seasonal"){
          
          dp <- yearroute %>% 
            filter(Season == paste(input$Seldata))
          
        } else {
          
          dp <- yearroute %>% 
            filter(`Month Name` == paste(input$Seldata))
        }
   
        if(input$SelYaxis2 == "No.of Rides"){
        }
        
        if(input$SelYaxis2 != "No.of Rides"){
        }
B.Generating Pie Charts

The reason behind generating Pie charts on the click of a bar is because the Y-axis variable is no.of rides. Rides can be shown as distribution in the form of Pie chart.

When the partition is not checked in, a Pie chart will be shown showing distribution of rides among Casual and Member Riders. When Partition box is checked, a Pie-Donut chart will be shown where a Donut chart will surround the Pie chart. In a Pie-Donut chart, the Pie chart shows the distribution of rides among Casual and Member whereas the Donut part further classifies the rides into Bike type: Classic, Docked and Electric for every rider.

if(input$SelYaxis2 == "No.of Rides"){
          
          if(input$Part2 == "Yes(by Bike-Type)"){
            
          common = dp %>% 
            select(`Rider Type`, `Bike Type`, !!sym(input$SelXaxis2)) %>% 
            group_by(!!sym(input$SelXaxis2), `Rider Type`, `Bike Type`) %>% 
            count(`Bike Type`) %>% ungroup()
          
          } else {
          
          common = dp %>% 
            select(`Rider Type`, !!sym(input$SelXaxis2)) %>% 
            group_by(!!sym(input$SelXaxis2), `Rider Type`) %>% 
            count(`Rider Type`) %>% ungroup()
          }
          
          if(input$SelXaxis2 == "Route"){
            
            q1 = common %>% 
              group_by(Route) %>% 
              mutate(total_n = sum(n), percentage = (n / total_n) * 100) %>% 
              ungroup()
            
            s1 = q1[order(-q1$total_n), ]
            r1 = s1 %>% select(Route) %>% distinct() %>% slice(1:25)
            p = semi_join(s1,r1, by=("Route"))
            r1$Route <- factor(r1$Route, levels = unique(r1$Route)) ## This will keep the bars in predefined decreasing order in graph
            xInp1 <- r1$Route[xInp + 1]
            print(xInp + 1)
            print(xInp1)
            print(class(xInp1))
            
          } else {
            
            p = common %>% 
              mutate(n = n/1000000) %>% mutate_if(is.numeric, round, digits = 5)
            
            if(input$SelXaxis2 == "Weekday"){
              
              weekday <- c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")
              print(xInp)
              xInp1 = weekday[xInp + 1]
          
            } else if((input$SelXaxis2 == "Start Time (in Hrs)")){
             
              hours <- c("0","1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23")
              print(xInp)
              xInp1 = hours[xInp + 1]
         
            } else if(input$SelXaxis2 == "Day of Month"){
              
              days <- c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31")
              print(xInp)
              xInp1 = days[xInp + 1]
           
            } else if(input$SelXaxis2 == "Week of Year"){
              
              if(input$Part2 == "Yes(by Bike-Type)"){
              if(cInp == "1"){
                Week <- dp %>% filter(`Bike Type` == "Docked") %>% select(`Week of Year`) %>% distinct()
              } else {
                Week <- dp %>% select(`Week of Year`) %>% distinct()
              }
            } else {
              Week <- dp %>% select(`Week of Year`) %>% distinct() 
            }
              Week <- Week[order(Week$`Week of Year`),]
              Week$`Week of Year` <- factor(Week$`Week of Year`, levels = unique(Week$`Week of Year`))
              print(xInp)
              xInp1 = Week$`Week of Year`[xInp + 1]
              
            } else if(input$SelXaxis2 == "Day of Year"){
              
              if(input$Part2 == "Yes(by Bike-Type)"){
                if(cInp == "2" | cInp == "4"){
                  Yearday <- dp %>% filter(`Bike Type` == "Electric") %>% select(`Day of Year`) %>% distinct()
                } else {
                  Yearday <- dp %>% select(`Day of Year`) %>% distinct()
                }
              } else {
                Yearday <- dp %>% select(`Day of Year`) %>% distinct() 
              }
                Yearday <- Yearday[order(Yearday$`Day of Year`),]
                Yearday$`Day of Year` <- factor(Yearday$`Day of Year`, levels = unique(Yearday$`Day of Year`))
                xInp1 = Yearday$`Day of Year`[xInp + 1]
                
            } else if(input$SelXaxis2 == "Month Name"){
              
              Months <- dp %>% select(`Month Name`) %>% distinct() 
              
              if(input$Seldata == "Entire Year"){
                
                Months$`Month Name` <- factor(Months$`Month Name`, levels = c("May 2023", "June 2023", "July 2023", "Aug 2023", "Sep 2023", "Oct 2023", "Nov 2023", "Dec 2023", "Jan 2024", "Feb 2024", "March 2024", "April 2024"))
              
              } else if(input$Seldata == "Spring"){
                
                Months$`Month Name` <- factor(Months$`Month Name`, levels = c("May 2023", "March 2024", "April 2024"))
               
              } else if(input$Seldata == "Summer"){
                
                Months$`Month Name` <- factor(Months$`Month Name`, levels = c("June 2023", "July 2023", "Aug 2023"))
                
              } else if(input$Seldata == "Fall"){
                
                Months$`Month Name` <- factor(Months$`Month Name`, levels = c("Sep 2023", "Oct 2023", "Nov 2023"))
               
             } else if(input$Seldata == "Winter"){
                
                Months$`Month Name` <- factor(Months$`Month Name`, levels = c("Dec 2023", "Jan 2024", "Feb 2024"))
              }
              
              Months <- Months[order(Months$`Month Name`),]
              xInp1 <- Months$`Month Name`[xInp + 1]
              
            } else if(input$SelXaxis2 == "Season"){
              
              if(input$Part2 == "Yes(by Bike-Type)"){
                if(cInp == "1"){
                Seasons <- dp %>% filter(`Bike Type` == "Docked") %>% select(`Season`) %>% distinct()
                } else {
                  Seasons <- dp %>% select(`Season`) %>% distinct()
                }
              } else {
                Seasons <- dp %>% select(`Season`) %>% distinct() 
              }
                Seasons$Season <- factor(Seasons$Season, levels = c("Fall", "Spring", "Summer", "Winter"))
                Seasons <- Seasons[order(Seasons$Season),]  
                print(xInp)
                xInp1 <- Seasons$Season[xInp + 1]
                print(xInp1)
                print(as.character(xInp1))
            }
          }
              
              m <- p %>% filter(!!sym(input$SelXaxis2) == as.character(xInp1)) %>% rename(Rider = `Rider Type`) %>% { if(input$Part2 == "Yes(by Bike-Type)") {rename(. , Bike = `Bike Type`)} else {.}}
              
        if(input$Part2 == "Yes(by Bike-Type)"){
                
              ch <- reactive({ PieDonut(m, aes(Rider, Bike, count = n), title = paste("Distribution of Rides for :", xInp1)) })
              
              showModal(modalDialog(
                plotOutput("Pie"),
                title = paste("Pie-Donut Chart" )
              ))
              output$Pie <- renderPlot({ch()})
              
        } else {
            
              custom_color <- c("#d45646", "#088F8F")
              ch <- reactive({ plot_ly(data = m, labels = ~Rider, values = ~n, type = "pie", marker = list(colors = custom_color, line = list(color = "white", width = 1))) %>% 
                  layout(title = paste("Distribution of Rides for :", xInp1)) 
              })
              
              showModal(modalDialog(
                plotlyOutput("Pie"),
                title = paste("Pie Chart" )
              ))
              
              output$Pie <- renderPlotly({ch()})
        }
              
}            
C.Generation of Heatmaps
else if(input$SelYaxis2 != "No.of Rides"){
        
          if(input$SelXaxis2 == "Weekday"){
            
            if(input$Part2 == "Yes(by Bike-Type)"){
           
            if(input$SelYaxis2 == "Avg.Ride-Duration"){
              
              q = dp %>% 
                group_by(Weekday, `Rider Type`, `Bike Type`, `Start Time (in Hrs)`) %>% 
                summarise(mean = mean(`Ride Length`)) %>% ungroup()
              
            } else if(input$SelYaxis2 == "Avg.Ride-Distance"){
              
              q = dp %>% 
                filter(!is.na(`Station Distance`)) %>% 
                group_by(Weekday, `Rider Type`, `Bike Type`, `Start Time (in Hrs)`) %>% 
                summarise(mean = mean(`Station Distance`)) %>% ungroup()
            } 
          } else {
              
            if(input$SelYaxis2 == "Avg.Ride-Duration"){
              
              q = dp %>% 
                group_by(Weekday, `Rider Type`, `Start Time (in Hrs)`) %>% 
                summarise(mean = mean(`Ride Length`)) %>% ungroup()
              
            } else if(input$SelYaxis2 == "Avg.Ride-Distance"){
              
              q = dp %>% 
                filter(!is.na(`Station Distance`)) %>% 
                group_by(Weekday, `Rider Type`, `Start Time (in Hrs)`) %>% 
                summarise(mean = mean(`Station Distance`)) %>% ungroup()
            } 
          }
            
            weekdays <- c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")
            
            xInp1 <- weekdays[xInp + 1]
            
            q_cas_dock <- q %>% filter(Weekday == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Docked")} else {.}}
            q_cas_clas <- q %>% filter(Weekday == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
            q_cas_elec <- q %>% filter(Weekday == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
            q_mem_clas <- q %>% filter(Weekday == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
            q_mem_elec <- q %>% filter(Weekday == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
            
            filterdata1 <- right_join(q_cas_dock, starttime,   by='Start Time (in Hrs)')
            filterdata2 <- right_join(q_cas_clas, starttime,   by='Start Time (in Hrs)')
            filterdata3 <- right_join(q_cas_elec, starttime,   by='Start Time (in Hrs)')
            filterdata4 <- right_join(q_mem_clas, starttime,   by='Start Time (in Hrs)')
            filterdata5 <- right_join(q_mem_elec, starttime,   by='Start Time (in Hrs)')
            
            if(!any(is.na(filterdata1)) && !any(is.na(filterdata2)) && !any(is.na(filterdata3)) && !any(is.na(filterdata4)) && !any(is.na(filterdata5))){
              
              filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
              
            } else {
              
              
            if(any(is.na(filterdata1))){
                
                filterdata1$mean <- ifelse(is.na(filterdata1$mean), 0, filterdata1$mean)
                filterdata1$Weekday <- ifelse(is.na(filterdata1$Weekday), xInp1, filterdata1$Weekday)
                filterdata1$`Rider Type` <- ifelse(is.na(filterdata1$`Rider Type`), "Casual", filterdata1$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") { 
                filterdata1$`Bike Type` <- ifelse(is.na(filterdata1$`Bike Type`), "Docked", filterdata1$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata2))){
                
                filterdata2$mean <- ifelse(is.na(filterdata2$mean), 0, filterdata2$mean)
                filterdata2$Weekday <- ifelse(is.na(filterdata2$Weekday), xInp1, filterdata2$Weekday)
                filterdata2$`Rider Type` <- ifelse(is.na(filterdata2$`Rider Type`), "Casual", filterdata2$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") { 
                  filterdata2$`Bike Type` <- ifelse(is.na(filterdata2$`Bike Type`), "Classic", filterdata2$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata3))){
                
                filterdata3$mean <- ifelse(is.na(filterdata3$mean), 0, filterdata3$mean)
                filterdata3$Weekday <- ifelse(is.na(filterdata3$Weekday), xInp1, filterdata3$Weekday)
                filterdata3$`Rider Type` <- ifelse(is.na(filterdata3$`Rider Type`), "Casual", filterdata3$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") { 
                  filterdata3$`Bike Type` <- ifelse(is.na(filterdata3$`Bike Type`), "Electric", filterdata3$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata4))){
                
                filterdata4$mean <- ifelse(is.na(filterdata4$mean), 0, filterdata4$mean)
                filterdata4$Weekday <- ifelse(is.na(filterdata4$Weekday), xInp1, filterdata4$Weekday)
                filterdata4$`Rider Type` <- ifelse(is.na(filterdata4$`Rider Type`), "Member", filterdata4$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") { 
                  filterdata4$`Bike Type` <- ifelse(is.na(filterdata4$`Bike Type`), "Classic", filterdata4$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata5))){
                
                filterdata5$mean <- ifelse(is.na(filterdata5$mean), 0, filterdata5$mean)
                filterdata5$Weekday <- ifelse(is.na(filterdata5$Weekday), xInp1, filterdata5$Weekday)
                filterdata5$`Rider Type` <- ifelse(is.na(filterdata5$`Rider Type`), "Member", filterdata5$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") { 
                  filterdata5$`Bike Type` <- ifelse(is.na(filterdata5$`Bike Type`), "Electric", filterdata5$`Bike Type`)
                }
              }
              
              filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
              
            }
            
            filterdata$`Start Time (in Hrs)` <- factor(filterdata$`Start Time (in Hrs)`, levels = c("0","1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23"))
            
            ch <- reactive({ 
                
              {if(input$Part2 == "Yes(by Bike-Type)") { 
                ggplot(filterdata, aes(x = `Start Time (in Hrs)`, y = interaction(`Rider Type`, `Bike Type`), fill = mean))
                } else {
                  ggplot(filterdata, aes(x = `Start Time (in Hrs)`, y = `Rider Type`, fill = mean))
                }} +
                geom_tile(color = "white") +
                scale_fill_gradient(low = "lightblue", high = "darkblue") +
                
                {if(input$Part2 == "Yes(by Bike-Type)") { 
                  labs(x = "Time of the Day", y = "Rider-Bike Combination", fill = "mean") 
                } else {
                  labs(x = "Time of the Day", y = "Rider Type", fill = "mean") 
                }} +
                theme_minimal() +
                theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
                labs(title = paste(input$SelYaxis2, " distribution across Time of the Day for", xInp1))
            }) 
            
            showModal(modalDialog(
              plotOutput("HeatMap"),
              title = paste("Heat Map" )
            ))
            
            output$HeatMap <- renderPlot({ch()})
            
          } else if((input$SelXaxis2 == "Start Time (in Hrs)")){
            
            if(input$Part2 == "Yes(by Bike-Type)") {   
            
            if(input$SelYaxis2 == "Avg.Ride-Duration"){
              
              q = dp %>%
                group_by(`Start Time (in Hrs)`, `Rider Type`, `Bike Type`, Station ) %>%
                summarise(mean = mean(`Ride Length`)) %>% ungroup()
              
            } else if(input$SelYaxis2 == "Avg.Ride-Distance"){
              
              q = dp %>% 
                filter(!is.na(`Station Distance`)) %>% 
                group_by(`Start Time (in Hrs)`, `Rider Type`, `Bike Type`, Station ) %>%
                summarise(mean = mean(`Station Distance`)) %>% ungroup()
            } 
            } else {
              
              if(input$SelYaxis2 == "Avg.Ride-Duration"){
                
                q = dp %>%
                  group_by(`Start Time (in Hrs)`, `Rider Type`, Station ) %>%
                  summarise(mean = mean(`Ride Length`)) %>% ungroup()
                
              } else if(input$SelYaxis2 == "Avg.Ride-Distance"){
                
                q = dp %>% 
                  filter(!is.na(`Station Distance`)) %>% 
                  group_by(`Start Time (in Hrs)`, `Rider Type`, Station ) %>%
                  summarise(mean = mean(`Station Distance`)) %>% ungroup()
              }
              
            }
            
            hours <- c("0","1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23")
            
            xInp1 = hours[xInp + 1]
            
            q_cas_dock <- q %>% filter(`Start Time (in Hrs)` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Docked")} else {.}}
            q_cas_clas <- q %>% filter(`Start Time (in Hrs)` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
            q_cas_elec <- q %>% filter(`Start Time (in Hrs)` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
            q_mem_clas <- q %>% filter(`Start Time (in Hrs)` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
            q_mem_elec <- q %>% filter(`Start Time (in Hrs)` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
            
            
            filterdata1 <- right_join(q_cas_dock, Station,   by='Station')
            filterdata2 <- right_join(q_cas_clas, Station,   by='Station')
            filterdata3 <- right_join(q_cas_elec, Station,   by='Station')
            filterdata4 <- right_join(q_mem_clas, Station,   by='Station')
            filterdata5 <- right_join(q_mem_elec, Station,   by='Station')
            
            if(!any(is.na(filterdata1)) && !any(is.na(filterdata2)) && !any(is.na(filterdata3)) && !any(is.na(filterdata4)) && !any(is.na(filterdata5))){
              
              filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
              
            } else {
              
              
              if(any(is.na(filterdata1))){
                
                filterdata1$mean <- ifelse(is.na(filterdata1$mean), 0, filterdata1$mean)
                filterdata1$`Start Time (in Hrs)` <- ifelse(is.na(filterdata1$`Start Time (in Hrs)`), xInp1, filterdata1$`Start Time (in Hrs)`)
                filterdata1$`Rider Type` <- ifelse(is.na(filterdata1$`Rider Type`), "Casual", filterdata1$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") { 
                filterdata1$`Bike Type` <- ifelse(is.na(filterdata1$`Bike Type`), "Docked", filterdata1$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata2))){
                
                filterdata2$mean <- ifelse(is.na(filterdata2$mean), 0, filterdata2$mean)
                filterdata2$`Start Time (in Hrs)` <- ifelse(is.na(filterdata2$`Start Time (in Hrs)`), xInp1, filterdata2$`Start Time (in Hrs)`)
                filterdata2$`Rider Type` <- ifelse(is.na(filterdata2$`Rider Type`), "Casual", filterdata2$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") { 
                  filterdata2$`Bike Type` <- ifelse(is.na(filterdata2$`Bike Type`), "Classic", filterdata2$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata3))){
                
                filterdata3$mean <- ifelse(is.na(filterdata3$mean), 0, filterdata3$mean)
                filterdata3$`Start Time (in Hrs)` <- ifelse(is.na(filterdata3$`Start Time (in Hrs)`), xInp1, filterdata3$`Start Time (in Hrs)`)
                filterdata3$`Rider Type` <- ifelse(is.na(filterdata3$`Rider Type`), "Casual", filterdata3$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") { 
                  filterdata3$`Bike Type` <- ifelse(is.na(filterdata3$`Bike Type`), "Electric", filterdata3$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata4))){
                
                filterdata4$mean <- ifelse(is.na(filterdata4$mean), 0, filterdata4$mean)
                filterdata4$`Start Time (in Hrs)` <- ifelse(is.na(filterdata4$`Start Time (in Hrs)`), xInp1, filterdata4$`Start Time (in Hrs)`)
                filterdata4$`Rider Type` <- ifelse(is.na(filterdata4$`Rider Type`), "Member", filterdata4$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") { 
                  filterdata4$`Bike Type` <- ifelse(is.na(filterdata4$`Bike Type`), "Classic", filterdata4$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata5))){
                
                filterdata5$mean <- ifelse(is.na(filterdata5$mean), 0, filterdata5$mean)
                filterdata5$`Start Time (in Hrs)` <- ifelse(is.na(filterdata5$`Start Time (in Hrs)`), xInp1, filterdata5$`Start Time (in Hrs)`)
                filterdata5$`Rider Type` <- ifelse(is.na(filterdata5$`Rider Type`), "Member", filterdata5$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") { 
                  filterdata5$`Bike Type` <- ifelse(is.na(filterdata5$`Bike Type`), "Electric", filterdata5$`Bike Type`)
                }
              }
              
              filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
            }
            
            filterdata$Station <- factor(filterdata$Station, levels = unique(Station$Station))
            
            ch <- reactive({ 
              
              {if(input$Part2 == "Yes(by Bike-Type)") { 
                ggplot(filterdata, aes(x = Station, y = interaction(`Rider Type`, `Bike Type`), fill = mean))
              } else {
                ggplot(filterdata, aes(x = Station, y = `Rider Type`, fill = mean))
              }} +
                geom_tile(color = "white") +
                scale_fill_gradient(low = "yellow", high = "red") +
                {if(input$Part2 == "Yes(by Bike-Type)") { 
                  labs(x = "Top 20 Stations", y = "Rider-Bike Combination", fill = "mean")
                } else {
                  labs(x = "Top 20 Stations", y = "Rider Type", fill = "mean")
                }} +
                theme_minimal() +
                theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
                labs(title = paste(input$SelYaxis2, " distribution across Stations of Chicago for the Hour:", xInp1))
            }) 
            
            showModal(modalDialog(
              plotOutput("HeatMap"),
              title = paste("Heat Map" )
            ))
            
            output$HeatMap <- renderPlot({ch()})
            
          } else if(input$SelXaxis2 == "Day of Month"){
            
          if(input$Part2 == "Yes(by Bike-Type)") { 
            
            if(input$SelYaxis2 == "Avg.Ride-Duration"){
              
              q = dp %>%
                group_by(`Day of Month`, `Rider Type`, `Bike Type`, `Start Time (in Hrs)`) %>% 
                summarise(mean = mean(`Ride Length`)) %>% ungroup()
              
            } else if(input$SelYaxis2 == "Avg.Ride-Distance"){
              
              q = dp %>% 
                filter(!is.na(`Station Distance`)) %>% 
                group_by(`Day of Month`, `Rider Type`, `Bike Type`, `Start Time (in Hrs)`) %>% 
                summarise(mean = mean(`Station Distance`)) %>% ungroup()
            } 
          } else {
              
            if(input$SelYaxis2 == "Avg.Ride-Duration"){
              
              q = dp %>%
                group_by(`Day of Month`, `Rider Type`,`Start Time (in Hrs)`) %>% 
                summarise(mean = mean(`Ride Length`)) %>% ungroup()
              
            } else if(input$SelYaxis2 == "Avg.Ride-Distance"){
              
              q = dp %>% 
                filter(!is.na(`Station Distance`)) %>% 
                group_by(`Day of Month`, `Rider Type`, `Start Time (in Hrs)`) %>% 
                summarise(mean = mean(`Station Distance`)) %>% ungroup()
            }
            
            }
            
            days <- c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31")
            
            xInp1 = days[xInp + 1]
            
            q_cas_dock <- q %>% filter(`Day of Month` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Docked")} else {.}}
            q_cas_clas <- q %>% filter(`Day of Month` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
            q_cas_elec <- q %>% filter(`Day of Month` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
            q_mem_clas <- q %>% filter(`Day of Month` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
            q_mem_elec <- q %>% filter(`Day of Month` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
            
            filterdata1 <- right_join(q_cas_dock, starttime,   by='Start Time (in Hrs)')
            filterdata2 <- right_join(q_cas_clas, starttime,   by='Start Time (in Hrs)')
            filterdata3 <- right_join(q_cas_elec, starttime,   by='Start Time (in Hrs)')
            filterdata4 <- right_join(q_mem_clas, starttime,   by='Start Time (in Hrs)')
            filterdata5 <- right_join(q_mem_elec, starttime,   by='Start Time (in Hrs)')
            
            if(!any(is.na(filterdata1)) && !any(is.na(filterdata2)) && !any(is.na(filterdata3)) && !any(is.na(filterdata4)) && !any(is.na(filterdata5))){
              
              filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
              
            } else {
              
              
              if(any(is.na(filterdata1))){
                
                filterdata1$mean <- ifelse(is.na(filterdata1$mean), 0, filterdata1$mean)
                filterdata1$`Day of Month` <- ifelse(is.na(filterdata1$`Day of Month`), xInp1, filterdata1$`Day of Month`)
                filterdata1$`Rider Type` <- ifelse(is.na(filterdata1$`Rider Type`), "Casual", filterdata1$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)"){
                filterdata1$`Bike Type` <- ifelse(is.na(filterdata1$`Bike Type`), "Docked", filterdata1$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata2))){
                
                filterdata2$mean <- ifelse(is.na(filterdata2$mean), 0, filterdata2$mean)
                filterdata2$`Day of Month` <- ifelse(is.na(filterdata2$`Day of Month`), xInp1, filterdata2$`Day of Month`)
                filterdata2$`Rider Type` <- ifelse(is.na(filterdata2$`Rider Type`), "Casual", filterdata2$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)"){
                  filterdata2$`Bike Type` <- ifelse(is.na(filterdata2$`Bike Type`), "Classic", filterdata2$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata3))){
                
                filterdata3$mean <- ifelse(is.na(filterdata3$mean), 0, filterdata3$mean)
                filterdata3$`Day of Month` <- ifelse(is.na(filterdata3$`Day of Month`), xInp1, filterdata3$`Day of Month`)
                filterdata3$`Rider Type` <- ifelse(is.na(filterdata3$`Rider Type`), "Casual", filterdata3$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)"){
                  filterdata3$`Bike Type` <- ifelse(is.na(filterdata3$`Bike Type`), "Electric", filterdata3$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata4))){
                
                filterdata4$mean <- ifelse(is.na(filterdata4$mean), 0, filterdata4$mean)
                filterdata4$`Day of Month` <- ifelse(is.na(filterdata4$`Day of Month`), xInp1, filterdata4$`Day of Month`)
                filterdata4$`Rider Type` <- ifelse(is.na(filterdata4$`Rider Type`), "Member", filterdata4$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)"){
                  filterdata4$`Bike Type` <- ifelse(is.na(filterdata4$`Bike Type`), "Classic", filterdata4$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata5))){
                
                filterdata5$mean <- ifelse(is.na(filterdata5$mean), 0, filterdata5$mean)
                filterdata5$`Day of Month` <- ifelse(is.na(filterdata5$`Day of Month`), xInp1, filterdata5$`Day of Month`)
                filterdata5$`Rider Type` <- ifelse(is.na(filterdata5$`Rider Type`), "Member", filterdata5$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)"){
                  filterdata5$`Bike Type` <- ifelse(is.na(filterdata5$`Bike Type`), "Electric", filterdata5$`Bike Type`)
                }
              }
              
              filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
              
            }
            
            filterdata$`Start Time (in Hrs)` <- factor(filterdata$`Start Time (in Hrs)`, levels = c("0","1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23"))
            
            ch <- reactive({ 
                
                {if(input$Part2 == "Yes(by Bike-Type)"){
                ggplot(filterdata, aes(x = `Start Time (in Hrs)`, y = interaction(`Rider Type`, `Bike Type`), fill = mean))
                } else {
                  ggplot(filterdata, aes(x = `Start Time (in Hrs)`, y = `Rider Type`, fill = mean)) 
                  } } +
                geom_tile(color = "white") +
                scale_fill_gradient(low = "pink", high = "maroon") +
                {if(input$Part2 == "Yes(by Bike-Type)"){
                  labs(x = "Time of the Day", y = "Rider-Bike Combination", fill = "mean")
                } else {
                  labs(x = "Time of the Day", y = "Rider Type", fill = "mean")
                } }  +
                theme_minimal() +
                theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
                labs(title = paste(input$SelYaxis2, " distribution across different time-period for Day:", xInp1))
            }) 
            
            showModal(modalDialog(
              plotOutput("HeatMap"),
              title = paste("Heat Map" )
            ))
            
            output$HeatMap <- renderPlot({ch()})
            
          } else if(input$SelXaxis2 == "Week of Year"){
            
            if(input$Part2 == "Yes(by Bike-Type)"){ 
           
              if(input$SelYaxis2 == "Avg.Ride-Duration"){
                
                q = dp %>% { if(cInp == "1") {filter(. , `Bike Type` == "Docked")} else {.}} %>% 
                  group_by(`Week of Year`, `Rider Type`, `Bike Type`, Weekday) %>% 
                  summarise(mean = mean(`Ride Length`)) %>% ungroup()
                
              } else if(input$SelYaxis2 == "Avg.Ride-Distance"){
                
                q = dp %>% { if(cInp == "1") {filter(. , `Bike Type` == "Docked")} else {.}} %>% 
                  filter(!is.na(`Station Distance`)) %>% 
                  group_by(`Week of Year`, `Rider Type`, `Bike Type`, Weekday) %>% 
                  summarise(mean = mean(`Station Distance`)) %>% ungroup()
                
              } } else{
              
              if(input$SelYaxis2 == "Avg.Ride-Duration"){
                
                q = dp %>%
                  group_by(`Week of Year`, `Rider Type`,Weekday) %>% 
                  summarise(mean = mean(`Ride Length`)) %>% ungroup()
                
              } else if(input$SelYaxis2 == "Avg.Ride-Distance"){
                
                q = dp %>% 
                  filter(!is.na(`Station Distance`)) %>% 
                  group_by(`Week of Year`, `Rider Type`, Weekday) %>% 
                  summarise(mean = mean(`Station Distance`)) %>% ungroup()
                
              } }
              
              Yearlyweek <- q %>% select(`Week of Year`) %>% distinct()
              
              Yearlyweek <- Yearlyweek[order(Yearlyweek$`Week of Year`), ]
              
              weeklevel = unique(Yearlyweek$`Week of Year`)
              
              xInp1 <- weeklevel[xInp + 1]
            
            Weekday <- yearroute %>% select(Weekday) %>% distinct() 
            
            Weekday$Weekday <- factor(Weekday$Weekday, levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"))
            
            Weekday <- Weekday[order(Weekday$Weekday), ]
            
            # weekdays <- c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")
           
            q_cas_dock <- q %>% filter(`Week of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Docked")} else {.}}
            q_cas_clas <- q %>% filter(`Week of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
            q_cas_elec <- q %>% filter(`Week of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
            q_mem_clas <- q %>% filter(`Week of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
            q_mem_elec <- q %>% filter(`Week of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
            
            filterdata1 <- right_join(q_cas_dock, Weekday,   by='Weekday')
            filterdata2 <- right_join(q_cas_clas, Weekday,   by='Weekday')
            filterdata3 <- right_join(q_cas_elec, Weekday,   by='Weekday')
            filterdata4 <- right_join(q_mem_clas, Weekday,   by='Weekday')
            filterdata5 <- right_join(q_mem_elec, Weekday,   by='Weekday')
            
            if(!any(is.na(filterdata1)) && !any(is.na(filterdata2)) && !any(is.na(filterdata3)) && !any(is.na(filterdata4)) && !any(is.na(filterdata5))){
              
              filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
              
            } else {
              
              if(any(is.na(filterdata1))){
                
                filterdata1$mean <- ifelse(is.na(filterdata1$mean), 0, filterdata1$mean)
                filterdata1$`Week of Year` <- ifelse(is.na(filterdata1$`Week of Year`), xInp1, filterdata1$`Week of Year`)
                filterdata1$`Rider Type` <- ifelse(is.na(filterdata1$`Rider Type`), "Casual", filterdata1$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                filterdata1$`Bike Type` <- ifelse(is.na(filterdata1$`Bike Type`), "Docked", filterdata1$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata2))){
                
                filterdata2$mean <- ifelse(is.na(filterdata2$mean), 0, filterdata2$mean)
                filterdata2$`Week of Year` <- ifelse(is.na(filterdata2$`Week of Year`), xInp1, filterdata2$`Week of Year`)
                filterdata2$`Rider Type` <- ifelse(is.na(filterdata2$`Rider Type`), "Casual", filterdata2$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                  filterdata2$`Bike Type` <- ifelse(is.na(filterdata2$`Bike Type`), "Classic", filterdata2$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata3))){
                
                filterdata3$mean <- ifelse(is.na(filterdata3$mean), 0, filterdata3$mean)
                filterdata3$`Week of Year` <- ifelse(is.na(filterdata3$`Week of Year`), xInp1, filterdata3$`Week of Year`)
                filterdata3$`Rider Type` <- ifelse(is.na(filterdata3$`Rider Type`), "Casual", filterdata3$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                  filterdata3$`Bike Type` <- ifelse(is.na(filterdata3$`Bike Type`), "Electric", filterdata3$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata4))){
                
                filterdata4$mean <- ifelse(is.na(filterdata4$mean), 0, filterdata4$mean)
                filterdata4$`Week of Year` <- ifelse(is.na(filterdata4$`Week of Year`), xInp1, filterdata4$`Week of Year`)
                filterdata4$`Rider Type` <- ifelse(is.na(filterdata4$`Rider Type`), "Member", filterdata4$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                  filterdata4$`Bike Type` <- ifelse(is.na(filterdata4$`Bike Type`), "Classic", filterdata4$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata5))){
                
                filterdata5$mean <- ifelse(is.na(filterdata5$mean), 0, filterdata5$mean)
                filterdata5$`Week of Year` <- ifelse(is.na(filterdata5$`Week of Year`), xInp1, filterdata5$`Week of Year`)
                filterdata5$`Rider Type` <- ifelse(is.na(filterdata5$`Rider Type`), "Member", filterdata5$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                  filterdata5$`Bike Type` <- ifelse(is.na(filterdata5$`Bike Type`), "Electric", filterdata5$`Bike Type`)
                }
              }
              
              filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
            }
            
            filterdata$Weekday <- factor(filterdata$Weekday, levels = unique(Weekday$Weekday))
            
            ch <- reactive({ 
              
                {if(input$Part2 == "Yes(by Bike-Type)") {
                ggplot(filterdata, aes(x = Weekday, y = interaction(`Rider Type`, `Bike Type`), fill = mean))
                } else {
                  ggplot(filterdata, aes(x = Weekday, y = `Rider Type`, fill = mean))
                  }} +
                geom_tile(color = "white") +
                scale_fill_gradient(low = "lightgreen", high = "darkgreen") +
                
                {if(input$Part2 == "Yes(by Bike-Type)") {
                  labs(x = "Weekday", y = "Rider-Bike Combination", fill = "mean")
                } else {
                  labs(x = "Weekday", y = "Rider Type", fill = "mean")
                }} +
                theme_minimal() +
                theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
                labs(title = paste(input$SelYaxis2, " distribution across Weekday for the Week:", xInp1))
            }) 
            
            showModal(modalDialog(
              plotOutput("HeatMap"),
              title = paste("Heat Map" )
            ))
            
            output$HeatMap <- renderPlot({ch()})
            
            
          } else if(input$SelXaxis2 == "Month Name"){
            
          if(input$Part2 == "Yes(by Bike-Type)") {
            
            if(input$SelYaxis2 == "Avg.Ride-Duration"){
              
              q = dp %>%
                group_by(`Month Name`, `Rider Type`, `Bike Type`, `Week of Year`) %>% 
                summarise(mean = mean(`Ride Length`)) %>% ungroup()
              
            } else if(input$SelYaxis2 == "Avg.Ride-Distance"){
              
              q = dp %>% 
                filter(!is.na(`Station Distance`)) %>% 
                group_by(`Month Name`, `Rider Type`, `Bike Type`, `Week of Year`) %>% 
                summarise(mean = mean(`Station Distance`)) %>% ungroup()
            }
          } else {
            
            if(input$SelYaxis2 == "Avg.Ride-Duration"){
              
              q = dp %>%
                group_by(`Month Name`, `Rider Type`, `Week of Year`) %>% 
                summarise(mean = mean(`Ride Length`)) %>% ungroup()
              
            } else if(input$SelYaxis2 == "Avg.Ride-Distance"){
              
              q = dp %>% 
                filter(!is.na(`Station Distance`)) %>% 
                group_by(`Month Name`, `Rider Type`, `Week of Year`) %>% 
                summarise(mean = mean(`Station Distance`)) %>% ungroup()
            } }
            
            Months <- dp %>% select(`Month Name`) %>% distinct() 
            
            if(input$Seldata == "Entire Year"){
              
              Months$`Month Name` <- factor(Months$`Month Name`, levels = c("May 2023", "June 2023", "July 2023", "Aug 2023", "Sep 2023", "Oct 2023", "Nov 2023", "Dec 2023", "Jan 2024", "Feb 2024", "March 2024", "April 2024"))
              
            } else if(input$Seldata == "Spring"){
              
              Months$`Month Name` <- factor(Months$`Month Name`, levels = c("May 2023", "March 2024", "April 2024"))
              
            } else if(input$Seldata == "Summer"){
              
              Months$`Month Name` <- factor(Months$`Month Name`, levels = c("June 2023", "July 2023", "Aug 2023"))
              
            } else if(input$Seldata == "Fall"){
              
              Months$`Month Name` <- factor(Months$`Month Name`, levels = c("Sep 2023", "Oct 2023", "Nov 2023"))
              
            } else if(input$Seldata == "Winter"){
              
              Months$`Month Name` <- factor(Months$`Month Name`, levels = c("Dec 2023", "Jan 2024", "Feb 2024"))
            }
            
            Months <- Months[order(Months$`Month Name`),]
            xInp1 <- Months$`Month Name`[xInp + 1]
            
            q_cas_dock <- q %>% filter(`Month Name` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Docked")} else {.}}
            q_cas_clas <- q %>% filter(`Month Name` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
            q_cas_elec <- q %>% filter(`Month Name` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
            q_mem_clas <- q %>% filter(`Month Name` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
            q_mem_elec <- q %>% filter(`Month Name` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
            
            Yearlyweek <- dp %>% filter(`Month Name` == as.character(xInp1)) %>% select(`Week of Year`) %>% distinct()
            
            Yearlyweek <- Yearlyweek[order(Yearlyweek$`Week of Year`), ] 
            
            filterdata1 <- right_join(q_cas_dock, Yearlyweek,   by='Week of Year')
            filterdata2 <- right_join(q_cas_clas, Yearlyweek,   by='Week of Year')
            filterdata3 <- right_join(q_cas_elec, Yearlyweek,   by='Week of Year')
            filterdata4 <- right_join(q_mem_clas, Yearlyweek,   by='Week of Year')
            filterdata5 <- right_join(q_mem_elec, Yearlyweek,   by='Week of Year')
            
            if(!any(is.na(filterdata1)) && !any(is.na(filterdata2)) && !any(is.na(filterdata3)) && !any(is.na(filterdata4)) && !any(is.na(filterdata5))){
              
              filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
              
            } else {
              
              if(any(is.na(filterdata1))){
                
                filterdata1$mean <- ifelse(is.na(filterdata1$mean), 0, filterdata1$mean)
                filterdata1$`Month Name` <- ifelse(is.na(filterdata1$`Month Name`), xInp1, filterdata1$`Month Name`)
                filterdata1$`Rider Type` <- ifelse(is.na(filterdata1$`Rider Type`), "Casual", filterdata1$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                  filterdata1$`Bike Type` <- ifelse(is.na(filterdata1$`Bike Type`), "Docked", filterdata1$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata2))){
                
                filterdata2$mean <- ifelse(is.na(filterdata2$mean), 0, filterdata2$mean)
                filterdata2$`Month Name` <- ifelse(is.na(filterdata2$`Month Name`), xInp1, filterdata2$`Month Name`)
                filterdata2$`Rider Type` <- ifelse(is.na(filterdata2$`Rider Type`), "Casual", filterdata2$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                  filterdata2$`Bike Type` <- ifelse(is.na(filterdata2$`Bike Type`), "Classic", filterdata2$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata3))){
                
                filterdata3$mean <- ifelse(is.na(filterdata3$mean), 0, filterdata3$mean)
                filterdata3$`Month Name` <- ifelse(is.na(filterdata3$`Month Name`), xInp1, filterdata3$`Month Name`)
                filterdata3$`Rider Type` <- ifelse(is.na(filterdata3$`Rider Type`), "Casual", filterdata3$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                  filterdata3$`Bike Type` <- ifelse(is.na(filterdata3$`Bike Type`), "Electric", filterdata3$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata4))){
                
                filterdata4$mean <- ifelse(is.na(filterdata4$mean), 0, filterdata4$mean)
                filterdata4$`Month Name` <- ifelse(is.na(filterdata4$`Month Name`), xInp1, filterdata4$`Month Name`)
                filterdata4$`Rider Type` <- ifelse(is.na(filterdata4$`Rider Type`), "Member", filterdata4$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                  filterdata4$`Bike Type` <- ifelse(is.na(filterdata4$`Bike Type`), "Classic", filterdata4$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata5))){
                
                filterdata5$mean <- ifelse(is.na(filterdata5$mean), 0, filterdata5$mean)
                filterdata5$`Month Name` <- ifelse(is.na(filterdata5$`Month Name`), xInp1, filterdata5$`Month Name`)
                filterdata5$`Rider Type` <- ifelse(is.na(filterdata5$`Rider Type`), "Member", filterdata5$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                  filterdata5$`Bike Type` <- ifelse(is.na(filterdata5$`Bike Type`), "Electric", filterdata5$`Bike Type`)
                }
              }
              
              filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
            }
            
            filterdata$`Week of Year` <- factor(filterdata$`Week of Year`, levels = unique(Yearlyweek$`Week of Year`))
            
            ch <- reactive({ 
              {if(input$Part2 == "Yes(by Bike-Type)") {
                ggplot(filterdata, aes(x = `Week of Year`, y = interaction(`Rider Type`, `Bike Type`), fill = mean))
              } else {
                ggplot(filterdata, aes(x = `Week of Year`, y = `Rider Type`, fill = mean))
              }} +
                geom_tile(color = "white") +
                scale_fill_gradient(low = "#F9CE87", high = "#F28B1E") +
                {if(input$Part2 == "Yes(by Bike-Type)") {
                  labs(x = "Week of Year", y = "Rider-Bike Combination", fill = "mean")
                } else {
                  labs(x = "Week of Year", y = "Rider Type", fill = "mean")
                }}  +
                theme_minimal() +
                theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
                labs(title = paste(input$SelYaxis2, " distribution across Week of the Year for the Month:", xInp1))
            }) 
            
            showModal(modalDialog(
              plotOutput("HeatMap"),
              title = paste("Heat Map" )
            ))
            
            output$HeatMap <- renderPlot({ch()})
            
            
          } else if(input$SelXaxis2 == "Day of Year"){
            
            if(input$Part2 == "Yes(by Bike-Type)"){
              
              if(input$SelYaxis2 == "Avg.Ride-Duration"){
                
                q = dp %>% { if(cInp == "2" | cInp == "4") {filter(. , `Bike Type` == "Electric")} else if(cInp == "1") {filter(. , `Bike Type` == "Docked")} else {.}} %>% 
                  group_by(`Day of Year`, `Rider Type`, `Bike Type`, `Start Time (in Hrs)`) %>% 
                  summarise(mean = mean(`Ride Length`)) %>% ungroup()
                
              } else if(input$SelYaxis2 == "Avg.Ride-Distance"){
                
                q = dp %>% { if(cInp == "2" | cInp == "4") {filter(. , `Bike Type` == "Electric")} else if(cInp == "1") {filter(. , `Bike Type` == "Docked")} else {.}} %>% 
                  filter(!is.na(`Station Distance`)) %>% 
                  group_by(`Day of Year`, `Rider Type`, `Bike Type`, `Start Time (in Hrs)`) %>% 
                  summarise(mean = mean(`Station Distance`)) %>% ungroup()
                
              } } else {
              
              if(input$SelYaxis2 == "Avg.Ride-Duration"){
                
                q = dp %>%
                  group_by(`Day of Year`, `Rider Type`, `Start Time (in Hrs)`) %>% 
                  summarise(mean = mean(`Ride Length`)) %>% ungroup()
                
              } else if(input$SelYaxis2 == "Avg.Ride-Distance"){
                
                q = dp %>% 
                  filter(!is.na(`Station Distance`)) %>% 
                  group_by(`Day of Year`, `Rider Type`, `Start Time (in Hrs)`) %>% 
                  summarise(mean = mean(`Station Distance`)) %>% ungroup()
              } }
              
              days = q %>% select(`Day of Year`) %>% distinct()
              
              days = days[order(days$`Day of Year`), ]
              
              days$`Day of Year` = factor(days$`Day of Year`, levels = unique(days$`Day of Year`))
              
              xInp1 = days$`Day of Year`[xInp + 1]
          
            q_cas_dock <- q %>% filter(`Day of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Docked")} else {.}}
            q_cas_clas <- q %>% filter(`Day of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
            q_cas_elec <- q %>% filter(`Day of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
            q_mem_clas <- q %>% filter(`Day of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
            q_mem_elec <- q %>% filter(`Day of Year` == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
            
            filterdata1 <- right_join(q_cas_dock, starttime,   by='Start Time (in Hrs)')
            filterdata2 <- right_join(q_cas_clas, starttime,   by='Start Time (in Hrs)')
            filterdata3 <- right_join(q_cas_elec, starttime,   by='Start Time (in Hrs)')
            filterdata4 <- right_join(q_mem_clas, starttime,   by='Start Time (in Hrs)')
            filterdata5 <- right_join(q_mem_elec, starttime,   by='Start Time (in Hrs)')
            
            if(!any(is.na(filterdata1)) && !any(is.na(filterdata2)) && !any(is.na(filterdata3)) && !any(is.na(filterdata4)) && !any(is.na(filterdata5))){
              
              filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
              
            } else {
             
              if(any(is.na(filterdata1))){
                
                filterdata1$mean <- ifelse(is.na(filterdata1$mean), 0, filterdata1$mean)
                filterdata1$`Day of Year` <- ifelse(is.na(filterdata1$`Day of Year`), xInp1, filterdata1$`Day of Year`)
                filterdata1$`Rider Type` <- ifelse(is.na(filterdata1$`Rider Type`), "Casual", filterdata1$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                filterdata1$`Bike Type` <- ifelse(is.na(filterdata1$`Bike Type`), "Docked", filterdata1$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata2))){
                
                filterdata2$mean <- ifelse(is.na(filterdata2$mean), 0, filterdata2$mean)
                filterdata2$`Day of Year` <- ifelse(is.na(filterdata2$`Day of Year`), xInp1, filterdata2$`Day of Year`)
                filterdata2$`Rider Type` <- ifelse(is.na(filterdata2$`Rider Type`), "Casual", filterdata2$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                  filterdata2$`Bike Type` <- ifelse(is.na(filterdata2$`Bike Type`), "Classic", filterdata2$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata3))){
                
                filterdata3$mean <- ifelse(is.na(filterdata3$mean), 0, filterdata3$mean)
                filterdata3$`Day of Year` <- ifelse(is.na(filterdata3$`Day of Year`), xInp1, filterdata3$`Day of Year`)
                filterdata3$`Rider Type` <- ifelse(is.na(filterdata3$`Rider Type`), "Casual", filterdata3$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                  filterdata3$`Bike Type` <- ifelse(is.na(filterdata3$`Bike Type`), "Electric", filterdata3$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata4))){
                
                filterdata4$mean <- ifelse(is.na(filterdata4$mean), 0, filterdata4$mean)
                filterdata4$`Day of Year` <- ifelse(is.na(filterdata4$`Day of Year`), xInp1, filterdata4$`Day of Year`)
                filterdata4$`Rider Type` <- ifelse(is.na(filterdata4$`Rider Type`), "Member", filterdata4$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                  filterdata4$`Bike Type` <- ifelse(is.na(filterdata4$`Bike Type`), "Classic", filterdata4$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata5))){
                
                filterdata5$mean <- ifelse(is.na(filterdata5$mean), 0, filterdata5$mean)
                filterdata5$`Day of Year` <- ifelse(is.na(filterdata5$`Day of Year`), xInp1, filterdata5$`Day of Year`)
                filterdata5$`Rider Type` <- ifelse(is.na(filterdata5$`Rider Type`), "Member", filterdata5$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                  filterdata5$`Bike Type` <- ifelse(is.na(filterdata5$`Bike Type`), "Electric", filterdata5$`Bike Type`)
                }
              }
              
              filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
            }
            
            filterdata$`Start Time (in Hrs)` <- factor(filterdata$`Start Time (in Hrs)`, levels = c("0","1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23"))
            
            ch <- reactive({ 
                
              {if(input$Part2 == "Yes(by Bike-Type)") {
                ggplot(filterdata, aes(x = `Start Time (in Hrs)`, y = interaction(`Rider Type`, `Bike Type`), fill = mean))
                } else {
                  ggplot(filterdata, aes(x = `Start Time (in Hrs)`, y = `Rider Type`, fill = mean))
                  }} +
                geom_tile(color = "white") +
                scale_fill_gradient(low = "#6EF0F7", high = "#09B4BD") +
                {if(input$Part2 == "Yes(by Bike-Type)") {
                  labs(x = "Time of the Day", y = "Rider-Bike Combination", fill = "mean")
                } else {
                  labs(x = "Time of the Day", y = "Rider Type", fill = "mean")
                }} +
                theme_minimal() +
                theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
                labs(title = paste(input$SelYaxis2, " distribution across different time-period for Day:", xInp1))
            }) 
            
            showModal(modalDialog(
              plotOutput("HeatMap"),
              title = paste("Heat Map" )
            ))
            
            output$HeatMap <- renderPlot({ch()})
            
          } else if(input$SelXaxis2 == "Season"){
            
            if(input$Part2 == "Yes(by Bike-Type)") {
              
              if(input$SelYaxis2 == "Avg.Ride-Duration"){
                
                q = dp %>% { if(cInp == "1") {filter(. , `Bike Type` == "Docked")} else {.}} %>% 
                  group_by(Season, `Rider Type`, `Bike Type`, `Month Name`) %>%  
                  summarise(mean = mean(`Ride Length`)) %>% ungroup()
                
              } else if(input$SelYaxis2 == "Avg.Ride-Distance"){
                
                q = dp %>% { if(cInp == "1") {filter(. , `Bike Type` == "Docked")} else {.}} %>% 
                  filter(!is.na(`Station Distance`)) %>% 
                  group_by(Season, `Rider Type`, `Bike Type`, `Month Name`) %>%  
                  summarise(mean = mean(`Station Distance`)) %>% ungroup()
              }
            } else{
              
              if(input$SelYaxis2 == "Avg.Ride-Duration"){
                
                q = dp %>% 
                  group_by(Season, `Rider Type`, `Month Name`) %>%  
                  summarise(mean = mean(`Ride Length`)) %>% ungroup()
                
              } else if(input$SelYaxis2 == "Avg.Ride-Distance"){
                
                q = dp %>% 
                  filter(!is.na(`Station Distance`)) %>% 
                  group_by(Season, `Rider Type`, `Month Name`) %>%  
                  summarise(mean = mean(`Station Distance`)) %>% ungroup()
              } }
              
              Seasons <- q %>% select(Season) %>% distinct() 
              Seasons$Season <- factor(Seasons$Season, levels = c("Fall", "Spring", "Summer", "Winter"))
              Seasons <- Seasons[order(Seasons$Season),]
              xInp1 <- Seasons$Season[xInp + 1]
          
            q_cas_dock <- q %>% filter(Season == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Docked")} else {.}}
            q_cas_clas <- q %>% filter(Season == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
            q_cas_elec <- q %>% filter(Season == as.character(xInp1)) %>% filter(`Rider Type` == "Casual") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
            q_mem_clas <- q %>% filter(Season == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Classic")} else {.}}
            q_mem_elec <- q %>% filter(Season == as.character(xInp1)) %>% filter(`Rider Type` == "Member") %>% { if(input$Part2 == "Yes(by Bike-Type)") {filter(. , `Bike Type` == "Electric")} else {.}}
            
            Months <- dp %>% filter(Season == as.character(xInp1)) %>% select(`Month Name`) %>% distinct() 
            
            if(xInp1 == "Spring"){
              
              Months$`Month Name` <- factor(Months$`Month Name`, levels = c("May 2023", "March 2024", "April 2024"))
              
            } else if(xInp1 == "Summer"){
              
              Months$`Month Name` <- factor(Months$`Month Name`, levels = c("June 2023", "July 2023", "Aug 2023"))
              
            } else if(xInp1 == "Fall"){
              
              Months$`Month Name` <- factor(Months$`Month Name`, levels = c("Sep 2023", "Oct 2023", "Nov 2023"))
              
            } else if(xInp1 == "Winter"){
              
              Months$`Month Name` <- factor(Months$`Month Name`, levels = c("Dec 2023", "Jan 2024", "Feb 2024"))
              
            }
            
            Months <- Months[order(Months$`Month Name`),]
           
            filterdata1 <- right_join(q_cas_dock, Months,   by='Month Name')
            filterdata2 <- right_join(q_cas_clas, Months,   by='Month Name')
            filterdata3 <- right_join(q_cas_elec, Months,   by='Month Name')
            filterdata4 <- right_join(q_mem_clas, Months,   by='Month Name')
            filterdata5 <- right_join(q_mem_elec, Months,   by='Month Name')
            
            if(!any(is.na(filterdata1)) && !any(is.na(filterdata2)) && !any(is.na(filterdata3)) && !any(is.na(filterdata4)) && !any(is.na(filterdata5))){
              
              filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
              
            } else {
              
              
              if(any(is.na(filterdata1))){
                
                filterdata1$mean <- ifelse(is.na(filterdata1$mean), 0, filterdata1$mean)
                filterdata1$Season <- ifelse(is.na(filterdata1$Season), xInp1, filterdata1$Season)
                filterdata1$`Rider Type` <- ifelse(is.na(filterdata1$`Rider Type`), "Casual", filterdata1$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                filterdata1$`Bike Type` <- ifelse(is.na(filterdata1$`Bike Type`), "Docked", filterdata1$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata2))){
                
                filterdata2$mean <- ifelse(is.na(filterdata2$mean), 0, filterdata2$mean)
                filterdata2$Season <- ifelse(is.na(filterdata2$Season), xInp1, filterdata2$Season)
                filterdata2$`Rider Type` <- ifelse(is.na(filterdata2$`Rider Type`), "Casual", filterdata2$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                  filterdata2$`Bike Type` <- ifelse(is.na(filterdata2$`Bike Type`), "Classic", filterdata2$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata3))){
                
                filterdata3$mean <- ifelse(is.na(filterdata3$mean), 0, filterdata3$mean)
                filterdata3$Season <- ifelse(is.na(filterdata3$Season), xInp1, filterdata3$Season)
                filterdata3$`Rider Type` <- ifelse(is.na(filterdata3$`Rider Type`), "Casual", filterdata3$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                  filterdata3$`Bike Type` <- ifelse(is.na(filterdata3$`Bike Type`), "Electric", filterdata3$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata4))){
                
                filterdata4$mean <- ifelse(is.na(filterdata4$mean), 0, filterdata4$mean)
                filterdata4$Season <- ifelse(is.na(filterdata4$Season), xInp1, filterdata4$Season)
                filterdata4$`Rider Type` <- ifelse(is.na(filterdata4$`Rider Type`), "Member", filterdata4$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                  filterdata4$`Bike Type` <- ifelse(is.na(filterdata4$`Bike Type`), "Classic", filterdata4$`Bike Type`)
                }
              } 
              
              if(any(is.na(filterdata5))){
                
                filterdata5$mean <- ifelse(is.na(filterdata5$mean), 0, filterdata5$mean)
                filterdata5$Season <- ifelse(is.na(filterdata5$Season), xInp1, filterdata5$Season)
                filterdata5$`Rider Type` <- ifelse(is.na(filterdata5$`Rider Type`), "Member", filterdata5$`Rider Type`)
                if(input$Part2 == "Yes(by Bike-Type)") {
                  filterdata5$`Bike Type` <- ifelse(is.na(filterdata5$`Bike Type`), "Electric", filterdata5$`Bike Type`)
                }
              }
              
              filterdata <- rbind(filterdata1, filterdata2, filterdata3, filterdata4, filterdata5)
            }
            
            filterdata$`Month Name` <- factor(filterdata$`Month Name`, levels = unique(Months$`Month Name`))
            
            ch <- reactive({ 
              
                {if(input$Part2 == "Yes(by Bike-Type)") {
                ggplot(filterdata, aes(x = `Month Name`, y = interaction(`Rider Type`, `Bike Type`), fill = mean))
                } else {
                  ggplot(filterdata, aes(x = `Month Name`, y = `Rider Type`, fill = mean)) 
                  }} +
                geom_tile(color = "white") +
                scale_fill_gradient(low = "#BD8DEC", high = "#7906EC") +
                {if(input$Part2 == "Yes(by Bike-Type)") {
                  labs(x = "Months", y = "Rider-Bike Combination", fill = "mean") 
                } else {
                  labs(x = "Months", y = "Rider Type", fill = "mean")  
                }} +
                theme_minimal() +
                theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
                labs(title = paste(input$SelYaxis2, " distribution across Months for the Season:", xInp1))
            }) 
            
            showModal(modalDialog(
              plotOutput("HeatMap"),
              title = paste("Heat Map" )
            ))
            
            output$HeatMap <- renderPlot({ch()})
          }
        } 

7.Generation of Choropleth Maps

Choropleth Maps are useful in identifying the regional density distribution. For current Case-study, a dot density map has been plotted on the top of the map of Chicago thus displaying the region where the density is the highest.

output$map_plot <- renderPlot({
      
      if(input$Data4 == "Yes"){
        
        output$dataM <- DT::renderDataTable({
          
          DT::datatable(
            
            mytable %>% filter(`Type of Rider` == paste(input$SelRid)) %>% 
              filter(`Bike Type` == paste(input$SelBik)) %>% 
              mutate(Perc_n = (n/sum(n))*100) %>% filter(Perc_n > "0.15") %>% 
              mutate_if(is.numeric, round, digits = 3),
            
            filter = 'top', extensions = c('Buttons', 'Scroller'),
            options = list(scrollY = 375,
                           scrollX = 500,
                           deferRender = TRUE,
                           scroller = TRUE,
                           # paging = TRUE,
                           # pageLength = 25,
                           buttons = list(list(extend = 'colvis', targets = 0, visible = FALSE)),
                           dom = 'lBfrtip',
                           fixedColumns = TRUE), 
            rownames = FALSE)
        })  
        
      } else {
        
        output$dataM <- DT::renderDataTable({
          
          DT::datatable(
            NULL
          )
        })
      }
   
      shp_1 <- read_sf("Boundaries - Community Areas (current)")
 
      if(input$SelRid == "casual" & input$SelBik == "classic_bike"){
        
        shp <- read_sf("mytable_casual_classic")
        
        interval <- c(0, 0.005, 0.01, 0.015, 0.02, 0.025, 0.03, 0.035, 0.04, 0.045, 0.05, 0.055, 0.06, 0.065, 0.07, 0.075, 0.08, 0.085, 0.09, 0.095, 0.1, 0.105, 0.110, 0.115, 0.12, 0.125, 0.13, 0.135, 0.14, 0.145, 0.15, 0.155, 0.16, 0.165, 0.17, Inf )
        
        l <- c("0.000-0.010", "0.010-0.020", "0.020-0.030", 
               "0.030-0.040", "0.040-0.050", "0.050-0.060", 
               "0.060-0.070", "0.070-0.080", "0.080-0.090", 
               "0.090-0.100", "0.100-0.110", "0.110-0.120", 
               "0.120-0.130", "0.130-0.140", "0.140-0.150", 
               "0.150-0.160", "0.160-0.170", "> 0.170" )
        
        m = 0.17
 
      } else if(input$SelRid == "casual" & input$SelBik == "electric_bike"){
  
        shp <- read_sf("mytable_casual_electric")
        
        interval <- c(0, 0.005, 0.01, 0.015, 0.02, 0.025, 0.03, 0.035, 0.04, 0.045, 0.05, 0.055, 0.06, 0.065, 0.07, 0.075, 0.08, 0.085, 0.09, 0.095, 0.1, 0.105, 0.110, 0.115, 0.12, 0.125, 0.13, 0.135, 0.14, 0.145, 0.15, Inf )
        
        l <- c("0.000-0.010", "0.010-0.020", "0.020-0.030", 
               "0.030-0.040", "0.040-0.050", "0.050-0.060", 
               "0.060-0.070", "0.070-0.080", "0.080-0.090", 
               "0.090-0.100", "0.100-0.110", "0.110-0.120", 
               "0.120-0.130", "0.130-0.140", "0.140-0.150","> 0.150")
        
        m = 0.15
     
      } else if(input$SelRid == "member" & input$SelBik == "classic_bike"){
 
        shp <- read_sf("mytable_member_classic")
        
        interval <- c(0, 0.005, 0.01, 0.015, 0.02, 0.025, 0.03, 0.035, 0.04, 0.045, 0.05, 0.055, 0.06, 0.065, 0.07, 0.075, 0.08, 0.085, 0.09, 0.095, 0.1, 0.105, 0.110, 0.115, 0.12, 0.125, 0.13, 0.135, 0.14, 0.145, 0.15, 0.155, 0.16, 0.165, 0.17, 0.175, 0.18, 0.185, 0.19, 0.195, 0.200, 0.205, 0.210, Inf )
        
        l <- c("0.000-0.010", "0.010-0.020", "0.020-0.030", 
               "0.030-0.040", "0.040-0.050", "0.050-0.060", 
               "0.060-0.070", "0.070-0.080", "0.080-0.090", 
               "0.090-0.100", "0.100-0.110", "0.110-0.120", 
               "0.120-0.130", "0.130-0.140", "0.140-0.150", 
               "0.150-0.160", "0.160-0.170", "0.170-0.180", 
               "0.180-0.190", "0.190-0.200", "0.200-0.210", "> 0.210" )
        
        m = 0.21
        
      } else {
        
        shp <- read_sf("mytable_member_electric")
        
        interval <- c(0, 0.005, 0.01, 0.015, 0.02, 0.025, 0.03, 0.035, 0.04, 0.045, 0.05, 0.055, 0.06, 0.065, 0.07, 0.075, 0.08, 0.085, 0.09, 0.095, 0.1, 0.105, 0.110, 0.115, 0.12, 0.125, 0.13, 0.135, 0.14, 0.145, 0.15, 0.155, 0.16, 0.165, 0.17, Inf )
        
        l <- c("0.000-0.010", "0.010-0.020", "0.020-0.030", 
               "0.030-0.040", "0.040-0.050", "0.050-0.060", 
               "0.060-0.070", "0.070-0.080", "0.080-0.090", 
               "0.090-0.100", "0.100-0.110", "0.110-0.120", 
               "0.120-0.130", "0.130-0.140", "0.140-0.150", 
               "0.150-0.160", "0.160-0.170", "> 0.170" )
        m = 0.17
      }
  
      d <- mytable %>% filter(`Type of Rider` == paste(input$SelRid)) %>% filter(`Bike Type` == paste(input$SelBik)) %>% mutate(Perc_n = (n/sum(n))*100)
      
      n_interval <- length(interval) - 1
 
      incidence_interval_1 <- classIntervals(d$Perc_n,
                                             n = n_interval,
                                             style = "fixed",
                                             fixedBreaks = interval,
                                             yearrouterecision = 3)
      
      colors <- colorRampPalette(brewer.pal(9, "Oranges"))(n_interval)
      
      colcod <- findColours(incidence_interval_1, colors)
      
      p <- ggplot(shp_1) +
        geom_sf(color = "black", fill = "khaki3") + 
        geom_sf(shp, mapping = aes(fill = Perc_n), color = colcod) +
        coord_sf(xlim = c(-88, -87.52), ylim = c(41.6, 42.07)) +
        ggthemes::theme_map() +
        theme(legend.position = "none") +
        theme(panel.background = element_rect(fill = "aliceblue", color = "white"),
              axis.ticks = element_blank(),
              axis.text = element_blank()) +
        # panel.grid = element_line(color = "white", linewidth = 0.8)) +
        labs(title = "Chicago City Map", 
             subtitle = paste("Heatmap showing percen. distribution of", input$SelRid,"-", input$SelBik,"riders")) 
      p
      
      # Function to plot color bar
      color.bar <- function(lut, min = 0, max = m, nticks = n_interval, ticks = seq(min, max, len = nticks/2), title='Dist. % per Dot') {
        scale = (max-min)/(length(lut)-1)
        plot(c(0,5), c(min,max), type='n', bty='n', xaxt='n', xlab='', yaxt='n', ylab='', main=title , adj = 0, line = -0.02, cex.main = 0.75)
        axis(2, at = ticks, labels = l, cex.axis = 0.6, las=1)
        
        for (i in 0:(length(lut)-1)) {
          y = i*scale + min
          rect(0,y,1.5,(length(lut)-1)*scale , col=lut[i], border= "black")
        }
      } 
      
      color.bar(colorRampPalette(brewer.pal(9, "Oranges"))(n_interval))
      
      # Combining the ggplot with colorbar using gridExtra from Hmisc package.

      p2 <- as.grob(p)
      r <- as.grob(function() color.bar(colorRampPalette(brewer.pal(9, "Oranges"))(n_interval)))
      
      grid.newpage()
      grid.draw(p2)
      vp = viewport(x=0.8, y=.45, width=.3, height=1.2)
      pushViewport(vp)
      grid.draw(r)
      upViewport()
    })

Deploying the Dashboard on ShinyApps.io

For deploying the web-app, first we need to combine the UI and server code in one file called “app.R”.

Include the below code before UI

library(tidyverse)
library(readr)
library(dplyr)
library(lubridate)
library(shiny)
library(shinydashboard)
library(fontawesome)
library(plotly)
library(DT)
library(janitor)
library(devtools)
library(urbnmapr)
library(tibble)
library(sf)
library(rvest)
library(viridis)
library(ggrepel)
library(ggthemes)
library(hms)
library(RColorBrewer)
library(classInt)
library(dots)
library(cartography)
library(mapsf)
library(scales)
library(ggpubr)
library(Hmisc)
library(gridExtra)
library("grid")
library("ggplotify")
library(effects)
library(ggplot2)
library(ggtext)
library(maps)
library(shinycssloaders)
library(plotly)
library(rlang)
library(ggcorrplot)
library(gghighlight)
library(webr)
library(grid)
library(reshape)
library(conflicted)

conflicts_prefer(DT::dataTableOutput)
conflicts_prefer(dplyr::filter)
conflicts_prefer(plotly::layout)
conflicts_prefer(dplyr::rename)

yearroute <- read_csv("yearroute.txt")
Station <- read_csv("Station.txt")
starttime <- read_csv("starttime.txt")
rawdata_sample <- read_csv("rawdata_sample.txt")
mytable <- read_csv("mytable.txt")
input_list <- read.csv("Inputs.csv")

c <- c("May 2023", "June 2023", "July 2023", "Aug 2023", "Sep 2023", "Oct 2023", "Nov 2023", "Dec 2023", "Jan 2024", "Feb 2024", "March 2024", "April 2024")

Include the below code after the Server

shinyApp(ui = ui, server = server)

Some points to be noted:

  1. Include all your media files in a “www” folder and keep that folder inside the folder where the file app.R is present.

  2. If there is a reference to any external file other than app.R, then give the path of that file relative to app.R file and not the absolute path (i.e. C:/Users/…..).

  3. If there is OOM (Out of Memory) error given during deployment then we can do the following:

  • Increase the Instance size to maximum in the settings option
  • Try to reduce your dataset by sampling or compressing it.
  • We can even break our dataset into small chunks so that only those chunk get loaded when they are required.

Some features of my Case-Study analysis are as follows:

  • Customised Analysis: With the various UI Controls available, we just need to input the parameters and a graph customised as per our requirement gets ready in no time!

  • Automation of Analysis: This is the main advantage of using R shiny i.e. for automating the analytical process. With the help of automation in Shiny, I am able to create 240 barcharts and 5022 linked charts for monthly analysis, 216 barcharts and 7318 linked charts for yearly analysis and 4 graphs for choropleth maps which amounts to 12800 graphs. Making 12800 graphs manually would really take a lot of time and effort. But with Shiny, it really becomes a lot more easier, time-saving and also it does not cluster up the space with graphs.

  • Complying with Stakeholder’s Needs: Many times, Stakeholders do prefer referring to the Data-table for which a separate option for viewing Data is provided as well.

  • Linked Charts: Linking of Pie Charts and Heatmaps to bars for indepth visualization.

  • Highlights: Spotlighting important numbers for ease of identification.

  • Attractive UI: Use of various icons for an attractive UI.

Findings and Recommendations

The findings and recommendations are as given below:

Findings

Recommendations

  1. People who would purchase Cyclistic Membership could be given some free rides annually to places like Dusable Lake Shore Drive and Streeter Dr where multiple parks are located.

  2. Small groups of Riders can be formed by Cyclistic which would prefer going for early morning rides and also in the evening (around 17.00 hrs) in Spring and Summer covering several places in Chicago. Advertisements of such groups should be done focussed towards the residents in the region of Streeter Dr and Grand Ave and other top 25 such stations which can attract a lot of people who may join the groups to expand their Networks. Marketing team can provide Seasonal memberships at discounted prices to these groups.

  3. Since, the Casual riders are more during Weekends, Marathons could be organised on Weekends. Particiants could be provided with discounted memberships.

Conclusion

With this case-study, I really got to develop a lot of skills pertaining to programming in R studio and getting comfortable with Shiny webapp as well. Throughout this Case-Study analysis, I was able to get my hands on various tidyverse packages such as tidyr, dplyr, stringr, ggplot2 and also beyond that such as lubridate, shinydashboard, viridis, DT, Hmisc, gridextra, urbanmapr and many more!

I would love to thank the entire team at Google for such a wonderful course. Besides this, I would also like to give credit to the youtube playlist of Abhinav Agrawal Sir, Data Science Tutorials by Abhishek Agarwal Sir and the entire community on Stack-overflow for helping me out with their informative articles!

References

  1. Abhinav Agrawal Sir’s Youtube Videos
  2. Abhishek Agrawal Sir’s Youtube Videos
  3. Linked Charts on Clicking
  4. Tags used in Shiny
  5. Shiny and Beyond: Mastering Interactive Web Applications with R and Appsilon Packages
  6. Choropleth Maps in R
  7. Customised Colorbar for Maps in R
  8. Group_by in Shiny
  9. Conditional Evaluation using pipe operator
  10. Pie-Donut Chart in R
  11. Pie Charts in R
  12. Heatmaps in R
  13. Deploying on Shiny Apps
  14. R Markdown for Data Analysis