Google Data Analytics Professional Certificate capstone case study track one Cyclistic bike-share analysis

Prepare and Process

Import data

  1. Made copies of all data files
  2. Imported data into PostgreSQL. (The BigQuery Sandbox account does not accept large files).
    1. First check files in a text file, to quickly find column names and detect correct data type to import correctly, and to change file names to more easily reflect contents (e.g, 202110-divvy-tripdata to Oct_2021).
    1. Created database: Cyclistics_project, schema: fiscal_year_2021_2022, and tables: Oct_2021, Nov_2021, Dec_2021, Jan_2022, Feb_2022, Mar_2022, Apr_2022, May_2022, June_2022, July_2022, Aug_2022, and Sept_2021.
CREATE TABLE fiscal_year_2021_2022."Oct_2021"
(ride_id VARCHAR,
    rideable_type VARCHAR,
    started_at timestamp without time zone,
    ended_at timestamp without time zone,
    start_station_name VARCHAR,
    start_station_id VARCHAR,
    end_station_name VARCHAR,
    end_station_id VARCHAR,
    start_lat double precision,
    start_lng double precision,
    end_lat double precision,
    end_lng double precision,
    member_casual VARCHAR);
    1. Checked each file for proper import by column names and column and row count (not including header):
      Oct_2021 631226 13 Nov_2021 359978 13 Dec_2021 247540 13 Jan_2022 103770 13 Feb_2022 115609 13 Mar_2022 284042 13 Apr_2022 371249 13 May_2022 634858 13 Jun_2022 769204 13 July_2022 823488 13 Aug_2022 785932 13 Sept_2022 701339 13
  1. Examined the content and layout of Oct_2021 to get a feel for the data
SELECT * FROM fiscal_year_2021_2022."Oct_2021"
LIMIT 100
    1. Looked at variables for ‘rideable_type’ and ‘member_casual’
SELECT
    DISTINCT rideable_type
FROM fiscal_year_2021_2022."Oct_2021";
“rideable_type”, “classic_bike”,“docked_bike” “electric_bike”
And
“member_casual”, “casual”, “member”
    1. Looked at number and names of stations in ‘end_station_name’
    1. Return is 791 names, some of which are maintenance, temp, and vaccination sites

Process (wrangling, cleaning, and transformation)

  1. Combined files into quarters and then into fiscal year file named: cyclistic_2021_2022
    1. E.g., Oct_2021, Nov_2021, and Dec_2021 into 1Q
INSERT INTO fiscal_year_2021_2022."1Q"
SELECT * FROM fiscal_year_2021_2022."Oct_2021";
INSERT INTO fiscal_year_2021_2022."1Q"
SELECT * FROM fiscal_year_2021_2022."Nov_2021";
INSERT INTO fiscal_year_2021_2022."1Q"
SELECT * FROM fiscal_year_2021_2022."Dec_2021"
ON CONFLICT DO NOTHING;
      1. fiscal_year_2021_2022.”1Q” Total rows: 1238744
      1. fiscal_year_2021_2022.”2Q” Total rows: 503421
      1. fiscal_year_2021_2022.”3Q” Total rows: 1775311
      1. fiscal_year_2021_2022.”4Q” Total rows: 2310759
    1. Created fiscal year file named: total_rides
CREATE TABLE fiscal_year_2021_2022."total_rides" AS
(SELECT * FROM fiscal_year_2021_2022."1Q"
UNION ALL
SELECT * FROM fiscal_year_2021_2022."2Q"
UNION ALL
SELECT * FROM fiscal_year_2021_2022."3Q"
UNION ALL
SELECT * FROM fiscal_year_2021_2022."4Q");
      1. fiscal_year_2021_2022.”total_tides” 5828235
      1. Checked row and column count: 1238744 + 503421 + 1775311 + 2310759 = 5828235 and 13 columns
  1. Checked for NULL values

    Example:

SELECT ride_id, rideable_type, started_at, ended_at, start_station_name, start_station_id, end_station_name, end_station_id, start_lat, start_lng, end_lat, end_lng, member_casual
FROM fiscal_year_2021_2022.total_rides
WHERE member_casual IS NULL;
    1. Return:
    ride_id IS NULL
    rideable_type IS NULL
    started_at IS NULL
    ended_at  IS NULL
    start_station_name = 895032
    start_station_id = 895032
    end_station_name  = 958227
    end_station_id  = 958227
    start_lat IS NULL
    start_lng IS NULL
    end_lat is 5844
    end_lng is 5844
    member_casual IS NULL
    1. (895032 x 2) + (958227 x 2) + (5844 x 2)= ((895032 * 2) + (958227 * 2)) + (5844 * 2) = 3718206
  1. Removed null values (and created new draft)
CREATE TABLE fiscal_year_2021_2022."total_rides_V2" AS
(SELECT *
FROM fiscal_year_2021_2022."total_rides"
WHERE start_station_name NOT LIKE '%NULL%'
AND start_station_id NOT LIKE '%NULL%'
AND end_station_name NOT LIKE '%NULL%'
AND end_station_id NOT LIKE '%NULL%'
AND end_lat NOT LIKE '%NULL%'
AND end_lng NOT LIKE '%NULL%');
    1. 4474141 columns remain; 5828235 - 4474141 = 1354094 were removed.
  1. Checked for duplicate ride_id values: 0
SELECT ride_id FROM fiscal_year_2021_2022."total_rides_V2"
    1. 4474141 variables, so no duplicates
  1. Checked for leading or trailing spaces from start_station_name and end_station_name

    Example:

SELECT *
FROM fiscal_year_2021_2022."total_rides_V2"
WHERE start_station_name LIKE ' %' or start_station_name LIKE '% ';
    1. start_station_name return: 71
    1. start_station_name return: 84
  1. Updated start_station_name and end_station_name Example:
UPDATE fiscal_year_2021_2022."total_rides_V2"
SET  start_station_name = TRIM(start_station_name);
    1. Check = 0
    1. Check = 0
  1. Checked for uniformity of character length in ride_id
    1. Check length of string:
SELECT LENGTH(ride_id)
FROM fiscal_year_2021_2022."total_rides_V2";
    1. Return = 16 characters long
    1. Check uniformity:
SELECT ride_id
FROM fiscal_year_2021_2022."total_rides_V2"
WHERE LENGTH(ride_id) <> 16
    1. Return = 0
  1. Found and removed stations warehouse, repair, and charging from columns start_station_name and end_station_name:
    1. Warehouse
      1. Base - 2132 W Hubbard = 890/127
      1. Base - 2132 W Hubbard Warehouse = 317/134
      1. Hastings WH 2 = 2/2
    1. Mobile stations
      1. DIVVY CASSETTE REPAIR MOBILE STATION = 0/6
      1. Throop/Hastings Mobile Station = 1/1
  1. Charging stations
      1. Bissell St & Armitage Ave - Charging = 18/20
      1. Lincoln Ave & Roscoe St - Charging = 3/3
      1. Pawel Bialowas - Test- PBSC charging station = 1/1
      1. Wilton Ave & Diversey Pkwy - Charging = 17/1
CREATE TABLE fiscal_year_2021_2022."total_rides_V3" AS
(SELECT *
FROM fiscal_year_2021_2022."total_rides_V2"
WHERE start_station_name NOT LIKE '%Base - 2132 W Hubbard%'
AND start_station_name NOT LIKE '%Base - 2132 W Hubbard Warehouse%'
AND start_station_name NOT LIKE '%Hastings WH 2%'
AND start_station_name NOT LIKE '%DIVVY CASSETTE REPAIR MOBILE STATION%'
AND start_station_name NOT LIKE '%Throop/Hastings Mobile Station%'
AND start_station_name NOT LIKE '%Bissell St & Armitage Ave - Charging%'
AND start_station_name NOT LIKE '%Lincoln Ave & Roscoe St - Charging%'
AND start_station_name NOT LIKE '%Pawel Bialowas - Test- PBSC charging station%'
AND start_station_name NOT LIKE '%Wilton Ave & Diversey Pkwy - Charging%');
    1. Check: row count: 4472892
    1. Removed same stations from column end_station_name:
UPDATE fiscal_year_2021_2022."total_rides_V3"
SET end_station_name =
WHERE end_station_name NOT LIKE '%Base - 2132 W Hubbard%'
AND end_station_name NOT LIKE '%Base - 2132 W Hubbard Warehouse%'
AND end_station_name NOT LIKE '%Hastings WH 2%'
AND end_station_name NOT LIKE '%DIVVY CASSETTE REPAIR MOBILE STATION%'
AND end_station_name NOT LIKE '%Throop/Hastings Mobile Station%'
AND end_station_name NOT LIKE '%Bissell St & Armitage Ave - Charging%'
AND end_station_name NOT LIKE '%Lincoln Ave & Roscoe St - Charging%'
AND end_station_name NOT LIKE '%Pawel Bialowas - Test- PBSC charging station%'
AND end_station_name NOT LIKE '%Wilton Ave & Diversey Pkwy - Charging%';
    1. Check: row count: 4472599
    1. Investigated how many stations with Temp in the name would need to be deleted:
SELECT COUNT(*) AS num_of_rows_to_delete
 FROM fiscal_year_2021_2022."total_rides_V4"
 WHERE start_station_name LIKE '%Temp%';
      1. Return: 30446 rows
      1. I did not remove these.
  1. Standardized column data-type and labels:
    1. Did not need to retype/cast data (see above)
    1. Relabelled columns and check:

Example:

ALTER TABLE fiscal_year_2021_2022."total_rides_V3"
RENAME COLUMN ride_id to trip_id 
      1. ride_id > trip_id
      1. rideable_type > bike_type
      1. member_casual > user_type
  1. Created new columns: start_date, start_time: Example:
ALTER TABLE fiscal_year_2021_2022."total_rides_V4" ADD COLUMN start_date date;
ALTER TABLE fiscal_year_2021_2022."total_rides_V4" ADD COLUMN start_time time;
UPDATE fiscal_year_2021_2022."total_rides_V4"
SET start_date = started_at :: date,
    start_time = started_at :: time;
    1. Check that columns were created
  1. Created new columns: end_date, end_time:
    1. Check that columns were created
  1. Created columns: month and day
ALTER TABLE fiscal_year_2021_2022."total_rides_V5" ADD COLUMN month VARCHAR;
UPDATE fiscal_year_2021_2022."total_rides_V4"
SET month = TO_CHAR (start_date, 'Month');
ALTER TABLE fiscal_year_2021_2022."total_rides_V4" ADD COLUMN day VARCHAR;
UPDATE fiscal_year_2021_2022."total_rides_V4"
SET day = TO_CHAR (start_date, 'Day');
    1. Check that columns were created
  1. Created new column trip_duration for trip in seconds:
ALTER TABLE fiscal_year_2021_2022."total_rides_V4" ADD COLUMN trip_duration INTEGER;
UPDATE fiscal_year_2021_2022."total_rides_V4"
SET trip_duration = EXTRACT(EPOCH FROM (ended_at - started_at));
  1. Checked trip_duration column for outliers
SELECT
MIN(trip_duration),
MAX(trip_duration)
FROM fiscal_year_2021_2022."total_rides_V4";
    1. Noted that trip_duration has trips under 60 seconds long (-7621 seconds).
    1. Noted that trip_duration has trips over 86400 seconds long (over 24hrs [2442301 seconds]) as they are likely stolen.
  1. Checked how many rows have outliers
SELECT *
FROM fiscal_year_2021_2022."total_rides_V4"
WHERE trip_duration <= 60 or trip_duration >= 86400;
    1. Return: 74712 rows
  1. Removed trip_duration outliers (create new draft of data frame):
CREATE TABLE fiscal_year_2021_2022."total_rides_CLEAN" AS
SELECT *
FROM fiscal_year_2021_2022."total_rides_V4"
WHERE trip_duration > 60 AND trip_duration < 86400
ORDER BY trip_duration DESC;
    1. Return: 4397887 (4472599-74712= 4397887)
  1. Rechecked for outliers:
    1. Return: 0
  1. Exported clean CSV file: total_rides_2021-2022_CLEAN

Aggregate and Analyze

Counted rides by user type and percentage of total

  1. Divided total rides by user type:
SELECT user_type, COUNT(*) AS number_of_rides
FROM fiscal_year_2021_2022."total_rides_CLEAN"
GROUP BY user_type
ORDER BY user_type DESC
    1. Return:
    user_type   total
    "member"        2618743
    "casual"        1779144
  1. Calculated percentage of rides by user:
SELECT user_type,
     COUNT(user_type) AS total,
     ROUND (COUNT(user_type) * 100.0 /
(SELECT COUNT(*)
     FROM fiscal_year_2021_2022."total_rides_CLEAN")) AS percent
FROM fiscal_year_2021_2022."total_rides_CLEAN"
GROUP BY user_type
    1. Return:
    user_type   total       percent
    "casual"    1779144     40
    "member"        2618743     60
  1. Totaled monthly ride count and percentage of monthly rides by user type:
SELECT month, user_type,
        COUNT(*) AS total,
ROUND(COUNT(*) * 100.0 / (SELECT COUNT(*) 
FROM fiscal_year_2021_2022."total_rides_CLEAN")) AS percent
FROM fiscal_year_2021_2022."total_rides_CLEAN"
GROUP BY month, user_type
ORDER BY month
    1. Return: months are out of order
  1. Example:

    "month"     "user_type"    "total"    "percent"
    "April"       "casual"      90747       2
    "April"     "member"        177666      4
    "August"        "casual"        265563      6
    "August"        "member"        328365      7
    "December"      "casual"        44644       1
    "December"      "member"        129282      3
    1. Put months in order
SELECT month, user_type,
     COUNT(*) AS total,
ROUND(COUNT(*) * 100.0 / (SELECT COUNT(*)
FROM fiscal_year_2021_2022."total_rides_CLEAN")) AS percent
FROM fiscal_year_2021_2022."total_rides_CLEAN"
GROUP BY month, user_type
ORDER BY CASE WHEN month='January' THEN 1
           WHEN month='February' THEN 2
           WHEN month='March' THEN 3
            WHEN month='April' THEN 4
           WHEN month='May' THEN 5
           WHEN month='June' THEN 6
            WHEN month='July' THEN 7
           WHEN month='August' THEN 8
            WHEN month='September' THEN 9
           WHEN month='October' THEN 10
           WHEN month='November' THEN 11
            ELSE 12
            END;
      1. Return: this did nothing
    1. Checked length of string for month and day:
SELECT LENGTH (month)
FROM fiscal_year_2021_2022."total_rides_CLEAN"
GROUP BY month;
      1. Return: all months have 9 characters
    1. Trimmed newly created month and day columns
UPDATE fiscal_year_2021_2022."total_rides_CLEAN"
    SET month = TRIM(month), day = TRIM (day);
      1. Return for month (days also returned trimmed):
  "length"
    5
    6
    8
    8
    7
    4
    4
    5
    3
    8
    7
    9
    1. Did the same for day column
    1. Reran original query (3b):
      1. Returned with months in order:
        "month"       "user_type"      "total"  "percent"
        "January"       "member"        66554       2
        "January"       "casual"        12461       0
        "February"      "casual"        14950       0
        "February"      "member"        72656       2
        "March"         "casual"        66329       2
        "March"         "member"        146390      3
        "April"         "casual"        90747       2
        "April"         "member"        177666      4
        "May"           "casual"        216860      5
        "May"           "member"        277063      6
        "June"          "casual"        287406      7
        "June"          "member"        322107      7
        "July"          "casual"        306378      7
        "July"          "member"        324096      7
        "August"        "member"        328365      7
        "August"        "casual"        265563      6
        "September"     "member"        307658      7
        "September"     "casual"        217375      5
        "October"       "casual"        187206      4
        "October"       "member"        284038      6
        "November"      "member"        182868      4
        "November"      "casual"        69225       2
        "December"      "member"        129282      3
        "December"      "casual"        44644       1
    1. Put days in order using the same script.
  1. Totaled ride count and percentage of rides by user type by day of the week:
SELECT day, user_type,
      COUNT(*) AS total,
ROUND(COUNT(*) * 100.0 / (SELECT COUNT(*)
FROM fiscal_year_2021_2022."total_rides_CLEAN")) AS percent
FROM fiscal_year_2021_2022."total_rides_CLEAN"
GROUP BY day, user_type
ORDER BY CASE WHEN day = 'Sunday' THEN 1
        WHEN day = 'Monday' THEN 2
        WHEN day = 'Tuesday' THEN 3
        WHEN day = 'Wednesday' THEN 4
        WHEN day = 'Thursday' THEN 5
        WHEN day = 'Friday' THEN 6
        ELSE 7
        END;
    1. Return:
    "day"           "user_type"      "total"    "percent"
    "Sunday"        "casual"        309192    7
    "Sunday"        "member"        296518    7
    "Monday"        "casual"     206822   5
    "Monday"        "member"        366622    8
    "Tuesday"       "member"        420848    10
    "Tuesday"       "casual"        199917    5
    "Wednesday"    "member"     415557    9
    "Wednesday"     "casual"        203274    5
    "Thursday"      "casual"        221681    5
    "Thursday"      "member"        404778    9
    "Friday"        "member"        370376    8
    "Friday"        "casual"        255838    6
    "Saturday"      "casual"        382420    9
    "Saturday"      "member"        344044    8

Aggregated trip durations by user type and bike type

  1. Aggregated column trip duration:
SELECT ROUND(AVG(trip_duration/60)) AS average_ride_duration
FROM fiscal_year_2021_2022."total_rides_CLEAN"
    1. Return:
“average_ride_duration”
17
  1. Aggregated trip_duration by user type. Note that the average duration of a casual user’s ride is ~twice as long as a member’s ride.
SELECT user_type,
    ROUND(AVG(trip_duration/60)) AS average_ride_duration,
    MIN(trip_duration/60) AS MIN_ride_duration,
    MAX(trip_duration/60) AS MAX_ride_duration
FROM fiscal_year_2021_2022."total_rides_CLEAN"
GROUP BY user_type
    1. Return:
    "user_type"    "average_ride_duration"    "min_ride_duration"    "max_ride_duration"
    "casual"                    24                      1               1439
    "member"                    12                      1               1435
  1. Compared number of rides, trip duration, and user type by month:
SELECT month, user_type,
      COUNT(*) AS total,
    ROUND(AVG(trip_duration/60)) AS average_ride_duration
FROM fiscal_year_2021_2022."total_rides_CLEAN"
GROUP BY month, user_type
ORDER BY CASE WHEN month='January' THEN 1
        WHEN month='February' THEN 2
        WHEN month='March' THEN 3
             WHEN month='April' THEN 4
        WHEN month='May' THEN 5
        WHEN month='June' THEN 6
             WHEN month='July' THEN 7
        WHEN month='August' THEN 8
             WHEN month='September' THEN 9
        WHEN month='October' THEN 10
        WHEN month='November' THEN 11
             ELSE 12
             END;
    1. Return:
    "month"    "user_type"    "total"    "average_ride_duration"
    "January"    "member"    66554      10
    "January"    "casual"    12461          18
    "February"    "casual"    14950         21
    "February"    "member"    72656     10
    "March"    "casual"    66329            26
    "March"    "member"    146390       11
    "April"    "casual"    90747            25
    "April"    "member"    177666           11
    "May"    "casual"    216860         27
    "May"    "member"    277063         13
    "June"    "casual"    287406            25
    "June"    "member"    322107            13
    "July"    "casual"    306378            25
    "July"    "member"    324096            13
    "August"    "member"    328365      13
    "August"    "casual"    265563          23
    "September"    "member"    307658       12
    "September"    "casual"    217375       22
    "October"    "casual"    187206     24
    "October"    "member"    284038     12
    "November"    "member"    182868        11
    "November"    "casual"    69225     20
    "December"    "member"    129282        10
    "December"    "casual"    44644     20
  1. Compared number of rides, trip duration, and user type by day of the week:
SELECT day, user_type,
      COUNT(*) AS total,
    ROUND(AVG(trip_duration/60)) AS average_ride_duration
FROM fiscal_year_2021_2022."total_rides_CLEAN"
GROUP BY day, user_type
ORDER BY CASE WHEN day = 'Sunday' THEN 1
        WHEN day = 'Monday' THEN 2
        WHEN day = 'Tuesday' THEN 3
        WHEN day = 'Wednesday' THEN 4
        WHEN day = 'Thursday' THEN 5
        WHEN day = 'Friday' THEN 6
        ELSE 7
        END;
    1. Return:
    "day"    "user_type"    "total"    "average_ride_duration"
    "Sunday"    "casual"    309192          28
    "Sunday"    "member"    296518      14
    "Monday"    "casual"    206822      25
    "Monday"    "member"    366622      12
    "Tuesday"    "member"    420848     12
    "Tuesday"    "casual"    199917     22
    "Wednesday"    "member"    415557       12
    "Wednesday"    "casual"    203274       21
    "Thursday"    "casual"    221681        21
    "Thursday"    "member"    404778        12
    "Friday"    "member"    370376      12
    "Friday"    "casual"    255838          22
    "Saturday"    "casual"    382420        27
    "Saturday"    "member"    344044        14
  1. Checked percentage of use by bike type by total rides and average duration:
SELECT bike_type,
    COUNT(bike_type) AS total_rides,
    ROUND(AVG(trip_duration/60)) AS average_ride_duration,
    ROUND (COUNT(bike_type) * 100.0 /
           (SELECT COUNT(*)
    FROM fiscal_year_2021_2022."total_rides_CLEAN")) AS percent
FROM fiscal_year_2021_2022."total_rides_CLEAN"
GROUP BY bike_type
    1. Return:
    "bike_type"     "total_rides"    "average_ride_duration"    "percent"
    "classic_bike"        2695565           17              61
    "docked_bike"     188124            48              4
    "electric_bike"     1514198         14              34
  1. Checked percentage of use of bike types by user
SELECT user_type, bike_type,
     COUNT(user_type) AS total,
    ROUND(AVG(trip_duration/60)) AS average_ride_duration,
     ROUND(COUNT(user_type) * 100.0 / (SELECT COUNT(*) FROM fiscal_year_2021_2022."total_rides_CLEAN")) AS percent
FROM fiscal_year_2021_2022."total_rides_CLEAN"
GROUP BY user_type, bike_type
    1. Return:
    "user_type"    "bike_type"       "total"    "average_ride_duration"    "percent"
    "casual"    "classic_bike"    925549        24              21
    "casual"        "docked_bike"   188124      48              4
    "casual"        "electric_bike"    665471       17              15

    "member"        "classic_bike"    1770016       13              40
    "member"        "electric_bike"   848727        11              19

Investigated types of trips taken by user type

  1. Compared number of round trips and their average duration of each user type by bike type (used start_station_id and end_station_id columns since they are numbers, they are more reliable than names - which could be alternatively typed.)
SELECT user_type, COUNT (*) AS number_of_round_trips
FROM fiscal_year_2021_2022."total_rides_CLEAN"
WHERE start_station_id = end_station_id
GROUP BY user_type
    1. Return:
    "user_type"    "number_of_round_trips"
    "casual"        154171
    "member"        72341
  1. Calculated most used stations of user types:
SELECT  start_station_id AS most_used_station,
  COUNT(*) AS num_trips,
  ROUND(AVG(trip_duration)/60) AS duration_in_mins
FROM fiscal_year_2021_2022."total_rides_CLEAN"
GROUP BY
start_station_id
ORDER BY
num_trips DESC
LIMIT 6
    1. Return:
    "most_used_station"    "num_trips"    "duration_in_mins "
    "13022"                     70719       33
    "13300"                     39310       34
    "LF-005"                      37616     25
    "13042"                     37111       31
    "TA1308000050"              35990       14
    "13008"                     33958       35
    1. Checked names of most used stations:
SELECT start_station_id, start_station_name
    FROM fiscal_year_2021_2022."total_rides_CLEAN"
    WHERE start_station_id IN ('13022', '13300', 'LF-005', '13042', 'TA1308000050', '13008')
GROUP BY start_station_id, start_station_name;
      1. Return:
    "start_station_id"      "start_station_name"
    "13008"         "Millennium Park"
    "13022"         "Streeter Dr & Grand Ave"
    "13042"         "Michigan Ave & Oak St"
    "13300"         "DuSable Lake Shore Dr & Monroe St"
    "LF-005"            "DuSable Lake Shore Dr & North Blvd"
    "TA1308000050"      "Wells St & Concord Ln"
  1. Calculated most used stations of casual:
SELECT user_type, start_station_id AS most_used_station, start_station_name,
  COUNT(start_station_id) AS num_trips,
  ROUND(AVG(trip_duration)/60) AS duration_in_mins
FROM fiscal_year_2021_2022."total_rides_CLEAN"
WHERE user_type = 'casual'
GROUP BY
user_type, start_station_id, start_station_name
ORDER BY
num_trips DESC
LIMIT 6
    1. Return:
    "user_type"    "most_used_station"    "start_station_name"      "num_trips"    "duration_in_mins"
    "casual"        "13022"         "Streeter Dr & Grand Ave"             54792       37
    "casual"        "13300"         "DuSable Lake Shore Dr & Monroe St"    30270        37
    "casual"        "13008"         "Millennium Park"                       25080       41
    "casual"        "13042"         "Michigan Ave & Oak St"                 23659       37
    "casual"        "LF-005"        "DuSable Lake Shore Dr & North Blvd"    22130       30
    "casual"        "15544"         "Shedd Aquarium"                        19293       31
  1. Calculated most used stations of member:
SELECT user_type, start_station_id AS most_used_station, start_station_name,
  COUNT(start_station_id) AS num_trips,
  ROUND(AVG(trip_duration)/60) AS duration_in_mins
FROM fiscal_year_2021_2022."total_rides_CLEAN"
WHERE user_type = 'member'
GROUP BY
user_type, start_station_id, start_station_name
ORDER BY
num_trips DESC
LIMIT 6
    1. Return:
    "user_type"    "most_used_station"    "start_station_name"    "num_trips"    "duration_in_mins"
    "member"    "KA1503000043"    "Kingsbury St & Kinzie St"        24567           9
    "member"    "TA1307000039"    "Clark St & Elm St"               21451           12
    "member"    "TA1308000050"    "Wells St & Concord Ln"           20645           12
    "member"    "WL-012"        "Clinton St & Washington Blvd"      18654           11
    "member"    "TA1305000032"    "Clinton St & Madison St"         18483           11
    "member"    "KA1504000135"    "Wells St & Elm St"               18242           11
  1. Calculated least used stations of user types:
SELECT start_station_id AS most_used_station, start_station_name,
  COUNT(start_station_id) AS num_trips,
  ROUND(AVG(trip_duration)/60) AS duration_in_mins
FROM fiscal_year_2021_2022."total_rides_CLEAN"
GROUP BY
start_station_id, start_station_name
ORDER BY
num_trips ASC
LIMIT 6
    1. Return:
    "most_used_station"    "start_station_name"     "num_trips"    "duration_in_mins"
    "1032"    "Public Rack - Kedvale Ave & 63rd St"         1           5
    "1033"    "Public Rack - Pulaski Rd &amp; 65th St"      1           11
    "1018"    "Public Rack - Kostner Ave & Wrightwood Ave"  1         102
    "1030"    "Public Rack - Lawndale & 63rd St"            1           37
    "1015"    "Public Rack - Peterson Ave & Drake Ave"      1           25
    "1034"    "Public Rack - Kenneth Ave & 63rd St E"       1           8
  1. Calculated least used stations of casual:
    1. Return:
    "user_type"    "most_used_station"    "start_station_name"    "num_trips"    "duration_in_mins"
    "casual"    "1036"    "Public Rack - Kedzie Ave & 60th St"          1           3
    "casual"    "1038"    "Public Rack - Kedzie Ave &amp; 62nd Pl"      1           6
    "casual"    "1032"    "Public Rack - Kedvale Ave & 63rd St"         1           5
    "casual"    "1030"    "Public Rack - Lawndale & 63rd St"            1           37
    "casual"    "1018"    "Public Rack - Kostner Ave & Wrightwood Ave"    1         102
    "casual"    "1040"    "Public Rack - Talman Ave & Pershing Rd"      1           40
  1. Calculated least used stations of member:
    1. Return:
    "user_type"    "most_used_station"    "start_station_name"    "num_trips"    "duration_in_mins"     
    "member"    "1034"    "Public Rack - Kenneth Ave & 63rd St E"       1           8
    "member"    "1036"    "Public Rack - Kedzie Ave & 60th St"          1           6
    "member"    "1033"    "Public Rack - Pulaski Rd &amp; 65th St"        1         11
    "member"    "1015"    "Public Rack - Peterson Ave & Drake Ave"        1         25
    "member"    "1016"    "Public Rack - Peterson Ave & Bernard Ave"      1         9
    "member"    "1039"    "Public Rack - Kedzie Ave & 61st Pl W"        1           4

Visualize, Analyze, and Share

See Tableau