Executative Summary : New York Taxi Ride Map

The dataset is obtained from the 2016 NYC Yellow Cab trip record data made available in Big Query on Google Cloud Platform. The data was originally published by the NYC Taxi and Limousine Commission (TLC). This report will be demostrating on what are the popular pickup spots of two different time frame (Morning, Evening) during a day.

Exploratory Data Analysis

The training dataset is downloaded and unzip from Kaggle website, we can see that there are totally 1.45 million rows and 11 columns.

# The URL to download the train.zip is from the kaggle website: 
# website:https://www.kaggle.com/c/nyc-taxi-trip-duration/data/
unzip("train.zip")
trainDS<-read.csv2("train.csv", header=TRUE, sep=",")
dim(trainDS)
## [1] 1458644      11

The columns available are shown below. We will be using the following columns (pickup_datetime, pickup_longitute, pickup_latitude) to demostrate the popular pickup location for the taxi ride during morning time (7 am to 9 am), and evening time (5 pm to 7 pm).

names(trainDS)
##  [1] "id"                 "vendor_id"          "pickup_datetime"   
##  [4] "dropoff_datetime"   "passenger_count"    "pickup_longitude"  
##  [7] "pickup_latitude"    "dropoff_longitude"  "dropoff_latitude"  
## [10] "store_and_fwd_flag" "trip_duration"