PROJECT OVERVIEW

Uber Data

Uber Data


A project for ANLY 512: ‘The Quantified Self’ Data Visualization

The Quantified Self (QS) is a movement motivated to leverage the synergy of wearables, analytics, and “Big Data”. This movement exploits the ease and convenience of data acquisition through the internet of things (IoT) to feed the growing obsession of personal informatics and quotidian data.

In this project I am going to conduct the “Quantified Self’ analysis on my My Uber Rides Data and develop visualizations to exhibit different patterns and behviors in my Uber Rides, so far. The motivation behind this project was I the simple fact that I use Uber a lot as I dont have a car. I am planning to buy a new car and I have been comtemplating that for a while now. So I wanted to obtain insights on how my Uber Rides History looks like. I will be focusing on topics like - how much money I have been spending on Uber, what are the locations I have mostly used Uber for commute, whats is longest that i have travelled using Uber etc.

Data Collection: UBER RIDER DATA was fetched from my account on Uber website. The data includes information that was used to get me to my destination, such as: - Times and locations at which a trip was requested, started, and ended, - Distance traveled - Trip prices and currency

Tools Used: I mainly used R for data processing. Various packages from R such as ggplot2, plotly, lubridate, ggthemes, ggmap were used for devleoping the visulazations.

Questions: 1) What are the different Cities I used Uber in? 1a. What do my different Uber Trip statuses say about about my Rides in respective cities? 2) What were the areas in particular city where I took most of my Uber Rides - Spatial Data 3) My Travel Timeline through the years 2015-2019 4) What is the higest amount I have spent on Uber riders & on what Date, Location. Note: Spending could be currency specific. 5) Distance Travelled in miles on different dates. What is the max. distance travelled and when and where?

My Travel Timeline


I moved to New York, USA in 2015 from India and thats when my Uber Journey started. I have lived in 3 cities so far - New York, Minneapolis-St.Paul, Philadeplhia; and my timeline visualization clearly depicts this journey.

I moved out of New York in Nov 2015 but I have been consistently travelling in New York area from 2015 to 2019. My Uber usage in Jersey area is justified by the same. I lived in Minneapolis - St.Paul from Dec 2015 until Jan 2018. Post that I moved to Philadelphia area, starting from 2018 til date.

All other locations are the places I have travelled to, and number of rides taken there are very low therefore it is not showing up on the graph.

What does my Uber rides usage look like in different cities? What do different Uber Trip Statuses suggest about my Rides?


I have used Uber in different cities as we can see in the visualization. Note:the suburbs travel data is aggregated at City level.

Maximum use of Uber for me seems to be in Philadephia area, followed by Minneapolis-St Paul, followed by New York City.

I see my commute rate via Uber has considerably gone up in Phialdephia area since I live in Philly suburbs and the public transport is quite scarse in this area. This visualization is very apt and clearly explains why it was only after moving to PA I felt the need of buying a car. I have maximum usage of Uber in Philadelphia area, in least amount of time.

Surprisingly, New York is 3rd on the list. I was under the impression that I haven’t used much of Uber in New York City as I have always travelled in the subways but dosen’t seem to be the case. However, I do feel that the rides count might also be impacted by the time I have spent in each of these cities. For example, I lived in NYC for more than 2.5 years which makes me think even though I didnt take as many Uber rides there, the overall count has increased.

What are the locations where I frequenty commute or take Uber Rides from?


Since maxmimum Uber rides are Philadelphia which also includes the suburbs, I wanted to take a closer look at different areas in Philly where I have used Uber. There are dense Red dots in Malvern area - which is where I live and Phialdephia city. This shows that most of times I travel to the city I take Uber.

How much fare I have paid for Uber Rides in different areas from 2015 to 2019.


This plot shows the amount spent on Uber over the period from 2015 to 2018. As can be seen the uber data shows a spike around 14th to 16th Jan 2017, but hovering shows that it is in INR and thus in Dollar it is not more than 10\(. Highest amount paid for Uber was on 21st September 2018. Most of my uber rides are clustered between 5 to 30\) showing the behavior that i am unlikely to take an Uber if Uber prices are above 30$

2017: What is most Fare paid by me in 2017 and where was that?

[1] "2017-01-13"
[1] "2017-11-24"

*Note: Interactive Plot

In year 2017, maximum fare I paid was 261.64 INR in Pune City. Thing to note here is this is not a correct analysis since 261INR in USD amounts to ~$3. I purposely didnt do INR to USD conversion during my data setup because I wanted to highlight this difference.

In month of Jan 2017, I was travelling to India so most of the expenses in that time period seem to be high but they are all in INR.

2018: What is most Fare paid by me in 2018 and where was that?

*** Note:Interactive Plot In year 2018, maximum fare I paid was 21st Sept in New York City.

Overall, my Uber commute is quite spiked up as compared to 2017.

What is the maximum distance I have travelled using Uber and When?

###Distance Travelled Graph Summary *** Maximum distance that I travelled using Uber was 36.78 miles on 11/26/2015. Ride was taken in the morning around 10:15 AM. This is followed by couple of rides in range of 33.88 miles and 32.37 miles. Overall the distance graph suggests that I not only take Uber for longer commutes but I have quite frequently used the Uber for smaller distances as well. The lowest distance travlled is 0.22 miles on 11/27/2015 at 12:30 PM.

Fun Observation: Max distance travelled is immediatly followed by Min distance travelled, the next day.

---
title: "Anly 512_Final Project"
author: "Kirti"
date: "6/21/2019"
output: 
  flexdashboard::flex_dashboard:
    storyboard: true
    social: menu
    source: embed
    orientation: rows
    vertical_layout: fill
    
    
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)

library(flexdashboard)
library(dygraphs)
library(tidyverse)
library(tidyquant)
library(quantmod)
library(plyr)
library(ggplot2)
library(ggthemes)
library(RColorBrewer)
library(ggmap)
library(sp)
library(leaflet)
library(htmltools)
library(lubridate)
#install.packages("plotly")
library(plotly)
#install.packages("xts")
library(xts)
```

###PROJECT OVERVIEW {data-commentary-width=550}

![Uber Data](/Users/katie/R Data/ANLY 512 - Data Viz/Uber Data/images.jpeg)

***
A project for ANLY 512: 'The Quantified Self' Data Visualization

The Quantified Self (QS) is a movement motivated to leverage the synergy of wearables, analytics, and “Big Data”. This movement exploits the ease and convenience of data acquisition through the internet of things (IoT) to feed the growing obsession of personal informatics and quotidian data.

In this project I am going to conduct the "Quantified Self' analysis on my My Uber Rides Data and develop visualizations to exhibit different patterns and behviors in my Uber Rides, so far. The motivation behind this project was I the simple fact that I use Uber a lot as I dont have a car. I am planning to buy a new car and I have been comtemplating that for a while now. So I wanted to obtain insights on how my Uber Rides History looks like.
I will be focusing on topics like - how much money I have been spending on Uber, what are the locations I have mostly used Uber for commute, whats is longest that i have travelled using Uber etc. 

Data Collection:
UBER RIDER DATA was fetched from my account on Uber website. The data includes information that was used to get me to my destination, such as:
- Times and locations at which a trip was requested, started, and ended, 
- Distance traveled
- Trip prices and currency

Tools Used:
I mainly used R for data processing. Various packages from R such as ggplot2, plotly, lubridate, ggthemes, ggmap were used for devleoping the visulazations.

Questions: 
1) What are the different Cities I used Uber in? 
   1a. What do my different Uber Trip statuses say about about my Rides in respective cities?
2) What were the areas in particular city where I took most of my Uber Rides - Spatial Data
3) My Travel Timeline through the years 2015-2019
4) What is the higest amount I have spent on Uber riders & on what Date, Location. Note: Spending could be currency specific.
5) Distance Travelled in miles on different dates. What is the max. distance travelled and when and where?



### My Travel Timeline {data-commentary-width=350}
```{r results='hide'}
RiderData <- read.csv("/Users/katie/R Data/ANLY 512 - Data Viz/Uber Data/UberRidesData.csv", stringsAsFactors = FALSE)

RiderData$Request.Date <- as.Date(RiderData$Request.Date, "%Y-%m-%d")
RiderData$Year <- as.Date(cut(RiderData$Request.Date, breaks="month"))

G1 <- ggplot(RiderData, aes(RiderData$Year, RiderData$City))+
  geom_line(color = "#00AFBB", size = 2) + labs(title ="Travel Timeline", x= "Year", y= "Cities") 
G1
```

***
I moved to New York, USA in 2015 from India and thats when my Uber Journey started. 
I have lived in 3 cities so far - New York, Minneapolis-St.Paul, Philadeplhia; and my timeline visualization clearly depicts this journey.

I moved out of New York in Nov 2015 but I have been consistently travelling in New York area from 2015 to 2019. My Uber usage in Jersey area is justified by the same. 
I lived in Minneapolis - St.Paul from Dec 2015 until Jan 2018. 
Post that I moved to Philadelphia area, starting from 2018 til date. 

All other locations are the places I have travelled to, and number of rides taken there are very low therefore it is not showing up on the graph. 


### What does my Uber rides usage look like in different cities? What do different Uber Trip Statuses suggest about my Rides? 

```{r results='hide'}

RiderData <- read.csv("/Users/katie/R Data/ANLY 512 - Data Viz/Uber Data/UberRidesData.csv", stringsAsFactors = FALSE)

#View(RiderData)
#str(RiderData)

G1<-ggplot(RiderData, aes(City,fill = Trip.Status)) + scale_fill_discrete(name = "Trip Status") + labs(title = "My Uber Trips Status", subtitle = "By Cities", x = "City", y = "Count")
G1+geom_bar() + theme(axis.text.x = element_text(angle = 45, hjust = 1))

```

***

I have used Uber in different cities as we can see in the visualization.
Note:the suburbs travel data is aggregated at City level.

Maximum use of Uber for me seems to be in Philadephia area, followed by Minneapolis-St Paul, followed by New York City. 

I see my commute rate via Uber has considerably gone up in Phialdephia area since I live in Philly suburbs and the public transport is quite scarse in this area. This visualization is very apt and clearly explains why it was only after moving to PA I felt the need of buying a car.
I have maximum usage of Uber in Philadelphia area, in least amount of time.

Surprisingly, New York is 3rd on the list. I was under the impression that I haven't used much of Uber in New York City as I have always travelled in the subways but dosen't seem to be the case. However, I do feel that the rides count might also be impacted by the time I have spent in each of these cities. For example, I lived in NYC for more than 2.5 years which makes me think even though I didnt take as many Uber rides there, the overall count has increased. 


###What are the locations where I frequenty commute or take Uber Rides from? 

```{r}
#2. Rides taken Philadephia Area - Map
register_google(key = "AIzaSyCepLjMb-v9bGHccTC2IgYHvO9DKeqrygU")
Philadelphia <- subset(RiderData, City=="Philadelphia")
ggmap(get_map(location = "Philadelphia", zoom="auto", maptype = "terrain")) + geom_point(aes(Begin.Trip.Lng, Begin.Trip.Lat), data=Philadelphia, color = I('Red'), size = I(2), zoom=10) 


#qmplot(Begin.Trip.Lng, Begin.Trip.Lat, data=Philadelphia, color = I('Red'), size = I(2), zoom=12, darken = .2, extent = "panel", main = "Cities", xlab = "Longitude", ylab = "Latitude")

```

***
Since maxmimum Uber rides are Philadelphia which also includes the suburbs, I wanted to take a closer look at different areas in Philly where I have used Uber. There are dense Red dots in Malvern area - which is where I live and Phialdephia city. This shows that most of times I travel to the city I take Uber.


### How much fare I have paid for Uber Rides in different areas from 2015 to 2019.

```{r echo=FALSE, fig.keep='all', results='hide'}
Uber<- ggplot(RiderData, aes(x=Dropoff.Date, y=Fare.Amount, Fare.Currency = Fare.Currency)) +
  geom_point(aes(col=Fare.Amount, size=Fare.Amount)) + labs(title = "My Uber Rides", x = "Dates", y = "Fare")

Uber1 <- ggplotly(Uber, tooltip = c("Dropoff.Date", "y", "Fare.Currency"))
Uber1 

```

***
This plot shows the amount spent on Uber over the period from 2015 to 2018. As can be seen the uber data shows a spike around 14th to 16th Jan 2017, but hovering shows that it is in INR and thus in Dollar it is not more than 10$. 
Highest amount paid for Uber was on 21st September 2018. 
Most of my uber rides are clustered between 5 to 30$ showing the behavior that i am unlikely to take an Uber if Uber prices are above 30$

###2017: What is most Fare paid by me in 2017 and where was that? 


```{r fig.keep='all', message=FALSE}
RiderData2017 <- RiderData %>% filter(Request.Date >= as.Date("2017-01-01") & Request.Date <= as.Date("2017-12-31"))

min(RiderData2017$Request.Date)
max(RiderData2017$Request.Date)

G2017 <- ggplot(RiderData2017, aes(Request.Date, Fare.Amount, Fare.Currency = Fare.Currency, City = City))+ geom_bar(stat = 'identity', fill = 'darkorchid4') +labs(title = "Uber Rides in 2017", subtitle = "Fare Paid", x = "Request Date", y = "Fare Amount")

G2017 <-ggplotly(G2017, tooltip = c("Request.Date", "Fare.Amount", "Fare.Currency", "City"))
G2017

```

***
*Note: Interactive Plot

In year 2017, maximum fare I paid was 261.64 INR in Pune City. Thing to note here is this is not a correct analysis since 261INR in USD amounts to ~$3. I purposely didnt do INR to USD conversion during my data setup because I wanted to highlight this difference. 

In month of Jan 2017, I was travelling to India so most of the expenses in that time period seem to be high but they are all in INR. 



###2018: What is most Fare paid by me in 2018 and where was that? 

```{r fig.keep='all'}

RiderData$Request.Date <-as.Date(RiderData$Request.Date, format = "%Y-%m-%d")
RiderData2018 <- RiderData %>% filter(Request.Date >= as.Date("2018-01-01") & Request.Date <= as.Date("2018-12-31"))


G5 <- ggplot(RiderData2018, aes(Request.Date, Fare.Amount,Fare.Currency=Fare.Currency, City = City))+ geom_bar(stat = 'identity', fill = 'darkorchid4') + labs(title = "Uber Rides in 2018", subtitle = "Fare Paid", x = "Request Date", y = "Fare Amount")


G5 <-ggplotly(G5, tooltip = c("Request.Date", "Fare.Amount", "Fare.Currency", "City"))
G5

```

*** Note:Interactive Plot
In year 2018, maximum fare I paid was 21st Sept in New York City.

Overall, my Uber commute is quite spiked up as compared to 2017. 

### What is the maximum distance I have travelled using Uber and When?
```{r echo=FALSE, fig.keep='all'}

#weekdays(RiderData$Request.Date)
RDD <- read.csv("/Users/katie/R Data/ANLY 512 - Data Viz/Uber Data/Uber Data 2/Rider/trips_data.csv")
#str(RDD)
RDD$Request.Time <- ymd_hms(RDD$Request.Time)
RDD$Dropoff.Time <- ymd_hms(RDD$Dropoff.Time)
Ride = xts(x=RDD$Distance..miles., order.by = RDD$Request.Time)


dygraph(Ride, main = "Distance Travelled from 2015 - 2019") %>%
  dyOptions(drawPoints = TRUE, pointSize = 5) %>%
  dyRangeSelector() %>%
  dyAxis("y", label= "Distance") %>%
  dyHighlight(highlightCircleSize = 0.5,
              highlightSeriesBackgroundAlpha = 1) 
```
###Distance Travelled Graph Summary
***
Maximum distance that I travelled using Uber was 36.78 miles on 11/26/2015.
Ride was taken in the morning around 10:15 AM.
This is followed by couple of rides in range of 33.88 miles and 32.37 miles. 
Overall the distance graph suggests that I not only take Uber for longer commutes but I have quite frequently used the Uber for smaller distances as well. 
The lowest distance travlled is 0.22 miles on 11/27/2015 at 12:30 PM. 

Fun Observation: Max distance travelled is immediatly followed by Min distance travelled, the next day.