Founded in 2009, Uber is a leading technology platform that primarily offers its end users the ability to connect with independent providers of ride services and food delivery service providers.Uber also leverages this same network to connect shippers with carriers in the freight industry.
In less than a decade, Uber has grown to to become a prominent sharing economy service provider. As of 2019, Uber has about 110 million users (monthly) in the US and has generated around 14.1 billion USD in net revenue (Statista, 2020).
In this analysis, we do a deep dive of Uber’s adjusted EBITDA (USD)in millions broken down by its five major segments - rides, eats, freight, other bets (multiple investment stage offerings) and ATG (development and commercialization of autonomous vehicles). Through our analysis, we seek to understand what are the most and least profitable segments for Uber.
The dataset titled ‘Uber Adjusted EBITDA by segment, USD Millions’ was posted in the discussion board by Diego Correa. Segment Adjusted EBITDA is defined as revenue less specific expenses (Uber Annual Report, 2020).
The data ranges from Q1 2018-Q1 2020.
It is a wide dataset with 9 rows:
While columns 2-3 contain the earning information by segment, column 9 reflect the costs or expenses to be subtracted from the total earnings from the previous columns.
Unreported data is denoted by ‘X’, however, there is missing data in the table as well (such as Ride and Eats does not have any data for Q1 2018 and Q2 2018 as the earnings are reflected under Core Platform).
The original dataset is from Mansoor Iqbal’s 2020 report titled ‘Uber Revenue and Usage Statistics’ published by Business of Apps. link
I recreated the data set in a SQL DB implemented in Azure. I used the create table statement to create the table and the insert into statement to insert the values into the table.
There’s an additional ID column in the table. I used 0 to denote blank data. Moreover, because the data type for the segment columns is integer, instead of X for unreported data I used 2.
library(odbc)
library(DBI)
library (tidyverse)
## -- Attaching packages ---------------- tidyverse 1.3.0 --
## v ggplot2 3.3.2 v purrr 0.3.4
## v tibble 3.0.3 v dplyr 1.0.2
## v tidyr 1.1.2 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.5.0
## -- Conflicts ------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(ggplot2)
Code is hidden in Rpubs since it contains access information
Data <- dbGetQuery(my_connection,"Select * from Uber")
knitr::kable(Data,caption='Uber Stats')
| ID | Quarter | Rides | Eats | Core_Platform | Other_Inc_Freight | Other | ATG | Corporate_GA_Platform_RD | Freight |
|---|---|---|---|---|---|---|---|---|---|
| 2 | Q1 2018 | 0 | 0 | 427 | -20 | 0 | 2 | 2 | 0 |
| 3 | Q2 2018 | 0 | 0 | 369 | -28 | 0 | 2 | 2 | 0 |
| 4 | Q3 2018 | 416 | -189 | 0 | 0 | -12 | -132 | -501 | -31 |
| 5 | Q4 2018 | 195 | -278 | 0 | 0 | -38 | -105 | -568 | -23 |
| 6 | Q1 2019 | 192 | -309 | 0 | 0 | -42 | -113 | -568 | -29 |
| 7 | Q2 2019 | 506 | -286 | 0 | -122 | 0 | 2 | 2 | 0 |
| 8 | Q3 2019 | 631 | -316 | 0 | 0 | -72 | -124 | -623 | -81 |
| 9 | Q4 2019 | 742 | -461 | 0 | 0 | -67 | -130 | -644 | -55 |
| 10 | Q1 2020 | 581 | -313 | 0 | 0 | -63 | -108 | -645 | -64 |
Data_Tibble <-as_tibble(Data)
Data_Tibble
One of the analysis requested by Diego, is to look at the adjusted EBITDA by year.However, the dataset in its current format, contains the information for both the quarter and year in one column - quarter.
Data_Tibble <- Data_Tibble <- separate(Data_Tibble,Quarter,c('Quarter','Year'), ' ' )
The core platform column reflects the sum of earnings from rides and eats prior to Q3 2018. Since we are looking at the adjusted EBITDA broken down by the segments as of today, it would be more accurate to take the earning for Q1 and Q2 2018 under Core Platform and divide them accordingly between rides and eats. However, we do not have the information to determine how to divide the sum of earnings between the two segments.
What we could do is determine the % of total earnings from rides and eats , from Q3 2018 on ward and then apply the average % to the sum of earnings in Core Platforms to get the numbers for each of the segments prior to Q3 2018.
RidePercent <-(Data_Tibble$Rides/(Data_Tibble$Eats+Data_Tibble$Rides))
mean(RidePercent,na.rm=TRUE)
## [1] 0.9934043
Rides accounts for 0.99 or 99% of sum of earning from rides+eats.
EatsPercent <- (Data_Tibble$Eats/(Data_Tibble$Eats+Data_Tibble$Rides))
mean(EatsPercent,na.rm=TRUE)
## [1] 0.006595667
Eats accounts for less than 1% of total earnings. We can now apply these percents to earnings reported in core_platform prior to Q1 and Q2 in 2018. For our analysis well round this mean to 0.01.
Data_Tibble$Rides <-ifelse(Data_Tibble$Quarter=='Q1'&Data_Tibble$Year==2018,0.99*Data_Tibble$Core_Platform,Data_Tibble$Rides)
Data_Tibble$Rides <-ifelse(Data_Tibble$Quarter=='Q2'&Data_Tibble$Year==2018,0.99*Data_Tibble$Core_Platform,Data_Tibble$Rides)
Data_Tibble$Rides
## [1] 422.73 365.31 416.00 195.00 192.00 506.00 631.00 742.00 581.00
Data_Tibble$Eats <-ifelse(Data_Tibble$Quarter=='Q1'&Data_Tibble$Year==2018,0.01*Data_Tibble$Core_Platform,Data_Tibble$Eats)
Data_Tibble$Eats <-ifelse(Data_Tibble$Quarter=='Q2'&Data_Tibble$Year==2018,0.01*Data_Tibble$Core_Platform,Data_Tibble$Eats)
Data_Tibble$Eats
## [1] 4.27 3.69 -189.00 -278.00 -309.00 -286.00 -316.00 -461.00 -313.00
Data_Tibble <-select(Data_Tibble,-Core_Platform)
We’ll do the same with the Other_Inc_Freight column, which is a combination of earning from Freight and Other bets prior to Q3 2018 and also in Q2 2019.
FreightPercent <-(Data_Tibble$Freight/(Data_Tibble$Freight+Data_Tibble$Other))
mean(FreightPercent,na.rm=TRUE)
## [1] 0.4984331
OtherPercent <-(Data_Tibble$Other/(Data_Tibble$Freight+Data_Tibble$Other))
mean(OtherPercent,na.rm=TRUE)
## [1] 0.5015669
Data_Tibble$Other <-ifelse(Data_Tibble$Quarter=='Q1'&Data_Tibble$Year==2018,0.5*Data_Tibble$Other_Inc_Freight,Data_Tibble$Other)
Data_Tibble$Other <-ifelse(Data_Tibble$Quarter=='Q2'&Data_Tibble$Year==2018,0.5*Data_Tibble$Other_Inc_Freight,Data_Tibble$Other)
Data_Tibble$Other <-ifelse(Data_Tibble$Quarter=='Q2'&Data_Tibble$Year==2019,0.5*Data_Tibble$Other_Inc_Freight,Data_Tibble$Other)
Data_Tibble$Freight <-ifelse(Data_Tibble$Quarter=='Q1'&Data_Tibble$Year==2018,0.49*Data_Tibble$Other_Inc_Freight,Data_Tibble$Freight)
Data_Tibble$Freight <-ifelse(Data_Tibble$Quarter=='Q2'&Data_Tibble$Year==2018,0.49*Data_Tibble$Other_Inc_Freight,Data_Tibble$Freight)
Data_Tibble$Freight <-ifelse(Data_Tibble$Quarter=='Q2'&Data_Tibble$Year==2019,0.49*Data_Tibble$Other_Inc_Freight,Data_Tibble$Freight)
Data_Tibble$Freight
## [1] -9.80 -13.72 -31.00 -23.00 -29.00 -59.78 -81.00 -55.00 -64.00
Data_Tibble$Other
## [1] -10 -14 -12 -38 -42 -61 -72 -67 -63
Data_Tibble <-select(Data_Tibble,-Other_Inc_Freight)
Data_Tibble
There are blank values (indicated by a 0) and unreported values (indicated by 2) in the dataset.However, it seems like by getting rid of the Core Platform and Other Inc/Freight columns and assigning the numbers in these columns to other respective columns, have removed all 0s from the dataset. We only have unreported values or 2s in the ATG and Corporate G&A and platform R&D columns which we will convert to NAs:
Data_Tibble$ATG <- ifelse(Data_Tibble$ATG == 2, NA,Data_Tibble$ATG)
Data_Tibble$Corporate_GA_Platform_RD<- ifelse(Data_Tibble$Corporate_GA_Platform_RD == 2, NA,Data_Tibble$Corporate_GA_Platform_RD)
Data_Tibble
We will create a column called segments where we will list the 5 segments and another column to house their values called earnings.
Data_Tibble <- Data_Tibble %>%
gather(Segment,Earnings,
Rides,
Eats,
Other,
ATG,
Freight)
Since Corporate_GA_Platform_RD reflects the expenses we will keep this column separate but just rename it to Expense.
Data_Tibble <- Data_Tibble %>%
rename(Expense=Corporate_GA_Platform_RD)
For improved readability, we’ll reorganize the columns.
Data_Tibble <- Data_Tibble %>% relocate(Expense, .after=Earnings)
For the sake of this analysis, we will assume that Adjusted EBITDA is earnings-expense. However, expenses are listed in negatives in this dataset. Does this mean that there were negative expenses for Uber during the time frame? Most likely not, therefore I am assuming that because the table reflects earnings, costs are listed as negative earnings in this table. So the calculation for our new column is earnings+expense.
Data_Tibble["Adjusted_EBITDA"]<-Data_Tibble$Earnings+Data_Tibble$Expense
Our data is now ready for analysis
knitr::kable(Data_Tibble, caption="Uber Adjusted EBITDA")
| ID | Quarter | Year | Segment | Earnings | Expense | Adjusted_EBITDA |
|---|---|---|---|---|---|---|
| 2 | Q1 | 2018 | Rides | 422.73 | NA | NA |
| 3 | Q2 | 2018 | Rides | 365.31 | NA | NA |
| 4 | Q3 | 2018 | Rides | 416.00 | -501 | -85 |
| 5 | Q4 | 2018 | Rides | 195.00 | -568 | -373 |
| 6 | Q1 | 2019 | Rides | 192.00 | -568 | -376 |
| 7 | Q2 | 2019 | Rides | 506.00 | NA | NA |
| 8 | Q3 | 2019 | Rides | 631.00 | -623 | 8 |
| 9 | Q4 | 2019 | Rides | 742.00 | -644 | 98 |
| 10 | Q1 | 2020 | Rides | 581.00 | -645 | -64 |
| 2 | Q1 | 2018 | Eats | 4.27 | NA | NA |
| 3 | Q2 | 2018 | Eats | 3.69 | NA | NA |
| 4 | Q3 | 2018 | Eats | -189.00 | -501 | -690 |
| 5 | Q4 | 2018 | Eats | -278.00 | -568 | -846 |
| 6 | Q1 | 2019 | Eats | -309.00 | -568 | -877 |
| 7 | Q2 | 2019 | Eats | -286.00 | NA | NA |
| 8 | Q3 | 2019 | Eats | -316.00 | -623 | -939 |
| 9 | Q4 | 2019 | Eats | -461.00 | -644 | -1105 |
| 10 | Q1 | 2020 | Eats | -313.00 | -645 | -958 |
| 2 | Q1 | 2018 | Other | -10.00 | NA | NA |
| 3 | Q2 | 2018 | Other | -14.00 | NA | NA |
| 4 | Q3 | 2018 | Other | -12.00 | -501 | -513 |
| 5 | Q4 | 2018 | Other | -38.00 | -568 | -606 |
| 6 | Q1 | 2019 | Other | -42.00 | -568 | -610 |
| 7 | Q2 | 2019 | Other | -61.00 | NA | NA |
| 8 | Q3 | 2019 | Other | -72.00 | -623 | -695 |
| 9 | Q4 | 2019 | Other | -67.00 | -644 | -711 |
| 10 | Q1 | 2020 | Other | -63.00 | -645 | -708 |
| 2 | Q1 | 2018 | ATG | NA | NA | NA |
| 3 | Q2 | 2018 | ATG | NA | NA | NA |
| 4 | Q3 | 2018 | ATG | -132.00 | -501 | -633 |
| 5 | Q4 | 2018 | ATG | -105.00 | -568 | -673 |
| 6 | Q1 | 2019 | ATG | -113.00 | -568 | -681 |
| 7 | Q2 | 2019 | ATG | NA | NA | NA |
| 8 | Q3 | 2019 | ATG | -124.00 | -623 | -747 |
| 9 | Q4 | 2019 | ATG | -130.00 | -644 | -774 |
| 10 | Q1 | 2020 | ATG | -108.00 | -645 | -753 |
| 2 | Q1 | 2018 | Freight | -9.80 | NA | NA |
| 3 | Q2 | 2018 | Freight | -13.72 | NA | NA |
| 4 | Q3 | 2018 | Freight | -31.00 | -501 | -532 |
| 5 | Q4 | 2018 | Freight | -23.00 | -568 | -591 |
| 6 | Q1 | 2019 | Freight | -29.00 | -568 | -597 |
| 7 | Q2 | 2019 | Freight | -59.78 | NA | NA |
| 8 | Q3 | 2019 | Freight | -81.00 | -623 | -704 |
| 9 | Q4 | 2019 | Freight | -55.00 | -644 | -699 |
| 10 | Q1 | 2020 | Freight | -64.00 | -645 | -709 |
First I would like to see which, which segments have been high and low performers within our time frame
I will sum up the adjusted EBITDA by segment first:
do <- Data_Tibble %>%
group_by(Segment) %>%
summarize(sum_ebitda=sum(Adjusted_EBITDA, na.rm=TRUE))
## `summarise()` ungrouping output (override with `.groups` argument)
do
Visualizing the data in the table above:
ggplot(data=do, aes(x=Segment,y=sum_ebitda, fill=Segment))+geom_col()+geom_text(label=(do$sum_ebitda))+
ggtitle("Adjusted EBITDA by Segment")
Looks like all of the segments incurred a negative EBITDA (loss). This is rather surprising because we were expecting Uber Rides to incur a positive EBITDA. Unsurprisingly, Uber Rides is seems to be performing better than the other segments in terms of EBITDA, while Eats the worst. The latter is also surprising to me because Eats is probably Uber’s second most well-known platform.
However, it is also important to consider that the sample size for each segment is most likely not the same i.e. there maybe a significantly higher amount of bookings for eats than for freights, therefore the adjusted EBITDA should be relative. Unfortunately, we could not find the required data to compare the earnings in relative terms.
I would also like to see what the year over year trend has been for Uber’s EBITDA:
yr <- Data_Tibble %>%
group_by(Year) %>%
summarize(sum_ebitda=sum(Adjusted_EBITDA, na.rm=TRUE))
## `summarise()` ungrouping output (override with `.groups` argument)
yr
ggplot(data=yr, aes(x=Year,y=sum_ebitda))+geom_col()+ geom_text(label=(yr$sum_ebitda))+
ggtitle("Adjusted EBITDA by Year")
We will ignore 2020 for now as it only reported earnings for Q1. However, it seems like losses are increasing(earnings are falling) from 2018 to 2019. Let’s see how the earnings were broken down by quarter:
dg <- Data_Tibble %>%
group_by(Year,Quarter) %>%
summarize(sum_ebitda=sum(Adjusted_EBITDA, na.rm=TRUE))
## `summarise()` regrouping output by 'Year' (override with `.groups` argument)
dg
ggplot(data=dg, aes(x=Year,y=sum_ebitda,fill=Quarter))+geom_col()+
ggtitle("Adjusted EBITDA by Year")
It looks like the losses are increasing progressively by quarter-however Q2 of 2019 is an anomaly with the earnings suddenly going up to 0, and then goes back down in Q3 2019.
The original analysis proposed by Diego was to see if there is a yearly trend in adjusted EBITDA for Uber broken down by segment:
dt <- Data_Tibble %>%
group_by(Segment,Year) %>%
summarize(sum_ebitda=sum(Adjusted_EBITDA, na.rm=TRUE))
## `summarise()` regrouping output by 'Segment' (override with `.groups` argument)
dt
ggplot(dt,aes(x=Year,y=sum_ebitda,fill=Segment))+
geom_col()+
facet_wrap(~Segment)+labs(title="Adjusted EBITDA by Segment by Year")+
geom_text(label=(dt$sum_ebitda),size=2)
It looks like the yearly trend we observed before(increasing losses from 2018 to 2019) still holds for all segments except for Rides, where losses seem to be falling (earnings increasing). Once again we will have to look at the sample size or the number the transactions within this time frame for each sector to fully determining the accuracy of our insights.
According to our analysis, Uber’s adjusted EBITDA appears to be falling continuously (incurring losses) across all segments, except for Rides, from 2018 to 2019. However, although rides while earnings for rides seem to have increased from 2018-19, as of Q1 2020, Rides is still incurring losses.
Given that Uber is reputed to have significant market share of the ride-hailing industry, the findings of our exploratory analyses indicate that Uber has not been profitable since 2018. However, in this regard, it is also important to note that our analysis was limited to just one dataset - we think if we had the transactions data for each segment for the same time line, the results could have differed. Also we treated corporate G&A and platform R&D as an expense for the sake of our analysis as we could not find any literature explaining its inclusion in the dataset ( it is not one of the identified segments).
Moreover, it is also noteworthy that segment adjusted EBITDA excludes any non-cash items, transactions that are not indicative of ongoing segment operating performance or items that are not reflective of ongoing core operations.
As next steps, we recommend doing a deeper dive into why the segments are incurring losses with additional data sets from Uber that could further enhance the analysis.
Uber Technologies. (2020, September 23). Statista.
Uber Revenue and Usage Statistics (2020). (2020, August 29). Business of Apps. https://www.businessofapps.com/data/uber-statistics/
Uber Technologies. (2020, June 1). Uber 2019 Annual Report.