In spite of global efforts to transition towards sustainable energy solutions, crude oil retains its significance as a primary energy source in numerous nations and sectors.
Through refinement processes, crude oil yields a range of products including gasoline, diesel, kerosene, and heavy fuel oil, which are essential for powering transportation vehicles, aircraft, maritime vessels, machinery, and providing heating in select regions.
Notably, the United States occupies a prominent position as both the largest producer and consumer of crude oil on the global stage. This report will undertake an examination of crude oil imports specifically within the context of the United States.
The United States recorded its highest volume of oil imports in 2010, amounting to 24,395,476,000 barrels.
Over the period spanning from 2009 to 2024, there was a general decline in the quantity of oil imported into the United States.
Across the years from 2009 to 2023, January exhibited the highest average oil import volume, standing at 2,452,890 barrels, whereas February displayed the lowest average import volume at 2,214,200 barrels.
Throughout 2009 to 2023, the primary regions from which the United States imports oil include Canada, the Other Americas, and the Middle East.
While Africa, Canada, and the Other Americas were the top three sources of oil imports for the United States in 2009 and 2010, the Middle East supplanted Africa from 2011 onward, consistently appearing among the top three suppliers.
Notably, oil exports from the Other Americas to the United States experienced a general downward trajectory over the years, whereas exports from Canada exhibited an upward trend.
The leading destinations for oil within the United States are the United States itself, PADD3 (Gulf Coast), PADD2 (Midwest), Texas, and PADD5 (West Coast).
Predominantly, the United States imports oil of the Heavy Sour grade, with the grades imported ranked in descending order based on quantity: Heavy Sour, Medium, Light Sweet, Light Sour, and Heavy Sweet.
oildf <- read.csv("~/Case Studies/US Oil Imports Case Study/US Oil Imports/data.csv")
library('stringr')
library('dplyr')
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library('ggplot2')
library('patchwork')
# create a new data frame containing only rows where origintype name is 'region'
regiondf <- subset(oildf, originTypeName == "Region")
str(regiondf)
## 'data.frame': 124589 obs. of 8 variables:
## $ year : int 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 ...
## $ month : int 1 1 1 1 1 1 1 1 1 1 ...
## $ originName : chr "Africa" "Africa" "Africa" "Africa" ...
## $ originTypeName : chr "Region" "Region" "Region" "Region" ...
## $ destinationName : chr "SHELL OIL PRODUCTS US / ST ROSE / LA" "SHELL OIL PRODUCTS US / ST ROSE / LA" "OIL TANKING PL INC / HOUSTON (GULF) / TX" "EXXONMOBIL REFINING & SPLY CO / BEAUMONT / TX" ...
## $ destinationTypeName: chr "Refinery" "Refinery" "Refinery" "Refinery" ...
## $ gradeName : chr "Light Sweet" "Medium" "Light Sweet" "Light Sweet" ...
## $ quantity : int 665 2417 381 1501 474 1024 4249 1617 1076 274 ...
# insert '_' to column names
newcolnames <- gsub("(?<=[a-z])(?=[A-Z])", "_", colnames(regiondf), perl = TRUE)
colnames(regiondf) <- newcolnames
# change all column names to lower case
colnames(regiondf) <- tolower(colnames(regiondf))
# capitalize only the first letter of each word in destination name
regiondf$destination_name <- str_to_title(regiondf$destination_name)
# remove trailing spaces
regiondf <- regiondf %>%
mutate_at(vars(origin_name, origin_type_name, destination_name, destination_type_name, grade_name), trimws)
# convert month column from numeric month values to character representations
regiondf$month <- month.name[regiondf$month]
sum(duplicated(regiondf))
## [1] 0
sum(is.na(regiondf))
## [1] 0
yearly <- aggregate(quantity ~ year, regiondf, FUN = sum)
ggplot(yearly, aes(x = year, y = quantity)) +
geom_line(color = "grey25") +
geom_point(aes(color = factor(year)), show.legend = FALSE) +
labs(x = "Year", y = "Quantity ('000 barrels)") +
theme(plot.background = element_rect(fill = "ivory"),
panel.background = element_rect(fill = "ivory2"),
text = element_text(family = "Helvetica", size = 10),
axis.text = element_text(family = "Helvetica", size = 10),
axis.title = element_text(family = "Helvetica", size = 10))
# drop 2024 as data is incomplete
regiondf2 <- subset(regiondf, year != 2024)
yeardf <- split(regiondf2$quantity, regiondf2$year)
lapply(yeardf, summary)
## $`2009`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2 423 878 2360 1994 92737
##
## $`2010`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1 435 920 2496 2140 85796
##
## $`2011`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1 400 873 2385 1983 81939
##
## $`2012`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1 301 663 2254 1730 76845
##
## $`2013`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1 276 639 2162 1691 71053
##
## $`2014`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1 277 648 2231 1762 69005
##
## $`2015`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1 309 709 2262 1835 69488
##
## $`2016`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1 348 813 2427 2056 71114
##
## $`2017`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1 347 866 2511 2184 76280
##
## $`2018`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.0 346.5 796.0 2432.3 2049.0 89584.0
##
## $`2019`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1 349 822 2255 2001 87337
##
## $`2020`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1 348 723 2189 1768 82885
##
## $`2021`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1 399 804 2259 1874 92380
##
## $`2022`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2 397 862 2464 2064 92573
##
## $`2023`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1 391 907 2584 2248 93508
monthorder <- c("January", "February", "March", "April", "May", "June", "July",
"August", "September", "October", "November", "December")
regiondf2$month <- factor(regiondf2$month, levels = monthorder)
monthly <- aggregate(quantity ~ month, regiondf2, function(x) round(mean(x), 2))
ggplot(monthly, aes(x = month, y = quantity)) +
geom_line(group = 1, color = "grey25") +
geom_point(aes(color = factor(month)), show.legend = FALSE) +
labs(x = "Month", y = "Average Quantity ('000 barrels)") +
theme(plot.background = element_rect(fill = "ivory"),
panel.background = element_rect(fill = "ivory2"),
text = element_text(family = "Helvetica", size = 10),
axis.text = element_text(family = "Helvetica", size = 10),
axis.text.x = element_text(angle = 45, hjust = 1),
axis.title = element_text(family = "Helvetica", size = 10))
# top 3 oil export regions to the US of all time
t3rat <- regiondf2 %>%
group_by(origin_name) %>%
summarise(total_quantity = sum(quantity)) %>%
top_n(3, total_quantity)
origincolour <- c("thistle2", "cadetblue1", "hotpink1")
p1 <- ggplot(t3rat, aes(x = origin_name, y = total_quantity, fill = origincolour)) +
geom_bar(stat = "identity", colour = "grey23", size = 0.2) +
scale_fill_manual(values = origincolour) +
labs(x = "Origin", y = "Quantity ('000 barrels)", fill = "Origin Name") +
theme(plot.background = element_rect(fill = "ivory"),
panel.background = element_rect(fill = "ivory2"),
legend.background = element_rect(fill = "ivory"),
text = element_text("Helvetica", size = 10),
axis.text = element_text("Helvetica", size = 10),
axis.title = element_text(family = "Helvetica", size = 10))
# top 3 oil export regions to the US each year
t3ry <- regiondf2 %>%
group_by(year, origin_name) %>%
summarise(total_quantity = sum(quantity)) %>%
top_n(3, total_quantity) %>%
arrange(year, desc(total_quantity))
origincolour2 <- c("lemonchiffon","hotpink1","thistle2", "cadetblue1")
p2 <- ggplot(t3ry, aes(x = year, y = total_quantity, fill = origin_name)) +
scale_fill_manual(values = origincolour2) +
geom_bar(stat = "identity", color = "grey23", size = 0.2) +
labs(x = "Year", y = "Quantity ('000 barrels)", fill = "Origin Name") +
theme(plot.background = element_rect(fill = "ivory"),
panel.background = element_rect(fill = "ivory2"),
legend.background = element_rect(fill = "ivory"),
text = element_text(family = "Helvetica", size = 10),
axis.text = element_text(family = "Helvetica", size = 10),
axis.title = element_text(family = "Helvetica", size = 10))
combined1 <- p1 + p2 + plot_layout(ncol = 2)
combined1
# top 5 oil destination in the US of all time
t5dat <- regiondf2 %>%
group_by(destination_name) %>%
summarise(total_quantity = sum(quantity)) %>%
top_n(5, total_quantity) %>%
arrange(desc(total_quantity))
destinationcolours <- c("violetred2", "lavenderblush1", "darkseagreen1", "darkslategray2", "orchid")
p3 <- ggplot(t5dat, aes(x = destination_name, y = total_quantity, fill = destination_name)) +
scale_fill_manual(values = destinationcolours) +
geom_bar(stat = "identity", color = "grey23", size = 0.2) +
labs(x = "Destination Name",y = "Quantity ('000 barrels)", fill = "Destination Name") +
theme(plot.background = element_rect(fill = "ivory"),
panel.background = element_rect(fill = "ivory2"),
legend.background = element_rect(fill = "ivory"),
text = element_text(family = "Helvetica", size = 10),
axis.text = element_text(family = "Helvetica", size = 10),
axis.text.x = element_text(angle = 25, hjust = 1),
axis.title = element_text(family = "Helvetica", size = 10))
# top 5 oil destination in the US each year
t5dy <- regiondf2 %>%
group_by(year, destination_name) %>%
summarise(total_quantity = sum(quantity)) %>%
top_n(5, total_quantity) %>%
arrange(year, desc(total_quantity))
destinationcolours2 <- c("lightgoldenrod","lightcoral","violetred2", "lavenderblush1", "darkseagreen1", "darkslategray2", "orchid")
p4 <- ggplot(t5dy, aes(x = year, y = total_quantity, fill = destination_name)) +
geom_bar(stat = "identity", color = "grey23", size = 0.2) +
scale_fill_manual(values = destinationcolours2) +
labs(x = "Year", y = "Quantity ('000 barrels)", fill = "Destination Name") +
theme(plot.background = element_rect(fill = "ivory"),
panel.background = element_rect(fill = "ivory2"),
legend.background = element_rect(fill = "ivory"),
text = element_text(family = "Helvetica", size = 10),
axis.text = element_text(family = "Helvetica", size = 10),
axis.title = element_text(family = "Helvetica", size = 10))
combined2 <- p3 + p4 + plot_layout(ncol = 2)
combined2
# distribution of import quantities by oil grade
graderank <- regiondf2 %>%
group_by(grade_name) %>%
summarise(total_quantity = sum(quantity)) %>%
arrange(desc(total_quantity))
gradecolours <- c("cyan2", "aquamarine1", "slategray2","thistle1", "palevioletred")
ggplot(graderank, aes(x = grade_name, y = total_quantity, fill = grade_name)) +
geom_bar(stat = "identity", color = "grey23", size = 0.2, show.legend = FALSE) +
scale_fill_manual(values = gradecolours) +
labs(x = "Grade Name", y = "Quantity ('000 barrels)") +
theme(plot.background = element_rect(fill = "ivory"),
panel.background = element_rect(fill = "ivory2"),
text = element_text(family = "Helvetica", size = 10),
axis.title = element_text(family = "Helvetica", size = 10),
axis.text = element_text(family = "Helvetica", size = 10))
t5dtgat <- subset(regiondf2, destination_name %in% c("United States", "Padd3 (Gulf Coast)",
"Padd2 (Midwest)", "Texas", "Padd5 (West Coast)"))
t5dtgat <- t5dtgat %>%
group_by(destination_name, grade_name) %>%
summarise(total_quantity = sum(quantity)) %>%
arrange(desc(total_quantity))
ggplot(t5dtgat, aes(x = destination_name, y = total_quantity, fill = grade_name)) +
geom_bar(stat = "identity", color = "grey23", size = 0.2) +
scale_fill_manual(values = gradecolours) +
labs(x = "Destination Name", y = "Quantity ('000 barrels)", fill = "Grade") +
theme(plot.background = element_rect(fill = "ivory"),
panel.background = element_rect(fill = "ivory2"),
legend.background = element_rect(fill = 'ivory'),
text = element_text(family = "Helvetica", size = 10),
axis.title = element_text(family = "Helvetica", size = 10),
axis.text = element_text(family = "Helvetica", size = 10),
axis.text.x = element_text(angle = 25, hjust = 1))
In conclusion, the analysis elucidates the dynamics of the United States’ oil trade and consumption patterns. While crude oil remains a fundamental energy source, there is a discernible decline in imports, reflecting global shifts towards greener energy alternatives such as wind, solar, bioenergy, and hydroelectric power. The consistent seasonal fluctuation in import volumes underscores the necessity of considering temporal factors in energy planning and market analysis. Understanding the preferences for specific oil grades and distribution patterns within the United States offers critical insights for stakeholders to adapt to evolving market conditions and make informed decisions regarding energy security and sustainability. These findings provide a solid foundation for policymakers, industry participants, and investors to effectively navigate the intricate landscape of the global energy transition.