Problem statement
Due to all the lockdowns, travel limits, and less industrial activity, the COVID-19 pandemic has had a big impact on the world economy. As a result, there has been a sharp drop in the demand for oil and gas. Because of this drop in demand, market volatility, and problems in the supply chain, oil prices have been all over the place. Consumers, companies, and governments all over the world have been affected by these changes, which have led to financial stress and economic uncertainty.
Solutions
- Economic Impact: Fluctuating oil prices directly affect household spending, business activity, and inflation, so it is important to find solutions to stabilize prices and minimize economic risks.
- Supply Chain Disruption: The pandemic has exposed vulnerabilities in global supply chains. Analyzing oil prices can provide insights to improve the resilience of energy supply chains in future crises.
- Energy Policy: Governments need data-driven strategies to manage energy resources, establish policies that ensure price stability, support economic recovery, and promote long-term sustainability.
Execution, Interpretation and Results
Load the required libraries
library(readxl)
library(tidyr)
library(car)
## Warning: package 'car' was built under R version 4.4.2
## Loading required package: carData
## Warning: package 'carData' was built under R version 4.4.2
library(corrplot)
## Warning: package 'corrplot' was built under R version 4.4.2
## corrplot 0.95 loaded
Import the data
data <- read_excel("10326_retail_fuel_prices_05-16-24.xlsx")
## New names:
## • `` -> `...2`
## • `` -> `...3`
## • `` -> `...4`
## • `` -> `...5`
## • `` -> `...6`
## • `` -> `...7`
## • `` -> `...8`
## • `` -> `...9`
data
## # A tibble: 101 × 9
## Average Retail Fuel Prices …¹ ...2 ...3 ...4 ...5 ...6 ...7 ...8 ...9
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 Report Date Gaso… E85 CNG LNG Prop… Dies… B20 B99/…
## 2 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 3 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 4 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 5 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 6 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 7 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 8 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 9 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 10 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## # ℹ 91 more rows
## # ℹ abbreviated name:
## # ¹`Average Retail Fuel Prices in the United States (in Gasoline Gallon Equivalents, GGEs)`
Select columns 2 to 9
df <- data[, 2:9]
df
## # A tibble: 101 × 8
## ...2 ...3 ...4 ...5 ...6 ...7 ...8 ...9
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 Gasoline E85 CNG LNG Propane* Diesel B20 B99/B100
## 2 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 3 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 4 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 5 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 6 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 7 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 8 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 9 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 10 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## # ℹ 91 more rows
Convert from character to numeric
df <- as.data.frame(lapply(df, function(x) as.numeric(x)))
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
df
## ...2 ...3 ...4 ...5 ...6 ...7 ...8 ...9
## 1 NA NA NA NA NA NA NA NA
## 2 NA NA NA NA NA NA NA NA
## 3 NA NA NA NA NA NA NA NA
## 4 NA NA NA NA NA NA NA NA
## 5 NA NA NA NA NA NA NA NA
## 6 NA NA NA NA NA NA NA NA
## 7 NA NA NA NA NA NA NA NA
## 8 NA NA NA NA NA NA NA NA
## 9 NA NA NA NA NA NA NA NA
## 10 NA NA NA NA NA NA NA NA
## 11 NA NA NA NA NA NA NA NA
## 12 NA NA NA NA NA NA NA NA
## 13 NA NA NA NA NA NA NA NA
## 14 NA NA NA NA NA NA NA NA
## 15 NA NA NA NA NA NA NA NA
## 16 NA NA NA NA NA NA NA NA
## 17 NA NA NA NA NA NA NA NA
## 18 NA NA NA NA NA NA NA NA
## 19 NA NA NA NA NA NA NA NA
## 20 NA NA NA NA NA NA NA NA
## 21 NA NA NA NA NA NA NA NA
## 22 NA NA NA NA NA NA NA NA
## 23 NA NA NA NA NA NA NA NA
## 24 NA NA NA NA NA NA NA NA
## 25 NA NA NA NA NA NA NA NA
## 26 NA NA NA NA NA NA NA NA
## 27 NA NA NA NA NA NA NA NA
## 28 NA NA NA NA NA NA NA NA
## 29 NA NA NA NA NA NA NA NA
## 30 NA NA NA NA NA NA NA NA
## 31 NA NA NA NA NA NA NA NA
## 32 NA NA NA NA NA NA NA NA
## 33 NA NA NA NA NA NA NA NA
## 34 NA NA NA NA NA NA NA NA
## 35 NA NA NA NA NA NA NA NA
## 36 NA NA NA NA NA NA NA NA
## 37 NA NA NA NA NA NA NA NA
## 38 NA NA NA NA NA NA NA NA
## 39 NA NA NA NA NA NA NA NA
## 40 NA NA NA NA NA NA NA NA
## 41 NA NA NA NA NA NA NA NA
## 42 NA NA NA NA NA NA NA NA
## 43 NA NA NA NA NA NA NA NA
## 44 NA NA NA NA NA NA NA NA
## 45 NA NA NA NA NA NA NA NA
## 46 NA NA NA NA NA NA NA NA
## 47 NA NA NA NA NA NA NA NA
## 48 NA NA NA NA NA NA NA NA
## 49 NA NA NA NA NA NA NA NA
## 50 NA NA NA NA NA NA NA NA
## 51 NA NA NA NA NA NA NA NA
## 52 NA NA NA NA NA NA NA NA
## 53 NA NA NA NA NA NA NA NA
## 54 NA NA NA NA NA NA NA NA
## 55 NA NA NA NA NA NA NA NA
## 56 NA NA NA NA NA NA NA NA
## 57 NA NA NA NA NA NA NA NA
## 58 NA NA NA NA NA NA NA NA
## 59 NA NA NA NA NA NA NA NA
## 60 NA NA NA NA NA NA NA NA
## 61 NA NA NA NA NA NA NA NA
## 62 NA NA NA NA NA NA NA NA
## 63 2.50 2.68 2.17 2.66 3.88 2.63 2.55 3.41
## 64 2.67 2.87 2.18 2.57 3.87 2.70 2.59 3.39
## 65 2.88 3.05 2.22 2.60 3.85 2.89 2.75 3.48
## 66 2.91 3.10 2.19 2.64 3.93 2.99 2.78 3.57
## 67 2.27 2.59 2.19 2.71 3.99 2.65 2.52 3.50
## 68 2.76 3.00 2.22 2.38 3.97 2.75 2.59 3.44
## 69 2.76 3.06 2.21 2.46 3.87 2.71 2.58 3.55
## 70 2.68 2.97 2.20 2.40 3.79 2.74 2.58 3.65
## 71 2.59 2.96 2.18 2.47 3.82 2.71 2.60 3.65
## 72 1.91 2.28 2.19 2.43 3.74 2.33 2.13 3.44
## 73 2.22 2.58 2.15 2.40 3.75 2.20 2.11 3.08
## 74 2.18 2.54 2.18 2.42 3.74 2.13 2.06 3.26
## 75 2.32 2.65 2.19 2.45 3.90 2.35 2.18 3.11
## 76 2.82 3.12 2.19 2.39 4.01 2.77 2.53 3.49
## 77 3.09 3.40 2.22 2.54 4.08 2.90 2.74 3.56
## 78 3.25 3.55 2.33 2.45 4.34 3.10 2.96 3.73
## 79 3.28 3.87 2.49 2.69 4.69 3.22 3.08 3.88
## 80 4.13 4.60 2.59 2.82 4.83 4.50 4.16 4.96
## 81 4.70 5.10 2.76 3.15 5.19 5.02 4.80 5.48
## 82 4.05 4.13 2.88 3.23 4.86 4.60 4.40 5.15
## 83 3.31 3.60 3.25 4.23 5.02 4.08 4.01 5.11
## 84 3.69 3.88 2.99 4.02 4.98 3.78 3.62 4.85
## 85 3.59 3.84 2.86 3.38 4.46 3.45 3.40 4.44
## 86 3.72 3.96 2.85 3.49 4.51 4.02 3.98 4.76
## 87 3.06 3.32 2.95 3.34 4.78 3.51 3.45 4.60
## 88 3.65 3.85 2.90 3.43 4.72 3.62 3.55 4.48
## 89 NA NA NA NA NA NA NA NA
## 90 NA NA NA NA NA NA NA NA
## 91 NA NA NA NA NA NA NA NA
## 92 NA NA NA NA NA NA NA NA
## 93 NA NA NA NA NA NA NA NA
## 94 NA NA NA NA NA NA NA NA
## 95 NA NA NA NA NA NA NA NA
## 96 NA NA NA NA NA NA NA NA
## 97 NA NA NA NA NA NA NA NA
## 98 NA NA NA NA NA NA NA NA
## 99 NA NA NA NA NA NA NA NA
## 100 NA NA NA NA NA NA NA NA
## 101 NA NA NA NA NA NA NA NA
Clean the data
df_clean <- na.omit(df)
df_clean
## ...2 ...3 ...4 ...5 ...6 ...7 ...8 ...9
## 63 2.50 2.68 2.17 2.66 3.88 2.63 2.55 3.41
## 64 2.67 2.87 2.18 2.57 3.87 2.70 2.59 3.39
## 65 2.88 3.05 2.22 2.60 3.85 2.89 2.75 3.48
## 66 2.91 3.10 2.19 2.64 3.93 2.99 2.78 3.57
## 67 2.27 2.59 2.19 2.71 3.99 2.65 2.52 3.50
## 68 2.76 3.00 2.22 2.38 3.97 2.75 2.59 3.44
## 69 2.76 3.06 2.21 2.46 3.87 2.71 2.58 3.55
## 70 2.68 2.97 2.20 2.40 3.79 2.74 2.58 3.65
## 71 2.59 2.96 2.18 2.47 3.82 2.71 2.60 3.65
## 72 1.91 2.28 2.19 2.43 3.74 2.33 2.13 3.44
## 73 2.22 2.58 2.15 2.40 3.75 2.20 2.11 3.08
## 74 2.18 2.54 2.18 2.42 3.74 2.13 2.06 3.26
## 75 2.32 2.65 2.19 2.45 3.90 2.35 2.18 3.11
## 76 2.82 3.12 2.19 2.39 4.01 2.77 2.53 3.49
## 77 3.09 3.40 2.22 2.54 4.08 2.90 2.74 3.56
## 78 3.25 3.55 2.33 2.45 4.34 3.10 2.96 3.73
## 79 3.28 3.87 2.49 2.69 4.69 3.22 3.08 3.88
## 80 4.13 4.60 2.59 2.82 4.83 4.50 4.16 4.96
## 81 4.70 5.10 2.76 3.15 5.19 5.02 4.80 5.48
## 82 4.05 4.13 2.88 3.23 4.86 4.60 4.40 5.15
## 83 3.31 3.60 3.25 4.23 5.02 4.08 4.01 5.11
## 84 3.69 3.88 2.99 4.02 4.98 3.78 3.62 4.85
## 85 3.59 3.84 2.86 3.38 4.46 3.45 3.40 4.44
## 86 3.72 3.96 2.85 3.49 4.51 4.02 3.98 4.76
## 87 3.06 3.32 2.95 3.34 4.78 3.51 3.45 4.60
## 88 3.65 3.85 2.90 3.43 4.72 3.62 3.55 4.48
Rename the columns
colnames(df_clean) <- c("GASOLINE", "E85", "CNG", "LNG", "PROPANE","DIESEL", "B20", "B99/B100")
df_clean
## GASOLINE E85 CNG LNG PROPANE DIESEL B20 B99/B100
## 63 2.50 2.68 2.17 2.66 3.88 2.63 2.55 3.41
## 64 2.67 2.87 2.18 2.57 3.87 2.70 2.59 3.39
## 65 2.88 3.05 2.22 2.60 3.85 2.89 2.75 3.48
## 66 2.91 3.10 2.19 2.64 3.93 2.99 2.78 3.57
## 67 2.27 2.59 2.19 2.71 3.99 2.65 2.52 3.50
## 68 2.76 3.00 2.22 2.38 3.97 2.75 2.59 3.44
## 69 2.76 3.06 2.21 2.46 3.87 2.71 2.58 3.55
## 70 2.68 2.97 2.20 2.40 3.79 2.74 2.58 3.65
## 71 2.59 2.96 2.18 2.47 3.82 2.71 2.60 3.65
## 72 1.91 2.28 2.19 2.43 3.74 2.33 2.13 3.44
## 73 2.22 2.58 2.15 2.40 3.75 2.20 2.11 3.08
## 74 2.18 2.54 2.18 2.42 3.74 2.13 2.06 3.26
## 75 2.32 2.65 2.19 2.45 3.90 2.35 2.18 3.11
## 76 2.82 3.12 2.19 2.39 4.01 2.77 2.53 3.49
## 77 3.09 3.40 2.22 2.54 4.08 2.90 2.74 3.56
## 78 3.25 3.55 2.33 2.45 4.34 3.10 2.96 3.73
## 79 3.28 3.87 2.49 2.69 4.69 3.22 3.08 3.88
## 80 4.13 4.60 2.59 2.82 4.83 4.50 4.16 4.96
## 81 4.70 5.10 2.76 3.15 5.19 5.02 4.80 5.48
## 82 4.05 4.13 2.88 3.23 4.86 4.60 4.40 5.15
## 83 3.31 3.60 3.25 4.23 5.02 4.08 4.01 5.11
## 84 3.69 3.88 2.99 4.02 4.98 3.78 3.62 4.85
## 85 3.59 3.84 2.86 3.38 4.46 3.45 3.40 4.44
## 86 3.72 3.96 2.85 3.49 4.51 4.02 3.98 4.76
## 87 3.06 3.32 2.95 3.34 4.78 3.51 3.45 4.60
## 88 3.65 3.85 2.90 3.43 4.72 3.62 3.55 4.48
Summarize the data
summary(df_clean)
## GASOLINE E85 CNG LNG
## Min. :1.910 Min. :2.280 Min. :2.150 Min. :2.380
## 1st Qu.:2.610 1st Qu.:2.893 1st Qu.:2.190 1st Qu.:2.450
## Median :2.895 Median :3.110 Median :2.220 Median :2.620
## Mean :3.038 Mean :3.329 Mean :2.451 Mean :2.837
## 3rd Qu.:3.520 3rd Qu.:3.848 3rd Qu.:2.828 3rd Qu.:3.210
## Max. :4.700 Max. :5.100 Max. :3.250 Max. :4.230
## PROPANE DIESEL B20 B99/B100
## Min. :3.740 Min. :2.130 Min. :2.060 Min. :3.080
## 1st Qu.:3.870 1st Qu.:2.703 1st Qu.:2.558 1st Qu.:3.450
## Median :4.000 Median :2.895 Median :2.745 Median :3.610
## Mean :4.253 Mean :3.167 Mean :3.027 Mean :3.962
## 3rd Qu.:4.713 3rd Qu.:3.592 3rd Qu.:3.525 3rd Qu.:4.570
## Max. :5.190 Max. :5.020 Max. :4.800 Max. :5.480
Data visualization: boxplot
boxplot(df_clean,
main = "Fuel types",
ylab = "price (USD)",
col = "orange",
las = 2)

Compute ANOVA test statistic
anova <- aov(Price_in_USD ~ Fuel_types, data = pl_data)
summary(anova)
## Df Sum Sq Mean Sq F value Pr(>F)
## Fuel_types 7 63.16 9.022 22.28 <2e-16 ***
## Residuals 200 80.99 0.405
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Interpretation: p_value (0.00) < alpha(0.05), we reject H0 and conclude that there is a statistically significant difference in the mean prices of the different fuel types.
Pairwise comparison:
pairwise.t.test(pl_data$Price_in_USD, pl_data$Fuel_types, p.adj = "bonferroni")
##
## Pairwise comparisons using t tests with pooled SD
##
## data: pl_data$Price_in_USD and pl_data$Fuel_types
##
## B20 B99/B100 CNG DIESEL E85 GASOLINE LNG
## B99/B100 8.5e-06 - - - - - -
## CNG 0.03639 8.2e-14 - - - - -
## DIESEL 1.00000 0.00032 0.00199 - - - -
## E85 1.00000 0.01168 4.0e-05 1.00000 - - -
## GASOLINE 1.00000 1.2e-05 0.02940 1.00000 1.00000 - -
## LNG 1.00000 3.4e-08 0.84443 1.00000 0.16216 1.00000 -
## PROPANE 1.5e-09 1.00000 < 2e-16 1.2e-07 1.2e-05 2.1e-09 2.4e-12
##
## P value adjustment method: bonferroni
Interpretation:
- B20 vs B99/B100: p-value (0.00), which is less than alpha (0.05);
- B20 vs CNG: p-value (0.04), which is less than alpha (0.05);
- B20 vs DIESEL: p-value (1.00), which is greater than alpha (0.05);
- B99/B100 vs CNG: p-value ( 0.00), which is less than alpha (0.05);
- DIESEL vs E85: p-value (1.00), which is greater than alpha (0.05);
There are significant differences in the prices of certain fuel types. Specifically, there is a statistically significant difference between B20 and B99/B100, between B20 and CNG, and between B99/B100 vs CNG. However, no significant difference was found between B20 and DIESEL or between DIESEL and E85.
Compute the correlation matrix
cor_matrix <- cor(df_clean)
cor_matrix
## GASOLINE E85 CNG LNG PROPANE DIESEL B20
## GASOLINE 1.0000000 0.9881192 0.7286583 0.6140873 0.8753738 0.9526024 0.9480207
## E85 0.9881192 1.0000000 0.6997831 0.5694837 0.8786178 0.9384997 0.9302512
## CNG 0.7286583 0.6997831 1.0000000 0.9494823 0.9026420 0.8043824 0.8378527
## LNG 0.6140873 0.5694837 0.9494823 1.0000000 0.8088951 0.7041987 0.7426770
## PROPANE 0.8753738 0.8786178 0.9026420 0.8088951 1.0000000 0.9153619 0.9235126
## DIESEL 0.9526024 0.9384997 0.8043824 0.7041987 0.9153619 1.0000000 0.9961120
## B20 0.9480207 0.9302512 0.8378527 0.7426770 0.9235126 0.9961120 1.0000000
## B99/B100 0.8916978 0.8762823 0.9035662 0.8205770 0.9373394 0.9668667 0.9751682
## B99/B100
## GASOLINE 0.8916978
## E85 0.8762823
## CNG 0.9035662
## LNG 0.8205770
## PROPANE 0.9373394
## DIESEL 0.9668667
## B20 0.9751682
## B99/B100 1.0000000
Correlation analysis: The correlation values are relatively high across fuel types, indicating that prices for different types of fuel are not fully independent. Which suggests that fuel prices are influenced by similar external factors, leading to correlated price changes across types of fuel.
Conclusions
Our analysis demonstrated that there are significant differences in the average gas prices between certain fuel types across pre, during and post COVID periods.
The ANOVA test results confirm that gas prices have shifted significantly highlighting the influence of unprecedented global events on fuel markets.
STATISTICALLY SIGNIFICANT DIFFERENCES:
- Price disparities are notable between: B20 vs. B99/B100, B99/B100 vs. CNG and B20 vs. CNG.
- B20, B99/B100, and CNG, showed greater volatility in response to market changes. This suggests these fuels are more sensitive to external factors like supply chain issues, demand fluctuations, and economic policies.
NON-SIGNIFICANT DIFFERENCES:
- No significant price difference was found between B20 and Diesel or between Diesel and E85.
- Fuel types such as Diesel and E85 displayed more stable prices with fewer significant differences across time periods. This could indicate these conventional fuels are less influenced by the same market conditions, possibly due to established infrastructure or higher demand stability.
Recommendation
CONTINUOUS MONITORING AND FACTOR ANALYSIS:
Set up ongoing data tracking for fuel prices across all types to capture shifts that may be driven by external economic changes. Focus on the impact on global supply chain disruptions, geopolitical events, and regulatory changes on each specific fuel type to gain targeted insights.
POLICY INTERVENTIONS:
Advocate for regulatory measures to stabilize prices, especially for more volatile fuel types (e.g., B20, CNG). Possible interventions could include subsidies, incentives for alternative fuels, or strategic reserves.
CROSS-INDUSTRY COLLABORATION:
Engage with industry stakeholders to share findings and support industry-wide strategies aimed at mitigating price volatility, particularly for transportation and logistics sectors heavily reliant on specific fuels.