import price data

spotPrice <- read.csv("D:/OilMarketAnalysis/spotPrice.csv")
cannot open file 'D:/OilMarketAnalysis/spotPrice.csv': No such file or directoryError in file(file, "rt") : cannot open the connection

put price data into a dataframe and change the date to the date data format

#simplify date column 
priceData <- edit(priceData)
class discarded from column 㤼㸱as.Date.spotPrice.Day....m..d..Y..㤼㸲

plot price data

import production data

oilProduction <- read.csv("~/oilProduction.csv")

convert date to proper data type and save in a dataframe and clean data

#I need to make the date variable column name simpler for later use
productionData <- edit(productionData)
class discarded from column 㤼㸱as.Date.oilProduction.Date....m..d..Y...1.467.㤼㸲

plot production data

this data looks weird. i want to check for weird values

any(is.na(productionData$oilProduction.Oklahoma.Field.Production.of.Crude.Oil..Thousand.Barrels..1.467.) | is.infinite(productionData$oilProduction.Oklahoma.Field.Production.of.Crude.Oil..Thousand.Barrels..1.467.))
[1] FALSE

there appears to be no missing or infinite values

I want to match prices with output

head(priceData)
head(productionData)

#i will join my data frames on the date column
library(dplyr)


oilDataBase <- inner_join(productionData, priceData)
Joining, by = "Date"
#just price and quantity
marketOverview <- oilDataBase[order(oilDataBase$oilProduction.Oklahoma.Field.Production.of.Crude.Oil..Thousand.Barrels..1.467.),]

View(marketOverview)
plot(marketOverview$oilProduction.Oklahoma.Field.Production.of.Crude.Oil..Thousand.Barrels..1.467.[1:284], marketOverview$spotPrice.Cushing.OK.WTI.Spot.Price.FOB..Dollars.per.Barrel[1:284], type = "l", xlab = "Production (Thousands of barrels)", ylab = "Price (Dollars)", main = "Quantity Supplied and Price of Oil in Oklahoma", col = "blue")



View(oilDataBase)

model <- lm(formula = oilDataBase$spotPrice.Cushing.OK.WTI.Spot.Price.FOB..Dollars.per.Barrel ~ oilDataBase$oilProduction.Oklahoma.Field.Production.of.Crude.Oil..Thousand.Barrels..1.467.)

plot(oilDataBase$oilProduction.Oklahoma.Field.Production.of.Crude.Oil..Thousand.Barrels..1.467., oilDataBase$spotPrice.Cushing.OK.WTI.Spot.Price.FOB..Dollars.per.Barrel)


model

Call:
lm(formula = oilDataBase$spotPrice.Cushing.OK.WTI.Spot.Price.FOB..Dollars.per.Barrel ~ 
    oilDataBase$oilProduction.Oklahoma.Field.Production.of.Crude.Oil..Thousand.Barrels..1.467.)

Coefficients:
                                                                               (Intercept)  
                                                                                 4.085e+01  
oilDataBase$oilProduction.Oklahoma.Field.Production.of.Crude.Oil..Thousand.Barrels..1.467.  
                                                                                 3.644e-04  
plot(model)


bfl <- (3.644e-04 * oilDataBase$oilProduction.Oklahoma.Field.Production.of.Crude.Oil..Thousand.Barrels..1.467.) + 4.085e+01

demand <- (((1/3.644e-04) * oilDataBase$oilProduction.Oklahoma.Field.Production.of.Crude.Oil..Thousand.Barrels..1.467.) + 4.085e+01)


plot(oilDataBase$oilProduction.Oklahoma.Field.Production.of.Crude.Oil..Thousand.Barrels..1.467., bfl, type = "l", xlab = "Production (Thousands of barrels)", main = "Linear Model of Oil Supply in Oklahoma", ylab = "Price (Dollars)", col = "red") 

It appears that higher prices generally lead to a higher production of oil, but recent technological changes make it hard to determine a more realistic supply curve. Production is going up and prices are falling. Currently the market appears oversupplied.

summary(model)

Call:
lm(formula = oilDataBase$spotPrice.Cushing.OK.WTI.Spot.Price.FOB..Dollars.per.Barrel ~ 
    oilDataBase$oilProduction.Oklahoma.Field.Production.of.Crude.Oil..Thousand.Barrels..1.467.)

Residuals:
   Min     1Q Median     3Q    Max 
-33.51 -24.05 -12.20  17.41  95.75 

Coefficients:
                                                                                            Estimate
(Intercept)                                                                                4.085e+01
oilDataBase$oilProduction.Oklahoma.Field.Production.of.Crude.Oil..Thousand.Barrels..1.467. 3.644e-04
                                                                                           Std. Error
(Intercept)                                                                                 4.661e+00
oilDataBase$oilProduction.Oklahoma.Field.Production.of.Crude.Oil..Thousand.Barrels..1.467.  5.015e-04
                                                                                           t value
(Intercept)                                                                                  8.765
oilDataBase$oilProduction.Oklahoma.Field.Production.of.Crude.Oil..Thousand.Barrels..1.467.   0.726
                                                                                           Pr(>|t|)
(Intercept)                                                                                  <2e-16
oilDataBase$oilProduction.Oklahoma.Field.Production.of.Crude.Oil..Thousand.Barrels..1.467.    0.468
                                                                                              
(Intercept)                                                                                ***
oilDataBase$oilProduction.Oklahoma.Field.Production.of.Crude.Oil..Thousand.Barrels..1.467.    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 29.01 on 282 degrees of freedom
Multiple R-squared:  0.001868,  Adjusted R-squared:  -0.001671 
F-statistic: 0.5278 on 1 and 282 DF,  p-value: 0.4681

the model does not appear significant at a 5% significance level.

I want to isolate more recent market data from 2015 - 2020…

summary(recentModel)

Call:
lm(formula = orderedRecentData$Production ~ orderedRecentData$Price)

Residuals:
   Min     1Q Median     3Q    Max 
 -3218  -1161   -286   1122   3383 

Coefficients:
                        Estimate Std. Error t value Pr(>|t|)    
(Intercept)              7320.02    1479.08   4.949 1.47e-05 ***
orderedRecentData$Price   145.06      27.66   5.244 5.78e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1603 on 39 degrees of freedom
Multiple R-squared:  0.4135,    Adjusted R-squared:  0.3985 
F-statistic:  27.5 on 1 and 39 DF,  p-value: 5.783e-06

I want to deseasonalize the data…

now aggrigate, order, and plot the deseasonalized data

deseasonalizedData <- data.frame(deseasonalizedPrice$dspar, deseasonalizedProduction$z) 
row names were found from a short variable and have been discarded

now i want to make a cleaner plot of the deseasonalized relationship

okay, that should do it for supply. now i want to do the same for demand… import the data

USCrudeSupplied <- read.csv("~/USCrudeSupplied.csv")

get the data types set up

I cant find oklahoma specific data so I’m going to marginally use this data. moving on…

I want to compare price and quantity from before horizontal drilling and after. Supposedly technical change should shift the supply line to the right. is this correct?

grab the data

oldData <- oilDataBase[35:85]
Error in `[.data.frame`(oilDataBase, 35:85) : undefined columns selected

plot the data

now order the data for price and quantity

plot and compare price and quantity

try to put both lines on one graph…

i need to show change in demand get the data

now plot the data

