| #” Modelling and Predicting the permeability using multiple linear regression |
| in multiple scenarios” |
| # “Ali Abdulhussein Abdulazeez ,BUOG” |
| # date: “2024-11-28” |
In this homework we will conduct a MLR to model and predict the permeability in different scenarios and visualize the results in each scenario to compare between the measured and predicted permeabilities.
####Load the necessary libraries:
library("xtable")
library("Metrics")
library("MASS")
library("rcompanion")
library("CTT")
library("olsrr")
###Scenario 1-MLR of permeability given all well logs without facies.
####first we need to load the dataset
karpur <- read.csv("C:/Users/del/Documents/Karpur.csv")
summary(karpur)
####then we have to build a linear model using multiple linear regression.
model1 = lm(k.core ~ depth + caliper + ind.deep + gamma + phi.N + R.deep + R.med + SP + density.corr + density , data = karpur)
summary(model1)
Call: lm(formula = k.core ~ depth + caliper + ind.deep + gamma + phi.N + R.deep + R.med + SP + density.corr + density, data = karpur) Residuals: Min 1Q Median 3Q Max -5597.2 -850.3 -229.7 648.9 12357.3 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 97915.8644 16814.8331 5.823 8.33e-09 *** depth -8.2176 1.5300 -5.371 1.02e-07 *** caliper -6185.1337 1085.3765 -5.699 1.69e-08 *** ind.deep 2.1811 0.4717 4.624 4.38e-06 *** gamma -72.2617 5.7132 -12.648 < 2e-16 *** phi.N 312.8772 1308.5886 0.239 0.81109 R.deep -22.7203 7.3258 -3.101 0.00199 ** R.med 59.9218 10.3089 5.813 8.85e-09 *** SP -8.2996 3.6668 -2.263 0.02387 * density.corr 938.8590 5594.5019 0.168 0.86677 density 3879.7925 902.6578 4.298 1.93e-05 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1531 on 808 degrees of freedom Multiple R-squared: 0.5367, Adjusted R-squared: 0.531 F-statistic: 93.61 on 10 and 808 DF, p-value: < 2.2e-16
####We got R2 and Adjusted R2 around 0.53 for both.
####Now, to make prediction based on our model:
predicted.k = predict(model1, newdata = karpur)
AdjR.sq2 = 1 - sum((predicted.k - karpur$k.core)^2) / sum((karpur$k.core - mean(karpur$k.core))^2) RMSE = sqrt(mean((predicted.k - karpur$k.core)^2))
We got AdjR.sq2=-0.603437 and RMSE=2829.154.
par(mfrow=c(1,1)) plot(y = karpur$depth, x = karpur$k.core, type="l", col="blue", lwd = 2, lty = 1, xlab = "Permeability", ylab = "Depth (m)", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) par(new = TRUE) plot(y = karpur$depth, x = predicted.k, type="l", col="red", lwd = 2, lty = 2, xlab = "Permeability", ylab = "", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) legend("topright", legend = c("k.core", "Predicted k without facies"), col = c("blue", "red"), lty = c(1,2), lwd =2,cex =0.7 )
In this Scenario we will remove the unnecessary parameters from our model using the stepwise elimination. To do a Stepwise regression analysis we need to load the package (StepReg).
library("StepReg")
model2 = step(model1 , direction = "backward")
summary(model2)
Call: lm(formula = k.core ~ depth + caliper + ind.deep + gamma + R.deep + R.med + SP + density, data = karpur) Residuals: Min 1Q Median 3Q Max -5604.3 -854.0 -226.1 656.3 12356.3 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 96473.0715 16269.0921 5.930 4.49e-09 *** depth -8.0351 1.4276 -5.628 2.51e-08 *** caliper -6177.8440 1083.2186 -5.703 1.65e-08 *** ind.deep 2.1679 0.4695 4.618 4.51e-06 *** gamma -71.6221 5.2783 -13.569 < 2e-16 *** R.deep -22.8541 7.3071 -3.128 0.00183 ** R.med 60.3201 10.2293 5.897 5.44e-09 *** SP -8.1112 3.6219 -2.239 0.02540 * density 4041.0242 730.9125 5.529 4.35e-08 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1529 on 810 degrees of freedom Multiple R-squared: 0.5367, Adjusted R-squared: 0.5321 F-statistic: 117.3 on 8 and 810 DF, p-value: < 2.2e-16
#### if you noticed from the summary that two predictors have been removed ( ‘phi.N’ and ‘density.corr’) which they have pr value larger than 0.05, and we still have R2 and Adjusted R2 around 0.53 for both, which means we removed the unnecessary parameters.
predicted.k2 = predict(model2, newdata = karpur)
AdjR.sq2 = 1 - sum((predicted.k2 - karpur$k.core)^2) / sum((karpur$k.core - mean(karpur$k.core))^2) RMSE = sqrt(mean((predicted.k2 - karpur$k.core)^2))
after removing the unwanted parameters the AdjR.sq2 has increased and RMSE has decreased which make sense.
we’ll visualize the permeabilities again to aprove that the removed parameters doesn’t effect our prediction.
par(mfrow=c(1,1)) plot(y = karpur$depth, x = karpur$k.core, type="l", col="blue", lwd = 2, lty = 1, xlab = "Permeability", ylab = "Depth (m)", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) par(new = TRUE) plot(y = karpur$depth, x = predicted.k2, type="l", col="red", lwd = 2, lty = 2, xlab = "Permeability", ylab = "", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) legend("topright", legend = c("k.core", "Predicted k without facies after elimination"), col = c("blue", "red"), lty = c(1,2), lwd =2,cex =0.5 )
model3 = lm(k.core ~ depth + caliper + ind.deep + gamma + phi.N + R.deep + R.med + SP + density.corr + density + Facies-1, data = karpur)
summary(model3)
Call:
lm(formula = k.core ~ depth + caliper + ind.deep + gamma + phi.N +
R.deep + R.med + SP + density.corr + density + Facies - 1,
data = karpur)
Residuals:
Min 1Q Median 3Q Max
-5598.9 -600.6 -19.7 522.4 8968.3
Coefficients:
Estimate Std. Error t value Pr(>|t|)
depth 8.621 1.754 4.916 1.07e-06 ***
caliper -241.581 1036.088 -0.233 0.81569
ind.deep 1.364 0.434 3.142 0.00174 **
gamma -48.915 6.219 -7.865 1.19e-14 ***
phi.N 699.544 1497.681 0.467 0.64057
R.deep -28.594 6.501 -4.399 1.24e-05 ***
R.med 65.304 9.333 6.997 5.52e-12 ***
SP -7.763 3.249 -2.389 0.01711 *
density.corr -5842.405 4929.615 -1.185 0.23630
density -1375.439 1133.737 -1.213 0.22541
FaciesF1 -41138.384 17471.026 -2.355 0.01878 *
FaciesF10 -41094.372 17537.843 -2.343 0.01936 *
FaciesF2 -41708.035 17507.367 -2.382 0.01744 *
FaciesF3 -41273.486 17539.331 -2.353 0.01885 *
FaciesF5 -41058.931 17534.421 -2.342 0.01944 *
FaciesF7 -42402.676 17504.041 -2.422 0.01564 *
FaciesF8 -42725.944 17541.930 -2.436 0.01508 *
FaciesF9 -44925.743 17646.639 -2.546 0.01109 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1315 on 801 degrees of freedom
Multiple R-squared: 0.832, Adjusted R-squared: 0.8282
F-statistic: 220.3 on 18 and 801 DF, p-value: < 2.2e-16
We got R2=0.832 and Adjusted R2= 0.8282
Excellent ! with facies the relationship tends to be more linear with higher R2 and Adjusted R2 values.
predicted.k3 = predict(model3, newdata = data)
AdjR.sq2 = 1 - sum((predicted.k3 - karpur$k.core)^2) / sum((karpur$k.core - mean(karpur$k.core))^2) RMSE = sqrt(mean((predicted.k3 - karpur$k.core)^2))
We got AdjR.sq2=-0.67779 and RMSE=2894.006.
par(mfrow=c(1,1)) plot(y = karpur$depth, x = karpur$k.core, type="l", col="blue", lwd = 2, lty = 1, xlab = "Permeability", ylab = "Depth (m)", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) par(new = TRUE) plot(y = karpur$depth, x = predicted.k3 , type="l", col="red", lwd = 2, lty = 2, xlab = "Permeability", ylab = "", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) legend("topright", legend = c("k.core", "Predicted k with facies"), col = c("blue", "red"), lty = c(1,2), lwd =2,cex =0.7 )
lets remove the unnecessary parameters and see what will happen.
library("StepReg")
model4 = step(model3 , direction = "backward")
summary(model4)
predicted.k4 = predict(model4, newdata = karpur)
AdjR.sq2 = 1 - sum((predicted.k4 - karpur$k.core)^2) / sum((karpur$k.core - mean(karpur$k.core))^2) RMSE = sqrt(mean((predicted.k4 - karpur$k.core)^2))
AdjR.sq2 = 0.6606003 , RMSE= 1301.626
after removing the unwanted parameters the AdjR.sq2 has increased further and RMSE has decreased further which make sense.
par(mfrow=c(1,1)) plot(y = karpur$depth, x = karpur$k.core, type="l", col="blue", lwd = 2, lty = 1, xlab = "Permeability", ylab = "Depth (m)", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) par(new = TRUE) plot(y = karpur$depth, x = predicted.k4, type="l", col="red", lwd = 2, lty = 2, xlab = "Permeability", ylab = "", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) legend("topright", legend = c("k.core", "Predicted k with facies after elimination"), col = c("blue", "red"), lty = c(1,2), lwd =2,cex =0.5 )
klog = log(data$k.core)
model5 = lm(klog ~ depth + caliper + ind.deep + gamma + phi.N + R.deep + R.med + SP + density.corr + density + Facies-1, data = karpur)
summary(model5)
Call:
lm(formula = klog ~ depth + caliper + ind.deep + gamma + phi.N +
R.deep + R.med + SP + density.corr + density + Facies - 1,
data = karpur)
Residuals:
Min 1Q Median 3Q Max
-3.8699 -0.1431 0.0238 0.1702 0.6322
Coefficients:
Estimate Std. Error t value Pr(>|t|)
depth -1.996e-02 4.261e-04 -46.840 < 2e-16 ***
caliper -4.982e+00 2.517e-01 -19.789 < 2e-16 ***
ind.deep 7.148e-04 1.054e-04 6.778 2.36e-11 ***
gamma 2.204e-03 1.511e-03 1.458 0.145139
phi.N 6.444e-01 3.639e-01 1.771 0.076972 .
R.deep 7.626e-03 1.579e-03 4.828 1.65e-06 ***
R.med -7.855e-03 2.268e-03 -3.464 0.000561 ***
SP 1.127e-03 7.894e-04 1.427 0.153854
density.corr -1.026e+00 1.198e+00 -0.857 0.391727
density 3.957e-01 2.755e-01 1.437 0.151208
FaciesF1 1.647e+02 4.245e+00 38.803 < 2e-16 ***
FaciesF10 1.656e+02 4.261e+00 38.874 < 2e-16 ***
FaciesF2 1.653e+02 4.254e+00 38.868 < 2e-16 ***
FaciesF3 1.656e+02 4.261e+00 38.865 < 2e-16 ***
FaciesF5 1.657e+02 4.260e+00 38.884 < 2e-16 ***
FaciesF7 1.662e+02 4.253e+00 39.069 < 2e-16 ***
FaciesF8 1.665e+02 4.262e+00 39.061 < 2e-16 ***
FaciesF9 1.671e+02 4.288e+00 38.966 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.3195 on 801 degrees of freedom
Multiple R-squared: 0.9981, Adjusted R-squared: 0.9981
F-statistic: 2.341e+04 on 18 and 801 DF, p-value: < 2.2e-16
We got R2=0.9981 and Adjusted R2= 0.9981
Excellent ! with facies and log transformation the relationship tends to be more linear with higher R2 and Adjusted R2 values.
predicted.k5 = exp(predict(model5, newdata = data))
AdjR.sq2 = 1 - sum((predicted.k5 - karpur$k.core)^2) / sum((karpur$k.core - mean(karpur$k.core))^2) RMSE = sqrt(mean((predicted.k5 - karpur$k.core)^2))
We got AdjR.sq2=-1.167243 and RMSE=3289.155.
par(mfrow=c(1,1)) plot(y = karpur$depth, x = karpur$k.core, type="l", col="blue", lwd = 2, lty = 1, xlab = "Permeability", ylab = "Depth (m)", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) par(new = TRUE) plot(y = karpur$depth, x = predicted.k5 , type="l", col="red", lwd = 2, lty = 2, xlab = "Permeability", ylab = "", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) legend("topright", legend = c("k.core", "Predicted log k with facies"), col = c("blue", "red"), lty = c(1,2), lwd =2,cex =0.7 )
library("StepReg")
model6 = step(model5 , direction = "backward")
summary(model6)
predicted.k6 = exp(predict(model6, newdata = karpur))
AdjR.sq2 = 1 - sum((predicted.k6 - karpur$k.core)^2) / sum((karpur$k.core - mean(karpur$k.core))^2) RMSE = sqrt(mean((predicted.k6 - karpur$k.core)^2))
AdjR.sq2 = -1.35074 , RMSE= 3425.57
par(mfrow=c(1,1)) plot(y = karpur$depth, x = karpur$k.core, type="l", col="blue", lwd = 2, lty = 1, xlab = "Permeability", ylab = "Depth (m)", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) par(new = TRUE) plot(y = karpur$depth, x = predicted.k6, type="l", col="red", lwd = 2, lty = 2, xlab = "Permeability", ylab = "", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) legend("topright", legend = c("k.core", "Predicted k with facies after elimination"), col = c("blue", "red"), lty = c(1,2), lwd =2,cex =0.5 )
the predicted k should be perfectly match the observed k after log transformation , which means I have an issue in my code!