#” Modelling and Predicting the permeability using multiple linear regression
in multiple scenarios”
# “Ali Abdulhussein Abdulazeez ,BUOG”
# date: “2024-11-28”

In this homework we will conduct a MLR to model and predict the permeability in different scenarios and visualize the results in each scenario to compare between the measured and predicted permeabilities.

####Load the necessary libraries:

library("xtable")

library("Metrics")

library("MASS")

library("rcompanion")

library("CTT")

library("olsrr")

###Scenario 1-MLR of permeability given all well logs without facies.

####first we need to load the dataset

karpur <- read.csv("C:/Users/del/Documents/Karpur.csv")

summary(karpur)

####then we have to build a linear model using multiple linear regression.

model1 = lm(k.core ~ depth + caliper + ind.deep + gamma + phi.N + R.deep + R.med + SP + density.corr + density , data = karpur)

summary(model1)

Call:
lm(formula = k.core ~ depth + caliper + ind.deep + gamma + phi.N + 
    R.deep + R.med + SP + density.corr + density, data = karpur)

Residuals:
    Min      1Q  Median      3Q     Max 
-5597.2  -850.3  -229.7   648.9 12357.3 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)  97915.8644 16814.8331   5.823 8.33e-09 ***
depth           -8.2176     1.5300  -5.371 1.02e-07 ***
caliper      -6185.1337  1085.3765  -5.699 1.69e-08 ***
ind.deep         2.1811     0.4717   4.624 4.38e-06 ***
gamma          -72.2617     5.7132 -12.648  < 2e-16 ***
phi.N          312.8772  1308.5886   0.239  0.81109    
R.deep         -22.7203     7.3258  -3.101  0.00199 ** 
R.med           59.9218    10.3089   5.813 8.85e-09 ***
SP              -8.2996     3.6668  -2.263  0.02387 *  
density.corr   938.8590  5594.5019   0.168  0.86677    
density       3879.7925   902.6578   4.298 1.93e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1531 on 808 degrees of freedom
Multiple R-squared:  0.5367,  Adjusted R-squared:  0.531 
F-statistic: 93.61 on 10 and 808 DF,  p-value: < 2.2e-16

####We got R2 and Adjusted R2 around 0.53 for both.

####Now, to make prediction based on our model:

predicted.k = predict(model1, newdata = karpur)

AdjR.sq2 = 1 - sum((predicted.k - karpur$k.core)^2) / sum((karpur$k.core - mean(karpur$k.core))^2) RMSE = sqrt(mean((predicted.k - karpur$k.core)^2))

We got AdjR.sq2=-0.603437 and RMSE=2829.154.

Now lets visualize the observed and predicted permeabilities:

par(mfrow=c(1,1)) plot(y = karpur$depth, x = karpur$k.core, type="l", col="blue", lwd = 2, lty = 1, xlab = "Permeability", ylab = "Depth (m)", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) par(new = TRUE) plot(y = karpur$depth, x = predicted.k, type="l", col="red", lwd = 2, lty = 2, xlab = "Permeability", ylab = "", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) legend("topright", legend = c("k.core", "Predicted k without facies"), col = c("blue", "red"), lty = c(1,2), lwd =2,cex =0.7 )

Scenario 2- Apply Stepwise Elimination on the previous model.

In this Scenario we will remove the unnecessary parameters from our model using the stepwise elimination. To do a Stepwise regression analysis we need to load the package (StepReg).

library("StepReg")

model2 = step(model1 , direction = "backward")

summary(model2)

Call:
lm(formula = k.core ~ depth + caliper + ind.deep + gamma + R.deep + 
    R.med + SP + density, data = karpur)

Residuals:
    Min      1Q  Median      3Q     Max 
-5604.3  -854.0  -226.1   656.3 12356.3 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) 96473.0715 16269.0921   5.930 4.49e-09 ***
depth          -8.0351     1.4276  -5.628 2.51e-08 ***
caliper     -6177.8440  1083.2186  -5.703 1.65e-08 ***
ind.deep        2.1679     0.4695   4.618 4.51e-06 ***
gamma         -71.6221     5.2783 -13.569  < 2e-16 ***
R.deep        -22.8541     7.3071  -3.128  0.00183 ** 
R.med          60.3201    10.2293   5.897 5.44e-09 ***
SP             -8.1112     3.6219  -2.239  0.02540 *  
density      4041.0242   730.9125   5.529 4.35e-08 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1529 on 810 degrees of freedom
Multiple R-squared:  0.5367,  Adjusted R-squared:  0.5321 
F-statistic: 117.3 on 8 and 810 DF,  p-value: < 2.2e-16

#### if you noticed from the summary that two predictors have been removed ( ‘phi.N’ and ‘density.corr’) which they have pr value larger than 0.05, and we still have R2 and Adjusted R2 around 0.53 for both, which means we removed the unnecessary parameters.

predicted.k2 = predict(model2, newdata = karpur)

AdjR.sq2 = 1 - sum((predicted.k2 - karpur$k.core)^2) / sum((karpur$k.core - mean(karpur$k.core))^2) RMSE = sqrt(mean((predicted.k2 - karpur$k.core)^2))

AdjR.sq2 = 0.5366501 , RMSE= 1520.846

after removing the unwanted parameters the AdjR.sq2 has increased and RMSE has decreased which make sense.

we’ll visualize the permeabilities again to aprove that the removed parameters doesn’t effect our prediction.

par(mfrow=c(1,1)) plot(y = karpur$depth, x = karpur$k.core, type="l", col="blue", lwd = 2, lty = 1, xlab = "Permeability", ylab = "Depth (m)", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) par(new = TRUE) plot(y = karpur$depth, x = predicted.k2, type="l", col="red", lwd = 2, lty = 2, xlab = "Permeability", ylab = "", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) legend("topright", legend = c("k.core", "Predicted k without facies after elimination"), col = c("blue", "red"), lty = c(1,2), lwd =2,cex =0.5 )

###scenario 3- MLR of Permeability given all well logs with Facies.

model3 = lm(k.core ~ depth + caliper + ind.deep + gamma + phi.N + R.deep + R.med + SP + density.corr + density + Facies-1, data = karpur)

summary(model3)

Call:
lm(formula = k.core ~ depth + caliper + ind.deep + gamma + phi.N + 
    R.deep + R.med + SP + density.corr + density + Facies - 1, 
    data = karpur)

Residuals:
    Min     1Q  Median      3Q     Max 
-5598.9  -600.6   -19.7   522.4  8968.3 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
depth             8.621      1.754   4.916 1.07e-06 ***
caliper        -241.581   1036.088  -0.233  0.81569    
ind.deep          1.364      0.434   3.142  0.00174 ** 
gamma           -48.915      6.219  -7.865 1.19e-14 ***
phi.N           699.544   1497.681   0.467  0.64057    
R.deep          -28.594      6.501  -4.399 1.24e-05 ***
R.med            65.304      9.333   6.997 5.52e-12 ***
SP               -7.763      3.249  -2.389  0.01711 *  
density.corr  -5842.405   4929.615  -1.185  0.23630    
density       -1375.439   1133.737  -1.213  0.22541    
FaciesF1     -41138.384  17471.026  -2.355  0.01878 *  
FaciesF10    -41094.372  17537.843  -2.343  0.01936 *  
FaciesF2     -41708.035  17507.367  -2.382  0.01744 *  
FaciesF3     -41273.486  17539.331  -2.353  0.01885 *  
FaciesF5     -41058.931  17534.421  -2.342  0.01944 *  
FaciesF7     -42402.676  17504.041  -2.422  0.01564 *  
FaciesF8     -42725.944  17541.930  -2.436  0.01508 *  
FaciesF9     -44925.743  17646.639  -2.546  0.01109 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1315 on 801 degrees of freedom
Multiple R-squared:  0.832, Adjusted R-squared:  0.8282 
F-statistic: 220.3 on 18 and 801 DF,  p-value: < 2.2e-16

We got R2=0.832 and Adjusted R2= 0.8282

Excellent ! with facies the relationship tends to be more linear with higher R2 and Adjusted R2 values.

predicted.k3 = predict(model3, newdata = data)

AdjR.sq2 = 1 - sum((predicted.k3 - karpur$k.core)^2) / sum((karpur$k.core - mean(karpur$k.core))^2) RMSE = sqrt(mean((predicted.k3 - karpur$k.core)^2))

We got AdjR.sq2=-0.67779 and RMSE=2894.006.

Now lets visualize the observed and predicted permeabilities:

par(mfrow=c(1,1)) plot(y = karpur$depth, x = karpur$k.core, type="l", col="blue", lwd = 2, lty = 1, xlab = "Permeability", ylab = "Depth (m)", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) par(new = TRUE) plot(y = karpur$depth, x = predicted.k3 , type="l", col="red", lwd = 2, lty = 2, xlab = "Permeability", ylab = "", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) legend("topright", legend = c("k.core", "Predicted k with facies"), col = c("blue", "red"), lty = c(1,2), lwd =2,cex =0.7 )

lets remove the unnecessary parameters and see what will happen.

4- Apply Stepwise Elimination on the previous model.

In this Scenario we will remove the unnecessary parameters from our model using the stepwise elimination. To do a Stepwise regression analysis we need to load the package (StepReg).

library("StepReg")

model4 = step(model3 , direction = "backward")

summary(model4)

predicted.k4 = predict(model4, newdata = karpur)

AdjR.sq2 = 1 - sum((predicted.k4 - karpur$k.core)^2) / sum((karpur$k.core - mean(karpur$k.core))^2) RMSE = sqrt(mean((predicted.k4 - karpur$k.core)^2))

AdjR.sq2 = 0.6606003 , RMSE= 1301.626

after removing the unwanted parameters the AdjR.sq2 has increased further and RMSE has decreased further which make sense.

we’ll visualize the permeabilities again and see the difference:

par(mfrow=c(1,1)) plot(y = karpur$depth, x = karpur$k.core, type="l", col="blue", lwd = 2, lty = 1, xlab = "Permeability", ylab = "Depth (m)", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) par(new = TRUE) plot(y = karpur$depth, x = predicted.k4, type="l", col="red", lwd = 2, lty = 2, xlab = "Permeability", ylab = "", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) legend("topright", legend = c("k.core", "Predicted k with facies after elimination"), col = c("blue", "red"), lty = c(1,2), lwd =2,cex =0.5 )

good match !!!

Scenario 5 -MLR of Log10(Permeability) given all well logs with Facies.

we’ll do a Logarithmic Transformation for the k.core.

klog = log(data$k.core)

model5 = lm(klog ~ depth + caliper + ind.deep + gamma + phi.N + R.deep + R.med + SP + density.corr + density + Facies-1, data = karpur)

summary(model5)

Call:
lm(formula = klog ~ depth + caliper + ind.deep + gamma + phi.N + 
    R.deep + R.med + SP + density.corr + density + Facies - 1, 
    data = karpur)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.8699 -0.1431  0.0238  0.1702  0.6322 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
depth        -1.996e-02  4.261e-04 -46.840  < 2e-16 ***
caliper      -4.982e+00  2.517e-01 -19.789  < 2e-16 ***
ind.deep      7.148e-04  1.054e-04   6.778 2.36e-11 ***
gamma         2.204e-03  1.511e-03   1.458 0.145139    
phi.N         6.444e-01  3.639e-01   1.771 0.076972 .  
R.deep        7.626e-03  1.579e-03   4.828 1.65e-06 ***
R.med        -7.855e-03  2.268e-03  -3.464 0.000561 ***
SP            1.127e-03  7.894e-04   1.427 0.153854    
density.corr -1.026e+00  1.198e+00  -0.857 0.391727    
density       3.957e-01  2.755e-01   1.437 0.151208    
FaciesF1      1.647e+02  4.245e+00  38.803  < 2e-16 ***
FaciesF10     1.656e+02  4.261e+00  38.874  < 2e-16 ***
FaciesF2      1.653e+02  4.254e+00  38.868  < 2e-16 ***
FaciesF3      1.656e+02  4.261e+00  38.865  < 2e-16 ***
FaciesF5      1.657e+02  4.260e+00  38.884  < 2e-16 ***
FaciesF7      1.662e+02  4.253e+00  39.069  < 2e-16 ***
FaciesF8      1.665e+02  4.262e+00  39.061  < 2e-16 ***
FaciesF9      1.671e+02  4.288e+00  38.966  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3195 on 801 degrees of freedom
Multiple R-squared:  0.9981,    Adjusted R-squared:  0.9981 
F-statistic: 2.341e+04 on 18 and 801 DF,  p-value: < 2.2e-16

We got R2=0.9981 and Adjusted R2= 0.9981

Excellent ! with facies and log transformation the relationship tends to be more linear with higher R2 and Adjusted R2 values.

predicted.k5 = exp(predict(model5, newdata = data))

AdjR.sq2 = 1 - sum((predicted.k5 - karpur$k.core)^2) / sum((karpur$k.core - mean(karpur$k.core))^2) RMSE = sqrt(mean((predicted.k5 - karpur$k.core)^2))

We got AdjR.sq2=-1.167243 and RMSE=3289.155.

Now lets visualize the observed and predicted permeabilities:

par(mfrow=c(1,1)) plot(y = karpur$depth, x = karpur$k.core, type="l", col="blue", lwd = 2, lty = 1, xlab = "Permeability", ylab = "Depth (m)", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) par(new = TRUE) plot(y = karpur$depth, x = predicted.k5 , type="l", col="red", lwd = 2, lty = 2, xlab = "Permeability", ylab = "", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) legend("topright", legend = c("k.core", "Predicted log k with facies"), col = c("blue", "red"), lty = c(1,2), lwd =2,cex =0.7 )

Scenario 6 - Apply Stepwise Elimination on the previous model.

library("StepReg")

model6 = step(model5 , direction = "backward")

summary(model6)

predicted.k6 = exp(predict(model6, newdata = karpur))

AdjR.sq2 = 1 - sum((predicted.k6 - karpur$k.core)^2) / sum((karpur$k.core - mean(karpur$k.core))^2) RMSE = sqrt(mean((predicted.k6 - karpur$k.core)^2))

AdjR.sq2 = -1.35074 , RMSE= 3425.57

we’ll visualize the permeabilities again and see the difference:

par(mfrow=c(1,1)) plot(y = karpur$depth, x = karpur$k.core, type="l", col="blue", lwd = 2, lty = 1, xlab = "Permeability", ylab = "Depth (m)", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) par(new = TRUE) plot(y = karpur$depth, x = predicted.k6, type="l", col="red", lwd = 2, lty = 2, xlab = "Permeability", ylab = "", xlim = c(0, 16000), ylim = rev(range(karpur$depth))) legend("topright", legend = c("k.core", "Predicted k with facies after elimination"), col = c("blue", "red"), lty = c(1,2), lwd =2,cex =0.5 )

the predicted k should be perfectly match the observed k after log transformation , which means I have an issue in my code!