We will Load some useful packages that could import the data easily and give more specific information about the data.
pacman::p_load(rio,psych)
data.k <- import("C:/Users/hp ZBook/OneDrive/Desktop/karpur.csv")
Now we will view part of the data.
head(data.k)
## depth caliper ind.deep ind.med gamma phi.N R.deep R.med SP
## 1 5667.0 8.685 618.005 569.781 98.823 0.410 1.618 1.755 -56.587
## 2 5667.5 8.686 497.547 419.494 90.640 0.307 2.010 2.384 -61.916
## 3 5668.0 8.686 384.935 300.155 78.087 0.203 2.598 3.332 -55.861
## 4 5668.5 8.686 278.324 205.224 66.232 0.119 3.593 4.873 -41.860
## 5 5669.0 8.686 183.743 131.155 59.807 0.069 5.442 7.625 -34.934
## 6 5669.5 8.686 109.512 75.633 57.109 0.048 9.131 13.222 -39.769
## density.corr density phi.core k.core Facies
## 1 -0.033 2.205 0.339000 2442.590 F1
## 2 -0.067 2.040 0.334131 3006.989 F1
## 3 -0.064 1.888 0.331000 3370.000 F1
## 4 -0.053 1.794 0.349000 2270.000 F1
## 5 -0.054 1.758 0.350644 2530.758 F1
## 6 -0.058 1.759 0.353152 2928.314 F1
We can get some details about our data like (the mean,standard deviation,median,minimum and maximum values ,etc.).
describe(data.k)
## vars n mean sd median trimmed mad min max
## depth 1 819 5873.06 120.09 5871.50 5872.76 154.19 5667.00 6083.00
## caliper 2 819 8.62 0.11 8.59 8.61 0.11 8.49 8.89
## ind.deep 3 819 275.36 254.30 217.85 257.13 292.66 6.53 769.48
## ind.med 4 819 273.36 243.34 254.38 256.28 341.75 9.39 746.03
## gamma 5 819 53.42 18.94 51.37 52.17 15.86 16.74 112.40
## phi.N 6 819 0.22 0.07 0.24 0.23 0.04 0.01 0.41
## R.deep 7 819 24.50 35.16 4.59 16.38 4.59 1.30 153.08
## R.med 8 819 21.20 27.54 3.93 15.62 3.61 1.34 106.54
## SP 9 819 -30.98 16.65 -32.25 -31.43 16.50 -73.95 25.13
## density.corr 10 819 -0.01 0.02 -0.01 -0.01 0.01 -0.07 0.09
## density 11 819 2.10 0.13 2.10 2.10 0.11 1.76 2.39
## phi.core 12 819 0.27 0.05 0.28 0.27 0.05 0.16 0.36
## k.core 13 819 2251.90 2235.61 1591.22 1869.26 1639.31 0.42 15600.00
## Facies* 14 819 4.84 2.63 5.00 4.92 4.45 1.00 8.00
## range skew kurtosis se
## depth 416.00 0.02 -1.21 4.20
## caliper 0.40 0.98 0.19 0.00
## ind.deep 762.95 0.42 -1.40 8.89
## ind.med 736.64 0.36 -1.40 8.50
## gamma 95.66 0.69 0.40 0.66
## phi.N 0.39 -0.96 0.50 0.00
## R.deep 151.78 1.88 2.93 1.23
## R.med 105.20 1.45 1.19 0.96
## SP 99.08 0.29 -0.03 0.58
## density.corr 0.16 -0.26 2.18 0.00
## density 0.63 0.02 -0.50 0.00
## phi.core 0.21 -0.51 -0.71 0.00
## k.core 15599.58 1.93 5.11 78.12
## Facies* 7.00 -0.19 -1.57 0.09
We could plot our data to see the relationship between core or log porosity data with core permeability data.
par(mfrow=c(2,1))
plot(data.k$phi.core,data.k$k.core,main="Core porosity v. Core permeability",
xlab="Core porosity",ylab="Core permeability",col="blue",pch=19)
plot(data.k$phi.N,data.k$k.core,main="Log porosity v. Core permeability", xlab="Log porosity",ylab="Core permeability",col="green",pch=19)
Now we build a linear regression model to correct core porosity data with log porosity data.
Corrected.prosity.model <- lm(phi.core ~ phi.N+Facies-1,data=data.k)
Notice we add facies to the regression model to get more accurate prediction.Additionally we add -1 to the facies so that the model will estimate separate values for each facies category without treating one as the reference (i.e., modeling the effect of each facies individually).
Now we will make a predictions using our model.
phi.core.corr<-predict(Corrected.prosity.model,data=data.k)
We can compare the predicted porosity with original log data.
head(cbind(Corrected_Porosity=phi.core.corr,Log_porosity=data.k$phi.N))
## Corrected_Porosity Log_porosity
## 1 0.3202847 0.410
## 2 0.3189082 0.307
## 3 0.3175183 0.203
## 4 0.3163957 0.119
## 5 0.3157275 0.069
## 6 0.3154468 0.048
Now we build a model to correct the permeability core data with the corrected porosity data
Corrected.permeability.model <-lm(k.core~phi.core.corr+Facies-1,data.k)
Now we will use our model to make predictions.
k.core.corr<-predict(Corrected.permeability.model,data=data.k)
We can compare the predicted permeability with original core data.
head(cbind(Corrected_permeability=k.core.corr,Core_permeability=data.k$k.core))
## Corrected_permeability Core_permeability
## 1 589.2371 2442.590
## 2 1156.8516 3006.989
## 3 1729.9769 3370.000
## 4 2192.8857 2270.000
## 5 2468.4267 2530.758
## 6 2584.1540 2928.314
Now we make a summary to show the accuracy of our models.
summary(Corrected.prosity.model)
##
## Call:
## lm(formula = phi.core ~ phi.N + Facies - 1, data = data.k)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.103530 -0.011573 -0.000206 0.010463 0.102852
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## phi.N 0.013364 0.018060 0.74 0.46
## FaciesF1 0.314805 0.002777 113.37 <2e-16 ***
## FaciesF10 0.207680 0.005072 40.95 <2e-16 ***
## FaciesF2 0.175233 0.009390 18.66 <2e-16 ***
## FaciesF3 0.231939 0.004955 46.81 <2e-16 ***
## FaciesF5 0.272953 0.003914 69.74 <2e-16 ***
## FaciesF7 0.225164 0.008730 25.79 <2e-16 ***
## FaciesF8 0.305884 0.005019 60.94 <2e-16 ***
## FaciesF9 0.264448 0.004825 54.81 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.02326 on 810 degrees of freedom
## Multiple R-squared: 0.9928, Adjusted R-squared: 0.9928
## F-statistic: 1.246e+04 on 9 and 810 DF, p-value: < 2.2e-16
The result indicates very high accuracy (Adjusted R-squared=0.9928).
summary(Corrected.permeability.model)
##
## Call:
## lm(formula = k.core ~ phi.core.corr + Facies - 1, data = data.k)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5613.4 -596.9 -130.3 475.0 10449.1
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## phi.core.corr -412352 89814 -4.591 5.11e-06 ***
## FaciesF1 132659 28386 4.673 3.47e-06 ***
## FaciesF10 87869 18969 4.632 4.21e-06 ***
## FaciesF2 73980 16049 4.610 4.69e-06 ***
## FaciesF3 97910 21087 4.643 4.00e-06 ***
## FaciesF5 118916 24729 4.809 1.81e-06 ***
## FaciesF7 95868 20496 4.677 3.40e-06 ***
## FaciesF8 130990 27786 4.714 2.86e-06 ***
## FaciesF9 111324 24050 4.629 4.28e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1546 on 810 degrees of freedom
## Multiple R-squared: 0.7652, Adjusted R-squared: 0.7626
## F-statistic: 293.2 on 9 and 810 DF, p-value: < 2.2e-16
The result shows a relatively good accuracy (Adjusted R-squared=0.7626).