Dykstra-Parsons coefficient

The Dykstra-Parsons method involves plotting the frequency distribution of the permeability on a log-normal probability graph paper. This is done by arranging the permeability values in descending order and then calculating for each permeability, the percent of the samples with permeability greater than that value. to avoid values of zero or 100%, the percent greater than or equal to value is normalized by n+1, where n it is the number of samples.

data = read.csv("karpur.csv")
head(data)
##    depth caliper ind.deep ind.med  gamma phi.N R.deep  R.med      SP
## 1 5667.0   8.685  618.005 569.781 98.823 0.410  1.618  1.755 -56.587
## 2 5667.5   8.686  497.547 419.494 90.640 0.307  2.010  2.384 -61.916
## 3 5668.0   8.686  384.935 300.155 78.087 0.203  2.598  3.332 -55.861
## 4 5668.5   8.686  278.324 205.224 66.232 0.119  3.593  4.873 -41.860
## 5 5669.0   8.686  183.743 131.155 59.807 0.069  5.442  7.625 -34.934
## 6 5669.5   8.686  109.512  75.633 57.109 0.048  9.131 13.222 -39.769
##   density.corr density phi.core   k.core Facies
## 1       -0.033   2.205  33.9000 2442.590     F1
## 2       -0.067   2.040  33.4131 3006.989     F1
## 3       -0.064   1.888  33.1000 3370.000     F1
## 4       -0.053   1.794  34.9000 2270.000     F1
## 5       -0.054   1.758  35.0644 2530.758     F1
## 6       -0.058   1.759  35.3152 2928.314     F1

This method required using only the K.core column from our data-set.

Here I am sorting the whole data-set descending based on K.core than spliting cored permeability as an individual vector:

#SORTING BASED ON K CORE

data = data[order(data$k.core, decreasing = TRUE), ]
K = data$k.core

Next creating another vector containing the number of samples ≥ k, while the order of data is descending the 1st K will has only one permeability greater or equals to its value while the last K has 819 (data length) of permeability values greater or equals to its value.

sample = c(1:819)

The next parameter required is the percent of samples that have permeability greater or equal to the current K value, which represent the probability of current K value. note: the division on [length of the data-set +1] to neglect having probability of zero and one.

k_percent = (sample * 100) / 820

Now, all parameters is ready to use. Let’s plot Y values (logarithmic permeability) and X values (probability):

xlab = "Portion of Total Sample Having Higher K "
ylab = "Permeability (md)"
plot(k_percent, K, log =  'y', xlab = xlab, ylab = ylab, pch = 10, cex = 0.5, col = "#001c49")

Now, let’s find the Linear Model that fits the data, lm() function used to build linear regression model:

log_k = log(K)
model = lm(log_k ~ k_percent)
plot(k_percent,log_k, xlab = xlab, ylab = ylab, pch = 10, cex = 0.5, col = "#001c49")
abline(model, col = 'red', lwd = 2)

summary(model)
## 
## Call:
## lm(formula = log_k ~ k_percent)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.8697 -0.2047  0.1235  0.3150  0.4280 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  9.2584172  0.0377994  244.94   <2e-16 ***
## k_percent   -0.0426137  0.0006549  -65.07   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5404 on 817 degrees of freedom
## Multiple R-squared:  0.8382, Adjusted R-squared:  0.838 
## F-statistic:  4234 on 1 and 817 DF,  p-value: < 2.2e-16

Our model has good R2 value which is obvious from the plot. abline is just adding the fitted line.

Now let’s find K values at K50 & K84.1 , by define new data frame that contains the required values:

new_data = data.frame(k_percent  = c(50, 84.1))

Now using predict () function to predict from given model:

predicted_values = predict(model, new_data)

so the logarithmic permeability of K50 = 7.13 & K84.1 = 5.67

to calculate heterogeneity index:

V=(K50-K84.1)/(K50)

heterogenity_index = (predicted_values[1] - predicted_values[2]) / predicted_values[1]

heterogenity_index
##         1 
## 0.2038692

So, the heterogeneity index = 0.204 , which indicate slightly heterogeneous class of permeability distribution, the low value of heterogeneity allow to use homogeneous model with minimum error.

2- Lorenz Coefficient This method is measuring the inequality of distribution of a property, in finance for example: Is the income distributed equally among the population? The same concept can be applied in reservoir engineering but between the flow capacity (KH) and storage volume (PHIK).

Using the same data:

data = read.csv("karpur.csv")

Then, we need to calculate the thickness of each permeability based on Depth[i] - Depth[i-1] , then update the data by sorting it based on Permeability.core:

#thickness calculation
thickness = c(0.5)
for (i in 2:819) {
  h = data$depth[i] - data$depth[i-1]
  thickness = append(thickness, h)
}

data = data[order(data$k.core, decreasing = TRUE), ]

Then, calculate the flow capacity and storage capacity which is easily the multiplication between permeability and porosity with corresponding thickness. Then calculate the cumulative sum of each one using cumsum() function.

KH = thickness * data$k.core
PH = thickness * data$phi.core
cum_sum_kh = cumsum(KH)
cum_sum_ph = cumsum(PH)

Finally we need to calculate the following terms:

(∑cum.k.core)/(total)& (∑cum.phi.core)/(total)

frac_tot_vol = (cum_sum_ph/100) / max(cum_sum_ph/100) #to convert phi between (0,1)
frac_tot_flow = cum_sum_kh / max(cum_sum_kh)

Now, plotting the percent of storage capacity on X-axis and the percent of flow capacity on Y-axis:

plot(frac_tot_vol,frac_tot_flow, pch = 10, cex = 0.2, col = '#001c49', text(0.4, 0.8, "D"))
text(0,0, "A", pos = 2)
text(1,1, "B", pos = 1)
text(1,0, "C", pos = 2)
abline(0,1, col = 'red', lwd = 2)

Now, We need to estimate the area under the curve (ADBCA).

using AUC() function that approximate the area under a curve, trapezoid the numerical technique that used:

require("DescTools")
## Loading required package: DescTools
library(DescTools)

area = AUC(frac_tot_vol, frac_tot_flow, method="trapezoid")

The formula of Lorenz Coefficient:

Lorenz Coefficient=(AreaADBA)/(ABCA)

Area ADBA, is just area value - 0.5:

Lcoff  = (area - 0.5) / 0.5

The heterogeneity Index:

Lcoff
## [1] 0.4524741

heterogeneity