Determination of Heterogeneity Index (HI) First by Dykstra-Parsons Coefficient (Vk) and Second by Lorenz Coefficient (Lk)

Heterogeneity is a function of horizontal permeability, and can be quantified by two techniques:

Dykstra-Parsons

Lorenz Coefficient

The data-set used:

data<-read.csv("D:/karpur.csv",header=T)
summary(data)

##      depth         caliper         ind.deep          ind.med       
##  Min.   :5667   Min.   :8.487   Min.   :  6.532   Min.   :  9.386  
##  1st Qu.:5769   1st Qu.:8.556   1st Qu.: 28.799   1st Qu.: 27.892  
##  Median :5872   Median :8.588   Median :217.849   Median :254.383  
##  Mean   :5873   Mean   :8.622   Mean   :275.357   Mean   :273.357  
##  3rd Qu.:5977   3rd Qu.:8.686   3rd Qu.:566.793   3rd Qu.:544.232  
##  Max.   :6083   Max.   :8.886   Max.   :769.484   Max.   :746.028  
##      gamma            phi.N            R.deep            R.med        
##  Min.   : 16.74   Min.   :0.0150   Min.   :  1.300   Min.   :  1.340  
##  1st Qu.: 40.89   1st Qu.:0.2030   1st Qu.:  1.764   1st Qu.:  1.837  
##  Median : 51.37   Median :0.2450   Median :  4.590   Median :  3.931  
##  Mean   : 53.42   Mean   :0.2213   Mean   : 24.501   Mean   : 21.196  
##  3rd Qu.: 62.37   3rd Qu.:0.2640   3rd Qu.: 34.724   3rd Qu.: 35.853  
##  Max.   :112.40   Max.   :0.4100   Max.   :153.085   Max.   :106.542  
##        SP          density.corr          density         phi.core    
##  Min.   :-73.95   Min.   :-0.067000   Min.   :1.758   Min.   :15.70  
##  1st Qu.:-42.01   1st Qu.:-0.016000   1st Qu.:2.023   1st Qu.:23.90  
##  Median :-32.25   Median :-0.007000   Median :2.099   Median :27.60  
##  Mean   :-30.98   Mean   :-0.008883   Mean   :2.102   Mean   :26.93  
##  3rd Qu.:-19.48   3rd Qu.: 0.002000   3rd Qu.:2.181   3rd Qu.:30.70  
##  Max.   : 25.13   Max.   : 0.089000   Max.   :2.387   Max.   :36.30  
##      k.core            Facies         
##  Min.   :    0.42   Length:819        
##  1st Qu.:  657.33   Class :character  
##  Median : 1591.22   Mode  :character  
##  Mean   : 2251.91                     
##  3rd Qu.: 3046.82                     
##  Max.   :15600.00

1- Dykstra-Parsons This method required using only the K.core column from our data-set.

Here I am sorting the whole data-set descending based on K.core than spliting cored permeability as an individual vector:

#SORTING BASED ON K CORE

data = data[order(data$k.core, decreasing = TRUE), ]
K = data$k.core

Next creating another vector containing the number of samples ≥ k, while the order of data is descending the 1st K will has only one permeability greater or equals to its value while the last K has 819 (data length) of permeability values greater or equals to its value.

sample = c(1:819)

The next parameter required is the percent of samples that have permeability greater or equal to the current K value, which represent the probability of current K value. note: the division on [length of the data-set +1] to neglect having probability of zero and one.

k_percent = (sample * 100) / 820

Now, all parameters is ready to use. Let’s plot Y values (logarithmic permeability) and X values (probability):

xlab = "Portion of Total Sample Having Higher K "
ylab = "Permeability (md)"
plot(k_percent, K, log =  'y', xlab = xlab, ylab = ylab, pch = 10, cex = 0.5, col = "#001c49")

Now, let’s find the Linear Model that fits the data, lm() function used to build linear regression model:

log_k = log(K)
model = lm(log_k ~ k_percent)
plot(k_percent,log_k, xlab = xlab, ylab = ylab, pch = 10, cex = 0.5, col = "#001c49")
abline(model, col = 'red', lwd = 2)

summary(model)

## 
## Call:
## lm(formula = log_k ~ k_percent)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.8697 -0.2047  0.1235  0.3150  0.4280 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  9.2584172  0.0377994  244.94   <2e-16 ***
## k_percent   -0.0426137  0.0006549  -65.07   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5404 on 817 degrees of freedom
## Multiple R-squared:  0.8382, Adjusted R-squared:  0.838 
## F-statistic:  4234 on 1 and 817 DF,  p-value: < 2.2e-16

Our model has good R2 value which is obvious from the plot. abline is just adding the fitted line.

Now let’s find K values at K50 & K84.1 , by define new data frame that contains the required values:

new_data = data.frame(k_percent  = c(50, 84.1))

Now using predict () function to predict from given model:

predicted_values = predict(model, new_data)

so the logarithmic permeability of K50 = 7.13 & K84.1 = 5.67

to calculate heterogeneity index:

heterogenity_index = (predicted_values[1] - predicted_values[2]) / predicted_values[1]

heterogenity_index

##         1 
## 0.2038692

So, the heterogeneity index = 0.204 , which indicate slightly heterogeneous class of permeability distribution, the low value of heterogeneity allow to use homogeneous model with minimum error.

2- Lorenz Coefficient

This method is measuring the inequality of distribution of a property, in finance for example: Is the income distributed equally among the population? The same concept can be applied in reservoir engineering but between the flow capacity (K*H) and storage volume (PHI*K).

Using the same data:\

data<-read.csv("D:/karpur.csv",header=T)

Then, we need to calculate the thickness of each permeability based on Depth[i] - Depth[i-1] , then update the data by sorting it based on Permeability.core:

#thickness calculation
thickness = c(0.5)
for (i in 2:819) {
  h = data$depth[i] - data$depth[i-1]
  thickness = append(thickness, h)
}

data = data[order(data$k.core, decreasing = TRUE), ]

Then, calculate the flow capacity and storage capacity which is easily the multiplication between permeability and porosity with corresponding thickness. Then calculate the cumulative sum of each one using cumsum() function.

KH = thickness * data$k.core
PH = thickness * data$phi.core
cum_sum_kh = cumsum(KH)
cum_sum_ph = cumsum(PH)

Finally we need to calculate the following terms:

frac_tot_vol = (cum_sum_ph/100) / max(cum_sum_ph/100) #to convert phi between (0,1)
frac_tot_flow = cum_sum_kh / max(cum_sum_kh)

Now, plotting the percent of storage capacity on X-axis and the percent of flow capacity on Y-axis:

plot(frac_tot_vol,frac_tot_flow, pch = 10, cex = 0.2, col = '#001c49', text(0.4, 0.8, "D"))
text(0,0, "A", pos = 2)
text(1,1, "B", pos = 1)
text(1,0, "C", pos = 2)
abline(0,1, col = 'red', lwd = 2)

Now, We need to estimate the area under the curve (ADBCA).

using AUC() function that approximate the area under a curve, trapezoid the numerical technique that used:

require("DescTools")

## Loading required package: DescTools

## Warning: package 'DescTools' was built under R version 4.2.2

## Loading required package: DescTools

## Warning: package 'DescTools' was built under R version 4.2.2

library(DescTools)

area = AUC(frac_tot_vol, frac_tot_flow, method="trapezoid")

The formula of Lorenz Coefficient:

Area ADBA, is just area value - 0.5:

Lcoff  = (area - 0.5) / 0.5

The heterogeneity Index:

Lcoff

## [1] 0.4524741

is 0.45 , in Lorenz method if the value is closer to zero, it considers more heterogeneous. So the current value considers as heterogeneous or high heterogeneous reservoir.

Determination of Heterogeneity Index (HI) First by Dykstra-Parsons Coefficient (Vk) and Second by Lorenz Coefficient (Lk)

Zahraa Sameer Mohammed

2023-12-27

2- Lorenz Coefficient