Here we measure the degree of heterogeneity in permeability using:

Dykstra-Parsons coefficient

The Dykstra-Parsons method involves plotting the frequency distribution of the permeability on a log-normal probability graph paper. This is done by arranging the permeability values in descending order and then calculating for each permeability, the percent of the samples with permeability greater than that value. to avoid values of zero or 100%, the percent greater than or equal to value is normalized by n+1, where n it is the number of samples.

first, let’s import the data

data = read.csv("karpur.csv")
k_core = data$k.core
summary(k_core)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##     0.42   657.33  1591.22  2251.91  3046.82 15600.00

second, let’s take the permeability and sort in descending order

sorted_k = sort(k_core, decreasing = TRUE)
head(sorted_k)
## [1] 15600.00 14225.31 13544.98 13033.53 11841.74 11117.40

next, for each permeability, calculate the percent of the samples with permeability greater than that value. since the permeability is sorted descending order, we simply made a list with the numbers 1, 2, 3, …, n and each number in the list is divided by (n+1).

n = length(sorted_k) #the total number of samples
k_percent = c(1:n)/(n+1) 
head(k_percent)
## [1] 0.001219512 0.002439024 0.003658537 0.004878049 0.006097561 0.007317073

polt the data on a log-normal probability graph, k_percent on the x-axis, sorted_k on the y-axis

plot(
  x = k_percent, 
  y = sorted_k,
  type="l",
  xlab = "% >= k",
  ylab = "k",
  log='y',
  lwd=2
  )

as we can see in the graph, the curve in the interval [0.1, 0.9] can be approximated

vary well with a straight line, so let’s fit a straight line to this portion of the data

nearly_linear_k_values = sorted_k[floor(0.1*n): floor(0.9*n)]
nearly_linear_k_percent = k_percent[floor(0.1*n): floor(0.9*n)]
linear_model = lm(log10(nearly_linear_k_values) ~ nearly_linear_k_percent)
summary(linear_model)
## 
## Call:
## lm(formula = log10(nearly_linear_k_values) ~ nearly_linear_k_percent)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.19128 -0.03880  0.02148  0.04110  0.07176 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)              3.885895   0.005010   775.7   <2e-16 ***
## nearly_linear_k_percent -1.461904   0.009112  -160.4   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.05402 on 655 degrees of freedom
## Multiple R-squared:  0.9752, Adjusted R-squared:  0.9751 
## F-statistic: 2.574e+04 on 1 and 655 DF,  p-value: < 2.2e-16
plot(
  x = k_percent, 
  y = sorted_k,
  type="l",
  xlab = "% >= k",
  ylab = "k",
  log='y',
  lwd=2
  )
abline(linear_model, col="blue", lwd=2)

The Dykstra-Parsons coefficient of permeability variation, V, is determined from the graph as:

\(V = (k_{50}-k_{84.1})/k_{50}\)

where:

k50 = permeability value at %k>= 50 (log mean)

k84.1= permeability value at %k >= 84.1 (one standard deviation)

log_k50_and_84.1 = predict(
  linear_model, 
  data.frame(nearly_linear_k_percent=c(0.50, 0.841))
  )
print(10^log_k50_and_84.1) 
##         1         2 
## 1428.7056  453.3498
k50   = 10^log_k50_and_84.1[1]
k84.1 = 10^log_k50_and_84.1[2]
permeability_variation = (k50 - k84.1)/k50
print(permeability_variation)
##        1 
## 0.682685

Lorenz coefficient

The Lorenz coefficient of permeability variation is obtained by plotting a graph of cumulative kh versus cumulative phi*h, sometimes called a flow capacity plot.

first, let’s calculate the thickness from the depths.

the differences in depths gives the thickness:

d = data$depth
h = d[2:n]-d[1:n-1] #differences in depths
summary(h)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.5000  0.5000  0.5000  0.5086  0.5000  1.0000

we need one more value to have the same number of data samples, since most of the values are 0.5, this what I’m going to add.

h = append(h, 0.5)

second, lets calculate the cumulative kh, and cumulative phi*h

df         = data.frame(h)  # we will use data frames to store the data
df['h']    = h              # 1st column
k          = data$k.core
df['k']    = k              # 2nd column
df['phi']  = data$phi.core  # 3nd column
df         = df[order(k, decreasing = TRUE),] #sort data such that k is in descending order

df['kh']           = df['k']*df['h']              #multiply k and h
df['sum_kh']       = cumsum(df['kh'])             #calculate the cumulative sum
df['sum_kh/total'] = df['sum_kh']/df[n, 'sum_kh'] #divide by the total

#same for phi and h
df['phih']           = df['phi']*df['h']
df['sum_phih']       = cumsum(df['phih'])
df['sum_phih/total'] = df['sum_phih']/df[n, 'sum_phih']

plot cumulative kh versus cumulative phi*h

plot(
  df['sum_phih/total'][1:n,1], df['sum_kh/total'][1:n,1],
  xlab = "fraction of total volume (phih)", 
  ylab = "fraction of total flow capacity (kh)",
  type = "l",
  lwd = 2
  )

we need to calculate the area under the above graph, I implemented the trapezoid method to find the AUC.

x  = c(df['sum_phih/total'][1:n,1])
dx = x[2:n] - x[1:n-1]

y1 = c(df['sum_kh/total'][1:n,1])[1:n-1]
y2 = c(df['sum_kh/total'][1:n,1])[2:n]

lower_area = sum(y1*dx) #Reduce("+", y1*dx)
upper_area = sum(y2*dx) #Reduce("+", y2*dx)

AUC = (lower_area + upper_area)/2

print(AUC)
## [1] 0.7247318

finally we calculate Lorenz coefficient as fallows:

\({Lorenz\ coefficient} = (AUC - 0.5)/0.5\)

Lorenz_coefficient = (AUC - 0.5)/0.5
print(Lorenz_coefficient)
## [1] 0.4494636

The Lorenz coefficient of permeability variation also varies from zero to one. Unfortunately, the Lorenz coefficient is not a unique measure of reservoir heterogeneity. Several different permeability distributions can give the same value of Lorenz coefficient. For log-normal permeability distributions, the Lorenz coefficient is very similar to the Dykstra-Parsons coefficient of permeability variation.