R Markdown

Heterogeneity is the spatial variability in permeability and can be calculated by two methods: 1- Dykstra-Parson Coefficient 2-Lorentz Coefficient

#1-Dykstra-Parson Coefficient This methods implies simple calculations on permeability values in a table form. It starts with calculating the number of data points with greater or equal permeability (Number of samples k). This can be used a further calculation of the samples portion (%>=K). Then, When plotting the latter against permeability in log-normal distribution, a straight trend line should be observed.Finally, we can model the relationship and calculate the heterogeneity Index (HI) through mathematical formula.

First, reading the whole data set

data=read.csv("karpur.csv")
head(data)
##    depth caliper ind.deep ind.med  gamma phi.N R.deep  R.med      SP
## 1 5667.0   8.685  618.005 569.781 98.823 0.410  1.618  1.755 -56.587
## 2 5667.5   8.686  497.547 419.494 90.640 0.307  2.010  2.384 -61.916
## 3 5668.0   8.686  384.935 300.155 78.087 0.203  2.598  3.332 -55.861
## 4 5668.5   8.686  278.324 205.224 66.232 0.119  3.593  4.873 -41.860
## 5 5669.0   8.686  183.743 131.155 59.807 0.069  5.442  7.625 -34.934
## 6 5669.5   8.686  109.512  75.633 57.109 0.048  9.131 13.222 -39.769
##   density.corr density phi.core   k.core Facies phi..Core.frc
## 1       -0.033   2.205  33.9000 2442.590     F1      0.339000
## 2       -0.067   2.040  33.4131 3006.989     F1      0.334131
## 3       -0.064   1.888  33.1000 3370.000     F1      0.331000
## 4       -0.053   1.794  34.9000 2270.000     F1      0.349000
## 5       -0.054   1.758  35.0644 2530.758     F1      0.350644
## 6       -0.058   1.759  35.3152 2928.314     F1      0.353152

For simplicity, sorting permeability values in a descending order.

data = data[order(data$k.core, decreasing = TRUE), ]
k = data$k.core

Since values are sorted, Number of samplesk is actually the row indices.

sample = c(1: length(k))

Calculating %>=K

k_percent =(sample * 100) / length(k)

when plotting the output, a straight line is observed.

xlab = "Portion of Total Samples Having Larger or Egual k  "
ylab = "permeabiliy (md)"
plot(k_percent, k, log = 'y' , xlab = xlab, ylab, pch = 10, cex =0.5, col ="#001c49")
## Warning in plot.xy(xy, type, ...): plot type 'permeabiliy (md)' will be
## truncated to first character

A linear model is fitted to the data.

log_k = log(k)
model = lm(log_k~ k_percent)
plot(k_percent,log_k, xlab = xlab, ylab = ylab,pch = 10, cex = 0.5, col = "#001c49")
abline(model, col ='red' , lwd = 2)

Brief description of the model coefficients and evaluation metrics

summary(model)
## 
## Call:
## lm(formula = log_k ~ k_percent)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.8697 -0.2047  0.1235  0.3150  0.4280 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  9.2584172  0.0377994  244.94   <2e-16 ***
## k_percent   -0.0425617  0.0006541  -65.07   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5404 on 817 degrees of freedom
## Multiple R-squared:  0.8382, Adjusted R-squared:  0.838 
## F-statistic:  4234 on 1 and 817 DF,  p-value: < 2.2e-16

Calculating the HI mathematically

new_data = data.frame(k_percent = c(50, 84.1))
predicted_values = predict(model, new_data)
heterogenity_index = (predicted_values[1] - predicted_values[2]) / predicted_values[1]
heterogenity_index
##         1 
## 0.2035464