The “rock” dataset from the datasets package in R shows measurements on 48 rock samples from a petroleum reservoir. The set represents twelve core samples, 4 cross-sections each. Each core sample was measured for the following:
area area of pores space, in pixels out of 256 by 256
peri perimeter in pixel
shape perimeter/sqrt(area)
perm permeability in milli-Darcies
str(rock)
## 'data.frame': 48 obs. of 4 variables:
## $ area : int 4990 7002 7558 7352 7943 7979 9333 8209 8393 6425 ...
## $ peri : num 2792 3893 3931 3869 3949 ...
## $ shape: num 0.0903 0.1486 0.1833 0.1171 0.1224 ...
## $ perm : num 6.3 6.3 6.3 6.3 17.1 17.1 17.1 17.1 119 119 ...
First, I will assign the variables more meaningful names.
t <- as.data.frame(rock)
setnames(t, old=c(1:4), new=c("Area", "Perimeter", "Shape", "Permeability"))
The following scatterplot will show the relationship of all 4 variables.
1. Perimeter of pores is mapped along x axis
2. Area of pores is mapped along y axis
3. Permeability is denoted by change in color
4. Shape is denoted by change in size
plot_ly(t, type="scatter", x=~Perimeter, y=~Area, color=~Permeability, size=~Shape, colors=c("purple", "orange")) %>% layout(title="Measurements on Petroleum Rock Samples", xaxis=list(title="Perimeter of Pores"), yaxis=list(title="Area of Pores"))
Since shape is a function of both pore perimeter and pore area, we would expect to see a pattern of circle size to both. Higher permeability seems to be associated with a smaller pore perimeter and larger shape.