I need help explaining working out a concept. I’ve created a reprex of my method for calculating a variable circulus deposition rate. The method is based on the lifetime average deposition rate (calculated using the Carlin data) and the intercirculus spacings (distance between each circulus). Math is reeeeally not my strongsuit sometimes, so I want to know if what I did is unnecessarly complicated and/or blatantly erroneous. I feel like standardizing might be the way to go here, but couldn’t figure out how to make that approach work.
I think this method works, but I need help:
understanding if/why it is/is not mathmetically sound
figuring out whether there is a simpler way to do this this I’m entirely overlooking
trying to explain in words what the heck I acutally did here
The spacings[i] represent the distance between circnum[i] and circnum[i+1].
set.seed(123)
exampledata<-tibble(circ_num = seq(1,15,1), spacings = rnorm(15,0.05,0.0215))
exampledata
## # A tibble: 15 x 2
## circ_num spacings
## <dbl> <dbl>
## 1 1 0.0379
## 2 2 0.0451
## 3 3 0.0835
## 4 4 0.0515
## 5 5 0.0528
## 6 6 0.0869
## 7 7 0.0599
## 8 8 0.0228
## 9 9 0.0352
## 10 10 0.0404
## 11 11 0.0763
## 12 12 0.0577
## 13 13 0.0586
## 14 14 0.0524
## 15 15 0.0380
mean(exampledata$spacings)
## [1] 0.05327626
I think its a fair arguement to assume that mean circulus deposition rate = mean spacing value. Based on the fast growth/wide spacing, slow growth/narrow spacing hypothesis…. if a fish lays down a circulus at a constant rate, the circuli will be evenly spaced. However, since we know that circuli are not evenly spaced in our real data, we can speculate that it takes a different amount of time to deposit these circuli.
I want to come up with some multiplier that will represent the relationship of each spacing with the mean spacing (and thereby the mean deposition), so I calculated spacing value/ mean value, which is the “percent of the mean spacing.”
If we used this is the multiplier for our average deposition rate, the relationship would be smaller spacing = fewer days to deposit, larger spacing = more days to deposit.
exampledata %>%
mutate(perc_of_mean = spacings/mean(exampledata$spacings))
## # A tibble: 15 x 3
## circ_num spacings perc_of_mean
## <dbl> <dbl> <dbl>
## 1 1 0.0379 0.712
## 2 2 0.0451 0.846
## 3 3 0.0835 1.57
## 4 4 0.0515 0.967
## 5 5 0.0528 0.991
## 6 6 0.0869 1.63
## 7 7 0.0599 1.12
## 8 8 0.0228 0.428
## 9 9 0.0352 0.661
## 10 10 0.0404 0.759
## 11 11 0.0763 1.43
## 12 12 0.0577 1.08
## 13 13 0.0586 1.10
## 14 14 0.0524 0.983
## 15 15 0.0380 0.714
However, this doesn’t work as a multiplier because the relationship is actually the inverse. If a spacing is small, we need the multiplier value to make the value its multiplied by larger (narrow spacing, more time required to form circulus because growth rate is slow). In contrast, If a spacing is large, the multiplier to make the value its multiplied by smaller (wide spacing, less time required to form a circulus because growth rate is fast). So I was thinking you could just do the inverse… but you can’t.
If this were correct, the sum of perc_of_mean and perc_of_mean_inverse should be equal, but they are not.
proof<-exampledata %>%
mutate(perc_of_mean = spacings/mean(exampledata$spacings)) %>%
mutate(perc_of_mean_inverse = mean(exampledata$spacings)/spacings)
proof
## # A tibble: 15 x 4
## circ_num spacings perc_of_mean perc_of_mean_inverse
## <dbl> <dbl> <dbl> <dbl>
## 1 1 0.0379 0.712 1.40
## 2 2 0.0451 0.846 1.18
## 3 3 0.0835 1.57 0.638
## 4 4 0.0515 0.967 1.03
## 5 5 0.0528 0.991 1.01
## 6 6 0.0869 1.63 0.613
## 7 7 0.0599 1.12 0.889
## 8 8 0.0228 0.428 2.34
## 9 9 0.0352 0.661 1.51
## 10 10 0.0404 0.759 1.32
## 11 11 0.0763 1.43 0.698
## 12 12 0.0577 1.08 0.923
## 13 13 0.0586 1.10 0.909
## 14 14 0.0524 0.983 1.02
## 15 15 0.0380 0.714 1.40
sum(proof$perc_of_mean); sum(proof$perc_of_mean_inverse)
## [1] 15
## [1] 16.88434
So instead, I took the absolute value of 1-percentage of the mean (dist_from_1). If the value perc_of_mean was originally greater than 1, I subtracted the “dist_from_1” from 1. If the value of perc_of_mean was originally less than 1, I added “dist_from_1” to 1. This should create the inverse multiplier.
To ensure these are the inverse of the multipliers, the sum of perc_of_mean and mult_rate_by should be equal.
proof2<-exampledata %>%
mutate(perc_of_mean = spacings/mean(exampledata$spacings)) %>%
mutate(dist_from_1 = abs(1-perc_of_mean)) %>%
mutate(mult_rate_by = ifelse(perc_of_mean < 1, dist_from_1+1, 1-dist_from_1))
proof2
## # A tibble: 15 x 5
## circ_num spacings perc_of_mean dist_from_1 mult_rate_by
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 0.0379 0.712 0.288 1.29
## 2 2 0.0451 0.846 0.154 1.15
## 3 3 0.0835 1.57 0.568 0.432
## 4 4 0.0515 0.967 0.0330 1.03
## 5 5 0.0528 0.991 0.00932 1.01
## 6 6 0.0869 1.63 0.631 0.369
## 7 7 0.0599 1.12 0.125 0.875
## 8 8 0.0228 0.428 0.572 1.57
## 9 9 0.0352 0.661 0.339 1.34
## 10 10 0.0404 0.759 0.241 1.24
## 11 11 0.0763 1.43 0.432 0.568
## 12 12 0.0577 1.08 0.0837 0.916
## 13 13 0.0586 1.10 0.100 0.900
## 14 14 0.0524 0.983 0.0168 1.02
## 15 15 0.0380 0.714 0.286 1.29
sum(proof2$perc_of_mean); sum(proof2$mult_rate_by)
## [1] 15
## [1] 15
Now, to test whether this works as a multiplier to vary the lifetime average circulus deposition rate and create estimates of a possible variable deposition rate scenario.
avg_circulus_deposition_rate<-11.1
proof3<-exampledata %>%
mutate(perc_of_mean = spacings/mean(exampledata$spacings)) %>%
mutate(dist_from_1 = abs(1-perc_of_mean)) %>%
mutate(mult_rate_by = ifelse(perc_of_mean < 1, dist_from_1+1, 1-dist_from_1)) %>%
mutate(circ_rate= (avg_circulus_deposition_rate*mult_rate_by))
proof3
## # A tibble: 15 x 6
## circ_num spacings perc_of_mean dist_from_1 mult_rate_by circ_rate
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 0.0379 0.712 0.288 1.29 14.3
## 2 2 0.0451 0.846 0.154 1.15 12.8
## 3 3 0.0835 1.57 0.568 0.432 4.80
## 4 4 0.0515 0.967 0.0330 1.03 11.5
## 5 5 0.0528 0.991 0.00932 1.01 11.2
## 6 6 0.0869 1.63 0.631 0.369 4.10
## 7 7 0.0599 1.12 0.125 0.875 9.72
## 8 8 0.0228 0.428 0.572 1.57 17.4
## 9 9 0.0352 0.661 0.339 1.34 14.9
## 10 10 0.0404 0.759 0.241 1.24 13.8
## 11 11 0.0763 1.43 0.432 0.568 6.30
## 12 12 0.0577 1.08 0.0837 0.916 10.2
## 13 13 0.0586 1.10 0.100 0.900 9.99
## 14 14 0.0524 0.983 0.0168 1.02 11.3
## 15 15 0.0380 0.714 0.286 1.29 14.3
Firstly, if this works, the sum of the variable circulus deposition rates should be equal to the average deposition rate (11.1) multiplied by the number of values (15).
Secondly, the mean of the variable circulus deposition rates should be equal to the average deposition rate (11.1).
sum(proof3$circ_rate);(11.1*length(proof3$circ_num))
## [1] 166.5
## [1] 166.5
mean(proof3$circ_rate);avg_circulus_deposition_rate
## [1] 11.1
## [1] 11.1
Here are two examples to think through why this might be right. In this example, the spacing value (0.03794977) is much less than the mean (0.05327626), so we would expect the deposition rate to be greater than the average (11.1), which it is (14.3).
## # A tibble: 1 x 6
## circ_num spacings perc_of_mean dist_from_1 mult_rate_by circ_rate
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 0.0379 0.712 0.288 1.29 14.3
In this example, the spacing value (0.8351233) is much greater than the mean (0.05327626), so we would expect the deposition rate to be less than the average (11.1), which it is (4.8).
## # A tibble: 1 x 6
## circ_num spacings perc_of_mean dist_from_1 mult_rate_by circ_rate
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 3 0.0835 1.57 0.568 0.432 4.80
This is the same thing, but written in a one line formula. If I’m interpreting this correctly….
spacings/mean(spacings) = the proportional relationship between the value of “spacing” and the mean spacing where a relationship of 1 = mean(spacings)
1-spacings/mean(spacings) = direction and size of the relationship
(1-(spacings/mean(spacings)))+1) = inverse proportional relationship between value of spacing (still centered on 1, but direction is reversed)
proof4<-exampledata %>%
mutate(mult_rate_by = (1-(spacings/mean(spacings)))+1) %>%
mutate(circ_rate= (avg_circulus_deposition_rate*mult_rate_by))
proof4
## # A tibble: 15 x 4
## circ_num spacings mult_rate_by circ_rate
## <dbl> <dbl> <dbl> <dbl>
## 1 1 0.0379 1.29 14.3
## 2 2 0.0451 1.15 12.8
## 3 3 0.0835 0.432 4.80
## 4 4 0.0515 1.03 11.5
## 5 5 0.0528 1.01 11.2
## 6 6 0.0869 0.369 4.10
## 7 7 0.0599 0.875 9.72
## 8 8 0.0228 1.57 17.4
## 9 9 0.0352 1.34 14.9
## 10 10 0.0404 1.24 13.8
## 11 11 0.0763 0.568 6.30
## 12 12 0.0577 0.916 10.2
## 13 13 0.0586 0.900 9.99
## 14 14 0.0524 1.02 11.3
## 15 15 0.0380 1.29 14.3
sum(proof4$circ_rate);(11.1*length(proof4$circ_num))
## [1] 166.5
## [1] 166.5
mean(proof4$circ_rate);avg_circulus_deposition_rate
## [1] 11.1
## [1] 11.1
So in conclusion…
Is this method for estimating a variable deposition rate (each circulus of each fish gets its own deposition rate rather than just each fish getting its own lifetime average deposition rate) mathematically correct?
If so, does it make any sense??
And if you can spare the time… do you have any ideas on how I can explain what I did here? The part I’m really stuck on is explaining the how I came up with the final multiplier value (1-(spacings/mean(spacings)))+1). How do you describe this value in words?