In this Project, I will analyze the the relationship of the body weights of mammals and their pulse rates.
Assumption is that the amount of energy available is proportional to the blood flow through the lungs, which is the source of oxygen. Assuming the least amount of blood needed to circulate, the amount of available energy will equal the amount of energy used to maintain the body temperature.
Construct a model relating blood flow through the heart to body weight.
The following data relate weights of some mammals to their heart rate measured in beats per minute. Construct a model that relates heart rate to body weight. Discuss the assumptions of your model. Use the data to check your model.
library(ggplot2)
library(reshape2)
mammals<- c('Vespergo pipistrellas', 'Mouse', 'Rat','Guinea pig', 'Rabbit', 'Little dog', 'Big dog','Sheep', 'Man', 'Horse', 'Ox','Elephant')
body_weight<- c(4,25,200,300,2000,5000,30000,50000,70000,450000,500000,3000000)
pulse_rate<-c(660,670,420,300,205,120,85,70,72,38,40,48)
df<- data.frame(cbind(mammals, body_weight,pulse_rate ))
#df$pulse_rate <- as.integer(df$pulse_rate)
#df$body_weight <- as.integer(df$body_weight)
str(df)
## 'data.frame': 12 obs. of 3 variables:
## $ mammals : Factor w/ 12 levels "Big dog","Elephant",..: 12 7 10 3 9 5 1 11 6 4 ...
## $ body_weight: Factor w/ 12 levels "200","2000","25",..: 7 3 1 4 2 9 5 10 12 8 ...
## $ pulse_rate : Factor w/ 12 levels "120","205","300",..: 8 9 6 3 2 1 12 10 11 4 ...
knitr::kable(df)
mammals | body_weight | pulse_rate |
---|---|---|
Vespergo pipistrellas | 4 | 660 |
Mouse | 25 | 670 |
Rat | 200 | 420 |
Guinea pig | 300 | 300 |
Rabbit | 2000 | 205 |
Little dog | 5000 | 120 |
Big dog | 30000 | 85 |
Sheep | 50000 | 70 |
Man | 70000 | 72 |
Horse | 450000 | 38 |
Ox | 5e+05 | 40 |
Elephant | 3e+06 | 48 |
It is understood that smaller the mammals have much higher resting pulse rates than large ones. For examples, a mouse, with a body mass of 25 grams, may have a heart rate of 670 beats/minute, as compared to 48 beats/minute for a Elephant.
Warm-blooded animals at rest use large quantities of energy to maintain body temperature, due to heat loss through the body surface. Let us make the following assumptions about energy balance over a fixed time interval (1 minute):
A. Energy lost is proportional to the surface area of the body.
B. Energy gained is proportional to the amount of blood flowing through the lungs, the source of oxygen for the body.
C. Energy lost = energy gained for a body in temperature equilibrium.
Model Construction E = energy S = surface area m = mass W = weight of the animal r = pulse rate V = volume of the animal b = volume of blood pumped by the heart in one stroke.
From assumption A, we can assume that the energy required to maintain the body temperature of a mammal E is proportional to its body surface area S, i.e. \(E \propto S\)
Since blood flow through the lungs provides animals oxygen, larger animals have larger lungs and require more oxygen. Therefore, we assume that the amount of energy available is proportional to the blood flow b through the lungs, i.e \(E \propto b\)
Also, we assume that the least amount of blood needed to circulate, the amount of available energy will equal the amount of energy used to maintain the body temperature. This implies \(b \propto S\)
Since larger animals have larger hearts and larger lungs, we can assume the mammals are geometrically similar. Thus, \(S \propto V^{2/3}\)
where V is the volume of the animal.
Since \(V \propto m \propto W\) where m is the mass of the animal and W is the weight of the animal, we have \(S \propto W^{2/3}\). This implies \(b \propto W^{2/3}\).
Also, since blood flow is determined by the total amount of blood pumped by the heart per unit time (minutes), and every time the heart beats the blood is pumped into the arteries, we have \(b \propto rV \propto rW\) where r is the pulse rate and Vh is heart volume. From \(b \propto W^{2/3}\) and \(b \propto rW\), we have \(W^{2/3} \propto rW\) which is \(r \propto W^{{-1/3}}\).
First we will be modeling using Geometric Similarity given our power curve model function
We will be using Least squares criterion to fit our curve model of the form \(y = Ax^n\) where n is fixed to a given collection of data points. In our case, our curve is \(r = Aw^{-1/3}\) . Therefore we will use the least squares criterion to estimate A with n = -1/3.
In addition we will use the Chebyshev criterion to find the bounds.
#body_weight<- c(4,25,200,300,2000,5000,30000,50000,70000,450000,500000,3000000)
#pulse_rate<-c(660,670,420,300,205,120,85,70,72,38,40,48)
df<- data.frame(mammals, body_weight,pulse_rate )
#df
str(df)
## 'data.frame': 12 obs. of 3 variables:
## $ mammals : Factor w/ 12 levels "Big dog","Elephant",..: 12 7 10 3 9 5 1 11 6 4 ...
## $ body_weight: num 4 25 200 300 2000 5000 30000 50000 70000 450000 ...
## $ pulse_rate : num 660 670 420 300 205 120 85 70 72 38 ...
df$z = c(NA,tail(df$pulse_rate,-1)-head(df$pulse_rate,-1))
df$z2= c(NA,tail(df$body_weight,-1)-head(df$body_weight,-1))
# finding the slope
df$slope<- df$z / df$z2
slope<- mean(df$slope, na.rm= TRUE)
slope
## [1] -0.2035175
# calculating f(x) using the slope
df$fx<- -0.2035175*(df$body_weight)^(-1/3)
p1<- ggplot(data = df, aes(x = df$body_weight, y = df$pulse_rate)) +
geom_line()+
ggtitle("Body Weight vs Pulse Rate (Original data)")
p1
p2<- ggplot(data = df, aes(x = df$body_weight, y = df$fx)) +
geom_line()+
ggtitle("Body Weight vs Pulse Rate using Geometric Similarity \n f(x) = -0.2035175*(df$body_weight)^(-1/3)")
p2
\[ f(x) = ax^n \] requires minimization of:
\[ S = \sum_{i=1}^{m} [y_i - f(x_i)]^2 \]
\[ S = \sum_{i=1}^{m} [y_i - ax^n]^2 \]
Solving for a yields
\[ a = \frac {{\sum xi^{2} y_i}} {\sum xi^2n}\]
dfl<- data.frame(mammals, body_weight,pulse_rate )
str(dfl)
## 'data.frame': 12 obs. of 3 variables:
## $ mammals : Factor w/ 12 levels "Big dog","Elephant",..: 12 7 10 3 9 5 1 11 6 4 ...
## $ body_weight: num 4 25 200 300 2000 5000 30000 50000 70000 450000 ...
## $ pulse_rate : num 660 670 420 300 205 120 85 70 72 38 ...
sum1<- sum((dfl$body_weight)^(-1/3) * (dfl$pulse_rate))
sum2<- sum((dfl$body_weight)^(-2/3))
a<- sum1/sum2
dfl$pulse_rate_fit<- round(a*((dfl$body_weight)^(-1/3)), 2)
dfl
## mammals body_weight pulse_rate pulse_rate_fit
## 1 Vespergo pipistrellas 4 660 864.06
## 2 Mouse 25 670 469.08
## 3 Rat 200 420 234.54
## 4 Guinea pig 300 300 204.89
## 5 Rabbit 2000 205 108.86
## 6 Little dog 5000 120 80.21
## 7 Big dog 30000 85 44.14
## 8 Sheep 50000 70 37.23
## 9 Man 70000 72 33.28
## 10 Horse 450000 38 17.90
## 11 Ox 500000 40 17.28
## 12 Elephant 3000000 48 9.51
final <- data.frame( body_weight= dfl$body_weight, pulse_rate = dfl$pulse_rate, pulse_rate_fit = dfl$pulse_rate_fit)
dflg <- melt(final, id= 'body_weight')
dflg
## body_weight variable value
## 1 4 pulse_rate 660.00
## 2 25 pulse_rate 670.00
## 3 200 pulse_rate 420.00
## 4 300 pulse_rate 300.00
## 5 2000 pulse_rate 205.00
## 6 5000 pulse_rate 120.00
## 7 30000 pulse_rate 85.00
## 8 50000 pulse_rate 70.00
## 9 70000 pulse_rate 72.00
## 10 450000 pulse_rate 38.00
## 11 500000 pulse_rate 40.00
## 12 3000000 pulse_rate 48.00
## 13 4 pulse_rate_fit 864.06
## 14 25 pulse_rate_fit 469.08
## 15 200 pulse_rate_fit 234.54
## 16 300 pulse_rate_fit 204.89
## 17 2000 pulse_rate_fit 108.86
## 18 5000 pulse_rate_fit 80.21
## 19 30000 pulse_rate_fit 44.14
## 20 50000 pulse_rate_fit 37.23
## 21 70000 pulse_rate_fit 33.28
## 22 450000 pulse_rate_fit 17.90
## 23 500000 pulse_rate_fit 17.28
## 24 3000000 pulse_rate_fit 9.51
ggplot(data = dflg, aes(x = body_weight, y = value, color = variable)) +
geom_area()+
ggtitle("Body Weight vs Pulse Rate (LS)")
ggplot(data = dflg, aes(x = body_weight, y = value, color = variable)) +
geom_line() +
ggtitle("Body Weight vs Pulse Rate (Least Squares)")
Given N points (x_1, y_1),(x_2, y_2),(x_3, y_3), .., (x_n, y_n), to fit the model y=cx using Chebyshev criterion, we need to:
\[ E = max_i|yi - c(xi)|\]
Goal: to minimize r
r - ( 660 - c4 ) >= 0 r + ( 660 - c4 ) >= 0
r - ( 670 - c25 ) >= 0 r + ( 670 - c25 ) >= 0
r - ( 420 - c200 ) >= 0 r + ( 420 - c200 ) >= 0
r - ( 300 - c300 ) >= 0 r + ( 300 - c300 ) >= 0
r - ( 205 - c2000 ) >= 0 r + ( 205 - c2000 ) >= 0
r - ( 120 - c5000 ) >= 0 r + ( 120 - c5000 ) >= 0
r - ( 85 - c30000 ) >= 0 r + ( 85 - c30000 ) >= 0
r - ( 70 - c50000 ) >= 0 r + ( 70 - c50000 ) >= 0
r - ( 72 - c70000 ) >= 0 r + ( 72 - c70000 ) >= 0
r - ( 38 - c450000 ) >= 0 r + ( 38 - c450000 ) >= 0
r - ( 40 - c500000 ) >= 0 r + ( 40 - c500000 ) >= 0
r - ( 48 - c3000000 ) >= 0 r + ( 48 - c3000000 ) >= 0
fun1<- function(c) (660-20*c)
fun2<- function(c) (670-25*c)
fun3<- function(c) (420-200*c)
fun4<- function(c) (300-300*c)
fun5<- function(c) (205-2000*c)
fun6<- function(c) (120-5000*c)
fun7<- function(c) (85-30000*c)
fun8<- function(c) (70-50000*c)
fun9<- function(c) (72-70000*c)
fun11<- function(c) (38-450000*c)
fun12<- function(c) (40-500000*c)
fun13<- function(c) (48-3000000*c)
fun21<- function(c) 20*c-660
fun22<- function(c) 25*c-670
fun23<- function(c) 200*c-420
fun24<- function(c) 300*c-300
fun25<- function(c) 2000*c-205
fun26<- function(c) 5000*c-120
fun27<- function(c) 30000*c-85
fun28<- function(c) 50000*c-70
fun29<- function(c) 70000*c-72
fun31<- function(c) 450000*c-38
fun32<- function(c) 500000*c-40
fun33<- function(c) 3000000*c-48
x1 = seq(-5,5)
mydf = data.frame(x1, y1=fun1(x1), y2=fun2(x1),y3= fun3(x1), y4=fun4(x1),
y5=fun5(x1),y6= fun6(x1),
y7=fun7(x1), y8=fun8(x1),y9= fun9(x1),
y11=fun11(x1), y12=fun12(x1),y13= fun13(x1), y21=fun21(x1),
y22=fun22(x1),y23= fun23(x1),
y24=fun24(x1), y25=fun25(x1),y26= fun26(x1),
y27=fun27(x1), y28=fun25(x1),y29= fun26(x1),
y31=fun24(x1), y32=fun25(x1),y33= fun26(x1)
)
mydf <- transform(mydf, z = pmax(y1,pmin(y2,y3, y4, y5, y6, y7, y8, y9, y11, y12,y13,
y22,y23,y24, y25, y26, y27, y28, y29,y31,y32,y33)))
ggplot(mydf, aes(x = x1)) +
geom_line(aes(y = y1), colour = 'blue') +
geom_line(aes(y = y2), colour = 'black') +
geom_line(aes(y = y3), colour = 'red') +
geom_line(aes(y = y4), colour = 'yellow') +
geom_line(aes(y = y5), colour = 'green') +
geom_line(aes(y = y6), colour = 'orange') +
geom_line(aes(y = y7), colour = 'gray') +
geom_line(aes(y = y8), colour = 'pink') +
geom_line(aes(y = y9), colour = '#663399') +
geom_line(aes(y = y11), colour = 'violet') +
geom_line(aes(y = y12), colour = '#000066') +
geom_line(aes(y = y13), colour = 'brown') +
geom_line(aes(y = y21), colour = 'blue') +
geom_line(aes(y = y22), colour = 'black') +
geom_line(aes(y = y23), colour = 'red') +
geom_line(aes(y = y24), colour = 'yellow') +
geom_line(aes(y = y25), colour = 'green') +
geom_line(aes(y = y26), colour = 'orange') +
geom_line(aes(y = y27), colour = 'gray') +
geom_line(aes(y = y28), colour = 'pink') +
geom_line(aes(y = y29), colour = '#663399') +
geom_line(aes(y = y31), colour = 'violet') +
geom_line(aes(y = y32), colour = '#000066') +
geom_line(aes(y = y33), colour = 'brown') +
geom_ribbon(aes(ymin=y1,ymax = z), fill = 'gray60')+
coord_flip()
lmfit <- lm(x1 ~ (y1+y2+y3+y4+y5+y6+y7+y8+y9+y11+y12+y13+y21+y22+y23+y24+y25+y26+y27+y28+y29+y31+y32+y33), data=mydf)
summary(lmfit)
##
## Call:
## lm(formula = x1 ~ (y1 + y2 + y3 + y4 + y5 + y6 + y7 + y8 + y9 +
## y11 + y12 + y13 + y21 + y22 + y23 + y24 + y25 + y26 + y27 +
## y28 + y29 + y31 + y32 + y33), data = mydf)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.443e-15 -2.870e-15 -6.939e-16 1.427e-15 1.158e-14
##
## Coefficients: (23 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.300e+01 1.482e-14 2.227e+15 <2e-16 ***
## y1 -5.000e-02 2.235e-17 -2.237e+15 <2e-16 ***
## y2 NA NA NA NA
## y3 NA NA NA NA
## y4 NA NA NA NA
## y5 NA NA NA NA
## y6 NA NA NA NA
## y7 NA NA NA NA
## y8 NA NA NA NA
## y9 NA NA NA NA
## y11 NA NA NA NA
## y12 NA NA NA NA
## y13 NA NA NA NA
## y21 NA NA NA NA
## y22 NA NA NA NA
## y23 NA NA NA NA
## y24 NA NA NA NA
## y25 NA NA NA NA
## y26 NA NA NA NA
## y27 NA NA NA NA
## y28 NA NA NA NA
## y29 NA NA NA NA
## y31 NA NA NA NA
## y32 NA NA NA NA
## y33 NA NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.689e-15 on 9 degrees of freedom
## Multiple R-squared: 1, Adjusted R-squared: 1
## F-statistic: 5.003e+30 on 1 and 9 DF, p-value: < 2.2e-16
From the above Chebyshev graph, the minmax is $ r+(48-3000000*c) $and maxmin is \(r-(20*c-660)\). Hence, equating these two equations to 0.
\(r + 48-3000000*c = 0\)
\(r - 20*c-660 = 0\)
Therefore the optimal value of \(c =236*10^-6\) and r =640
Conclusion:
For the majority of the mammals, the model makes good predictions in both least Square and Geometric Similarity, and support the biology that larger the animals slower pulse rate. However, Elephant is bigger than all the animals but it has pulse rate slightly higher than Ox and Horse. So, apart from the body weight, there is something else that affects the pulse rate. Chebysev has a large error which is not acceptable.