In this project, I will investigate the relationship of the body weights of birds and their pulse rates.
Heart rate of birds- Warm-blooded animals use large quantities of energy to maintain body temperature because of the heat loss through the body surface. In fact, biologists believe that the primary energy drain on a resting warm blooded animal is maintenance of body temperature.
Construct a model relating blood flow through the heart to body weight.
The following data relate weights of some birds to their heart rate measured in beats per minute. Construct a model that relates heart rate to body weight. Discuss the assumptions of your model. Use the data to check your model.
bird<- c('Canary','Pigeon','Crow','Buzzard','Duck','Hen','Goose','Turkey','Ostrich')
body_weight<- c(20,300,341,658,1100,2000,2300,8750,71000)
pulse_rate<-c(1000, 185,378,300,190,312,240,193,60)
df<- data.frame(cbind(bird, body_weight,pulse_rate ))
knitr::kable(df)
| bird | body_weight | pulse_rate |
|---|---|---|
| Canary | 20 | 1000 |
| Pigeon | 300 | 185 |
| Crow | 341 | 378 |
| Buzzard | 658 | 300 |
| Duck | 1100 | 190 |
| Hen | 2000 | 312 |
| Goose | 2300 | 240 |
| Turkey | 8750 | 193 |
| Ostrich | 71000 | 60 |
It is well-known that small warm-blooded animals have much higher resting pulse rates than large ones. For examples, a canary, with a body mass of 20 grams, may have a heart rate of 1000 beats/minute, as compared to 70-80 beats/minute for a human being.
Warm-blooded animals at rest use large quantities of energy to maintain body temperature, due to heat loss through the body surface. Let us make the following assumptions about energy balance over a fixed time interval (1 minute, say):
A. Energy lost is proportional to the surface area of the body.
B. Energy gained is proportional to the amount of blood flowing through the lungs, the source of oxygen for the body.
C. Energy lost = energy gained for a body in temperature equilibrium.
E = energy
S = surface area
m = mass
W = weight of the animal
r = pulse rate
V = volume of the animal
b = volume of blood pumped by the heart in one stroke.
From assumption A, we can assume that the energy required to maintain the body temperature of a bird E is proportional to its body surface area S, i.e. \(E\propto S\)
Since blood flow through the lungs provides animals oxygen, larger animals have larger lungs and require more oxygen. Therefore, we assume that the amount of energy available is proportional to the blood flow b through the lungs, i.e \(E \propto b\)
Also, we assume that the least amount of blood needed to circulate, the amount of available energy will equal the amount of energy used to maintain the body temperature. This implies \(b \propto S\)
Since larger animals have larger hearts and larger lungs, we can assume the birds are geometrically similar. Thus, \(S\propto V^{\frac 2 3}\) where V is the volume of the animal.
Since \(V \propto m \propto W\) where m is the mass of the animal and W is the weight of the animal, we have \(S \propto W^{\frac 2 3}\) . This implies \(b \propto W^{\frac 2 3}\).
Also, since blood flow is determined by the total amount of blood pumped by the heart per unit time (minutes), and every time the heart beats the blood is pumped into the arteries, we have \(b \propto rV \propto rW\) where r is the pulse rate and Vh is heart volume. From \(b\propto W^{\frac 2 3}\) and \(b \propto rW\), we have \(W^{ \frac 2 3} \propto rW\), i.e. \(r \propto W^{\frac {-1} {3}}\).
First we will be modeling using Geometric Similarity given our power curve model function
We will be using Least squares criterion to fit our curve model of the form \(y= Ax^n\) where n is fixed to a given collection of data points. In our case, our curve is \(r = Aw^{\frac {-1} {3}}\). Therefore we will use the least squares criterion to estimate \(A\) with \(n = -1/3\).
body_weight<- c(20,300,341,658,1100,2000,2300,8750,71000)
pulse_rate<-c(1000, 185,378,300,190,312,240,193,60)
df<- data.frame(bird, body_weight,pulse_rate )
df$z = c(NA,tail(df$pulse_rate,-1)-head(df$pulse_rate,-1))
df$z2= c(NA,tail(df$body_weight,-1)-head(df$body_weight,-1))
# finding the slope
df$slope<- df$z / df$z2
slope<- mean(df$slope, na.rm= TRUE)
#slope
# calculating f(x) using the slope
df$fx<- 0.148476*(df$body_weight)^(-1/3)
#########################################
# Multiple plot function
#
# ggplot objects can be passed in ..., or to plotlist (as a list of ggplot objects)
# - cols: Number of columns in layout
# - layout: A matrix specifying the layout. If present, 'cols' is ignored.
#
# If the layout is something like matrix(c(1,2,3,3), nrow=2, byrow=TRUE),
# then plot 1 will go in the upper left, 2 will go in the upper right, and
# 3 will go all the way across the bottom.
#
multiplot <- function(..., plotlist=NULL, file, cols=1, layout=NULL) {
library(grid)
# Make a list from the ... arguments and plotlist
plots <- c(list(...), plotlist)
numPlots = length(plots)
# If layout is NULL, then use 'cols' to determine layout
if (is.null(layout)) {
# Make the panel
# ncol: Number of columns of plots
# nrow: Number of rows needed, calculated from # of cols
layout <- matrix(seq(1, cols * ceiling(numPlots/cols)),
ncol = cols, nrow = ceiling(numPlots/cols))
}
if (numPlots==1) {
print(plots[[1]])
} else {
# Set up the page
grid.newpage()
pushViewport(viewport(layout = grid.layout(nrow(layout), ncol(layout))))
# Make each plot, in the correct location
for (i in 1:numPlots) {
# Get the i,j matrix positions of the regions that contain this subplot
matchidx <- as.data.frame(which(layout == i, arr.ind = TRUE))
print(plots[[i]], vp = viewport(layout.pos.row = matchidx$row,
layout.pos.col = matchidx$col))
}
}
}
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.2.5
p1<- ggplot(data = df, aes(x = df$body_weight, y = df$pulse_rate)) +
geom_line()+
ggtitle("Body Weight vs Pulse Rate (Original data)")
p2<- ggplot(data = df, aes(x = df$body_weight, y = df$fx)) +
geom_line()+
ggtitle("Body Weight vs Pulse Rate using Fitted model \n f(x) = 0.148476*(df$body_weight)^(-1/3)")
library(gridExtra)
## Warning: package 'gridExtra' was built under R version 3.2.5
multiplot(p1, p2, cols=1)
dfold<- df
Recall model of form power curve \(f(x) = ax^n\) requires minimization of: \[ S = \sum_{i=1}^m[y_i -f(x_i)]^2\] \[ s = \sum_{i=1}^m[y_i -ax_i^n]^2\] solving the the equation for \(a\) yeilds: \[ a = \frac {\sum x_i^2y_i} {\sum x_i^{2n}} \]
bird<- c('Canary','Pigeon','Crow','Buzzard','Duck','Hen','Goose','Turkey','Ostrich')
body_weight<- c(20,300,341,658,1100,2000,2300,8750,71000)
pulse_rate<-c(1000, 185,378,300,190,312,240,193,60)
df<- data.frame(cbind(bird, body_weight,pulse_rate ))
library(varhandle)
## Warning: package 'varhandle' was built under R version 3.2.3
sum1<- sum(unfactor(df$body_weight)^(-1/3) * unfactor(df$pulse_rate))
sum2<- sum(unfactor(df$body_weight)^(-2/3))
a<- sum1/sum2
a
## [1] 2576.555
df$pulse_rate_fit<- round(a*(unfactor(df$body_weight)^(-1/3)), 2)
df
## bird body_weight pulse_rate pulse_rate_fit
## 1 Canary 20 1000 949.21
## 2 Pigeon 300 185 384.89
## 3 Crow 341 378 368.80
## 4 Buzzard 658 300 296.23
## 5 Duck 1100 190 249.60
## 6 Hen 2000 312 204.50
## 7 Goose 2300 240 195.19
## 8 Turkey 8750 193 125.04
## 9 Ostrich 71000 60 62.22
library(reshape2)
library(ggplot2)
final <- data.frame( body_weight= df$body_weight, pulse_rate = df$pulse_rate, pulse_rate_fit = df$pulse_rate_fit)
#plot(final)
DF <- melt(final, id= 'body_weight')
## Warning: attributes are not identical across measure variables; they will
## be dropped
ggplot(data = DF, aes(x = body_weight, y = value, color = variable)) +
geom_area()
DF<- data.frame(DF)
DF<- unfactor(DF)
#str(DF)
DF$value<- as.numeric(DF$value)
ggplot(data = DF, aes(x = body_weight, y = value, color = variable)) +
geom_line()
Given N points (x_1, y_1),(x_2, y_2),(x_3, y_3), …., (x_n, y_n), to fit the model y=cx using Chebyshev criterion, we need to:
\[\epsilon = max_i\bigl| y_i - y(x_i) \bigr|\] \[ Goal: \ to \ minimize\ \epsilon\]
\[r = max_i\bigl| y_i - cx_i \bigr|\] \[ Goal: to \ minimize\ r\] subject to: \[r-(1000 -c*20) >=0\] \[r+(1000 -c*20) >=0\] \[r-(185 -c*300) >=0\] \[r+(185 -c*300) >=0\] \[r-(378 -c*341) >=0\] \[r+(378 -c*341) >=0\] \[r-(300 -c*658) >=0\] \[r+(300 -c*658) >=0\] \[r-(190 -c*1100) >=0\] \[r+(190 -c*1100) >=0\] \[r-(312 -c*2000) >=0\] \[r+(312 -c*2000) >=0\] \[r-(240 -c*2300) >=0\] \[r+(240 -c*2300) >=0\] \[r-(193 -c*8750) >=0\] \[r+(193 -c*8750) >=0\] \[r-(60 -c*71000) >=0\] \[r+(60 -c*71000) >=0\]
library(ggplot2)
fun1<- function(c) (1000-20*c)
fun2<- function(c) (185-300*c)
fun3<- function(c) (378-341*c)
fun4<- function(c) (300-658*c)
fun5<- function(c) (190-1100*c)
fun6<- function(c) (312-2000*c)
fun7<- function(c) (240-2300*c)
fun8<- function(c) (193-8750*c)
fun9<- function(c) (60-71000*c)
fun11<- function(c) 20*c-1000
fun12<- function(c) 300*c-185
fun13<- function(c) 341*c-378
fun14<- function(c) 658*c-300
fun15<- function(c) 1100*c-190
fun16<- function(c) 2000*c-312
fun17<- function(c) 2300*c-240
fun18<- function(c) 8750*c-193
fun19<- function(c) 71000*c-60
x1 = seq(-5,1000)
mydf = data.frame(x1, y1=fun1(x1), y2=fun2(x1),y3= fun3(x1), y4=fun4(x1),
y5=fun5(x1),y6= fun6(x1),
y7=fun7(x1), y8=fun8(x1),y9= fun9(x1),
y11=fun11(x1), y12=fun12(x1),y13= fun13(x1), y14=fun14(x1),
y15=fun15(x1),y16= fun16(x1),
y17=fun17(x1), y18=fun18(x1),y19= fun19(x1)
)
mydf <- transform(mydf, z = pmax(y1,pmin(y2,y3, y4, y5, y6, y7, y8, y9,
y12,y13, y14, y15, y16, y17, y18, y19)))
ggplot(mydf, aes(x = x1)) +
geom_line(aes(y = y1), colour = 'blue') +
geom_line(aes(y = y2), colour = 'black') +
geom_line(aes(y = y3), colour = 'red') +
geom_line(aes(y = y4), colour = 'yellow') +
geom_line(aes(y = y5), colour = 'green') +
geom_line(aes(y = y6), colour = 'orange') +
geom_line(aes(y = y7), colour = 'gray') +
geom_line(aes(y = y8), colour = 'pink') +
geom_line(aes(y = y9), colour = 'brown') +
geom_line(aes(y = y11), colour = 'blue') +
geom_line(aes(y = y12), colour = 'black') +
geom_line(aes(y = y13), colour = 'red') +
geom_line(aes(y = y14), colour = 'yellow') +
geom_line(aes(y = y15), colour = 'green') +
geom_line(aes(y = y16), colour = 'orange') +
geom_line(aes(y = y17), colour = 'gray') +
geom_line(aes(y = y18), colour = 'pink') +
geom_line(aes(y = y19), colour = 'brown') +
geom_ribbon(aes(ymin=y1,ymax = z), fill = 'gray60')+
coord_flip()
From the above Chebyshev graph , the minmax is r+(60-c71000) and maxmin is r-(1000 -c20). Hence, setting r+(60-c71000) = 0 and r-(1000 -c20) = 0, where r = 1000 -c*20 Therefore, the optimal value of c = 67. The residual error is 340. Please note that the feasible region could not visibly be confirmed. Hence the maxmin was just visual approximation from the graph
For the majority of the birds, the model makes good predictions in both Least Square and Geometric Similarity, and support the biology that larger animals have slower pulse rate. However, this is not always true for all birds. The pulse rate of a Pigeon is almost the same as the pulse rate of a Turkey, but the weight of a Turkey is 29.16 times of the weight of a pigeon. This model reveals that assuming there is no measurement error, there is something else affecting pigeon’s pulse rate; thus further investigations and possible model refinement are needed for birds like pigeon.
However, a research by Florence Buchanan in his paper, The significance of the pulse rate in vertebrate animals University college, London 1910, explains the pigeon data discrepancy. The pigeon weights 300g and pulse rate of 185. However, if we examine its average dioxide per kilo per hour(in grams) which is 3.4 as compared to that of the goose 1.07 and yet body weight of 2300 and pulse rate of 240, we can derive that larger average dioxide per kilo per hour(in grams) leads to faster heart rates.. It is clearly demonstrated in by Florence Buchanan that larger average dioxide per kilo per hour (in grams) leads to faster heart rates. link: https://books.google.com/books?id=3jEsAAAAYAAJ&printsec=frontcover#v=onepage&q&f=false
Please note that Chebyshev criterion resulted in optimal value of 67; however, large error of 340 which is not acceptable. In fact the feasible region could not be geometrically confirmed.