The Arbuthnot data set refers to Dr. John Arbuthnot, an 18th century physician, writer, and mathematician. He was interested in the ratio of newborn boys to newborn girls, so he gathered the baptism records for children born in London for every year from 1629 to 1710.
A summary of Arbuthnot’s baptism data set for girls and boys:
source("more/arbuthnot.R")
head(arbuthnot)## year boys girls
## 1 1629 5218 4683
## 2 1630 4858 4457
## 3 1631 4422 4102
## 4 1632 4994 4590
## 5 1633 5158 4839
## 6 1634 5035 4820
dim(arbuthnot)## [1] 82 3
names(arbuthnot)## [1] "year" "boys" "girls"
The Present data set contains trend analysis of the sex ratios at birth in the US from 19049 to 2002.
The present data set contains 63 rows and 3 columns which are named year, boys, and girls.
source("more/present.R")
head(present)## year boys girls
## 1 1940 1211684 1148715
## 2 1941 1289734 1223693
## 3 1942 1444365 1364631
## 4 1943 1508959 1427901
## 5 1944 1435301 1359499
## 6 1945 1404587 1330869
dim(present)## [1] 63 3
names(present)## [1] "year" "boys" "girls"
The birth data ranges from 1940 to 2002.
min(present$year)## [1] 1940
max(present$year)## [1] 2002
The total number of baptisms for boys and girls for the period from 1629 to 1710.
plot(arbuthnot$year, arbuthnot$boys + arbuthnot$girls, type = "l")The total number of births for boys and girls for the period from 1940 to 2002.
plot(present$year, present$boys + present$girls, type = "l")min(arbuthnot$boys + arbuthnot$girls)## [1] 5612
max(arbuthnot$boys + arbuthnot$girls)## [1] 16145
min(present$boys + present$girls)## [1] 2360399
max(present$boys + present$girls)## [1] 4268326
The scale of present day births is several hundred times the scale of baptisms.
min(present$boys + present$girls)/min(arbuthnot$boys + arbuthnot$girls)## [1] 420.5985
max(present$boys + present$girls)/max(arbuthnot$boys + arbuthnot$girls)## [1] 264.3745
Early on the number of boys born is higher than girls, but as the years progress the proportion of boys to girls gets closer to being equal.
plot(present$year, present$boys/present$girls, type = "l")There are more boys born than girls during the entire period of the data set, but the proportion of boys gets smaller over time.
present$boys/(present$boys + present$girls)## [1] 0.5133386 0.5131376 0.5141926 0.5138001 0.5135613 0.5134745 0.5142562
## [8] 0.5134883 0.5131024 0.5130881 0.5130778 0.5126891 0.5124173 0.5130027
## [15] 0.5125423 0.5123716 0.5125011 0.5123550 0.5120462 0.5120713 0.5119269
## [22] 0.5122088 0.5117064 0.5128408 0.5115250 0.5124656 0.5118474 0.5121866
## [29] 0.5130068 0.5129073 0.5133154 0.5126337 0.5124973 0.5127013 0.5133340
## [36] 0.5130513 0.5127982 0.5128057 0.5128266 0.5126110 0.5128692 0.5125792
## [43] 0.5123372 0.5126648 0.5122425 0.5126849 0.5124035 0.5121951 0.5121931
## [50] 0.5121286 0.5121179 0.5112054 0.5121992 0.5121845 0.5116894 0.5119398
## [57] 0.5114951 0.5116337 0.5115255 0.5119072 0.5117182 0.5111665 0.5117154
plot(present$year, present$boys/(present$boys + present$girls), type = "l")The year with the greatest number of total births is 1961.
present$totalBirths = present$boys + present$girls
present[ present$totalBirths == max(present$totalBirths), ]## year boys girls totalBirths
## 22 1961 2186274 2082052 4268326