See the Arbuthnot data in HistData library. The Arbuthnot data set refers to Dr. John Arbuthnot, an 18th century physician, writer, and mathematician. He was interested in the ratio of newborn boys to newborn girls, so he gathered the christening records for children born in London for every year from 1629 to 1710. It also contains the variable Ratio, giving the ratio of male to female births.
library("HistData", lib.loc="/Library/Frameworks/R.framework/Versions/3.4/Resources/library")
plot(Arbuthnot$Ratio ~ Arbuthnot$Year, ylim=c(0.95,1.15))
abline(h=1, col="red")
Plot the total number of christenings, Males + Females or Total (in 000s) over time. What unusual features do you see?
Between the 1640s and 1660s, the number of christenings dramatically decreased.
plot(Arbuthnot$Males + Arbuthnot$Females ~ Arbuthnot$Year)
Use the data set WomenQueue to:
library(vcd)
## Loading required package: grid
data(WomenQueue, package ="vcd")
WQ.fit <- goodfit(WomenQueue, type="binomial")
## Warning in goodfit(WomenQueue, type = "binomial"): size was not given,
## taken as maximum count
plot(WQ.fit, type="deviation", xlab="Counts")
Perhaps there are less females than males and therefore the chance of a queue with 10 women is lower.
Work on the distribution of male children in families in “Saxony” by fitting a binomial distribution, 𝑛(𝑛 = 12, 𝑝 = 1) , specifying equal probability for boys and girls. [Hint: you need to specify both size and prob values for goodfit ().]
The model is a pretty good fit. There’s only slight deviations.
gf1=goodfit(Saxony, type = "binomial", par = list(size = 12))
gf1
##
## Observed and fitted values for binomial distribution
## with parameters estimated by `ML'
##
## count observed fitted pearson residual
## 0 3 0.9328394 2.1402809
## 1 24 12.0888374 3.4257991
## 2 104 71.8031709 3.7996298
## 3 286 258.4751335 1.7120476
## 4 670 628.0550119 1.6737139
## 5 1033 1085.2107008 -1.5849023
## 6 1343 1367.2793552 -0.6566116
## 7 1112 1265.6303069 -4.3184059
## 8 829 854.2466464 -0.8637977
## 9 478 410.0125627 3.3576088
## 10 181 132.8357027 4.1789562
## 11 45 26.0824586 3.7041659
## 12 7 2.3472734 3.0368664
gf2=goodfit(Saxony, type = "binomial", par = list(prob = 0.5, size = 12))
gf2
##
## Observed and fitted values for binomial distribution
## with fixed parameters
##
## count observed fitted pearson residual
## 0 3 1.49292 1.2334401
## 1 24 17.91504 1.4376359
## 2 104 98.53271 0.5507842
## 3 286 328.44238 -2.3419098
## 4 670 738.99536 -2.5380434
## 5 1033 1182.39258 -4.3445838
## 6 1343 1379.45801 -0.9816094
## 7 1112 1182.39258 -2.0471328
## 8 829 738.99536 3.3108845
## 9 478 328.44238 8.2523747
## 10 181 98.53271 8.3079041
## 11 45 17.91504 6.3991064
## 12 7 1.49292 4.5071617
plot(gf1,xlab= "Number of Males")
plot(gf2,xlab= "Number of Males")