(a). Make a plot of Ratio over Year. What features stand out? Which plot do you prefer to display the tendency for more male births?
library("HistData")
## Warning: package 'HistData' was built under R version 3.4.4
Arbuthnot
## Year Males Females Plague Mortality Ratio Total
## 1 1629 5218 4683 0 8771 1.114243 9.901
## 2 1630 4858 4457 1317 10554 1.089971 9.315
## 3 1631 4422 4102 274 8562 1.078011 8.524
## 4 1632 4994 4590 8 9535 1.088017 9.584
## 5 1633 5158 4839 0 8393 1.065923 9.997
## 6 1634 5035 4820 1 10400 1.044606 9.855
## 7 1635 5106 4928 0 10651 1.036120 10.034
## 8 1636 4917 4605 10400 23359 1.067752 9.522
## 9 1637 4703 4457 3082 11763 1.055194 9.160
## 10 1638 5359 4952 363 13624 1.082189 10.311
## 11 1639 5366 4784 314 9862 1.121656 10.150
## 12 1640 5518 5332 1450 12771 1.034884 10.850
## 13 1641 5470 5200 1375 13142 1.051923 10.670
## 14 1642 5460 4910 1274 13273 1.112016 10.370
## 15 1643 4793 4617 996 13212 1.038120 9.410
## 16 1644 4107 3997 1492 10933 1.027521 8.104
## 17 1645 4047 3919 1871 11479 1.032661 7.966
## 18 1646 3768 3395 2365 12780 1.109867 7.163
## 19 1647 3796 3536 3597 14059 1.073529 7.332
## 20 1648 3363 3181 611 9894 1.057215 6.544
## 21 1649 3079 2746 67 10566 1.121267 5.825
## 22 1650 2890 2722 15 8754 1.061719 5.612
## 23 1651 3231 2840 23 10827 1.137676 6.071
## 24 1652 3220 2908 16 12569 1.107290 6.128
## 25 1653 3196 2959 6 10087 1.080095 6.155
## 26 1654 3441 3179 16 13247 1.082416 6.620
## 27 1655 3655 3349 9 11357 1.091371 7.004
## 28 1656 3668 3382 6 13921 1.084565 7.050
## 29 1657 3396 3289 4 12434 1.032533 6.685
## 30 1658 3157 3013 14 14993 1.047793 6.170
## 31 1659 3209 2781 36 14756 1.153901 5.990
## 32 1660 3724 3247 13 12681 1.146905 6.971
## 33 1661 4748 4107 20 16665 1.156075 8.855
## 34 1662 5216 4803 12 13664 1.085988 10.019
## 35 1663 5411 4881 9 12741 1.108584 10.292
## 36 1664 6041 5681 5 15453 1.063369 11.722
## 37 1665 5114 4858 68596 97306 1.052697 9.972
## 38 1666 4678 4319 1998 12738 1.083121 8.997
## 39 1667 5616 5322 35 15842 1.055242 10.938
## 40 1668 6073 5560 14 17278 1.092266 11.633
## 41 1669 6506 5829 3 19432 1.116143 12.335
## 42 1670 6278 5719 0 20198 1.097744 11.997
## 43 1671 6449 6061 5 15729 1.064016 12.510
## 44 1672 6443 6120 5 18230 1.052778 12.563
## 45 1673 6073 5822 5 17504 1.043112 11.895
## 46 1674 6113 5738 3 21201 1.065354 11.851
## 47 1675 6058 5717 1 17244 1.059647 11.775
## 48 1676 6552 5847 2 18732 1.120575 12.399
## 49 1677 6423 6203 2 19076 1.035467 12.626
## 50 1678 6568 6033 5 20678 1.088679 12.601
## 51 1679 6247 6041 2 21730 1.034100 12.288
## 52 1680 6548 6299 0 21053 1.039530 12.847
## 53 1681 6822 6533 0 23951 1.044237 13.355
## 54 1682 6909 6744 0 20691 1.024466 13.653
## 55 1683 7577 7158 0 20587 1.058536 14.735
## 56 1684 7575 7127 0 23202 1.062860 14.702
## 57 1685 7484 7246 0 23222 1.032846 14.730
## 58 1686 7575 7119 0 22609 1.064054 14.694
## 59 1687 7737 7214 0 21460 1.072498 14.951
## 60 1688 7487 7101 0 22921 1.054359 14.588
## 61 1689 7604 7167 0 23502 1.060974 14.771
## 62 1690 7909 7302 0 21461 1.083128 15.211
## 63 1691 7662 7392 0 22691 1.036526 15.054
## 64 1692 7602 7316 0 20874 1.039092 14.918
## 65 1693 7676 7483 0 20959 1.025792 15.159
## 66 1694 6985 6647 0 24109 1.050850 13.632
## 67 1695 7263 6713 0 19047 1.081931 13.976
## 68 1696 7632 7229 0 18638 1.055748 14.861
## 69 1697 8062 7767 0 20292 1.037981 15.829
## 70 1698 8426 7626 0 20183 1.104904 16.052
## 71 1699 7911 7452 0 20795 1.061594 15.363
## 72 1700 7578 7061 0 19443 1.073219 14.639
## 73 1701 8102 7514 0 20471 1.078254 15.616
## 74 1702 8031 7656 0 19481 1.048981 15.687
## 75 1703 7765 7683 0 20720 1.010673 15.448
## 76 1704 6113 5738 0 22684 1.065354 11.851
## 77 1705 8366 7779 0 22097 1.075460 16.145
## 78 1706 7952 7417 0 19847 1.072132 15.369
## 79 1707 8379 7687 0 21600 1.090022 16.066
## 80 1708 8239 7623 0 21291 1.080808 15.862
## 81 1709 7840 7380 0 21800 1.062331 15.220
## 82 1710 7640 7288 0 24620 1.048299 14.928
plot(Arbuthnot$Year, Arbuthnot$Ratio, pch = 16, col = "blue", cex = 0.5,
main = "Male Births / Femail Births over Year",
xlab = "Year", ylab = "Ratio"
)
tendency = lm(Arbuthnot$Ratio ~ Arbuthnot$Year)
abline(tendency)
Even though the ratio flutuates over the years, it’s always been above one, that means it’s always more male birth. The reason I chose the scatter plot is those date are discrete.
plot(Arbuthnot$Year, Arbuthnot$Total, type = 'n',
main = "Total over Time",
xlab = "Year", ylab = "Total (in 000s)")
lines(Arbuthnot$Year, Arbuthnot$Total, type='l')
There’s a big dip on the graph for the total from 1640 to 1660.
Exercise 3.3 Use the data set WomenQueue to:
library('vcd')
## Loading required package: grid
WomenQueue
## nWomen
## 0 1 2 3 4 5 6 7 8 9 10
## 1 3 4 23 25 19 18 5 1 1 0
graph = goodfit(WomenQueue, type = "binomial", par = list(size = 10))
plot(graph, type = "standing", shade = TRUE)
graph1 = goodfit(WomenQueue, type = "binomial", par = list(prob = .5,size = 10))
plot(graph1, type = "deviation", shade = TRUE)
## Warning in max.c * pmax(pmin(interpolate(abs(res)), 1), 0): longer object
## length is not a multiple of shorter object length
## Warning in max.c * pmax(pmin(interpolate(abs(res)), 1), 0): longer object
## length is not a multiple of shorter object length
The sample size is not big enough to be presented as normal distribution.
Exercise 3.4 Work on the distribution of male children in families in “Saxony” by fitting a binomial distribution, (n=12,p=1/2) , specifying equal probability for boys and girls. [Hint: you need to specify both size and prob values for goodfit ().]
Saxony
## nMales
## 0 1 2 3 4 5 6 7 8 9 10 11 12
## 3 24 104 286 670 1033 1343 1112 829 478 181 45 7
graph2 = goodfit(Saxony, type = "binomial", par = list(prob = 0.5, size = 12))
summary(graph2)
## Warning in summary.goodfit(graph2): Chi-squared approximation may be
## incorrect
##
## Goodness-of-fit test for binomial distribution
##
## X^2 df P(> X^2)
## Pearson 249.1954 12 2.013281e-46
## Likelihood Ratio 205.4060 12 2.493625e-37
graph3 = goodfit(Saxony, type = "binomial", par = list(size = 12))
graph3$par
## $prob
## [1] 0.519215
##
## $size
## [1] 12
summary(graph3)
##
## Goodness-of-fit test for binomial distribution
##
## X^2 df P(> X^2)
## Likelihood Ratio 97.0065 11 6.978187e-16
plot(graph2, type = "standing", shade = TRUE, main = "Bin(n=12, p=0.5)")
## Warning in max.c * pmax(pmin(interpolate(abs(res)), 1), 0): longer object
## length is not a multiple of shorter object length
plot(graph3, type = "standing", shade = TRUE, main = "Bin(n=12, p=p head)")