Exercise

source("more/arbuthnot.R")

1. What command would you use to extract just the counts of girls baptized?

I first extract the girls vector, turn into an array, and then use dim to count.

dim(array(arbuthnot$girls))
## [1] 82

2. Is there an apparent trend in the number of girls baptized over the years? How would you describe it?

plot(x = arbuthnot$year, y = arbuthnot$girls, type = "l")

Yes, the number of girls baptized decreases from 1640 to 1660. After that, it jumps shapely between 1660 to 1700.

3. Now, make a plot of the proportion of boys over time. What do you see?

y = arbuthnot$boys / (arbuthnot$boys + arbuthnot$girls)
plot (x = arbuthnot$year, y, type = "p", ylab = "Proportion of Boys")

It seems the proportion of boys is decreasing, but very slightly and slowly.

On Your Own

source("more/present.R")

1. What years are included in this data set? What are the dimensions of the data frame and what are the variable or column names?

head(present)
##   year    boys   girls
## 1 1940 1211684 1148715
## 2 1941 1289734 1223693
## 3 1942 1444365 1364631
## 4 1943 1508959 1427901
## 5 1944 1435301 1359499
## 6 1945 1404587 1330869
tail(present)
##    year    boys   girls
## 58 1997 1985596 1895298
## 59 1998 2016205 1925348
## 60 1999 2026854 1932563
## 61 2000 2076969 1981845
## 62 2001 2057922 1968011
## 63 2002 2057979 1963747
dim(present)
## [1] 63  3
names(present)
## [1] "year"  "boys"  "girls"

The years are from 1940 to 2002. The dimension is 63x3, and the variables are year, boys, and girls.

2. How do these counts compare to Arbuthnot’s? Are they on a similar scale?

plot(arbuthnot$year, arbuthnot$boys + arbuthnot$girls, type = "l")

plot(present$year, present$boys + present$girls, type = "l")

Comparing the two plots, it can be seen that they are not on the same scale. The arbuthnot data is in the 10^4 scale, while the present data is in the 10^6 scale.

3. Make a plot that displays the boy-to-girl ratio for every year in the data set. What do you see? Does Arbuthnot’s observation about boys being born in greater proportion than girls hold up in the U.S.? Include the plot in your response.

present$ratio <- present$boys / present$girls
plot(x = present$year, y = present$ratio, type = "l", ylab = "boy-to-girl ratio")

The plot suggests that the boy-to-girl ratio is decreasing in general.

The boy-to-girl ratio is greater than 1 throughout the graph, therefore, Arbuthnot’s observation is correct.

4. In what year did we see the most total number of births in the U.S.? You can refer to the help files or the R reference card http://cran.r-project.org/doc/contrib/Short-refcard.pdf to find helpful commands.

In Question 2 above, the total number of births for the present data is plotted. There is a jump in birth rate in the years around 1960. This is probably what they called the “Baby-boom” era.

maxBirth <- present[which.max(present$boys + present$girls), ]
maxBirth
##    year    boys   girls    ratio
## 22 1961 2186274 2082052 1.050057

The birth reached a climax in 1961, which saw \(4.268326\times 10^{6}\) babies born in that year.