What years are included in this data set? What are the dimensions of the data frame and what are the variable or column names?

1940 through 2002, for 63 years of data.

source("C:\\Users\\Jeremy\\Documents\\R\\win-library\\3.2\\DATA606\\labs\\Lab0\\more\\present.R")
source("C:\\Users\\Jeremy\\Documents\\R\\win-library\\3.2\\DATA606\\labs\\Lab0\\more\\arbuthnot.R")
range(present$year)
## [1] 1940 2002
max(present$year) - min(present$year) + 1
## [1] 63

Dimensions include year, boys (born), and girls (born).

str(present)
## 'data.frame':    63 obs. of  3 variables:
##  $ year : num  1940 1941 1942 1943 1944 ...
##  $ boys : num  1211684 1289734 1444365 1508959 1435301 ...
##  $ girls: num  1148715 1223693 1364631 1427901 1359499 ...

How do these counts compare to Arbuthnot’s? Are they on a similar scale?

They’re both on an annual scale. The arbuthnot set covers 82 sequential years, from 1628 through 1710. The present set covers 63 years, from 1940 through 2002.

range(arbuthnot$year)
## [1] 1629 1710
max(arbuthnot$year) - min(arbuthnot$year) + 1
## [1] 82

Make a plot that displays the boy-to-girl ratio for every year in the data set. What do you see? Does Arbuthnot’s observation about boys being born in greater proportion than girls hold up in the U.S.? Include the plot in your response.

Yes, in these datasets male births tend to exceed females, both in the modern era as well as the 17th century. The modern ratio is more locally stable; the arbuthnot dataset shows greater agitation between years, with higher volatility and wider dispersion.

totalBirths.present <- c(present$boys + present$girls)
b2gRatio.present <- c(present$boys / totalBirths.present)
plot(x = present$year, y = b2gRatio.present, type = 'l', main = "Modern Birth Rates:\n1940 to 1920", xlab = "year", ylab = "% male births", ylim = c(.5,.54))

round(fivenum(b2gRatio.present), 4)
## [1] 0.5112 0.5121 0.5125 0.5130 0.5143
b2gRatio.arbuthnot <- c((arbuthnot$boys) / (arbuthnot$boys + arbuthnot$girls))
plot(x = arbuthnot$year, y = b2gRatio.arbuthnot, type = 'l', main = "Historical Birth Rates:\n1940 to 1920", xlab = "year", ylab = "% male births", ylim = c(.5,.54))

round(fivenum(b2gRatio.arbuthnot), 4)
## [1] 0.5027 0.5118 0.5157 0.5211 0.5362

In what year did we see the most total number of births in the U.S.?

1961 is seemingly a global maximum (visible in the plot below).

present$year[which.max(totalBirths.present)]
## [1] 1961
plot(x = present$year, y = totalBirths.present, type = 'l', main = "Modern Births:\n1940 to 1920", xlab = "year", ylab = "total births", ylim = c(0,5000000))