source("more/arbuthnot.R")
source("more/present.R")
What years are included in this data set? What are the dimensions of the data frame and what are the variable or column names?
Present data includes the following years: 1940, 1941, 1942, 1943, 1944, 1945, 1946, 1947, 1948, 1949, 1950, 1951, 1952, 1953, 1954, 1955, 1956, 1957, 1958, 1959, 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002
Present data has the following dimensions (rows, columns): 63, 3
Present data includes the following columns: year, boys, girls
How do these counts compare to Arbuthnot’s? Are they on a similar scale?
Present data covers 62 years from 1940 to 2002; whereas, arbuthnot’s data covers 81 years from 1629 to 1710. The scales are roughly similar.
Make a plot that displays the boy-to-girl ratio for every year in the data set. What do you see? Does Arbuthnot’s observation about boys being born in greater proportion than girls hold up in the U.S.? Include the plot in your response.
plot(x = present$year, y = present$boy/(present$boy + present$girl), type = "l")
The proportion of boys born per year is still slightly higher than girls, but the fluctuation of this proportion from year to year is within a narrower range in the present data set
sd(present$boy/(present$boy + present$girl))
## [1] 0.0006757978
sd(arbuthnot$boy/(arbuthnot$boy + arbuthnot$girl))
## [1] 0.007219292In what year did we see the most total number of births in the U.S.?
present$year[present$boy + present$girl == max(present$boy + present$girl, na.rm=TRUE)]
## [1] 1961