source("more/arbuthnot.R")
source("more/present.R")

Exercise 1

What command would you use to extract just the counts of girls baptized? Try it!

arbuthnot$girls
##  [1] 4683 4457 4102 4590 4839 4820 4928 4605 4457 4952 4784 5332 5200 4910
## [15] 4617 3997 3919 3395 3536 3181 2746 2722 2840 2908 2959 3179 3349 3382
## [29] 3289 3013 2781 3247 4107 4803 4881 5681 4858 4319 5322 5560 5829 5719
## [43] 6061 6120 5822 5738 5717 5847 6203 6033 6041 6299 6533 6744 7158 7127
## [57] 7246 7119 7214 7101 7167 7302 7392 7316 7483 6647 6713 7229 7767 7626
## [71] 7452 7061 7514 7656 7683 5738 7779 7417 7687 7623 7380 7288

Exercise 2

Is there an apparent trend in the number of girls baptized over the years?

How would you describe it?

#I googled how to draw a regression line on a scatter plot to help with this question.
#The slope tells us that girls baptisms increase an average of 54.6 a year.
plot(x = arbuthnot$year, y = arbuthnot$girls)
lm(formula = arbuthnot$girls ~ arbuthnot$year)
## 
## Call:
## lm(formula = arbuthnot$girls ~ arbuthnot$year)
## 
## Coefficients:
##    (Intercept)  arbuthnot$year  
##       -85618.0            54.6
abline(-85618.0, 54.6)

Exercise 3

Now, make a plot of the proportion of boys over time. What do you see?

#The proportion of boys to girls is decreasing on average every year.
propBoys <- arbuthnot$boys/(arbuthnot$boys+arbuthnot$girls)
plot(arbuthnot$year, propBoys, type = "l")
lm(formula = propBoys ~ arbuthnot$year)
## 
## Call:
## lm(formula = propBoys ~ arbuthnot$year)
## 
## Coefficients:
##    (Intercept)  arbuthnot$year  
##      6.322e-01      -6.902e-05
abline(6.322e-01, -6.902e-05)

On Your Own

1. What years are included in this data set? What are the dimensions of the data frame and what are the variable or column names?

#Years 1940 - 2002
#Dimensions are 63 row and 3 columns
#Column names are "year", "boys", and "girls"
dim(present) 
## [1] 63  3
names(present)
## [1] "year"  "boys"  "girls"

2. How do these counts compare to Arbuthnot’s? Are they on a similar scale?

#The are many more births in the "present" data than baptisms in "arbutnot" data
#If we compare the births of girls to the baptism of girls we see that an average
#of 5211 more girls are born every year compared to an increase of 55 baptisms per year
plot(present$year, present$girls)
lm(formula = present$girls ~ present$year)
## 
## Call:
## lm(formula = present$girls ~ present$year)
## 
## Coefficients:
##  (Intercept)  present$year  
##     -8477154          5211
abline(-8477154, 5211)

3. Make a plot that displays the boy-to-girl ratio for every year in the data set. What do you see? Does Arbuthnot’s observation about boys being born in greater proportion than girls hold up in the U.S.?

#The proportion of boys to girls is greater. But the plot shows it is decreasing. 
boyRatio <- present$boys/present$girls
plot(present$year, boyRatio)

4. In what year did we see the most total number of births in the U.S.?

#Year 1961 had the most births. 
#I made a matrix called totalBirths that has two columns with the year and total births.
#I used the max() command to find the max value in the column of total births
#This loop searches for the max value in the totalbirth column then prints the entire row.
totalBirths <- matrix(c(present$year, present$boys+present$girls), ncol = 2)
maxBirth <- max(totalBirths[,2])
i<-1
while(i < 64)
{
  if (maxBirth == totalBirths[i,2])
  {
    print(totalBirths[i,])
  }
  i <- i + 1
}
## [1]    1961 4268326