Muller-0

Start!

First we get the “present” and “arbuthnot” objects

source("C:/Users/Exped/Desktop/Textbooks/606 Homeworks/Lab material/DATA606-master/inst/labs/Lab0/more/present.r")
source("C:/Users/Exped/Desktop/Textbooks/606 Homeworks/Lab material/DATA606-master/inst/labs/Lab0/more/arbuthnot.r")
df1 = arbuthnot
df2 = present

What years are included in this data set? What are the dimensions of the data frame and what are the variable or column names?

We use range to find all the years

range(df2$year)

## [1] 1940 2002

We use dim (short for dimensions) to get the dimensions of our dataframe

dim(df2)

## [1] 63  3

That 63 rows and 3 columns

We use names to get the names of columns/attributes/variables

names(df2)

## [1] "year"  "boys"  "girls"

How does df2(present) compare to df1(arbuthnot)? Are they on a similar scale?

rangeOfYearsdf2 <- range(df2$year)
rangeOfYearsdf1 = range(df1$year)
sumOfBGdf1 = sum(df1$boy)+sum(df1$girl)
sumOfBGdf2 = sum(df2$boy)+sum(df2$girl)

We can see that present df studies 19 less years (1940, 2002 >>> total of (62 years)) than the arbuthnot df(1629, 1710 >>> total of (81 years))

However, the sample size for the present df 2.318094210^{8} is much larger than the arbuthnot df 938223

Present df is of greater scale, by 2.30871210^{8}

Make a plot that displays the boy-to-girl ratio for every year in the data set. What do you see? Does Arbuthnot’s observation about boys being born in greater proportion than girls hold up in the U.S.? Include the plot in your response.

plot(df2$year,df2$boys/df2$girls, type = 's', main = 'Boys to girls ratio in present df', ylab='B/G Ratio', xlab = 'Years', xlim = c(1940, 2050))
     
lineOfBestFit = lm(df2$boys/df2$girls ~ df2$year)
abline(lineOfBestFit,col='#641399')

We see a steady decline in boys to girl birth ratio, however Arbuthnot’s observation still holds true that boys are still born at a greater ratio than girls.

In what year did we see the most total number of births in the U.S.?

answer = df2$year[df2$girls+df2$boys == max(df2$girls+df2$boys)]

The year with most births is 1961

Muller-0

Michael Muller

February 4, 2017

Start!

What years are included in this data set? What are the dimensions of the data frame and what are the variable or column names?

How does df2(present) compare to df1(arbuthnot)? Are they on a similar scale?

Make a plot that displays the boy-to-girl ratio for every year in the data set. What do you see? Does Arbuthnot’s observation about boys being born in greater proportion than girls hold up in the U.S.? Include the plot in your response.

In what year did we see the most total number of births in the U.S.?

Please leave a comment