Problem 1. Copy and paste the following R code into your R console.

binom.test(x=10, n=30, alternative=“less”)

I recieved an error when copy and pasting but it would not let me export to html file without deleting the error

Does it run? Now type the command into the R console as you see it instead of copying and pasting it into R. Now it should work. Why? Investigate this problem.

binom.test(x=10, n=30, alternative="less")
## 
##  Exact binomial test
## 
## data:  10 and 30
## number of successes = 10, number of trials = 30, p-value = 0.04937
## alternative hypothesis: true probability of success is less than 0.5
## 95 percent confidence interval:
##  0.0000000 0.4994387
## sample estimates:
## probability of success 
##              0.3333333

#in the first portion R did not process that alternative = “less” because it was not highlighted green and didn’t process it as code

Problem 2. For this exercise you will be using a subset of data collected from pregnancies that occured between 1960 and 1967 among women in Oakland, California. First create a folder called STAT7000 and inside that folder create another folder called assignment2. Next change your working directory to the folder assignment2 and execute the following lines of code:

setwd("/Users/lexigachman/Desktop/stat 7000/assignment 2")

website=‘http://www.stat.berkeley.edu/~statlabs/data/babies23.data’ download.file(website, destfile=‘babies’, method=‘auto’)

setwd("/Users/lexigachman/Desktop/stat 7000/assignment 2")
website='http://www.stat.berkeley.edu/~statlabs/data/babies23.data'
download.file(website, destfile='babies', method='auto')

A description of the variables in the data file is found here: http://www.stat.berkeley.edu/~statlabs/data/babies.readme

  1. Read the data into R and store it in the variable babies. Hint: Use the function read.table() or the Import Dataset button. The values in the data file are separated by ‘white space’, that is one or more spaces, tabs, newlines or carriage returns.
babies <- read.csv("~/Desktop/stat 7000/assignment 2/babies", sep="")
  1. What are the names of columns 1, 7, 12, 13? Store columns 1, 7, 12, 13 in the variable babies2. Only include the data for which the values in column 7 are less than 999, the values in column 12 are less than 99, and the values in 13 are less than 999. Hints: use the function subset() and names().
babies[,c(1,7,12,13)] 

id, wt, ht, and wt.1

babies2 <- babies[,c(1,7,12,13)]
names(babies2)
## [1] "id"   "wt"   "ht"   "wt.1"
names(babies2)<-c("id","baby.weight","mother.height","mother.weight")
babies2<-subset(babies2,baby.weight<999 & mother.height<99 & mother.weight<999)
  1. Change the variable names of babies2 to “id”, “baby.weight”, “mother.height”, “mother.weight”.
names(babies2)<-c("id","baby.weight","mother.height","mother.weight")
  1. Add another column to babies2 called “mother.bmi” which records the body mass index of the mother. Calculate the body mass index by using the mother’s weight in kilograms divided by the square of her height in meters. Hint: Note the units of the variables “mother.height”, “mother.weight”.
babies2$mother.bmi<-as.numeric(babies2$mother.weight)/as.numeric(babies2$mother.height)^2*703
  1. A person with a body mass index of 30 or more is classified as obese. How many mothers are obese?
sum(babies2$mother.bmi>=30)
## [1] 36
  1. Find the identification numbers of the ten mothers with the highest body mass index. Hint: use the function order().
ord<-order(babies2$mother.bmi, decreasing=TRUE)
babies2$id[ord][1:10]
##  [1] 2430 7580 2908 8499 8659 6084 6736 7109 6035 7427