BIOL-6605 Group assignment 1
#1) The Behavioral Risk Factor Surveillance System (BRFSS) is an annual telephone survey of 350,000 people in the United States. The BRFSS is designed to identify risk factors in the adult population and report emerging health trends. For example, respondents are asked about their diet, weekly exercise, possible tobacco use, and healthcare coverage.
Based on the description above, is this an experiment or observational study? Briefly justify your answer.
this is an observational study because nothing is really being tested and a survey is being taking in order to collect data.
#2) Read in the data set cdc. The file cdc.csv can be found on Canvas.
cdc <- read.csv("cdc.csv.csv")
Examine the data frame using any of the techniques demonstrated in class. Identify the data types (discrete, continuous, etc) for the following columns. Be specific a. height: numerical continuous b. age:numerical discrete c. gender: categorical nomial d. exerany:categorical nomial
Hint. You have to think a little more about exerany (any exercise?). The values 0 & 1 are codes for no (0) and yes (1).
nrow(cdc)
## [1] 20000
ncol(cdc)
## [1] 9
#4) A city council has requested a household survey be conducted in a suburban area of their city. The area is broken into many distinct and unique neighborhoods, some including large homes, some with only apartments, and others a diverse mixture of housing structures. Identify the sampling methods described below, and comment on whether or not you think they would be effective in this setting.
Randomly sample 50 households from the city. # yes this would be effective due to the randomness of the sample being able to avoid bias
Divide the city into neighborhoods, and sample 20 households from each neighborhood. # yes this would be good cause it is covering teh entire testing area
Divide the city into neighborhoods, randomly sample 10 neighborhoods, and sample all households from those neighborhoods. # no becasue this would not cover all of the neighborhoods
Divide the city into neighborhoods, randomly sample 10 neighborhoods, and then randomly sample 20 households from those neighborhoods. # yes thsi would becactive becasue then it would still be random
Sample the 200 households closest to the city council offices. # no because this would be within the same district instead of the overall larger area”