letters is a built-in vector to
R and contains the lower case English alphabet.
letters vector.```{r}letters[9] letters[c(9, 11, 19)] letters[-c(25, 26)]
2. This dataset provides birth rates and related data across the 50 states and DC from 2016 to 2021. The data was sourced from the Centers for Disease Control and Prevention (CDC) and includes detailed information such as number of births, gender, birth weight, state, and year of the delivery. A particular emphasis is given to detailed information on the mother's educational level. With this dataset, one can, for example, examine trends and patterns in birth rates across different academic groups and geographic locations.
This code reads the dataset.
a. Use the str function to inspect the data. Comment the number of rows and columns in the data frame.
b. Use the head function to output the first 4 rows of data.
c. You can look at just the race column by typing `birth.data$Race`. Find the number of births of each race using the table function. `table(birth.data$Race)`.
d. Assign the proportion of data are from Black Americans to the variable name, "prop.black".to do this preform a calculation by using `length(birth.data)` and the table value for "Black" in the race table you created.
e. Repeat the process in part c to find the range of years in this dataset and the proportion of observations from 1989.
```{r}str(birth.data)
head(birth.data, 4)
table(birth.data$Race)
black_births <- table(birth.data$Race)["Black"]
prop.black <- black_births / nrow(birth.data)
prop.black
range(birth.data$Year)
births_1989 <- sum(birth.data$Year == 1989)
prop_1989 <- births_1989 / nrow(birth.data)
prop_1989
See https://openintro.info/stat/labs.php?stat_lab_software=R for more information.
The Behavioral Risk Factor Surveillance System (BRFSS) is an annual telephone survey of 350,000 people in the United States. As its name implies, the BRFSS is designed to identify risk factors in the adult population and report emerging health trends. For example, respondents are asked about their diet and weekly physical activity, their HIV/AIDS status, possible tobacco use, and even their level of healthcare coverage.
The BRFSS Web site (http://www.cdc.gov/brfss) contains a complete description of the survey.
We will focus on a random sample of 20,000 people from the BRFSS survey conducted in 2000. While there are over 200 variables in this data set, we will work with a small subset.
The source command provides the code for entering the data
view the names of the variables, and the structure of the data see https://openintro.info/stat/data/?data=cdc for more information on the data
```{r}str(cdc) head(cdc, 6) mean.weight <- mean(cdc$weight, na.rm = TRUE) mean.weight
```