{ {;;;} {*;;;}
—
Complete all Exercises, and submit answers to
Questions on the Coursera
platform.
statsr
: for data files and functions used in this
coursedplyr
: for data wranglingggplot2
: for data visualizationinstall.packages
and install_github
.library
function. Note that you only need to
install packages once, but\{r load-packages, message = FALSE\}\ library(dplyr)\ library(ggplot2)\ library(statsr)\
\{r load-abrbuthnot-data\}\ data(arbuthnot)\
arbuthnot
that has 82 observations\{r view-data\}\ arbuthnot\
arbuthnot
in the Environment pane (upper
right window) that lists\{r dim-data\}\ dim(arbuthnot)\
[1] 82 3
, indicating that there
are 82 rows and 3[1]
means in a bit), just as
it says next to\{r names-data\}\ names(arbuthnot)\
Exercise: What years are included in this dataset?
Hint: Take a look at the year
variable in the Data Viewer to answer this question.
year
, boys
, andgirls
. At this point, you might notice that many of the
commands in R look adim
and
names
commands, for
Tip: If you use the up and down arrow keys, you can
scroll through your
previous commands, your so-called command history. You can also access
it
by clicking on the history tab in the upper right panel. This will
save
you a lot of typing in the future.
\{r view-boys\}\ arbuthnot$boys\
arbuthnot$boys
arbuthnot$girls
girls
arbuthnot[girls]
$girls
\{r extract-counts-of-girls-born\}\ # type your code for the Question 2 here, and Knit\ \
as its argument.\ </div>\ \ ```\{r years-in-present-data\}\ # type your code for Exercise here, and Knit\ \ ```\ \ 5. Calculate the total number of births for each year and store these values in a new \ variable called
totalin the
presentdataset. Then, calculate the proportion of \ boys born each year and store these values in a new variable called
prop_boysin \ the same dataset. Plot these values over time and based on the plot determine if the \ following statement is true or false: The proportion of boys born in the US has \ decreased over time. \ <ol>\ <li> True </li>\ <li> False </li>\ </ol>\ \ ```\{r prop-boys-over-time\}\ # type your code for Question 5 here, and Knit\ \ ```\ \ \ 6. Create a new variable called
more_boyswhich contains the value of either
TRUE\ if that year had more boys than girls, or
FALSEif that year did not. Based on this \ variable which of the following statements is true? \ <ol>\ <li> Every year there are more girls born than boys. </li>\ <li> Every year there are more boys born than girls. </li>\ <li> Half of the years there are more boys born, and the other half more girls born. </li>\ </ol>\ \ ```\{r more-boys-per-year\}\ # type your code for Question 6 here, and Knit\ \ ```\ \ \ 7. Calculate the boy-to-girl ratio each year, and store these values in a new variable called
prop_boy_girlin the
presentdataset. Plot these values over time. Which of the following best describes the trend? \ <ol>\ <li> There appears to be no trend in the boy-to-girl ratio from 1940 to 2013. </li>\ <li> There is initially an increase in boy-to-girl ratio, which peaks around 1960. After 1960 there is a decrease in the boy-to-girl ratio, but the number begins to increase in the mid 1970s. </li>\ <li> There is initially a decrease in the boy-to-girl ratio, and then an increase between 1960 and 1970, followed by a decrease. </li>\ <li> The boy-to-girl ratio has increased over time. </li>\ <li> There is an initial decrease in the boy-to-girl ratio born but this number appears to level around 1960 and remain constant since then. </li>\ </ol>\ \ ```\{r prop-boy-girl-over-time\}\ # type your code for Question 7 here, and Knit\ \ ```\ \ 8. In what year did we see the most total number of births in the U.S.? *Hint:* Sort \ your dataset in descending order based on the
totalcolumn. You can do this \ interactively in the data viewer by clicking on the arrows next to the variable \ names. Or to arrange the data in a descenting order with new function:
descr(for \ descending order).\ <ol>\ <li> 1940 </li>\ <li> 1957 </li>\ <li> 1961 </li>\ <li> 1991 </li>\ <li> 2007 </li>\ </ol>\ \ ```\{r most-total-births\}\ # type your code for Question 8 here\ # sample code is provided below, edit as necessary, uncomment, and then Knit\ #present %>%\ # mutate(total = ?) %>%\ # arrange(desc(total))\ ```\ \ ## Resources for learning R and working in RStudio\ \ That was a short introduction to R and RStudio, but we will provide you with more\ functions and a more complete sense of the language as the course progresses. You \ might find the following tips and resources helpful.\ \ - In this course we will be using the
dplyr(for data wrangling) and
ggplot2`
(for
This is a derivative of an OpenIntro lab, and is
released under a Attribution-NonCommercial-ShareAlike
3.0 United States license.
}