The babynames
data set has the number of babies assigned
each first name (with at least 10 babies per year). The columns in the
data are:
sex
: The sex of the child assigned at birth (either ‘F’
or ‘M’)year
: the year (1910 - 2023)name
: The first name assignedcount
: The number of babies assigned the name for that
year and sex combinationprop
: The proportion of babies assigned the name for
the year & sex combinationCreate a line graph just for the name ‘Taylor’ with 2 lines: one for female babies and one for male babies. Make sure the line graph displays the proportion, not the count. See Brightspace for how the graph should look.
Around what year was ‘Taylor’ more likely to be given to a female baby than a male baby?
Create a data set named haileys
that represents
female babies given the names Hailey, Hayley, Haley, and Haylee since
1975. Make sure to arrange the rows by year in descending order then by
name.
Make sure to display the data frame in the knitted document
If done correctly, the graph below should match what is seen in Brightspace.
Update the code below by giving data = ...
a
data frame with 1 rows per name that corresponds to the year when the
name was the most popular.
For this question, you’ll be creating a data frame for how common each letter is
Add a column to babynames
called
letter
that has the first letter of each name
AND change ‘F’ to ‘Female’ and ‘M’ to ‘Male’ for sex
. Call
the resulting data frame babynames2
.
To get a subset of a string, use
str_sub(string, start, end)
. For example, if you wanted to
keep the 4th - 6th letters of the word ’bananas, it would be
str_sub('bananas', 4, 6)
Keep all six columns, but only display the sex
,
name
, and letter
columns in the knitted
document
Create a data frame that has how common each
letter
is per year
by sex
. Name
it baby_letters
. It should have 4 columns:
Display the results in the knitted document
If done correctly, the code chunk below should make the graph seen in Brightspace