Objectives
By the end of this assignment, you should:
summarize, mutate, group_by)This assignment is due Thursday, February 6 at noon. Please turn your .html AND .Rmd files into Canvas. Your .Rmd file should knit without an error before turning in the assignment.
The first few excercises concern the data babynames again. The dataset is included in the “babynames” package.
The final exercises concern the object complexity norms for the Lewis and Frank (2016) dataset that we have talked about in class. In the experiment, we asked participants to rate a random sample of 10 (out of 60) objects in terms of their conceptual complexity. They gave their responses using a slider scale where 0 = simple, and 1 = complex (and all values in between were possible). We collected data from 2 samples of participants. The data are in the data frame located at data/lewis_object_complexity_norms_by_participant.csv. Each variable is described below.
For each object, calculate the mean, min, and max complexity rating. Use geom_bar(stat = "identity") to plot the mean complexity for each object. Then, use geom_point to plot the mininium complexity in red and the maximum complexity in blue. Sort the bars by the mean complexity rating from lowest to highest (see class slides for an example of this). Change the labels on the plot to make it maximally readable (xlab, ylab, ggtitle, and any other changes you think might help).
Create a new variable called range that is the difference between the maximum and mininimum complexity rating for each object.
For each object, for each sample, calculate the mean complexity rating and save it to a data frame called participant_sample_mean. Print the first 5 rows of your data frame. (note: you can print a prettier table by using the kable() function around the data frame you want to output).
Create a scatter plot with sample one on the x-axis and sample two on the y-axis. Add a line to fit the data.
This is a little tricky. The current data frame is tidy (each observation is its own row), but to make a plot that puts sample 1 and sample 2 on different axes we need sample one and sample two to be their own columns. In other words, we need a dataframe that has three columns: object_id, mean_complexity_sample_1, mean_complexity_sample_2. To transform the current data frame into the one we need, we can use the function pivot_wider(). The code below does this for you.
participant_sample_mean_wide <- participant_sample_mean %>%
pivot_wider(names_from = "sample", values_from = "mean_complexity") Print the first 5 rows of participant_sample_mean_wide. Then plot the data with a scatter plot.