Objectives

By the end of this assignment, you should:

This assignment is due Thursday, January 30 at noon. Please turn your .html AND .Rmd files into Canvas. Your .Rmd file should knit without an error before turning in the assignment.

The first few excercises we’ll do in lab. They concern a dataset called babynames. The dataset is included in the “babynames” package.


  1. Alter the code to select just the n column:
select(babynames, name, prop)


  1. Use the logical operators to manipulate the code below to show: [a] All of the names where prop is greater than or equal to 0.08, [b] All of the children named “Sea” [c] All of the names that have a missing value for n (Hint: this should return an empty data set).
filter(babynames, name == "Garrett")


  1. Arrange babynames by n. Add prop as a second (tie breaking) variable to arrange on. What is the smallest value of n?


  1. Use %>% to write a sequence of functions that: 1. Filters babynames to just the girls that were born in 2015. 2. Selects the name and n columns. 3. Arranges the results so that the most popular names are near the top.



The next few exercises will focus on data from the Lewis & Frank (2018) replication of the Xu and Tenenbaum 2007 experiment (that we talked about in lecture). We’ll be working with data from the first experiment only. For reference, the journal paper write up of this study can be found here, and you can see the actual experiment that participants saw here.

The data are in a file called lewis_2018_exp1.csv. We can start by loading the data with the read_csv() function and saving it to a variable called lf_data:

lf_data <- read_csv("data/lewis_2018_exp1.csv")

This data frame is tidy, meaning each column is a variable and each row is an observation. In this case, each observation is a unique participant and trial combination. There are six variables in the data and each variable is described below. The first six rows of the data frame are also displayed below.

exp subids trial_num category condition proportion_basic_level_responses
1 1 9 vehicles three_subordinate 0
1 2 9 animals three_basic 1
1 3 9 animals three_superordinate 1
1 4 9 vehicles three_superordinate 1
1 5 9 animals three_superordinate 1
1 6 9 vegetables three_subordinate 0


  1. Select the columns subids, category, proportion_basic_level_responses from the data. Print the first six rows of this data frame.


  1. Print the first six rows of a data frame that does NOT include the category column.

Note: the template for the remaining exercises is blank, and so you will need to add R chunks where appropriate.


  1. Use logical tests and Boolean operators to return only the rows that contain trials (rows): [a] with category as vegetables, [b] with category as animals and a trial number less than 7, [c] with category as vegetables or animals, [d] with at least one basic level response in the “one” condition.


  1. The following code selects all trials (rows) where the condition was either “three_subordinate” or “one.” Rewrite this code in a way that uses the %in% operator.
filter(lf_data, condition == "three_subordinate" | condition == "one")


  1. How many trials are there where the category is either vegetables or animals? Use nrow().


  1. The three following sets of commands are written without the pipe operator (%>%). Rewrite each one to include the pipe.

[a]

var1 <- mutate(lf_data, category)

[b]

var1 <- select(lf_data, category)
var2 <- nrow(var1)

[c]

var1 <- filter(lf_data, trial_num == 1)
var2 <- filter(var1, category == "animals")
var3 <- select(var2, trial_num, category)


  1. The two following sets of commands are written with the pipe operator. Rewrite each one to exclude the pipe.

[a]

lf_data %>%
  filter(trial_num < 6) %>%
  nrow()

[b]

lf_data %>%
  select(subids, category, proportion_basic_level_responses) %>%
  filter(subids == 1) %>%
  arrange(category)


  1. Look at the code below. Describe in full sentences what this code does.
lf_data %>%
  select(subids, category, condition) %>%
  filter(category == "vehicles" & condition != "one") %>%
  arrange(-subids)


  1. On the first day of class, we talked about the “Sally Anne Task” that measures children’s understanding of theory of mind (example videos). Describe four variables that you could measure in this task to assess children’s theory of mind performance. Specifically, describe (1) one qualitative variable, (2) one quantitative - binary variable, (3) one quantitative - numeric, and (4) one quantitative - real variable. For each variable, give a one sentence description of the variable, AND one example value of that variable with units.


  1. Consider the following claim: “The scientific process is a social endeavor.” To what extent is this statement true or not true? What are the implications of your response for research methods in psychological science? Please respond with a short paragraph.
