List the datasets in dplyr.
data(package='dplyr')
Load the built-in dataset starwars and use
glimpse() to see an overview.
data(starwars)
glimpse(starwars)
## Rows: 87
## Columns: 14
## $ name <chr> "Luke Skywalker", "C-3PO", "R2-D2", "Darth Vader", "Leia Or…
## $ height <int> 172, 167, 96, 202, 150, 178, 165, 97, 183, 182, 188, 180, 2…
## $ mass <dbl> 77.0, 75.0, 32.0, 136.0, 49.0, 120.0, 75.0, 32.0, 84.0, 77.…
## $ hair_color <chr> "blond", NA, NA, "none", "brown", "brown, grey", "brown", N…
## $ skin_color <chr> "fair", "gold", "white, blue", "white", "light", "light", "…
## $ eye_color <chr> "blue", "yellow", "red", "yellow", "brown", "blue", "blue",…
## $ birth_year <dbl> 19.0, 112.0, 33.0, 41.9, 19.0, 52.0, 47.0, NA, 24.0, 57.0, …
## $ sex <chr> "male", "none", "none", "male", "female", "male", "female",…
## $ gender <chr> "masculine", "masculine", "masculine", "masculine", "femini…
## $ homeworld <chr> "Tatooine", "Tatooine", "Naboo", "Tatooine", "Alderaan", "T…
## $ species <chr> "Human", "Droid", "Droid", "Human", "Human", "Human", "Huma…
## $ films <list> <"A New Hope", "The Empire Strikes Back", "Return of the J…
## $ vehicles <list> <"Snowspeeder", "Imperial Speeder Bike">, <>, <>, <>, "Imp…
## $ starships <list> <"X-wing", "Imperial shuttle">, <>, <>, "TIE Advanced x1",…
Convert the built-in base R mtcars dataset to a tibble
(you will need to find the function for this; it isn’t in the chapter),
and store it in the object mt.
mt <- tibble::as_tibble(mtcars)
Download the the zip file and unzip it into a “data” folder that is a subfolder of your working directory (e.g., a folder called “4.2” or something like that).
Read “disgust_scores.csv” into a table.
disgust <- readr::read_csv("data/disgust_scores.csv")
How many rows and columns are in the disgust
dataset?
disgust_rows <- nrow(disgust)
disgust_cols <- ncol(disgust)
In the space provided directly below, write down what type of variable disgust_rows and disgust_cols are. You should be able to tell this just by looking in the environment at the values.
disgust_rows and disgust_cols are both _______ variables.
Create a tibble with the columns name, age,
and country of origin for 2 people you know.
people <- people <- tibble::tibble(
name = c("Alex", "Jamie"),
age = c(22, 24),
country = c("Canada", "India")
)
Export this data table in your “data” folder as a CSV and an RDS file. When you save the RDS file, use “gz” compression to reduce file size.
readr::write_csv(people, "data/people.csv")
saveRDS(people, "data/people.rds", compress = "gzip")
Set the following objects to the number 1 with the indicated data type:
one_int (integer)one_dbl (double)one_chr (character)one_int <- 1L
one_dbl <- 1
one_chr <- "1"
Create a vector of the numbers 3, 6, and 9.
threes <- c(3, 6, 9)
The built-in vector letters contains the letters of the
English alphabet. Use an indexing vector of integers to extract the
letters that spell ‘cat’.
cat <- letters[c(3, 1, 20)]
The function colors() returns all of the color names
that R is aware of. What is the length of the vector returned by this
function? (Use code to find the answer.)
color_length <- length(colors())
Create a named list called random_list that lists the
objects “cat”, “threes”, and “color_length” (i.e., the three objects you
just saved above).
random_list <- list(
cat = cat,
threes = threes,
color_length = color_length
)
random_list
## $cat
## [1] "c" "a" "t"
##
## $threes
## [1] 3 6 9
##
## $color_length
## [1] 657
Run the code below and consider what seems to be happening here, considering the output above when you run random_list to display it.
Then, write a new line of code that only prints the letter “a” from within the random_list.
random_list$cat[2]
## [1] "a"
The following code provided to you defines a matrix as a field of “O”s with a single “X” (X marks the spot!) Write a single line of code to extract the specific “X” from the field of “O”s
given_mat <- matrix(c(rep("O", times = 39), "X", rep("O", times = 9)),
nrow = 7, ncol = 7)
given_mat[given_mat == "X"]
## [1] "X"
Set the object x to a vector containing the integers 1
to 100 (increasing by 1).
Use vectorised operations to define y as x
squared. Use plot(x, y) to visualise the relationship
between these two numbers.
x <- 1:100
y <- x^2
plot(x, y)
The function call runif(n, min, max) will draw
n numbers from a uniform distribution from min
to max. If you set n to 10000,
min to 0 and max to 1, this simulates the
p-values that you would get from 10000 experiments where the null
hypothesis is true. Create the following objects:
pvals: 10000 simulated p-values using
runif()is_sig: a logical vector that is TRUE if
the corresponding element of pvals is less than .05,
FALSE otherwisesig_vals: a vector of just the significant
p-valuesprop_sig: the proportion of those p-values that were
significantset.seed(8675309) # ensures you get the same random numbers each time you run this code chunk
pvals <- NULL
is_sig <- NULL
sig_vals <- NULL
prop_sig <- NULL