Welcome to the PSYC3361 coding W1 self test. The test assesses your ability to use the coding skills covered in the Week 1 online coding modules.
In particular, it assesses your ability to…
It is IMPORTANT to document the code that you write so that someone who is looking at your code can understand what it is doing. Above each chunk, write a few sentences outlining which packages/functions you have chosen to use and what the function is doing to your data. Where relevant, also write a sentence that interprets the output of your code.
Your notes should also document the troubleshooting process you went through to arrive at the code that worked.
For each of the challenges below, the documentation is JUST AS IMPORTANT as the code.
Good luck!!
Jenny
For this exercise, I am loading the tidyverse
package
and the here
package. Tidyverse contains functions to read
in the data read_csv
and to create gruoped summaries
(group_by
and summarise
). The
here
package makes it easy to tell R where the data is when
you are reading it in.
library(tidyverse)
library(here)
The data is in .csv format so I am giong to use the read_csv() function. This call tells R to find the data “here” within the data folder and to make a new object called babies
babies <- read_csv(here("data", "Sample_BirthWeight_GestAge.csv"))
Here I am using group_by and summarise to calculate the mean
birthweight separately for twins and singletons. It is good practice to
use ungroup()
in case you want to pipe more operations on
to the list later.
babies %>%
group_by(plurality) %>%
summarise(mean_bw = mean(birthweight)) %>%
ungroup()
## # A tibble: 2 × 2
## plurality mean_bw
## <chr> <dbl>
## 1 singleton 3248.
## 2 twin 2311.
Here I am using group_by and summarise to identify the minimum
birthweight baby in each ethnicity, using the min()
function. Again, remembering to ungroup()
babies %>%
group_by(child_ethn) %>%
summarise(min_ga = min(gestation_age_w)) %>%
ungroup()
## # A tibble: 10 × 2
## child_ethn min_ga
## <chr> <chr>
## 1 Aboriginal/Torres Strait Islander 33
## 2 African/African-American 26
## 3 Caucasian 26
## 4 East Asian 33
## 5 Hispanic/Latino 37
## 6 Middle-Eastern 28
## 7 Missing 36
## 8 Polynesian/Melanesian 28
## 9 South Asian 28
## 10 South-East Asian 29
The pipe allows you to string together a number of code operations into a sequence of actions that you can do with your data. FOr example, it is useful to produce descriptive summaries separately for each group in your data set. By taking the dataframe, piping it to group_by, then piping it again to summarise, we can easily calculate means separtely for each group. The Tidyverse documentation has useful examples.
Putting images in your Rmd file is going to be useful when you want to insert screenshots into your verification report. You can create a folder within your project called “images” and put the image files in that folder. Then you can call the location of the image file with the notation images/baby.jpeg. Put the path within round brackets (i.e. (images/baby.jpeg)) and put an exclamation point and some square brackets on the front []
Here I am adding another pipe operation onto the bottom of the mean birthweight calculation to write the data to a new csv file that I could open in a different program.
babies %>%
group_by(plurality) %>%
summarise(mean_bw = mean(birthweight)) %>%
ungroup() %>%
write_csv("bw_by_plurality.csv")