Welcome to the PSYC3361 coding W1 self test. The test assesses your ability to use the coding skills covered in the Week 1 online coding modules.

In particular, it assesses your ability to…

  • choose packages/functions
  • read in data
  • group_by and summarise
  • make notes using RMarkdown
  • insert pictures in an Rmd document
  • write data to csv

It is IMPORTANT to document the code that you write so that someone who is looking at your code can understand what it is doing. Above each chunk, write a few sentences outlining which packages/functions you have chosen to use and what the function is doing to your data. Where relevant, also write a sentence that interprets the output of your code.

Your notes should also document the troubleshooting process you went through to arrive at the code that worked.

For each of the challenges below, the documentation is JUST AS IMPORTANT as the code.

Good luck!!

Jenny

1. customise your Rmd document by adding your name as the author, a table of contents and choosing a theme that you like.

A floating table of contents will stay on the screen when you scroll. The yeti theme changes the font and colour of the html output.

2. load the packages you will need

The tidyverse pacakage will be used. It sometimes needs to be installed then loaded from the library.

library(tidyverse)

This will load the tidyverse package into the console.

3. read the birthweight data

I will use the read_csv function to open the data file.

frames <- read_csv(file = "data/birthweight_data.csv")

The data can then be observed as a spreadsheet in the environment pane and in console output will show what R has read from the file.

4. calculate the mean birthweight separately for twins and singletons

Using the pipe, we can define the groups (twins and singletons) then summarise by mean.

frames %>% group_by(plurality) %>% summarise(mean_weight = mean(birthweight)) %>% ungroup()
## # A tibble: 2 × 2
##   plurality mean_weight
##   <chr>           <dbl>
## 1 singleton       3248.
## 2 twin            2311.

The mean weight of a singleton baby is 3248.103 grams whilst the mean weight of a twin baby is 2310.682 grams.

5. identify the earliest (i.e. the minimum value) gestational age for each ethicity group

Using pipe, the data can be split into the ethnicity category and then summarise the data using min() i.e. smallest value, to find the lowest gestational age in each group.

frames %>% group_by(child_ethn) %>% summarise(min_gesage = min(gestation_age_w)) %>% ungroup()
## # A tibble: 10 × 2
##    child_ethn                        min_gesage
##    <chr>                             <chr>     
##  1 Aboriginal/Torres Strait Islander 33        
##  2 African/African-American          26        
##  3 Caucasian                         26        
##  4 East Asian                        33        
##  5 Hispanic/Latino                   37        
##  6 Middle-Eastern                    28        
##  7 Missing                           36        
##  8 Polynesian/Melanesian             28        
##  9 South Asian                       28        
## 10 South-East Asian                  29

6. write some notes about how group_by and summarise work with the pipe below, including a link to documentation or a blog post that you think is useful

group_by is like a filter and is used to define the group/s of interest - a grouping variable e.g. ethnicity whilst summarise, summarises data from the indivdual data entries so we may want to compute the mean of a group.

The pipe allows for the above chain of operations and others, to be passed along itself, reducing the use of nested functions which can be hard to read.

This blog/github work is quite helpful as it breaks these topics into subsets and also has additional exercises.

7. download a picture of a baby from the internet and insert it into your document below

There are so many AI generated photos of babies, hopefully this is a real one!

8. write the summary of mean birthweight by twins/singletons that you made in step 3 above to a new csv file

I will create a new variable out of the summarised data and use write_csv to save it as a new file. NB: “path” has been deprecated but still works, R has a notice to use “file” instead.

weight_summary <- frames %>% group_by(plurality) %>% summarise(mean_weight = mean(birthweight)) %>% ungroup()

write_csv(weight_summary, file = "mean_weight_data_summary.csv")

9. Knit your document and publish the output to RPubs