Welcome to the PSYC3361 coding W1 self test. The test assesses your ability to use the coding skills covered in the Week 1 online coding modules.

In particular, it assesses your ability to…

  • choose packages/functions
  • read in data
  • group_by and summarise
  • make notes using RMarkdown
  • insert pictures in an Rmd document
  • write data to csv

It is IMPORTANT to document the code that you write so that someone who is looking at your code can understand what it is doing. Above each chunk, write a few sentences outlining which packages/functions you have chosen to use and what the function is doing to your data. Where relevant, also write a sentence that interprets the output of your code.

Your notes should also document the troubleshooting process you went through to arrive at the code that worked.

For each of the challenges below, the documentation is JUST AS IMPORTANT as the code.

Good luck!!

Jenny

1. customise your Rmd document by adding your name as the author, a table of contents and choosing a theme that you like.

2. load the packages you will need

# load package
library(tidyverse)

3. read the birthweight data

# checking files in main folder
print(list.files())
##  [1] "data"                      "extra_ggplot_practice.Rmd"
##  [3] "images"                    "project.Rproj"            
##  [5] "w1 self test.Rmd"          "W1 stuff"                 
##  [7] "w1-self-test.html"         "w1-self-test.Rmd"         
##  [9] "w2 self test.Rmd"          "w3 self test.Rmd"
# checking files in "data" folder and use the correct pathway
print(list.files("data"))
## [1] "alone.csv"                    "birthweight_data.csv"        
## [3] "dino.csv"                     "ozbabynames.csv"             
## [5] "summary_birthweight_data.csv"
# obtain the correct pathway to reach the data folder
frames <- read_csv(file = "data/birthweight_data.csv") 

# viewing the data
print(frames)
## # A tibble: 788 × 5
##    true_ID birthweight gestation_age_w child_ethn               plurality
##      <dbl>       <dbl> <chr>           <chr>                    <chr>    
##  1    3100        3030 39              Middle-Eastern           singleton
##  2    3101        3710 40              Caucasian                singleton
##  3    3102        3770 42              African/African-American singleton
##  4    3103        3660 38              Caucasian                singleton
##  5    3104        3800 39              Caucasian                singleton
##  6    3105        3540 41              Caucasian                singleton
##  7    3106        3400 37              South-East Asian         singleton
##  8    3107        3650 39              Middle-Eastern           singleton
##  9    3108        3460 39              South-East Asian         singleton
## 10    3109        3380 39              South-East Asian         singleton
## # ℹ 778 more rows

4. calculate the mean birthweight separately for twins and singletons

# find mean using pipe function
mean_pularity <- frames %>%
  group_by(plurality) %>%
    summarise(
    m = mean(birthweight)
) %>% 
  ungroup()

5. identify the earliest (i.e. the minimum value) gestational age for each ethicity group

# find minimum gestational age using pipe function
frames %>%
  group_by(child_ethn) %>%
  summarise(
    min = min(gestation_age_w)
  ) %>% ungroup()
## # A tibble: 10 × 2
##    child_ethn                        min  
##    <chr>                             <chr>
##  1 Aboriginal/Torres Strait Islander 33   
##  2 African/African-American          26   
##  3 Caucasian                         26   
##  4 East Asian                        33   
##  5 Hispanic/Latino                   37   
##  6 Middle-Eastern                    28   
##  7 Missing                           36   
##  8 Polynesian/Melanesian             28   
##  9 South Asian                       28   
## 10 South-East Asian                  29

6. write some notes about how group_by and summarise work with the pipe below, including a link to documentation or a blog post that you think is useful

In the beginning, the package of tidyverse is first loaded with functions including group_by and summarise that allows researchers to better organise and showcase the important results. The pipe function allows the code to be process in a sequential order that aligns with normal reading habits, which enhances the clarity of the codes. By performing %>%, it allows functions to be performed in stages.

7. download a picture of a baby from the internet and insert it into your document below

baby_img
baby_img

8. write the summary of mean birthweight by twins/singletons that you made in step 3 above to a new csv file

write_csv(mean_pularity, path = "data/summary_birthweight_data.csv")
## Warning: The `path` argument of `write_csv()` is deprecated as of readr 1.4.0.
## ℹ Please use the `file` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

9. Knit your document and publish the output to RPubs