R Bootcamp (2017 EPI/FACETS Summer Program)

Kazuki Yoshida (ScD Candidate in Epi/Biostat)
6/13-14/2017

Overview

Languages of data analysis

There are several “languages” that you encounter in data analysis. We will focus on R.

  • R
  • SAS
  • Stata
  • Python
  • Julia

Why program in data analysis?

There are point-and-click analysis software, why do we bother to code?

  • Reproducible research
  • Coding keeps track of what you did better than point-and-click
  • Beyond occasional casual use, programming provides more opportunities for workflow efficiency improvement.

What is R and RStudio

  • R language is the language.
  • R/R.app/R.exe is the “interpreter” that interprets what you say in R to your machine.
  • RStudio is the “integrated development environment (IDE)”.
  • RStudio provides a nice coding environment and makes interaction with R interpreter easier.
  • RStudio can also edit various static and dynamic documents, e.g., presentation (this!), RMarkdown, LaTeX, and web app (shiny).

Where to find more resources

The following sessions are largely based on Wickahm & Grolemund's R for Data Science, which is available as a free website or a physical book. This book covers the current state of “tidyverse” very nicely.

Some additional notable resources are: