#BST02: Using R for Statistics in MedicalResearch

What is this Course About

Statistics have flourished in the recent years mainly due to the possibility of doing complex analysis using computers
- Many statistical software exist to do simple and specialized analysis

The programming language R is popular for data scientists
- Analysts must not only learn how to use the software but also the ideas behind it
- Learning statistical modelling and algorithm is more important than learning a programming language.

The most valuable tool of a modern quantitative researcher is his/her personal computer

Planning

General introduction in R

What does R look like ?

Underlying program of Rstudio. Quite basic.

What is R ?

  • R is a software environment for statistical computing and graphics
    • extensive catalog of statistical and graphical methods
  • R is mainly used in academia. However, many large companies also use R programming language, including healthcare industries but also Uber, Google, Airbnb, Facebook and so on
  • Unlike SPSS, R is purely command driven

A brief history of R

1993: University of Auckland, New Zealand by Ross Ihaka and Robert Gentleman
1997: R core Team was formed (20 members)
2000: R 1.0.0 released
2004: First international user conference in Vienna
2013: 5026 packages available
2017: 10875 packages available
Now: nrow(available.packages())

Why learn R ?

  • R is a free software environment for statistical computing and graphics
  • It compiles and runs on LINUX, Windows and MacOS
  • Open source language
  • Users are allowed to modify and redistribute the code
  • Advanced statistical language
  • Supports extensions
  • Related to other languages
  • Flexible and fun!

Where do I get R ?

http://cran.r-project.org
- choose your platform, e.g., Windows, Linux
- e.g., for Windows: Windows → base → Download R 3.6.2 for Windows
- Install . . .

How does R work ?

  • Packages built for specific tasks
  • Download R packages from the CRAN web site → within R
    • Packages
    • Install package(s) . . .
    • Make you choice(s)
    • Load the package using library() (note: install does not mean load)

How to get help in R ?

Disadvantages of R

  • Appears intimidating to the first-time user
  • Output is not so nice looking (but there are some alternatives)
  • Exporting output is more difficult
  • Cannot easily handle very big data sets (depends on the installed RAM)
  • A lot of things are available but it is sometimes hard to find your way
  • The quality of the available packages is greatly varying
  • Has been criticized for using only one CPU at a time (but the parallel packages helps you perform tasks in different cores)

##Summary - R is a great tool to explore and investigate the data
- Several statistical methods can be performed with R
- It is important to understand the methods before applying them in R

How to use: R uses packages that perform specific tasks
- Install package only once
- Load package every time you open R