Links to CRAN (the Comprehensive R Archive Network) for downloads
Manuals, FAQs, and contributed documentation
Source code and news about new versions of R
This site is the “home base” for R itself and is separate from RStudio/Posit (which provides the IDE).
2 How to Download R and RStudio, and Check Your Versions
2.1 Installing R and RStudio
R is the programming language and engine.
RStudio (by Posit) is a popular Integrated Development Environment (IDE) for R.
An IDE is a software environment that integrates coding, running, and debugging tools in one place, making programming easier and more efficient.
R and RStudio
Install R first, then RStudio.
Download R (CRAN; The Comprehensive R Archive Network)
Download the installer for your operating system and run it
Open RStudio after installing (R should already be installed)
2.2 Checking Your Versions
###R version
# To run the code you have 3 options: # (1) highlight the line and press Run,# (2) place your cursor at the end of the line and type Cmd/Ctrl + Enter, or # (3) to run the entire chunk, click on the green button to the right. R.version.string
[1] "R version 4.4.3 (2025-02-28)"
2.2.1 RStudio version
# If running inside RStudio, this will show RStudio version:if (exists("RStudio.Version")) paste("RStudio version:", RStudio.Version()$version)
Explanation:
RStudio.Version is a function provided by RStudio. If it is run inside RStudio, it returns a list of metadata, including “version”. We are calling only for the version using the $ operator.
paste simply fronts the version with “RStudio version” so that it prints nicely.
3 What Is a CRAN Mirror and How Do You Set It?
CRAN (The Comprehensive R Archive Network) hosts R itself and thousands of R packages.
Because CRAN is mirrored (copied) around the world, you typically choose a CRAN mirror close to you geographically so downloads are faster and more reliable.
You can set your CRAN mirror interactively:
# This opens a menu in R to choose a CRAN mirror:# chooseCRANmirror() # uncomment and run once in a session
Or you can set it programmatically in your script:
# Example: set CRAN mirror via options() for this sessionoptions(repos =c(CRAN ="https://cran.rstudio.com/"))# You can check current repos with:getOption("repos")
CRAN
"https://cran.rstudio.com/"
For reproducible research, it is often helpful to explicitly set the repository in your script or within your RStudio Project options.
4 R Is Open Source – Anyone Can Create an R Function
R is an open-source language, which means:
The source code is publicly available.
Anyone can write and share R functions and packages.
Most R packages are maintained by individuals or teams and distributed through CRAN, Bioconductor, GitHub, or other platforms.
You can define your own functions easily:
# A simple custom functionadd_two <-function(x) { x +2}add_two(5)
[1] 7
# Another example: compute a z-scorez_score <-function(x) { (x -mean(x, na.rm =TRUE)) /sd(x, na.rm =TRUE)}z_score(c(1, 2, 3, 4, 5))
This is the same mechanism that package authors use—just organized and distributed as packages.
5 Exploring the RStudio Environment (Panes and Toolbars)
Once you have installed R and RStudio, open RStudio. By default, you will see four main panes:
Source (top-left)
Your editor for .R scripts, .Rmd, .qmd files
Run code lines or chunks into the Console
Console (bottom-left)
Where R commands execute
Shows outputs, errors, warnings
Environment/History (top-right)
Environment: data frames and objects currently in memory
History: previously run commands
Files/Plots/Packages/Help/Viewer (bottom-right)
Files: browse project files
Plots: view figures
Packages: installed packages and load/unload controls
Help: documentation for functions and packages
Viewer: renders HTML content (e.g., Quarto documents)
Across the top toolbar you will find buttons for:
Running code chunks or single lines
Saving files
Creating new scripts and Quarto / R Markdown files
Knitting (for .Rmd) or Rendering (for .qmd)
Managing Projects and version control (Git)
5.1 Quick Demonstrations
# Create a few objects (watch the Environment pane update)nums <-rnorm(10)df <-data.frame(id =1:5, value =c(10, 20, 15, 30, NA))# Use Help pane: open documentationhelp("mean") # or ?mean
RStudio window
# Show a basic plot (appears in Plots tab)plot(nums, type ="b", main ="Demo plot", xlab ="Index", ylab ="Value")
6 Setting a Working Directory and Using R Projects
R needs to know where your files live. This is your working directory.
A very efficient workflow is to:
Create a work folder for your project.
Create an R Project inside that folder.
Keep all your data, code, and documents in that folder.
6.1 Creating an .Rproj and Setting the Working Directory
Create a .Rproj file, name it, and save it in your new work folder:
File → New Project → Existing Directory (or “New Directory” to create a new folder)
To create and save your R Script file:
File → New File → R Script
File → Save (choose .R extension)
For R Markdown: File → New File → R Markdown, then Save as .Rmd
Once you are in your R Project, the working directory will automatically be the project folder.
You can change the working directory manually (less recommended than using Projects):
# Example only; adjust to your own path:# setwd("/path/to/your/project/folder")
Using R Projects is more robust than calling setwd() in every script, and it keeps your projects self-contained.
7 Installing and Loading Libraries
R’s functionality is extended through packages (libraries). You typically:
Install a package once (per machine or environment).
Load it in each session where you want to use it.
7.1 Installing Packages
7.1.1 Install a single package
# Example install (run once; set eval: false to avoid automatic install)# install.packages("dplyr") # install.packages("ggplot2")
7.1.2 Install multiple packages at once
# Example install of multiple packages (run once; set eval: false to avoid automatic install)# install.packages(c("tidyverse", "readr", "dplyr", "ggplot2", "readxl", "data.table"))
7.2 Loading Packages
7.2.1 Load a single package
# Load dplyr packagelibrary(dplyr)
7.2.2 Load multiple packages safely
# Load if available; fall back gracefully if notloaded_pkgs <-c()for (pkg inc("dplyr", "ggplot2")) {if (requireNamespace(pkg, quietly =TRUE)) {library(pkg, character.only =TRUE) loaded_pkgs <-c(loaded_pkgs, pkg) }}# This code creates an empty vector to track successfully loaded packages.# loops through the list of package names.# checks whether each package is installed. # loads the package if it is installed.# tells library that the variable contains the package name as text. # records the packages that were loaded. (if it is not installed, it isn't loaded)loaded_pkgs
[1] "dplyr" "ggplot2"
Notes:
Use install.packages("packagename") once per machine or project.
Use library(packagename) in each session/script where needed.
For reproducibility, consider project environments such as renv.
8 Formatting Flat Files for Loading
Good practices for CSV/TSV flat files:
Use a header row with short, clear, alphanumeric column names
(avoid spaces; use underscores if needed)
Use UTF-8 encoding
Use a consistent delimiter (comma for CSV, tab for TSV)
Use ISO 8601 for dates (YYYY-MM-DD) and include time zones if timestamps are present
Avoid embedded line breaks in cells; if present, ensure proper quoting
Keep one “tidy” table per file: each row is one observation, each column is one variable
8.1 Create and Save a Well-Formatted CSV
# Example tidy datasettidy_example <-data.frame(subject_id =1:6,group =c("control", "control", "control", "treatment", "treatment", "treatment"),age_years =c(34, 45, 51, 29, 40, NA),visit_date =as.Date(c("2025-01-10", "2025-01-12", "2025-01-13", "2025-01-11", "2025-01-12", "2025-01-14")),score =c(87, 90, 85, 92, 88, 91))# Create a data folder, then save CSVdir.create("data", showWarnings =FALSE)csv_path <-file.path("data", "tidy_example.csv")write.csv(tidy_example, csv_path, row.names =FALSE, na ="")csv_path
[1] "data/tidy_example.csv"
9 Loading a Dataset (Flat File and Other Resources)
9.1 Loading the CSV with Base R
loaded_base <-read.csv(csv_path, stringsAsFactors =FALSE)str(loaded_base) # structure of the dataframe
'data.frame': 6 obs. of 5 variables:
$ subject_id: int 1 2 3 4 5 6
$ group : chr "control" "control" "control" "treatment" ...
$ age_years : int 34 45 51 29 40 NA
$ visit_date: chr "2025-01-10" "2025-01-12" "2025-01-13" "2025-01-11" ...
$ score : int 87 90 85 92 88 91
head(loaded_base) # first six rows (by default)
# Load tidy_example data back in, add a variable, and resave.tidy_example <-read.csv(csv_path)tidy_example$score_dichot <-ifelse(tidy_example$score >90, 1, 0)write.csv(tidy_example, csv_path, row.names =FALSE, na ="")head(tidy_example)
9.2 Loading the CSV with readr (tidyverse)
# Install readr if needed (run once; eval is FALSE so it won't execute automatically)# install.packages("readr")
# If readr is available, demonstrate its use safelyif (requireNamespace("readr", quietly =TRUE)) { loaded_readr <- readr::read_csv(csv_path, show_col_types =FALSE)head(loaded_readr)}
9.2.1 Handling Column Types and Missing Values Explicitly with readr
if (requireNamespace("ggplot2", quietly =TRUE)) {library(ggplot2)ggplot(loaded_base, aes(x = group, y = score, fill = group)) +geom_boxplot() +geom_jitter(width =0.1, alpha =0.6) +labs(title ="Scores by Group", x ="Group", y ="Score") +theme_minimal()}
ggplot requires that we must have previously created a dataframe that is in long format that it will use. In this case, loaded_base is already in the format that ggplot can use.
fit <-lm(mpg ~ wt + cyl, data = mtcars)summary(fit)
Call:
lm(formula = mpg ~ wt + cyl, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-4.2893 -1.5512 -0.4684 1.5743 6.1004
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 39.6863 1.7150 23.141 < 2e-16 ***
wt -3.1910 0.7569 -4.216 0.000222 ***
cyl -1.5078 0.4147 -3.636 0.001064 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.568 on 29 degrees of freedom
Multiple R-squared: 0.8302, Adjusted R-squared: 0.8185
F-statistic: 70.91 on 2 and 29 DF, p-value: 6.809e-12
# <- is the assignment operator# lm commands linear regression# mpg is the outcome (y variable)# ~ is the formula operator that expresses relationships. # wt and cyl are the x variables # the dataframe specified is mtcars (built int)# lm(y ~ x + z, data = dataframe)
13.5 T-test (Group Comparison)
t.test(mpg ~ am, data = mtcars)
Welch Two Sample t-test
data: mpg by am
t = -3.7671, df = 18.332, p-value = 0.001374
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
-11.280194 -3.209684
sample estimates:
mean in group 0 mean in group 1
17.14737 24.39231
Quarto is a modern open-source scientific and technical publishing system built on Pandoc. It allows you to create dynamic documents, reports, presentations, and websites that combine text, code, and output.
15.2 How to create Quarto documents
In RStudio, go to File → New File → Quarto Document.
Choose a template (e.g., HTML, PDF, Word) and click OK.
Write your content using Markdown and embed R code chunks using triple backticks with {r}.
Save the file with a .qmd extension.
15.3 Rendering Quarto from R
# Quarto render (requires Quarto installed as a separate tool from https://quarto.org/)if (requireNamespace("quarto", quietly =TRUE)) {# quarto::quarto_render("your_document.qmd")}