# Simple addition
1 + 1[1] 2
# Division
10 / 2[1] 5
# Exponentiation
2 ^ 3[1] 8
# Square root
sqrt(16)[1] 4
R is a powerful programming language and environment for statistical computing and data visualization. In this course, you will learn:
This lecture provides a foundational introduction to R programming that you’ll use throughout the course.
Learning Objectives: By the end of this lecture, you should be able to:
R is:
RStudio is an integrated development environment (IDE) for R that makes it much easier to use R. We will focus on using R through RStudio.
R has become the standard tool for data analysis and statistical computing in:
Key advantages include:
When you open RStudio, you’ll see a window with multiple panes arranged in a grid.
The four main panes are:
You can customize the appearance and layout of RStudio:
The console is where you type commands and see results. Let’s start with simple arithmetic:
# Simple addition
1 + 1[1] 2
# Division
10 / 2[1] 5
# Exponentiation
2 ^ 3[1] 8
# Square root
sqrt(16)[1] 4
Notice the [1] prefix in the output. This indicates the position of the first value displayed on that line.
# When displaying many values, you see multiple indices
100:130 [1] 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123
[25] 124 125 126 127 128 129 130
In this output, [1] indicates position 1, [14] indicates position 14, and [27] indicates position 27.
Tip: The [1] notation becomes especially useful when working with vectors and matrices where you need to track the position of values.
If you type an incomplete command and press Enter, R displays a + prompt, waiting for you to complete it:
# Type this but don't press Enter after the minus sign
5 -
# Then press Enter and the prompt will show +
# Type 1 and press Enter
1When you do this in R, you’ll see:
> 5 -
+ 1
[1] 4To cancel an incomplete command, press Escape and start over.
If you type a command that R doesn’t recognize, you’ll get an error message:
> 3 % 5
Error: unexpected input in "3 % 5"Error messages are helpful! They tell you what went wrong. Don’t be intimidated by them—they’re just R’s way of saying it didn’t understand your command. The % operator is used for different purposes in R, and 3 % 5 is not valid syntax.
Some R commands take a long time to run. You can cancel a running command by pressing:
Or click the STOP button in the console. Note that canceling may take a moment.
R lets you save data by storing it in objects. An object is simply a name you can use to retrieve stored data.
You assign values to objects using <- or =:
# Create an object named 'a' with value 1
a <- 1
# View the contents
a[1] 1
# Do arithmetic with the object
a + 2[1] 3
# Create another object
b <- 10
a + b[1] 11
Best Practice: Use <- for assignment in R. While = also works, <- is the R convention and makes your code more readable to other R users.
R will overwrite an object without asking for confirmation:
# First assignment
a <- 1
a[1] 1
# Reassign a new value
a <- 2
a[1] 2
# You can overwrite with different data types
a <- "text"
a[1] "text"
Object names in R have a few rules:
Valid names start with:
.variable_name)Invalid names:
2variables ❌my-var, my$var, my@var ❌# Valid names
my_variable <- 5
my.variable <- 5
myVariable <- 5
x1 <- 10
# Invalid names (these will produce errors)
1variable <- 5 # Error: starts with number
my-variable <- 5 # Error: hyphen not allowed
my variable <- 5 # Error: space not allowedR is case-sensitive, so name and Name are different objects:
name <- "lowercase"
Name <- "uppercase"
name[1] "lowercase"
Name[1] "uppercase"
Use ls() to see all objects you’ve created:
# Create some objects
x <- 5
y <- 10
z <- "hello"
# List all objects
ls()[1] "a" "b" "name" "Name" "x" "y" "z"
# Remove a specific object
rm(z)
# Verify it's gone
ls()[1] "a" "b" "name" "Name" "x" "y"
R has several basic data types:
# Numeric (default for numbers)
x <- 3.14
y <- 42
class(x)[1] "numeric"
class(y)[1] "numeric"
# Character (text)
name <- "Alice"
greeting <- 'Hello, world!'
class(name)[1] "character"
class(greeting)[1] "character"
# Logical (TRUE/FALSE)
is_patient <- TRUE
treatment_received <- FALSE
class(is_patient)[1] "logical"
A vector is a collection of values of the same type. Create vectors using c():
# Numeric vector
ages <- c(25, 30, 35, 40, 45)
# Character vector
names <- c("Alice", "Bob", "Carol", "Dave", "Eve")
# Logical vector
is_smoker <- c(TRUE, FALSE, TRUE, FALSE, TRUE)
# Check length
length(ages)[1] 5
# Access individual elements
names[2] # Second element[1] "Bob"
ages[c(1, 3)] # First and third elements[1] 25 35
# Create sequences
1:10 # Sequence from 1 to 10 [1] 1 2 3 4 5 6 7 8 9 10
seq(1, 10, by = 2) # Sequence with step[1] 1 3 5 7 9
# Repeat values
rep(1, 5) # Repeat 1 five times[1] 1 1 1 1 1
rep(c("A", "B"), 3) # Repeat vector three times[1] "A" "B" "A" "B" "A" "B"
A list is a collection where elements can have different types:
# Create a list
person <- list(
name = "Alice",
age = 30,
is_student = FALSE
)
# Access elements
person$name[1] "Alice"
person[[2]] # Access by position[1] 30
person[["age"]] # Access by name[1] 30
# View the list structure
str(person)List of 3
$ name : chr "Alice"
$ age : num 30
$ is_student: logi FALSE
A matrix is a two-dimensional collection of elements all of the same type:
# Create a 3x3 matrix
m <- matrix(c(1, 2, 3, 4, 5, 6, 7, 8, 9), nrow = 3, ncol = 3)
m [,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
# Access elements
m[1, 2] # Row 1, Column 2[1] 4
m[2, ] # All of Row 2[1] 2 5 8
m[, 3] # All of Column 3[1] 7 8 9
# Matrix operations
t(m) # Transpose [,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
m %*% m # Matrix multiplication [,1] [,2] [,3]
[1,] 30 66 102
[2,] 36 81 126
[3,] 42 96 150
Data frames are the most important data structure in R for statistical analysis. They’re like spreadsheets where:
# Create a simple data frame
df <- data.frame(
ID = 1:5,
Name = c("Alice", "Bob", "Carol", "Dave", "Eve"),
Age = c(28, 35, 42, 31, 29),
Diagnosis = c("Yes", "No", "Yes", "No", "No")
)
df ID Name Age Diagnosis
1 1 Alice 28 Yes
2 2 Bob 35 No
3 3 Carol 42 Yes
4 4 Dave 31 No
5 5 Eve 29 No
# Check structure
str(df)'data.frame': 5 obs. of 4 variables:
$ ID : int 1 2 3 4 5
$ Name : chr "Alice" "Bob" "Carol" "Dave" ...
$ Age : num 28 35 42 31 29
$ Diagnosis: chr "Yes" "No" "Yes" "No" ...
# Dimensions
dim(df)[1] 5 4
nrow(df)[1] 5
ncol(df)[1] 4
# Access a column using $
df$Name[1] "Alice" "Bob" "Carol" "Dave" "Eve"
# Access by row and column
df[1, 2] # Row 1, Column 2[1] "Alice"
# Access entire column
df[, "Age"][1] 28 35 42 31 29
df[, 3][1] 28 35 42 31 29
# Access entire row
df[2, ] ID Name Age Diagnosis
2 2 Bob 35 No
# Subset rows where Age > 30
df[df$Age > 30, ] ID Name Age Diagnosis
2 2 Bob 35 No
3 3 Carol 42 Yes
4 4 Dave 31 No
In epidemiological research, you’ll usually start with existing data:
# Read a CSV file
data <- read.csv("path/to/file.csv")
# If first row contains variable names (default)
data <- read.csv("data.csv")
# If no header row
data <- read.csv("data.csv", header = FALSE)
# Specify missing value codes
data <- read.csv("data.csv", na.strings = c("", "NA", "."))Easier method for point-and-click:
Copy this code to your script for reproducibility:
# Code generated by RStudio
library(readr)
mydata <- read_csv("myfile.csv")Always save the import code in your script. This creates a reproducible record of how you loaded your data.
An R script is a plain text file containing R code. Scripts allow you to:
Method 1 (RStudio):
Method 2:
Use # to add comments that explain your code:
# Create sample patient data
set.seed(123)
data <- data.frame(
age = rnorm(100, mean = 55, sd = 15),
weight = rnorm(100, mean = 80, sd = 12),
height = rnorm(100, mean = 1.70, sd = 0.10)
)
# Calculate BMI
data$BMI <- data$weight / (data$height^2)
# Subset to patients over 50 years old
older_patients <- data[data$age > 50, ]
# Summarize BMI for older patients
summary(older_patients$BMI) Min. 1st Qu. Median Mean 3rd Qu. Max.
17.8 24.6 27.0 27.3 30.3 39.4
Start with header information and load all packages:
# =====================================================
# Analysis of NHANES Data
# Author: Your Name
# Date: February 2025
# =====================================================
# Load required packages
library(dplyr)
library(ggplot2)
library(NHANES)
# -----
# 1. Load and Prepare Data
# -----
data(NHANES)
head(NHANES)
# -----
# 2. Exploratory Data Analysis
# -----
summary(NHANES)
NHANES %>%
group_by(Gender) %>%
summarise(mean_age = mean(Age, na.rm = TRUE),
mean_bp_sys = mean(BPSys1, na.rm = TRUE))
# -----
# 3. Statistical Tests
# -----
t.test(BPSys1 ~ Gender, data = NHANES)# Good names
systolic_bp <- 140
patient_age <- 55
calculate_bmi <- function(weight, height) { weight / height^2 }
# Poor names
sb <- 140 # Unclear abbreviation
a <- 55 # Single letter
f1 <- function(w, h) { w / h^2 } # CrypticRun a single line:
Run multiple lines:
Run the entire script:
Let’s work through a complete analysis workflow using the NHANES (National Health and Nutrition Examination Survey) dataset:
# Load required packages
library(NHANES)
library(dplyr)
# 1. Load the data
data(NHANES)
# 2. Examine the data
head(NHANES)# A tibble: 6 × 76
ID SurveyYr Gender Age AgeDecade AgeMonths Race1 Race3 Education MaritalStatus HHIncome
<int> <fct> <fct> <int> <fct> <int> <fct> <fct> <fct> <fct> <fct>
1 51624 2009_10 male 34 " 30-39" 409 White <NA> High School Married 25000-34999
2 51624 2009_10 male 34 " 30-39" 409 White <NA> High School Married 25000-34999
3 51624 2009_10 male 34 " 30-39" 409 White <NA> High School Married 25000-34999
4 51625 2009_10 male 4 " 0-9" 49 Other <NA> <NA> <NA> 20000-24999
5 51630 2009_10 female 49 " 40-49" 596 White <NA> Some College LivePartner 35000-44999
6 51638 2009_10 male 9 " 0-9" 115 White <NA> <NA> <NA> 75000-99999
# ℹ 65 more variables: HHIncomeMid <int>, Poverty <dbl>, HomeRooms <int>, HomeOwn <fct>,
# Work <fct>, Weight <dbl>, Length <dbl>, HeadCirc <dbl>, Height <dbl>, BMI <dbl>,
# BMICatUnder20yrs <fct>, BMI_WHO <fct>, Pulse <int>, BPSysAve <int>, BPDiaAve <int>,
# BPSys1 <int>, BPDia1 <int>, BPSys2 <int>, BPDia2 <int>, BPSys3 <int>, BPDia3 <int>,
# Testosterone <dbl>, DirectChol <dbl>, TotChol <dbl>, UrineVol1 <int>, UrineFlow1 <dbl>,
# UrineVol2 <int>, UrineFlow2 <dbl>, Diabetes <fct>, DiabetesAge <int>, HealthGen <fct>,
# DaysPhysHlthBad <int>, DaysMentHlthBad <int>, LittleInterest <fct>, Depressed <fct>, …
str(NHANES)tibble [10,000 × 76] (S3: tbl_df/tbl/data.frame)
$ ID : int [1:10000] 51624 51624 51624 51625 51630 51638 51646 51647 51647 51647 ...
$ SurveyYr : Factor w/ 2 levels "2009_10","2011_12": 1 1 1 1 1 1 1 1 1 1 ...
$ Gender : Factor w/ 2 levels "female","male": 2 2 2 2 1 2 2 1 1 1 ...
$ Age : int [1:10000] 34 34 34 4 49 9 8 45 45 45 ...
$ AgeDecade : Factor w/ 8 levels " 0-9"," 10-19",..: 4 4 4 1 5 1 1 5 5 5 ...
$ AgeMonths : int [1:10000] 409 409 409 49 596 115 101 541 541 541 ...
$ Race1 : Factor w/ 5 levels "Black","Hispanic",..: 4 4 4 5 4 4 4 4 4 4 ...
$ Race3 : Factor w/ 6 levels "Asian","Black",..: NA NA NA NA NA NA NA NA NA NA ...
$ Education : Factor w/ 5 levels "8th Grade","9 - 11th Grade",..: 3 3 3 NA 4 NA NA 5 5 5 ...
$ MaritalStatus : Factor w/ 6 levels "Divorced","LivePartner",..: 3 3 3 NA 2 NA NA 3 3 3 ...
$ HHIncome : Factor w/ 12 levels " 0-4999"," 5000-9999",..: 6 6 6 5 7 11 9 11 11 11 ...
$ HHIncomeMid : int [1:10000] 30000 30000 30000 22500 40000 87500 60000 87500 87500 87500 ...
$ Poverty : num [1:10000] 1.36 1.36 1.36 1.07 1.91 1.84 2.33 5 5 5 ...
$ HomeRooms : int [1:10000] 6 6 6 9 5 6 7 6 6 6 ...
$ HomeOwn : Factor w/ 3 levels "Own","Rent","Other": 1 1 1 1 2 2 1 1 1 1 ...
$ Work : Factor w/ 3 levels "Looking","NotWorking",..: 2 2 2 NA 2 NA NA 3 3 3 ...
$ Weight : num [1:10000] 87.4 87.4 87.4 17 86.7 29.8 35.2 75.7 75.7 75.7 ...
$ Length : num [1:10000] NA NA NA NA NA NA NA NA NA NA ...
$ HeadCirc : num [1:10000] NA NA NA NA NA NA NA NA NA NA ...
$ Height : num [1:10000] 165 165 165 105 168 ...
$ BMI : num [1:10000] 32.2 32.2 32.2 15.3 30.6 ...
$ BMICatUnder20yrs: Factor w/ 4 levels "UnderWeight",..: NA NA NA NA NA NA NA NA NA NA ...
$ BMI_WHO : Factor w/ 4 levels "12.0_18.5","18.5_to_24.9",..: 4 4 4 1 4 1 2 3 3 3 ...
$ Pulse : int [1:10000] 70 70 70 NA 86 82 72 62 62 62 ...
$ BPSysAve : int [1:10000] 113 113 113 NA 112 86 107 118 118 118 ...
$ BPDiaAve : int [1:10000] 85 85 85 NA 75 47 37 64 64 64 ...
$ BPSys1 : int [1:10000] 114 114 114 NA 118 84 114 106 106 106 ...
$ BPDia1 : int [1:10000] 88 88 88 NA 82 50 46 62 62 62 ...
$ BPSys2 : int [1:10000] 114 114 114 NA 108 84 108 118 118 118 ...
$ BPDia2 : int [1:10000] 88 88 88 NA 74 50 36 68 68 68 ...
$ BPSys3 : int [1:10000] 112 112 112 NA 116 88 106 118 118 118 ...
$ BPDia3 : int [1:10000] 82 82 82 NA 76 44 38 60 60 60 ...
$ Testosterone : num [1:10000] NA NA NA NA NA NA NA NA NA NA ...
$ DirectChol : num [1:10000] 1.29 1.29 1.29 NA 1.16 1.34 1.55 2.12 2.12 2.12 ...
$ TotChol : num [1:10000] 3.49 3.49 3.49 NA 6.7 4.86 4.09 5.82 5.82 5.82 ...
$ UrineVol1 : int [1:10000] 352 352 352 NA 77 123 238 106 106 106 ...
$ UrineFlow1 : num [1:10000] NA NA NA NA 0.094 ...
$ UrineVol2 : int [1:10000] NA NA NA NA NA NA NA NA NA NA ...
$ UrineFlow2 : num [1:10000] NA NA NA NA NA NA NA NA NA NA ...
$ Diabetes : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
$ DiabetesAge : int [1:10000] NA NA NA NA NA NA NA NA NA NA ...
$ HealthGen : Factor w/ 5 levels "Excellent","Vgood",..: 3 3 3 NA 3 NA NA 2 2 2 ...
$ DaysPhysHlthBad : int [1:10000] 0 0 0 NA 0 NA NA 0 0 0 ...
$ DaysMentHlthBad : int [1:10000] 15 15 15 NA 10 NA NA 3 3 3 ...
$ LittleInterest : Factor w/ 3 levels "None","Several",..: 3 3 3 NA 2 NA NA 1 1 1 ...
$ Depressed : Factor w/ 3 levels "None","Several",..: 2 2 2 NA 2 NA NA 1 1 1 ...
$ nPregnancies : int [1:10000] NA NA NA NA 2 NA NA 1 1 1 ...
$ nBabies : int [1:10000] NA NA NA NA 2 NA NA NA NA NA ...
$ Age1stBaby : int [1:10000] NA NA NA NA 27 NA NA NA NA NA ...
$ SleepHrsNight : int [1:10000] 4 4 4 NA 8 NA NA 8 8 8 ...
$ SleepTrouble : Factor w/ 2 levels "No","Yes": 2 2 2 NA 2 NA NA 1 1 1 ...
$ PhysActive : Factor w/ 2 levels "No","Yes": 1 1 1 NA 1 NA NA 2 2 2 ...
$ PhysActiveDays : int [1:10000] NA NA NA NA NA NA NA 5 5 5 ...
$ TVHrsDay : Factor w/ 7 levels "0_hrs","0_to_1_hr",..: NA NA NA NA NA NA NA NA NA NA ...
$ CompHrsDay : Factor w/ 7 levels "0_hrs","0_to_1_hr",..: NA NA NA NA NA NA NA NA NA NA ...
$ TVHrsDayChild : int [1:10000] NA NA NA 4 NA 5 1 NA NA NA ...
$ CompHrsDayChild : int [1:10000] NA NA NA 1 NA 0 6 NA NA NA ...
$ Alcohol12PlusYr : Factor w/ 2 levels "No","Yes": 2 2 2 NA 2 NA NA 2 2 2 ...
$ AlcoholDay : int [1:10000] NA NA NA NA 2 NA NA 3 3 3 ...
$ AlcoholYear : int [1:10000] 0 0 0 NA 20 NA NA 52 52 52 ...
$ SmokeNow : Factor w/ 2 levels "No","Yes": 1 1 1 NA 2 NA NA NA NA NA ...
$ Smoke100 : Factor w/ 2 levels "No","Yes": 2 2 2 NA 2 NA NA 1 1 1 ...
$ Smoke100n : Factor w/ 2 levels "Non-Smoker","Smoker": 2 2 2 NA 2 NA NA 1 1 1 ...
$ SmokeAge : int [1:10000] 18 18 18 NA 38 NA NA NA NA NA ...
$ Marijuana : Factor w/ 2 levels "No","Yes": 2 2 2 NA 2 NA NA 2 2 2 ...
$ AgeFirstMarij : int [1:10000] 17 17 17 NA 18 NA NA 13 13 13 ...
$ RegularMarij : Factor w/ 2 levels "No","Yes": 1 1 1 NA 1 NA NA 1 1 1 ...
$ AgeRegMarij : int [1:10000] NA NA NA NA NA NA NA NA NA NA ...
$ HardDrugs : Factor w/ 2 levels "No","Yes": 2 2 2 NA 2 NA NA 1 1 1 ...
$ SexEver : Factor w/ 2 levels "No","Yes": 2 2 2 NA 2 NA NA 2 2 2 ...
$ SexAge : int [1:10000] 16 16 16 NA 12 NA NA 13 13 13 ...
$ SexNumPartnLife : int [1:10000] 8 8 8 NA 10 NA NA 20 20 20 ...
$ SexNumPartYear : int [1:10000] 1 1 1 NA 1 NA NA 0 0 0 ...
$ SameSex : Factor w/ 2 levels "No","Yes": 1 1 1 NA 2 NA NA 2 2 2 ...
$ SexOrientation : Factor w/ 3 levels "Bisexual","Heterosexual",..: 2 2 2 NA 2 NA NA 1 1 1 ...
$ PregnantNow : Factor w/ 3 levels "Yes","No","Unknown": NA NA NA NA NA NA NA NA NA NA ...
# 3. Calculate summary statistics by gender
summary_stats <- NHANES %>%
group_by(Gender) %>%
summarise(
n = n(),
mean_age = mean(Age, na.rm = TRUE),
sd_age = sd(Age, na.rm = TRUE),
mean_bp_sys = mean(BPSys1, na.rm = TRUE),
sd_bp_sys = sd(BPSys1, na.rm = TRUE),
mean_bmi = mean(BMI, na.rm = TRUE),
sd_bmi = sd(BMI, na.rm = TRUE)
)
print(summary_stats)# A tibble: 2 × 8
Gender n mean_age sd_age mean_bp_sys sd_bp_sys mean_bmi sd_bmi
<fct> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 female 5020 37.6 22.7 117. 18.1 26.8 7.90
2 male 4980 35.8 22.0 121. 16.6 26.5 6.81
# 4. Visualize the data
library(ggplot2)
ggplot(NHANES, aes(x = Gender, y = BPSys1, fill = Gender)) +
geom_boxplot(alpha = 0.7) +
geom_jitter(width = 0.2, alpha = 0.2) +
labs(title = "Systolic Blood Pressure by Gender",
y = "Systolic Blood Pressure (mmHg)",
x = "Gender") +
theme_minimal() +
theme(legend.position = "none")# 5. Conduct statistical test
# Remove rows with missing blood pressure data
nhanes_clean <- NHANES %>%
filter(!is.na(BPSys1))
# Compare systolic blood pressure between genders
t_test <- t.test(BPSys1 ~ Gender, data = nhanes_clean)
print(t_test)
Welch Two Sample t-test
data: BPSys1 by Gender
t = -9.3, df = 8172, p-value <2e-16
alternative hypothesis: true difference in means between group female and group male is not equal to 0
95 percent confidence interval:
-4.332 -2.829
sample estimates:
mean in group female mean in group male
117.3 120.9
Free Online Resources:
Interactive Learning:
install.packages("swirl")
library(swirl)
swirl()R Objects: Use <- to assign values to named objects
Vectors: Combine multiple values with c()
Functions: Call functions with syntax function_name(argument1, argument2)
Data Frames: The primary structure for statistical analysis in R
Scripts: Always write and save your code in scripts for reproducibility
Help: Use ?function_name or help(function_name) for documentation
Comments: Use # to explain your code for future reference
Working Directory: Understand where R looks for files with getwd() and setwd()
sessionInfo()R version 4.5.2 (2025-10-31)
Platform: aarch64-apple-darwin20
Running under: macOS Tahoe 26.2
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/New_York
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ggplot2_4.0.1 dplyr_1.1.4 NHANES_2.1.0
loaded via a namespace (and not attached):
[1] vctrs_0.6.5 cli_3.6.5 knitr_1.51 rlang_1.1.6 xfun_0.55
[6] otel_0.2.0 generics_0.1.4 S7_0.2.1 jsonlite_2.0.0 labeling_0.4.3
[11] glue_1.8.0 htmltools_0.5.9 scales_1.4.0 rmarkdown_2.30 grid_4.5.2
[16] evaluate_1.0.5 tibble_3.3.0 fastmap_1.2.0 yaml_2.3.12 lifecycle_1.0.4
[21] compiler_4.5.2 RColorBrewer_1.1-3 htmlwidgets_1.6.4 pkgconfig_2.0.3 rstudioapi_0.17.1
[26] farver_2.1.2 digest_0.6.39 R6_2.6.1 tidyselect_1.2.1 utf8_1.2.6
[31] pillar_1.11.1 magrittr_2.0.4 withr_3.0.2 gtable_0.3.6 tools_4.5.2
Last updated: January 20, 2026
This lecture provides the foundation for all the statistical computing we’ll do in EPI 553. In the next lecture, we’ll review biostatistical foundations essential for understanding advanced modeling techniques.