x <- 10
y <- 5
z <- x + y
z[1] 15
Complete this before the first module (Data Collection & Prep) if you are new to R
This file is optional and ungraded. It exists for one reason: if you have never used R before, working through it before the required file (e.g., Data collection & prep) will make the next week’s practice significantly less stressful.
If you can already answer yes to all three of these, skip this file:
<- does in R?filter() or select() from the tidyverse before?Everything you practice here you will do again in 1_Data_Collection_and_Prep.qmd with real data and a real submission. This file is the warm-up.
A variable stores a value so you can use it later. In R, you assign values with <- — read it as “gets.” x <- 10 means “x gets 10.”
x <- 10
y <- 5
z <- x + y
z[1] 15
# starts a comment. R ignores everything after it. Use comments to explain what your code does.
# Numeric — any number
score <- 87.5
# Character — text, always in quotes
subject <- "Mathematics"
# Logical — TRUE or FALSE (always capitalized)
passed <- TRUE
# Check what type a variable is
class(score)[1] "numeric"
class(subject)[1] "character"
class(passed)[1] "logical"
When you load real data later, R sometimes imports numeric columns as character. Knowing how to check (class()) and fix types (as.numeric(), as.character()) is a data cleaning skill you will use often in next modules.
# Create a variable called 'my_score' with any number between 0 and 100.
my_score <- 67
# Create a variable called 'my_subject' with the name of a subject you
# teach or plan to teach.
my_subject <- "ESL"
# Print both variables.
print(my_score)[1] 67
print(my_subject)[1] "ESL"
A vector is a sequence of values of the same type. Create one with c() — which stands for “combine.”
# Numeric vector — quiz scores for 5 students
scores <- c(85, 92, 78, 95, 88)
scores[1] 85 92 78 95 88
# Character vector — student names
students <- c("Maya", "Jordan", "Sam", "Alex", "Riley")
students[1] "Maya" "Jordan" "Sam" "Alex" "Riley"
# How many items?
length(scores)[1] 5
mean(scores) # average[1] 87.6
median(scores) # middle value[1] 88
max(scores) # highest[1] 95
min(scores) # lowest[1] 78
sum(scores) # total[1] 438
# Access a specific item by position (R counts from 1, not 0)
scores[1] # first item[1] 85
scores[3] # third item[1] 78
# Create a numeric vector called 'weekly_hours' with 5 values between 1–20,
# representing hours 5 students spent on an LMS in one week.
weekly_hours <- c(5,6,7,8,9)
# Calculate the mean and max.
mean(weekly_hours)[1] 7
max(weekly_hours)[1] 9
# Access the second item.
weekly_hours[2][1] 6
A data frame is like a spreadsheet — rows are observations (students, learners), columns are variables (scores, attendance, time spent).
student_data <- data.frame(
student_id = c(101, 102, 103, 104),
name = c("Maya", "Jordan", "Sam", "Alex"),
quiz_score = c(85, 92, 78, 95),
time_on_task = c(25, 30, 20, 35)
)
student_data student_id name quiz_score time_on_task
1 101 Maya 85 25
2 102 Jordan 92 30
3 103 Sam 78 20
4 104 Alex 95 35
nrow(student_data) # number of rows[1] 4
ncol(student_data) # number of columns[1] 4
glimpse(student_data) # structure — names, types, first valuesRows: 4
Columns: 4
$ student_id <dbl> 101, 102, 103, 104
$ name <chr> "Maya", "Jordan", "Sam", "Alex"
$ quiz_score <dbl> 85, 92, 78, 95
$ time_on_task <dbl> 25, 30, 20, 35
summary(student_data) # summary statistics student_id name quiz_score time_on_task
Min. :101.0 Length:4 Min. :78.00 Min. :20.00
1st Qu.:101.8 Class :character 1st Qu.:83.25 1st Qu.:23.75
Median :102.5 Mode :character Median :88.50 Median :27.50
Mean :102.5 Mean :87.50 Mean :27.50
3rd Qu.:103.2 3rd Qu.:92.75 3rd Qu.:31.25
Max. :104.0 Max. :95.00 Max. :35.00
# Access a column with $
student_data$quiz_score[1] 85 92 78 95
# Mean of a column
mean(student_data$quiz_score)[1] 87.5
# Select specific columns with tidyverse select()
student_data |> select(name, quiz_score) name quiz_score
1 Maya 85
2 Jordan 92
3 Sam 78
4 Alex 95
# Keep only students who scored above 85
high_scorers <- student_data |>
filter(quiz_score > 85)
high_scorers student_id name quiz_score time_on_task
1 102 Jordan 92 30
2 104 Alex 95 35
filter() and select() are from the tidyverse. The |> operator means “take this, then do this.” You will use it constantly in next weeks. Read student_data |> filter(quiz_score > 85) as: “take student_data, then keep rows where quiz_score is above 85.”
# Create a data frame called 'my_class' with 5 rows:
# - student_id (any numbers)
# - subject (any subject name)
# - score (numbers between 60–100)
# - attended (TRUE or FALSE)
my_class <- data.frame(
student_id = c(12345, 54321, 13524, 42531),
subject = c("ESL","Math","Science","ELD"),
score = c(55, 70, 85, 100),
attended = c(TRUE)
)
# Use glimpse() to inspect it.
glimpse(my_class)Rows: 4
Columns: 4
$ student_id <dbl> 12345, 54321, 13524, 42531
$ subject <chr> "ESL", "Math", "Science", "ELD"
$ score <dbl> 55, 70, 85, 100
$ attended <lgl> TRUE, TRUE, TRUE, TRUE
# Filter to keep only students with score above 75.
my_class |> filter(score > 75) student_id subject score attended
1 13524 Science 85 TRUE
2 42531 ELD 100 TRUE
# Select only student_id and score.
my_class |> select(student_id, score) student_id score
1 12345 55
2 54321 70
3 13524 85
4 42531 100
In the next module, you will load a real CSV file. Here is the syntax so it is not new when you see it:
# eval: false means this chunk will NOT run — it is just for reading.
# You will use this in next modules with a real file name.
data <- read_csv("data/your_file_name.csv")
glimpse(data)
head(data)
nrow(data)Older R tutorials use read.csv(). This course uses read_csv() from the tidyverse — faster and more consistent. When you see read.csv() in an online example, you can usually swap it for read_csv().
If you worked through all four practice sections, you are ready for 1_Data_Collection_and_Prep.qmd. What will be new there:
drop_na() and replace_na()mutate()arrange()|>ggplot2This file is optional and ungraded. Nothing to render or submit. When you are ready, open 1_Data_Collection_and_Prep.qmd.
This file is optional and ungraded — there is no submission required. But if you would like to publish it as part of your e-portfolio to show your R learning journey, you can.
Click the Render button in the toolbar above. A formatted HTML page will appear in your Viewer tab or a new browser window.
Choose any method that fits your portfolio:
| Option | Best for | Link |
|---|---|---|
| Posit Cloud | Quickest — one click from your workspace | Guide |
| RPubs | Free, public, easy to share a link | rpubs.com |
| Quarto Pub | Clean public portfolio pages | Guide |
| GitHub Pages | Best for a professional portfolio | Guide |
If you are building an e-portfolio, publishing all four .qmd files (this one plus the three required files) tells a complete story — from your first R steps to a full learning analytics capstone. A viewer can follow your progression across the semester.
If you have any questions or run into technical issues, post in the course discussion board or contact your instructor.