R Programming for ABSOLUTE beginners

This RPubs document shows the code used in “R programming for ABSOLUTE beginners,” an excellent introduction to R code from Dr. Greg Martin’s R Programming 101 YouTube channel. See the video at: https://youtu.be/FY8BISK5DpM?si=TWpU_N_q6y6UAD--

Assigning values to variables.

5 + 6

## [1] 11

a <- 5
b <- 6
a + b

## [1] 11

sum(a,b)

## [1] 11

name <- c("Greg", "Gill")
name

## [1] "Greg" "Gill"

name <- c("Greg", "Paul", "Kim")
age <- c(47,52,34)
gender <- c("M","M","F")

name

## [1] "Greg" "Paul" "Kim"

age

## [1] 47 52 34

gender

## [1] "M" "M" "F"

Creating a data frame using the data.frame() function.

I used the head() function to show the data frame in the output here. Martin simply clicks on the data frame in RStudio and shows it to you there.

friends <- data.frame(name, age, gender)
head(friends)

##   name age gender
## 1 Greg  47      M
## 2 Paul  52      M
## 3  Kim  34      F

An aside: One concept Martin doesn’t explain: A data frame is a grid. Horizontal sections of the grid are called “rows.” Vertical sections of the grid are called “columns.” Each data point gets stored in the intersection of a row and a column. In the friends data frame, for example, Martin’s first name, “Greg,” is stored in the intersection of the first row and the first column.
Another aside: When you are using a spreadsheet, like Excel, these intersections are called “cells.” Each has a “cell address,” like A1 for the cell at the intersection of Column A and Row 1. Most of what you do in a spreadsheet involves working with data in cells. R is designed to work with whole columns at once. It’s a little different approach. But, as you’ll see, it’s often much, much faster.
One last aside: In data journalism, it’s rare to build a data frame manually like this. Most of the time, you import data that someone else has already assembled.

Subsetting (the hard way)

Selecting various pieces of the data frame using base R code. Note: This is the hard way to do things. It comes in handy sometimes. But you’ll see an easier way in a moment.

# Show all rows in the data frame's "name" column:
friends$name

## [1] "Greg" "Paul" "Kim"

# Show all rows and columns in the data frame
friends[ , ]

##   name age gender
## 1 Greg  47      M
## 2 Paul  52      M
## 3  Kim  34      F

# Show all columns in the data frame's first row
friends[1, ]

##   name age gender
## 1 Greg  47      M

# Show the first row of the first column
friends[1,1]

## [1] "Greg"

# Show rows 1 through 3 of the first column (Same result as friends$name)
friends[1:3,1]

## [1] "Greg" "Paul" "Kim"

# Show column 1 of the first three rows (Same result as friends[1, ])
friends[1,1:3]

##   name age gender
## 1 Greg  47      M

# Show all rows in the first two columns for which age is less than 50
friends[friends$age<50,1:2]

##   name age
## 1 Greg  47
## 3  Kim  34

Subsetting (the easy way)

A much easier way to select pieces of the data frame: Use the tidyverse package. The tidyverse package makes many other things in R easier to do, too.

if (!require("tidyverse"))
  install.packages("tidyverse")
library(tidyverse)

friends %>% 
  select(name,age) %>% 
  filter(age < 50) %>% 
  arrange(age)

##   name age
## 1  Kim  34
## 2 Greg  47

R Programming for ABSOLUTE beginners

Dr. Ken Blake

2024-09-02

Assigning values to variables.

Creating a data frame using the data.frame() function.

Subsetting (the hard way)

Subsetting (the easy way)