R is a free, open source software program for statistical computing and analysis.
Things to know about R:
Windows, Macintosh, and
Linux.R Markdown and R Notebook.RStudio is a free, open source IDE (integrated development environment) for R.
Things to know about RStudio:
CSV,
Excel, text (txt),
SAS (.sas7bdat), SPSS (.sav), and
Stata (*.dta) files into R without having to write the code
to do so.1. Arithmetic Operations:
+, -, *, /, ^ or **.
# Addition
2 + 5
# Subtraction
3 - 10
# Multiplication
6 * (-3)
# Division
-9/10
# Exponentiation
2^5
2. Variables: Assigning values to variables using the \(<-\) operator or \(=\).
x <- 5
y <- 3
z <- x + y
z
u <- -2 * z
u
v <- z/u
v
3. Vectors: Creating vectors using the \(c()\) function.
# Numeric vector
vec_numeric <- c(25, 32, 18, 63, 78)
vec_numeric
# Character vector
vec_charactor <- c("apple", "banana", "orange", "mango", "pawpaw")
vec_charactor
# Logical vector
vec_logical <- c(TRUE, FALSE, TRUE)
vec_logical
4. Indexing and Slicing a Vector: Accessing elements
of a vector using square brackets [].
# Pick the second number
vec_numeric[2]
# Pick the second and the fifth numbers
vec_numeric[c(2,5)]
# Pick all except the second and fifth numbers
vec_numeric[-c(2,5)]
# Pick the third fruit
vec_charactor[3]
# Pick the third and fifth fruits
vec_charactor[c(3,5)]
# Pick all except the third and fifth fruits
vec_charactor[-c(3,5)]
5. Data Frames: Creating and working with data frames, which are like tables.
# Data frame with age, height, weight, and gender
dat <- data.frame(age = c(25, 30, 35, 28, 22, 27, 33, 29, 31, 24),
ht = c(175, 160, 180, 165, 170, 168, 175, 162, 178, 160),
wt = c(70, 55, 80, 60, 65, 68, 75, 58, 82, 50),
sex = c("Male", "Female", "Male", "Female", "Male", "Male", "Female", "Female", "Male", "Female"))
dat
6. Accessing Elements of a Data Frame: Accessing
elements of a data frame using square brackets [] or the
$ operator.
Various Conditions to Subset Data.:
Equality (==), Inequality (!=), Greater
than (>), Greater than or equal to (>=),
less than (<), and less than or equal to
(<=).
The & operator is used for AND
conditions, the | operator is used for OR
condition, and the ! operator is used for NOT
condition.
# Accessing the entire 'age' column
ages <- dat$age
ages
# Accessing the first three elements of the 'height' column
first_three_ht <- dat$ht[1:3]
first_three_ht
# Accessing the value in the second row and third column ('weight')
wt_second_row <- dat[2, "wt"]
wt_second_row
# Accessing the first two columns ('age' and 'height')
first_two_columns <- dat[, 1:2]
first_two_columns
# Accessing the last two columns ('weight' and 'gender')
last_two_columns <- dat[, 3:4]
# Accessing specific columns by index ('height' and 'gender')
height_gender <- dat[, c(2, 4)]
height_gender
# Accessing a subset of the data frame based on a condition
subset_data_1 <- dat[dat$age >= 30, ]
subset_data_1
# Accessing a subset of the data frame based on a condition
subset_data_2 <- dat[dat$sex == "Male", ]
subset_data_2
# Accessing a subset of the data frame based on multiple conditions
subset_data_3 <- dat[dat$age < 30 & dat$sex == "Female", ]
subset_data_3
# Accessing a subset of the data frame based on multiple conditions
subset_data_4 <- dat[dat$wt > 50 & dat$wt <= 75, ]
subset_data_4
# Accessing a subset of the data frame based on multiple conditions
subset_data_5 <- dat[dat$wt < 60 | dat$wt > 75, c("age", "sex", "ht")]
subset_data_5
CSV file is to use
read.table(). Most people prefer to use
read.csv() which is a rapper around
read.table() with the sep argument preset to a
comma (,).read.table() is a
data.frame.CSV file from your local
computer into R using the test data set. Click
me to download the test data set to your local computer.# Set your working directory to the folder on your local computer that contains the test data set.
# How to set your working directory: On the top RStudio menu, click on “Session”, then “Set Working Directory”, then “Choose Directory”.
setwd("D:/Year_2024/Documents/Fall_2024/MA223")
# Read data into R using the read.csv() function.
test_dat_loc <- read.csv("test.csv", header = TRUE, stringsAsFactors = TRUE)
test_dat_loc
url <- "https://raw.githubusercontent.com/sylvadon4/data_sets/main/test.csv"
test_dat_web <- read.csv(url, header = TRUE, stringsAsFactors = TRUE)
test_dat_web
R comes with several built-in data sets. To see the list of
pre-loaded data, type the function data():
data()
Load theChickWeight data as follow:
data(ChickWeight)
# To get details about this data set, remove the hash tag and run the code.
# ?ChickWeight
head() for first few rows of a matrix or data
frame.tail() for last few rows of a matrix or data
frame.dim() for dimension of a matrix or data frame.str() for displaying the structure of an R object.nrow() for number of rows of a matrix or data
frame.ncol() for number of columns of a matrix or data
frame.summary() for numeric variables.quantile() for quartiles.table() for categorical variables.sum(is.na()) for counting the number of NAs in the
entire dataset.If you need to change the data type for any column, use the following functions:
as.character() converts to a text string.as.numeric() converts to a number.as.factor() converts to a categorical variable.as.integer() converts to an integer.url <- "https://raw.githubusercontent.com/sylvadon4/data_sets/main/test.csv"
test_dat_web <- read.csv(url, header = TRUE, stringsAsFactors = TRUE)
test_dat_web
head(test_dat_web)
tail(test_dat_web)
dim(test_dat_web)
str(test_dat_web)
summary(test_dat_web)
An R package is a collection of R functions, data, and compiled code designed to solve specific problems or provide additional functionality to the R programming language. Packages are crucial for extending R’s capabilities beyond its base functionality.
To install an R package, you can use the
install.packages() function. Open your R console or script
and type:
install.packages("package_name")
Replace package_name with the name of the package you
want to install. If you need to install multiple packages, you can pass
a vector of package names.
install.packages(c("package1", "package2", "package3"))
Once installed, you need to load the package into your R session
using the library() function. This makes the package’s
functions and features available for use.
library(package_name)
ISLR2() package in R. Click
me for the documentation on `ISLR2()# install.packages("ISLR2")
library(ISLR2)
Note: A package can be removed using
remove.packages("package name").
Southeast Missouri State University, ethompson@semo.edu↩︎