Lecture 1.3 - Stats 20 - R Data Structures and R Markdown

1. Data Structures

1.1 Scalar

A scalar is a single number. A scalar is zero-dimensional.

4         # a single number
a = 4     # store as a value
a * 10    # can do calculation 
half.a = a/2
half.a

1.2 Vectors

A vector is a row of numbers. A vector is one-dimensional.

vector1 = c(1, 3, 5, 9)  # this vector has 4 elments
vector2 = c(2, 5, -1, 3)
vector1 + vector2
vector3 = 1:20            # generate a vectors with integers from 1 to 15
vector4 = rnorm(20)      # this vector has 10 elments and  is generated from the  normal distribution with mean 0 and standard deviation 1.
vector5 = seq(from = 0, to = 50, by = 5) # use the seq() function to construct a vector

You can extract individual elements in a vector by using an indexing structure vector[i]. For example:

vector5[10]
vector5[c(4,7)]

You can do some operations with vectors.

vector = c(1, 3, 5, 9, 0, 12)
length(vector)
mean(vector)
sd(vector)
sqrt(vector)

1.3 Matrices

A matrices are two-dimensional vectors. To define a matrix, use the function matrix

?matrix  # get help about function matrix

matrix(data = c(1:20), nrow = 2)
matrix(data = c(1:20), ncol = 2)
matrix(data = c(1:20), nrow = 4, ncol = 5)
matrix(data = c(1:20), ncol = 5, nrow = 4)

# A more complicated example
m <- matrix(1:6, nrow = 2, dimnames = list(c("a", "b"), LETTERS[1:3]))
m




Matrix operations are similar to vector operations. Elements of a matrix can be addressed in the usual way: [row, column].

a = matrix(data = c(1:20), nrow = 2)
a[1,2]  # extract element in row 1 column 2
a[1,]   # extract all elements in row 1.
a[,5]   # extract all elements in column 5.

1.4 Data Frames

A data frames is a matrix with names above the columns. Use the function data.frame() to create a data frame.

?data.frame
Frame = data.frame(one = c(10, 11, 12), two = c( 2, 3, 4), three = c(18, -3, 0))
Frame

You can extract specific columns from a data frame.

Frame$one   # extract column one by specifying the name of the column
Frame[,1]   # extract column one by specifying the location of the column.
Frame[,c(1,3)]  # extract columns one and three

You can do calculation with columns

mean(Frame$one)  # find the mean
sd(Frame$one)    # find the standard deviation

YOUR TURN

Create an R Markdown script file, which constructs two random normal vectors of length 100. Call them x1 and x2. Make a data frame called dframe containing these two columns. Plot the data by using plot() function.

1.5 Lists

A list is a collection of vectors that don't have to be of the same length, unlike matrices and data frames.

L = list(one = 1, two = c(1,2), three = seq(0, 1, length = 5), four = c ("mary", "bob"))
L

names(L) # get the names of the list L
n = c(2, 3, 5) 
s = c("aa", "bb", "cc", "dd", "ee") 
b = c(TRUE, FALSE, TRUE, FALSE, FALSE) 
x = list(n, s, b)   # x contains copies of n, s, b 
x

2. R Markdown (continued) - Block, R code chunk and Inline R code.

2.1 Block

Text included in a block will be shown in a block. R code included in a block is treated as text and won't be evaluated.

summary(cars$dist)
summary(cars$speed)

2.2 R code chunk

This is the structure of an R code chunk, similar to the block but with curly brackets around letter r

R code will be evaluated and printed when included in chunks. Example:

# Define the cars vector with 5 values
trucks <- c(1, 3, 6, 4, 9, 12)

# Graph trucks using blue points overlayed by a line
plot(trucks, type = "o", col = "blue")

# Create a title
title(main = "Autos")

plot of chunk unnamed-chunk-2

2.3 Inline R code

To embed R code inline with text, use the single quotation marks and include the lower case letter _r_:  `r `

Example 1 : The names of the variables in the dataset cars are speed, dist.

Example 2: My favorite number is 3.1416.

Next week:

1. Simple graphics in R

2. Text Output in R Markdown

https://www.harding.edu/fmccown/r/