Introduction

All the regular mathematical operations can be carried out. The usual signals and hierarchies are used.

4 + 5
4 * 5
4 / 5
4^5

sqrt(5)
log(5)

test <- 2*3
test

test = 2/3
test

Practice

Calculate the average of 2, 5, 6, 8, 9, 15, 1, 3
Calculate the difference between 225 and 200 as a percentage change

Objects

R uses objects. Among the main objects are:

integer
numeric
string
logical (boolean)
vector
data frame
matrix

When things go wrong, it is often the case that you have the wrong class. For example, you could be trying to adding up words (“strings”) together. That can only be done with + if you are using an integer or a numeric.

String

For an example of string

a <- "My name"
a

## [1] "My name"

Practice

Assign you first and last name to the objects a and b
Add a to b

Logical (boolean, true or false)

Test whether it is true or false.

1 == 1

## [1] TRUE

1 == 2

## [1] FALSE

1 & 2 == 1

## [1] FALSE

1 | 2 == 1

## [1] TRUE

1 > 2

## [1] FALSE

1 < 2

## [1] TRUE

Vectors and matrices

Numerics and strings can be concatenated or combined to form a vector or matrix

mynumbers <- c(3,5,6,7,9)
mynumbers

## [1] 3 5 6 7 9

mynumbers <- c(1:10)
mynumbers

##  [1]  1  2  3  4  5  6  7  8  9 10

To create vectors there are a number of functions that can be used. For example, to create a sequence, use the command sequence

a <- seq(0.5,2.5, length=100)
a

##   [1] 0.5000000 0.5202020 0.5404040 0.5606061 0.5808081 0.6010101 0.6212121
##   [8] 0.6414141 0.6616162 0.6818182 0.7020202 0.7222222 0.7424242 0.7626263
##  [15] 0.7828283 0.8030303 0.8232323 0.8434343 0.8636364 0.8838384 0.9040404
##  [22] 0.9242424 0.9444444 0.9646465 0.9848485 1.0050505 1.0252525 1.0454545
##  [29] 1.0656566 1.0858586 1.1060606 1.1262626 1.1464646 1.1666667 1.1868687
##  [36] 1.2070707 1.2272727 1.2474747 1.2676768 1.2878788 1.3080808 1.3282828
##  [43] 1.3484848 1.3686869 1.3888889 1.4090909 1.4292929 1.4494949 1.4696970
##  [50] 1.4898990 1.5101010 1.5303030 1.5505051 1.5707071 1.5909091 1.6111111
##  [57] 1.6313131 1.6515152 1.6717172 1.6919192 1.7121212 1.7323232 1.7525253
##  [64] 1.7727273 1.7929293 1.8131313 1.8333333 1.8535354 1.8737374 1.8939394
##  [71] 1.9141414 1.9343434 1.9545455 1.9747475 1.9949495 2.0151515 2.0353535
##  [78] 2.0555556 2.0757576 2.0959596 2.1161616 2.1363636 2.1565657 2.1767677
##  [85] 2.1969697 2.2171717 2.2373737 2.2575758 2.2777778 2.2979798 2.3181818
##  [92] 2.3383838 2.3585859 2.3787879 2.3989899 2.4191919 2.4393939 2.4595960
##  [99] 2.4797980 2.5000000

b <- seq(1,10,1)
b

##  [1]  1  2  3  4  5  6  7  8  9 10

mynumbers <- 1:12
m <-matrix(mynumbers, nrow=4)
m

##      [,1] [,2] [,3]
## [1,]    1    5    9
## [2,]    2    6   10
## [3,]    3    7   11
## [4,]    4    8   12

Extract elements from the matrix by using [x, y] where x is the row and y is the column. Therefore m[1, 1] would be the top left hand corner. m[ , 1] will select all the first column, m[2, ] will select all the second row.

Practice

Create a matrix with 3 rows and 3 columns with numbers running from 1 to 9
Extract the top right and bottom left numbers

Data frames

Matrices can only contain one type of object. Data-frames can group together vectors or various class. These will be the most useful for us as we want time series with dates and integers.

Individual components of a vector, matrix or data.frame can be identifies by using square brackets and a pair of numbers with the first equal to the row and the second equal to column. You only need the first number for a vector.

Students <- c('Rob', 'James', 'Sam', 'Jane')
Marks <- c(20, 45, 65, 52)
myclass <- data.frame(Students, Marks)
myclass[, 1]

## [1] "Rob"   "James" "Sam"   "Jane"

myclass[, 2]

## [1] 20 45 65 52

mean(myclass[, 2])

## [1] 45.5

It is also possible to subset columns by their name in a dataframe. Use the $ after the dataframe name.

myclass$Students

## [1] "Rob"   "James" "Sam"   "Jane"

myclass$Marks

## [1] 20 45 65 52

myclass$Students == 'Rob'

## [1]  TRUE FALSE FALSE FALSE

myclass[myclass$Students == 'Rob', ]

##   Students Marks
## 1      Rob    20

myclass$Marks > 50

## [1] FALSE FALSE  TRUE  TRUE

myclass$Marks > 50

## [1] FALSE FALSE  TRUE  TRUE

myclass[myclass$Marks > 50, ]

##   Students Marks
## 3      Sam    65
## 4     Jane    52

Practice

Extract the marks for Jane
Extract those marks that are below 50

Getting help

There is a lot of help. Open source ensures that there is a community where help is available. You might try one of the AI machines like Chat-GPT4 or dedicated help forums like Stackoverflow.

Search for functions and assistance. You can also get help built in to R and Rstudio.

If you have the cursor on a function and press F1, you will get help on the function
If you type ?function you will get help on that function

Practice

Search for the function that will create a histogram
Use Rstudio to get an idea about how the function works

Plotting

The plot function will allow you to create graphs with the data that you have. Plotting is the best way to understand the data that you have. It will also identify if there are any errors.

The plot function will plot the data. You need two series that match. If you get an error it is frequently the case that they are of different length. You can you add more lines by using the lines function. Here we create two series and plot.

x <- seq(1: 10)
y1 <- x^2
y2 <- 2 * x ^ 2
plot(x,y2)
lines(y1)

There are a number of parameters that can be used to customise the plot.

col: colour, which will determine the colour of the lines
lty: line type, which will determine the type of plot, dashed or straight
lwd: line weight will determine the weight of the line
legend: will create a legend (more below)
main: will provide a main heading
xlab: will provide a label for the x axis
ylab: will provide a label for the y axis.

Practice

Using the same x, y1 and y2 variables, see if you can create a plot with a heading and alternative labels for the x and y axis. If you are adventurous you can change the colour, weight and type of lines.

Histogram

It is also possible to create histograms. Here we use the rnorm function to create 100 normal random variables and then to calculate the mean and standard deviation of these series and use these as the basis for constructing a normal distribution using these values for the mean and standard deviation.

z <- rnorm(100)
hist(z, prob=TRUE, col = 'cornflowerblue')
mu <- mean(z)
sig <-sd(z)
x <- seq(-4,4,length=500)
y <- dnorm(x,mu,sig)
lines(x,y, col='red')

Practice

Repeat the last exercise using 100,000 generated normal random variables. What do you notice?
Take a look at the function rnorm and work out what it does and the alternatives like dnorm, qnorm
Find the midpoint of a standard normal distribution using the qnorm function and then check the answer by using pnorm. Remember that the probability of being below the mid-point of a normal standard deviation is 50%. Remember that a standard normal distribution has a mean of zero and a standard deviation of one.

First R

Introduction

Practice

Objects

String

Practice

Logical (boolean, true or false)

Vectors and matrices

Practice

Data frames

Practice

Getting help

Practice

Plotting

Practice

Histogram

Practice