All the regular mathematical operations can be carried out. The usual signals and hierarchies are used.
4 + 5
4 * 5
4 / 5
4^5
sqrt(5)
log(5)
test <- 2*3
test
test = 2/3
test
R uses objects. Among the main objects are:
When things go wrong, it is often the case that you have the wrong class. For example, you could be trying to adding up words (“strings”) together. That can only be done with + if you are using an integer or a numeric.
For an example of string
a <- "My name"
a
## [1] "My name"
a and ba to bTest whether it is true or false.
1 == 1
## [1] TRUE
1 == 2
## [1] FALSE
1 & 2 == 1
## [1] FALSE
1 | 2 == 1
## [1] TRUE
1 > 2
## [1] FALSE
1 < 2
## [1] TRUE
Numerics and strings can be concatenated or combined to form a vector or matrix
mynumbers <- c(3,5,6,7,9)
mynumbers
## [1] 3 5 6 7 9
mynumbers <- c(1:10)
mynumbers
## [1] 1 2 3 4 5 6 7 8 9 10
To create vectors there are a number of functions that can be used. For example, to create a sequence, use the command sequence
a <- seq(0.5,2.5, length=100)
a
## [1] 0.5000000 0.5202020 0.5404040 0.5606061 0.5808081 0.6010101 0.6212121
## [8] 0.6414141 0.6616162 0.6818182 0.7020202 0.7222222 0.7424242 0.7626263
## [15] 0.7828283 0.8030303 0.8232323 0.8434343 0.8636364 0.8838384 0.9040404
## [22] 0.9242424 0.9444444 0.9646465 0.9848485 1.0050505 1.0252525 1.0454545
## [29] 1.0656566 1.0858586 1.1060606 1.1262626 1.1464646 1.1666667 1.1868687
## [36] 1.2070707 1.2272727 1.2474747 1.2676768 1.2878788 1.3080808 1.3282828
## [43] 1.3484848 1.3686869 1.3888889 1.4090909 1.4292929 1.4494949 1.4696970
## [50] 1.4898990 1.5101010 1.5303030 1.5505051 1.5707071 1.5909091 1.6111111
## [57] 1.6313131 1.6515152 1.6717172 1.6919192 1.7121212 1.7323232 1.7525253
## [64] 1.7727273 1.7929293 1.8131313 1.8333333 1.8535354 1.8737374 1.8939394
## [71] 1.9141414 1.9343434 1.9545455 1.9747475 1.9949495 2.0151515 2.0353535
## [78] 2.0555556 2.0757576 2.0959596 2.1161616 2.1363636 2.1565657 2.1767677
## [85] 2.1969697 2.2171717 2.2373737 2.2575758 2.2777778 2.2979798 2.3181818
## [92] 2.3383838 2.3585859 2.3787879 2.3989899 2.4191919 2.4393939 2.4595960
## [99] 2.4797980 2.5000000
b <- seq(1,10,1)
b
## [1] 1 2 3 4 5 6 7 8 9 10
mynumbers <- 1:12
m <-matrix(mynumbers, nrow=4)
m
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
Extract elements from the matrix by using [x, y] where x is the row and y is the column. Therefore m[1, 1] would be the top left hand corner. m[ , 1] will select all the first column, m[2, ] will select all the second row.
Matrices can only contain one type of object. Data-frames can group together vectors or various class. These will be the most useful for us as we want time series with dates and integers.
Individual components of a vector, matrix or data.frame can be identifies by using square brackets and a pair of numbers with the first equal to the row and the second equal to column. You only need the first number for a vector.
Students <- c('Rob', 'James', 'Sam', 'Jane')
Marks <- c(20, 45, 65, 52)
myclass <- data.frame(Students, Marks)
myclass[, 1]
## [1] "Rob" "James" "Sam" "Jane"
myclass[, 2]
## [1] 20 45 65 52
mean(myclass[, 2])
## [1] 45.5
It is also possible to subset columns by their name in a dataframe. Use the $ after the dataframe name.
myclass$Students
## [1] "Rob" "James" "Sam" "Jane"
myclass$Marks
## [1] 20 45 65 52
myclass$Students == 'Rob'
## [1] TRUE FALSE FALSE FALSE
myclass[myclass$Students == 'Rob', ]
## Students Marks
## 1 Rob 20
myclass$Marks > 50
## [1] FALSE FALSE TRUE TRUE
myclass$Marks > 50
## [1] FALSE FALSE TRUE TRUE
myclass[myclass$Marks > 50, ]
## Students Marks
## 3 Sam 65
## 4 Jane 52
There is a lot of help. Open source ensures that there is a community where help is available. You might try one of the AI machines like Chat-GPT4 or dedicated help forums like Stackoverflow.
Search for functions and assistance. You can also get help built in to R and Rstudio.
?function you will get help on that functionThe plot function will allow you to create graphs with the data that you have. Plotting is the best way to understand the data that you have. It will also identify if there are any errors.
The plot function will plot the data. You need two series that match. If you get an error it is frequently the case that they are of different length. You can you add more lines by using the lines function. Here we create two series and plot.
x <- seq(1: 10)
y1 <- x^2
y2 <- 2 * x ^ 2
plot(x,y2)
lines(y1)
There are a number of parameters that can be used to customise the plot.
col: colour, which will determine the colour of the lineslty: line type, which will determine the type of plot, dashed or straightlwd: line weight will determine the weight of the linelegend: will create a legend (more below)main: will provide a main headingxlab: will provide a label for the x axisylab: will provide a label for the y axis.Using the same x, y1 and y2 variables, see if you can create a plot with a heading and alternative labels for the x and y axis. If you are adventurous you can change the colour, weight and type of lines.
It is also possible to create histograms. Here we use the rnorm function to create 100 normal random variables and then to calculate the mean and standard deviation of these series and use these as the basis for constructing a normal distribution using these values for the mean and standard deviation.
z <- rnorm(100)
hist(z, prob=TRUE, col = 'cornflowerblue')
mu <- mean(z)
sig <-sd(z)
x <- seq(-4,4,length=500)
y <- dnorm(x,mu,sig)
lines(x,y, col='red')
rnorm and work out what it does and the alternatives like dnorm, qnormqnorm function and then check the answer by using pnorm. Remember that the probability of being below the mid-point of a normal standard deviation is 50%. Remember that a standard normal distribution has a mean of zero and a standard deviation of one.