1. Vector

One-Dimensional Array

c(1, 1.3, 8)
## [1] 1.0 1.3 8.0
x = 2:5
x
## [1] 2 3 4 5
x^2 + 1/4
## [1]  4.25  9.25 16.25 25.25

Class function helps us understand the type of object

class(x)
## [1] "integer"

The set.seed() function gurantees that the same random values are produced each time.

set.seed(2021)
y = sample(1:50)
y
##  [1]  7 38 46 39 12  6 49 44  5 47 23 48 18  3 26 22 31 19  4 21 35 42  9 45 43
## [26] 11 36 27 30 40 15  2 24 16 20  1  8 17 34 29 32 10 41 37 50 33 13 25 28 14

The summary() function reduces data frame to just one vector or value.

summary(y)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.00   13.25   25.50   25.50   37.75   50.00
class(y)
## [1] "integer"

The str() function displays the contents of a list.

str(y)
##  int [1:50] 7 38 46 39 12 6 49 44 5 47 ...

2. Data Frame

The head(x,y) function returns a table dataset which x is the data and y is the amount of rows being displayed.

head(cars, 8)
##   speed dist
## 1     4    2
## 2     4   10
## 3     7    4
## 4     7   22
## 5     8   16
## 6     9   10
## 7    10   18
## 8    10   26
class(cars)
## [1] "data.frame"
summary(cars)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00
summary(cars$speed)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     4.0    12.0    15.0    15.4    19.0    25.0
summary(cars$dist)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    2.00   26.00   36.00   42.98   56.00  120.00

The $ operator is used to extract or subset a specific part of a data object.

speed = cars$speed
mean(speed)
## [1] 15.4
median(speed)
## [1] 15
min(speed)
## [1] 4
max(speed)
## [1] 25

The quantile(x,y) function divides a data set x in equal halves and you can change the amount of halves with y.

quantile(speed)
##   0%  25%  50%  75% 100% 
##    4   12   15   19   25
quantile(speed, probs=seq(0,1, by = 0.20))
##   0%  20%  40%  60%  80% 100% 
##    4   11   14   17   20   25
sd(speed)
## [1] 5.287644
var(speed)
## [1] 27.95918
dist = cars$dist
##correlation
cor(speed, dist)
## [1] 0.8068949
str(cars)
## 'data.frame':    50 obs. of  2 variables:
##  $ speed: num  4 4 7 7 8 9 10 10 10 11 ...
##  $ dist : num  2 10 4 22 16 10 18 26 34 17 ...

The dim() function retrieves or sets the dimensions of an object.

The View() function invokes a spreadsheet-style data viewer within RStudio.

dim(cars)
## [1] 50  2
View(cars)

3. Data Subsets

The cars[1,] shows all columns but only in the first row

The cars[,1] shows all rows but only in the first column

The cars[,1:2] shows all rows but only columns 1 through 2.

cars[1,]
##   speed dist
## 1     4    2
cars[,1]
##  [1]  4  4  7  7  8  9 10 10 10 11 11 12 12 12 12 13 13 13 13 14 14 14 14 15 15
## [26] 15 16 16 17 17 17 18 18 18 18 19 19 19 20 20 20 20 20 22 23 24 24 24 24 25
cars[,1:2]
##    speed dist
## 1      4    2
## 2      4   10
## 3      7    4
## 4      7   22
## 5      8   16
## 6      9   10
## 7     10   18
## 8     10   26
## 9     10   34
## 10    11   17
## 11    11   28
## 12    12   14
## 13    12   20
## 14    12   24
## 15    12   28
## 16    13   26
## 17    13   34
## 18    13   34
## 19    13   46
## 20    14   26
## 21    14   36
## 22    14   60
## 23    14   80
## 24    15   20
## 25    15   26
## 26    15   54
## 27    16   32
## 28    16   40
## 29    17   32
## 30    17   40
## 31    17   50
## 32    18   42
## 33    18   56
## 34    18   76
## 35    18   84
## 36    19   36
## 37    19   46
## 38    19   68
## 39    20   32
## 40    20   48
## 41    20   52
## 42    20   56
## 43    20   64
## 44    22   66
## 45    23   54
## 46    24   70
## 47    24   92
## 48    24   93
## 49    24  120
## 50    25   85

Distance is a subset of cars and its values 1-3 are shown below

cars$dist[1:3]
## [1]  2 10  4

The c() function is short for combine which then combines the values 2,8,4. The values in those corresponding rows are returned and in column 2.

cars[c(2,8,4), 2]
## [1] 10 26 22

The - infront of c() function acts a not or not include values between 2-49.

cars[-c(2:49),]
##    speed dist
## 1      4    2
## 50    25   85

The which() function returns the position of the value that satisfies the given condition.

which.max(cars$speed)
## [1] 50
cars[which.max(cars$speed),"dist"]
## [1] 85

The %n% operator checks if the values of the first argument are present in the second argument. Are the values included in?

cars$dist %in% c(10,20)
##  [1] FALSE  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
## [13]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
## [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [49] FALSE FALSE
cars[cars$dist %in% c(10,20),]
##    speed dist
## 2      4   10
## 6      9   10
## 13    12   20
## 24    15   20

4. Functions

f = function(x,y){
  r = sqrt(x^2 + y^2)
  return(10*sin(r)/r) # sin() returns the sine of a number in radians
}
f(3,4)
## [1] -1.917849

The Seq(from, to, by, length, along with) function generates a sequence of numbers.

x = seq(-10, 10, length=30)
y=x

The outer(x,y,z) function is used to apply a function to two arrays. x,y are the arrays. z is the function

z=outer(x,y,f)

The persp() function draws perspective plots of surfaces over the x-y plane.

persp(x, y, z, theta = 30, phi = 30, expand = 0.5, col = "lightblue")

x=seq(-10,10,length=300)

The plot() function will plot two vectors against each other.

plot(x,sin(x))