This tutorial will walk you through some basic matrix notation that will be useful. First I will go over how to make a basic matrix:

A = matrix( 
c(2, 4, 3, 1, 5, 7), # the data elements 
nrow=2,              # number of rows 
ncol=3,              # number of columns 
byrow = TRUE)        # fill matrix by rows 
 
A                      # print the matrix 
##      [,1] [,2] [,3]
## [1,]    2    4    3
## [2,]    1    5    7

Now that we know how to create a basic matrix and the format, we can use this same logic with any dataset we have. For this tutorial I am going to use the Boston dataset. Below are some points that we will look at that we will later change.

data(Boston)

class(Boston)
## [1] "data.frame"
str(Boston)
## 'data.frame':    506 obs. of  14 variables:
##  $ crim   : num  0.00632 0.02731 0.02729 0.03237 0.06905 ...
##  $ zn     : num  18 0 0 0 0 0 12.5 12.5 12.5 12.5 ...
##  $ indus  : num  2.31 7.07 7.07 2.18 2.18 2.18 7.87 7.87 7.87 7.87 ...
##  $ chas   : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ nox    : num  0.538 0.469 0.469 0.458 0.458 0.458 0.524 0.524 0.524 0.524 ...
##  $ rm     : num  6.58 6.42 7.18 7 7.15 ...
##  $ age    : num  65.2 78.9 61.1 45.8 54.2 58.7 66.6 96.1 100 85.9 ...
##  $ dis    : num  4.09 4.97 4.97 6.06 6.06 ...
##  $ rad    : int  1 2 2 3 3 3 5 5 5 5 ...
##  $ tax    : num  296 242 242 222 222 222 311 311 311 311 ...
##  $ ptratio: num  15.3 17.8 17.8 18.7 18.7 18.7 15.2 15.2 15.2 15.2 ...
##  $ black  : num  397 397 393 395 397 ...
##  $ lstat  : num  4.98 9.14 4.03 2.94 5.33 ...
##  $ medv   : num  24 21.6 34.7 33.4 36.2 28.7 22.9 27.1 16.5 18.9 ...

We will change the dataset into a matrix, see how these functions below change now that our dataset is a matrix.

mat <- as.matrix(Boston)
class(mat)
## [1] "matrix"
str(mat)
##  num [1:506, 1:14] 0.00632 0.02731 0.02729 0.03237 0.06905 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : chr [1:506] "1" "2" "3" "4" ...
##   ..$ : chr [1:14] "crim" "zn" "indus" "chas" ...

we will now look at some other properties that can be used for analysis later on. The nrow and ncol functions tell us how many rows and columns our matrix(dataset) has.

nrow(mat)
## [1] 506
ncol(mat)
## [1] 14

Now we will look at specific values in the matrix. Using the matrix name with brackets will reference a row and a column, an example is below

# first value in mat, first row and first column
mat[1, 1]
## [1] 0.00632
# a middle value in mat, 250th row and 5th column
mat[250, 5]
## [1] 0.431

We can also reference specific cells by referencing a certain row and column name like we did below:

mat["10","age"]
## [1] 85.9

We can also refernce whole sections as well. If you wanna refernce several rows and columns add : between the range you wish to select, some examples are below:

mat[1:4, 1:2]
##      crim zn
## 1 0.00632 18
## 2 0.02731  0
## 3 0.02729  0
## 4 0.03237  0
mat[5:8, 1:2]
##      crim   zn
## 5 0.06905  0.0
## 6 0.02985  0.0
## 7 0.08829 12.5
## 8 0.14455 12.5

If we don't want to reference a range, but want to reference more than one cell, we can use the c() function to create matrixes within our matrix.

mat[c(1,3,5), c(1,3)]
##      crim indus
## 1 0.00632  2.31
## 3 0.02729  7.07
## 5 0.06905  2.18

If we only want to reference a certain row or column we can, we don't need to reference both:

# All columns from row 5
mat[5, ]
##      crim        zn     indus      chas       nox        rm       age       dis 
##   0.06905   0.00000   2.18000   0.00000   0.45800   7.14700  54.20000   6.06220 
##       rad       tax   ptratio     black     lstat      medv 
##   3.00000 222.00000  18.70000 396.90000   5.33000  36.20000
# All rows from column 2
mat[, 2]
##     1     2     3     4     5     6     7     8     9    10    11    12    13 
##  18.0   0.0   0.0   0.0   0.0   0.0  12.5  12.5  12.5  12.5  12.5  12.5  12.5 
##    14    15    16    17    18    19    20    21    22    23    24    25    26 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##    27    28    29    30    31    32    33    34    35    36    37    38    39 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##    40    41    42    43    44    45    46    47    48    49    50    51    52 
##  75.0  75.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0  21.0  21.0 
##    53    54    55    56    57    58    59    60    61    62    63    64    65 
##  21.0  21.0  75.0  90.0  85.0 100.0  25.0  25.0  25.0  25.0  25.0  25.0  17.5 
##    66    67    68    69    70    71    72    73    74    75    76    77    78 
##  80.0  80.0  12.5  12.5  12.5   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##    79    80    81    82    83    84    85    86    87    88    89    90    91 
##   0.0   0.0  25.0  25.0  25.0  25.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##    92    93    94    95    96    97    98    99   100   101   102   103   104 
##   0.0  28.0  28.0  28.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   105   106   107   108   109   110   111   112   113   114   115   116   117 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   118   119   120   121   122   123   124   125   126   127   128   129   130 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   131   132   133   134   135   136   137   138   139   140   141   142   143 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   144   145   146   147   148   149   150   151   152   153   154   155   156 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   157   158   159   160   161   162   163   164   165   166   167   168   169 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   170   171   172   173   174   175   176   177   178   179   180   181   182 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   183   184   185   186   187   188   189   190   191   192   193   194   195 
##   0.0   0.0   0.0   0.0   0.0  45.0  45.0  45.0  45.0  45.0  45.0  60.0  60.0 
##   196   197   198   199   200   201   202   203   204   205   206   207   208 
##  80.0  80.0  80.0  80.0  95.0  95.0  82.5  82.5  95.0  95.0   0.0   0.0   0.0 
##   209   210   211   212   213   214   215   216   217   218   219   220   221 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   222   223   224   225   226   227   228   229   230   231   232   233   234 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   235   236   237   238   239   240   241   242   243   244   245   246   247 
##   0.0   0.0   0.0   0.0  30.0  30.0  30.0  30.0  30.0  30.0  22.0  22.0  22.0 
##   248   249   250   251   252   253   254   255   256   257   258   259   260 
##  22.0  22.0  22.0  22.0  22.0  22.0  22.0  80.0  80.0  90.0  20.0  20.0  20.0 
##   261   262   263   264   265   266   267   268   269   270   271   272   273 
##  20.0  20.0  20.0  20.0  20.0  20.0  20.0  20.0  20.0  20.0  20.0  20.0  20.0 
##   274   275   276   277   278   279   280   281   282   283   284   285   286 
##  20.0  40.0  40.0  40.0  40.0  40.0  20.0  20.0  20.0  20.0  90.0  90.0  55.0 
##   287   288   289   290   291   292   293   294   295   296   297   298   299 
##  80.0  52.5  52.5  52.5  80.0  80.0  80.0   0.0   0.0   0.0   0.0   0.0  70.0 
##   300   301   302   303   304   305   306   307   308   309   310   311   312 
##  70.0  70.0  34.0  34.0  34.0  33.0  33.0  33.0  33.0   0.0   0.0   0.0   0.0 
##   313   314   315   316   317   318   319   320   321   322   323   324   325 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   326   327   328   329   330   331   332   333   334   335   336   337   338 
##   0.0   0.0   0.0   0.0   0.0   0.0  35.0  35.0   0.0   0.0   0.0   0.0   0.0 
##   339   340   341   342   343   344   345   346   347   348   349   350   351 
##   0.0   0.0   0.0  35.0   0.0  55.0  55.0   0.0   0.0  85.0  80.0  40.0  40.0 
##   352   353   354   355   356   357   358   359   360   361   362   363   364 
##  60.0  60.0  90.0  80.0  80.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   365   366   367   368   369   370   371   372   373   374   375   376   377 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   378   379   380   381   382   383   384   385   386   387   388   389   390 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   391   392   393   394   395   396   397   398   399   400   401   402   403 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   404   405   406   407   408   409   410   411   412   413   414   415   416 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   417   418   419   420   421   422   423   424   425   426   427   428   429 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   430   431   432   433   434   435   436   437   438   439   440   441   442 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   443   444   445   446   447   448   449   450   451   452   453   454   455 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   456   457   458   459   460   461   462   463   464   465   466   467   468 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   469   470   471   472   473   474   475   476   477   478   479   480   481 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   482   483   484   485   486   487   488   489   490   491   492   493   494 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 
##   495   496   497   498   499   500   501   502   503   504   505   506 
##   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0

Now we will look at some functions that are a bit more complicated that you can do with a matrix We will look at some summary functions:

# first row, all of the columns
column_1 <- mat[1, ]
# max particle size for resin 1
max(column_1)
## [1] 396.9
# max particle size for resin 2
max(mat[2, ])
## [1] 396.9
# minimum particle size for operator 3
min(mat[, 3])
## [1] 0.46
# mean for operator 3
mean(mat[, 3])
## [1] 11.13678
# median for operator 3
median(mat[, 3])
## [1] 9.69
# standard deviation for operator 3
sd(mat[, 3])
## [1] 6.860353

Another useful function is the apply function. This allows us the collasp the matrix down to the dimension specified by the margin. For example to find the average particle size of each column, we will calculate the mean of all the rows of the matrix:

avg_column <- apply(mat, 1, mean)

I will now use the matrix notation we learned to run a two sample t-test:

t.test(mat[5,], mat[10,], paired=TRUE)
## 
##  Paired t-test
## 
## data:  mat[5, ] and mat[10, ]
## t = -1.2579, df = 13, p-value = 0.2306
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -23.530384   6.212686
## sample estimates:
## mean of the differences 
##               -8.658849

Specifically, do the means of the two vectors differ significantly?

result <- t.test(mat[5,], mat[10,], paired=TRUE)
names(result)
##  [1] "statistic"   "parameter"   "p.value"     "conf.int"    "estimate"   
##  [6] "null.value"  "stderr"      "alternative" "method"      "data.name"
result$p.value
## [1] 0.2305678

We can see from this analysis that our p-value is high, so we would conclude that the means of the two vectors don't differ significantly.

Now that we have went ove some basic notation of matrices, this will help you further in many other analysis and functions in R. Any dataset can be turned into a matrix and the notation is easier to use for many more functions such as t-test and many more analysis!