library("dplyr")
library("ggplot2")
library("gridExtra")

0.1 Some basic R commands

0.2 Math functions

0.3 Matrix operations

##      [,1] [,2] [,3]
## col1    2    1    0
## col2    1    3    1
## col3    1    1    2
##      col1 col2 col3
## col1    5    5    3
## col2    5   11    6
## col3    3    6    6
##            [,1]       [,2]       [,3]
## col1  0.5555556 -0.1111111 -0.2222222
## col2 -0.2222222  0.4444444 -0.1111111
## col3  0.1111111 -0.2222222  0.5555556

0.4 Descriptive Statistics

0.4.1 Mean

The mean is the most representative value in a list of numbers(calculated with mean() function):

\[\frac{\sum_{i=1}^n x_{i}}{n}\]

0.4.2 Median

The median is the value that separates the higher half from the lower half in a list of numbers. median() function in R applied to a list of numbers.

If the count of the total numbers is an odd number, the formula for the median is the following:

\[ \frac{n+1}{2} \] With even numbers:

\[ \frac{\frac{n}{2}+[\frac{n}{2}+1]}{2} \]

0.4.3 Variance

It measures how far a set of random numbers are spread out from their average value. var() function in R.

\[ Var(X) = \frac{\sum_{i=1}^n(X_{i}-\mu_{x})^2}{n-1} \]

0.4.4 Standard Deviation

square root of the variance. sd() function to perform it.

\[ \sigma_{x} = \sqrt{Var(X)} \] ### Mode

The mode is the most frequently value in a set of data.

0.4.5 Range

The range is the total amplitude of a list of numbers. range() function is appropiate to it.

\[max(x) -min(x)\]

0.4.6 Standard error of the mean

\[\frac{\sigma}{\sqrt{n}}\]

0.4.7 Variation Coefficient

the variation coefficient is the percentage of dispersion from the total average, it’s a way to compare variance between variables with differente scale.

\[\frac{\sigma}{\mu}\]

0.4.8 Covariance

Covariance is the measure of the joint variability of two random variables. cov() is the function used to calculate it.

\[ cov(X,Y) = \frac{\sum_{i=1}^n (X_{i}-\mu_{x})(Y_{i}-\mu_{y})}{n-1} \]

0.4.9 Correlation

The correlation measures the strength of the relatioship between variables. cor() is the R function used to calculate it.

\[ \rho(X,Y) = \frac{cov(X,Y)}{\sigma_{x}\sigma_{y}} \]

0.5 Probability

Most Known probability distributions.