Mathematics Review

Jose M. Fernandez, University of Louisville

Wednesday, August 13, 2014

Descriptive Statistics: First Moment

Given \(n\) numbers, we compute the average or mean by adding up and dividing by \(n\)

\[ \bar x = \frac{1}{n}\sum_{i}^n x_{i} \]

The mean is known as the “the first moment.” A central first moment can be thought of as the average deviation from the mean, which must always be zero.

\[ \sum_{i}^n (x_{i}-\bar x) \]

Descriptive Statistics: First Moment

Proof: From the definition of the mean, we know that \(n\bar x=\sum_{i}^n x_{i}\). This implies that \[ \sum_{i}^n (x_{i}-\bar x) = \sum_{i}^n x_{i}-\sum_{i}^n\bar x \] \[ n\bar x - \sum_{i}^n\bar x = n\bar x - n\bar x = 0\]

Descriptive Statistics: The Second Moment is the Variance

The variance is defined as the average squared deviation from the mean.

Average
Squared Deviation
Mean

Population variance \[ \sigma^2 = \frac{1}{n}\sum_{i=1}^n (x_{i}-\mu)^2 \]

Sample variance

\[ s^2 = \frac{1}{n-1}\sum_{i=1}^n (x_{i}-\bar x)^2 \]

Descriptive Statistics: Covariance

Another second moment is the covariance of two random variables x and y.
The covariance tells us how these two variables are related.
The covariance measures the average deviation across x and y.

Descriptive Statistics with R

R can provides with quick summary statistics
R can provides with a quick scatter plot
We can report indvidual statistics: mean, standard deviation, variance, median
We can also perform t test

Descriptive Statistics: R Code and Output

summary(cars)

##      speed           dist    
##  Min.   : 4.0   Min.   :  2  
##  1st Qu.:12.0   1st Qu.: 26  
##  Median :15.0   Median : 36  
##  Mean   :15.4   Mean   : 43  
##  3rd Qu.:19.0   3rd Qu.: 56  
##  Max.   :25.0   Max.   :120

Plotting the data

plot of chunk unnamed-chunk-2

Individual Stats

sapply(cars, mean, na.rm=TRUE)

## speed  dist 
## 15.40 42.98

sapply(cars, median, na.rm=TRUE)

## speed  dist 
##    15    36

sapply(cars, sd, na.rm=TRUE)

##  speed   dist 
##  5.288 25.769

mean(cars$speed)

## [1] 15.4

sd(cars$speed)

## [1] 5.288

T Test

t.test(cars$speed, mu=15)

## 
##  One Sample t-test
## 
## data:  cars$speed
## t = 0.5349, df = 49, p-value = 0.5951
## alternative hypothesis: true mean is not equal to 15
## 95 percent confidence interval:
##  13.9 16.9
## sample estimates:
## mean of x 
##      15.4