August 31, 2016

Mathematical tools: Summation

\[\sum_{i=1}^{n} x_i \equiv x_1 + x_2 + ... + x_i + ... + x_n\]

X <- 1:10       # let the vector X be the set of integers from 1 to 10
print(X)
##  [1]  1  2  3  4  5  6  7  8  9 10
X.sum <- sum(X) # now sum X
print(X.sum)    # and show the result
## [1] 55

Mathematical tools: Summation

\[\sum_{i=1}^{n} c = nc \]

X <- rep(5,10)  # make a vector where every value is 5
print(X)
##  [1] 5 5 5 5 5 5 5 5 5 5
X.sum <- sum(X)
print(X.sum)
## [1] 50

Mathematical tools: Summation

\[\sum_{i=1}^{n} c x_i = c \sum x_i \]

X <- 1:10       # whole numbers from 1 to 10
c <- 2          # constant c
print(X)
##  [1]  1  2  3  4  5  6  7  8  9 10
X.sum1 <- sum(X*c)    # 1*2 + 2*2 + ...
X.sum2 <- sum(X) * c  # (1+2+...+10)*2
print(X.sum1)
## [1] 110
print(X.sum2)
## [1] 110

Mathematical tools: Summation

\[\sum_{i=1}^{n}(ax_i + by_i) = a\sum x + b \sum y \]

##     X  Y a b
## 1   1 21 3 5
## 2   2 22 3 5
## 3   3 23 3 5
## 4   4 24 3 5
## 5   5 25 3 5
## 6   6 26 3 5
## 7   7 27 3 5
## 8   8 28 3 5
## 9   9 29 3 5
## 10 10 30 3 5

Mathematical tools: Summation

\[\sum_{i=1}^{n}(ax_i + by_i) = a\sum x + b \sum y \]

X <- 1:10 ; Y <- 21:30 ; a <- 3 ; b <- 5 
# Semi-colons let you put multiple commands on one line.
# It's not recommended. 
Z <- sum(a*X + b*Y)
print(Z)
## [1] 1440
print(a*sum(X) + b*sum(Y))
## [1] 1440

Mathematical tools: Summation

\[\sum_{i=1}^{n} (x_i / y_i) \neq \frac{\sum_{i=1}^{n} x_i}{ \sum_{i=1}^{n} y_i}\]

X <- 1:10
Y <- seq(from = 1, by = pi, length.out = 10) 
# create a sequence starting at 1, incremented by 3.141593, 
# with 10 elements. 
head(Y) # let's just check the first 6 elements
## [1]  1.000000  4.141593  7.283185 10.424778 13.566371 16.707963

Mathematical tools: Summation

\[\sum_{i=1}^{n} (x_i / y_i) \neq \frac{\sum_{i=1}^{n} x_i}{ \sum_{i=1}^{n} y_i}\]

X.sum <- sum(X)
Y.sum <- sum(Y)
Z.sum <- sum(X/Y)
print(X.sum/Y.sum)
## [1] 0.3633441
print(Z.sum)
## [1] 4.392788

Quick aside: if statements in R

X.sum <- sum(X)
Y.sum <- sum(Y)
XY <- X.sum/Y.sum
Z.sum <- sum(X/Y)
if(XY != Z.sum){ # if XY does not equal (!=) Z.sum
  print("Your stuff isn't equal!") # do this thing if condition is TRUE
} else {
  print("Your stuff is all good!") # otherwise do this thing
}
## [1] "Your stuff isn't equal!"

Descriptive Statistics: Mean

Usual measure of central tendency

\[\bar{x} = \frac{\sum_{i=1}^{n} x_i}{n}\]

X.bar <- mean(X)
print(X.bar)
## [1] 5.5

Descriptive Statistics: Mean

# demeaning X
demean <- function(arg){ # this is how you define a function
  out <- paste("Now",arg,"is sad",sep=" ")
  return(out)
} # this completes the definition of the function.
print(demean(X.bar)) # you can put a function in a function!
## [1] "Now 5.5 is sad"
# That didn't do it...
X.demeaned <- X - X.bar
print(rbind(X,X.demeaned))
##            [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## X           1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0  9.0  10.0
## X.demeaned -4.5 -3.5 -2.5 -1.5 -0.5  0.5  1.5  2.5  3.5   4.5

Some fancy math that will come up with variance…

\[\sum_{i=1}^{n}(x_i - \bar{x})^2=\sum_{i=1}^{n} x_i^2 - n(\bar{x}^2)\]

print(sum((X-X.bar)^2))
## [1] 82.5
print(sum(X^2) - 10*(X.bar^2))
## [1] 82.5

Some fancy math that will come up with covariance…

\[\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})=\sum_{i=1}^{n} x_i y_i - n(\bar{x}\bar{y})\]

print(sum((X-X.bar)^2))
## [1] 82.5
print(sum(X^2) - 10*(X.bar^2))
## [1] 82.5

Alternative measure of measure of central tendency: Median

Median value is not sensitive to outliers so may be a better measure of central tendency

networth <- c(rep(1,9),1000) # Bill Gates is having lunch 
# with 9 millionaires. What is the typical net worth?
print(mean(networth)) # A whole lot!
## [1] 100.9
print(median(networth)) # Or not very much :(
## [1] 1
#print(mode(networth)) # This won't do what you think... 

How lines work

\[ y = \beta_0 + \beta_1 x \]

This is pretty much the same as \(y=mx+b\), and makes it easy to find the marginal effect of \(x\) on \(y\).

\[ \Delta y = \beta_1 \Delta x \forall \Delta x\]

Example: Interpretting lines

\[housing = 164 + 0.27 \times income\] What happens to housing spending if income goes up by $100? The marginal effect of income on housing spending is 0.27, so $100 more income should lead to $27 of extra spending.

Let's play macroeconomist and call that Marginal Propensity to Consume (MPC). Then APC (Average Propensity to Consume) would be

\[\frac{housing}{income} = \frac{164}{housing} + 0.27\]

Example: Interpretting lines

housing <- function(income){
  out <- 164 + 0.27*income
  return(out)
}
income.data <- sample(35000:1780000,100) # let's just make up data
# we're pulling 100 numbers from between 35,000 and 1,780,000
data <- data.frame(income=income.data,housing = housing(income.data))

Example: Housing Spending

library(ggplot2) # I'm going to use the ggplot2 package
## Warning: package 'ggplot2' was built under R version 3.3.1
# This gives me access to functions that aren't already built into R.
# Packages let R be more flexible, expandable, and powerful
ggplot(data,aes(income,housing)) + geom_point() + geom_smooth(method="lm")

Example: Housing Spending

That didn't look very realistic. Here it is again, but with some random variation.

Example: Average Propensity to Consume

data$av <- data$housing/data$income
ggplot(data,aes(income,av)) + geom_line() + ylab("APC")