1 Installing Rmarkdown

library(rmarkdown)

2 Running Your R code online

3 Data and Regression

Christian Kleiber and Achim Zeileis (2008), Applied Econometrics with R, Springer-Verlag, New York.
http://cran.r-project.org/web/packages/AER/AER.pdf

Example 1: Cigarette Consumption
This example is taken from Baltagi (Section 3.10 Empirical Example)

Let us install the AER package. This can be done using the menu:

or using the following command:

library(AER)

Now let us load the “CigarettesB” data from AER package:

data("CigarettesB", package = "AER")
names(CigarettesB)

## [1] "packs"  "price"  "income"

CigarettesB

##      packs    price  income
## AL 4.96213  0.20487 4.64039
## AZ 4.66312  0.16640 4.68389
## AR 5.10709  0.23406 4.59435
## CA 4.50449  0.36399 4.88147
## CT 4.66983  0.32149 5.09472
## DE 5.04705  0.21929 4.87087
## DC 4.65637  0.28946 5.05960
## FL 4.80081  0.28733 4.81155
## GA 4.97974  0.12826 4.73299
## ID 4.74902  0.17541 4.64307
## IL 4.81445  0.24806 4.90387
## IN 5.11129  0.08992 4.72916
## IA 4.80857  0.24081 4.74211
## KS 4.79263  0.21642 4.79613
## KY 5.37906 -0.03260 4.64937
## LA 4.98602  0.23856 4.61461
## ME 4.98722  0.29106 4.75501
## MD 4.77751  0.12575 4.94692
## MA 4.73877  0.22613 4.99998
## MI 4.94744  0.23067 4.80620
## MN 4.69589  0.34297 4.81207
## MS 4.93990  0.13638 4.52938
## MO 5.06430  0.08731 4.78189
## MT 4.73313  0.15303 4.70417
## NE 4.77558  0.18907 4.79671
## NV 4.96642  0.32304 4.83816
## NH 5.10990  0.15852 5.00319
## NJ 4.70633  0.30901 5.10268
## NM 4.58107  0.16458 4.58202
## NY 4.66496  0.34701 4.96075
## ND 4.58237  0.18197 4.69163
## OH 4.97952  0.12889 4.75875
## OK 4.72720  0.19554 4.62730
## PA 4.80363  0.22784 4.83516
## RI 4.84693  0.30324 4.84670
## SC 5.07801  0.07944 4.62549
## SD 4.81545  0.13139 4.67747
## TN 5.04939  0.15547 4.72525
## TX 4.65398  0.28196 4.73437
## UT 4.40859  0.19260 4.55586
## VT 5.08799  0.18018 4.77578
## VA 4.93065  0.11818 4.85490
## WA 4.66134  0.35053 4.85645
## WV 4.82454  0.12008 4.56859
## WI 4.83026  0.22954 4.75826
## WY 5.00087  0.10029 4.71169

3.1 Regression

We regress consumption on price using OLS:

cig_lm <- lm(packs ~ price, data = CigarettesB)
summary(cig_lm)

## 
## Call:
## lm(formula = packs ~ price, data = CigarettesB)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.45472 -0.09968  0.00612  0.11553  0.29346 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   5.0941     0.0627  81.247  < 2e-16 ***
## price        -1.1983     0.2818  -4.253 0.000108 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.163 on 44 degrees of freedom
## Multiple R-squared:  0.2913, Adjusted R-squared:  0.2752 
## F-statistic: 18.08 on 1 and 44 DF,  p-value: 0.0001085

3.2 Results in Equation Form

coef(cig_lm)

## (Intercept)       price 
##    5.094108   -1.198316

cig_lm.sum <- summary(cig_lm)
#You can see that the summary of our linear model has a lot of information.


r2<-summary(cig_lm)$r.squared
r2a<-summary(cig_lm)$adj.r.squared
sig<-summary(cig_lm)$sigma

The R-square value for our linear model is $R^2$=0.2912836.

The adjusted R-square is $\bar{R}^2$=0.2751764.

From this, we can deduce that the estimate of the intercept is $\widehat \beta_0= 5.0941081$, and the estimate of the slope is $\widehat \beta_1= -1.1983162$.

Consequently, the line of best fit is \[ \widehat {Y_{t}} = 5.0941081 -1.1983162 X_{t} \] However, this is reporting the estimates to too many decimal places: we can reduce that into 3 decimal places as follows: \[ \widehat {Y_{t}} = 5.094 -1.198 X_{t}. \]

3.3 Plotting Fitted Line

x<-CigarettesB$price
y<-CigarettesB$packs
plot(x,y,pch=19,cex=0.6,xlab='Price',ylab='Consumption (Packs)')
abline(coef(cig_lm),col='red')
title('Line of Best Fit for Cigarette Data')

3.4 Confidence Intervals

confint(cig_lm, level = 0.95)

##                 2.5 %     97.5 %
## (Intercept)  4.967747  5.2204696
## price       -1.766224 -0.6304087

3.5 Histogram of residuals

hist(resid(cig_lm))

3.6 Saving Fitted Values

CigarettesB$predicted <- predict(cig_lm)

3.7 Saving Residuals

CigarettesB$residuals <- residuals(cig_lm)

3.8 Saving Residuals: Manual

CigarettesB$residual2=CigarettesB$packs - CigarettesB$predicted

3.9 Residuals versus Fitted Plot

3.9.1 Plot 1

plot(CigarettesB$predicted, CigarettesB$residuals,pch=21,bg="red",col="red")
abline(0,0)

3.9.2 Plot 2

plot(CigarettesB$predicted, CigarettesB$residuals,pch=21,bg="red",col="blue")
abline(0,0)

3.9.3 Plot 3

plot(CigarettesB$predicted, CigarettesB$residuals,pch=21,bg="red",col="red")
abline(0,0)

3.9.4 Plot 4

Other characters can be used to specify pch “+”, “*“,”-“,”.“,”#, “%”, “o”

plot(CigarettesB$predicted, CigarettesB$residuals, pch="+",bg="red",col="red")
abline(0,0)

4 Writing Using Rmarkdown

4.1 Inline/Display Form

Equations can be formatted inline or as displayed formulas. In display form, they are centered and set off from the main text. In the former case, the expression occurs smoothly in the line of text.

Inline form is set off by the use of single dollar-sign ($) characters.

This summation expression $\sum_{i=1}^n X_i$ appears inline.

This summation expression $\sum_{i=1}^n X_i$ appears inline.

Using “span” we can change the color as $\sum_{i=1}^n X_i$ appears inline.

Display form is set off by the use of double dollar-sign ($$) characters.

This summation expression is in display form: $$\sum_{i=1}^n X_i$$

This summation expression is in display form.

\[\sum_{i=1}^n X_i\]

4.2 Summations Without Indices

$\sum x_{t}$ can be written by:
```
$\sum x_{t}$
```
$\sum x_{t}^2$ can be written by:
```
$\sum x_{t}^2$
```
$\sum x_{t}y_{t}$ can be written by:
```
$\sum x_{t}y_{t}$
```
$\sum X_{t}$ can be written by:
```
$\sum X_{t}$
```
$\sum X_{t}^2$ can be written by:
```
$\sum X_{t}^2$
```
$\sum X_{t}Y_{t}$ can be written by:
```
$\sum X_{t}Y_{t}$
```

4.3 Summations With Indices - Inline Form

$\sum_{t=1}^T x_{t}$ can be written by:
```
$\sum_{t=1}^T x_{t}$
```
$\sum_{t=1}^T x_{t}^2$ can be written by:
```
$\sum_{t=1}^T x_{t}^2$
```
$\sum_{t=1}^T x_{t}y_{t}$ can be written by:
```
$\sum_{t=1}^T x_{t}y_{t}$
```

4.4 Summations With Indices - Display Form

$\sum_{t=1}^T x_{t}y_{t}$ can be written by:
```
$\sum_{t=1}^T x_{t}y_{t}$
```

4.5 Correlations

${SS}_{XX} = \sum (X - \bar{X})^2 = \sum X^2 - \frac {(\sum X)^2}{T}$ can be written by:
```
${SS}_{XX} = \sum (X - \bar{X})^2 = \sum X^2 - \frac {(\sum X)^2}{T}$
```
${SS}_{XY} = \sum (X - \bar{X})(Y - \bar{Y}) = \sum XY - \frac {(\sum X)(\sum Y)}{T}$ can be written by:
```
${SS}_{XY} = \sum (X - \bar{X})(Y - \bar{Y}) = \sum XY - \frac {(\sum X)(\sum Y)}{T}$
```
$r = \frac {{SS}_{XY}}{\sqrt {{SS}_{XX}{SS}_{YY}}}$ can be written by:
```
$r = \frac {{SS}_{XY}}{\sqrt {{SS}_{XX}{SS}_{YY}}}$
```

4.6 Population Regression Function (PRF)

$E(Y) = \alpha + \beta{X}$ can be written by:
```
$E(Y) = \alpha + \beta{X}$
```
$E(Y) = \beta_0 + \beta_1{X}$ can be written by:
```
$E(Y) = \beta_0 + \beta_1{X}$
```
$Y = \beta_0 + \beta_1{X} +u$ can be written by:
```
$Y = \beta_0 + \beta_1{X} +u$
```
$E(Y_t) = \beta_0 + \beta_1{X_t}$ can be written by:
```
$E(Y_t) = \beta_0 + \beta_1{X_t}$
```
$Y_t = \beta_0 + \beta_1{X_t} +u_t$ can be written by:
```
$Y_t = \beta_0 + \beta_1{X_t} +u_t$
```
$\bar{Y} = \beta_0 + \beta_1\bar{X}+\bar{u}$ can be written by:
```
$\bar{Y} = \beta_0 + \beta_1\bar{X}+\bar{u}$
```
$var(Y) = \sigma^2$ can be written by:
```
$var(Y) = \sigma^2$
```
$\widehat{var(Y)} = \widehat{\sigma}^2$ can be written by:
```
$\widehat{var(Y)} = \widehat{\sigma}^2$
```

4.7 Sample Regression Function (SRF)

$\widehat{Y_{t}} = \widehat{\beta_0} + \widehat{\beta_1}X_{t}$ can be written by:
```
$\widehat{Y_{t}} = \widehat{\beta_0} + \widehat{\beta_1}X_{t}$
```

4.8 Residual Sum of Squares

$SSR = \sum (Y_t - \hat{Y_t})^2$ can be written by:
```
$SSR = \sum (Y_t - \hat{Y_t})^2$
```

4.9 Standard Errors

$\widehat\sigma = \sqrt \frac {SSR}{T - k -1}$ can be written by:

$\widehat\sigma = \sqrt \frac {SSR}{T - k -1}$

4.10 Square Roots

$\sqrt{b^2 - 4ac}$ can be written

$\sqrt{b^2 - 4ac}$

4.11 Fractions

$\frac{4z^3}{16}$ can be written

$\frac{4z^3}{16}$

4.12 Self-Sizing Parentheses

$\sum_{i=1}^{n}\left( \frac{X_i}{Y_i} \right)$ can be written

$\sum_{i=1}^{n}\left( \frac{X_i}{Y_i} \right)$

4.13 Greek Letters

Both upper and lower case versions available for some letters.

$\alpha, \beta, \gamma, \Gamma$ can be written

$\alpha, \beta,  \gamma, \Gamma$

4.14 Brief Mathematical Notation

$\{1, 2, 3\}$ can be written

$\{1, 2, 3\}$

$\binom{n}{k}$ can be written

$\binom{n}{k}$

$\frac{a}{b}$ can be written

$\frac{a}{b}$

$\lim_{x \to \infty} f(x)$ can be written

$\lim_{x \to \infty} f(x)$

$\frac{a}{b}$ can be written

$\frac{a}{b}$

$\hat{x}$ can be written

$\hat{x}$

$\int_{a}^{b}$ can be written

$\int_{a}^{b}$

$\left(\int_{a}^{b} f(x) \; dx\right)$ can be written

$\left(\int_{a}^{b} f(x) \; dx\right)$

$\left. F(x) \right|_{a}^{b}$ can be written

$\left. F(x) \right|_{a}^{b}$

$\left[\int_{\-infty}^{\infty} f(x) \; dx\right]$ can be written

$\left[\int_{\-infty}^{\infty} f(x) \; dx\right]$

$\lim_{x \to \infty} f(x)$ can be written

$\lim_{x \to \infty} f(x)$

$\log(x)$ can be written

$\log(x)$

$\mathrm{P}(A \mid B)$ can be written

$\mathrm{P}(A \mid B)$

$\mathrm{P}(X \le x) = {\tt pbinom}(x, n, \pi)$ can be written

$\mathrm{P}(X \le x) = {\tt pbinom}(x, n, \pi)$

$\overline{x}$ can be written

$\overline{x}$

$\prod_{x = a}^{b} f(x)$ can be written

$\prod_{x = a}^{b} f(x)$

$\sin(x)$ can be written

$\sin(x)$

$\sum_{x = a}^{b} f(x)$ can be written

$\sum_{x = a}^{b} f(x)$

$\tilde{x}$ can be written

$\tilde{x}$

$|A|$ can be written

$|A|$

$A \cap B$ can be written

$A \cap B$

$A \cup B$ can be written

$A \cup B$

$P(A \mid B)$ can be written

$P(A \mid B)$

$x \ge y$ can be written

$x \ge y$

$x \in A$ can be written

$x \in A$

$x \le y$ can be written

$x \le y $

$X \sim {\sf Binom}(n, \pi)$ can be written

$X \sim {\sf Binom}(n, \pi)$

$x \subset B$ can be written

$x \subset B$

$x \subseteq B$ can be written

$x \subseteq B$

$x < y$ can be written

$x < y$

$x = y$ can be written

$x = y$

$x > y$ can be written

$x > y$

$x^{n}$ can be written

$x^{n}$

$x_{1} + x_{2} + \cdots + x_{n}$ can be written

$x_{1} + x_{2} + \cdots + x_{n}$

$x_{1}, x_{2}, \dots, x_{n}$ can be written

$x_{1}, x_{2}, \dots, x_{n}$

$x_{n}$ can be written

$x_{n}$

$\sim$ (Distributed) can be written

$\sim$

4.15 Special Symbols

$a \pm b$ can be written

$a \pm b$

$x \ge 15$ can be written

$x \ge 15$

$a_i \ge 0~~~\forall i$

$a_i \ge 0~~~\forall i$

4.16 Matrices

Matrics are presented in the array environment. One begins with the statement

\begin{array}

and ends with the statement

\end{array}

Following the opening statement, a format code is used to indicate the formatting of each column. In the example below, we use the code

{rrr}

to indicate that each column is right justified. Each row is then entered, with cells separated by the

symbol, and each line (except the last) terminated by

\\

Example 1. The matrix without brackets given below: \[\begin{array} {rrr} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{array}\] can be written

$$\begin{array}
{rrr}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 & 9
\end{array}$$

Example 2. Matrix letters in boldface For bold face, use

\mathbf

Hence \[\mathbf{X}\] can be written as

$$\mathbf{X}$$

Example 3. The matrix with brackets given below: \[\mathbf{X} = \left[\begin{array} {rrr} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{array}\right]\] can be writtten

$$\mathbf{X} = \left[\begin{array}
{rrr}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 & 9
\end{array}\right]$$

4.17 Simple Tables

Simple tables look like this (Notice we don’t use dollar signs or anything, just a blank line above and below the table):

Right	Left	Center	Default
12	12	hmmm	12
123	123	123	123
1	1	1	1

can be written


Right   Left     Center   Default 
------- ------ ---------- ------- 
     12 12        hmmm        12 
    123 123        123       123 
      1 1            1         1

The headers and table rows must each fit on one line. Column alignments are determined by the position of the header text relative to the dashed line below it.

If the dashed line is flush with the header text on the right side but extends beyond it on the left, the column is right-aligned. If the dashed line is flush with the header text on the left side but extends beyond it on the right, the column is left-aligned. If the dashed line extends beyond the header text on both sides, the column is centered. If the dashed line is flush with the header text on both sides, the default alignment is used (in most cases, this will be left). The table must end with a blank line, or a line of dashes followed by a blank line.

4.18 Greek Letters

$\alpha A$ can be written

$\alpha A$

$\nu N$ can be written

$\nu N$

$\beta B$ can be written

$\beta B$

$\xi\Xi$ can be written

$\xi\Xi$

$\gamma \Gamma$ can be written

$\gamma \Gamma$

$o O$ (omicron) can be written

$o O$

$\delta \Delta$ can be written

$\delta \Delta$

$\pi \Pi$ can be written

$\pi \Pi$

$\epsilon \varepsilon E$ can be written

$\epsilon \varepsilon E$

$\rho\varrho P$ can be written

$\rho\varrho P$

$\zeta Z \sigma \,\!$ can be written

$\zeta Z \sigma \,\!$

$\sigma \Sigma$ can be written

$\sigma \Sigma$

$\eta H$ can be written

$\eta H$

$\tau T$ can be written

$\tau T$

$\theta \vartheta \Theta$ can be written

$\theta \vartheta \Theta$

$\upsilon \Upsilon$ can be written

$\upsilon \Upsilon$

$\iota I$ can be written

$\iota I$

$\phi \varphi \Phi$ can be written

$\phi \varphi \Phi$

$\kappa K$ can be written

$\kappa K$

$\chi X$ can be written

$\chi X$

$\lambda \Lambda$ can be written

$\lambda \Lambda$

$\psi \Psi$ can be written

$\psi \Psi$

$\mu M$ can be written

$\mu M$

$\omega \Omega$ can be written

$\omega \Omega$

4.19 Text with Calculation

Two plus two equals 4

Econometrics with R - Presentation 1

Prof. Dr. Ozan ERUYGUR, AHBV University, Department of Economics

September 12, 2019