Slides are available at the following link:

http://rpubs.com/sallychen/437154

What is R & Rstudio

R Project: Powerful support for projects management

First step: Create a R project

Rstudio Panes

Install and Load R packages

install.packages("readxl") # package to read excel files in R
install.packages("AER")  # pacakge of Applied Econometrics in R
# load useful libraries
library(AER)
library(readxl)

Download and Import data

Ritvars <- read_excel("Ritvars.xls")
head(Ritvars)
##   ID TEST1 TEST2 IMPROVE DOSAGE DRUGDUM FEMALE AGE INTERVAL
## 1  1    75   100      25  0.452       1      0 108    0.592
## 2  2    80    80       0  0.550       1      1  90    0.329
## 3  3    80    70     -10  0.508       1      1 108    0.362
## 4  4    80    90      10  0.478       1      0 138    0.592
## 5  5    75    75       0  0.423       1      0  87    0.822
## 6  6    90   100      10  0.452       1      0 132    0.690

OLS Regression

\[Improvement = Dosage + Female + Age + 1\]

help(lm)
treatment<-lm(IMPROVE~DOSAGE + FEMALE + AGE,data=Ritvars)
summary(treatment)
## 
## Call:
## lm(formula = IMPROVE ~ DOSAGE + FEMALE + AGE, data = Ritvars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -20.758  -8.659  -1.263   5.896  42.687 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -9.37188    9.44956  -0.992   0.3253  
## DOSAGE      11.28543    5.78582   1.951   0.0558 .
## FEMALE      -4.45350    3.44881  -1.291   0.2015  
## AGE          0.10357    0.09046   1.145   0.2568  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11.91 on 60 degrees of freedom
## Multiple R-squared:  0.1049, Adjusted R-squared:  0.0601 
## F-statistic: 2.343 on 3 and 60 DF,  p-value: 0.08207

IV Regression

data("CollegeDistance")
cd<-CollegeDistance
help(CollegeDistance)
head(cd)
##   gender ethnicity score fcollege mcollege home urban unemp wage distance
## 1   male     other 39.15      yes       no  yes   yes   6.2 8.09      0.2
## 2 female     other 48.87       no       no  yes   yes   6.2 8.09      0.2
## 3   male     other 48.74       no       no  yes   yes   6.2 8.09      0.2
## 4   male      afam 40.40       no       no  yes   yes   6.2 8.09      0.2
## 5 female     other 40.48       no       no   no   yes   5.6 8.09      0.4
## 6   male     other 54.71       no       no  yes   yes   5.6 8.09      0.4
##   tuition education income region
## 1 0.88915        12   high  other
## 2 0.88915        12    low  other
## 3 0.88915        12    low  other
## 4 0.88915        12    low  other
## 5 0.88915        13    low  other
## 6 0.88915        12    low  other
ols_wage<-lm(wage~urban + gender + education + unemp,data=cd)
summary(ols_wage)
## 
## Call:
## lm(formula = wage ~ urban + gender + education + unemp, data = cd)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.1645 -0.8311  0.1562  0.7643  3.7193 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   8.297396   0.157930  52.538   <2e-16 ***
## urbanyes     -0.058977   0.044512  -1.325   0.1852    
## genderfemale -0.093668   0.037772  -2.480   0.0132 *  
## education     0.020404   0.010505   1.942   0.0522 .  
## unemp         0.129851   0.006812  19.063   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.293 on 4734 degrees of freedom
## Multiple R-squared:  0.07348,    Adjusted R-squared:  0.07269 
## F-statistic: 93.85 on 4 and 4734 DF,  p-value: < 2.2e-16
iv_wage<-ivreg(wage ~ urban + gender + unemp + education|urban + gender + unemp + distance, data = cd)
summary(iv_wage)
## 
## Call:
## ivreg(formula = wage ~ urban + gender + unemp + education | urban + 
##     gender + unemp + distance, data = cd)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -5.24048 -1.17602 -0.01759  1.32727  4.91374 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -0.821737   1.939098  -0.424    0.672    
## urbanyes     -0.014900   0.060798  -0.245    0.806    
## genderfemale -0.071329   0.051201  -1.393    0.164    
## unemp         0.136343   0.009296  14.667  < 2e-16 ***
## education     0.675636   0.139208   4.853 1.25e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.746 on 4734 degrees of freedom
## Multiple R-Squared: -0.6879, Adjusted R-squared: -0.6894 
## Wald test: 56.89 on 4 and 4734 DF,  p-value: < 2.2e-16

Useful Contents for R

http://rpubs.com/sallychen/413493

http://rpubs.com/sallychen/413763