Shige
10/15/2013
The goal of having a good workflow is to enable you to get your work done as efficiently as possible. More theoretical discussions can be found on this paper by Kieran Healy.
Please share …
What do you gain:
You may ask:


Let's pause and make sure you have installed all the packages we need by:
This is good for simple calculation and exploratory analysis.
We can go to the “Environment” tab to find more about this data set.
Run “demo1.R”
zelig(depvar ~ indvars, data=dataname, model="modelname")texreg(list(model1, model2, model3))A notebook is a report file that combines both R program code and the output generated by the code. With Rstudio, you don't need to do anything special. Just click “File/Compile Notebook”.
What are the main weaknesses of a notebook?
That is the complete syntax!
Let's create a simple R Markdown report of what just did together
library(Zelig)
library(texreg)
data(turnout)
v.model1 <- zelig(vote ~ race + age, data = turnout, model = "logit")
v.model2 <- zelig(vote ~ race + age + educate, data = turnout, model = "logit")
v.model3 <- zelig(vote ~ race + age + educate + income, data = turnout, model = "logit")
result <- htmlreg(list(v.model1, v.model2, v.model3))
| Model 1 | Model 2 | Model 3 | |
|---|---|---|---|
| (Intercept) | 0.04 (0.18) | -3.05 (0.33)*** | -3.03 (0.33)*** |
| racewhite | 0.65 (0.13)*** | 0.38 (0.14)** | 0.25 (0.15) |
| age | 0.01 (0.00)*** | 0.03 (0.00)*** | 0.03 (0.00)*** |
| educate | 0.22 (0.02)*** | 0.18 (0.02)*** | |
| income | 0.18 (0.03)*** | ||
| AIC | 2234.82 | 2080.03 | 2033.98 |
| BIC | 2251.62 | 2102.43 | 2061.99 |
| Log Likelihood | -1114.41 | -1036.01 | -1011.99 |
| Deviance | 2228.82 | 2072.03 | 2023.98 |
| Num. obs. | 2000 | 2000 | 2000 |
| ***p < 0.001, **p < 0.01, *p < 0.05 | |||
There are a number of ways to use your own data, which are likely to be in the format of Stata, SPSS, or SAS data set, including:
library(foreign)
binreg <- read.dta("http://www.stata-press.com/data/r12/binreg.dta")
names(binreg)
[1] "cat" "d" "n" "alc" "smo" "soc"
library(Zelig)
z.out <- zelig(cbind(d, (n-d)) ~ alc + smo + soc, data=binreg, model="logit")
| Model 1 | |
|---|---|
| (Intercept) | -3.81 (0.44)*** |
| alc | 0.37 (0.13)** |
| smo | 0.56 (0.24)* |
| soc | 0.18 (0.14) |
| AIC | 77.44 |
| BIC | 81.00 |
| Log Likelihood | -34.72 |
| Deviance | 14.84 |
| Num. obs. | 18 |
| ***p < 0.001, **p < 0.01, *p < 0.05 | |
* webuse binreg
* glm n_lbw_babies alcohol smokes social, family(binomial n_women ) link(logit)
library(foreign)
dbox <- read.dta("https://dl.dropboxusercontent.com/u/211468568/class_survey.dta")
Let's try to run the code and see what's in that data.
Muenchen and Hilbe (2010) list eight reasons why Stata users should learn R:
I would like to add the following:
I want you to try the following: