150612: Note first that both for the major manipulations functions such as summarise as for do, the first argument is not data but .data. Do makes mainly sense with grouped data.frames. The dot can be used to refer to the current group, i.e. the data frame filtered for that group. Also the output of do has to be a data frame (or an arbitrary object).
by_cyl <- group_by(mtcars, cyl)
summarise(.data = by_cyl, Mean = mean(disp))
## Source: local data frame [3 x 2]
##
## cyl Mean
## 1 4 105.1364
## 2 6 183.3143
## 3 8 353.1000
# so summarise knows the column headers
do(.data = by_cyl, data.frame(Mean = mean(.$disp)))
## Source: local data frame [3 x 2]
## Groups: cyl
##
## cyl Mean
## 1 4 105.1364
## 2 6 183.3143
## 3 8 353.1000
# dot is used for the data frmae defined by the group, you have to output a
# data frame to which it interestingly adds the group column
# the following is cool then, a list as a column of a data frame
models <- by_cyl %>% do(mod = lm(mpg ~ disp, data = .))
models
## Source: local data frame [3 x 2]
## Groups: <by row>
##
## cyl mod
## 1 4 <S3:lm>
## 2 6 <S3:lm>
## 3 8 <S3:lm>
summarise(models, rsq = summary(mod)$r.squared)
## Source: local data frame [3 x 1]
##
## rsq
## 1 0.64840514
## 2 0.01062604
## 3 0.27015777
models %>% do(data.frame(var = names(coef(.$mod)),coef(summary(.$mod))))
## Source: local data frame [6 x 5]
## Groups: <by row>
##
## var Estimate Std..Error t.value Pr...t..
## 1 (Intercept) 40.871955322 3.589605400 11.3861973 1.202715e-06
## 2 disp -0.135141815 0.033171608 -4.0740206 2.782827e-03
## 3 (Intercept) 19.081987419 2.913992892 6.5483988 1.243968e-03
## 4 disp 0.003605119 0.015557115 0.2317344 8.259297e-01
## 5 (Intercept) 22.032798914 3.345241115 6.5863112 2.588765e-05
## 6 disp -0.019634095 0.009315926 -2.1075838 5.677488e-02
There is more good stuff to learn in the do() examples.