S Legrand
2014.07.30
To evaluate a character string as R-code, the usual thing to do is to use eval(parse(text=myString)). For example
eval(parse(text="1+2"))
[1] 3
But this begs the question, what does parse return?
A quick check shows that parse returns an object called an expression
ex1<-parse(text="1+2")
mode(ex1)
[1] "expression"
Invoking help on expression reveals other ways to create an expression
#directly
expression(1+2)
expression(1 + 2)
#using as.expression
cl<-quote(1+2)
as.expression(cl)
expression(1 + 2)
What we want know is whats inside, that is, what are expressions made of?
So we start by dissection of an expression
We dissect an expression using the [[]] operator
ex<-expression(1+2)
length(ex)
[1] 1
ex[[1]]
1 + 2
mode(ex[[1]])
[1] "call"
cl<-call("+",1,2)
ex.cl<-list(cl)
mode(ex.cl)<-"expression"
identical(ex.cl, expression(1+2))
[1] TRUE
So an expression is a specialized list with mode expression
Technically an expression is a primitive, and we may be doing some coercing, but it's easiest to think of them as some kind of specialized list.
The building blocks of an expression can be any of the following:
ex1<-expression(1+2)
mode(ex1[[1]])
[1] "call"
ex2<-expression(x)
mode(ex2[[1]])
[1] "name"
ex3<-expression(3)
mode(ex3[[1]])
[1] "numeric"
c(is.expression(expression(1+2)),
is.call(expression(1+2)[[1]]),
is.symbol(expression(x)[[1]]),
is.numeric(expression(3)[[1]]))
[1] TRUE TRUE TRUE TRUE
Expressions can contain multiple components. For example the following expression contain 2 calls
ex1<-expression(x<-1,x+2)
ex1
expression(x <- 1, x + 2)
ex2<-parse(text="x<-1;x+2")
ex2
expression(x <- 1, x + 2)
parse(
text="1+2")[[1]]
1 + 2
call("+",1,2)
1 + 2
quote(1+2)
1 + 2
as.call( list(as.name("+"), 1,2))
1 + 2
substitute(1+2)
1 + 2
fn<-function(){1+2}
body(fn)[[2]]
1 + 2
list(as.name("+"),1,2)->cl
mode(cl)<-"call"
identical(cl, call("+",1,2))
[1] TRUE
Calls are specialized lists with mode call
Technically a call is a primitive, and we are doing some coercing. But as with expressions, the most convenient way to think of a call is as some kind of specialized list.
Calls can also be evaluated using eval
cl1<-parse(text="1+2")[[1]]
eval(cl1)
[1] 3
We can dissect a call and examine it's components using the [[]] operator
cl1[[1]]
`+`
cl1[[2]]
[1] 1
cl1[[3]]
[1] 2
cl1<-quote(1+2)
cl1
1 + 2
fn<-function(x,y){
1+x^y
}
cl1[[1]]<-as.name("fn")
cl1 # Replaced + with fn
fn(1, 2)
eval(cl1)
[1] 2
cl<-quote(1*2+3)
cl
1 * 2 + 3
Note: we now have both * and + inside our call
cl[[1]]
`+`
cl[[2]]
1 * 2
cl[[2]][[1]]
`*`
cl[[2]][[2]]
[1] 1
cl[[2]][[3]]
[1] 2
Ex: call('+',1,2)
A terminal node is a value or the name of a function with no args.
We call this tree an abstract syntax tree (AST)
Accessor | Position | Role | Description |
---|---|---|---|
cl | root | Non-terminal | (+, 1 * 2, 3) |
cl[[1]] | root label | Label | + |
cl[[2]] | 1st child of root | Non-terminal | (*, 1, 2) |
cl[[3]] | 2nd child of root | Terminal | 3 |
cl[[2]][[1]] | root 1st child label | Label | * |
cl[[2]][[2]] | 1st child of cl[[1]][[2]] | Terminal | 1 |
cl[[3]][[2]] | 2nd child of cl[[1]][[2]] | Terminal | 2 |
The package pryr makes it a little easier to see this structure
library(pryr)
call_tree(cl)
\- ()
\- `+
\- ()
\- `*
\- 1
\- 2
\- 3
Here we used the package diagram to do our rendering :)
We can even manipulate to rearrange precedence!
cl.orig<-parse(text="1*2+3")[[1]]
cl.orig
1 * 2 + 3
tmp<-cl.orig
cl.mod<-cl.orig[[2]]
cl.mod[[3]]->tmp[[2]]
cl.mod[[3]]<-tmp
cl.mod
1 * (2 + 3)
call_tree(cl.orig)
\- ()
\- `+
\- ()
\- `*
\- 1
\- 2
\- 3
call_tree(cl.mod)
\- ()
\- `*
\- 1
\- ()
\- `+
\- 2
\- 3
fn<-function(){1+2}
cl<-body(fn)
mode(cl)
[1] "call"
call_tree(cl)
\- ()
\- `{
\- ()
\- `+
\- 1
\- 2
cl<-body(
function(){
1+2
}
)
mode(cl)
[1] "call"
length(cl)
[1] 2
mode(cl[[2]])
[1] "call"
cl[[2]]
1 + 2
Here, the call cl, contains the call cl[[2]]
fn<-function(x){
y<-x+1
2*y
}
cl<-body(fn)
length(cl)
[1] 3
call_tree(cl)
\- ()
\- `{
\- ()
\- `<-
\- `y
\- ()
\- `+
\- `x
\- 1
\- ()
\- `*
\- 2
\- `y
We can construct a new function in steps:
fn<-function(){}
fn
function(){}
body(fn)<-call("{",quote(y<-x+1),quote(2*y) )
fn
function ()
{
y <- x + 1
2 * y
}
The function body now contains 2 lines.
formals(fn)<-alist(x=)
fn
function (x)
{
y <- x + 1
2 * y
}
The function now has an argument x. If we wanted a default value of x=2, we would have specified alist(x=2)
environment(fn) <- .GlobalEnv
environment(fn)
<environment: R_GlobalEnv>
Here we set the environment of the function to be the Global environment
When we don't know what or how many statements the function may contain, then to create the body we might generate a list of either **calls* or character strings.
To illustrate this, we take an example motivated by continued fractions. Given n, we want to generate a function that contains n copies of the line
x<-1/(1+x)
We illustrate 2 approaches:
First we generate a list of calls
n<-3
fn1<-function(x){} #A skeleton
cl<-quote(x<-1/(1+x))
cl.list<-c(rep(list(cl),n),quote(x))
#display the list as text using deparse
deparse(cl.list) #for display only!!
[1] "list(x <- 1/(1 + x), x <- 1/(1 + x), x <- 1/(1 + x), x)"
body(fn1)<-as.call(c(as.name("{"),cl.list))
fn1
function (x)
{
x <- 1/(1 + x)
x <- 1/(1 + x)
x <- 1/(1 + x)
x
}
Note use as.call with lists of calls
n<-3
s<-"x<-1/(1+x)"
s.list<-c(rep(list(cl),n),"x")
s.list
[[1]]
x <- 1/(1 + x)
[[2]]
x <- 1/(1 + x)
[[3]]
x <- 1/(1 + x)
[[4]]
[1] "x"
fn2<-function(x){} #A skeleton
ex<-parse(text=paste(s.list,collapse=";"))
body(fn2)<-as.call(c(as.name("{"),ex))
fn2
function (x)
{
x <- 1/(1 + x)
x <- 1/(1 + x)
x <- 1/(1 + x)
x
}
Note, here we use parse, which produces an expression, which becomes the argument of as.call
identical(fn1,fn2)
[1] TRUE
fn1(2)
[1] 0.5714
fn2(2)
[1] 0.5714
Use deparse to turn a call into a character string.
cl1<-quote(1+2)
cl2<-parse(text="1+2")[[1]]
identical(deparse(cl1),deparse(cl2))
[1] TRUE
mode(deparse(cl1))
[1] "character"
deparse(cl1)
[1] "1 + 2"
a<-3
times<-function(x,y){
cl<-substitute(x*y)
paste(deparse(cl),"is",eval(cl))
}
times(2,a)
[1] "2 * a is 6"
Upon executation, substitute, substitutes 2 and a for x and y respectively to form a call, (which is the unevaluated AST quote(2 * a)). Then by turning that call back into a string via deparse we obtain the names of function inputs.