TensorFlow for R

TensorFlowT is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API. TensorFlow was originally developed by researchers and engineers working on the Google Brain Team within Google’s Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well. see https://tensorflow.rstudio.com/.

Since 2017, many contributors have been hard at work on creating R interfaces to TensorFlow, an open-source machine learning framework from Google. We are excited about TensorFlow for many reasons, not the least of which is its state-of-the-art infrastructure for deep learning applications.https://blog.rstudio.com/2018/02/06/tensorflow-for-r/.

Computation Graph

TensorFlow is a programming system in which you represent computations as graphs.A TensorFlow graph is a description of computations. To compute anything, a graph must be launched in a Session.

Waiting time between eruptions and the duration of the eruption for the Old Faithful geyser in Yellowstone National Park, Wyoming, USA. A data frame with 272 observations on 2 variables.

x_data = eruptions numer ic Eruption time in mins y_data = waiting numeric Waiting time to next eruption (in mins)

Explore the Faithful Geyser Dataset

library(tensorflow)

x_data <- faithful$eruptions/max(faithful$eruptions)
y_data <- faithful$waiting/max(faithful$waiting)
fit <- lm(faithful$waiting~faithful$eruptions)
plot(faithful$eruptions,faithful$waiting, main="Faithful Geyser Data", xlab="Eruptions Time in mins", ylab="Waiting Time in mins")
lines(faithful$eruptions, fitted(fit), lwd=2, col="blue")

Building The Graph

Linear Regression is a very common statistical method that allows us to learn a function or relationship from a given set of continuous data. In case of Linear regression, the hypothesis is a straight line, i.e, h(x) = wx + b Where w is a vector called Weights and b is a scalar called Bias. The Weights and Bias are called the parameters of the model. Now we will start creating our model by defining the placeholders X and Y, so that we can feed our training examples X and Y into the optimizer during the training process. We use tf.Variable for trainable variables such as weights (W) and biases (B) for your model. tf.placeholder is used to feed actual training examples. The difference is that with tf.Variable you have to provide an initial value when you declare it. With tf.placeholder you don’t have to provide an initial value and you can specify it at run time with the feed_dict argument inside Session.run

Define variabes

W <- tf$Variable(tf$random_uniform(shape(1L), -1.0, 1.0))
b <- tf$Variable(tf$zeros(shape(1L)))
y <- W * x_data + b

Define cost function

Now, we will be building the Hypothesis, the Cost Function, this cost function is also called Mean Squared Error, and the Optimizer. For finding the optimized value of the parameters for which loss is minimum, we will be using a commonly used optimizer algorithm called Gradient Descent. We won’t be implementing the Gradient Descent Optimizer manually since it is built inside Tensorflow.

loss <- tf$reduce_mean((y - y_data) ^ 2)
optimizer <- tf$train$GradientDescentOptimizer(0.5)
train <- optimizer$minimize(loss)

Launch the graph and initialize the variables.

Now we will begin the training process inside a Tensorflow Session.

sess = tf$Session()
sess$run(tf$global_variables_initializer())

Train the Model

Now we will define the hyperparameters of the model step=200 Now let us look at the result at each step

for (step in 1:201) {
  sess$run(train)
  if (step %% 20 == 0)
    cat("Step =",step, "Estimate W=", sess$run(W),"Estimate b=", sess$run(b), "\n")
}
## Step = 20 Estimate W= 0.2198711 Estimate b= 0.5964775 
## Step = 40 Estimate W= 0.3933527 Estimate b= 0.4737093 
## Step = 60 Estimate W= 0.4808808 Estimate b= 0.4117678 
## Step = 80 Estimate W= 0.525042 Estimate b= 0.380516 
## Step = 100 Estimate W= 0.547323 Estimate b= 0.3647482 
## Step = 120 Estimate W= 0.5585646 Estimate b= 0.3567928 
## Step = 140 Estimate W= 0.5642364 Estimate b= 0.352779 
## Step = 160 Estimate W= 0.5670981 Estimate b= 0.3507539 
## Step = 180 Estimate W= 0.5685419 Estimate b= 0.3497321 
## Step = 200 Estimate W= 0.5692704 Estimate b= 0.3492166

Visualization of Results

Note that in this case both the Weight and bias are scalars. This is because, we have considered only one dependent variable in out training data. If we have m dependent variables in our training dataset, the Weight will be an m-dimensional vector while bias will be a scalar.

Finally, we will plot our results at various steps.

plot(x_data,y_data, main="Fit Normalized Faithful Data by TensorFlow", xlab="Normalized Eruption Time", ylab="Normalized Waiting Time")

w=0.569
b=0.368
y=w*x_data+b
lines(x_data,y,lwd=2, col="black")
lines(x_data,y)

w=0.543
b=0.366
y=w*x_data+b
lines(x_data,y,col="orange")

w=0.189
b=0.617
y=w*x_data+b
lines(x_data,y,col="blue")

legend(0.4, 0.95, legend=c("Step 200", "Step 100", "Step 20"), col=c("black", "orange", "blue"), lty=1, cex=0.8)

Convert estimated normalized wight and bias back to original dataset space

wbar = 0.5694
bbar = 0.3491


x_data <- faithful$eruptions
y_data <- faithful$waiting

xbar = max(faithful$eruptions)
ybar = max(faithful$waiting)
w = wbar*(ybar/xbar)
b = bbar*ybar
ycap = w*faithful$eruptions+b
plot(x_data,y_data, main="Fit Original Faithful Data by TensorFlow", xlab="Eruption Time(mins)", ylab="Waiting Time(mins)")
lines(x_data,ycap,lwd=2, col="dark green")

legend(0.4, 0.95, legend="TensorFlow Model", col="dark green", lty=1, cex=0.8)