Dr. Ng showed how to use gradient descent to find the linear regression fit in matlab. Here is a demonstration of how to implement it in R. The syntax of matlab and R differs a lot in vector/matrix indexing, but the idea is the same.
First, let’s generate some data to work with, let’s define our training example with 4 features:
set.seed(11)
x <- matrix(rnorm(400), ncol = 4)
y <- rnorm(100)
m <- length(y)
X<-cbind(rep(1, 100), x)
theta<-rep(0,5)
Second, set up the cost function for least square linear regression:
compCost<-function(X, y, theta){
m <- length(y)
J <- sum((X%*%theta- y)^2)/(2*m)
return(J)
}
Next, set up the gradient descent function, running for iterations:
gradDescent<-function(X, y, theta, alpha, num_iters){
m <- length(y)
J_hist <- rep(0, num_iters)
for(i in 1:num_iters){
# this is a vectorized form for the gradient of the cost function
# X is a 100x5 matrix, theta is a 5x1 column vector, y is a 100x1 column vector
# X transpose is a 5x100 matrix. So t(X)%*%(X%*%theta - y) is a 5x1 column vector
theta <- theta - alpha*(1/m)*(t(X)%*%(X%*%theta - y))
# this for-loop records the cost history for every iterative move of the gradient descent,
# and it is obtained for plotting number of iterations against cost history.
J_hist[i] <- compCost(X, y, theta)
}
# for a R function to return two values, we need to use a list to store them:
results<-list(theta, J_hist)
return(results)
}
Then, let’s set a training rate alpha and number of iterations to perform gradient descent:
alpha <- .1
num_iters <- 150
results <- gradDescent(X, y, theta, alpha, num_iters)
theta <- results[[1]]
cost_hist <- results[[2]]
print(theta)
## [,1]
## [1,] 0.04636521
## [2,] 0.09681053
## [3,] 0.10089729
## [4,] -0.11856147
## [5,] -0.20665900
Finally let’s plot the cost history:
plot(1:num_iters, cost_hist, type = 'l')