First we need some data to plot. For this example I will just choose a simple linear model with two predictors and one response variable.
x1 <- rnorm(100);
x2 <- rnorm(100, 10, 4)
yobs <- 0.1 + 2.33*x1 - 0.5*x2 + rnorm(100, 0, 2);
Ok, now lets model it using the lm function. Obviously you should use whatever sort of regression or modeling function you think makes sense for your data. If you just want a smooth surface, I suggest the loess function.
model <- lm(yobs ~ x1 + x2)
print(summary(model))
##
## Call:
## lm(formula = yobs ~ x1 + x2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.1075 -1.3808 0.1743 1.5311 4.8348
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.43580 0.60284 0.723 0.471
## x1 2.34205 0.20903 11.204 < 2e-16 ***
## x2 -0.50926 0.05323 -9.567 1.16e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.133 on 97 degrees of freedom
## Multiple R-squared: 0.7045, Adjusted R-squared: 0.6984
## F-statistic: 115.6 on 2 and 97 DF, p-value: < 2.2e-16
Looks good. Well obviously it would because it is the same as the generating function. With real data you would have to check the residuals and do whatever model selection and remedial measures necessary.
To start plotting you need to select the values for the x and y axis, then interpolate values for the z axis to generate a surface.
First create a sequence of x and y values corresponding to the range in your data, then calculate the corresponding z values.
## Set range and domain of plot
x <- seq(min(x1), max(x1), length.out = 25);
y <- seq(min(x2), max(x2), length.out = 25);
## Interpolate surface
z <- outer(x,y,function(x,y){predict(model, data.frame(x1=x, x2=y))});
The outer product uses the entered model to calculate values of the response variable given a range of predictor values. If you haven’t used the predict function before I suggest looking at its help file.
With all that done we can finally get plotting. The persp function should not be that unfamiliar if you have plotted using the base plot function, except for the theta and phi values, which I’ll explain in a bit.
p <- persp(x,y,z, theta = 30, phi = 20,
col = "lightblue", shade = 0.8, ticktype = "detailed")
Now lets plot the observed points, and add line segments to the predicted points.
obs <- trans3d(x1, x2, yobs, p);
pred <- trans3d(x1, x2, fitted(model), p);
points(obs, col = "red", pch = 16);
segments(obs$x, obs$y, pred$x, pred$y)
As you’ll notice the persp function creates a surface with a 3D perspective box around it. The default ticktype option (simple) doesn’t give the variable values, so I like to set it to “detailed”.
Now let’s play around a bit with the phi and theta options to see what they do.
p <- persp(x,y,z, theta = 0, phi = 20,
col = "lightblue", shade = 0.8, ticktype = "detailed")
So it appears that theta rotates the image around its vertical axis (i.e. left and right). I wonder if phi rotates it around its horizontal axis (up and down).
p <- persp(x,y,z, theta = 30, phi = 45,
col = "lightblue", shade = 0.8, ticktype = "detailed")
It does! Basically if you want to rotate the perspective play around with the theta and phi parameters.
Perspective plots are good for understanding the relationship between different variables, but they can make it difficult to figure out what the actual value of one variable is given the other two. In order to do that, it would be better to use something called a contour plot. Fortunately, the steps necessary for contour plotting are mostly the same as for perspective plotting; the only difference is in the plot itself. The following examples use the same x, y, and z values as in the perspective plotting examples.
In the base package, you can use the contour function. The nlevels option allows you to choose how many levels of z to display.
contour(x,y,z, nlevels = 20)
I have always found the contour lines to be a bit confusing, especially for a linear function. They represent isoclines, like terrain lines on a map; the closer together they are the steeper the slope. By looking at the intersection of any x and y value and noting the nearest contour lines, you can figure out an approximate value for z. So, for example, in the above plot, an x value of -1 and a y value of 5 intersect just below the -5 contour. The actual value of z at that point is about -4.8, so it’s a pretty good approximation.
The base contour function isn’t bad, but it could use a bit of jazzing up. The filled.contour function in the graphics package can produce much prettier results that can make finding corresponding x values a bit easier.
require(graphics)
## Using the terrain color plaette with 20 levels
cols <- terrain.colors(25)
filled.contour(x, y, z, nlevels = 20, col = cols,
ylab = "y", xlab = "x", key.title = title("z"))
Instead of having the labels on the contour lines, they are on a handy little legend on the right.
So that’s some ideas for 3d plotting in R, I hope you find it useful.