R package predict3d aims to draw predicts plot for various regression models. The main two functions are ggPredict() for 2-dimensional(2d) plot and predict3d() for three dimensional(3d) plot. The ggPredict() function makes a ggplot object using ggplot2 package and the predict3d() function uses rgl package to make the 3d scatter plot.
You can install the predict3d package from CRAN.
install.packages("predict3d")
You can install the developmental version of predict3d package from github.
if(!require(devtools)) install.packages("devtools")
devtools::install_github("cardiomoon/predict3d"")
You can draw simple linear regression model. The mtcars data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models). First model has one response(dependent) variable and one continuous explanatory(independent) variable.
require(predict3d)
require(rgl)
fit=lm(mpg~wt,data=mtcars)
summary(fit)
Call:
lm(formula = mpg ~ wt, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-4.5432 -2.3647 -0.1252 1.4096 6.8727
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.2851 1.8776 19.858 < 2e-16 ***
wt -5.3445 0.5591 -9.559 1.29e-10 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3.046 on 30 degrees of freedom
Multiple R-squared: 0.7528, Adjusted R-squared: 0.7446
F-statistic: 91.38 on 1 and 30 DF, p-value: 1.294e-10
From summaries, you can get the slope and the intercept of the regression line. The intercept of regression line is 37.29 and the slope is -5.34.
You can draw this model with ggPredict() function. By default, the regression equation is printed just above the regression line. You can adjust the position of this label with xpos(relative position on x-axis) and vjust(vertical position) argument(). You can draw error by setting the show.error argument TRUE.
ggPredict(fit)
ggPredict(fit,xpos=0.75,vjust=1.5,show.error = TRUE)
The vs column of mtcars data represents engine shape. The V-shaped engine is coded as 0 and the straight one is coded as 1. You can make new variable engine with vs.
mtcars$engine=ifelse(mtcars$vs==0,"V-shaped","straight")
fit1=lm(mpg~wt+engine,data=mtcars)
summary(fit1)
Call:
lm(formula = mpg ~ wt + engine, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-3.7071 -2.4415 -0.3129 1.4319 6.0156
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 36.1586 1.7656 20.480 < 2e-16 ***
wt -4.4428 0.6134 -7.243 5.63e-08 ***
engineV-shaped -3.1544 1.1907 -2.649 0.0129 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.78 on 29 degrees of freedom
Multiple R-squared: 0.801, Adjusted R-squared: 0.7873
F-statistic: 58.36 on 2 and 29 DF, p-value: 6.818e-11
From summary, you can get the intercepts and slopes of of two regression lines. The slope is identical for both type of engines. The intercepts are:
You can draw this model with ggPredict() function. You can draw separate figure according to the shape of engine by setting the facet.modx TRUE. You can suppress the regression equation by setting the show.text FALSE.
ggPredict(fit1,digits=1)
ggPredict(fit1,digits=1,show.error = TRUE,facet.modx = TRUE,show.text=FALSE)
You can draw 3d plot of this model with predict3d() function.
predict3d(fit1,radius=0.5)
rglwidget(elementId = "1st")
Once you have create a model with predict3d(), you can rotate, zoom in and zoom out your object with your mouse. You can save your 3d plot as a figure file or pdf file.
rgl.bringtotop()
rgl.snapshot("fig1.png")
rgl.postscript("fig2.pdf","pdf")
You can make regression model in which the explanatory variables are interact.
fit2=lm(mpg~wt*engine,data=mtcars)
summary(fit2)
Call:
lm(formula = mpg ~ wt * engine, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-3.9950 -1.7881 -0.3423 1.2935 5.2061
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 41.2981 2.7002 15.294 4.01e-15 ***
wt -6.4110 0.9998 -6.412 6.08e-07 ***
engineV-shaped -11.7667 3.7638 -3.126 0.0041 **
wt:engineV-shaped 2.9097 1.2157 2.393 0.0236 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.578 on 28 degrees of freedom
Multiple R-squared: 0.8348, Adjusted R-squared: 0.8171
F-statistic: 47.16 on 3 and 28 DF, p-value: 4.497e-11
In the regression equation, you can see the *. It means all possible combinations. So the equation A*B is the same as A + B+ A:B.
From summary, you can get the intercepts and slopes of of two regression lines. In this model, two regression lines cross each other. In other words, the slopes of the two regression lines are different.
You can draw this model with ggPredict() function. You can suppress the points by setting the show.point FALSE. You can draw 95% confidence interval by setting the se TRUE. You can make facet by rows with the following R code.
ggPredict(fit2, show.point = FALSE,se=TRUE)
ggPredict(fit2, show.point = FALSE,se=TRUE,xpos=0.5,facet.modx = TRUE, facetbycol = FALSE)
You can draw 3d plot of this model with predict3d() function.
predict3d(fit2,radius=0.5)
rglwidget(elementId = "2nd")
You can make regression model with two continuous explanatory variables.
fit3=lm(mpg ~ hp+wt,data=mtcars)
summary(fit3)
Call:
lm(formula = mpg ~ hp + wt, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-3.941 -1.600 -0.182 1.050 5.854
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.22727 1.59879 23.285 < 2e-16 ***
hp -0.03177 0.00903 -3.519 0.00145 **
wt -3.87783 0.63273 -6.129 1.12e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.593 on 29 degrees of freedom
Multiple R-squared: 0.8268, Adjusted R-squared: 0.8148
F-statistic: 69.21 on 2 and 29 DF, p-value: 9.109e-12
Can you get the intercepts and slopes for regression lines of this model? There are two continuous variables. While the slopes for one variable remain constant, the intercepts change according to the other variable. The mean and standard deviation of weights of mtcars are 3.22 and 0.98 respectively. The ggPredict() function uses mean and \(mean \pm 1*sd\) calculated by the following R code provided that the argument mode is 1(default value).
(mean(mtcars$wt,na.rm=TRUE) + c(-1,0,1)*sd(mtcars$wt,na.rm=TRUE))
[1] 2.238793 3.217250 4.195707
The slopes of regression lines for hp are all -0.03 and the intercepts depend on the wt.
By default, the ggPredict() function calculate the mean and standard deviation of the moderator variable and calculate the equation of predictor variable. The roles of the predictor variable(pred,the first predictor variable) and moderator variable(modx,the second predictor variable) can be changed by the following R code.
ggPredict(fit3, show.point=FALSE,se=TRUE,xpos=0.5)
ggPredict(fit3, pred=wt, modx=hp, show.point = FALSE,se=TRUE,xpos=0.5)
You can draw 3d plot for this model with the following R command.
predict3d(fit3)
rglwidget(elementId ="3rd")
You can see that the regression plane is flat. In other words, the slopes of the regression line do not change and only the intercepts change.
You can make regression model with two continuous explanatory variables.
fit4=lm(mpg ~ hp*wt,data=mtcars)
summary(fit4)
Call:
lm(formula = mpg ~ hp * wt, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-3.0632 -1.6491 -0.7362 1.4211 4.5513
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 49.80842 3.60516 13.816 5.01e-14 ***
hp -0.12010 0.02470 -4.863 4.04e-05 ***
wt -8.21662 1.26971 -6.471 5.20e-07 ***
hp:wt 0.02785 0.00742 3.753 0.000811 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.153 on 28 degrees of freedom
Multiple R-squared: 0.8848, Adjusted R-squared: 0.8724
F-statistic: 71.66 on 3 and 28 DF, p-value: 2.981e-13
Can you get the intercepts and slopes for regression lines of this model? In this model, there are two continuous variables interacting together. The slopes and intercepts for one variable vary according to the other variable. When the argument mode is 2, the ggPredict() function uses the 14th, 50th and the 86th quantiles of the moderator variables calculated by the following R code.
(quantile(mtcars$wt,probs=c(0.14,0.50,0.86),type=6))
14% 50% 86%
2.0621 3.3250 3.9305
The quantile values depend on the calculating algorithms. The ggPredict() function uses the 6th algorithm to calculate quantiles to maintain compatibility with SPSS.
The slopes and the intercepts of the regression lines for hp depend on the wt.
To uses this value, call the ggPredict() function with setting the mode argument 2. If the mode argument is 3, you can draw as many regression lines as you want by setting the colorn argument.
ggPredict(fit4,mode=2,digits=2)
ggPredict(fit4,mode=3,colorn=50, show.text=FALSE)
As shown in the figure above, the regression lines intersect each other, so that the regression plane of the three dimensional(3d) plot is twisted. You can show error in the 3d plot by setting the show.error argument TRUE.
predict3d(fit4, show.error = TRUE)
rglwidget(elementId ="4th")
Once you have create a model with predict3d(), you can make movie with the following R code. You can get the movie.gif in current folder.
movie3d(spin3d(axis = c(0, 0, 1)),dir=".", duration = 6,movie="movie")
You can make regression model with three explanatory variables.
fit5 = lm(mpg ~ hp*wt*engine,data=mtcars)
summary(fit5)
Call:
lm(formula = mpg ~ hp * wt * engine, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-3.4392 -1.4404 0.0168 1.3475 3.8171
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 41.80412 8.59305 4.865 5.86e-05 ***
hp -0.04802 0.09377 -0.512 0.613
wt -3.75345 3.60736 -1.040 0.308
engineV-shaped 4.54377 12.64388 0.359 0.722
hp:wt -0.01236 0.03627 -0.341 0.736
hp:engineV-shaped -0.05489 0.10581 -0.519 0.609
wt:engineV-shaped -3.92911 4.67846 -0.840 0.409
hp:wt:engineV-shaped 0.03745 0.03927 0.954 0.350
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.175 on 24 degrees of freedom
Multiple R-squared: 0.8992, Adjusted R-squared: 0.8698
F-statistic: 30.58 on 7 and 24 DF, p-value: 1.819e-10
In this model, pickup the intercepts and the slopes values are more complex. So does the regression plot.
ggPredict(fit5)
ggPredict(fit5, facet.modx = TRUE,add.modx.values = FALSE,xpos=0.5, show.point = FALSE)
You can draw the 3d plot of this model with the following R code. When a regression model has more than three explanatory variables, the predict3d() function uses the 3rd explanatory variable as facetting variable and draws separated plots.
predict3d(fit5)
rglwidget(elementId ="5th")
If you want to draw 3d plot in a plot, set the argument overlay TRUE. In this model, you can see the regression planes intersect with each other.
predict3d(fit5, overlay=TRUE)
rglwidget(elementId ="6th")
To be continued…