R package predict3d aims to draw predicts plot for various regression models. The main two functions are ggPredict() for 2-dimensional(2d) plot and predict3d() for three dimensional(3d) plot. The ggPredict() function makes a ggplot object using ggplot2 package and the predict3d() function uses rgl package to make the 3d scatter plot.

Package Installation

You can install the predict3d package from CRAN.

install.packages("predict3d")

You can install the developmental version of predict3d package from github.

if(!require(devtools)) install.packages("devtools")
devtools::install_github("cardiomoon/predict3d"")

Simple Linear Regression Model

You can draw simple linear regression model. The mtcars data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models). First model has one response(dependent) variable and one continuous explanatory(independent) variable.

require(predict3d)
require(rgl)

fit=lm(mpg~wt,data=mtcars)
summary(fit)

Call:
lm(formula = mpg ~ wt, data = mtcars)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.5432 -2.3647 -0.1252  1.4096  6.8727 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  37.2851     1.8776  19.858  < 2e-16 ***
wt           -5.3445     0.5591  -9.559 1.29e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.046 on 30 degrees of freedom
Multiple R-squared:  0.7528,    Adjusted R-squared:  0.7446 
F-statistic: 91.38 on 1 and 30 DF,  p-value: 1.294e-10

From summaries, you can get the slope and the intercept of the regression line. The intercept of regression line is 37.29 and the slope is -5.34.

You can draw this model with ggPredict() function. By default, the regression equation is printed just above the regression line. You can adjust the position of this label with xpos(relative position on x-axis) and vjust(vertical position) argument(). You can draw error by setting the show.error argument TRUE.

ggPredict(fit)
ggPredict(fit,xpos=0.75,vjust=1.5,show.error = TRUE)

Multiple Linear Regression Model Without Interaction

The vs column of mtcars data represents engine shape. The V-shaped engine is coded as 0 and the straight one is coded as 1. You can make new variable engine with vs.

mtcars$engine=ifelse(mtcars$vs==0,"V-shaped","straight")
fit1=lm(mpg~wt+engine,data=mtcars)
summary(fit1)

Call:
lm(formula = mpg ~ wt + engine, data = mtcars)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.7071 -2.4415 -0.3129  1.4319  6.0156 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)     36.1586     1.7656  20.480  < 2e-16 ***
wt              -4.4428     0.6134  -7.243 5.63e-08 ***
engineV-shaped  -3.1544     1.1907  -2.649   0.0129 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.78 on 29 degrees of freedom
Multiple R-squared:  0.801, Adjusted R-squared:  0.7873 
F-statistic: 58.36 on 2 and 29 DF,  p-value: 6.818e-11

From summary, you can get the intercepts and slopes of of two regression lines. The slope is identical for both type of engines. The intercepts are:

  • For straight engine, \(36.15\)
  • For V-shaped engine, \(36.15-3.15 = 33\)

You can draw this model with ggPredict() function. You can draw separate figure according to the shape of engine by setting the facet.modx TRUE. You can suppress the regression equation by setting the show.text FALSE.

ggPredict(fit1,digits=1)
ggPredict(fit1,digits=1,show.error = TRUE,facet.modx = TRUE,show.text=FALSE)

You can draw 3d plot of this model with predict3d() function.

predict3d(fit1,radius=0.5)
rglwidget(elementId = "1st")

Once you have create a model with predict3d(), you can rotate, zoom in and zoom out your object with your mouse. You can save your 3d plot as a figure file or pdf file.

rgl.bringtotop()
rgl.snapshot("fig1.png")               
rgl.postscript("fig2.pdf","pdf")

Multiple Linear Regression Model With Interaction

You can make regression model in which the explanatory variables are interact.

fit2=lm(mpg~wt*engine,data=mtcars)
summary(fit2)

Call:
lm(formula = mpg ~ wt * engine, data = mtcars)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.9950 -1.7881 -0.3423  1.2935  5.2061 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)        41.2981     2.7002  15.294 4.01e-15 ***
wt                 -6.4110     0.9998  -6.412 6.08e-07 ***
engineV-shaped    -11.7667     3.7638  -3.126   0.0041 ** 
wt:engineV-shaped   2.9097     1.2157   2.393   0.0236 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.578 on 28 degrees of freedom
Multiple R-squared:  0.8348,    Adjusted R-squared:  0.8171 
F-statistic: 47.16 on 3 and 28 DF,  p-value: 4.497e-11

In the regression equation, you can see the *. It means all possible combinations. So the equation A*B is the same as A + B+ A:B.

  • \(A*B = A + B + A:B\)
  • \(A*B*C = A + B + C + A:B + A:C +B:C +A:B:C\)

From summary, you can get the intercepts and slopes of of two regression lines. In this model, two regression lines cross each other. In other words, the slopes of the two regression lines are different.

Intercepts

  • For straight engine, \(41.30\)
  • For V-shaped engine, \(41.30 - 11.77 = 29.53\)

Slopes

  • For straight engine, \(-6.41\)
  • For V-shaped engine, \(-6.41 + 2.91 = 3.5\)

You can draw this model with ggPredict() function. You can suppress the points by setting the show.point FALSE. You can draw 95% confidence interval by setting the se TRUE. You can make facet by rows with the following R code.

ggPredict(fit2, show.point = FALSE,se=TRUE)
ggPredict(fit2, show.point = FALSE,se=TRUE,xpos=0.5,facet.modx = TRUE, facetbycol = FALSE)

You can draw 3d plot of this model with predict3d() function.

predict3d(fit2,radius=0.5)
rglwidget(elementId = "2nd")

Multiple Linear Regression Model - Two continuous explanatory variables

You can make regression model with two continuous explanatory variables.

fit3=lm(mpg ~ hp+wt,data=mtcars)
summary(fit3) 

Call:
lm(formula = mpg ~ hp + wt, data = mtcars)

Residuals:
   Min     1Q Median     3Q    Max 
-3.941 -1.600 -0.182  1.050  5.854 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 37.22727    1.59879  23.285  < 2e-16 ***
hp          -0.03177    0.00903  -3.519  0.00145 ** 
wt          -3.87783    0.63273  -6.129 1.12e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.593 on 29 degrees of freedom
Multiple R-squared:  0.8268,    Adjusted R-squared:  0.8148 
F-statistic: 69.21 on 2 and 29 DF,  p-value: 9.109e-12

Can you get the intercepts and slopes for regression lines of this model? There are two continuous variables. While the slopes for one variable remain constant, the intercepts change according to the other variable. The mean and standard deviation of weights of mtcars are 3.22 and 0.98 respectively. The ggPredict() function uses mean and \(mean \pm 1*sd\) calculated by the following R code provided that the argument mode is 1(default value).

(mean(mtcars$wt,na.rm=TRUE) + c(-1,0,1)*sd(mtcars$wt,na.rm=TRUE))
[1] 2.238793 3.217250 4.195707

The slopes of regression lines for hp are all -0.03 and the intercepts depend on the wt.

  • For weight 2.24, \(37.2272 - 3.8778\times 2.24 = 28.55\)
  • For weight 3.22, \(37.2272 - 3.8778\times 3.22 = 24.75\)
  • For weight 4.20, \(37.2272 - 3.8778\times 4.20 = 20.96\)

By default, the ggPredict() function calculate the mean and standard deviation of the moderator variable and calculate the equation of predictor variable. The roles of the predictor variable(pred,the first predictor variable) and moderator variable(modx,the second predictor variable) can be changed by the following R code.

ggPredict(fit3, show.point=FALSE,se=TRUE,xpos=0.5)
ggPredict(fit3, pred=wt, modx=hp, show.point = FALSE,se=TRUE,xpos=0.5)

You can draw 3d plot for this model with the following R command.

predict3d(fit3)
rglwidget(elementId ="3rd")

You can see that the regression plane is flat. In other words, the slopes of the regression line do not change and only the intercepts change.

Multiple Linear Regression Model - Two interacting continuous explanatory variables

You can make regression model with two continuous explanatory variables.

fit4=lm(mpg ~ hp*wt,data=mtcars)
summary(fit4) 

Call:
lm(formula = mpg ~ hp * wt, data = mtcars)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.0632 -1.6491 -0.7362  1.4211  4.5513 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 49.80842    3.60516  13.816 5.01e-14 ***
hp          -0.12010    0.02470  -4.863 4.04e-05 ***
wt          -8.21662    1.26971  -6.471 5.20e-07 ***
hp:wt        0.02785    0.00742   3.753 0.000811 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.153 on 28 degrees of freedom
Multiple R-squared:  0.8848,    Adjusted R-squared:  0.8724 
F-statistic: 71.66 on 3 and 28 DF,  p-value: 2.981e-13

Can you get the intercepts and slopes for regression lines of this model? In this model, there are two continuous variables interacting together. The slopes and intercepts for one variable vary according to the other variable. When the argument mode is 2, the ggPredict() function uses the 14th, 50th and the 86th quantiles of the moderator variables calculated by the following R code.

(quantile(mtcars$wt,probs=c(0.14,0.50,0.86),type=6))
   14%    50%    86% 
2.0621 3.3250 3.9305 

The quantile values depend on the calculating algorithms. The ggPredict() function uses the 6th algorithm to calculate quantiles to maintain compatibility with SPSS.

The slopes and the intercepts of the regression lines for hp depend on the wt.

Intercepts

  • For weight 2.06, \(49.8084 - 8.2166\times 2.06 = 32.86\)
  • For weight 3.33, \(49.8084 - 8.2166\times 3.33 = 22.49\)
  • For weight 3.93, \(49.8084 - 8.2166\times 3.93 = 17.51\)

Slopes

  • For weight 2.06, \(-0.1201 + 0.02785\times 2.06 = -0.06\)
  • For weight 3.33, \(-0.1201 + 0.02785\times 3.33 =-0.03\)
  • For weight 3.93, \(-0.1201 + 0.02785\times 3.93 = 0.01\)

To uses this value, call the ggPredict() function with setting the mode argument 2. If the mode argument is 3, you can draw as many regression lines as you want by setting the colorn argument.

ggPredict(fit4,mode=2,digits=2)
ggPredict(fit4,mode=3,colorn=50, show.text=FALSE)

As shown in the figure above, the regression lines intersect each other, so that the regression plane of the three dimensional(3d) plot is twisted. You can show error in the 3d plot by setting the show.error argument TRUE.

predict3d(fit4, show.error = TRUE)
rglwidget(elementId ="4th")

Once you have create a model with predict3d(), you can make movie with the following R code. You can get the movie.gif in current folder.

movie3d(spin3d(axis = c(0, 0, 1)),dir=".", duration = 6,movie="movie")

Multiple Linear Regression Model - Three interacting explanatory variables

You can make regression model with three explanatory variables.

fit5 = lm(mpg ~ hp*wt*engine,data=mtcars)
summary(fit5)

Call:
lm(formula = mpg ~ hp * wt * engine, data = mtcars)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.4392 -1.4404  0.0168  1.3475  3.8171 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)          41.80412    8.59305   4.865 5.86e-05 ***
hp                   -0.04802    0.09377  -0.512    0.613    
wt                   -3.75345    3.60736  -1.040    0.308    
engineV-shaped        4.54377   12.64388   0.359    0.722    
hp:wt                -0.01236    0.03627  -0.341    0.736    
hp:engineV-shaped    -0.05489    0.10581  -0.519    0.609    
wt:engineV-shaped    -3.92911    4.67846  -0.840    0.409    
hp:wt:engineV-shaped  0.03745    0.03927   0.954    0.350    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.175 on 24 degrees of freedom
Multiple R-squared:  0.8992,    Adjusted R-squared:  0.8698 
F-statistic: 30.58 on 7 and 24 DF,  p-value: 1.819e-10

In this model, pickup the intercepts and the slopes values are more complex. So does the regression plot.

ggPredict(fit5)
ggPredict(fit5, facet.modx = TRUE,add.modx.values = FALSE,xpos=0.5, show.point = FALSE)

You can draw the 3d plot of this model with the following R code. When a regression model has more than three explanatory variables, the predict3d() function uses the 3rd explanatory variable as facetting variable and draws separated plots.

predict3d(fit5)
rglwidget(elementId ="5th")

If you want to draw 3d plot in a plot, set the argument overlay TRUE. In this model, you can see the regression planes intersect with each other.

predict3d(fit5, overlay=TRUE)
rglwidget(elementId ="6th")

To be continued…