R package predict3d aims to draw predicts plot for various regression models. This is package vignette part 2. If you are unfamiliar to predict3d, please read the vignette Part I. From Simple to Multiple Regression.

Models With More Than Three Explanatory Variables

You can make multiple regression model with more than three explanatory variables.


Call:
lm(formula = mpg ~ hp * wt + am + disp + gear + carb, data = mtcars)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.0230 -1.3266 -0.3016  1.4476  4.4619 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 42.556287   6.362277   6.689 6.43e-07 ***
hp          -0.120234   0.032011  -3.756 0.000974 ***
wt          -7.783129   2.023098  -3.847 0.000775 ***
am          -1.370464   1.735176  -0.790 0.437372    
disp        -0.003415   0.011177  -0.306 0.762573    
gear         2.073539   1.100990   1.883 0.071828 .  
carb        -0.670690   0.550481  -1.218 0.234925    
hp:wt        0.030688   0.008791   3.491 0.001885 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.152 on 24 degrees of freedom
Multiple R-squared:  0.9013,    Adjusted R-squared:  0.8726 
F-statistic: 31.32 on 7 and 24 DF,  p-value: 1.41e-10

You can draw this model with ggPredict(). When calling this function, you can specify up to three predictor variables. The first predictor variable can be specified by the pred argument, which is used as an x-axis variable. The second variable can be specified with the modx argument and is used as a color variable. The third variable can be specified with the mod2 argument and is used as a facetting variable. If you do not specify any predictor variables, ggPredict() function automatically use up to three predictors in the regression model. The following two R codes make the same plot.

ggPredict(fit)  
ggPredict(fit,pred=hp,modx=wt,mod2=am)

You can make 3d plot of this model.

predict3d(fit) 
rglwidget(elementId ="1st")

As shown in the above figures, when the number of predictive variables is greater than the specified number, these variables are replaced with typical values when calculating the slope and intercept of the regression equation, and are displayed at the lower right of the plot. Typical values are the median for numeric, integer, and ordered factor vectors and the most frequent value for factors, characters, and logical vectors. In this regression model, gear is replaced by 4, carb by 2 and am by 0.

You can draw plot with one predictor or two predictors.

ggPredict(fit,pred=hp)  
ggPredict(fit,pred=hp,modx=wt)

Predictive variables can be specified regardless of the order of the regression equation.

ggPredict(fit, pred=wt)
ggPredict(fit, pred=wt, modx=am)

Polynomial Regression

Polynomial regression is a form of regression analysis in which the relationship between the independent variable x and the dependent variable y is modelled as an nth degree polynomial in x.

\(y=a+bx+cx^2+dx^3...\)

The radial data in package moonBook is a demographic data of 115 patients performing IVUS(intravascular ultrasound) examination of a radial artery. The body weight can fit higher powers of height. The ggPredict() function can draw the polynomial regression model.

require(moonBook)

fit1=lm(weight~I(height^2)+height+sex,data=radial)
summary(fit1)

Call:
lm(formula = weight ~ I(height^2) + height + sex, data = radial)

Residuals:
     Min       1Q   Median       3Q      Max 
-17.9490  -3.5091  -0.0529   3.8567  18.7530 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)  
(Intercept) 300.083478 170.598554   1.759   0.0814 .
I(height^2)   0.013618   0.006623   2.056   0.0421 *
height       -3.670114   2.127042  -1.725   0.0873 .
sexM         -0.014115   1.918601  -0.007   0.9941  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 6.605 on 110 degrees of freedom
  (1 observation deleted due to missingness)
Multiple R-squared:  0.5005,    Adjusted R-squared:  0.4869 
F-statistic: 36.75 on 3 and 110 DF,  p-value: < 2.2e-16

ggPredict(fit1,xpos=c(0.4,0.6))

You can draw 3d plot of this model.

predict3d(fit1,radius=1)
rglwidget(elementId ="2nd")

Transforming of variables

When models don’t meet the normality, linearity, or homoscedasticity assumptions, transforming one or more variables can often improve or correct the situation. You can make model with power transformation like this. The variable NTAV(normalized total atheroma volume measured by IVUS in cubic mm) in the data radial is a surrogate marker of atherosclerosis.

fit2=lm(log(NTAV)~log(hsCRP)*age,data=radial)
summary(fit2)

Call:
lm(formula = log(NTAV) ~ log(hsCRP) * age, data = radial)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.1356 -0.1705 -0.0280  0.1953  0.9291 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)     4.143208   0.295372  14.027   <2e-16 ***
log(hsCRP)      0.194921   0.135310   1.441    0.153    
age             0.000626   0.004573   0.137    0.891    
log(hsCRP):age -0.002995   0.002130  -1.406    0.163    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.3152 on 107 degrees of freedom
  (4 observations deleted due to missingness)
Multiple R-squared:  0.0481,    Adjusted R-squared:  0.02141 
F-statistic: 1.802 on 3 and 107 DF,  p-value: 0.1512

In this situation, you can draw this model with normal x-axis scale, log x-axis scale, normal y-axis scale and log y axis scale.

require(ggplot2)
options(ggPredict.show.point=FALSE)
ggPredict(fit2,plot=FALSE)$p+labs(title="A. Log y-axis scale")
ggPredict(fit2,pred=log(hsCRP),modx=age,plot=FALSE)$p+labs(title="B. Log x-axis and log y-axis scale")
ggPredict(fit2,dep=NTAV,plot=FALSE)$p+labs(title="C. Normal x-axis and normal y-axis scale")
ggPredict(fit2,pred=log(hsCRP),modx=age,dep=NTAV,plot=FALSE)$p+labs(title="D. Log x-axis scale")

ggPredict() function can handle sqrt(), log(), exp(), I(x^2) and factor().

fit3=lm(mpg ~ sqrt(wt)*factor(vs),data=mtcars)
ggPredict(fit3)
ggPredict(fit3,pred=sqrt(wt),modx=factor(vs))

ggPredict() function can handle strings and factor.

fit4=lm(log(NTAV)~age*sex,data=radial)
ggPredict(fit4,xpos=0.5)
fit5=lm(Sepal.Length~Sepal.Width*Species,data=iris)
ggPredict(fit5,xpos=0.5)

Complex Regression Formula

When the left hand side of the formula is not a simple name, ggPredict() function can make plot of the model.

fit6 = lm(100/mpg ~ hp*wt,data=mtcars)
ggPredict(fit6, xpos=0.5)
ggPredict(fit6, dep=mpg,xpos=0.5)

fit7 = lm( 2^cyl ~ sqrt(hp)*wt,data=mtcars)
ggPredict(fit7, xpos=0.5)
ggPredict(fit7, dep=cyl, modx=wt,xpos=0.5)