ggPoints() for a interactive scatterplot

Keon-Woong Moon

2016-11-30

ggPoints() is a main function of package ggiraphExtra. It makes an interactive scatterplot with regression lines. It is one of the usful extensions of ggplots. It make a plot using the stat_smooth() of ggplot2 and geom_point_interactive() of ggiraph.

Package installation

You can install package ggiraphExtra with the following command.

#install.packages("devtools")
devtools::install_github("cardiomoon/ggiraphExtra")

First example

You can make interactive scatterplot easily. Basically, ggPoints() is a shortcut of geom_points_interactive() and geom_smooth(). The syntax is exactly the same with ggplot2. By default, ggPoints() make a static ggplot. You can make interactive scatterplot with setting the parameter interactive TRUE. You can even zoom-in or zoom-ou with your mouse wheel.

require(ggiraphExtra)
require(ggplot2)
require(ggiraph)
require(plyr)

ggplot(mtcars,aes(wt,mpg)) + geom_point() + geom_smooth()

ggPoints(aes(x=wt,y=mpg),data=mtcars,interactive=TRUE)

What is the difference between standard ggplot using geom_point() and geom_smooth() and ggPoints ?

Let me show one example. The ggplot() treat the dummy variable as numeric, but ggPoints() treat the dummy variable as a factor.

ggplot(mtcars,aes(wt,mpg,color=am)) + geom_point() + geom_smooth(method="lm")

ggPoints(aes(x=wt,y=mpg,color=am),data=mtcars,method="lm",interactive=TRUE)

You can adjust this behaviour by adjusting the maxfactorno parameter. The default value of maxfactorno is 6, that means any numeric variable with unique values up to 6 is treated as a factor variable.

ggPoints(aes(x=wt,y=mpg,color=carb,facet=carb),data=mtcars,method="lm",interactive=TRUE)
ggplot(data=mtcars,aes(x=wt,y=mpg,color=carb))+geom_point()+
        geom_smooth(method="lm")+facet_wrap(~carb)

If you do not want this feature, set the maxfactorno parameter less than the length of the unique values.

ggPoints(aes(x=wt,y=mpg,color=carb),data=mtcars,maxfactorno=3,interactive=TRUE)

## Customize tooltip and subset

You can customize the tooltip. If you want to use car names as a tooltip, make a column containing the desired names.

mtcars$name=rownames(mtcars)
ggPoints(aes(x=wt,y=mpg,color=am),tooltip="name",data=mtcars,interactive=TRUE)

Select regression method.

You can change the regression method to linear regression. Set the parameter method=“lm”. With linear regression models, you can see the regression equations when hovering the mouse on the regression line(s).

ggPoints(aes(x=wt,y=mpg,color=am),method="lm",data=mtcars,interactive=TRUE)

Easy facet

You can make separate plots easily bt using the parameter facet.

ggPoints(aes(x=wt,y=mpg,fill=am,facet=am),method="lm",data=mtcars,interactive=TRUE,shape=21)

Polynomial regression

You can plot polynomial regression model. With polynomial regression models, you can see the regression equations when hovering the mouse on the regression line(s).

require(gcookbook)

ggPoints(aes(x=heightIn,y=weightLb,fill=sex),method="lm",formula=y~poly(x,2),data=heightweight,title="Linear regression",subtitle="formula=y~poly(x,2)",interactive=TRUE,shape=21)

Logistic regression

You can draw scatter plot for binary dependent variable. The GBSG2 data contains data of 686 observations from the German Breast Cancer Study Group 2(GBSG2) study. You can get logistic regression line with a jittered scatterplot by setting the parameter method glm.

require(TH.data)
data(GBSG2)
ggPoints(aes(x=pnodes,y=cens),data=GBSG2,method="glm",interactive=TRUE)

You can get separated logistic regression lines by setting the parameter color or fill. You can get facetted plots by setting the parameter facet.

ggPoints(aes(x=pnodes,y=cens,color=horTh),data=GBSG2,method="glm",se=FALSE,interactive=TRUE)
ggPoints(aes(x=pnodes,y=cens,color=horTh,facet=horTh),data=GBSG2,method="glm",interactive=TRUE)