It has been said that an image is worth a thousand words. It has also been said that good graphics “force one to see that which otherwise would have been missed” (John Tukey). The reason for these advantages is tied to our evolutionary history; visually identifying complex patterns that might indicate a threat to our survival were quite adaptive. Failing to understand a table of numbers, on the other hand, likely never threatened the survival of our ancestors.

The landscape has changed, and with it, so have the threats to our “survival.” The dangers we face now do not come in the form of sabertoothed tigers, bears, sharks, and the like. Rather, the dangers we face are misinformation. The replication crisis has shed light on the fact that this new landscape is dangerous in other ways, as misinformation spreads throughout science and the public.

Unfortunately, statistical analysis is enormously complex, with more nuances and complexities growing daily. With the added complexity, the potential for misinformation grows even larger. It is curious, then, that with the added potential for misunderstanding, there hasn’t been a proportional push for greater simplification of interpretation. Given that our visual pattern recognition system is highly developed, graphics seem to be the most intuitive place to start simplifying the message.

I suspect the primary reason researchers fail to use graphics is two-fold. First, people do not understand how to produce good visualizations. Unfortunately, it is just as easy to deceive with visuals as it is for numbers. Although visualization resources are readily available, a cursory review of the literature demonstrates the message isn’t getting across; researchers rarely use figures, and when they do, they do so poorly. However, the bigger reason researchers fail to use graphics, I suspect, is due to software limitations. Most standard graphics produced by statistical software aremisleading, poor representations of the analyses, or plain ugly. Even if users are somehow able to overcome the software obstacle, deciding what graphic best represents the statistical analysis is another problem.

This paper attempts to address all these limitations by introducing Flexplot: an easy-to-use software package with a point-and-click interface that simplifies the visualization process. Flexplot produces publication-ready graphics that follow best-practices in visualization. Flexplot automates much of the decision-making in the backend. It is intelligent enough to determine which variables are categorical versus numeric, whether to do a scatterplot, histogram, bar chart, etc. By providing access to all these sorts of graphics in a single interface, Flexplot thus frees the resources of the user to spend their intellectual resources interpreting the graphics.

In this paper, I will proceed as follows. I first introduce the framework and general rules Flexplot uses to generate graphics. Subsequently, I illustrate the use of flexplot by showing how various sorts of modeling strategies (e.g., t-tests, ANOVAs, regressions) can be visualized using flexplot. I conclude with an example that shows how modeling and visualization can be integrated.

The General Linear Model Approach

Flexplot adapts a “General Linear Model” approach to visualization. Although most introductory statistics classes teach ANOVA’s, t-tests, regressions, etc. as separate and distinct procedures, most researchers eventually come to learn they are all part and parcel of the general linear model (GLM), which is simply an expanded form of regression. One can, for example, perform a t-test using regression (GLM) by relabeling one group as zero (e.g., Males=0) and the other as one (e.g., Females=1). The slope of said model is equivalent to the mean difference between the two groups (and the intercept is equal to the mean of the group recoded as zero), and the test of significance for that slope is equivalent to a t-test. In mathematical notation:

\(Y = b_0 + b_1\times X\)

where X is a dummy-coded variable indicating group membership. (For a more thorough treatment of general linear models, see Cohen, 1968)

If a grouping variable is a conversion from a label (e.g., Male versus Female) to a number (e.g., 0 versus 1), then one can plot the grouping variables on an X axis exactly as one would in a scatterplot (see, for example, the left-most panel in the figure below). This scatterplot can be improved further by labeling the axes (second panel).

However, this introduces a problem: because all males share the same X score (in other words, since all males share the score “Male”), there is too much overlap to see what is going on. This problem was solved years ago (reference) by the practice of “jittering.” Jittering means that we simply take the dummy-coded scores (1 and 0 in this case) and add a trivial amount of noise to that score. For example, one males’s score, formerly zero, might become .0987, while another male’s score might become -.972. Likewise, a female’s score (formerly one) might become 0.993 and another’s might become 1.045. Jittering minimizes overlap between datapoints and makes it easier to see where each individual score falls (see left panel in the figure below).

This image can be improved still by adding two elements: first, we can overlay means and standard errors. Additionally, we can alter the amount of noise applied during the jittering such that areas with greater concentration of scores are jittered more, while those areas with low concentration are jittered very little. (Technically, these images are jittered proportional to the density function of Outcome for that particular group). With enough datapoints, the “jittered density”, or JD plots begin looking like a reflected histogram. As an aside, this is very similar to how violin plots are created). These two improvements (overlaying means/standard errors and density-based jittering) are show in the right panel of the figure below. With these improvements, it is very easy to see how the two groups compare.

This conversion from t-test to GLM may require a slight modification in conceptualization. The traditional t-test aims to test differences between means. The GLM approach, on the other hand, treats group membership as a predictor and asks whether group membership is associated with the outcome variable. Although the conceptual difference seems nontrivial (i.e., mean difference versus association), the mathematics of both approaches are identical.

Once one is able to shift their conceptualization of grouping variables (or, in GLM terms, categorical predictor variables), the distinctions between groups and predictors becomes less important. This shift in thinking is necessary to understand the rationale of flexplot.

Much as the GLM conceptualizes statistical modeling in the form of an equation, flexplot also conceptualizes visualization in the form of an equation. The user need only specify the outcome and the predictor variable(s). Sometimes the user may also decide to specify how predictor variables are displayed (on the x-axis, as separate colors, as separate panels, etc.).

Flexplot’s Rules

This section will briefly explain the rules flexplot follows to generate graphics. Those looking for more specific information are welcome to study the code within Flexplot itself.

Flexplot was built using R, then adapted for use in the point-and-click software Jamovi: an open-source statistical computing software that allows developers (such as myself) to create “modules” for the general public. Flexplot is one such module.

The figure below shows the interface for Flexplot. Notice there are three panels of interest: Outcome variable, Predictor variable, and Paneled Variable. Aside from the outcome variable, Flexplot allows a total of four variables to be displayed: two variables in the “Predictor variables” box, and two in the “Paneled variables” box. For simplicity, I will name the first variable entered into Predictor variable as Predictor 1 and the second as Predictor 2. The first variable entered into Paneled variable I will call Panel 1 and the second Panel 2.

Predictor 1 will always be displayed on the X axis. If this variable is numeric, flexplot will produce a scatterplot. If this first variable is categorical, flexplot will produce a JD plot. Predictor 2 will be shown as different lines/colors/symbols. If Predictor 2 is numeric, it will first be “binned,” and each line/color/symbol will represent the relationship between Predictor 1 and the outcome for that particular level of Ppredictor 2. Likewise, if the variable is categorical, each line/color/symbol will represent the Predictor 1/outcome relationship for that level of (categorical) Predictor 2.

Panel 1 will show up in column panels, while Panel 2 will show up in row panels. As before, whichever of these variables are numeric will be binned first.

The user may wish to view the distribution of a particular variable (e.g., the outcome variable). To do so, the user simply specifies no predictor and no panel variables. Flexplot will then produce histograms (for numeric variables) or barplots (for categorical variables).

Now that I have epxlained the rules behind Flexplot, I will not demonstrate how to visualize various common analyses in research. In the process, I will highlight several features of Flexplot that enhance visual interpretation.

Regression in Flexplot

Much like the GLM is the basis of the majority of statistical analyses, likewise a scatterplot is the basis of most visualizations in flexplot. All other analyses are simply extensions of this outcome/predictor visual.

As expected, to visualize simple regression (i.e., one predictor variable), the user only needs to input the predictor variable and the outcome of interest. Flexplot will default to showing a “loess” or “lowess” line. In short, a loess line is a nonparametric curve that is allowed to “bend” with the data. The reason flexplot defaults to a loess curve is because often nonlinearities in data are masked if the researcher overlays a straight line. For example, in the figure below, the left plot appears to have a very weak linear relationship. It isn’t until we overlay a loess line that we realize the data are actually curvilinear.

Flexplot allows the user to specify a regression (straight) line, a loess line, a “robust” line, etc.:

Which results in this image:

Analysis of Variance (ANOVA)/T-Test

As mentioned previously, a categorical variable may be treated as numeric and subsequently plotted as a scatterplot. The examples shown in the first and second graphics are visual representation of a t-test. As the only difference between an ANOVA and a t-test is the number of groups; a visual for an ANOVA simply has three levels on the x axis, rather than two. The figure below shows a JD plots representing a t-test on the left, and an ANOVA on the right.

The user has the option of include means+standard errors, means + standard deviations, or medians with interquartile ranges. These may technically not be necessary, because the raw data provide all the information one would need. However, it is helpful to see a visual display of central tendency and variability The choice of which depends on what statistics the user choose to emphasize. If one is performing significance tests, means + standard errors should probably be reported (since p-values are a function of standard errors). If one is instead emphasizing effect sizes, such as Cohen’s d, means + standard deviations would be more appropriate (since cohen’s d is a function of standard deviations). The default is to report medians with interquartile ranges. The rationale for this is that it will highlight potential deviations from standard assumptions (normality, homoskedasticity).

To produce the graphics in Flexplot, one would simply put rewards (or therapy.type) in the predictor variable box, and weight.loss in the outcome variable box, like below

Factorial ANOVA

Recall that a Factorial ANOVA has two categorical variables, rather than one. For example, we might analyze the relationship between weight loss and gender/therapy.type. A graphic to match ought to incorporate both variables in the graphic. This can be done in one of two ways. The first way is to show the second variable as separate symbols/colors (left-most image in the figure below). Another way is to separate into panels, as in the middle plot in the figure below. The advantage of the left plot (where the second factor is presented as separate symbols/colors) is that it makes it easier to compare because the summary statistics are situated close to gether. Unfortunately, there is often a lot of overlap, which may mask important relationships. The graphic in the middle, however, reduces overlap, but makes it harder to make comparisons across panels. The far right panels uses a “ghost” line. The red line, or the ghost line, simply repeats the pattern from the “male” panel into the “female” panel, making it much easier to see that, across experimental conditions, females have more weight loss than males.

Error in round(breaks[i], digits = 1) : 
  non-numeric argument to mathematical function

One difficulty with multivariate data such as this is that one then has to choose which variable goes on the X axis. Each will give a different “view” of the data and, as such, both ought to be displayed (if not for publication, at least during data analysis). The figure, for example, shows therapy.type on the x axis in the left panel, and gender in the right panel. Each gives different insights into the data. In this case, the left panel emphasizes the main effect of therapy.type, while the right panel emphasizes the main effect of gender.

require(ggplot2)
require(cowplot)
require(flexplot)
data(exercise_data)
d =exercise_data
a = flexplot(weight.loss~gender | therapy.type, data=d, ghost.line="red")
Note: You didn't specify a reference for the ghost line. I'm going to choose it at random.
c = flexplot(weight.loss~therapy.type | gender, data=d, ghost.line="red")
Note: You didn't specify a reference for the ghost line. I'm going to choose it at random.
plot_grid(a,c,ncol=2)

ANCOVA

An ANCOVA generally involves a categorical grouping variable (e.g., an experimental effect) and a numeric “covariate.” When one performs an ANCOVA, they are typically not interested in the covariate. Rather, they seek the remove the influence of that predictor on the outcome. One way to visualize removing the effect of a predictor on an outcome is with Added Variable Plots (AVPs). In the background, the AVP models the relationship between the covariate and the outcome, residualizes the effect of the covariate (i.e., subtracting the fitted score from the actual score), then plots the categorical variable against the residuals. Residualizing a predictore removes any signal associated with the linear effect of that predictor.

Residualizing does not, however, remove any nonlinear relationships, such as interaction effects and polynomials. Because of this, I recommend plotting two graphics. One graphic will display the grouping variable as separated lines/colors/symbols. This first graphic will make it clear whether nonlinear and/or interaction effects are present. This is then supplemented with an AVP.

For example, the left panel of the figure below shows the relationship between motivation and weight.loss for each therapy.type (with a ghost line, for better interpretation). There may be some nonlinearity present, and the lines may deviate from parallel (indicating an interaction effect). However, these are relatively small sample sizes, so these minor deviations are probably nothing to worry about. The right panel shows the mean differences on the residualized outcome variable (as indicated by weight.loss|motivation), showing that there’s a slight advantage for beh over both cog and control conditions.

Normally when one residualizes a predictor variable from an outcome, the residuals are centered around zero. This can be confusing for those unfamiliar with AVPs. As such, Flexplot adds the mean of the depedent variable back into the residuals to make the visuals less intimidating.

Note: You didn't specify a reference for the ghost line. I'm going to choose it at random.

To create the left graphic in Jamovi, one would simply put weight.loss in the outcome variable box, motivation in predictor variable box, and therapy.type in the panel box. Notice the ghost.line box is checked. To create the right graphic, one would do all the same, except check the “Residualize predictor variable” box is checked.

Multiple Regression

Finally, we come to multiple regression (MR). Researchers tend to call an analysis a MR if they include multiple numeric variables. In actuality, MR is synonymous with the GLM; the mathematics (and flexplot, for that matter) don’t care whether the researcher considers the variables numeric or categorical. As such, we will illustrate multiple variable types in this section.

When plotting multivariate relationships, is it essential that we use paneling, AVPs, and ghost lines, otherwise things become overly complicated. The human brain is only capable of retaining so much information at once. To further simplify the analysis, I generally remove the standard error bars from the plots since they tend to add overlap between different plotted relationships. I also tend to plot regression lines, unless the loess lines indicate serious deviations from normality. Also, flexplot is only capable of plotting four predictor variables at once. More than that is exceptionally difficult for the human mind to interpret, so I have elected not to allow more than that. (One could plot separate paneled graphs for each of the levels of another variable to accomplish this, but again, that becomes quite difficult to interpret).

There is one other important feature of flexplot that needs noting before I demonstrate in Jamovi; categorical variables in the second, axis, row panels, or column panels will be “binned.” Bin simply means we convert them to categorical variables (e.g., each score on the categorical is categorized as high, medium, or low). Without this step, there would be as many panels as there are unique values of the numeric variable.

Having said that, let me introduce a strategy I use to visualize multivariate relationships. Let us assume we are interested in studying the relationship between weight.loss and each of the following predictors: motivation, health, muscle gain, and therapy.type. I will first start by plotting the bivariate relationship between weight.loss and each predictor variable. When doing this, I am looking for a couple of things. First, I am looking for bends in the relationships. If there are no bends, I will plot straight lines for the multivariate plot, which are far easier to interpret. I am also looking for the variable with the weakest relationship with weight.loss. I look for the weakest first because I am looking to reduce the number of variables to visualize as quickly as possible. The weakest bivariate relationship will likely be the weakest multivariate relationship. If it’s small enough not to worry about, I will generally remove that variable. In the below figure, gender seems to have the weakest relationship, and some variables have a nonlinear relationship with satisfaction.

Following univariate visuals, I then plot the multivariate visual. In this case, I will place gender on the X axis, and plot conscientiousness as separate lines/colors/symbols. The other two variables will simply be placed in panels (whether rows or columns doesn’t matter). This image is shown below.

At this point, I’m trying to dismiss gender as an important predictor. Eliminating a variable would simplify the analysis immensely. The only place I see a large difference between genders is in the first column/row panel. However, this only shows a large difference because there’s only one Male in that panel. As such, I’m going to dismiss gender as an important predictor and instead look at the other three variables.

Note: You didn't specify a reference for the ghost line. I'm going to choose it at random.

At this point, I am looking for non-parellel lines. Non-parallel lines signal an interaction, but I’m also wary of interpreting any deviation from parallel as important. I’m inclined to say these fitted lines are roughly parallel, but I’m not entirely sure. This is where modeling can be extremely useful. I can fit a model with interactions and a model without interactions and compare the fits. The below table shows various statistics to assess the two models. Fortunately, all metrics agree that the simpler model is a better model.

aic bic bayes.factor p.value r.squared
Main Effects Model 2468.37 2486.88 9106.24 0.35 0.27
Interaction Model 2471.81 2505.11 NA NA 0.28

At this point, the visualizations become extremely easy. The model comparison suggests there are only main effects, which means we are much safer in using AVPs. Granted, there may be some nonlinearity present in the residuals of the AVP plot, but I’m not to worried about that at this point. We can visualize the nonlinearity with the AVPs, as shown in the following figure.

These visuals suggest all variables may have a nonlinear association with satisfaction, once we controll for all the other variables in the model. There’s also some evidence of heteroskedasticity, but addressing that is beyond the scope of this paper.

Summary

In this paper, we have introduced Flexplot, an easy-to-use software package that allows flexible and simple visualizations with a graphical user interface. We have also introduces various graphical tools, including added variable plots, jittered density plots, ghost lines, paneling, etc. We also concluded with a general strategy for complex multivariate visualizations. It is our hope these tools and demonstrations seem both approachable and helpful, and that researchers have no hesitations about utilizing the tools available to flexplot.

References

