Boxplots are a helpful graphical tool for Exploratory Data Analysis (EDA) to show univeriate distributions of data.
A boxplot consists out of 5 statistical numbers: median, lower hinge, upper hinge, minimum and maximum.
boxplot(cars, col = c("lightgray", "lightblue"), main = "Boxplot of cars-dataset")
There exists a large number of possible variations to improve the boxplot-design or to add additional informations.
Edward Tufte (2001) proposed another boxplot-design in order to reduce data-ink-ratio.
boxplot(cars, horizontal = T, main = "Tufte-style boxplot", pars = list(boxcol = "white",
medlty = "blank", medpch = 16, medcex = 1.3, whisklty = c(1, 1), staplelty = "blank",
outcex = 0.5))
A notched boxplot adds inference-statistical information by plotting the confidence-intervall of the median.
boxplot(cars, col = c("lightgray", "lightblue"), notch = T)
Furthermore it might be useful to add information about density to the plot. Possible ways to do that are the violinplot or the bean-plot.
library(vioplot)
## Loading required package: sm
## Package `sm', version 2.2-4.1 Copyright (C) 1997, 2000, 2005, 2007, 2008,
## A.W.Bowman & A.Azzalini Type help(sm) for summary information
library(beanplot)
par(mar = c(2, 2, 2, 1))
par(mfrow = c(1, 2))
mu <- 2
si <- 0.6
bimodal <- c(rnorm(1000, -mu, si), rnorm(1000, mu, si))
uniform <- runif(2000, -4, 4)
normal <- rnorm(2000, 0, 3)
vioplot(bimodal, uniform, normal, col = "lightgray")
title(main = "violin plot", add = T)
beanplot(decrease ~ treatment, data = OrchardSprays, exp(rnorm(20, 3)), col = "lightblue",
main = "bean plot")
## log="y" selected