R Barplot Variable Height

I have a few groups (lists) of data of different sizes:

library('randomNames')
library(knitr)

set.seed(0x0a)

groups <- 5

group.sizes <- round(abs(20 * rnorm(groups))) + 5

group.data <- rbind(lapply(group.sizes, function (n) {
  row.data <- rnorm(n)
  names(row.data) <- unlist(lapply(row.data, function (x) {
    randomNames::randomNames(name.order = "first.last", name.sep = " ")
  }))
  row.data
}))

names(group.data) <- unlist(lapply(group.data, function (x) {
  randomNames::randomNames(name.order = "first.last", name.sep = " ")
}))

I want to see horizontal barplots of my lists!

for (i in seq(groups)) {
  barplot(
    sort(abs(group.data[[i]]))
    , horiz = TRUE
    , xlab = "A Very Important Numeric Quantity!"
    , main = sprintf("I am %s and\nI have %d bars!"
                     , names(group.data)[[i]], length(group.data[[i]]))
    , las = 1
  )
}

This is not what I had in mind! All the names are cut off, and the bars are different heights – some look squished and others look stretched. Maybe we could just show the top 5 per chart bar or something to make them look more uniform. However there are some additional requirements:

We can’t just show the top few, all bars must be displayed.
The whole name must be readable.

What I want is for the bar heights on each plot to be the same height, meaning the plots themselves will be different heights. I want the bar heights to be just slightly larger than the label heights so the labels do not overlap each other – more like chart 4 or 5, less like chart 3.

So, how to solve? The best I can come up with is absolute units. This is probably a gross hack but I am not good at computers so w/e. Let’s work through this on the first group:

# just solve one for now
group <- group.data[[1]]
group.name <- names(group.data)[1]

# How tall are the labels? Use inches since it does not require a plot to already exist: help(strwidth)
label.height <- max(strheight(names(group), units = "inches"))
label.width <- max(strwidth(names(group), units = "inches"))

op <- par(no.readonly = TRUE)
bottom.margin <- op$mai[1]
left.margin <- op$mai[2]
top.margin <- op$mai[3]
right.margin <- op$mai[4]

bar.spacing <- 0.2 # default, from help(barplot)

plot.height <- length(group) * (label.height + bar.spacing)
plot.width <- 5
left.margin <- left.margin + label.width

device.width <- left.margin + plot.width + right.margin
device.height <- bottom.margin + plot.height + top.margin

par(mai = c(bottom.margin, left.margin, top.margin, right.margin))
barplot(
  sort(abs(group))
  , horiz = TRUE
  , xlab = "A Very Important Numeric Quantity!"
  , main = sprintf("I am %s and\nI have %d bars!", group.name, length(group))
  , las = 1
  , space = bar.spacing
)

This looks decent. It is easy to put the width and height arguments in pdf() which accepts inches. I am not sure about png() or others. Anyhow let’s put this in a function so it can be called in a loop. We have to refactor a bit to pass fig.height to knit_child.

calculate_barplot_sizes <- function(
  bars
  , label.width
  , label.height
  , plot.width = 5
  , bar.spacing = 0.2
  ) {
  op <- par(no.readonly = TRUE)
  bottom.margin <- op$mai[1]
  left.margin <- op$mai[2]
  top.margin <- op$mai[3]
  right.margin <- op$mai[4]

  plot.height <- bars * (label.height + bar.spacing)
  left.margin <- left.margin + label.width

  device.width <- left.margin + plot.width + right.margin
  device.height <- bottom.margin + plot.height + top.margin

  list(
    mai = list(
      bottom = bottom.margin
      , left = left.margin
      , top = top.margin
      , right = right.margin),
    dev = list(width = device.width, height = device.height)
  )
}

# For knitr, reference the outer group.data. In a standalone R script I
# would pass in the whole data.
barplot_wrapper <- function(i, mai, ...) {
  group <- group.data[[i]]
  group.name <- names(group.data)[i]

  op <- par(no.readonly = TRUE)
  par(mai = mai)
  barplot(
    sort(abs(group))
    , horiz = TRUE
    , xlab = "A Very Important Numeric Quantity!"
    , main = sprintf("I am %s and\nI have %d bars!", group.name, length(group))
    , las = 1
    , ...
  )
  par(op)
}

And call the new function on each item to show the new results:

out <- unlist(lapply(seq(groups), function (i) {
  group <- group.data[[i]]
  group.name <- names(group.data)[i]

  # How tall are the labels? Use inches since it does not require a plot to
  # already exist: help(strwidth)
  label.height <- max(strheight(names(group), units = "inches"))
  label.width <- max(strwidth(names(group), units = "inches"))

  dims <- calculate_barplot_sizes(length(group), label.width, label.height)

  knit_child(
    text = sprintf(
      "```{r myfig%d, fig.width=%f, fig.height=%f, echo=FALSE}
barplot_wrapper(%d, mai = c(%f, %f, %f, %f))
      ```"
      , i, dims$dev$width, dims$dev$height, i, dims$mai$bottom, dims$mai$left
      , dims$mai$top, dims$mai$right
    )
  )
}))

Ta-da! I see a little bit of variation in the bar heights but it is not too noticeable. I am happy with it and calling it a day.

Fortune

## 
## The problem here is that the $ notation is a magical shortcut and like any
## other magic if used incorrectly is likely to do the programmatic
## equivalent of turning yourself into a toad.
##    -- Greg Snow (in response to a user that wanted to access a column
##       whose name is stored in y via x$y rather than x[[y]])
##       R-help (February 2012)

R Barplot Variable Height

jonathanwesleystone@gmail.com

2017-08-12