I have a few groups (lists) of data of different sizes:
library('randomNames')
library(knitr)
set.seed(0x0a)
groups <- 5
group.sizes <- round(abs(20 * rnorm(groups))) + 5
group.data <- rbind(lapply(group.sizes, function (n) {
row.data <- rnorm(n)
names(row.data) <- unlist(lapply(row.data, function (x) {
randomNames::randomNames(name.order = "first.last", name.sep = " ")
}))
row.data
}))
names(group.data) <- unlist(lapply(group.data, function (x) {
randomNames::randomNames(name.order = "first.last", name.sep = " ")
}))
I want to see horizontal barplots of my lists!
for (i in seq(groups)) {
barplot(
sort(abs(group.data[[i]]))
, horiz = TRUE
, xlab = "A Very Important Numeric Quantity!"
, main = sprintf("I am %s and\nI have %d bars!"
, names(group.data)[[i]], length(group.data[[i]]))
, las = 1
)
}
This is not what I had in mind! All the names are cut off, and the bars are different heights – some look squished and others look stretched. Maybe we could just show the top 5 per chart bar or something to make them look more uniform. However there are some additional requirements:
What I want is for the bar heights on each plot to be the same height, meaning the plots themselves will be different heights. I want the bar heights to be just slightly larger than the label heights so the labels do not overlap each other – more like chart 4 or 5, less like chart 3.
So, how to solve? The best I can come up with is absolute units. This is probably a gross hack but I am not good at computers so w/e. Let’s work through this on the first group:
# just solve one for now
group <- group.data[[1]]
group.name <- names(group.data)[1]
# How tall are the labels? Use inches since it does not require a plot to already exist: help(strwidth)
label.height <- max(strheight(names(group), units = "inches"))
label.width <- max(strwidth(names(group), units = "inches"))
op <- par(no.readonly = TRUE)
bottom.margin <- op$mai[1]
left.margin <- op$mai[2]
top.margin <- op$mai[3]
right.margin <- op$mai[4]
bar.spacing <- 0.2 # default, from help(barplot)
plot.height <- length(group) * (label.height + bar.spacing)
plot.width <- 5
left.margin <- left.margin + label.width
device.width <- left.margin + plot.width + right.margin
device.height <- bottom.margin + plot.height + top.margin
par(mai = c(bottom.margin, left.margin, top.margin, right.margin))
barplot(
sort(abs(group))
, horiz = TRUE
, xlab = "A Very Important Numeric Quantity!"
, main = sprintf("I am %s and\nI have %d bars!", group.name, length(group))
, las = 1
, space = bar.spacing
)
This looks decent. It is easy to put the width and height arguments in pdf() which accepts inches. I am not sure about png() or others. Anyhow let’s put this in a function so it can be called in a loop. We have to refactor a bit to pass fig.height to knit_child.
calculate_barplot_sizes <- function(
bars
, label.width
, label.height
, plot.width = 5
, bar.spacing = 0.2
) {
op <- par(no.readonly = TRUE)
bottom.margin <- op$mai[1]
left.margin <- op$mai[2]
top.margin <- op$mai[3]
right.margin <- op$mai[4]
plot.height <- bars * (label.height + bar.spacing)
left.margin <- left.margin + label.width
device.width <- left.margin + plot.width + right.margin
device.height <- bottom.margin + plot.height + top.margin
list(
mai = list(
bottom = bottom.margin
, left = left.margin
, top = top.margin
, right = right.margin),
dev = list(width = device.width, height = device.height)
)
}
# For knitr, reference the outer group.data. In a standalone R script I
# would pass in the whole data.
barplot_wrapper <- function(i, mai, ...) {
group <- group.data[[i]]
group.name <- names(group.data)[i]
op <- par(no.readonly = TRUE)
par(mai = mai)
barplot(
sort(abs(group))
, horiz = TRUE
, xlab = "A Very Important Numeric Quantity!"
, main = sprintf("I am %s and\nI have %d bars!", group.name, length(group))
, las = 1
, ...
)
par(op)
}
And call the new function on each item to show the new results:
out <- unlist(lapply(seq(groups), function (i) {
group <- group.data[[i]]
group.name <- names(group.data)[i]
# How tall are the labels? Use inches since it does not require a plot to
# already exist: help(strwidth)
label.height <- max(strheight(names(group), units = "inches"))
label.width <- max(strwidth(names(group), units = "inches"))
dims <- calculate_barplot_sizes(length(group), label.width, label.height)
knit_child(
text = sprintf(
"```{r myfig%d, fig.width=%f, fig.height=%f, echo=FALSE}
barplot_wrapper(%d, mai = c(%f, %f, %f, %f))
```"
, i, dims$dev$width, dims$dev$height, i, dims$mai$bottom, dims$mai$left
, dims$mai$top, dims$mai$right
)
)
}))
Ta-da! I see a little bit of variation in the bar heights but it is not too noticeable. I am happy with it and calling it a day.
Fortune
##
## The problem here is that the $ notation is a magical shortcut and like any
## other magic if used incorrectly is likely to do the programmatic
## equivalent of turning yourself into a toad.
## -- Greg Snow (in response to a user that wanted to access a column
## whose name is stored in y via x$y rather than x[[y]])
## R-help (February 2012)