The purpose of this function is to take a vector of probabilities and generates a frequency grid plot, where each probability is presented by a corresponding frequency of unique symbols. These plots help decision makers visualize the relative probability of different events (for example, outcomes related to medical screening). For example the following plot shows the corresponding frequencies of four outcomes with probabilities .05, .10, .25 and .60.

plot of chunk unnamed-chunk-3

Arguments

The function Freq.Grid has multiple arguments that allow the user to customize the plot:

  1. data = the data used to generate the plot. This can be in one of two forms:
    • A vector of probabilities corresponding to each symbol type
    • A matrix of values which will be directly mapped onto the plot
  2. mtx.dimension = If the data is entered as a vector of probabilities, this dictates the final dimensions of the (square) matrix of symbols. For example, a value of 10 means that the final plot will be a 10 x 10 matrix of symbols.

  3. pch.vec = The types of symbols to be drawn. This can be in one of two forms:
    • A vector of pch values, one for each symbol type
    • A value “square” or “circle”
  4. cols = Designates the color of the symbols. Input can be in one of two forms:
    • A vector of colors
    • A name of one of the existing pallettes in the ColorBrewer package (e.g.; “Set2”, “Accent”)
  5. block.dim = A two dimensional vector indicatingh the dimensions of the grouped blocks. For example, a value of c(3, 3) would produce blocks with 3 rows and 3 columns.

  6. block.border = The width of spaces between blocks

  7. Add.character = An optional vector of characters to add to the symbols. This can be in one of two forms:
    • If numerical, then R will take the values as pch values (see help(points))
    • If character, the actual characters will be used
  8. Add.character.cex = The size of the characters added to symbols.

  9. pt.cex = An optional hard-coding of the symbol size in case you don’t like their default size.

  10. sort.mtx = A logical value indicating whether or not you want the function to sort your matrix (only necessary if data is a matrix).

  11. fill.direction = A string indiciating whether to fill blocks to the right or down.

  12. what = A binary vector of size 2 indicating whether or not to add things…

  13. add.legend = A logical value indicating whether or not to add an additional plot with descriptions of the symbols.

  14. descriptions = A string vector with descriptions of the different symbols. Used in the legend.


Examples

Let’s do an example where there are four outcomes with probabilities .01, .09, .30 and .60. We’ll create a 15 x 15 matrix display, where each block has 5 rows and 5 columns.

Freq.Grid(c(.01, .09, .3, .6), 
         mtx.dimension = 15, block.dim = c(5, 5), 
         pch.vec = "square", fill.direction = "down", cols = c("Set3"),
         character = c("+", "-", "", ""), add.legend = F
         )

plot of chunk unnamed-chunk-4

Now let’s repeat it with a few changes:

  1. Fill boxes to the right,
  2. Increase the total number of points (50 x 50)
  3. Increase borders between boxes
  4. Add a legend
  5. Use different symbol types
Freq.Grid(c(.01, .09, .3, .6), 
         mtx.dimension = 30, block.dim = c(5, 5), 
         pch.vec = c(21, 22, 21, 22), fill.direction = "right", cols = c("Set3"),
         character = c("+", "-", "", ""), add.legend = T, block.border = .4,
         descriptions = c("Outcome 1", "Outcome 2", "Outcome 3", "Outcome 4")
         )

plot of chunk unnamed-chunk-5

Mammography screening results

Next, let’s create plots that compare the outcomes of women with and without breast cancer screening. In this study, women could have one of four outcomes:

  1. Death from breast cancer
  2. False alarm (a positive test for a woman without breast cancer)
  3. Positive diagnosis with unecessary treatment
  4. No cancer and no positive test.

Based on the relative probabilties of these events, one can decide whether they would or would not like to opt for screening:

Outcomes of women without screening

Freq.Grid(c(5 / 1000, 0, 0, 995 / 1000), 
         mtx.dimension = c(50), block.dim = c(10, 10), 
         pch.vec = "circle", fill.direction = "down", cols = c("Accent"), 
         character = c("+", "-", "", ""),
         descriptions = c("Death from Breast Cancer", "False Alarm", 
                               "Diagnosed with Unnecessary Treatment", 
                               "No cancer and no false positive test"),
         main = "Outcomes of women without mammography screening"
         )

plot of chunk unnamed-chunk-6

Outcomes of women with screening

Freq.Grid(c(4 / 1000, 5 / 1000, 100 / 1000, 890 / 1000), 
         mtx.dimension = c(50), block.dim = c(10, 10), 
         pch.vec = "square", fill.direction = "down", cols = c("Accent"), 
         character = c("+", "-", "", ""),
         descriptions = c("Death from Breast Cancer", "False Alarm", 
                               "Diagnosed with Unnecessary Treatment", 
                               "No cancer and no false positive test"),
         main = "No mammography screening"
         )

plot of chunk unnamed-chunk-7

We can also repeat the same plot but in a long format. The probabilities are the same, but now the number of total symbols is down to 100:

Freq.Grid(c(4 / 1000, 5 / 1000, 100 / 1000, 890 / 1000), 
         mtx.dimension = c(20, 50), block.dim = c(10, 10), 
         pch.vec = "square", fill.direction = "down", cols = c("Accent"), 
         character = c("+", "-", "", ""),
         descriptions = c("Death from Breast Cancer", "False Alarm", 
                               "Diagnosed with Unnecessary Treatment", 
                               "No cancer and no false positive test"),
         main = "With mammography screening"
         )

plot of chunk unnamed-chunk-8

Notes

  1. If you include a small probability data point, and don’t force the matrix dimensions to be large enough to capture it, you may run into problems (including certain events not being plotted).

  2. Future versions of the function will plot two different grids side-by-side.