Return to Home Page


Dee Chiluiza, PhD
Northeastern University
Introduction to data analysis using R, R Studio and R Markdown
Short manual series: Pie Charts



Applications of pie charts


I personally do not recommend the use of pie charts to present data analysis results, I will explain my reasons at the end of this section, however, they have some practical applications for other purposes, therefore I will show you how to make them.

  1. Everything starts with a question, for example, from the data set mtcars, how many observations there are on each category of the variable cylinders?
    Similar to bar plots, start by creating a table of the data you need to plot. You can use the table to present the frequencies or the percentages, see section “creating new calculated fields.”
par(mar=c(0.1, 0.1, 0.1, 0.1)) 
CYL_table = table(mtcars$cyl)

pander(as.matrix(CYL_table))
4 11
6 7
8 14
The table above indicates the frequencies of cars per number of cylinders: there are 11 cars with 4 cylinders, 7 cars with 6 cylinders, and 14 cars with 8 cylinders. This information can be presented using a pie chart.

Another information that can be accessed using the frequencies is the percentages. To obtain percentage, divide each value by the total number of observations, nrow(mtcars), and multiply it by 100.

# par(mfcol) to present figures using 1x2 grid
# par(mar) to change margins.

par(mfcol=c(1,2), mar=c(0, 0.7, 0, 0.7))

# First table
CYL_table = table(mtcars$cyl)

# Table calculating percentages
CYL_tableProb = CYL_table/nrow(mtcars)*100

# Basic plotting of pie charts
pie1 = pie(CYL_table, 
           labels = CYL_table, 
           radius = 0.8)

pie2 = pie(CYL_tableProb, 
           labels = CYL_tableProb, 
           radius = 0.8)

Code detail: In the case of the two graphs above, the size can be manipulated changing the margins with mar=c(0, 0.7, 0, 0.7) in the par code, and with radius = inside the pie code; practice.
Since the two graphs are similar, there is no point in presenting two pie charts, one for frequencies and the other for percentages, both can be combined into one single graph.

Create a label object
In this case, the paste() code is used to combine all the information required into one object, then the name of that object is used for labels = inside the pie code.

Pie Chart codes

Pie Chart codes


Start with the paste() code, providing a name to the object that is being created, in this case: pieLabels, then add the following elements:

  1. Use unique() to extract the list of values from the variable cylinders. If you need to order their sequences, use unique(sort()) and be sure to maintain consistency in all codes, and that the sequence matches the table used to create the pie chart. Add a comma to indicate end of code.
  2. Enter the text using quotations: “Cyl”. If you need space, add it inside the quotations. Add a comma to indicate end of code.
  3. Add a break line using: “\n”. Add a comma to indicate end of code.
  4. Use quotations to enter text: “n=”. Add a comma to indicate end of code.
  5. Call the first table (CYL_table), which contain the frequency values. Use sort() if needed. Add a comma to indicate end of code.
  6. Add a break line using: “\n”.
  7. Call the second table (CYL_tableProb), which contain the probability values. Use round() to adjust number of decimals.
  8. Use quotation to enter text: “%”

Observe the R Chunk below with the pieLabels object. On your console, call the pieLabels object and observe all its elements.

pieLabels = paste(unique(sort(mtcars$cyl)),
                  "Cyl",
                  "\n",
                  "n=",
                  sort(CYL_table), 
                  "\n", 
                  round(CYL_tableProb,1),
                  "%")

Now it’s time for a pie!
• Start with the code pie() and use any of the two tables, both produce the same figure.
• For labels, use the object we created: pieLabels, it will replace any label generated on the original figure.
• Add the other elements listed below, play with their values.

# par(mar) to change margins.
par(mar=c(0.1, 0.1, 0.1, 0.1))

pie(CYL_table,
    labels = pieLabels,
    radius = 0.6,
    col = brewer.pal(ncol(mtcars), "Paired"),
    border = "white",
    lty = 1,
    cex=0.9,
    font = 3)

Code alert: Notice the color code. Since there are three values to display in the graph, we need a number 3 in there. We can simple add “3”, right? However, in R, and specially during active data analysis, the number of categories can change depending on filtering processes. One strategy is to use ncol() with the name of data set or the name of the subset that is being manipulated, it will produce the exact number of variables being presented in the graph.

pie(CYL_table,
    labels = pieLabels,
    radius = 0.6,
    col = terrain.colors(3),
    border = "white",
    lty = 1,
    cex=0.8,
    font = 3
    )

legend("topleft",
       legend = paste(unique(sort(mtcars$cyl)), "Cylinders"),
       fill = terrain.colors(3),
       border = "white")

box(col="red")

# edges=1, 10, 20, etc., can also be used.
# on paste() for labels, sep = "" can be used to remove spaces. 
# Check the use of order() instead of sort()

Keep in mind: The size of the areas in a pie chart are meant to indicate the relative size or proportion of each value. This is a strong reason why pie charts are not recommended to display data results, the human eye is not that good at judging relative areas. It is always preferable to use dot charts or bar graphs. Pie charts are quite good for advertisement/selling purposes.


Disclaimer: This short series manual project is a work in progress. Until otherwise clearly stated, this material is considered to be draft version.


Dee Chiluiza, PhD
26 January, 2023
Boston, Massachusetts, USA

Bruno Dog