How to Choose a Chart Type

When working on any data science project, one of the essential steps to explore and interpret your results is to visualize your data. At the beginning of the project, visualizing your data helps you understand it better, find patterns and trends.

At the end of the project, after you’ve done your analysis and applied different machine learning models, data visualization will help you communicate your results more efficiently.

Humans are visual creatures by nature; things make sense to us when it’s represented in an easy to understand visualization. It’s way easier to interpret a bar chart than it is to look at massive amounts of numbers in a spreadsheet.

How to start?

Before you start looking at chart types, you need to ask yourself 5 critical questions about your data. These questions will help you understand your data better and hence, choose the perfect chart type to represent it. Data is just a story told in numbers.

  1. What is the story your data is trying to deliver?

  2. Who will you present your results to?

  3. How big is your data?

  4. What is your data type?

  5. How do the different elements of your data relate to each other?

Top used chart types

There are more than 40 types of charts out there; some are more commonly used than others because they are easier to build and interpret. Let’s talk about the top 7 used charts type and when to use each of them.

Bar plot

barplot(GNP ~ Year, data = longley)

When to use When to avoid
  • Comparing parts of a bigger set of data, highlighting different categories, or showing change over time.
  • Have long categories label-it offers more space.
  • If you want to illustrate both positive/ negative values
  • f you’re using multiple data points.

  • If you have many categories, avoid overloading your graph. Your graph shouldn’t have more than 10 bars.

Pie Chart

library(plotrix)
slices <- c(10, 12, 4, 16, 8)
lbls <- c("US", "UK", "Azerbaijan", "Germany", "France")
pie3D(slices,labels=lbls,explode=0.1, main="Pie Chart of Countries ")

When to use When to avoid

When you show relative proportions and percentages of a whole dataset.

  • Best used with small datasets — also applies to donut charts.
  • When comparing the effect of ONE factor on different categories.
  • If you have up to 6 categories.
  • When your data is nomial and not ordinal.
  • If you have a big dataset.

  • If you want to make a precise or absolute comparison between values.

Line chart

v <- c(17, 25, 26, 18, 24, 20, 22, 18,18)
plot(v, type = "o")

When to use When to avoid
  • If you have a continuous dataset that changes over time.
  • If your dataset is too big for a bar chart.
  • If you want to display multiple series for the same timeline.
  • If you want to visualize trends instead of exact values.
  • Line charts work better with bigger datasets, so, if you have a small one, use a bar chart instead.

Scatter plot

When to use When to avoid
  • To show correlation and clustering in big datasets.

  • If your dataset contains points that have a pair of values.

  • If the order of points in the dataset is not essential.

  • if you have a small dataset.

  • If the values in your dataset are not correlated.

Density plot

## 
## Attaching package: 'ggplot2'
## The following object is masked from 'mtcars':
## 
##     mpg

Chart selection tips

Whenever you decide to create some data visualization, use these best practices to make it more straightforward and effective.

  1. If you have categorical data, use a bar chart if you have more than 5 categories or a pie chart otherwise.

  2. If you have nominal data, use bar charts or histograms if your data is discrete, or line/ area charts if it is continuous.

  3. If you want to show the relationship between values in your dataset, use a scatter plot, bubble chart, or line charts.

  4. If you want to compare values, use a pie chart — for relative comparison — or bar charts — for precise comparison.

  5. If you want to compare volumes, use an area chart or a bubble chart.

  6. If you want to show trends and patterns in your data, use a line chart, bar chart, or scatter plot.

Conclusion

Before you choose what chart type to use, you need to get to know your data better, the story behind it, and your target audience/media. Whenever you try to create a visualization, chose simple colors and fonts.

Always aim for simple visualization than complex ones. The goal of visualizing data is to make it easier to understand and read. So, avoid overloading and cluttering your graphs. Having multiple simple graphs is always better than one elaborate graph.

Source.