In 1786, the father of statistical graphics, Playfair [1] first started using line, bar, pie, and circle graphs to convey his ideas. More than 200 years later, you can’t talk about data analysis without data visualization - it has become one of the most discussed topics in the world.
Interestingly, even before ENIAC was invented, people were already talking about the most effective types of statistical data graphs. 80 years ago, Journal of the American Statistical Association witnessed several comparisons of bar charts and pie charts [2]. In 1975, Kruskal said, " There is neither theory nor systematic body of experiments as a guide" [3]. In 1978, Cox [4] expressed, “There is a major need for a theory of graphical methods”.
These works in searching of optimized visualizations later developed into graphical perception [5]. Cleveland defines graphical perception as the visual decoding of the information encoded on graphs. Heer and Bostock [6] use the term graphical perception to denote the ability of viewers to interpret such visual encodings and thereby decode information in graphs. Lohse [7] thinks graphical perception is defined as the ability of users to comprehend the visual encoding and thereby decode the information presented in the graph.
In this paper, we are going to introduce 2 example studies of widely used charts to explore the important factors that impact human perception of the information in the graphs and how can we improve the graphing techniques.
Pie and donut charts are well perceived graphs in the data analysis community. Even though pie and donut charts have been on the stage for over a century, we still know little about the perceptual model of reading them. In a random sampling of infographics on visual content website Visual.ly, 36% of infographics with charts used some form of pie or donut chart [8].
Skau and Kosara thinks data in pie and donut charts is encoded in three ways:
Angle
Area
Arc length
They are trying to find out which of these encodings do people read, and how does combination works with each other? Which can be removed without interfering reader’s reading?
To answer these questions, they designed a study to separate these 3 encodings and see how people perceive different charts. They designed new charts which enable them to isolate each variable as much as possible. This enables them to test the efficacy of angle, area and arc length independently of their counterpart encodings.
Figure 1
The study uses six different chart types (Figure 1): * Baseline Pie - a standard pie chart (Figures 1a and g) using all three visual encodings to represent the percentage. You can easily spot area, center angle and arc length displayed in this chart. * Baseline Donut - a standard donut chart (Figures 1b and h) using area and arc length to encode data. The angle is almost impossible to read due to the removed center part of the circle. * Arc - a chart showing only arc length (Figures 1c and i), area and center angle are missing in this chart. * Angle Pie - a chart showing the angle component of a pie chart (Figures 1d and j) without a filled area or circle segment, thus removing these two cues. Skau and Kosara designed that small arrows are showing which part of the chart is actually encoded with data. * Angle Donut - a chart showing the angle component of a donut chart (Figures 1e and k), though without the lines meeting in the center that presumably allow precise judgment of the angle. Area and segment length are not represented. * Area Chart - a chart showing only area component to represent a percentage value (Figures 1f and l). The area representing the data “fills up” proportionally as the value increases, thus removing angle cues and only providing very non-linear segment length. Standard pie chart and donut chart are considered to be the baseline charts. As they have all 3 visual encodings in them.
In the experiments, charts were shown to subjects in random order and every type of charts was shown eight times per participant. The data in the charts was from a pre-selected array of random integers with a possible range from 3 to 97, the same for every participant. The array was shuffled randomly for each participant, making any combination between data and chart type possible.
Skau and Kosara asked the same question for every chart: “What percentage of the whole is indicated below?” Some of our chart variants made this relationship clearer than others. For example, the arc and area charts clearly have a part and a whole indicated by the blue segment and the gray segment, but the angle charts don’t provide a good indicator of the whole. By keeping the question consistent and providing the brief tutorial at the beginning, they hoped to avoid confusion when participants encountered the more unusual charts.
Means and 95% confidence intervals for log absolute error are reported. Error was smaller for the baseline charts, area chart, and the arc chart than the two angle-only charts.
Figure 2: Distribution of amount of error per chart type
What they found contradicts common wisdom that angles are critical to pie and donut chart perception. The distribution of mean log error per participant in Figure 2 clearly shows the differences between the two angle charts and the other chart types. The relatively tall and skinny violin plots show a high degree of variance in the amount of error for the angle charts, while the other charts have relatively tight groupings, showing a consistent level of error. The arc-length chart has the tightest grouping of error, and despite a higher mean error, the amount a participant would be wrong by is more predictable.
One of the most unusual findings in the study is that the area-only chart has very similar error to the pie and donut. This is unexpected, given how difficult it generally is to correctly estimate area, and also the chart’s lack of familiarity.
Skau and Kosara’s results cast doubt on the importance of angle: angle-only charts both performed considerably worse than the rest. This suggests that angle cannot be the only way we read a pie or donut chart. At least one of the other encodings is necessary to be able to interpret the angle encoding in a chart. We found that donuts are likely no worse than pies, despite missing the center. This suggests that area and arc length can make up for the missing angle information. While arc length and area alone are better than angle alone, they are still worse than complete pie and donut charts.
Line graphs are today one of the most common types of statistical data graphics [9], and are used to visualize temporal data in almost all industries such as finance, politics, science, engineering, and medicine.
In this work, Javed and McDonnel [10] explored reader performance for comparison, slope, and discrimination tasks for different types of line graphs involving multiple time series.
A great number of factors have impacts on perception of multiple time series. Below are the visual encodings Javed and McDonnel listed to classify different line charting techniques [10].
Space management: Whether time series are sharing the whole canvas. Shared canvas is usually easier for readers to compare different series, vice versa may be easier to read.
Space per line: The amount of vertical display space distributed to a single time series.
Identity: The technique to differentiate one series from another using patterns, color etc.
Baseline: common x-y axis or not.
Visual clutter: The clutter associated with the visualization technique, especially for large values of N.
They designed a quantitative user study to measure time and correctness performance for different combinations of visualization technique, screen space, and number of time series.
They asked the subjects to find out following traits of different types of charts given to the subjects, including:
And they record the correct rate and completion time.
Figure 3
Simple graph (SG, Figure 3a): All time series are drawn in one shared space, each drawn with colored lines.
Small multiples (SM, Figure 3c): Each time series has its own space and color. Y axis uses the same scale.
Horizon graph (HG, Figure 3d) [10]: A set of 2-band1 horizon graphs, one per time series, where each graph was given an equal amount of vertical screen space. Graphs were drawn using the standard red/blue horizon color scheme. Value (Y) axes used the same scale across all charts, and the baseline reference for the graph was set to the average of the extents to fully utilize the graph’s virtual resolution (equal ranges on each side of the baseline).
Braided graph (BG, Figure 3b) [10]: A single braided line graph using the whole vertical space where each time series was drawn as a filled line graph using its corresponding color
We can summarize our findings as follows:
Readers spent less time perceiving shared-space charts (SG and BG) were faster than split-space charts for finding the local maximum;
It’s easier to complete the slope task with split-space charts;
Less display space distribution has a negative effect on correctness, but had little effect on completion time.
Their results show that techniques that create separate charts for each time series-such as small multiples and horizon graphs-are generally more efficient for comparisons across time series with a large visual span. On the other hand, shared-space techniques-like standard line graphs-are typically more efficient for comparisons over smaller visual spans where the impact of overlap and clutter is reduced.
Although a lot of discussions are conducted on “how” to draw charts and graphs, I don’t think there is enough discussions on what are effective graphs to human being. “The power of graph is its ability to enable one to take in the quantitative information, organize it and see patterns and structure not readily revealed by other means of studying the data”. The studies of graphical perception I believe will provide guide lines, help people design graphs that are more efficient to read.
Graphical perception is a completely new concept to me. Although this paper only presented very limited scope in graphical perception, I still learned several techniques that contradicts common wisdom. In my future endeavors, I would like to read more studies about other types of charts and graphs, hoping to improve the “graphical perception” of my charts.
[1] P. J. FitzPatrick. Leading British statisticians of the Nineteenth Century. Journal of the American statistical Association, 55(289):38-70, Mar. 1960.
[2] W. C. Eells. The relative merits of circles and bars for representing component parts. Journal of the American Statistical Association,21(154):119-132, 1926. [6] F. E. Croxton and R. E. Stryker. Bar charts versus circle diagrams. Journal of the American Statistical Association, 22(160):473-482, 1927.
[3] KRUSKAL W. Visions of Maps and Graphs. Proceedings of the International Symposium on Computer-Assisted Cartography, Auto-Carto II (1975), 27-36.)
[4] COX D. Some remarks on the role in statistics of graphical methods. Applied Statistics 27, 1 (1978), 4-9.
[5] W. S. Cleveland and R. McGill.. Graphical perception: Theory, experimentation and application to the development of graphical methods. Journal of the American Statistical Association, 79(387):531-554, Sept. 1984.
[6] HEER J., BOSTOCK M.. Crowdsourcing graphical perception: using Mechanical Turk to assess visualization design. ACM Human Factors in Computing Systems (2010), 203-212.
[7] J. Lohse. A cognitive model for the perception and understanding of graphs. In Proceedings of the ACM CHI’91 Conference on Human Factors in Computing Systems, pages 137-144, 1991.
[8] Drew Skau, Robert Kosara. Arcs, Angles, or Areas: Individual Data Encodings in Pie and Donut Charts. Eurographics Conference on Visualization (EuroVis) 2016, Volume 35 No. 3.
[9] W. S. Cleveland. Visualizing Data. Hobart Press, Summit, NJ, 1994
[10] Javed, Waqas, B. Mcdonnel, and N. Elmqvist. “Graphical Perception of Multiple Time Series.” IEEE Transactions on Visualization & Computer Graphics 16.6(2010):927-934.
[11] H. Lam, T. Munzner, and R. Kincaid. Overview use in multiple visual information resolution interfaces. IEEE Transactions on Visualization and Computer Graphics, 13(6):1278-1285, 2007.
[12] V. Beattie and M. J. Jones. The impact of graph slope on rate of change judgments in corporate reports. ABACUS, 38(2):177-199, 2002.
[13] D. Simkin and R. Hastie. An information-processing analysis of graph perception. Journal of the American Statistical Association, 82(398):454-465, June 1987.