Directions

The objective of this assignment is for you to understand what characteristics you can use to develop a great data graphic.

NOTE: An Rmarkdown file like this one uses “code chunks” to include R coding that is interspersed with text. What you are reading now is text, and the code chunks are below, designated with three tick marks and the letter r in curly braces. Each code chunk ends with another three tick marks. Arranging text and code chunks is the way we build an R-markdown report.

Each question is worth 5 points.

To submit this homework you will create an html document in RStudio, using the knitr package (button included in RStudio) and then submit the pdf document to Canvas.

Question #1

Answer the following questions for this graphic Relationship between ages and psychosocial maturity

knitr::include_graphics("http://ars.els-cdn.com/content/image/1-s2.0-S1043276005002602-gr2.jpg")

a. Identify the visual cues, coordinate system, and scale(s)

The primary visual cues in this graphic are:

The coordinate system is a standard Cartesian (2D rectangular) system.The side axis is a straight number line for age. The bottom axis is a timeline split into four specific periods.

The scale: Side Scale (Vertical): A steady number scale for age in years, marked at 10 and 20. Bottom Scale (Horizontal): A timeline grouped by history steps. It is not a steady time scale because the gaps between the dates are completely uneven (jumping from 18,000 years to 1,800 years, then to 200 years).

b. How many variables are depicted in the graph? Explicitly link each variable to a visual cue.

There are at least three (3) variables depicted:

  1. Age (independent variable) → encoded via position on the x-axis
  2. Psychosocial maturity score (dependent variable) → encoded via position on the y-axis and the height of the trend line
  3. Group membership (e.g., sex or cohort, if multiple lines appear) → encoded via color or line type (e.g., solid vs. dashed)

If only one population is shown, the graph contains two variables; the third variable (group) would apply only if the chart distinguishes subgroups.

c. Critique this data graphic according to the ideas laid out by Wexler and Tufte

From a Tufte perspective, the graphic performs reasonably well in terms of data-ink ratio, a line chart is an efficient format for showing change over time, and there is minimal non-data ink if the design avoids heavy gridlines or decorative elements. However, Tufte would scrutinize whether the scale on the y-axis starts at zero; if the axis is truncated to magnify small differences, the visual impression of growth may be exaggerated relative to the true scale of change. Tufte also values clear, precise labelin, any ambiguity in axis labels or units would be a weakness.

From a Wexler perspective, the graphic should tell a clear story. The central message that psychosocial maturity increases with age but continues developing into early adulthood its effectively communicated by an upward-trending line. However, if error bars or confidence intervals are absent, the viewer cannot assess variability or statistical uncertainty, which limits interpretability for a research audience. The graphic would benefit from annotations identifying key developmental thresholds (e.g., age 16, age 18) to make the “maturity gap” finding more intuitive. Overall, the chart is functional but could be enriched with contextual cues that connect the data to its real-world significance.


Question #2

Answer the following questions for this graphic World’s top 10 best selling cigarette brands 2004-2007

knitr::include_graphics("https://farm3.static.flickr.com/2695/4149541331_482fbb0aaf_o.png")

a. Identify the visual cues, coordinate system, and scale(s)

The visual cues present in this graphic are:

The coordinate system is a Cartesian system oriented horizontally — the y-axis holds the categorical variable (cigarette brand names) and the x-axis holds the quantitative variable (market share or sales volume).

The scale on the x-axis is continuous and linear, representing market share percentage or units sold across countries surveyed from 2004 to 2007. The y-axis is a nominal categorical scale listing the top 10 brands.

b. How many variables are depicted in the graph? Explicitly link each variable to a visual cue.

This graphic depicts approximately three variables:

  1. Cigarette brand (categorical) → encoded via position on the y-axis and text labels
  2. Market share / sales volume (quantitative) → encoded via bar length along the x-axis
  3. Time period or country dominance (categorical/nominal) → encoded via color, distinguishing which brand led in which period or geography between 2004 and 2007

c. Critique this data graphic using the Wexler and Tufte readings

This graphic exemplifies the “simple is better” philosophy praised by the VizWiz post. From a Tufte standpoint, the horizontal bar chart is a high data-ink ratio format — each mark on the page carries meaningful information. The use of direct labeling reduces the need for a legend and minimizes the eye travel that Tufte criticizes in poorly designed charts. The color palette, noted by the original reviewer as varied without any single dominant color, avoids the visual hierarchy problem where one brand appears more important than another simply due to hue intensity.

From a Wexler standpoint, the chart succeeds in making a clear, accessible argument: cigarette brand competition is fierce and dynamic even under marketing restrictions. However, there are potential weaknesses. The chart does not show uncertainty or sample size variation across countries, which limits analytical depth. Additionally, aggregating 2004–2007 into a single snapshot may obscure year-over-year change, a small multiples display or connected dot plot might better reveal the trend within that window. Despite these critiques, the simplicity of the design makes it broadly accessible and appropriate for a general audience, which aligns with good communication-first visualization design.


Question #3

Find two data graphics published in a newspaper or on the internet in the last two years.

a.Our World in Data - Compelling graphic

knitr::include_graphics("https://ourworldindata.org/grapher/life-expectancy.png")
Global Life Expectancy Over Time — Our World in Data (2024). Source: ourworldindata.org/life-expectancy
Global Life Expectancy Over Time — Our World in Data (2024). Source: ourworldindata.org/life-expectancy

This chart works well for several reasons. It uses position along a common scale as the primary encoding the most accurate visual cue available, with years on the x-axis and life expectancy on the y-axis. The design reflects Tufte’s data-ink ratio principle: clean white background, minimal gridlines, and no decorative elements. From a Wexler standpoint, the chart tells a clear story immediately that global life expectancy has risen dramatically over two centuries. The long time scale also provides meaningful context, including the visible COVID-19 dip around 2020. Overall, the chart succeeds at both the analytical and communication levels.

b. A graphical display I find less compelling

knitr::include_graphics("https://upload.wikimedia.org/wikipedia/commons/d/db/Misleading_macworld_3d_pie_chart.svg")
Misleading 3D Pie Chart. Source: Wikimedia Commons (CC BY-SA 4.0)
Misleading 3D Pie Chart. Source: Wikimedia Commons (CC BY-SA 4.0)

This chart fails on multiple levels. The 3D perspective tilt distorts slice sizes, front slices appear larger than back slices even when values are equal, directly violating Tufte’s principle that visual encoding must be proportional to the data. The shadows, gradients, and depth effects are pure chartjunk that add no informational value. From a Wexler standpoint, comparing slice sizes is nearly impossible, defeating the entire purpose of the chart. The fix is straightforward: replace it with a simple horizontal ranked bar chart, which uses bar length, it’s the most accurately perceived encoding, to allow precise, direct comparison across all categories.


Question #4

Briefly critique the designer’s choices for: What is a Data Scientist

The “What is a Data Scientist” infographic (EMC2/Guardian, 2012) looks visually appealing but it is not the easiest to read and understand. It uses bright colors, icons, and circular diagrams to show different skills a data scientist needs, which looks nice but makes it hard to compare values accurately. Humans are not good at reading circular or radial charts because it is difficult to judge the size of angles and arcs, so the viewer can easily misread the data. Tufte would likely criticize this design because it prioritizes looks over accuracy. If I were the designer, I would have used simple bar charts instead, since they are much easier to read and compare. However, the infographic does one thing well: it gives a quick and memorable overview of what a data scientist does, which is helpful for people who are new to the topic and just need a general idea.


Question #5

Briefly critique the designer’s choices for: Charts that explain food in America

The Vox “Charts that explain food in America” collection is a good example of clear and accessible data visualization. The designers use simple chart types like bar charts, line charts, and area charts, which makes the information easy to follow. This aligns with Tufte’s idea that simple and familiar chart formats work best because they are easier for viewers to read accurately. One thing that works especially well is the use of labels and headlines directly on the charts, so the reader does not have to figure out the main point on their own. Wexler would approve of this approach because it puts the audience first. Color is also used well here, usually highlighting just one key category while keeping the rest muted, which helps guide the reader’s eye without causing confusion. One weakness is that some charts use stacked bars or stacked area formats, which can make it hard to compare categories that are not sitting on the bottom baseline. Simple side by side charts would have worked better in those cases. Overall, this collection does a good job of balancing visual appeal with clarity, and it is well suited for a general audience that just wants to understand the data quickly.