Directions

In this chapter we discussed why well-designed data graphics are important and we described a taxonomy for understanding their composition.

The objective of this assignment is for you to understand what characteristics you can use to develop a great data graphic.

Each question is worth 5 points.

To submit this homework you will create the document in Rstudio, using the knitr package (button included in Rstudio) and then submit the document to your Rpubs account. Once uploaded you will submit the link to that document on Canvas. Please make sure that this link is hyper linked and that I can see the visualization and the code required to create it.

Question #1

Answer the following questions for this graphic Relationship between ages and psychosocial maturity

  1. Identify the visual cues, coordinate system, and scale(s)

The visual cues of this graph are color, position, length, and direction, the coordinate system is Cartesian, and the scales are Linear.

  1. How many variables are depicted in the graph? Explicitly link each variable to a visual cue that you listed above. There are four pairs of two identical variables: menarche and psychosocial maturation. These pairs each have a link to a different visual cue. The links are:
  1. 20,000 years ago - length (The length of menarche and psychosocial maturation are different)
  2. 2,000 years ago - position (the position of the two are different compared to the 20,000 years ago group)
  3. 200 years ago - direction (The pair is shown to rise slightly as compared to the previous two groups)
  4. Present day - color (color is used to distinguish the two variables)
  1. Critique this data graphic using the taxonomy described in the lecture. The graphic uses color to distinguish the two variables, menarche and psychosocial maturation. The graphic shows a mismatch between the variables in the present day. Additionally, the scale of the graphic shows a time scale that decreases by 90% each step of the way. (From 20,000 to 2,000, from 2,000 to 200, etc.)

Question #2

Answer the following questions for this graphic World’s top 10 best selling cigarette brands 2004-2007

  1. Identify the visual cues, coordinate system, and scale(s)

The visual cues for this graph are color and length, the coordinate system is Cartesian, and the scale is linear.

  1. How many variables are depicted in the graph? Explicitly link each variable to a visual cue that you listed above.

There are 1o unique variables in this graph, and each variable is linked to both the color and length visual cues. The variables are: Marlboro, Mild Seven, L&M, Winston, Camel, Cleopatra, Derby, Pall Mall, Kent, and Wills Gold flake

  1. Critique this data graphic using the taxonomy described in the lecture. The grahpic has two visual cues, color and length. The title and x-axis are clearly marked, making the context and purpose of the graphic clear and easy to understand. We can see that Marlboro was the number 1 selling cigarette brand between 2004-2007. The graphic uses a linear scale in which the x-axis is sales in billions.

Question #3

Find two data graphics published in a newspaper on on the internet in the last two years.

  1. Identify a graphical display that you find compelling. What aspects of the display work well, and how do these relate to the principles that we have just gone over in lecture. Include a screenshot of the display along with your solution (Hint:use the following in a code chunk: knitr::include_graphics(“your_graphic”).
knitr::include_graphics("good graph.png")

I find this graph compelling because it is simple and easy to understand, and effortlessly coveys the information to the viewer. The graph shows the casualties via aircraft incidents per year in thousands, with the number of passengers carried by airlines, in billions, overlayed on that graphic. The graph shows both a dramatic increase in passengers carried, and a dramatic decrease in casualties from incidents involving airlines.

  1. Identify a graphical display that you find less compelling. What aspects of the display don’t work well? Are there ways that the display might be improved? Include a screenshot of the display along with your solution (Hint:use the following in a code chunk: knitr::include_graphics(“your_graphic”).
knitr::include_graphics("bad graph.jpg")

This graphic is less compelling simply because it is confusion. The time scale shown is inconsistent and all over the place, with bars meaning as much as a full month or as little as a week. There is seemingly no explanation for the difference in the time each bar represents as well, as for the month of July 2011, two of the three bars are for the same valuation.

Question #4

Briefly (one paragraph) critique the designer’s choices. Would you have made different choices? Why or why not? Note: Link contains a collection of many data graphics, and I don’t expect (or want) you to write a full report on each individual graphic. But each collection shares some common stylistic elements. You should comment on a few things that you notice about the design of the collection.

What is a Data Scientist

Answer: The use of the cohesive color scheme makes for a professional and compelling design. However, the graphic is bloated with far too much text and separate data points that it is difficult to digest all the information shown. To make it a stronger graphic, I would make clear deliniations between the sections, and include header questions to draw the viewer to them. For example, the first section could be labelled “Who are Data Scientists?”.

Question #5

Briefly (one paragraph) critique the designer’s choices. Would you have made different choices? Why or why not? Note: Link contains a collection of many data graphics, and I don’t expect (or want) you to write a full report on each individual graphic. But each collection shares some common stylistic elements. You should comment on a few things that you notice about the design of the collection.

Charts that explain food in America

Answer: The majority of the graphs use geographic coordinates to show the changes in the food industry in America and across the world over the years, though it should be noted that there are several graphs that use the Cartesian coordinate system as well. I like the use of vibrant colors to distinguish the different data points, it makes for a clear visual that is easy to understand. To make these graphics better as a whole unit, I would suggest having an area or several areas of focus that the graphs all work towards. That way the viewer can have a base understanding of what they should be expecting from the graphic as they first view it.