Directions

In this chapter we discussed why well-designed data graphics are important and we described a taxonomy for understanding their composition.

The objective of this assignment is for you to understand what characteristics you can use to develop a great data graphic.

Each question is worth 5 points.

To submit this homework you will create the document in Rstudio, using the knitr package (button included in Rstudio) and then submit the document to your Rpubs account. Once uploaded you will submit the link to that document on Canvas. Please make sure that this link is hyper linked and that I can see the visualization and the code required to create it.

Question #1

Answer the following questions for this graphic Relationship between ages and psychosocial maturity

  1. Identify the visual cues, coordinate system, and scale(s) Visual cues: Colors are used to indicate different dimensions - green for Menarche and red for Psychosocial maturation. Position and length of the bars indicate the age distribution. Coordinnate system: cartesian. Scales: X-axis is numeric scale for time and Y-axis is a numeric scale for age.

  2. How many variables are depicted in the graph? Explicitly link each variable to a visual cue that you listed above. Four pairs of two identical variables Menarche and Psychosocial maturation 20,000 years ago: different lengths for both 2,000 years ago: different position from 20,000 years ago 200 years ago: trend is moving in upwards direction

  3. Critique this data graphic using the taxonomy described in the lecture. Visually, the X-axis time scale is not incremental in standard units of time so the graph could be a bit misleading. It is also difficult to derive precise ages, but just ballpark ranges for both variables. With that said, it is easy to derive a high level relationship between the variables in terms of age with passing time from this graphic very clearly.

Question #2

Answer the following questions for this graphic World’s top 10 best selling cigarette brands 2004-2007

  1. Identify the visual cues, coordinate system, and scale(s) Visual Cues: Color to indicate different brands and length to indicate sales Coordinate system: Cartesian Scales: Linear

  2. How many variables are depicted in the graph? Explicitly link each variable to a visual cue that you listed above. Marlboro: Color and length Mild Seven: Color and length L&M: Color and length Winston: Color and length Camel: Color and length Cleopatra: Color and length Derby: Color and length Pall Mall: Color and length Kent: Color and length Wills Gold flake: Color and length

  3. Critique this data graphic using the taxonomy described in the lecture. The colors are used efficiently, no one color stands out more than others. However, they could have done with just a singular color as well. The length of the graph clearly indicates that Marlboro had the maximum sales and hence is the most popular choice in the given time period. Labels provide good context and meaningful comparision can be made.

Question #3

Find two data graphics published in a newspaper on on the internet in the last two years.

  1. Identify a graphical display that you find compelling. What aspects of the display work well, and how do these relate to the principles that we have just gone over in lecture. Include a screenshot of the display along with your solution (Hint:use the following in a code chunk: knitr::include_graphics(“your_graphic”).
knitr::include_graphics("Graphic 1.png")

This graphic above is a NYT Consumer Price Index graphic. It is a very simple, clear, minimal, yet informative graphic. A singular color is used to indicate several categories of inflation because what matters is the length of the bars to gauge the amount of inflation. The absence of more colors is actually a positive aspect to avoid unnecessary noise. The axis seems linear at first glance but it does move on the horizontal scale into the negative as well, indicating a slowdown or reduction in inflation for some categories.

  1. Identify a graphical display that you find less compelling. What aspects of the display don’t work well? Are there ways that the display might be improved? Include a screenshot of the display along with your solution (Hint:use the following in a code chunk: knitr::include_graphics(“your_graphic”).
knitr::include_graphics("Graphic 2.png")

I find this graphic less compelling because there are no indications of the scale. Perhaps a person following politics religiously may understand but it is hard to determine what 50-60-70…100 mean. All I can determine from this graphic is that more conservative people objected to certifying the 2020 electoral college result because they are gathered towards the right part of the graphic. The legend is also unclear where the yellow cirle is clearly indicated but the UNRATED has several blank circles instead of just one to tell us what it means. Also, it is unclear what happens below 50 on the X-axis.

Question #4

Briefly (one paragraph) critique the designer’s choices. Would you have made different choices? Why or why not? Note: Link contains a collection of many data graphics, and I don’t expect (or want) you to write a full report on each individual graphic. But each collection shares some common stylistic elements. You should comment on a few things that you notice about the design of the collection.

What is a Data Scientist

Answer: The uniform color throughout the graphic is a nice binding factor that makes it feel a lot more cohesive, to indicate it is all one topic. However, it is very text heavy, so the visuals don’t necessarily guide the eyes to see what is the most important first. The reader needs to figure that out themselves. The gradient used for lowering % amounts as well as different sizes of circles is efficient, however, using different colors for numbers and % sign is cluttering. The text is also not uniform in terms of sentence case and all caps, the purpose of which seems unclear and just looks busy. In the last graphic, varying percentages are all indicated in the same sized boxes which is misleading to visually understand tge composition of the team. It is also not a sum of 100%, but more than that which is unclear as to why.

Question #5

Briefly (one paragraph) critique the designer’s choices. Would you have made different choices? Why or why not? Note: Link contains a collection of many data graphics, and I don’t expect (or want) you to write a full report on each individual graphic. But each collection shares some common stylistic elements. You should comment on a few things that you notice about the design of the collection.

Charts that explain food in America

Answer: Some dynamic visualizations in the collection are compelling and have made efficient use of the method to convey changing states. Several heat maps to indicate food production, economics, and policies, etc. in different categories. These are usually indicative of high level trends, but I would also include the ability to zoom into a particular region or turn off one or more variables to see the relationship between the ones that I want to understand better. The skew map for meat consumption again gives a very high level understanding of the data but it is very difficult to say anything specific based on that map because of the lack of any scales or dimensions. The descriptions for each graphic helps understand deeper context and aids in deciphering the graphic better. Overall, very engaging graphics.