Directions

In this chapter we discussed wy well-designed data graphics are important and we described a taxonomy for understanding their composition.

The objective of this assignment is for you to understand what characteristics you can use to develop a great data graphic.

Each question is worth 5 points.

To submit this homework you will create the document in Rstudio, using the knitr package (button included in Rstudio) and then submit the document to your Rpubs account. Once uploaded you will submit the link to that document on Canvas. Please make sure that this link is hyper linked and that I can see the visualization and the code required to create it.

Question #1

Answer the following questions for this graphic Relationship between ages and psychosocial maturity

  1. Identify the visual cues, coordinate system, and scale(s):
  1. How many variables are depicted in the graph? Explicitly link each variable to a visual cue that you listed above.
  1. Critique this data graphic using the taxonomy described in the lecture. The data graphic has color visual cues to distinguish menarche and psychosocial maturation. It tries to show there is a mismatch between menarche and psychosocial maturation in the present time. Also, the scale of the graphic shows a time scale where there is a numeric quantity that has some special properties.

Question #2

Answer the following questions for this graphic World’s top 10 best selling cigarette brands 2004-2007

  1. Identify the visual cues, coordinate system, and scale(s)
  1. How many variables are depicted in the graph? Explicitly link each variable to a visual cue that you listed above.
  1. Critique this data graphic using the taxonomy described in the lecture. The data graphic has visual cues of color and length. We can see that Marlboro was the number 1 top best selling cigarette brand between 2004 and 2007. With the title and x-axis label, it gives us a clear context of what the purpose of the data graphics is to make meaningful comparison. With that said, it has a linear scale where x-axis is sales in billions($).

Question #3

Find two data graphics published in a newspaper on on the internet in the last two years.

  1. Identify a graphical display that you find compelling. What aspects of the display work well, and how do these relate to the principles that we have just gone over in lecture. Include a screenshot of the display along with your solution (Hint:use the following in a code chunk:)

Answer: I find the graphic compelling because it is simple and beautiful. The data graphic has color visual cues to distinguish mothers and fathers. It tries to show there is a discrepancy of US labor force participation between mothers and fathers. Also, the scale of the graphic shows a time scale where there is a numeric quantity that has some special properties. For example, summer months are shaded out.

knitr::include_graphics("good.png")

  1. Identify a graphical display that you find less compelling. What aspects of the display don’t work well? Are there ways that the display might be improved? Include a screenshot of the display along with your solution (Hint:use the following in a code chunk:

Answer: I find the graphic less compelling because it is complex and confusing. The data graphic has visual cues of color and length. We can see that Alpha and Delta variants have the highest prevalence of coronavirus variants in the US. However, it is not clear which variant has the highest prevalence. Also the legend seems confusing with the similar color scales. Because of this, it seem to give us a less clear context of what the purpose of the data graphics is to make meaningful comparison.

knitr::include_graphics("bad.png")

Question #4

Briefly (one paragraph) critique the designer’s choices. Would you have made different choices? Why or why not? Note: Link contains a collection of many data graphics, and I don’t expect (or want) you to write a full report on each individual graphic. But each collection shares some common stylistic elements. You should comment on a few things that you notice about the design of the collection.

What is a Data Scientist

Answer: Using the cohesive and unified visual cue especially color patterns makes the design of the collection more compelling and professional. However, there are too many texts and data points so it’s hard to figure out which data point is important. More precisely, it is asking three different questions in less than 2 pages (“Who are data science practitioners, what skills do they need, and why are they so different?”). To make it more effective, I would seperate each question and group each graph with the questions. So the first section is about “Who are they?”, the second section “What skills do they have?”, and the third section is “What makes them different?” This way, it gives the reader more clear context on what data visualization they are looking at.

Question #5

Briefly (one paragraph) critique the designer’s choices. Would you have made different choices? Why or why not? Note: Link contains a collection of many data graphics, and I don’t expect (or want) you to write a full report on each individual graphic. But each collection shares some common stylistic elements. You should comment on a few things that you notice about the design of the collection.

Charts that explain food in America

Answer: Using geographic coordinate system to show how food industry has been changed throughout the years. Of course, some of the 40 data visualizations have different coordinate system, such as Cartesian coordinate system to convey the food changes. That said, almost all data visualizations have a clear visual cue using color to distinguish different data points in the visualizations. To make it even better, I would suggest using less visualization to convey where our food comes from and how we eat it. In other words, what the visualizations need is a focus. And they should give a clear context of what each visualization is try to emphasize.