In this chapter we discussed wy well-designed data graphics are important and we described a taxonomy for understanding their composition.
The objective of this assignment is for you to understand what characteristics you can use to develop a great data graphic.
Each question is worth 5 points.
To submit this homework you will create the document in Rstudio, using the knitr package (button included in Rstudio) and then submit the document to your Rpubs account. Once uploaded you will submit the link to that document on Canvas. Please make sure that this link is hyper linked and that I can see the visualization and the code required to create it.
Question #1
Answer the following questions for this graphic Relationship between ages and psychosocial maturity
Identify the visual cues, coordinate system, and scale(s)
Visual cues: color, position, length, and direction Coordinate system: Cartesian Scale: Time
How many variables are depicted in the graph? Explicitly link each variable to a visual cue that you listed above.
Four pairs of two identical variables menarche and psychosocial maturation
20,000 years ago: length (length of menarche and psychosocial maturation are different)
2,000 years ago: position (compared to the 20,000-years-ago variable pair, it’s in a different position)
200 years ago: direction (compared to the other variable pairs, it’s slightly going up)
Present:color to distinguish menarche and psychosocial maturation
Critique this data graphic using the taxonomy described in the lecture.
The data graphic has color visual cues to distinguish menarche and psychosocial maturation. It tries to show there is a mismatch between menarche and psychosocial maturation in the present time. Also, the scale of the graphic shows a time scale where there is a numeric quantity that has some special properties.
Question #2
Answer the following questions for this graphic World’s top 10 best selling cigarette brands 2004-2007
Visual cues: color and length Coordinate system: Cartesian Scale(s): Linear
10 variables Marlboro: color & length Mild Seven: color & length L&M: color & length Winston: color & length Camel: color & length Cleopatra: color & length Derby: color & length Pall Mall: color & length Kent: color & length Wills Gold flake: color & length
The data graphic has visual cues of color and length. We can see that Marlboro was the number 1 top best selling cigarette brand between 2004 and 2007. With the title and x-axis label, it gives us a clear context of what the purpose of the data graphics is to make meaningful comparison. With that said, it has a linear scale where x-axis is sales in billions($).
Question #3
Find two data graphics published in a newspaper on on the internet in the last two years.
I find the graphic compelling because it is simple and beautiful. The data graphic has color visual cues to distinguish. It shows there is a discrepancy of interpreted prices between two perspectives. Also, the scale of the graphic shows a time scale with approriate spacing and key percentage change labeled.
knitr::include_graphics("pic.png")
I find the graphic less compelling because it is messy and confusing. The data graphic has visual cues of color and length. It’s hard to see anything very clearer, and it is impossible to make any comparison or judgement.
knitr::include_graphics("pic1.png")
Question #4
Briefly (one paragraph) critique the designer’s choices. Would you have made different choices? Why or why not? Note: Link contains a collection of many data graphics, and I don’t expect (or want) you to write a full report on each individual graphic. But each collection shares some common stylistic elements. You should comment on a few things that you notice about the design of the collection.
Answer: Using the cohesive and unified visual cue especially color patterns makes the design of the collection more compelling and professional. However, there are too many texts and data points so it’s hard to figure out which data point is important. More precisely, it is asking three different questions in less than 2 pages (“Who are data science practitioners, what skills do they need, and why are they so different?”). To make it more effective, I would seperate each question and group each graph with the questions. So the first section is about “Who are they?”, the second section “What skills do they have?”, and the third section is “What makes them different?” This way, it gives the reader more clear context on what data visualization they are looking at.
Question #5
Briefly (one paragraph) critique the designer’s choices. Would you have made different choices? Why or why not? Note: Link contains a collection of many data graphics, and I don’t expect (or want) you to write a full report on each individual graphic. But each collection shares some common stylistic elements. You should comment on a few things that you notice about the design of the collection.
Charts that explain food in America
Answer: Using geographic coordinate system to show how food industry has changed throughout the years. Total there are 40 data visualizations that have different coordinate system, such as Cartesian coordinate system to convey the food changes. That said, almost all data visualizations have a clear visual cue using color to distinguish different data points in the visualizations. To improve, I would suggest using less visualization to convey where our food comes from and how we eat it. In other words, the visualizations need to have a clear focus and not distract the audience. And they should give a clear context of what each visualization is try to emphasize.