Source file ⇒ /Users/sambamamba/Stat 133 Stuff/Lab 3 Graph Assignment.Rmd
Part I
Answer these questions:
What glyphs are used?
- The glyphs are the red lines and the dots that are distinguished by different percentages and time periods, which represent quantiles of US families on the graph distinguished by their individual family income.
What are the aesthetics for those glyphs?
- Color, x-y coordinates, labels, location, percentage numbering are the aesthetics used to illustrate the red lines.
Which variable is mapped to each aesthetic?
- label: the quantile names, such as “Wealthiest Fifth”, “Next Fifth”, “Middle Fifth”, “Second-Poorest Fifth”, “Poorest Fifth”
- location: variable year is mapped to x-axis, and variable annual family income is mapped to the y-axis
- The percentage number of annual family income is mapped to the y-axis and the two different time periods is mapped to the x-axis.
Which variable, if any, is used for faceting?
- There is only one graph. Since facets are multiple side-by-side graphs, there are no facets in this example.
What are the scales?
- The scales for color is just red. For the x-axis aesthetic, the scales are 1971 and 2011. The scale for the annual percentage y-axis is the range from 6% to 114%.
What variables make up the frame:
- The years define the x-coordinate and the annual family income percentages define the y-coordinate.
What are the guides?
- The guides in this graph are the percentage numbers differentiating the different quantiles of the annual incomes of the graph, the labels under each red line (quantiles), and lastly the years for each of the two time periods.
Write down what the glyph-ready dataframe looks like.
- A glyph-ready dataframe looks like a table in which each case is a quantile and are ordered in either ascending or descending order by their percentages. Below is a rough representation of a glyph-ready dataframe:
| Wealthiest Fifth |
6% |
9% |
| Next Fifth |
10% |
19% |
| Middle Fifth |
13% |
29% |
| Second Poorest Fifth |
19% |
46% |
| Poorest Fifth |
42% |
114% |
Part 2
Output A flights1 %>% + select(carrier, distance, dep_delay, origin) %>% + arrange(distance)
Output B flights1 %>% + filter(carrier == “UA”)
Output C flights1 %>% + select(carrier, distance, dep_delay, origin) %>% + head(2)
Output D flights1 %>% + summarise(total=mean(dep_delay))
Output E flights1 %>% + select(carrier, distance)