When completed, name your final output .html file as: YourName_512-91- O-2018.html and upload it to the “Visualization Coding Exercise #5 (Homework #5)” assignment in Week #5 on Moodle.

This assignment is worth 20 points. Each problem is worth 2.5points each.

Aesthetics Best Pratices

In class, you saw that there are lots of ways to use aesthetics. Perhaps too many, because although they are possible, they are not all recommended. These exercises help you take a look at what works and what does not work.

So far you’ve focused on scatter plots since they are intuitive, easily understood and very common. A major consideration in any scatter plot is dealing with overplotting. You’ll encounter this topic again in the geometries layer, but you can already make some adjustments here.

You’ll have to deal with overplotting when you have:

  1. Large datasets.

  2. Imprecise data and so points are not clearly separated on your plot (you saw this in last week’s lecture with the iris dataset.

  3. Interval data (for example, data appears at fixed values), or

  4. Aligned data values on a single axis.

One very common technique to always use when you have solid shapes it to use alpha blending (for example, adding transparency). An alternative is to use hollow shapes. These are adjustments to make before even worrying about positioning. This addresses the first point as above, which you’ll see again in the next exercise.

  1. Perform the following

Convert cyl to a factor variable before you make the plots requested below.

  1. Begin by making a basic scatter plot of mpg (y) vs. wt (x), map cyl to color and make the size = 4.

  2. Modify the plot in part a to set shape to 1. This allows for hollow circles.

  3. Modify the plot in part a to set alpha to 0.6.

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.4.4
#part a
ggplot(mtcars, aes(wt, mpg, col = cyl)) +
  geom_point(size = 4)

#part b
# Hollow circles - an improvement
ggplot(mtcars, aes(wt, mpg, col = cyl)) +
  geom_point(size = 4, shape = 1)

#part c
# Add transparency - very nice
ggplot(mtcars, aes(wt, mpg, col = cyl)) +
  geom_point(size = 4, alpha = .6)

Aesthetics Best Practices-More Practice!

In problem #1,we defined four situations in which you would have to adjust for overplotting. You will consider the last two here with the diamonds data frame.

  1. Large datasets
  2. Aligned data values on a single axis

The diamonds data frame is available in the ggplot2() package and you worked with this data set in previous homework exercises. Work with the entire data set (not the sample that was used in a previous homework assignment).

  1. Perform the following.
  1. Begin by making a basic scatter plot of price (y) vs. carat (x) and map clarity onto color.

  2. Modify the plot in part a by setting alpha to 0.5. This is a good start to dealing with the large dataset.

  3. Align all the diamonds within a clarity class, by plotting carat (y) vs. clarity (x). Map price onto color. alpha should still be 0.5.

  4. The plot in part c has all the individual values line up on a single axis within each clarity category, so you have not overcome overplotting. Modify the above plot to use the position = “jitter” inside geom_point().

library(ggplot2)

#part a
# Scatter plot: carat (x), price (y), clarity (color)
ggplot(diamonds, aes(carat, price, col = clarity)) + 
  geom_point()

#part b
# Adjust for overplotting
ggplot(diamonds, aes(carat, price, col = clarity)) + 
  geom_point(alpha = 0.5)

#part c
# Scatter plot: clarity (x), carat (y), price (color)
ggplot(diamonds, aes(clarity, carat, col = price)) + 
  geom_point(alpha = 0.5)

#part d
#Dot plot with jittering
ggplot(diamonds, aes(clarity, carat, col = price)) + 
  geom_point(alpha = 0.5, position = "jitter")