alt text

alt text

ggplot2: Elegant Graphics for Data Analysis

Literature cited:

Every ggplot2 plot has three key components:

  1. data.
  2. A set of aesthetic mappings between variables in the data and visual properties.
  3. At least one layer which describes how to render each observation. Layers are usually created with a geom function.
#Ejercicio 1. 
library(ggplot2)
ggplot(mpg, aes(x = displ, y = hwy)) + geom_point()

#Ejercicio 2. 
library(ggplot2)
ggplot(mpg, aes(cty, hwy)) + geom_point()

#Ejercicio 3. Colour, Size, Shape and Other Aesthetic Attributes.
library(ggplot2)
ggplot(mpg, aes(displ, cty, colour = class)) +
geom_point()

#Ejercicio 4. Histogram.
library(ggplot2)
ggplot(mpg, aes(cty)) + geom_histogram(binwidth =1)

If you have a scatterplot with a lot of noise, it can be hard to see the dominant pattern. In this case it’s useful to add a smoothed line to the plot with geom smooth()

#Ejercicio 5. Adding a Smoother to a Plot.
library(ggplot2)
ggplot(mpg, aes(displ, hwy)) +  geom_point() +  geom_smooth()
## `geom_smooth()` using method = 'loess'

  1. Jittering, geom jitter() , adds a little random noise to the data which can help avoid overplotting.
#Ejercicio 6. Jittering.
library(ggplot2)
ggplot(mpg, aes(drv, hwy)) + geom_jitter()

  1. Boxplots, geom boxplot() , summarise the shape of the distribution with a handful of summary statistics.
#Ejercicio 7. Boxplots.
library(ggplot2)
ggplot(mpg, aes(drv, hwy)) + geom_boxplot()

  1. Violin plots, geom violin() , show a compact representation of the “density” of the distribution, highlighting the areas where more points are found.
#Ejercicio 8. Violin plots.
library(ggplot2)
ggplot(mpg, aes(drv, hwy)) + geom_violin()

Labels

A variation on geom text() is geom label() : it draws a rounded rectangle behind the text. This makes it useful for adding labels to plots with busy backgrounds:

#Ejercicio 9. Labels.
library(ggplot2)
label <- data.frame(
  waiting = c(55, 80),
  eruptions = c(2, 4.3),
  label = c("peak one", "peak two")
)
ggplot(faithfuld, aes(waiting, eruptions)) +
  geom_tile(aes(fill = density)) +
  geom_label(data = label, aes(label = label))

Collective Geoms

Geoms can be roughly divided into individual and collective geoms. An individual geom draws a distinct graphical object for each observation (row). For example, the point geom draws one point per row. A collective geom displays multiple observations with one geometric object. This may be a result of a statistical summary, like a boxplot, or may be fundamental to the display of the geom, like a polygon. Lines and paths fall somewhere in between: each line is composed of a set of straight segments, but each segment represents two points. How do we control the assignment of observations to graphical elements? This is the job of the group aesthetic.

#Ejercicio 10. Collectyive Geoms.
library(ggplot2)
data(Oxboys,package = "nlme")
head(Oxboys)
##   Subject     age height Occasion
## 1       1 -1.0000  140.5        1
## 2       1 -0.7479  143.4        2
## 3       1 -0.4630  144.8        3
## 4       1 -0.1643  147.1        4
## 5       1 -0.0027  147.7        5
## 6       1  0.2466  150.2        6
ggplot(Oxboys, aes(Occasion, height)) +
geom_boxplot()

Surface Plots

ggplot2 does not support true 3d surfaces. However, it does support many common tools for representing 3d surfaces in 2d: contours, coloured tiles and bubble plots. These all work similarly, differing only in the aesthetic used for the third dimension.

#Ejercicio 11.
library(ggplot2)
ggplot(faithfuld, aes(eruptions, waiting)) +
geom_contour(aes(z = density, colour = ..level..))

#Ejercicio 12.
library(ggplot2)
ggplot(faithfuld, aes(eruptions, waiting)) +
geom_raster(aes(fill = density))

#Ejercicio 13. Bubble plots work better with fewer observations.
library(ggplot2)
small <- faithfuld[seq(1, nrow(faithfuld), by = 10), ]
ggplot(small, aes(eruptions, waiting)) +
geom_point(aes(size = density), alpha = 1/3) +
scale_size_area()

#Ejercicio 14. Modelo lineal. "lm""
library(ggplot2)
ggplot(mpg, aes(displ, hwy)) +
  geom_point() +
  geom_smooth(aes(colour = "loess"), method = "loess", se = FALSE) +
  geom_smooth(aes(colour = "lm"), method = "lm", se = FALSE) +
  labs(colour = "Method")

Drawing Maps

There are four types of map data you might want to visualise: vector boundaries, point metadata, area metadata, and raster images. Typically, assembling these datasets is the most challenging part of drawing maps. Unfortunately ggplot2 can’t help you with that part of the analysis.

#Ejercicio 15. Mapa Mundial
library(ggplot2)
world <- map_data("world")
worldmap <- ggplot(world, aes(long, lat, group = group)) +
  geom_path() +
  scale_y_continuous(NULL, breaks = (-2:3) * 30, labels = NULL) +
  scale_x_continuous(NULL, breaks = (-4:4) * 45, labels = NULL)
worldmap + coord_map()

#Ejercicio 16. Diferente perspectiva del mapa.
library(ggplot2)
worldmap + coord_map("ortho")

#Ejercicio 17. Diferente perspectiva del mapa.
library(ggplot2)
worldmap + coord_map("stereographic")