En esta sección vamos a realizar los ejercicios del curso R Programmin Bootcamp con el paquete ggplot2.

Recreate the following plots shown below. Don’t worry if your plots don’t match exactly what is shown below, as long as you have a general understanding of ggplot2 and the grammar of graphics. For the first few plots, use the mpg dataset.

library(ggthemes)
## Registered S3 methods overwritten by 'ggplot2':
##   method         from 
##   [.quosures     rlang
##   c.quosures     rlang
##   print.quosures rlang
library(ggplot2)
library(imager)
## Loading required package: magrittr
## 
## Attaching package: 'imager'
## The following object is masked from 'package:magrittr':
## 
##     add
## The following objects are masked from 'package:stats':
## 
##     convolve, spectrum
## The following object is masked from 'package:graphics':
## 
##     frame
## The following object is masked from 'package:base':
## 
##     save.image
head(mpg)
## # A tibble: 6 x 11
##   manufacturer model displ  year   cyl trans  drv     cty   hwy fl    class
##   <chr>        <chr> <dbl> <int> <int> <chr>  <chr> <int> <int> <chr> <chr>
## 1 audi         a4      1.8  1999     4 auto(~ f        18    29 p     comp~
## 2 audi         a4      1.8  1999     4 manua~ f        21    29 p     comp~
## 3 audi         a4      2    2008     4 manua~ f        20    31 p     comp~
## 4 audi         a4      2    2008     4 auto(~ f        21    30 p     comp~
## 5 audi         a4      2.8  1999     6 auto(~ f        16    26 p     comp~
## 6 audi         a4      2.8  1999     6 manua~ f        18    26 p     comp~

Histogram of hwy mpg values:

image <-load.image("~/R/Imagenes/image1.png")
plot(image)

Tenemos que reproducir el gráfico anterior. Es un histograma contando el número de apariciones de la variable hwy.

ggplot(mpg, aes(hwy)) + geom_histogram(bins=20,fill="red", alpha=0.5)

image2 <- load.image("~/R/Imagenes/image2.png")
plot(image2)

Es un gráfico de barras pues la variable manufacturer es discreta.

ggplot(mpg, aes(manufacturer)) + geom_bar(aes(fill=factor(cyl)))

Switch now to use the txhousing dataset that comes with ggplot2.

head(txhousing)
## # A tibble: 6 x 9
##   city     year month sales   volume median listings inventory  date
##   <chr>   <int> <int> <dbl>    <dbl>  <dbl>    <dbl>     <dbl> <dbl>
## 1 Abilene  2000     1    72  5380000  71400      701       6.3 2000 
## 2 Abilene  2000     2    98  6505000  58700      746       6.6 2000.
## 3 Abilene  2000     3   130  9285000  58100      784       6.8 2000.
## 4 Abilene  2000     4    98  9730000  68600      785       6.9 2000.
## 5 Abilene  2000     5   141 10590000  67300      794       6.8 2000.
## 6 Abilene  2000     6   156 13910000  66900      780       6.6 2000.

Create a scatterplot of volume versus sales. Afterwards play around with alpha and color arguments to clarify information.

pl <- ggplot(txhousing,aes(x=sales, y=volume)) + geom_point(alpha=0.4, col="blue")
pl
## Warning: Removed 568 rows containing missing values (geom_point).

Add a smooth fit line to the scatterplot from above. Hint: You may need to look up geom_smooth()

pl + geom_smooth( col="red")
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Removed 568 rows containing non-finite values (stat_smooth).
## Warning: Removed 568 rows containing missing values (geom_point).