En esta sección vamos a realizar los ejercicios del curso R Programmin Bootcamp con el paquete ggplot2.
Recreate the following plots shown below. Don’t worry if your plots don’t match exactly what is shown below, as long as you have a general understanding of ggplot2 and the grammar of graphics. For the first few plots, use the mpg dataset.
library(ggthemes)
## Registered S3 methods overwritten by 'ggplot2':
## method from
## [.quosures rlang
## c.quosures rlang
## print.quosures rlang
library(ggplot2)
library(imager)
## Loading required package: magrittr
##
## Attaching package: 'imager'
## The following object is masked from 'package:magrittr':
##
## add
## The following objects are masked from 'package:stats':
##
## convolve, spectrum
## The following object is masked from 'package:graphics':
##
## frame
## The following object is masked from 'package:base':
##
## save.image
head(mpg)
## # A tibble: 6 x 11
## manufacturer model displ year cyl trans drv cty hwy fl class
## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
## 1 audi a4 1.8 1999 4 auto(~ f 18 29 p comp~
## 2 audi a4 1.8 1999 4 manua~ f 21 29 p comp~
## 3 audi a4 2 2008 4 manua~ f 20 31 p comp~
## 4 audi a4 2 2008 4 auto(~ f 21 30 p comp~
## 5 audi a4 2.8 1999 6 auto(~ f 16 26 p comp~
## 6 audi a4 2.8 1999 6 manua~ f 18 26 p comp~
Histogram of hwy mpg values:
image <-load.image("~/R/Imagenes/image1.png")
plot(image)
Tenemos que reproducir el gráfico anterior. Es un histograma contando el número de apariciones de la variable hwy.
ggplot(mpg, aes(hwy)) + geom_histogram(bins=20,fill="red", alpha=0.5)
image2 <- load.image("~/R/Imagenes/image2.png")
plot(image2)
Es un gráfico de barras pues la variable manufacturer es discreta.
ggplot(mpg, aes(manufacturer)) + geom_bar(aes(fill=factor(cyl)))
Switch now to use the txhousing dataset that comes with ggplot2.
head(txhousing)
## # A tibble: 6 x 9
## city year month sales volume median listings inventory date
## <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Abilene 2000 1 72 5380000 71400 701 6.3 2000
## 2 Abilene 2000 2 98 6505000 58700 746 6.6 2000.
## 3 Abilene 2000 3 130 9285000 58100 784 6.8 2000.
## 4 Abilene 2000 4 98 9730000 68600 785 6.9 2000.
## 5 Abilene 2000 5 141 10590000 67300 794 6.8 2000.
## 6 Abilene 2000 6 156 13910000 66900 780 6.6 2000.
Create a scatterplot of volume versus sales. Afterwards play around with alpha and color arguments to clarify information.
pl <- ggplot(txhousing,aes(x=sales, y=volume)) + geom_point(alpha=0.4, col="blue")
pl
## Warning: Removed 568 rows containing missing values (geom_point).
Add a smooth fit line to the scatterplot from above. Hint: You may need to look up geom_smooth()
pl + geom_smooth( col="red")
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Removed 568 rows containing non-finite values (stat_smooth).
## Warning: Removed 568 rows containing missing values (geom_point).