ggplo2 como paquete –> aplicación de gráficos
library(ggplot2)
library(dplyr)
library(skimr)
todos los paquetes se escriben: library (blah) para cargarlos –> aplicaciones extras de mi celular
dplyr::glimpse(mpg)
Observations: 234
Variables: 11
$ manufacturer <chr> "audi", "audi", "audi", "audi", "audi", "audi", "audi", "audi", "audi", "...
$ model <chr> "a4", "a4", "a4", "a4", "a4", "a4", "a4", "a4 quattro", "a4 quattro", "a4...
$ displ <dbl> 1.8, 1.8, 2.0, 2.0, 2.8, 2.8, 3.1, 1.8, 1.8, 2.0, 2.0, 2.8, 2.8, 3.1, 3.1...
$ year <int> 1999, 1999, 2008, 2008, 1999, 1999, 2008, 1999, 1999, 2008, 2008, 1999, 1...
$ cyl <int> 4, 4, 4, 4, 6, 6, 6, 4, 4, 4, 4, 6, 6, 6, 6, 6, 6, 8, 8, 8, 8, 8, 8, 8, 8...
$ trans <chr> "auto(l5)", "manual(m5)", "manual(m6)", "auto(av)", "auto(l5)", "manual(m...
$ drv <chr> "f", "f", "f", "f", "f", "f", "f", "4", "4", "4", "4", "4", "4", "4", "4"...
$ cty <int> 18, 21, 20, 21, 16, 18, 18, 18, 16, 20, 19, 15, 17, 17, 15, 15, 17, 16, 1...
$ hwy <int> 29, 29, 31, 30, 26, 26, 27, 26, 25, 28, 27, 25, 25, 25, 25, 24, 25, 23, 2...
$ fl <chr> "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p"...
$ class <chr> "compact", "compact", "compact", "compact", "compact", "compact", "compac...
glimpse es para explorar datos: nos da un resumen de lo que tenemos.
int: entero dbl: mayor precisión. decimales chr: palabras, texto
skim(mpg)
Skim summary statistics
n obs: 234
n variables: 11
-- Variable type:character -----------------------------------------------------
variable missing complete n min max empty n_unique
class 0 234 234 3 10 0 7
drv 0 234 234 1 1 0 3
fl 0 234 234 1 1 0 5
manufacturer 0 234 234 4 10 0 15
model 0 234 234 2 22 0 38
trans 0 234 234 8 10 0 10
-- Variable type:integer -------------------------------------------------------
variable missing complete n mean sd p0 p25 p50 p75 p100 hist
cty 0 234 234 16.86 4.26 9 14 17 19 35 <U+2585><U+2587><U+2587><U+2587><U+2581><U+2581><U+2581><U+2581>
cyl 0 234 234 5.89 1.61 4 4 6 8 8 <U+2587><U+2581><U+2581><U+2587><U+2581><U+2581><U+2581><U+2587>
hwy 0 234 234 23.44 5.95 12 18 24 27 44 <U+2583><U+2587><U+2583><U+2587><U+2585><U+2581><U+2581><U+2581>
year 0 234 234 2003.5 4.51 1999 1999 2003.5 2008 2008 <U+2587><U+2581><U+2581><U+2581><U+2581><U+2581><U+2581><U+2587>
-- Variable type:numeric -------------------------------------------------------
variable missing complete n mean sd p0 p25 p50 p75 p100 hist
displ 0 234 234 3.47 1.29 1.6 2.4 3.3 4.6 7 <U+2587><U+2587><U+2585><U+2585><U+2585><U+2583><U+2582><U+2581>
skim: nos da un resumen estadÃstico. nos da mas info compactab que glimpse
datos: sacados de mpg variables representadas: en puntos
ggplot() +
geom_point(data = mpg, mapping = aes(x = displ, y = hwy, color = class))
drv: variable que nos dice si la traccion del auto está adelante, atrás o en las 4 ruedas
4 = 4 x 4 f = frontal r = atrás
split (mpg, mpg$drv)
$`4`
$f
$r
NA
ggplot(mpg) +
geom_point(aes(x = displ, y = hwy , colour = class))+
facet_wrap(~ drv)
aes: atributo estético (english)
este gráfico de lÃneo no tiene sentido visual para los datos que hemos recabado. }
grafico de barras
ggplot(data = mpg) +
geom_bar (aes ( x = drv))
formas de crear gráficos ejemplificadores
data_autos_resumida <- tribble(
~ tipo_traccion, ~ num_obs,
"4" , 104,
"f" , 102,
"r" , 25
)
data_autos_resumida
Gráfico de barras en relacion anuestros datos ya hechos. (gráfico anterior)
capa: representacion geométrica, por ejemplo geom_point
ggplot(data = mpg) +
geom_smooth (mapping = aes (x = displ, y = hwy))
ggplot(data = mpg) +
geom_point (mapping = aes (x = displ, y = hwy, color = class)) +
geom_smooth (mapping = aes (x = displ, y = hwy))
constante: por ejemplo el color, se escribe fuera de aes.
grafico de burbujas: gráfico de dispersión con tamaños diferentes de puntos (size)
library(dplyr)
data_americas <- filter(gapminder, continent == "Americas")
data_americas
dplyr::glimpse(data_americas)
Observations: 300
Variables: 6
$ country <fct> Argentina, Argentina, Argentina, Argentina, Argentina, Argent...
$ continent <fct> Americas, Americas, Americas, Americas, Americas, Americas, A...
$ year <int> 1952, 1957, 1962, 1967, 1972, 1977, 1982, 1987, 1992, 1997, 2...
$ lifeExp <dbl> 62.485, 64.399, 65.142, 65.634, 67.065, 68.481, 69.942, 70.77...
$ pop <int> 17876956, 19610538, 21283783, 22934225, 24779799, 26983828, 2...
$ gdpPercap <dbl> 5911.315, 6856.856, 7133.166, 8052.953, 9443.039, 10079.027, ...
ggplot (data_americas)+
geom_point(mapping= aes ( x = gdpPercap , y = lifeExp, size = pop))
NA
Este gráfico representa que a mayor PIB es mayor la expectativa de vida.
10-4