datosEl paquete datos contiene datos para prƔctica en espaƱol
results <- help.search("datasets", package = "datos")
results$matches[c("Name", "Title")]
actuales <- vehiculos
glimpse() para ver los datosglimpse(actuales)
## Rows: 33,442
## Columns: 12
## $ id <dbl> 13309, 13310, 13311, 14038, 14039, 14040, 14834, 14835, 1ā¦
## $ fabricante <chr> "Acura", "Acura", "Acura", "Acura", "Acura", "Acura", "Acā¦
## $ modelo <chr> "2.2CL/3.0CL", "2.2CL/3.0CL", "2.2CL/3.0CL", "2.3CL/3.0CLā¦
## $ anio <dbl> 1997, 1997, 1997, 1998, 1998, 1998, 1999, 1999, 1999, 199ā¦
## $ clase <chr> "Automóviles subcompactos", "Automóviles subcompactos", "ā¦
## $ transmision <chr> "AutomĆ”tica 4-velocidades", "Manual 5-velocidades", "Autoā¦
## $ traccion <chr> "Delantera", "Delantera", "Delantera", "Delantera", "Delaā¦
## $ cilindros <dbl> 4, 4, 6, 4, 4, 6, 4, 4, 6, 5, 5, 6, 5, 6, 5, 6, 6, 6, 6, ā¦
## $ motor <dbl> 2.2, 2.2, 3.0, 2.3, 2.3, 3.0, 2.3, 2.3, 3.0, 2.5, 2.5, 3.ā¦
## $ combustible <chr> "Regular", "Regular", "Regular", "Regular", "Regular", "Rā¦
## $ autopista <dbl> 26, 28, 26, 27, 29, 26, 27, 29, 26, 23, 23, 22, 23, 22, 2ā¦
## $ ciudad <dbl> 20, 22, 18, 19, 21, 17, 20, 21, 17, 18, 18, 17, 18, 17, 1ā¦
skim() del paquete skimr provée much mas informaciónskim(actuales)
| Name | actuales |
| Number of rows | 33442 |
| Number of columns | 12 |
| _______________________ | |
| Column type frequency: | |
| character | 6 |
| numeric | 6 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| fabricante | 0 | 1 | 3 | 34 | 0 | 128 | 0 |
| modelo | 0 | 1 | 1 | 39 | 0 | 3198 | 0 |
| clase | 0 | 1 | 4 | 44 | 0 | 34 | 0 |
| transmision | 8 | 1 | 10 | 43 | 0 | 45 | 0 |
| traccion | 0 | 1 | 7 | 32 | 0 | 7 | 0 |
| combustible | 0 | 1 | 6 | 26 | 0 | 13 | 0 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| id | 0 | 1 | 17038.30 | 10087.01 | 1 | 8361.25 | 16723.5 | 25264.75 | 34932.0 | āāāāā |
| anio | 0 | 1 | 1999.11 | 9.38 | 1984 | 1991.00 | 1999.0 | 2008.00 | 2015.0 | āāā āā |
| cilindros | 58 | 1 | 5.77 | 1.74 | 2 | 4.00 | 6.0 | 6.00 | 16.0 | āāā āā |
| motor | 57 | 1 | 3.35 | 1.36 | 0 | 2.30 | 3.0 | 4.30 | 8.4 | āāā āā |
| autopista | 0 | 1 | 23.55 | 6.21 | 9 | 19.00 | 23.0 | 27.00 | 109.0 | āāāāā |
| ciudad | 0 | 1 | 17.49 | 5.58 | 6 | 15.00 | 17.0 | 20.00 | 138.0 | āāāāā |
correlate(), del paquete corrr, permite usar el pipe (%>%) para producir una tabla de correlacionesactuales %>%
select_if(is.numeric) %>%
correlate()
##
## Correlation method: 'pearson'
## Missing treated using: 'pairwise.complete.obs'
rplot() crea una grƔfica fƔcilmenteactuales %>%
select_if(is.numeric) %>%
correlate() %>%
rplot()
##
## Correlation method: 'pearson'
## Missing treated using: 'pairwise.complete.obs'
## Don't know how to automatically pick scale for object of type noquote. Defaulting to continuous.
shave() quita los valores duplicadosactuales %>%
select_if(is.numeric) %>%
correlate() %>%
shave() %>%
rplot()
##
## Correlation method: 'pearson'
## Missing treated using: 'pairwise.complete.obs'
## Don't know how to automatically pick scale for object of type noquote. Defaulting to continuous.