En el siguiente documento vamos a identificar los tipos de objetos comunes en R: vectores, matrices, data.frame y funciones.
require(table1)
library(readxl)
Datos_Vivienda = read_excel("~/Desktop/Datos_Vivienda.xlsx")
head(Datos_Vivienda,5)
Zona | piso | Estrato | precio_millon | Area_contruida | parqueaderos | Banos | Habitaciones | Tipo | Barrio | cordenada_longitud | Cordenada_latitud |
---|---|---|---|---|---|---|---|---|---|---|---|
Zona Sur | 2 | 6 | 880 | 237 | 2 | 5 | 4 | Casa | pance | -76.46300 | 3.43000 |
Zona Oeste | 2 | 4 | 1200 | 800 | 3 | 6 | 7 | Casa | miraflores | -76.46400 | 3.42800 |
Zona Sur | 3 | 5 | 250 | 86 | NA | 2 | 3 | Apartamento | multicentro | -76.46400 | 3.42900 |
Zona Sur | NA | 6 | 1280 | 346 | 4 | 6 | 5 | Apartamento | ciudad jardāā n | -76.46400 | 3.43300 |
Zona Sur | 2 | 6 | 1300 | 600 | 4 | 7 | 5 | Casa | pance | -76.46438 | 3.43463 |
Se observa que la base de datos contiene un total de 8321 registros
Datos_Vivienda=na.omit(Datos_Vivienda)
table1(~precio_millon+Zona|Tipo,data = Datos_Vivienda)
Apartamento (N=5099) |
Casa (N=3219) |
Overall (N=8318) |
|
---|---|---|---|
precio_millon | |||
Mean (SD) | 367 (289) | 540 (358) | 434 (329) |
Median [Min, Max] | 279 [58.0, 1950] | 430 [77.0, 2000] | 330 [58.0, 2000] |
Zona | |||
Zona Centro | 24 (0.5%) | 100 (3.1%) | 124 (1.5%) |
Zona Norte | 1198 (23.5%) | 722 (22.4%) | 1920 (23.1%) |
Zona Oeste | 1029 (20.2%) | 169 (5.3%) | 1198 (14.4%) |
Zona Oriente | 62 (1.2%) | 289 (9.0%) | 351 (4.2%) |
Zona Sur | 2786 (54.6%) | 1939 (60.2%) | 4725 (56.8%) |
table1(~Zona,data = Datos_Vivienda)
Overall (N=8318) |
|
---|---|
Zona | |
Zona Centro | 124 (1.5%) |
Zona Norte | 1920 (23.1%) |
Zona Oeste | 1198 (14.4%) |
Zona Oriente | 351 (4.2%) |
Zona Sur | 4725 (56.8%) |
Vamos utilizar para visualización estatica ggplot2 y dinamica plotly.
require(ggplot2)
## Loading required package: ggplot2
require(plotly)
## Loading required package: plotly
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
g1=ggplot(Datos_Vivienda,aes(x=precio_millon,colour=Tipo)) + geom_histogram() +facet_grid(~Tipo) +theme_bw()
g1
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
g2=ggplot(Datos_Vivienda,aes(x=precio_millon,y=Zona,colour=Zona))+geom_boxplot()
g2
ggplotly(g2)
## Warning: The following aesthetics were dropped during statistical transformation:
## x_plotlyDomain
## ā¹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ā¹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
g2=ggplot(Datos_Vivienda,aes(y=precio_millon,x=Zona,fill=Zona))+geom_boxplot()
ggplotly(g2)
## Warning: The following aesthetics were dropped during statistical transformation:
## y_plotlyDomain
## ā¹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ā¹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
ggplot(Datos_Vivienda,aes(y=precio_millon,x=Tipo,fill=Tipo))+geom_boxplot()+facet_wrap(~Zona)
require(leaflet)
## Loading required package: leaflet
leaflet() %>% addTiles() %>% addCircleMarkers(lng = Datos_Vivienda$cordenada_longitud,lat = Datos_Vivienda$Cordenada_latitud,radius = 0.2)
leaflet() %>% addTiles() %>% addCircleMarkers(lng = Datos_Vivienda$cordenada_longitud,lat = Datos_Vivienda$Cordenada_latitud,radius = 0.2,clusterOptions= markerClusterOptions())