José F. Zea
March 4th, 2017
Service plataforms enables customers and users to download the data via a huge number of open standards (CSV, JSON, etc.) so there is no risk of lock-in at the data level.
Socrata is a software-as-a-service platform that provides a cloud-based solution for open data publishing and visualization
CKAN is an open source project, developed by the Open Knowledge Foundation, that lets users provision open data catalogs and, in some cases, visualizations and APIs.
Succesuful cases: Colombia open data porta, Mexico open data portal, Paraguay Open Data portal, etc.
get_socrata_metada: fetch a detailed metada of open datasets (extract json and basic list of datasets available from a Socrata domain
search_data: Shows a list object with with available datasets by keywords/tags. The list contains four things:
eda_opendata: Generate a basic report in rmarkdown or shiny with selected datasets from an open data portal.
Description fetch a detailed metada of open datasets (extract json and basic list of datasets available from a Socrata domain
Usage
get_socrata_metada(url)
Arguments
Details a R data frame containing a listing of datasets along with detailed metadata. Next field are preserved for every open dataset:
Value The function returns a dataframe with detailed information about datasets.
Description Shows a list object with with available datasets by keywords/tags. The list contains four things:
Usage
search_data(metadata, tags)
Arguments
Details The selected sample is drawn according to a selection-rejection (list-sequential) algorithm
Value The function returns a list of 4 dataframes. The 4 dataframes contain information of tabular, geo, href and blobby data of open data portal.
id | name | n_row | n_col | size | elapsed_time |
---|---|---|---|---|---|
229w-qzrf | DATOS ABIERTOS BOGOTA | 16 | 11 | 2.2 Kb | 1.72 |
45pp-fbx3 | RESERVA DE CUPOS | 18 | 7 | 23.8 Mb | 77.69 |
63i4-nng2 | DNP-BASE EJEC PTAL 02-15 (ACT 30 ABRIL 15) | 9 | 6 | 4.5 Mb | 22.13 |
9vy2-biux | LISTA DE JUNTAS DE ACCION COMUNAL MUNICIPIO DE TESALIA - HUILA | 12 | 9 | 6.1 Kb | 0.92 |
b4cc-cqqu | MEDICION 2 SENSÓRICA CONDUCTUAL CHINCHINA | 10 | 11 | 8.1 Kb | 0.95 |
d8dw-68hx | SALAS DE CINE Y CINEMATECAS EN LA CIUDAD DE BOGOTÁ | 13 | 3 | 21.5 Kb | 0.90 |
dmmd-s8ju | ENTIDADES ACREDITADAS CON AVAL A 31 DE ENERO DE 2017 | 17 | 4 | 60.8 Kb | 1.00 |
e2it-6w34 | REGISTRO USUARIOS DE ASISTENCIA TÉCNICA BUSBANZA | 7 | 5 | 37.4 Kb | 0.92 |
f8ve-dac2 | CADENAS PRODUCTIVAS DE TUTAZÁ | 3 | 1 | 5 Kb | 0.98 |
gr6q-6jhn | PARTICIPACIÓN PROVEEDORES POR DEMANDA | 2 | 13 | 3.7 Kb | 0.87 |
ih48-erzn | SÍFILIS GESTACIONAL EN EL DEPARTAMENTO DE NARIÑO AÑO 2008 A 2015 (GOBERNACIÓN DE NARIÑO) | 15 | 8 | 20.9 Kb | 0.90 |
m8cr-nt44 | ACUEDUCTOS VEREDALES CHÍQUIZA | 4 | 12 | 3.6 Kb | 0.97 |
qqbd-442m | ESTACIONES DE SERVICIO DE MISTRATÓ RISARALDA | 1 | 11 | 1.3 Kb | 0.99 |
s52q-9chx | OFERTA HIDRICA JUNIO 2015 | 5 | 10 | 26.9 Kb | 0.98 |
snc5-vevu | CONSOLIDADO DE PROCESOS DE SELECCIÓN | 14 | 2 | 328 Kb | 1.25 |
tdkw-xhkb | LISTADO DE DROGUERIAS -TUNJA | 11 | 2 | 9.2 Kb | 0.92 |
tg45-q549 | ROTACIÓN DEPORTIVA ESCOLAR COPACABANA 2016 | 8 | 13 | 5.2 Kb | 1.00 |
udix-2txh | INTERNET POR SUSCRIPCIÓN Y TECNOLOGÍA | 6 | 11 | 2.1 Kb | 1.00 |
wheb-axi3 | SITIOS TURÍSTICOS-6 | 2 | 14 | 5.9 Kb | 0.95 |
Description Generate a basic report in rmarkdown or shiny with selected datasets from an open data portal.
Usage
eda_opendata(metadata, ids, tags)
Arguments
Details The selected sample is drawn according to a selection-rejection (list-sequential) algorithm
Value The function returns a list of 4 dataframes. The 4 dataframes contain information of tabular, geo, href and blobby data of open data portal.
viewType | count | percent |
---|---|---|
tabular | 3543 | 82.0 |
href | 574 | 13.3 |
blobby | 162 | 3.7 |
geo | 44 | 1.0 |
Stats | |
---|---|
nbr.val | 4323.0 |
nbr.null | 0.0 |
nbr.na | 0.0 |
min | 1.0 |
max | 25932.0 |
range | 25931.0 |
sum | 375981.0 |
median | 26.0 |
mean | 87.0 |
SE.mean | 10.9 |
CI.mean.0.95 | 21.4 |
var | 513229.0 |
std.dev | 716.4 |
coef.var | 8.2 |