The Array of Things is an urban Project that has been deployed to Chicago and will eventually be expanded to several other Cities in the US. The goal is to place Nodes with Sensors to collect data. The sensors can track air quality, road conditions, traffic patterns as well as other data. The goal of the Array of Things Project is to enhance the quality of life of its population.
The data sources include:
Array of Thing Overview Documentation
Vendor Sensor and Datasheet - Waggle:
The project goal is to collect data from api and generate some plots with statistical data and map addresses with current sensor information. The velocity of data may be a bit of challenge. To address the constant change of data there is a need to store data on a continous bases.
This solution consists of several files:
Create DB.
Get Data.
library(dplyr)
library(tidyr)
library(jsonlite)
library(XML)
library(RCurl)
library(ggplot2)
library(plotly)
library(shiny)
library(ggmap)
library(sf)
In this code we are loading the APIs for projects, nodes, sensors, observations and raw data. We then copy the data from Json and save to a list
projects_aot_url <-getURL("https://api.arrayofthings.org/api/projects")
df.projects_aot_url<-fromJSON(projects_aot_url)
nodes_aot_url <-getURL("https://api.arrayofthings.org/api/nodes")
df.nodes_aot_url<-fromJSON(nodes_aot_url)
sensors_aot_url <-getURL("https://api.arrayofthings.org/api/sensors")
df.sensors_aot_url<-fromJSON(sensors_aot_url)
observations_aot_url<-getURL("https://api.arrayofthings.org/api/observations")
df.observations_aot_url<-fromJSON(observations_aot_url)
raw_aot_url <-getURL("https://api.arrayofthings.org/api/raw-observations")
df.raw_aot_url<-fromJSON(raw_aot_url)
Here we are converting each dataset list into dataframes. We use the names funciton to display the columns of each dataset.
df.projects_aot<-as.data.frame(df.projects_aot_url$data)
df.nodes_aot<-as.data.frame(df.nodes_aot_url$data)
df.sensors_aot<-as.data.frame(df.sensors_aot_url$data)
df.observations_aot<-as.data.frame(df.observations_aot_url$data)
df.raw_aot<-as.data.frame(df.raw_aot_url$data)
names(df.projects_aot)
## [1] "slug" "name" "latest_observation"
## [4] "hull" "first_observation" "bbox"
## [7] "archive_url"
names(df.nodes_aot)
## [1] "vsn" "location" "description"
## [4] "decommissioned_on" "commissioned_on" "address"
names(df.sensors_aot)
## [1] "uom" "subsystem" "sensor" "path" "parameter"
## [6] "ontology" "min" "max" "data_sheet"
names(df.observations_aot)
## [1] "value" "timestamp" "sensor_path" "node_vsn"
names(df.raw_aot)
## [1] "timestamp" "sensor_path" "raw" "node_vsn" "hrf"
In the transformation phase we need to separate the location column into latitude and longtitude then rename the 2 new columns to Long and Lat so we can merge with map data later. We then merge new dataframe with raw data joined by vsn field.
df.nodes_aot <- cbind(df.nodes_aot, data.frame(matrix(unlist(df.nodes_aot$location[,3]), nrow=nrow(df.nodes_aot), byrow=T),stringsAsFactors=FALSE))
names(df.nodes_aot)[7]<-"long"
names(df.nodes_aot)[8]<-"lat"
names(df.raw_aot)[4]<-"vsn"
df.merge_aot<-merge(df.nodes_aot, df.raw_aot, by = "vsn")
In this step we get the latest data. We connect to the MSQL DB that has been inserting hourly raw data with the scheduled getdata.r file.
library(RMySQL)
mydb = dbConnect(MySQL(), user='root', password='123456', dbname= 'aot', host='localhost')
dfstats_aot<-dbGetQuery(mydb, "SELECT count(sensor_path) as num_of_samples,sensor_path, format(avg(hrf),2) as avg_sample,
FORMAT(STD(hrf),2) as std_sample, format(avg(hrf)+ 1*std(hrf),2) as 1Std, format(avg(hrf)+ 2*std(hrf),2) as 2Std,
format(avg(hrf)+ 3*std(hrf),2) as 3Std FROM aot.dfraw_aot
where hrf is not null
group by sensor_path")
dbDisconnect(mydb)
## [1] TRUE
Here we merge the transform data with the data retrieved from the MYSql db by the sensor_path column. In addition, we add an additionl calculated column to the merged data for later display in the map.
df.merge_aot_all<-merge(dfstats_aot, df.merge_aot, by = "sensor_path")
#Add Std calcuation
df.merge_aot_all$std_aot<-as.integer(df.merge_aot_all$hrf) - as.integer(df.merge_aot_all$avg_sample)/as.integer(df.merge_aot_all$std_sample)
This step displays plotly box plots of the merged data. One plotly plot is displayed by sensor_path and the other is displayed by address.
#Plotly Histogram
#plot_ly(df.merge_aot_all, x = ~hrf, y = ~sensor_path)
#Plotly by Sensor Path
plot_ly(df.merge_aot_all, x = ~hrf, color = ~sensor_path,
type = "box",width = 950, height = 1100)%>%
layout(margin = list(l = 200))
#Plotly by Address
plot_ly(df.merge_aot_all, x = ~hrf, color = ~address,
type = "box",width = 950, height = 1100)%>%
layout(margin = list(l = 200))
This last step generates a map with the latest stats and raw data by node address. It uses the leaflet map function and places details in content into the popup.
#install.packages("leaflet")
library(leaflet)
#Set Details of data
content <- paste(sep = "<br/>",
"------------------------------",
df.merge_aot_all$description,
df.merge_aot_all$address,
"------------------------------",
'<b>Lastest Reading in GMT:</>',
df.merge_aot_all$timestamp,
df.merge_aot_all$sensor_path,
df.merge_aot_all$hrf,
"------------------------------",
'<b>Sample Count, Standard Deviation, Average</>',
df.merge_aot_all$num_of_samples,
df.merge_aot_all$std_sample,
df.merge_aot_all$avg_sample)
df = data.frame(Lat = df.merge_aot_all$lat, Long = df.merge_aot_all$long)
m <- leaflet(df, width = 900, height = 600) %>%
setView(lng = df.merge_aot_all$long[nrow(df.merge_aot_all)], lat = df.merge_aot_all$lat[nrow(df.merge_aot_all)], zoom = 10)
m %>%
addTiles() %>%
#addCircles()%>%
addMarkers(df.merge_aot_all$long, df.merge_aot_all$lat, popup = paste(sep = "<br/>", content))
This project took updated API data from serveral sources, collected and transformed the data, merged and displayed the data in statistical boxplots and took goelocation to map location of devices. The map displayed current reading and basic statistical data with the historical hourly collections.