The Array of Things (AoT) is an urban sensing project, a network of interactive, modular sensor boxes that will be installed around Chicago to collect real-time data on the city’s environment, infrastructure, and activity for research and public use.
A total of 500 nodes will be mounted around the city over the next two to three years. The first prototype nodes were installed in summer 2016, 2017 and more will be installed throughout 2018.
The objectives of this final project are:
array_of_things_locations_data<-read.csv("https://raw.githubusercontent.com/hovig/MSDS_CUNY/master/DATA606/Project-Proposal/array-of-things-locations-1.csv")
array_of_things_locations_data$Status<-as.character(array_of_things_locations_data$Status)
array_of_things_locations_data$Status[array_of_things_locations_data$Status=="Live"]<-"True"
t<-array_of_things_locations_data$Status[array_of_things_locations_data$Status=="True"]
array_of_things_locations_data$Status[array_of_things_locations_data$Status=="Planned"]<-"False"
f<-array_of_things_locations_data$Status[array_of_things_locations_data$Status=="False"]
status_of_things<-array_of_things_locations_data %>%
group_by(Status) %>%
summarise(count=n())
dat <- data.frame(
status = factor(status_of_things$Status, levels=status_of_things$Status),
count = status_of_things$count
)
df<-round(data.frame(
x = jitter(array_of_things_locations_data$Longitude, amount = .3),
y = jitter(array_of_things_locations_data$Latitude, amount = .3)),
digits = 2)
glimpse(array_of_things_locations_data)
## Observations: 41
## Variables: 8
## $ Name <fct> Ashland Av - Division St , Wabansia - Milwaukee,...
## $ Location.Type <fct> CDOT Placemaking Project, CDOT Placemaking Proje...
## $ Category <fct> Urban Placemaking, Urban Placemaking, Urban Plac...
## $ Notes <fct> , , , , , , , , , , , single node, Single node w...
## $ Status <chr> "False", "False", "False", "False", "False", "Fa...
## $ Latitude <dbl> 41.90351, 41.91235, 41.91409, 41.89200, 41.83866...
## $ Longitude <dbl> -87.66716, -87.68214, -87.68302, -87.61164, -87....
## $ Location <fct> (41.9035068, -87.6671648), (41.9123537, -87.6821...
kable(status_of_things)
Status | count |
---|---|
False | 29 |
True | 12 |
ggplot(data=dat, aes(x=status, y=count, fill=time)) +
geom_bar(colour="black", fill="#DD8888", width=.8, stat="identity") +
guides(fill=FALSE) +
xlab("Status type") + ylab("Status count per type") +
ggtitle("Chicago's planning status")
Longitude<-df$x
Latitude<-df$y
ggplot(df, aes(x=Longitude, y=Latitude)) + geom_point() + stat_smooth(method="lm", se=FALSE)
plot(Longitude~Latitude, data=df)
abline(lm(Longitude~Latitude, data=df))
ggmap(map, extent = 'device')
## Warning: `panel.margin` is deprecated. Please use `panel.spacing` property
## instead
This is an observational study done to map 41 Nodes (devices) and to understand which ones went
Live
(True
in this project) or which ones still stated asPlanned
(False
).
The response variable in this study is the
status
which is considered to be categorical and the explanatory variables are thecount
and thegeolocations
(longitude
,latitude
) which are considered to be numerical.
It is worth to still continue this study when the project is completed by the City of Chicago, then we can check the spread of the 500 Nodes and what data will they be streaming in.
Comparing the scatterplots above, the linear model and the non-linear model, we can conclude that the geolocation data need to be more accurate. We’re not sure if this manually inserted to the datasets or if it was read from the nodes themselves.