1 Assignment Instructions

For this assignment, you should take information from a relational database and migrate it to a NoSQL database of your own choosing.

For the relational database, you might use the flights database, the tb database, the “data skills” database your team created for Project 3, or another database of your own choosing or creation.

For the NoSQL database, you may use MongoDB (which we introduced in week 7), Neo4j, or another NoSQL database of your choosing.

Your migration process needs to be reproducible. R code is encouraged, but not required. You should also briefly describe the advantages and disadvantages of storing the data in a relational database vs. your NoSQL database.

2 Pre-Requistes : Available Libraries

  • knitr
  • dplyr
  • tidyr
  • RMySQL
  • mongolite
  • RNeo4j
  • neo4r
  • kableExtra
  • DT
  • data.table
  • ggplot2

3 Show Source Raw Data in Data Frame as Table

3.1 Source data of “airlines”

3.1.1 Kable

kable(head(airlines,10)) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed","responsive"),full_width   = F,position = "left",font_size = 12) %>%
  row_spec(0, background ="gray")
carrier name
9E Endeavor Air Inc.
AA American Airlines Inc.
AS Alaska Airlines Inc.
B6 JetBlue Airways
DL Delta Air Lines Inc.
EV ExpressJet Airlines Inc.
F9 Frontier Airlines Inc.
FL AirTran Airways Corporation
HA Hawaiian Airlines Inc.
MQ Envoy Air

3.1.2 Data Table

DT::datatable(head(airlines,10), options = list(pagelength=5))
#datatable(data_frame)

3.1.3 Select

DT::datatable(select(head(airlines,10), carrier:name), options = list(pagelength=5))

3.1.4 Knitr

knitr::kable(head(airlines,10), format = "html")
carrier name
9E Endeavor Air Inc.
AA American Airlines Inc.
AS Alaska Airlines Inc.
B6 JetBlue Airways
DL Delta Air Lines Inc.
EV ExpressJet Airlines Inc.
F9 Frontier Airlines Inc.
FL AirTran Airways Corporation
HA Hawaiian Airlines Inc.
MQ Envoy Air

3.1.5 SQL

sqldf("select * from airlines") 

3.2 Source data of “airports”

3.2.1 Kable

kable(head(airports,10)) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed","responsive"),full_width   = F,position = "left",font_size = 12) %>%
  row_spec(0, background ="gray")
faa name lat lon alt tz dst tzone
04G Lansdowne Airport 41.13047 -80.61958 1044 -5 A America/New_York
06A Moton Field Municipal Airport 32.46057 -85.68003 264 -6 A America/Chicago
06C Schaumburg Regional 41.98934 -88.10124 801 -6 A America/Chicago
06N Randall Airport 41.43191 -74.39156 523 -5 A America/New_York
09J Jekyll Island Airport 31.07447 -81.42778 11 -5 A America/New_York
0A9 Elizabethton Municipal Airport 36.37122 -82.17342 1593 -5 A America/New_York
0G6 Williams County Airport 41.46731 -84.50678 730 -5 A America/New_York
0G7 Finger Lakes Regional Airport 42.88356 -76.78123 492 -5 A America/New_York
0P2 Shoestring Aviation Airfield 39.79482 -76.64719 1000 -5 U America/New_York
0S9 Jefferson County Intl 48.05381 -122.81064 108 -8 A America/Los_Angeles

3.2.2 Data Table

DT::datatable(head(airports,10), options = list(pagelength=5))
#datatable(data_frame)

3.2.3 Select

DT::datatable(select(head(airports,10), faa:tzone), options = list(pagelength=5))

3.2.4 Knitr

knitr::kable(head(airports,10), format = "html")
faa name lat lon alt tz dst tzone
04G Lansdowne Airport 41.13047 -80.61958 1044 -5 A America/New_York
06A Moton Field Municipal Airport 32.46057 -85.68003 264 -6 A America/Chicago
06C Schaumburg Regional 41.98934 -88.10124 801 -6 A America/Chicago
06N Randall Airport 41.43191 -74.39156 523 -5 A America/New_York
09J Jekyll Island Airport 31.07447 -81.42778 11 -5 A America/New_York
0A9 Elizabethton Municipal Airport 36.37122 -82.17342 1593 -5 A America/New_York
0G6 Williams County Airport 41.46731 -84.50678 730 -5 A America/New_York
0G7 Finger Lakes Regional Airport 42.88356 -76.78123 492 -5 A America/New_York
0P2 Shoestring Aviation Airfield 39.79482 -76.64719 1000 -5 U America/New_York
0S9 Jefferson County Intl 48.05381 -122.81064 108 -8 A America/Los_Angeles

3.2.5 SQL

sqldf("select * from airports") 

3.3 Source data of “flights”

3.3.1 Kable

kable(head(flights,10)) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed","responsive"),full_width   = F,position = "left",font_size = 12) %>%
  row_spec(0, background ="gray")
year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time arr_delay carrier flight tailnum origin dest air_time distance hour minute time_hour
2013 1 1 517 515 2 830 819 11 UA 1545 N14228 EWR IAH 227 1400 5 15 2013-01-01 05:00:00
2013 1 1 533 529 4 850 830 20 UA 1714 N24211 LGA IAH 227 1416 5 29 2013-01-01 05:00:00
2013 1 1 542 540 2 923 850 33 AA 1141 N619AA JFK MIA 160 1089 5 40 2013-01-01 05:00:00
2013 1 1 544 545 -1 1004 1022 -18 B6 725 N804JB JFK BQN 183 1576 5 45 2013-01-01 05:00:00
2013 1 1 554 600 -6 812 837 -25 DL 461 N668DN LGA ATL 116 762 6 0 2013-01-01 06:00:00
2013 1 1 554 558 -4 740 728 12 UA 1696 N39463 EWR ORD 150 719 5 58 2013-01-01 05:00:00
2013 1 1 555 600 -5 913 854 19 B6 507 N516JB EWR FLL 158 1065 6 0 2013-01-01 06:00:00
2013 1 1 557 600 -3 709 723 -14 EV 5708 N829AS LGA IAD 53 229 6 0 2013-01-01 06:00:00
2013 1 1 557 600 -3 838 846 -8 B6 79 N593JB JFK MCO 140 944 6 0 2013-01-01 06:00:00
2013 1 1 558 600 -2 753 745 8 AA 301 N3ALAA LGA ORD 138 733 6 0 2013-01-01 06:00:00

3.3.2 Data Table

DT::datatable(head(flights,10), options = list(pagelength=5))
#datatable(data_frame)

3.3.3 Select

DT::datatable(select(head(flights,10), year:time_hour), options = list(pagelength=5))

3.3.4 Knitr

knitr::kable(head(flights,10), format = "html")
year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time arr_delay carrier flight tailnum origin dest air_time distance hour minute time_hour
2013 1 1 517 515 2 830 819 11 UA 1545 N14228 EWR IAH 227 1400 5 15 2013-01-01 05:00:00
2013 1 1 533 529 4 850 830 20 UA 1714 N24211 LGA IAH 227 1416 5 29 2013-01-01 05:00:00
2013 1 1 542 540 2 923 850 33 AA 1141 N619AA JFK MIA 160 1089 5 40 2013-01-01 05:00:00
2013 1 1 544 545 -1 1004 1022 -18 B6 725 N804JB JFK BQN 183 1576 5 45 2013-01-01 05:00:00
2013 1 1 554 600 -6 812 837 -25 DL 461 N668DN LGA ATL 116 762 6 0 2013-01-01 06:00:00
2013 1 1 554 558 -4 740 728 12 UA 1696 N39463 EWR ORD 150 719 5 58 2013-01-01 05:00:00
2013 1 1 555 600 -5 913 854 19 B6 507 N516JB EWR FLL 158 1065 6 0 2013-01-01 06:00:00
2013 1 1 557 600 -3 709 723 -14 EV 5708 N829AS LGA IAD 53 229 6 0 2013-01-01 06:00:00
2013 1 1 557 600 -3 838 846 -8 B6 79 N593JB JFK MCO 140 944 6 0 2013-01-01 06:00:00
2013 1 1 558 600 -2 753 745 8 AA 301 N3ALAA LGA ORD 138 733 6 0 2013-01-01 06:00:00

3.3.5 SQL

sqldf("select * from flights") 

3.4 Source data of “planes”

3.4.1 Kable

kable(head(planes,10)) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed","responsive"),full_width   = F,position = "left",font_size = 12) %>%
  row_spec(0, background ="gray")
tailnum year type manufacturer model engines seats speed engine
N10156 2004 Fixed wing multi engine EMBRAER EMB-145XR 2 55 NA Turbo-fan
N102UW 1998 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
N103US 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
N104UW 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
N10575 2002 Fixed wing multi engine EMBRAER EMB-145LR 2 55 NA Turbo-fan
N105UW 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
N107US 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
N108UW 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
N109UW 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
N110UW 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan

3.4.2 Data Table

DT::datatable(head(planes,10), options = list(pagelength=5))
#datatable(data_frame)

3.4.3 Select

DT::datatable(select(head(planes,10), tailnum:engine), options = list(pagelength=5))

3.4.4 Knitr

knitr::kable(head(planes,10), format = "html")
tailnum year type manufacturer model engines seats speed engine
N10156 2004 Fixed wing multi engine EMBRAER EMB-145XR 2 55 NA Turbo-fan
N102UW 1998 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
N103US 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
N104UW 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
N10575 2002 Fixed wing multi engine EMBRAER EMB-145LR 2 55 NA Turbo-fan
N105UW 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
N107US 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
N108UW 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
N109UW 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
N110UW 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan

3.4.5 SQL

sqldf("select * from planes") 

3.5 Source data of “weather”

3.5.1 Kable

kable(head(weather,10)) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed","responsive"),full_width   = F,position = "left",font_size = 12) %>%
  row_spec(0, background ="gray")
origin year month day hour temp dewp humid wind_dir wind_speed wind_gust precip pressure visib time_hour
EWR 2013 1 1 1 39.02 26.06 59.37 270 10.35702 NA 0 1012.0 10 2013-01-01 01:00:00
EWR 2013 1 1 2 39.02 26.96 61.63 250 8.05546 NA 0 1012.3 10 2013-01-01 02:00:00
EWR 2013 1 1 3 39.02 28.04 64.43 240 11.50780 NA 0 1012.5 10 2013-01-01 03:00:00
EWR 2013 1 1 4 39.92 28.04 62.21 250 12.65858 NA 0 1012.2 10 2013-01-01 04:00:00
EWR 2013 1 1 5 39.02 28.04 64.43 260 12.65858 NA 0 1011.9 10 2013-01-01 05:00:00
EWR 2013 1 1 6 37.94 28.04 67.21 240 11.50780 NA 0 1012.4 10 2013-01-01 06:00:00
EWR 2013 1 1 7 39.02 28.04 64.43 240 14.96014 NA 0 1012.2 10 2013-01-01 07:00:00
EWR 2013 1 1 8 39.92 28.04 62.21 250 10.35702 NA 0 1012.2 10 2013-01-01 08:00:00
EWR 2013 1 1 9 39.92 28.04 62.21 260 14.96014 NA 0 1012.7 10 2013-01-01 09:00:00
EWR 2013 1 1 10 41.00 28.04 59.65 260 13.80936 NA 0 1012.4 10 2013-01-01 10:00:00

3.5.2 Data Table

DT::datatable(head(weather,10), options = list(pagelength=5))
#datatable(data_frame)

3.5.3 Select

DT::datatable(select(head(weather,10), origin:time_hour), options = list(pagelength=5))

3.5.4 Knitr

knitr::kable(head(weather,10), format = "html")
origin year month day hour temp dewp humid wind_dir wind_speed wind_gust precip pressure visib time_hour
EWR 2013 1 1 1 39.02 26.06 59.37 270 10.35702 NA 0 1012.0 10 2013-01-01 01:00:00
EWR 2013 1 1 2 39.02 26.96 61.63 250 8.05546 NA 0 1012.3 10 2013-01-01 02:00:00
EWR 2013 1 1 3 39.02 28.04 64.43 240 11.50780 NA 0 1012.5 10 2013-01-01 03:00:00
EWR 2013 1 1 4 39.92 28.04 62.21 250 12.65858 NA 0 1012.2 10 2013-01-01 04:00:00
EWR 2013 1 1 5 39.02 28.04 64.43 260 12.65858 NA 0 1011.9 10 2013-01-01 05:00:00
EWR 2013 1 1 6 37.94 28.04 67.21 240 11.50780 NA 0 1012.4 10 2013-01-01 06:00:00
EWR 2013 1 1 7 39.02 28.04 64.43 240 14.96014 NA 0 1012.2 10 2013-01-01 07:00:00
EWR 2013 1 1 8 39.92 28.04 62.21 250 10.35702 NA 0 1012.2 10 2013-01-01 08:00:00
EWR 2013 1 1 9 39.92 28.04 62.21 260 14.96014 NA 0 1012.7 10 2013-01-01 09:00:00
EWR 2013 1 1 10 41.00 28.04 59.65 260 13.80936 NA 0 1012.4 10 2013-01-01 10:00:00

3.5.5 SQL

sqldf("select * from weather") 

4 Load data into RDBMS - MySql

Load the tidy data from R data frame to mySQL Google Cloud relational database for data analysis and visual representation

4.1 Connect to mySQL Google Cloud

4.2 Create and Load Tables in mySQL Google Cloud

  • Load into airlines table directly from the data frame via overwrite
# Write the data frame to the database
dbWriteTable(mySqlConn, name = "airlines", value = airlines, row.names = FALSE, overwrite=T)
  • Load into airports table via drop if exists (dbSendQuery) and create table (dbSendQuery) followed by load (dbWriteTable)
# Drop table if it already exists
dbSendQuery(mySqlConn, "DROP TABLE IF EXISTS airports;")
# Create table
dbSendQuery(mySqlConn, "CREATE TABLE airports 
(faa      VARCHAR(100) NOT NULL,
 name   VARCHAR(100) NOT NULL,
 lat    NUMERIC(65,5),
 lon    NUMERIC(65,5),
 alt    INTEGER,
 tz     INTEGER,
 dst    VARCHAR(100),
 tzone  VARCHAR(100),
 CONSTRAINT pk_airports PRIMARY KEY (faa)
 );")
# Load table
dbWriteTable(mySqlConn, "airports", airports, overwrite=T)
  • Load into flights table via drop (dbRemoveTable) if exists (dbExistsTable) and load (dbWriteTable)
# Drop table if it already exists
if (dbExistsTable(mySqlConn, "flights"))
    dbRemoveTable(mySqlConn, "flights")

# Write the data frame to the database
dbWriteTable(mySqlConn, name = "flights", value = flights, row.names = FALSE)
  • Load into planes table via load (dbWriteTable) if not exists (dbExistsTable)
if(!dbExistsTable(mySqlConn,"planes")) {
  dbWriteTable(mySqlConn, value = planes, name = "planes", row.names=F, append = TRUE ) 
}

5 Load data into NoSql - MongoDB

Mongodb is one of the Nosql database. It is very famous for its fluid data structure.

  • All the formats are loaded in JSON format.
  • Databases hold collections of documents.
  • Collections are analogous to tables of an RDBMS that store documents those who are not same in structure. This is possible because MongoDB is a Schema-free database.

Load the tidy data from R data frame to mongoDB NoSQL database for data analysis and visual representation

The mongolite package is more efficient than the RMongo package (unavailable from CRAN) which requires conversion to JSON, string manipulation, and several lines of additional code using the below functions:

  • mongoDbConnect(“Flights”, host = “localhost”, port = 27017)
  • dbInsertDocument(mongodb, ‘Table’, “jsonData”)
  • dbShowCollections(mongodb)
  • dbGetQuery(mongodb, “Table”, “{}”)
  • dbRemoveQuery(mongodb, ‘Table’, convert(Table))
  • dbDisconnect(mongodb)

5.1 Connect to NoSQL MongoDB Atlas

5.1.1 Using mongolite

#find cluster url with username and password replaced with SCRAM credentials
url_path = 'mongodb+srv://root:data607@cluster0-blj5o.mongodb.net/test'

#make connection object that specifies new database and collection (dataset)
mongoNYCFlightsConn <- mongo (collection = "nycflights13", 
                              db = "CUNY_DATA607", 
                              url = url_path,
                              #url = "mongodb://usr:pass@cluster0-shard-00-00-h8acf.mongodb.net:27017,cluster0-shard-00-01-12ucd.mongodb.net:27017,cluster0-shard-00-02-haucd.mongodb.net:27017/dbname?ssl=true&replicaSet=Cluster0-shard-0&authSource=admin", 
                              verbose = TRUE)

#show commands for connection
mongoNYCFlightsConn

5.1.2 Using RMongo

The RMongo package (unavailable from CRAN) requires conversion to JSON, string manipulation, and several lines of additional code using the below functions: - mongoDbConnect(“Flights”, host = “localhost”, port = 27017) - dbInsertDocument(mongodb, ‘Table’, “jsonData”) - dbShowCollections(mongodb) - dbGetQuery(mongodb, “Table”, “{}”) - dbRemoveQuery(mongodb, ‘Table’, convert(Table))
- dbDisconnect(mongodb)

5.2 Show data

5.2.1 Airlines

knitr::include_graphics('CUNY_DATA607.airlines.jpg')

#![Airlines Collection in MongoDB](CUNY_DATA607.airlines.jpg)
#plot(load.image("CUNY_DATA607.airlines.jpg"))
knitr::include_graphics('CUNY_DATA607.airlines-indexes.jpg')

#![Airlines Indexes in MongoDB](CUNY_DATA607.airlines-indexes.jpg)
#plot(load.image("CUNY_DATA607.airlines-indexes.jpg"))

5.2.2 Flights

knitr::include_graphics('CUNY_DATA607.flights.jpg')

#plot(load.image("CUNY_DATA607.flights.jpg"))
knitr::include_graphics('CUNY_DATA607.flights-indexes.jpg')

#plot(load.image("CUNY_DATA607.flights-indexes.jpg"))

5.2.3 NYCFlights13

knitr::include_graphics('CUNY_DATA607.nycflights13.jpg')

#plot(load.image("CUNY_DATA607.nycflights13.jpg"))
knitr::include_graphics('CUNY_DATA607.nycflights13-indexes.jpg')

#plot(load.image("CUNY_DATA607.nycflights13-indexes.jpg"))

5.2.4 NYCFlightsAirlines

knitr::include_graphics('CUNY_DATA607.nycflightsairlines.jpg')

#plot(load.image("CUNY_DATA607.nycflightsairlines.jpg"))
knitr::include_graphics('CUNY_DATA607.nycflightsairlines-indexes.jpg')

#plot(load.image("CUNY_DATA607.nycflightsairlines-indexes.jpg"))

5.3 Query data

5.3.1 First records in the collection

mongoNYCFlightsConn$iterate()$one() # first 1 record in the collection
## $carrier
## [1] "9E"
## 
## $name
## [1] "Endeavor Air Inc."
kable(mongoFlightAirlinesConn$iterate()$one()) # first 1 records in the collection
x
UA
x
United Air Lines Inc.
x
2013
x
1
x
1
x
517
x
515
x
2
x
830
x
819
x
11
x
1545
x
N14228
x
EWR
x
IAH
x
227
x
1400
x
5
x
15
x
2013-01-01 05:00:00

5.3.2 Find distinct airlines from collection

mongoFlightAirlinesConn$distinct("carrier")
##  [1] "UA" "AA" "B6" "DL" "EV" "MQ" "US" "WN" "VX" "FL" "AS"
mongoFlightAirlinesConn$distinct("carrier", "{\"origin\" : \"EWR\", \"dest\": \"IAH\" }")
## [1] "UA"

5.3.3 Find records in the collection

kable(mongoFlightAirlinesConn$find('{"origin":"EWR"}'))
## 
 Found 31 records...
 Imported 31 records. Simplifying into dataframe...
carrier name year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time arr_delay flight tailnum origin dest air_time distance hour minute time_hour
UA United Air Lines Inc. 2013 1 1 517 515 2 830 819 11 1545 N14228 EWR IAH 227 1400 5 15 2013-01-01 05:00:00
UA United Air Lines Inc. 2013 1 1 554 558 -4 740 728 12 1696 N39463 EWR ORD 150 719 5 58 2013-01-01 05:00:00
B6 JetBlue Airways 2013 1 1 555 600 -5 913 854 19 507 N516JB EWR FLL 158 1065 6 0 2013-01-01 06:00:00
UA United Air Lines Inc. 2013 1 1 558 600 -2 923 937 -14 1124 N53441 EWR SFO 361 2565 6 0 2013-01-01 06:00:00
UA United Air Lines Inc. 2013 1 1 559 600 -1 854 902 -8 1187 N76515 EWR LAS 337 2227 6 0 2013-01-01 06:00:00
B6 JetBlue Airways 2013 1 1 601 600 1 844 850 -6 343 N644JB EWR PBI 147 1023 6 0 2013-01-01 06:00:00
AA American Airlines Inc. 2013 1 1 606 610 -4 858 910 -12 1895 N633AA EWR MIA 152 1085 6 10 2013-01-01 06:00:00
UA United Air Lines Inc. 2013 1 1 607 607 0 858 915 -17 1077 N53442 EWR MIA 157 1085 6 7 2013-01-01 06:00:00
MQ Envoy Air 2013 1 1 608 600 8 807 735 32 3768 N9EAMQ EWR ORD 139 719 6 0 2013-01-01 06:00:00
DL Delta Air Lines Inc. 2013 1 1 615 615 0 833 842 -9 575 N326NB EWR ATL 120 746 6 15 2013-01-01 06:00:00
US US Airways Inc. 2013 1 1 622 630 -8 1017 1014 3 245 N807AW EWR PHX 342 2133 6 30 2013-01-01 06:00:00
EV ExpressJet Airlines Inc. 2013 1 1 624 630 -6 909 840 29 4626 N11107 EWR MSP 190 1008 6 30 2013-01-01 06:00:00
UA United Air Lines Inc. 2013 1 1 628 630 -2 1016 947 29 1665 N33289 EWR LAX 366 2454 6 30 2013-01-01 06:00:00
US US Airways Inc. 2013 1 1 629 630 -1 824 833 -9 1019 N426US EWR CLT 91 529 6 30 2013-01-01 06:00:00
EV ExpressJet Airlines Inc. 2013 1 1 632 608 24 740 728 12 4144 N13553 EWR IAD 52 212 6 8 2013-01-01 06:00:00
UA United Air Lines Inc. 2013 1 1 643 646 -3 922 940 -18 556 N497UA EWR PBI 146 1023 6 46 2013-01-01 06:00:00
US US Airways Inc. 2013 1 1 643 645 -2 837 848 -11 926 N178US EWR CLT 91 529 6 45 2013-01-01 06:00:00
UA United Air Lines Inc. 2013 1 1 644 636 8 931 940 -9 1701 N75435 EWR FLL 151 1065 6 36 2013-01-01 06:00:00
UA United Air Lines Inc. 2013 1 1 646 645 1 1023 1030 -7 1496 N38727 EWR SNA 380 2434 6 45 2013-01-01 06:00:00
UA United Air Lines Inc. 2013 1 1 656 700 -4 948 1011 -23 1115 N24212 EWR TPA 156 997 7 0 2013-01-01 07:00:00
UA United Air Lines Inc. 2013 1 1 659 700 -1 959 1008 -9 960 N838UA EWR RSW 164 1068 7 0 2013-01-01 07:00:00
UA United Air Lines Inc. 2013 1 1 701 700 1 1123 1154 -31 1203 N77296 EWR SJU 188 1608 7 0 2013-01-01 07:00:00
UA United Air Lines Inc. 2013 1 1 715 713 2 911 850 21 544 N841UA EWR ORD 156 719 7 13 2013-01-01 07:00:00
UA United Air Lines Inc. 2013 1 1 723 725 -2 1013 1017 -4 962 N514UA EWR PBI 153 1023 7 25 2013-01-01 07:00:00
AS Alaska Airlines Inc. 2013 1 1 724 725 -1 1020 1030 -10 11 N594AS EWR SEA 338 2402 7 25 2013-01-01 07:00:00
AA American Airlines Inc. 2013 1 1 725 730 -5 1052 1040 12 2083 N4WRAA EWR DFW 238 1372 7 30 2013-01-01 07:00:00
UA United Air Lines Inc. 2013 1 1 727 730 -3 959 952 7 1162 N37462 EWR DEN 254 1605 7 30 2013-01-01 07:00:00
UA United Air Lines Inc. 2013 1 1 732 645 47 1011 941 30 1111 N37456 EWR MCO 145 937 6 45 2013-01-01 06:00:00
UA United Air Lines Inc. 2013 1 1 739 739 0 1104 1038 26 1479 N37408 EWR IAH 249 1400 7 39 2013-01-01 07:00:00
UA United Air Lines Inc. 2013 1 1 746 746 0 1119 1129 -10 1668 N24224 EWR SFO 373 2565 7 46 2013-01-01 07:00:00
MQ Envoy Air 2013 1 1 749 710 39 939 850 49 3737 N508MQ EWR ORD 148 719 7 10 2013-01-01 07:00:00
kable(mongoFlightAirlinesConn$find('{"name":"Lansdowne Airport"}'))
## 
 Imported 0 records. Simplifying into dataframe...
mongoFlightAirlinesConn$find("{\"origin\" : \"EWR\"}", handler = NULL)
## 
 Found 31 records...
 Imported 31 records. Simplifying into dataframe...
##    carrier                     name year month day dep_time sched_dep_time
## 1       UA    United Air Lines Inc. 2013     1   1      517            515
## 2       UA    United Air Lines Inc. 2013     1   1      554            558
## 3       B6          JetBlue Airways 2013     1   1      555            600
## 4       UA    United Air Lines Inc. 2013     1   1      558            600
## 5       UA    United Air Lines Inc. 2013     1   1      559            600
## 6       B6          JetBlue Airways 2013     1   1      601            600
## 7       AA   American Airlines Inc. 2013     1   1      606            610
## 8       UA    United Air Lines Inc. 2013     1   1      607            607
## 9       MQ                Envoy Air 2013     1   1      608            600
## 10      DL     Delta Air Lines Inc. 2013     1   1      615            615
## 11      US          US Airways Inc. 2013     1   1      622            630
## 12      EV ExpressJet Airlines Inc. 2013     1   1      624            630
## 13      UA    United Air Lines Inc. 2013     1   1      628            630
## 14      US          US Airways Inc. 2013     1   1      629            630
## 15      EV ExpressJet Airlines Inc. 2013     1   1      632            608
## 16      UA    United Air Lines Inc. 2013     1   1      643            646
## 17      US          US Airways Inc. 2013     1   1      643            645
## 18      UA    United Air Lines Inc. 2013     1   1      644            636
## 19      UA    United Air Lines Inc. 2013     1   1      646            645
## 20      UA    United Air Lines Inc. 2013     1   1      656            700
## 21      UA    United Air Lines Inc. 2013     1   1      659            700
## 22      UA    United Air Lines Inc. 2013     1   1      701            700
## 23      UA    United Air Lines Inc. 2013     1   1      715            713
## 24      UA    United Air Lines Inc. 2013     1   1      723            725
## 25      AS     Alaska Airlines Inc. 2013     1   1      724            725
## 26      AA   American Airlines Inc. 2013     1   1      725            730
## 27      UA    United Air Lines Inc. 2013     1   1      727            730
## 28      UA    United Air Lines Inc. 2013     1   1      732            645
## 29      UA    United Air Lines Inc. 2013     1   1      739            739
## 30      UA    United Air Lines Inc. 2013     1   1      746            746
## 31      MQ                Envoy Air 2013     1   1      749            710
##    dep_delay arr_time sched_arr_time arr_delay flight tailnum origin dest
## 1          2      830            819        11   1545  N14228    EWR  IAH
## 2         -4      740            728        12   1696  N39463    EWR  ORD
## 3         -5      913            854        19    507  N516JB    EWR  FLL
## 4         -2      923            937       -14   1124  N53441    EWR  SFO
## 5         -1      854            902        -8   1187  N76515    EWR  LAS
## 6          1      844            850        -6    343  N644JB    EWR  PBI
## 7         -4      858            910       -12   1895  N633AA    EWR  MIA
## 8          0      858            915       -17   1077  N53442    EWR  MIA
## 9          8      807            735        32   3768  N9EAMQ    EWR  ORD
## 10         0      833            842        -9    575  N326NB    EWR  ATL
## 11        -8     1017           1014         3    245  N807AW    EWR  PHX
## 12        -6      909            840        29   4626  N11107    EWR  MSP
## 13        -2     1016            947        29   1665  N33289    EWR  LAX
## 14        -1      824            833        -9   1019  N426US    EWR  CLT
## 15        24      740            728        12   4144  N13553    EWR  IAD
## 16        -3      922            940       -18    556  N497UA    EWR  PBI
## 17        -2      837            848       -11    926  N178US    EWR  CLT
## 18         8      931            940        -9   1701  N75435    EWR  FLL
## 19         1     1023           1030        -7   1496  N38727    EWR  SNA
## 20        -4      948           1011       -23   1115  N24212    EWR  TPA
## 21        -1      959           1008        -9    960  N838UA    EWR  RSW
## 22         1     1123           1154       -31   1203  N77296    EWR  SJU
## 23         2      911            850        21    544  N841UA    EWR  ORD
## 24        -2     1013           1017        -4    962  N514UA    EWR  PBI
## 25        -1     1020           1030       -10     11  N594AS    EWR  SEA
## 26        -5     1052           1040        12   2083  N4WRAA    EWR  DFW
## 27        -3      959            952         7   1162  N37462    EWR  DEN
## 28        47     1011            941        30   1111  N37456    EWR  MCO
## 29         0     1104           1038        26   1479  N37408    EWR  IAH
## 30         0     1119           1129       -10   1668  N24224    EWR  SFO
## 31        39      939            850        49   3737  N508MQ    EWR  ORD
##    air_time distance hour minute           time_hour
## 1       227     1400    5     15 2013-01-01 05:00:00
## 2       150      719    5     58 2013-01-01 05:00:00
## 3       158     1065    6      0 2013-01-01 06:00:00
## 4       361     2565    6      0 2013-01-01 06:00:00
## 5       337     2227    6      0 2013-01-01 06:00:00
## 6       147     1023    6      0 2013-01-01 06:00:00
## 7       152     1085    6     10 2013-01-01 06:00:00
## 8       157     1085    6      7 2013-01-01 06:00:00
## 9       139      719    6      0 2013-01-01 06:00:00
## 10      120      746    6     15 2013-01-01 06:00:00
## 11      342     2133    6     30 2013-01-01 06:00:00
## 12      190     1008    6     30 2013-01-01 06:00:00
## 13      366     2454    6     30 2013-01-01 06:00:00
## 14       91      529    6     30 2013-01-01 06:00:00
## 15       52      212    6      8 2013-01-01 06:00:00
## 16      146     1023    6     46 2013-01-01 06:00:00
## 17       91      529    6     45 2013-01-01 06:00:00
## 18      151     1065    6     36 2013-01-01 06:00:00
## 19      380     2434    6     45 2013-01-01 06:00:00
## 20      156      997    7      0 2013-01-01 07:00:00
## 21      164     1068    7      0 2013-01-01 07:00:00
## 22      188     1608    7      0 2013-01-01 07:00:00
## 23      156      719    7     13 2013-01-01 07:00:00
## 24      153     1023    7     25 2013-01-01 07:00:00
## 25      338     2402    7     25 2013-01-01 07:00:00
## 26      238     1372    7     30 2013-01-01 07:00:00
## 27      254     1605    7     30 2013-01-01 07:00:00
## 28      145      937    6     45 2013-01-01 06:00:00
## 29      249     1400    7     39 2013-01-01 07:00:00
## 30      373     2565    7     46 2013-01-01 07:00:00
## 31      148      719    7     10 2013-01-01 07:00:00
mongoFlightAirlinesConn$find(paste('{"arr_delay" :{"$gt" : 80}}'))  %>% kable() # flights with an arrival delay > 80mins
## 
 Imported 0 records. Simplifying into dataframe...
mongoFlightAirlines <- mongoFlightAirlinesConn$find() 
## 
 Found 100 records...
 Imported 100 records. Simplifying into dataframe...
kable(head(mongoFlightAirlines))
carrier name year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time arr_delay flight tailnum origin dest air_time distance hour minute time_hour
UA United Air Lines Inc. 2013 1 1 517 515 2 830 819 11 1545 N14228 EWR IAH 227 1400 5 15 2013-01-01 05:00:00
UA United Air Lines Inc. 2013 1 1 533 529 4 850 830 20 1714 N24211 LGA IAH 227 1416 5 29 2013-01-01 05:00:00
AA American Airlines Inc. 2013 1 1 542 540 2 923 850 33 1141 N619AA JFK MIA 160 1089 5 40 2013-01-01 05:00:00
B6 JetBlue Airways 2013 1 1 544 545 -1 1004 1022 -18 725 N804JB JFK BQN 183 1576 5 45 2013-01-01 05:00:00
DL Delta Air Lines Inc. 2013 1 1 554 600 -6 812 837 -25 461 N668DN LGA ATL 116 762 6 0 2013-01-01 06:00:00
UA United Air Lines Inc. 2013 1 1 554 558 -4 740 728 12 1696 N39463 EWR ORD 150 719 5 58 2013-01-01 05:00:00

5.3.4 Join flights and airlines collections

mongoFlightsConn$aggregate('[
                          {"$match" : {"arr_delay" :{"$gt" : 120}}},
                          {
                           "$project":
                           {
                            "_id" :0
                           }
                          },
                          {
                            "$lookup":
                            {
                            "from": "airlines",
                            "localField": "carrier",
                            "foreignField": "carrier",
                            "as": "airline_name"
                            }
                          },
                          {"$unwind" : "$airline_name"},
                          {
                           "$project":
                           {
                            "airline_name._id" :0, "airline_name.carrier":0
                           }
                          }
                      ]') %>% kable()
## 
 Found 25 records...
 Imported 25 records. Simplifying into dataframe...
year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time arr_delay carrier flight tailnum origin dest air_time distance hour minute time_hour airline_name
2013 1 1 811 630 101 1047 830 137 MQ 4576 N531MQ LGA CLT 118 544 6 30 2013-01-01 06:00:00 Envoy Air
2013 1 1 848 1835 853 1001 1950 851 MQ 3944 N942MQ JFK BWI 41 184 18 35 2013-01-01 18:00:00 Envoy Air
2013 1 1 957 733 144 1056 853 123 UA 856 N534UA EWR BOS 37 200 7 33 2013-01-01 07:00:00 United Air Lines Inc.
2013 1 1 1114 900 134 1447 1222 145 UA 1086 N76502 LGA IAH 248 1416 9 0 2013-01-01 09:00:00 United Air Lines Inc.
2013 1 1 1505 1310 115 1638 1431 127 EV 4497 N17984 EWR RIC 63 277 13 10 2013-01-01 13:00:00 ExpressJet Airlines Inc.
2013 1 1 1525 1340 105 1831 1626 125 B6 525 N231JB EWR MCO 152 937 13 40 2013-01-01 13:00:00 JetBlue Airways
2013 1 1 1549 1445 64 1912 1656 136 EV 4181 N21197 EWR MCI 234 1092 14 45 2013-01-01 14:00:00 ExpressJet Airlines Inc.
2013 1 1 1558 1359 119 1718 1515 123 EV 5712 N826AS JFK IAD 53 228 13 59 2013-01-01 13:00:00 ExpressJet Airlines Inc.
2013 1 1 1732 1630 62 2028 1825 123 EV 4092 N16911 EWR DAY 119 533 16 30 2013-01-01 16:00:00 ExpressJet Airlines Inc.
2013 1 1 1803 1620 103 2008 1750 138 MQ 4622 N504MQ LGA BNA 154 764 16 20 2013-01-01 16:00:00 Envoy Air
2013 1 1 1815 1325 290 2120 1542 338 EV 4417 N17185 EWR OMA 213 1134 13 25 2013-01-01 13:00:00 ExpressJet Airlines Inc.
2013 1 1 1842 1422 260 1958 1535 263 EV 4633 N18120 EWR BTV 46 266 14 22 2013-01-01 14:00:00 ExpressJet Airlines Inc.
2013 1 1 1856 1645 131 2212 2005 127 AA 181 N323AA JFK LAX 336 2475 16 45 2013-01-01 16:00:00 American Airlines Inc.
2013 1 1 1934 1725 129 2126 1855 151 MQ 4255 N909MQ JFK BNA 154 765 17 25 2013-01-01 17:00:00 Envoy Air
2013 1 1 1938 1703 155 2109 1823 166 EV 4300 N18557 EWR RIC 68 277 17 3 2013-01-01 17:00:00 ExpressJet Airlines Inc.
2013 1 1 1942 1705 157 2124 1830 174 MQ 4410 N835MQ JFK DCA 60 213 17 5 2013-01-01 17:00:00 Envoy Air
2013 1 1 2006 1630 216 2230 1848 222 EV 4644 N14972 EWR SAV 121 708 16 30 2013-01-01 16:00:00 ExpressJet Airlines Inc.
2013 1 1 2009 1808 121 2145 1942 123 EV 4440 N14143 EWR PIT 65 319 18 8 2013-01-01 18:00:00 ExpressJet Airlines Inc.
2013 1 1 2115 1700 255 2330 1920 250 9E 3347 N924XJ JFK CVG 115 589 17 0 2013-01-01 17:00:00 Endeavor Air Inc.
2013 1 1 2119 1930 109 2358 2136 142 EV 4543 N13123 EWR DSM 200 1017 19 30 2013-01-01 19:00:00 ExpressJet Airlines Inc.
2013 1 1 2205 1720 285 46 2040 246 AA 1999 N5DNAA EWR MIA 146 1085 17 20 2013-01-01 17:00:00 American Airlines Inc.
2013 1 1 2221 2000 141 2331 2124 127 EV 4462 N13566 EWR BUF 56 282 20 0 2013-01-01 20:00:00 ExpressJet Airlines Inc.
2013 1 1 2312 2000 192 21 2110 191 EV 4312 N13958 EWR DCA 44 199 20 0 2013-01-01 20:00:00 ExpressJet Airlines Inc.
2013 1 1 2343 1724 379 314 1938 456 EV 4321 N21197 EWR MCI 222 1092 17 24 2013-01-01 17:00:00 ExpressJet Airlines Inc.
2013 1 2 126 2250 156 233 2359 154 B6 22 N636JB JFK SYR 49 209 22 50 2013-01-02 22:00:00 JetBlue Airways

5.3.5 Visualize trips per carrier

mongoFlightsConn$aggregate('[{"$group":{"_id":"$carrier", "Count": {"$sum":1}}}]') %>% 
  arrange(desc(Count)) %>% 
  ggplot(aes(x=reorder(`_id`,Count),y=Count))+
    geom_bar(stat="identity",color='skyblue')+
    coord_flip()+
    xlab("Carriers")+ 
    theme_bw()
## 
 Found 14 records...
 Imported 14 records. Simplifying into dataframe...

5.4 Observations of MongoDB

  • Setting the database needs little bit of knowledge and need to know how to start the server.
  • There is no GUI for mongodb. Everything is performed in command prompt. Although there are some from 3rd party, it is not combined with database.
  • When R interacts with mongodb, there is a lag between the operations.
  • Writing to database takes time. It is not quick as mysql.
  • Fetching the results are quick.
  • There are no joins. So to perform joins, we need to do lot of operations.

6 Load data into GraphDB - Neo4J

Neo4j is an graphical database. - We can link the data of different tables. - All the data points are called as nodes. - The nodes can have label. - These labels will have properties of that specific node. - The relationship also been tagged as labels. It also can have properties. - All the interactions are performed via CYPER queries.

There was a working RNeo4j package so that R and and Neo4j could connect seemlessly but it appears this package has been taken down from CRAN

6.1 Connect to GraphDB Cloud

# load data into Neo4J
write.csv(airlines, file = "airlines.csv", row.names=FALSE, na="")
write.csv(airports, file = "airports.csv", row.names=FALSE, na="")
write.csv(planes, file = "planes.csv", row.names=FALSE, na="")

#load_csv(on_load = "", neoConn, url="airlines.csv", header = TRUE, periodic_commit = 1000, as = "csv", type = c("row", "graph"), output = c("r", "json"), include_stats = TRUE, include_meta = FALSE)

#LOAD CSV WITH HEADERS FROM "airlines.csv" AS population
#CREATE (a:Population {country: population.country, year: toInt(population.year), population: toInt(population.population)})
#RETURN a 

#LOAD CSV FROM "airports.csv" AS line
#CREATE (b:Cases { country: line[0], year: toInt(line[1]), gender: line[2], cases: toInt(line[3])})
#RETURN b

6.1.1 Using RNeo4j

#setting up the database instance
#neo4j = startGraph("http://localhost:7474/db/data/","neo4j","admin")
neo4j = startGraph("bolt://hobby-nckncgjedgejgbkedmkbnccl.dbs.graphenedb.com:24787", #"https://hobby-nckncgjedgejgbkedmkbnccl.dbs.graphenedb.com:24780/db/data/", 
                   username = "root", 
                   password = "b.LVqKHBbkgC9I.NjBLWxHmLfTT0cji")

#Delete the previous relationships and nodes
clear(neo4j, input = FALSE) 
cypher(neo4j,"MATCH ()-[r:Departs]-() delete r")
cypher(neo4j,"MATCH ()-[r:Arrival]-() delete r")
cypher(neo4j,"match(n) delete n")
#cypher(neo4j,"match(n:airline) delete n")
# For airlines
for(i in 1:nrow(airlines)) {
  cypher(neo4j,paste0("create(carrier:airline{code:'",airlines[i,1],"',name:'",airlines[i,2],"'})",collapse = ""))
}

# For origin
#cypher(neo4j,"match(n:origin) delete n")
origin_codes <- unique(flightfact_mysql$origin) 
for(i in 1:length(origin_codes)) {
  cypher(neo4j,paste0("create(location:origin{origin_code:'",origin_codes[i],"'})",collapse = ""))
}

# For dest
#cypher(neo4j,"match(n:dest) delete n")
dest_codes <- unique(flightfact_mysql$dest)
for(i in 1:length(dest_codes)) {
  cypher(neo4j,paste0("create(location:dest{dest_code:'",dest_codes[i],"'})",collapse = ""))
}

# For depart
#cypher(neo4j,"MATCH ()-[r:Departs]-() delete r")
depart <- unique(flightfact_mysql[,c("origin","carrier")])
for(i in 1:nrow(depart)) {
  cypher(neo4j,paste0("match (air:airline{code:'",depart[i,2],"'}),(org:origin{origin_code:'",depart[i,1],"'}) create (air)-[r:Departs]->(org)"))
}

# For arrival
#cypher(neo4j,"MATCH ()-[r:Arrival]-() delete r")
arrival <- unique(flightfact_mysql[,c("dest","carrier")])
for(i in 1:nrow(arrival)) {
  cypher(neo4j,paste0("match (air:airline{code:'",arrival[i,2],"'}),(dest:dest{dest_code:'",arrival[i,1],"'}) create (air)-[r:Arrival]->(dest)"))
}

6.1.2 Using neo4r

#neo4jConn <- neo4j_api$new(url = "bolt://fuchsia-lakin-mountain-hermann.graphstory.cloud:7687", 
#                           user = "fuchsia_lakin_mountain_hermann", password = "vi4YzVJAt5MbSDPCp7vriTnv")
neoConn <- neo4j_api$new(url = "https://hobby-nckncgjedgejgbkedmkbnccl.dbs.graphenedb.com:24780/db/data/", 
                         user = "root", password = "b.LVqKHBbkgC9I.NjBLWxHmLfTT0cji")
#neoConn$get_version()
#call_neo4j("MATCH (n) DETACH DELETE n", neoConn)
cypher_query <- " "
#Now load the Airlines data dataframe into Neo4J graph Cypher query
for(i in 1:nrow(airlines)) {
  cypher_query <- paste(cypher_query, "CREATE", vec_to_cypher(airlines[i, ], "Airlines"), " ")
}
cypher_query
call_neo4j(paste(cypher_query,";"), neoConn)
cypher_query <- " "
#Now load the Airports data dataframe into Neo4J graph Cypher query
for(i in 1:nrow(airports)) {
  cypher_query <- paste(cypher_query, "CREATE", vec_to_cypher(airports[i, ], "Airports"), " ")
}
cypher_query
call_neo4j(paste(cypher_query,";"), neoConn)
cypher_query <- " "
#Now load the Airports data dataframe into Neo4J graph Cypher query
for(i in 1:nrow(planes)) {
  cypher_query <- paste(cypher_query, "CREATE", vec_to_cypher(planes[i, ], "Planes"), " ")
}
cypher_query
call_neo4j(paste(cypher_query,";"), neoConn)
cypher_query <- " "
#Now load the Airports data dataframe into Neo4J graph Cypher query
for(i in 1:nrow(flights)) {
  cypher_query <- paste(cypher_query, "CREATE", vec_to_cypher(flights[i, ], "Flights"), " ")
}
cypher_query
call_neo4j(paste(cypher_query,";"), neoConn)
call_neo4j("MATCH (a:Flights), (b:Airlines) WHERE (a.carrier) = (b.carrier) CREATE (a) -[:Carrier]-> (b);", neo4jConn)

6.2 Observations of Neo4J

  • Using cloud installation is easy with GrapheneDB and creating a database was simple.
  • The data is seen in visual format. So it is easy to think more about data.
  • It is done in step by step. So we need to understand the data more.
  • Relationships play a major role. It gives meaning to the nodes.
  • The grphical interface is very nice. But it becomes messy if you have more data and nodes.
  • Fetching the required data is very quick and fast.
  • Need to learn new query language called CYPER. Although it is easy for smaller operations.

7 Difference between the databases

Relational DB - MySQL NoSQL DB - MongoDB
RDBMS Document-oriented
Relational database schema is strictly well defined. NoSQL database doesn’t have a fixed schema.
Data are stored in tables as columns and rows. Data are stored in collections with fields as key - value pairs under document.
Mature: Established database - backed with a huge community - somehow stable. Scalable: Horizontally scalable reducing the workload and scaling business with ease.
Row size cannot exceed 64KB, which is shared among all columns. Max document size is 16MB.
Joins are used to combine data from two or more tables NoSQL databases do not use joins. Collections join is implemented at application level.
Object names are not case-insensitive. Object names are case-sensitive, even on case-insensitive file systems.
Row nesting is not supported. Document nesting is allowed, maximum 100 documents.
Relational databases are ACID compliant and is implemented using transactions. NoSQL databases are not ACID compliant, transaction safety is implemented using CAP theorem.
Tables can have upto 1000 columns and fit into 64KB. No limit on key-value pairs, as long as it fits into 16MB
A maximum of 16 columns is permitted for compound index. Maximum 31 fields in a compound index.
Slow performance for high volume High-performance for high volume queries
DBName MySQL MongoDB Neo4J
Description Most widely used database Popular nosql database Getting popular
Type RDBMS Document Store Graph
Transaction ACID No ACID ACID
In-memory Yes Yes No
Application scenarios Traditional structure API’s and IOT Real-time recommendations
Scalability Easy Easy Medium
Main character Zylo data JSON format Relationship

8 CleanUp

dbDisconnect(mySqlConn)
paste("Disconnected from MySQL on", date())

mongoNYCFlightsConn$drop()
mongoAirlinesConn$drop()
mongoFlightsConn$drop()
mongoFlightAirlinesConn$drop()