Goal

For Project 4, you should take information from a relational database and migrate it to a NoSQL database of your own choosing.

For the relational database, you might use the flights database, the tb database, the “data skills” database your team created for Project 3, or another database of your own choosing or creation.

For the NoSQL database, you may use MongoDB, Neo4j (which we introduce in Week 12), or another NoSQL database of your choosing.

Your migration process needs to be reproducible. R code is encouraged, but not required. You should also briefly describe the advantages and disadvantages of storing the data in a relational database vs. your NoSQL database.


Environment Setup

For project 4, I will take information from MySql databse and migrate it to MongoDB and will use the filghts databse.

Load required libraries
library(RMySQL)
## Loading required package: DBI
#install.packages("mongolite")
library(mongolite)
## Warning: package 'mongolite' was built under R version 3.3.2

Procedure

I will complete the project in 3 steps:

Step 1

In this step I will create a connectio with MySql, load database in r, convert the MySql tables to r data frames and finally close connection with r.

Create connection to MySql and load databse into r
flights_mysql <- dbConnect(MySQL(), user = 'root', password = '', dbname='flights', host = 'localhost')
Get the table names from the database
dbListTables(flights_mysql)
## [1] "airlines" "airports" "flights"  "planes"   "weather"
Convert MySql tables to r data frames
airlines_df <- "select * from airlines;"
airlines <- dbGetQuery(flights_mysql, airlines_df)

airports_df <- "select * from airports;"
airports <- dbGetQuery(flights_mysql, airports_df)

flights_df <- "select * from flights;"
flights <- dbGetQuery(flights_mysql, flights_df)

planes_df <- "select * from planes;"
planes <- dbGetQuery(flights_mysql, planes_df)

weather_df <- "select * from weather;"
weather <- dbGetQuery(flights_mysql, weather_df)

knitr::kable(head(airlines))
carrier name
9E Endeavor Air Inc.
AA American Airlines Inc.
AS Alaska Airlines Inc.
B6 JetBlue Airways
DL Delta Air Lines Inc.
EV ExpressJet Airlines Inc.
knitr::kable(head(airports))
faa name lat lon alt tz dst
04G Lansdowne Airport 41.13047 -80.61958 1044 -5 A
06A Moton Field Municipal Airport 32.46057 -85.68003 264 -5 A
06C Schaumburg Regional 41.98934 -88.10124 801 -6 A
06N Randall Airport 41.43191 -74.39156 523 -5 A
09J Jekyll Island Airport 31.07447 -81.42778 11 -4 A
0A9 Elizabethton Municipal Airport 36.37122 -82.17342 1593 -4 A
knitr::kable(head(flights))
year month day dep_time dep_delay arr_time arr_delay carrier tailnum flight origin dest air_time distance hour minute
2013 1 1 517 2 830 11 UA N14228 1545 EWR IAH 227 1400 5 17
2013 1 1 533 4 850 20 UA N24211 1714 LGA IAH 227 1416 5 33
2013 1 1 542 2 923 33 AA N619AA 1141 JFK MIA 160 1089 5 42
2013 1 1 544 -1 1004 -18 B6 N804JB 725 JFK BQN 183 1576 5 44
2013 1 1 554 -6 812 -25 DL N668DN 461 LGA ATL 116 762 6 54
2013 1 1 554 -4 740 12 UA N39463 1696 EWR ORD 150 719 6 54
knitr::kable(head(planes))
tailnum year type manufacturer model engines seats speed engine
N10156 2004 Fixed wing multi engine EMBRAER EMB-145XR 2 55 NA Turbo-fan
N102UW 1998 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
N103US 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
N104UW 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
N10575 2002 Fixed wing multi engine EMBRAER EMB-145LR 2 55 NA Turbo-fan
N105UW 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
knitr::kable(head(weather))
origin year month day hour temp dewp humid wind_dir wind_speed wind_gust precip pressure visib
EWR 2013 1 1 0 37.04 21.92 53.97 230 10.35702 11.91865 0 1013.9 10
EWR 2013 1 1 1 37.04 21.92 53.97 230 13.80936 15.89154 0 1013.0 10
EWR 2013 1 1 2 37.94 21.92 52.09 230 12.65858 14.56724 0 1012.6 10
EWR 2013 1 1 3 37.94 23.00 54.51 230 13.80936 15.89154 0 1012.7 10
EWR 2013 1 1 4 37.94 24.08 57.04 240 14.96014 17.21583 0 1012.8 10
EWR 2013 1 1 6 39.02 26.06 59.37 270 10.35702 11.91865 0 1012.0 10
Close Connection with MySql
dbDisconnect(flights_mysql)
## [1] TRUE

Step 2

In this step I will load the data frames to MongoDB and create a databse. Important: For this and following steps MongoDB sever required to run in the background.

MongoDB databse
m_airlines  <- mongo(collection = "airlines", db= "flights", url = "mongodb://localhost")
m_airlines$drop()
## [1] TRUE
m_airlines$insert(airlines)
## 
Complete! Processed total of 16 rows.
## $nInserted
## [1] 16
## 
## $nMatched
## [1] 0
## 
## $nRemoved
## [1] 0
## 
## $nUpserted
## [1] 0
## 
## $writeErrors
## list()
m_airports  <- mongo(collection = "airports", db= "flights", url = "mongodb://localhost")
m_airports$drop()
## [1] TRUE
m_airports$insert(airports)
## 
Processed 1000 rows...
Complete! Processed total of 1397 rows.
## $nInserted
## [1] 1397
## 
## $nMatched
## [1] 0
## 
## $nRemoved
## [1] 0
## 
## $nUpserted
## [1] 0
## 
## $writeErrors
## list()
m_flights  <- mongo(collection = "flights", db= "flights", url = "mongodb://localhost")
m_flights$drop()
## [1] TRUE
m_flights$insert(flights)
## 
Processed 1000 rows...
Processed 2000 rows...
Processed 3000 rows...
Processed 4000 rows...
Processed 5000 rows...
Processed 6000 rows...
Processed 7000 rows...
Processed 8000 rows...
Processed 9000 rows...
Processed 10000 rows...
Processed 11000 rows...
Processed 12000 rows...
Processed 13000 rows...
Processed 14000 rows...
Processed 15000 rows...
Processed 16000 rows...
Processed 17000 rows...
Processed 18000 rows...
Processed 19000 rows...
Processed 20000 rows...
Processed 21000 rows...
Processed 22000 rows...
Processed 23000 rows...
Processed 24000 rows...
Processed 25000 rows...
Processed 26000 rows...
Processed 27000 rows...
Processed 28000 rows...
Processed 29000 rows...
Processed 30000 rows...
Processed 31000 rows...
Processed 32000 rows...
Processed 33000 rows...
Processed 34000 rows...
Processed 35000 rows...
Processed 36000 rows...
Processed 37000 rows...
Processed 38000 rows...
Processed 39000 rows...
Processed 40000 rows...
Processed 41000 rows...
Processed 42000 rows...
Processed 43000 rows...
Processed 44000 rows...
Processed 45000 rows...
Processed 46000 rows...
Processed 47000 rows...
Processed 48000 rows...
Processed 49000 rows...
Processed 50000 rows...
Processed 51000 rows...
Processed 52000 rows...
Processed 53000 rows...
Processed 54000 rows...
Processed 55000 rows...
Processed 56000 rows...
Processed 57000 rows...
Processed 58000 rows...
Processed 59000 rows...
Processed 60000 rows...
Processed 61000 rows...
Processed 62000 rows...
Processed 63000 rows...
Processed 64000 rows...
Processed 65000 rows...
Processed 66000 rows...
Processed 67000 rows...
Processed 68000 rows...
Processed 69000 rows...
Processed 70000 rows...
Processed 71000 rows...
Processed 72000 rows...
Processed 73000 rows...
Processed 74000 rows...
Processed 75000 rows...
Processed 76000 rows...
Processed 77000 rows...
Processed 78000 rows...
Processed 79000 rows...
Processed 80000 rows...
Processed 81000 rows...
Processed 82000 rows...
Processed 83000 rows...
Processed 84000 rows...
Processed 85000 rows...
Processed 86000 rows...
Processed 87000 rows...
Processed 88000 rows...
Processed 89000 rows...
Processed 90000 rows...
Processed 91000 rows...
Processed 92000 rows...
Processed 93000 rows...
Processed 94000 rows...
Processed 95000 rows...
Processed 96000 rows...
Processed 97000 rows...
Processed 98000 rows...
Processed 99000 rows...
Processed 1e+05 rows...
Processed 101000 rows...
Processed 102000 rows...
Processed 103000 rows...
Processed 104000 rows...
Processed 105000 rows...
Processed 106000 rows...
Processed 107000 rows...
Processed 108000 rows...
Processed 109000 rows...
Processed 110000 rows...
Processed 111000 rows...
Processed 112000 rows...
Processed 113000 rows...
Processed 114000 rows...
Processed 115000 rows...
Processed 116000 rows...
Processed 117000 rows...
Processed 118000 rows...
Processed 119000 rows...
Processed 120000 rows...
Processed 121000 rows...
Processed 122000 rows...
Processed 123000 rows...
Processed 124000 rows...
Processed 125000 rows...
Processed 126000 rows...
Processed 127000 rows...
Processed 128000 rows...
Processed 129000 rows...
Processed 130000 rows...
Processed 131000 rows...
Processed 132000 rows...
Processed 133000 rows...
Processed 134000 rows...
Processed 135000 rows...
Processed 136000 rows...
Processed 137000 rows...
Processed 138000 rows...
Processed 139000 rows...
Processed 140000 rows...
Processed 141000 rows...
Processed 142000 rows...
Processed 143000 rows...
Processed 144000 rows...
Processed 145000 rows...
Processed 146000 rows...
Processed 147000 rows...
Processed 148000 rows...
Processed 149000 rows...
Processed 150000 rows...
Processed 151000 rows...
Processed 152000 rows...
Processed 153000 rows...
Processed 154000 rows...
Processed 155000 rows...
Processed 156000 rows...
Processed 157000 rows...
Processed 158000 rows...
Processed 159000 rows...
Processed 160000 rows...
Processed 161000 rows...
Processed 162000 rows...
Processed 163000 rows...
Processed 164000 rows...
Processed 165000 rows...
Processed 166000 rows...
Processed 167000 rows...
Processed 168000 rows...
Processed 169000 rows...
Processed 170000 rows...
Processed 171000 rows...
Processed 172000 rows...
Processed 173000 rows...
Processed 174000 rows...
Processed 175000 rows...
Processed 176000 rows...
Processed 177000 rows...
Processed 178000 rows...
Processed 179000 rows...
Processed 180000 rows...
Processed 181000 rows...
Processed 182000 rows...
Processed 183000 rows...
Processed 184000 rows...
Processed 185000 rows...
Processed 186000 rows...
Processed 187000 rows...
Processed 188000 rows...
Processed 189000 rows...
Processed 190000 rows...
Processed 191000 rows...
Processed 192000 rows...
Processed 193000 rows...
Processed 194000 rows...
Processed 195000 rows...
Processed 196000 rows...
Processed 197000 rows...
Processed 198000 rows...
Processed 199000 rows...
Processed 2e+05 rows...
Processed 201000 rows...
Processed 202000 rows...
Processed 203000 rows...
Processed 204000 rows...
Processed 205000 rows...
Processed 206000 rows...
Processed 207000 rows...
Processed 208000 rows...
Processed 209000 rows...
Processed 210000 rows...
Processed 211000 rows...
Processed 212000 rows...
Processed 213000 rows...
Processed 214000 rows...
Processed 215000 rows...
Processed 216000 rows...
Processed 217000 rows...
Processed 218000 rows...
Processed 219000 rows...
Processed 220000 rows...
Processed 221000 rows...
Processed 222000 rows...
Processed 223000 rows...
Processed 224000 rows...
Processed 225000 rows...
Processed 226000 rows...
Processed 227000 rows...
Processed 228000 rows...
Processed 229000 rows...
Processed 230000 rows...
Processed 231000 rows...
Processed 232000 rows...
Processed 233000 rows...
Processed 234000 rows...
Processed 235000 rows...
Processed 236000 rows...
Processed 237000 rows...
Processed 238000 rows...
Processed 239000 rows...
Processed 240000 rows...
Processed 241000 rows...
Processed 242000 rows...
Processed 243000 rows...
Processed 244000 rows...
Processed 245000 rows...
Processed 246000 rows...
Processed 247000 rows...
Processed 248000 rows...
Processed 249000 rows...
Processed 250000 rows...
Processed 251000 rows...
Processed 252000 rows...
Processed 253000 rows...
Processed 254000 rows...
Processed 255000 rows...
Processed 256000 rows...
Processed 257000 rows...
Processed 258000 rows...
Processed 259000 rows...
Processed 260000 rows...
Processed 261000 rows...
Processed 262000 rows...
Processed 263000 rows...
Processed 264000 rows...
Processed 265000 rows...
Processed 266000 rows...
Processed 267000 rows...
Processed 268000 rows...
Processed 269000 rows...
Processed 270000 rows...
Processed 271000 rows...
Processed 272000 rows...
Processed 273000 rows...
Processed 274000 rows...
Processed 275000 rows...
Processed 276000 rows...
Processed 277000 rows...
Processed 278000 rows...
Processed 279000 rows...
Processed 280000 rows...
Processed 281000 rows...
Processed 282000 rows...
Processed 283000 rows...
Processed 284000 rows...
Processed 285000 rows...
Processed 286000 rows...
Processed 287000 rows...
Processed 288000 rows...
Processed 289000 rows...
Processed 290000 rows...
Processed 291000 rows...
Processed 292000 rows...
Processed 293000 rows...
Processed 294000 rows...
Processed 295000 rows...
Processed 296000 rows...
Processed 297000 rows...
Processed 298000 rows...
Processed 299000 rows...
Processed 3e+05 rows...
Processed 301000 rows...
Processed 302000 rows...
Processed 303000 rows...
Processed 304000 rows...
Processed 305000 rows...
Processed 306000 rows...
Processed 307000 rows...
Processed 308000 rows...
Processed 309000 rows...
Processed 310000 rows...
Processed 311000 rows...
Processed 312000 rows...
Processed 313000 rows...
Processed 314000 rows...
Processed 315000 rows...
Processed 316000 rows...
Processed 317000 rows...
Processed 318000 rows...
Processed 319000 rows...
Processed 320000 rows...
Processed 321000 rows...
Processed 322000 rows...
Processed 323000 rows...
Processed 324000 rows...
Processed 325000 rows...
Processed 326000 rows...
Processed 327000 rows...
Processed 328000 rows...
Processed 329000 rows...
Processed 330000 rows...
Processed 331000 rows...
Processed 332000 rows...
Processed 333000 rows...
Processed 334000 rows...
Processed 335000 rows...
Processed 336000 rows...
Complete! Processed total of 336776 rows.
## $nInserted
## [1] 336776
## 
## $nMatched
## [1] 0
## 
## $nRemoved
## [1] 0
## 
## $nUpserted
## [1] 0
## 
## $writeErrors
## list()
m_planes  <- mongo(collection = "planes", db= "flights", url = "mongodb://localhost")
m_planes$drop()
## [1] TRUE
m_planes$insert(planes)
## 
Processed 1000 rows...
Processed 2000 rows...
Processed 3000 rows...
Complete! Processed total of 3322 rows.
## $nInserted
## [1] 3322
## 
## $nMatched
## [1] 0
## 
## $nRemoved
## [1] 0
## 
## $nUpserted
## [1] 0
## 
## $writeErrors
## list()
m_weather  <- mongo(collection = "weather", db= "flights", url = "mongodb://localhost")
m_weather$drop()
## [1] TRUE
m_weather$insert(weather)
## 
Processed 1000 rows...
Processed 2000 rows...
Processed 3000 rows...
Processed 4000 rows...
Processed 5000 rows...
Processed 6000 rows...
Processed 7000 rows...
Processed 8000 rows...
Complete! Processed total of 8719 rows.
## $nInserted
## [1] 8719
## 
## $nMatched
## [1] 0
## 
## $nRemoved
## [1] 0
## 
## $nUpserted
## [1] 0
## 
## $writeErrors
## list()

Step 3

In this step I will test the data to make sure data was not compromised during the relational to NoSQL migration. I will conduct two tests, in my first test I will make sure the numbers of rows in MySql are the same as the numbers of documents in MongoDB. In my second test I will make sure both database has same dimensions and load the data from MongoBD to r and create data frames to see if they are the same as the data frames I have created in step 1 from MySql.

Test 1
#Numbers of documents MongoDB m_airlines in and numbers on rows in airlines data frame
m_airlines$count()
## [1] 16
nrow(airlines)
## [1] 16
##Numbers of documents MongoDB in m_airports and numbers on rows in airports data frame
m_airports$count()
## [1] 1397
nrow(airports)
## [1] 1397
#Numbers of documents MongoDB in m_flights and numbers on rows in flights data frame
m_flights$count()
## [1] 336776
nrow(flights)
## [1] 336776
#Numbers of documents MongoDB in m_planes and numbers on rows in planes data frame
m_planes$count()
## [1] 3322
nrow(planes)
## [1] 3322
#Numbers of documents MongoDB in m_weather and numbers on rows in weather data frame
m_weather$count()
## [1] 8719
nrow(weather)
## [1] 8719

I have exact same numbers of documents and rows for collection and table containing the same data.

Test 2
mongo_airlines <- m_airlines$find()
## 
 Found 16 records...
 Imported 16 records. Simplifying into dataframe...
##mongo_airlines,  airlines data frame
dim(mongo_airlines) == dim(airlines)
## [1] TRUE TRUE
mongo_airports <- m_airports$find()
## 
 Found 1000 records...
 Found 1397 records...
 Imported 1397 records. Simplifying into dataframe...
##mongo_airports,  airports data frame
dim(mongo_airports) == dim(airports)
## [1] TRUE TRUE
mongo_flights <- m_flights$find()
## 
 Found 1000 records...
 Found 2000 records...
 Found 3000 records...
 Found 4000 records...
 Found 5000 records...
 Found 6000 records...
 Found 7000 records...
 Found 8000 records...
 Found 9000 records...
 Found 10000 records...
 Found 11000 records...
 Found 12000 records...
 Found 13000 records...
 Found 14000 records...
 Found 15000 records...
 Found 16000 records...
 Found 17000 records...
 Found 18000 records...
 Found 19000 records...
 Found 20000 records...
 Found 21000 records...
 Found 22000 records...
 Found 23000 records...
 Found 24000 records...
 Found 25000 records...
 Found 26000 records...
 Found 27000 records...
 Found 28000 records...
 Found 29000 records...
 Found 30000 records...
 Found 31000 records...
 Found 32000 records...
 Found 33000 records...
 Found 34000 records...
 Found 35000 records...
 Found 36000 records...
 Found 37000 records...
 Found 38000 records...
 Found 39000 records...
 Found 40000 records...
 Found 41000 records...
 Found 42000 records...
 Found 43000 records...
 Found 44000 records...
 Found 45000 records...
 Found 46000 records...
 Found 47000 records...
 Found 48000 records...
 Found 49000 records...
 Found 50000 records...
 Found 51000 records...
 Found 52000 records...
 Found 53000 records...
 Found 54000 records...
 Found 55000 records...
 Found 56000 records...
 Found 57000 records...
 Found 58000 records...
 Found 59000 records...
 Found 60000 records...
 Found 61000 records...
 Found 62000 records...
 Found 63000 records...
 Found 64000 records...
 Found 65000 records...
 Found 66000 records...
 Found 67000 records...
 Found 68000 records...
 Found 69000 records...
 Found 70000 records...
 Found 71000 records...
 Found 72000 records...
 Found 73000 records...
 Found 74000 records...
 Found 75000 records...
 Found 76000 records...
 Found 77000 records...
 Found 78000 records...
 Found 79000 records...
 Found 80000 records...
 Found 81000 records...
 Found 82000 records...
 Found 83000 records...
 Found 84000 records...
 Found 85000 records...
 Found 86000 records...
 Found 87000 records...
 Found 88000 records...
 Found 89000 records...
 Found 90000 records...
 Found 91000 records...
 Found 92000 records...
 Found 93000 records...
 Found 94000 records...
 Found 95000 records...
 Found 96000 records...
 Found 97000 records...
 Found 98000 records...
 Found 99000 records...
 Found 1e+05 records...
 Found 101000 records...
 Found 102000 records...
 Found 103000 records...
 Found 104000 records...
 Found 105000 records...
 Found 106000 records...
 Found 107000 records...
 Found 108000 records...
 Found 109000 records...
 Found 110000 records...
 Found 111000 records...
 Found 112000 records...
 Found 113000 records...
 Found 114000 records...
 Found 115000 records...
 Found 116000 records...
 Found 117000 records...
 Found 118000 records...
 Found 119000 records...
 Found 120000 records...
 Found 121000 records...
 Found 122000 records...
 Found 123000 records...
 Found 124000 records...
 Found 125000 records...
 Found 126000 records...
 Found 127000 records...
 Found 128000 records...
 Found 129000 records...
 Found 130000 records...
 Found 131000 records...
 Found 132000 records...
 Found 133000 records...
 Found 134000 records...
 Found 135000 records...
 Found 136000 records...
 Found 137000 records...
 Found 138000 records...
 Found 139000 records...
 Found 140000 records...
 Found 141000 records...
 Found 142000 records...
 Found 143000 records...
 Found 144000 records...
 Found 145000 records...
 Found 146000 records...
 Found 147000 records...
 Found 148000 records...
 Found 149000 records...
 Found 150000 records...
 Found 151000 records...
 Found 152000 records...
 Found 153000 records...
 Found 154000 records...
 Found 155000 records...
 Found 156000 records...
 Found 157000 records...
 Found 158000 records...
 Found 159000 records...
 Found 160000 records...
 Found 161000 records...
 Found 162000 records...
 Found 163000 records...
 Found 164000 records...
 Found 165000 records...
 Found 166000 records...
 Found 167000 records...
 Found 168000 records...
 Found 169000 records...
 Found 170000 records...
 Found 171000 records...
 Found 172000 records...
 Found 173000 records...
 Found 174000 records...
 Found 175000 records...
 Found 176000 records...
 Found 177000 records...
 Found 178000 records...
 Found 179000 records...
 Found 180000 records...
 Found 181000 records...
 Found 182000 records...
 Found 183000 records...
 Found 184000 records...
 Found 185000 records...
 Found 186000 records...
 Found 187000 records...
 Found 188000 records...
 Found 189000 records...
 Found 190000 records...
 Found 191000 records...
 Found 192000 records...
 Found 193000 records...
 Found 194000 records...
 Found 195000 records...
 Found 196000 records...
 Found 197000 records...
 Found 198000 records...
 Found 199000 records...
 Found 2e+05 records...
 Found 201000 records...
 Found 202000 records...
 Found 203000 records...
 Found 204000 records...
 Found 205000 records...
 Found 206000 records...
 Found 207000 records...
 Found 208000 records...
 Found 209000 records...
 Found 210000 records...
 Found 211000 records...
 Found 212000 records...
 Found 213000 records...
 Found 214000 records...
 Found 215000 records...
 Found 216000 records...
 Found 217000 records...
 Found 218000 records...
 Found 219000 records...
 Found 220000 records...
 Found 221000 records...
 Found 222000 records...
 Found 223000 records...
 Found 224000 records...
 Found 225000 records...
 Found 226000 records...
 Found 227000 records...
 Found 228000 records...
 Found 229000 records...
 Found 230000 records...
 Found 231000 records...
 Found 232000 records...
 Found 233000 records...
 Found 234000 records...
 Found 235000 records...
 Found 236000 records...
 Found 237000 records...
 Found 238000 records...
 Found 239000 records...
 Found 240000 records...
 Found 241000 records...
 Found 242000 records...
 Found 243000 records...
 Found 244000 records...
 Found 245000 records...
 Found 246000 records...
 Found 247000 records...
 Found 248000 records...
 Found 249000 records...
 Found 250000 records...
 Found 251000 records...
 Found 252000 records...
 Found 253000 records...
 Found 254000 records...
 Found 255000 records...
 Found 256000 records...
 Found 257000 records...
 Found 258000 records...
 Found 259000 records...
 Found 260000 records...
 Found 261000 records...
 Found 262000 records...
 Found 263000 records...
 Found 264000 records...
 Found 265000 records...
 Found 266000 records...
 Found 267000 records...
 Found 268000 records...
 Found 269000 records...
 Found 270000 records...
 Found 271000 records...
 Found 272000 records...
 Found 273000 records...
 Found 274000 records...
 Found 275000 records...
 Found 276000 records...
 Found 277000 records...
 Found 278000 records...
 Found 279000 records...
 Found 280000 records...
 Found 281000 records...
 Found 282000 records...
 Found 283000 records...
 Found 284000 records...
 Found 285000 records...
 Found 286000 records...
 Found 287000 records...
 Found 288000 records...
 Found 289000 records...
 Found 290000 records...
 Found 291000 records...
 Found 292000 records...
 Found 293000 records...
 Found 294000 records...
 Found 295000 records...
 Found 296000 records...
 Found 297000 records...
 Found 298000 records...
 Found 299000 records...
 Found 3e+05 records...
 Found 301000 records...
 Found 302000 records...
 Found 303000 records...
 Found 304000 records...
 Found 305000 records...
 Found 306000 records...
 Found 307000 records...
 Found 308000 records...
 Found 309000 records...
 Found 310000 records...
 Found 311000 records...
 Found 312000 records...
 Found 313000 records...
 Found 314000 records...
 Found 315000 records...
 Found 316000 records...
 Found 317000 records...
 Found 318000 records...
 Found 319000 records...
 Found 320000 records...
 Found 321000 records...
 Found 322000 records...
 Found 323000 records...
 Found 324000 records...
 Found 325000 records...
 Found 326000 records...
 Found 327000 records...
 Found 328000 records...
 Found 329000 records...
 Found 330000 records...
 Found 331000 records...
 Found 332000 records...
 Found 333000 records...
 Found 334000 records...
 Found 335000 records...
 Found 336000 records...
 Found 336776 records...
 Imported 336776 records. Simplifying into dataframe...
##mongo_flights,  flights data frame
dim(mongo_flights) == dim(flights)
## [1] TRUE TRUE
mongo_planes <- m_planes$find()
## 
 Found 1000 records...
 Found 2000 records...
 Found 3000 records...
 Found 3322 records...
 Imported 3322 records. Simplifying into dataframe...
##mongo_planes,  planes data frame
dim(mongo_planes) == dim(planes)
## [1] TRUE TRUE
mongo_weather <- m_weather$find()
## 
 Found 1000 records...
 Found 2000 records...
 Found 3000 records...
 Found 4000 records...
 Found 5000 records...
 Found 6000 records...
 Found 7000 records...
 Found 8000 records...
 Found 8719 records...
 Imported 8719 records. Simplifying into dataframe...
##mongo_weather,  weather data frame
dim(mongo_weather) == dim(weather)
## [1] TRUE TRUE
knitr::kable(head(mongo_airlines))
carrier name
9E Endeavor Air Inc.
AA American Airlines Inc.
AS Alaska Airlines Inc.
B6 JetBlue Airways
DL Delta Air Lines Inc.
EV ExpressJet Airlines Inc.
knitr::kable(head(mongo_airports))
faa name lat lon alt tz dst
04G Lansdowne Airport 41.13047 -80.61958 1044 -5 A
06A Moton Field Municipal Airport 32.46057 -85.68003 264 -5 A
06C Schaumburg Regional 41.98934 -88.10124 801 -6 A
06N Randall Airport 41.43191 -74.39156 523 -5 A
09J Jekyll Island Airport 31.07447 -81.42778 11 -4 A
0A9 Elizabethton Municipal Airport 36.37122 -82.17342 1593 -4 A
knitr::kable(head(mongo_flights))
year month day dep_time dep_delay arr_time arr_delay carrier tailnum flight origin dest air_time distance hour minute
2013 1 1 517 2 830 11 UA N14228 1545 EWR IAH 227 1400 5 17
2013 1 1 533 4 850 20 UA N24211 1714 LGA IAH 227 1416 5 33
2013 1 1 542 2 923 33 AA N619AA 1141 JFK MIA 160 1089 5 42
2013 1 1 544 -1 1004 -18 B6 N804JB 725 JFK BQN 183 1576 5 44
2013 1 1 554 -6 812 -25 DL N668DN 461 LGA ATL 116 762 6 54
2013 1 1 554 -4 740 12 UA N39463 1696 EWR ORD 150 719 6 54
knitr::kable(head(mongo_planes))
tailnum year type manufacturer model engines seats engine speed
N10156 2004 Fixed wing multi engine EMBRAER EMB-145XR 2 55 Turbo-fan NA
N102UW 1998 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 Turbo-fan NA
N103US 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 Turbo-fan NA
N104UW 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 Turbo-fan NA
N10575 2002 Fixed wing multi engine EMBRAER EMB-145LR 2 55 Turbo-fan NA
N105UW 1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214 2 182 Turbo-fan NA
knitr::kable(head(mongo_weather))
origin year month day hour temp dewp humid wind_dir wind_speed wind_gust precip pressure visib
EWR 2013 1 1 0 37.04 21.92 53.97 230 10.35702 11.91865 0 1013.9 10
EWR 2013 1 1 1 37.04 21.92 53.97 230 13.80936 15.89154 0 1013.0 10
EWR 2013 1 1 2 37.94 21.92 52.09 230 12.65858 14.56724 0 1012.6 10
EWR 2013 1 1 3 37.94 23.00 54.51 230 13.80936 15.89154 0 1012.7 10
EWR 2013 1 1 4 37.94 24.08 57.04 240 14.96014 17.21583 0 1012.8 10
EWR 2013 1 1 6 39.02 26.06 59.37 270 10.35702 11.91865 0 1012.0 10

All five data frames from both MySql and MongoDB are exactly the same.

Conclusion

From the tests above I can conclude that the migration of flights databse from MySql to MongoDB was successful.

Advantages and disadvantages of storing the data in a relational database vs MongoDB

MongoDB falls under the “Document” category of NoSQL databse. It uses dynamic-schema so it is more flexible to set up. MongoDB was designed with high availability and scalability in mind. It supports replica sets which makes the secondary server primary automatically without any human intervention if the primary server goes down. MongoDB enables horizontal scalability by using a technique called sharding. Sharding distributes the data across physical partitions to overcome the hardware limitations. The data is automatically balanced in the clusters.

Data size in MongoDB is usually larger due to each document has field names that stored it, it is less flexible with querying because there is no join function and it has no tansaction support.

References