Assignment:-
For this assignment, you should take information from a relational database and migrate it to a NoSQL database of your own choosing.

For the relational database, you might use the flights database, the tb database, the “data skills” database your team created for Project 3, or another database of your own choosing or creation.

For the NoSQL database, you may use MongoDB (which we introduced in week 7), Neo4j, or another NoSQL database of your choosing.

Your migration process needs to be reproducible. R code is encouraged, but not required. You should also briefly describe the advantages and disadvantages of storing the data in a relational database vs. your NoSQL database.

I came across NYCflights package and it has flights related data.
This package contains information about all flights that departed from NYC (i.e., EWR, JFK and LGA) in 2013: 336,776 flights with 16 variables. To help understand what causes delays, it also includes a number of other useful datasets: weather, planes, airports, airlines.

I loaded the data into mySQL in my local and will read the data and from it into dataframes and load them to Mongodb.

#library(nycflights13)

library(mongolite)
library(jsonlite)
library(RMySQL)
## Loading required package: DBI
library(sys)
#Importing data into dataframes

mydb = dbConnect(MySQL(), user=Sys.getenv("userid"), password=Sys.getenv("pwd"), dbname='week13-flights', host='localhost')
dbListTables(mydb)
## [1] "airlines" "airports" "flights"  "planes"   "weather"
airlines <- dbReadTable(mydb, "airlines")
head(airlines)
##   carrier                     name
## 1      9E        Endeavor Air Inc.
## 2      AA   American Airlines Inc.
## 3      AS     Alaska Airlines Inc.
## 4      B6          JetBlue Airways
## 5      DL     Delta Air Lines Inc.
## 6      EV ExpressJet Airlines Inc.
airports <- dbReadTable(mydb, "airports")
head(airports)
##   faa                           name      lat       lon  alt tz dst
## 1 04G              Lansdowne Airport 41.13047 -80.61958 1044 -5   A
## 2 06A  Moton Field Municipal Airport 32.46057 -85.68003  264 -6   A
## 3 06C            Schaumburg Regional 41.98934 -88.10124  801 -6   A
## 4 06N                Randall Airport 41.43191 -74.39156  523 -5   A
## 5 09J          Jekyll Island Airport 31.07447 -81.42778   11 -5   A
## 6 0A9 Elizabethton Municipal Airport 36.37122 -82.17342 1593 -5   A
##              tzone
## 1 America/New_York
## 2  America/Chicago
## 3  America/Chicago
## 4 America/New_York
## 5 America/New_York
## 6 America/New_York
flights <- dbReadTable(mydb, "flights")
head(flights)
##   year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time
## 1 2013     1   1      517            515         2      830            819
## 2 2013     1   1      533            529         4      850            830
## 3 2013     1   1      542            540         2      923            850
## 4 2013     1   1      544            545        -1     1004           1022
## 5 2013     1   1      554            600        -6      812            837
## 6 2013     1   1      554            558        -4      740            728
##   arr_delay carrier flight tailnum origin dest air_time distance hour
## 1        11      UA   1545  N14228    EWR  IAH      227     1400    5
## 2        20      UA   1714  N24211    LGA  IAH      227     1416    5
## 3        33      AA   1141  N619AA    JFK  MIA      160     1089    5
## 4       -18      B6    725  N804JB    JFK  BQN      183     1576    5
## 5       -25      DL    461  N668DN    LGA  ATL      116      762    6
## 6        12      UA   1696  N39463    EWR  ORD      150      719    5
##   minute           time_hour
## 1     15 2013-01-01 05:00:00
## 2     29 2013-01-01 05:00:00
## 3     40 2013-01-01 05:00:00
## 4     45 2013-01-01 05:00:00
## 5      0 2013-01-01 06:00:00
## 6     58 2013-01-01 05:00:00
planes <- dbReadTable(mydb, "planes")
head(planes)
##   tailnum year                    type     manufacturer     model engines
## 1  N10156 2004 Fixed wing multi engine          EMBRAER EMB-145XR       2
## 2  N102UW 1998 Fixed wing multi engine AIRBUS INDUSTRIE  A320-214       2
## 3  N103US 1999 Fixed wing multi engine AIRBUS INDUSTRIE  A320-214       2
## 4  N104UW 1999 Fixed wing multi engine AIRBUS INDUSTRIE  A320-214       2
## 5  N10575 2002 Fixed wing multi engine          EMBRAER EMB-145LR       2
## 6  N105UW 1999 Fixed wing multi engine AIRBUS INDUSTRIE  A320-214       2
##   seats speed    engine
## 1    55    NA Turbo-fan
## 2   182    NA Turbo-fan
## 3   182    NA Turbo-fan
## 4   182    NA Turbo-fan
## 5    55    NA Turbo-fan
## 6   182    NA Turbo-fan
weather <- dbReadTable(mydb, "weather")
head(weather)
##   origin year month day hour  temp  dewp humid wind_dir wind_speed
## 1    EWR 2013     1   1    0 37.04 21.92 53.97      230   10.35702
## 2    EWR 2013     1   1    1 37.04 21.92 53.97      230   13.80936
## 3    EWR 2013     1   1    2 37.94 21.92 52.09      230   12.65858
## 4    EWR 2013     1   1    3 37.94 23.00 54.51      230   13.80936
## 5    EWR 2013     1   1    4 37.94 24.08 57.04      240   14.96014
## 6    EWR 2013     1   1    6 39.02 26.06 59.37      270   10.35702
##   wind_gust precip pressure visib           time_hour
## 1  11.91865      0   1013.9    10 2012-12-31 19:00:00
## 2  15.89154      0   1013.0    10 2012-12-31 20:00:00
## 3  14.56724      0   1012.6    10 2012-12-31 21:00:00
## 4  15.89154      0   1012.7    10 2012-12-31 22:00:00
## 5  17.21583      0   1012.8    10 2012-12-31 23:00:00
## 6  11.91865      0   1012.0    10 2013-01-01 01:00:00

Creating Data collection in mongo.
I have started mondo deamon on the machine prior to running this

mon_airlines <- mongo(collection = "airlines", db = "db")
mon_airlines$remove('{}') # Had to empty the collection to avoid duplicates as i ran the program multiple times
mon_airlines$insert(airlines)
## List of 5
##  $ nInserted  : num 16
##  $ nMatched   : num 0
##  $ nRemoved   : num 0
##  $ nUpserted  : num 0
##  $ writeErrors: list()
mon_airlines$count() # Check count to ensure record numbers
## [1] 16
mon_airports <- mongo(collection = "airports", db = "db")
mon_airports$remove('{}') # Had to empty the collection to avoid duplicates as i ran the program multiple times
mon_airports$insert(airports)
## List of 5
##  $ nInserted  : num 1458
##  $ nMatched   : num 0
##  $ nRemoved   : num 0
##  $ nUpserted  : num 0
##  $ writeErrors: list()
mon_airports$count() # Check count to ensure record numbers
## [1] 1458
mon_flights <- mongo(collection = "flights", db = "db")
mon_flights$remove('{}') # Had to empty the collection to avoid duplicates as i ran the program multiple times
mon_flights$insert(flights)
## List of 5
##  $ nInserted  : num 327346
##  $ nMatched   : num 0
##  $ nRemoved   : num 0
##  $ nUpserted  : num 0
##  $ writeErrors: list()
mon_flights$count() # Check count to ensure record numbers
## [1] 327346
mon_planes <- mongo(collection = "planes", db = "db")
mon_planes$remove('{}') # Had to empty the collection to avoid duplicates as i ran the program multiple times
mon_planes$insert(planes)
## List of 5
##  $ nInserted  : num 3252
##  $ nMatched   : num 0
##  $ nRemoved   : num 0
##  $ nUpserted  : num 0
##  $ writeErrors: list()
mon_planes$count() # Check count to ensure record numbers
## [1] 3252
mon_weather <- mongo(collection = "weather", db = "db")
mon_weather$remove('{}') # Had to empty the collection to avoid duplicates as i ran the program multiple times
mon_weather$insert(weather)
## List of 5
##  $ nInserted  : num 23021
##  $ nMatched   : num 0
##  $ nRemoved   : num 0
##  $ nUpserted  : num 0
##  $ writeErrors: list()
mon_weather$count() # Check count to ensure record numbers
## [1] 23021