Introduction

For Project 4, you should take information from a relational database and migrate it to a NoSQL database of your own choosing. For the relational database, you might use the flights database, the tb database, the “data skills” database your team created for Project 3, or another database of your own choosing or creation. For the NoSQL database, you may use MongoDB, Neo4j (which we introduce in Week 12), or another NoSQL database of your choosing. Your migration process needs to be reproducible. R code is encouraged, but not required. You should also briefly describe the advantages and disadvantages of storing the data in a relational database vs. your NoSQL database.
I will be using the ‘nycflights13’ database that was used for previous assignment. My first step is to use the table import wizard to load the flights tables into MySQL. Once completed, I will them export into Mongodb.
library(nycflights13)
library(RMySQL)
## Loading required package: DBI
dbConnect(MySQL(), user="komotunde", password="N!cole09", 
    dbname="nycflights", host="localhost",client.flag=CLIENT_MULTI_STATEMENTS)
## <MySQLConnection:0,0>
#connect to MySQL

I had a very difficult time with this next part. I had initally planned to use the RMongo package as I could not find rmongodb via CRAN. I was lucky to find the following code to load the package. Using RMongo would not have been an issue but I did not find a lot of documentation on usage on it.

library(devtools)
install_github(repo = "mongosoup/rmongodb") #installs the rmongodb package
## Skipping install of 'rmongodb' from a github remote, the SHA1 (8eb2bca2) has not changed since last install.
##   Use `force = TRUE` to force installation

Now to connect to mongo

library(rmongodb)
mongo <- mongo.create(host = "localhost")
mongo.is.connected(mongo)
## [1] TRUE
#connect to Mongo

db.1 <- "test.nycflights"
newmongo.db1 <- "test.nycflights.airlines"
mongo.get.database.collections(mongo, db.1)
## character(0)
#create database and collection names

Now that we are connected to both our database and have our packages, we can start working. I will be working with flights and airlines csv files.

library(RCurl)
## Loading required package: bitops
airlines <- getURL("https://raw.githubusercontent.com/komotunde/DATA607/master/Project4/airlines.csv")
airlines <- read.csv(text = airlines)
head(airlines)
##   carrier                     name
## 1      9E        Endeavor Air Inc.
## 2      AA   American Airlines Inc.
## 3      AS     Alaska Airlines Inc.
## 4      B6          JetBlue Airways
## 5      DL     Delta Air Lines Inc.
## 6      EV ExpressJet Airlines Inc.
library(jsonlite)
airlines1 <- lapply(split(airlines, 1:nrow(airlines)), function(x)mongo.bson.from.JSON(toJSON(x)))
airlines1[1:3]
## $`1`
##  1 : 3    
##      carrier : 2      9E
##      name : 2     Endeavor Air Inc.
## 
## 
## $`2`
##  1 : 3    
##      carrier : 2      AA
##      name : 2     American Airlines Inc.
## 
## 
## $`3`
##  1 : 3    
##      carrier : 2      AS
##      name : 2     Alaska Airlines Inc.

Now to insert our data into Mongo

mongo.insert.batch(mongo, newmongo.db1, airlines1)
## [1] TRUE

Now to check and see if it loaded

mongo.count(mongo, newmongo.db1, query = '{"carrier":"9E"}') 
## [1] 0
#at this point I am not returning any values.

mongo.count(mongo, newmongo.db1, query = '{"name":"Envoy Air"}') 
## [1] 0
#neither of these returned anything so I will go back and make some changes.