Row

Setting Up MongoDB

This is the hardest part of the whole project. Making sure that you get mongo installed and set up right is pretty straight forward, but must be done in the right order and tested. I used Homebrew in Mac to install it via the command terminal. this can be done as follows:

Open a command window and download Homebrew

`/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"`  

You will need to update homebrew prior to installing MongoDB

`brew update`  

Install MongoDB using the Commandline Terminal

`brew install mongodb`  

Use Commandline Terminal to build a directory of MongoDB

You have options of where you choose to place this directory. I chose to install it to the default location data/db so that I (and consequently R) would not have to search high and low to find it.

To install in the default location, run:

`mkdir -p /data/db` or change the location of the directory to your desired location  

Activate MongoDB

In order for you to interact with the MongoDB instance, it must be running. This takes two steps, first you have to run the MongoDB Daemon in a command terminal window type mongod and then open a second command terminal window and type mongo which will activate the database to receive and send data.

Remember when you are through using MongoDB to first exit the mongo instance using quit() in the mongo terminal, then close the daemon in the mongod terminal by hitting control + c at the same time, making sure you are in the terminal window at the time you do so.

From SQL to MongoDB

Installing Packages and Loading SQL

Because I use SQLite to avoid annoying MY_SQL conflicts with my aging Mac system, you need to install RSQLite to access the flights data in the database. To save this data into MongoDB, you will also need to install mongolite.

install.packages("RSQLite")
install.packages("mongolite")

Load both packages, and set the directory to point R toward the SQLite database.

require(RSQLite)
require(mongolite)

Make a connection to the database using DBI package’s dbConnect making sure that you call the database by its correct name. I used “flights.sqlite” and checked the tables to be sure that they were all there.

Extracting data from SQL to send to MongoDB

setwd("/Volumes/Document_Drive/test")
db.flights = dbConnect(SQLite(), dbname="flights.sqlite")
dbListTables(db.flights)
[1] "airlines" "airports" "flights"  "planes"   "weather" 

With all the right tables showing up, I made a query to join two tables, airlines and flights, to get the total number of delays for each airline and the total delay time. I set this query equal to the variable toMongo so that I could pass this into the new NoSQL database.

I could just transfer SQL tables to MongoDB, but it defeats the purpose of a non-relational table, so I created what could be a nice bite of data for Mongo to chew on to show how to integrate R and MongoDB.

toMongo <- dbGetQuery(db.flights, "SELECT airlines.name as Airline, COUNT(*) as Delays, SUM(flights.arr_delay) as
Total_Delay FROM flights JOIN airlines ON flights.carrier=airlines.carrier WHERE flights.arr_delay>0 GROUP BY Airline ORDER BY Total_Delay DESC;")
dbDisconnect(db.flights)

Create MongoDB & Save Data

Creating MongoDB database in R is very simple, you just expressly call the connection using the variable it was saved into and assign it to a new database using the method mongo saving it to a new variable. Here the collection is toMongo from the last chunk, and we are saving it in a MongoDB data base named, flights.

Then you just insert this into your new database.

mongodata = mongo(collection = "test", db = "flights")
mongodata$insert(toMongo)
List of 5
 $ nInserted  : num 16
 $ nMatched   : num 0
 $ nRemoved   : num 0
 $ nUpserted  : num 0
 $ writeErrors: list()

You can see the data in the new database by calling it up using $find attribute of the connection.

mongodata$find()

Airline Delays Total_Delay
ExpressJet Airlines Inc. 24484 1181808
JetBlue Airways 23609 944574
United Air Lines Inc. 22222 814458
Delta Air Lines Inc. 16413 619485
Envoy Air 11693 442604
American Airlines Inc. 10706 409671
Endeavor Air Inc. 6637 327023
Southwest Airlines Co. 5304 216125
US Airways Inc. 7349 213206
AirTran Airways Corporation 1895 77874
Virgin America 1746 76557
Frontier Airlines Inc. 392 18651
Mesa Airlines Inc. 258 13179
Alaska Airlines Inc. 189 6495
Hawaiian Airlines Inc. 97 3398
SkyWest Airlines Inc. 10 606
ExpressJet Airlines Inc. 24484 1181808
JetBlue Airways 23609 944574
United Air Lines Inc. 22222 814458
Delta Air Lines Inc. 16413 619485
Envoy Air 11693 442604
American Airlines Inc. 10706 409671
Endeavor Air Inc. 6637 327023
Southwest Airlines Co. 5304 216125
US Airways Inc. 7349 213206
AirTran Airways Corporation 1895 77874
Virgin America 1746 76557
Frontier Airlines Inc. 392 18651
Mesa Airlines Inc. 258 13179
Alaska Airlines Inc. 189 6495
Hawaiian Airlines Inc. 97 3398
SkyWest Airlines Inc. 10 606
ExpressJet Airlines Inc. 24484 1181808
JetBlue Airways 23609 944574
United Air Lines Inc. 22222 814458
Delta Air Lines Inc. 16413 619485
Envoy Air 11693 442604
American Airlines Inc. 10706 409671
Endeavor Air Inc. 6637 327023
Southwest Airlines Co. 5304 216125
US Airways Inc. 7349 213206
AirTran Airways Corporation 1895 77874
Virgin America 1746 76557
Frontier Airlines Inc. 392 18651
Mesa Airlines Inc. 258 13179
Alaska Airlines Inc. 189 6495
Hawaiian Airlines Inc. 97 3398
SkyWest Airlines Inc. 10 606
ExpressJet Airlines Inc. 24484 1181808
JetBlue Airways 23609 944574
United Air Lines Inc. 22222 814458
Delta Air Lines Inc. 16413 619485
Envoy Air 11693 442604
American Airlines Inc. 10706 409671
Endeavor Air Inc. 6637 327023
Southwest Airlines Co. 5304 216125
US Airways Inc. 7349 213206
AirTran Airways Corporation 1895 77874
Virgin America 1746 76557
Frontier Airlines Inc. 392 18651
Mesa Airlines Inc. 258 13179
Alaska Airlines Inc. 189 6495
Hawaiian Airlines Inc. 97 3398
SkyWest Airlines Inc. 10 606
ExpressJet Airlines Inc. 24484 1181808
JetBlue Airways 23609 944574
United Air Lines Inc. 22222 814458
Delta Air Lines Inc. 16413 619485
Envoy Air 11693 442604
American Airlines Inc. 10706 409671
Endeavor Air Inc. 6637 327023
Southwest Airlines Co. 5304 216125
US Airways Inc. 7349 213206
AirTran Airways Corporation 1895 77874
Virgin America 1746 76557
Frontier Airlines Inc. 392 18651
Mesa Airlines Inc. 258 13179
Alaska Airlines Inc. 189 6495
Hawaiian Airlines Inc. 97 3398
SkyWest Airlines Inc. 10 606
ExpressJet Airlines Inc. 24484 1181808
JetBlue Airways 23609 944574
United Air Lines Inc. 22222 814458
Delta Air Lines Inc. 16413 619485
Envoy Air 11693 442604
American Airlines Inc. 10706 409671
Endeavor Air Inc. 6637 327023
Southwest Airlines Co. 5304 216125
US Airways Inc. 7349 213206
AirTran Airways Corporation 1895 77874
Virgin America 1746 76557
Frontier Airlines Inc. 392 18651
Mesa Airlines Inc. 258 13179
Alaska Airlines Inc. 189 6495
Hawaiian Airlines Inc. 97 3398
SkyWest Airlines Inc. 10 606
ExpressJet Airlines Inc. 24484 1181808
JetBlue Airways 23609 944574
United Air Lines Inc. 22222 814458
Delta Air Lines Inc. 16413 619485
Envoy Air 11693 442604
American Airlines Inc. 10706 409671
Endeavor Air Inc. 6637 327023
Southwest Airlines Co. 5304 216125
US Airways Inc. 7349 213206
AirTran Airways Corporation 1895 77874
Virgin America 1746 76557
Frontier Airlines Inc. 392 18651
Mesa Airlines Inc. 258 13179
Alaska Airlines Inc. 189 6495
Hawaiian Airlines Inc. 97 3398
SkyWest Airlines Inc. 10 606
ExpressJet Airlines Inc. 24484 1181808
JetBlue Airways 23609 944574
United Air Lines Inc. 22222 814458
Delta Air Lines Inc. 16413 619485
Envoy Air 11693 442604
American Airlines Inc. 10706 409671
Endeavor Air Inc. 6637 327023
Southwest Airlines Co. 5304 216125
US Airways Inc. 7349 213206
AirTran Airways Corporation 1895 77874
Virgin America 1746 76557
Frontier Airlines Inc. 392 18651
Mesa Airlines Inc. 258 13179
Alaska Airlines Inc. 189 6495
Hawaiian Airlines Inc. 97 3398
SkyWest Airlines Inc. 10 606
ExpressJet Airlines Inc. 24484 1181808
JetBlue Airways 23609 944574
United Air Lines Inc. 22222 814458
Delta Air Lines Inc. 16413 619485
Envoy Air 11693 442604
American Airlines Inc. 10706 409671
Endeavor Air Inc. 6637 327023
Southwest Airlines Co. 5304 216125
US Airways Inc. 7349 213206
AirTran Airways Corporation 1895 77874
Virgin America 1746 76557
Frontier Airlines Inc. 392 18651
Mesa Airlines Inc. 258 13179
Alaska Airlines Inc. 189 6495
Hawaiian Airlines Inc. 97 3398
SkyWest Airlines Inc. 10 606
ExpressJet Airlines Inc. 24484 1181808
JetBlue Airways 23609 944574
United Air Lines Inc. 22222 814458
Delta Air Lines Inc. 16413 619485
Envoy Air 11693 442604
American Airlines Inc. 10706 409671
Endeavor Air Inc. 6637 327023
Southwest Airlines Co. 5304 216125
US Airways Inc. 7349 213206
AirTran Airways Corporation 1895 77874
Virgin America 1746 76557
Frontier Airlines Inc. 392 18651
Mesa Airlines Inc. 258 13179
Alaska Airlines Inc. 189 6495
Hawaiian Airlines Inc. 97 3398
SkyWest Airlines Inc. 10 606
ExpressJet Airlines Inc. 24484 1181808
JetBlue Airways 23609 944574
United Air Lines Inc. 22222 814458
Delta Air Lines Inc. 16413 619485
Envoy Air 11693 442604
American Airlines Inc. 10706 409671
Endeavor Air Inc. 6637 327023
Southwest Airlines Co. 5304 216125
US Airways Inc. 7349 213206
AirTran Airways Corporation 1895 77874
Virgin America 1746 76557
Frontier Airlines Inc. 392 18651
Mesa Airlines Inc. 258 13179
Alaska Airlines Inc. 189 6495
Hawaiian Airlines Inc. 97 3398
SkyWest Airlines Inc. 10 606
ExpressJet Airlines Inc. 24484 1181808
JetBlue Airways 23609 944574
United Air Lines Inc. 22222 814458
Delta Air Lines Inc. 16413 619485
Envoy Air 11693 442604
American Airlines Inc. 10706 409671
Endeavor Air Inc. 6637 327023
Southwest Airlines Co. 5304 216125
US Airways Inc. 7349 213206
AirTran Airways Corporation 1895 77874
Virgin America 1746 76557
Frontier Airlines Inc. 392 18651
Mesa Airlines Inc. 258 13179
Alaska Airlines Inc. 189 6495
Hawaiian Airlines Inc. 97 3398
SkyWest Airlines Inc. 10 606
ExpressJet Airlines Inc. 24484 1181808
JetBlue Airways 23609 944574
United Air Lines Inc. 22222 814458
Delta Air Lines Inc. 16413 619485
Envoy Air 11693 442604
American Airlines Inc. 10706 409671
Endeavor Air Inc. 6637 327023
Southwest Airlines Co. 5304 216125
US Airways Inc. 7349 213206
AirTran Airways Corporation 1895 77874
Virgin America 1746 76557
Frontier Airlines Inc. 392 18651
Mesa Airlines Inc. 258 13179
Alaska Airlines Inc. 189 6495
Hawaiian Airlines Inc. 97 3398
SkyWest Airlines Inc. 10 606
ExpressJet Airlines Inc. 24484 1181808
JetBlue Airways 23609 944574
United Air Lines Inc. 22222 814458
Delta Air Lines Inc. 16413 619485
Envoy Air 11693 442604
American Airlines Inc. 10706 409671
Endeavor Air Inc. 6637 327023
Southwest Airlines Co. 5304 216125
US Airways Inc. 7349 213206
AirTran Airways Corporation 1895 77874
Virgin America 1746 76557
Frontier Airlines Inc. 392 18651
Mesa Airlines Inc. 258 13179
Alaska Airlines Inc. 189 6495
Hawaiian Airlines Inc. 97 3398
SkyWest Airlines Inc. 10 606

This looks tabular because I passed it into a kable table, but within Mongo, it is saved in a binary structure called BSON, which allows you to search by any field name in a document and get a corresponding value much as you would with JSON. Here, `mongolite is doing this for us.

Advantages/Disadvantages

SQL and Relational Databases

The advantages of using a relational store is that you can reduce redundancy in data by creating multiple tables to house attributes of tables which house more observations. An example would be a table of which had the categories of customers a store might have, with say 20 types, each of which might have 30 fields describing that type of customer. This could be called by a single customer-type key in the customer database which has 3-million records. This is far more efficient than storing all of the 30 fields with each customer to whom it applies, saving 30 million duplicate cells, and reducing query time, maintenance costs and physical computing power.

The disadvantages of Relational databases are their lack of ease in storing and accessing complex or non-tabular data efficiently. Searching such data is even less efficient. Also, at a certain point, distributing a relational system becomes cumbersome, requiring a lot of work to preserve the integrity of the individual tables as a database grows.

NoSQL and MongoDB

Advantages of using NoSQL is that you can store any shape, size or configuration of data within an individual record. This means JSON strings, the pages of “War and Peace”, the simplified binary description of RBG images in arrays, whole word documents, key-value pairs…almost anything which can be made into characters can be saved in NoSQL. SO it is flexible enough that you could keep every inch of your documents in one space.

NoSQL is also typically much more robust in a distributed environment playing nicely across nodes. They are highly scalable. And many types of computations and analysis can be done within a NoSQL that simply could not in a straight up SQL database.

Disadvantages of NoSQL, queries can be very complex, particularly for data that would be easily queryable in a traditional relation. They require more thought to administrate and plan for because the scope of the environment is not as clearly defined as with SQL. So you need to consider YOUR use case more clearly before you commit to much to a system.