For this assignment, you should take information from a relational database and migrate it to a NoSQL database of your own choosing.

For the relational database, you might use the flights database, the tb database, the “data skills” database your team created for Project 3, or another database of your own choosing or creation.

For the NoSQL database, you may use MongoDB (which we introduced in week 7), Neo4j, or another NoSQL database of your choosing.

Your migration process needs to be reproducible. R code is encouraged, but not required. You should also briefly describe the advantages and disadvantages of storing the data in a relational database vs. your NoSQL database.

Libraries:

library(RMySQL)
library(dbConnect)
library(DBI)
library(mongolite)

Use R code to connect to MySQL Database:

mydb = dbConnect(MySQL(), 
                 user='root', password='haley123', dbname='grads', host='localhost')

Query MySQL DB to Pull data into R.

grads<- dbGetQuery(mydb, "select * from graduation")

head (grads)
##    Demographic    DBN                           School Name   Cohort
## 1 Total Cohort 01M292 HENRY STREET SCHOOL FOR INTERNATIONAL     2003
## 2 Total Cohort 01M292 HENRY STREET SCHOOL FOR INTERNATIONAL     2004
## 3 Total Cohort 01M292 HENRY STREET SCHOOL FOR INTERNATIONAL     2005
## 4 Total Cohort 01M292 HENRY STREET SCHOOL FOR INTERNATIONAL     2006
## 5 Total Cohort 01M292 HENRY STREET SCHOOL FOR INTERNATIONAL 2006 Aug
## 6 Total Cohort 01M448   UNIVERSITY NEIGHBORHOOD HIGH SCHOOL     2001
##   Total Cohort Total Grads - n Total Grads - % of cohort Total Regents - n
## 1            5               s                         s                 s
## 2           55              37                      67.3                17
## 3           64              43                      67.2                27
## 4           78              43                      55.1                36
## 5           78              44                      56.4                37
## 6           64              46        71.900000000000006                32
##   Total Regents - % of cohort Total Regents - % of grads
## 1                           s                          s
## 2                        30.9                       45.9
## 3                        42.2                       62.8
## 4                        46.2                       83.7
## 5                        47.4                       84.1
## 6                          50         69.599999999999994
##   Advanced Regents - n Advanced Regents - % of cohort
## 1                    s                              s
## 2                    0                              0
## 3                    0                              0
## 4                    0                              0
## 5                    0                              0
## 6                    7                           10.9
##   Advanced Regents - % of grads Regents w/o Advanced - n
## 1                             s                        s
## 2                             0                       17
## 3                             0                       27
## 4                             0                       36
## 5                             0                       37
## 6                          15.2                       25
##   Regents w/o Advanced - % of cohort Regents w/o Advanced - % of grads
## 1                                  s                                 s
## 2                               30.9                              45.9
## 3                               42.2                              62.8
## 4                               46.2                              83.7
## 5                               47.4                              84.1
## 6                               39.1                              54.3
##   Local - n Local - % of cohort Local - % of grads Still Enrolled - n
## 1         s                   s                  s                  s
## 2        20                36.4               54.1                 15
## 3        16                  25 37.200000000000003                  9
## 4         7                   9               16.3                 16
## 5         7                   9               15.9                 15
## 6        14                21.9               30.4                 10
##   Still Enrolled - % of cohort Dropped Out - n Dropped Out - % of cohort
## 1                            s               s                         s
## 2                         27.3               3                       5.5
## 3                         14.1               9                      14.1
## 4                         20.5              11                      14.1
## 5                         19.2              11                      14.1
## 6                         15.6               6                       9.4
grads2<- dbGetQuery(mydb, "select  distinct `School Name`, `Total Grads - n`,`Dropped Out - n`, `Still Enrolled - n`
from graduation limit 10")

grads2
##                              School Name Total Grads - n Dropped Out - n
## 1  HENRY STREET SCHOOL FOR INTERNATIONAL               s               s
## 2  HENRY STREET SCHOOL FOR INTERNATIONAL              37               3
## 3  HENRY STREET SCHOOL FOR INTERNATIONAL              43               9
## 4  HENRY STREET SCHOOL FOR INTERNATIONAL              43              11
## 5  HENRY STREET SCHOOL FOR INTERNATIONAL              44              11
## 6    UNIVERSITY NEIGHBORHOOD HIGH SCHOOL              46               6
## 7    UNIVERSITY NEIGHBORHOOD HIGH SCHOOL              33               1
## 8    UNIVERSITY NEIGHBORHOOD HIGH SCHOOL              67              11
## 9    UNIVERSITY NEIGHBORHOOD HIGH SCHOOL              75               4
## 10   UNIVERSITY NEIGHBORHOOD HIGH SCHOOL              64              11
##    Still Enrolled - n
## 1                   s
## 2                  15
## 3                   9
## 4                  16
## 5                  15
## 6                  10
## 7                  16
## 8                   9
## 9                  33
## 10                 41

Connect to MongoDB and then migrate into MySQL tables grads and grads2 into a MongoDB collection

mongodb <- mongo(collection = "test", db = "school")
mongodb$insert(grads)
## List of 5
##  $ nInserted  : num 25096
##  $ nMatched   : num 0
##  $ nRemoved   : num 0
##  $ nUpserted  : num 0
##  $ writeErrors: list()
mongodb2 <- mongo(collection = "test", db= "school2")
mongodb2$insert(grads2)
## List of 5
##  $ nInserted  : num 10
##  $ nMatched   : num 0
##  $ nRemoved   : num 0
##  $ nUpserted  : num 0
##  $ writeErrors: list()

Advantages/ Disadvantages of NoSQL (MongoDB) Vs SQL

I noticed while using NoSql the data doesn’t need to be “Structured”- this means, every document in the collection can be different each other.

For example, If you’re using MySQL or PostgresSQL you would conform certain variables. Such as first name, last, email, phone number. You wouldn’t be able to add anything other information into the tables unless the table allows it.

While working in an unstructured database like MongoDB, the collection allows you to have more lead way. You can be able to have additional documents within the same collection.

A disadvantage for using NoSQL is that is still pretty new. Relational databases such as MySQL have been around longer, so NoSQL db can be stuck without inheriting needed support.