DATABASE MIGRATION (MYSQL TO NOSQL)

For this assignment, you should take information from a relational database and migrate it to a NoSQL database of your own choosing.

For the relational database, you might use the flights database, the tb database, the “data skills” database your team created for Project 3, or another database of your own choosing or creation.

For the NoSQL database, you may use MongoDB (which we introduced in week 7), Neo4j, or another NoSQL database of your choosing.

Your migration process needs to be reproducible. R code is encouraged, but not required. You should also briefly describe the advantages and disadvantages of storing the data in a relational database vs. your NoSQL database.

Database Migration: RDBMS to NOSQL (MYSQL TO MONGODB)

# Different kinds of packages that will be used in this project put in one place.
library(RMySQL)     #connecting to MySQL and performing db function
## Warning: package 'RMySQL' was built under R version 3.5.1
## Loading required package: DBI
## Warning: package 'DBI' was built under R version 3.5.1
library(mongolite)  #connecting to MongoDB
## Warning: package 'mongolite' was built under R version 3.5.1
library(stringr)    #string functions
## Warning: package 'stringr' was built under R version 3.5.1

I used a website to download the sample dataset which has 10,000 records of different employees.

https://www.sample-videos.com/download-sample-sql.php

Then, I downloaded the .sql file and ran the queries on MYSQL Workbench.

Lets connect to MYSQL

# connecting to MYSQL using my username and password
mydb <- dbConnect(MySQL(), user='root', password='everest8848', dbname = 'test', host='localhost')
dbSendQuery(mydb, 'use test')
## <MySQLResult:0,0,0>

To store database for further processing

#storing data 
emp_df <- dbGetQuery(mydb, 'select * from user_details')

#to make sure data is correct 
emp_df_25 <- head(emp_df, 25)
emp_df_25
##    user_id  username  first_name   last_name gender
## 1        1  rogers63       david        john Female
## 2        2    mike28      rogers        paul   Male
## 3        3  rivera92       david        john   Male
## 4        4    ross95       maria     sanders   Male
## 5        5    paul85      morris      miller Female
## 6        6   smith34      daniel     michael Female
## 7        7   james84     sanders        paul Female
## 8        8  daniel53        mark        mike   Male
## 9        9  brooks80      morgan       maria Female
## 10      10  morgan65        paul      miller Female
## 11      11 sanders84       david      miller Female
## 12      12   maria40 chrishaydon        bell Female
## 13      13   brown71     michael       brown   Male
## 14      14   james63      morgan       james   Male
## 15      15 jenny0993      rogers chrishaydon Female
## 16      16    john96      morgan      wright   Male
## 17      17  miller64      morgan      wright   Male
## 18      18    mark46       david        ross Female
## 19      19 jenny0988       maria      morgan Female
## 20      20    mark80        mike        bell   Male
## 21      21  morris72      miller     michael   Male
## 22      22  wright39        ross      rogers Female
## 23      23    paul68      brooks        mike   Male
## 24      24   smith60      miller      daniel   Male
## 25      25    bell43        mike      wright   Male
##                            password status
## 1  e6a33eee180b07e563d74fee8c2c66b8      1
## 2  2e7dc6b8a1598f4f75c3eaa47958ee2f      1
## 3  1c3a8e03f448d211904161a6f5849b68      1
## 4  62f0a68a4179c5cdd997189760cbcf18      1
## 5  61bd060b07bddfecccea56a82b850ecf      1
## 6  7055b3d9f5cb2829c26cd7e0e601cde5      1
## 7  b7f72d6eb92b45458020748c8d1a3573      1
## 8  299cbf7171ad1b2967408ed200b4e26c      1
## 9  aa736a35dc15934d67c0a999dccff8f6      1
## 10 a28dca31f5aa5792e1cefd1dfd098569      1
## 11 0629e4f9f0e01e6f20bc2066175e09f7      1
## 12 17f286a78c74db7ee24374c608a2f20c      1
## 13 fa0c46cc4339a8a51a7da1b33e9d2831      1
## 14 b945416fa907fac533d94efe1974ec07      1
## 15 388823cb9249d4cebc9d677a99e1d79d      1
## 16 d0bb977705c3cdad1e346c898f32a1b7      1
## 17 58b207ee33794b046511203967c8e0d7      1
## 18 21cdcb68a932871524e16680fac72e18      1
## 19 ec9ed18ae2a13fef709964af24bb60e6      1
## 20 084489b355edd349bca1c798788de19a      1
## 21 bdb047eb9ea511052fc690a8ac72a7d3      1
## 22 1b6859df2da2a416c5b0fa044b1c6a75      1
## 23 12d836bf64839f987338414ccbec657f      1
## 24 494610644518624d05e2bdc8b9df3c36      1
## 25 2bd4e16a15f5527cb43282ee0ef94619      1

It’s always good to clean up after you get the data and disconnect from MYSQL

#disconnecting MYSQL
dbDisconnect(mydb)
## [1] TRUE
# to stop certain packages to autorun everytime we run the RStudio
detach("package:RMySQL", unload=TRUE)
# including more libraries needed for further work
library(devtools)
## Warning: package 'devtools' was built under R version 3.5.1
## Warning: package 'usethis' was built under R version 3.5.1
library(rJava)
library(jsonlite)
## Warning: package 'jsonlite' was built under R version 3.5.1
library(rmongodb)

# connect to MongoDB
mongo = mongo.create(host = "localhost",db ="test")

#Confirm sucessful connection
mongo.is.connected(mongo)
## [1] TRUE
mongo.get.databases(mongo)
## [1] "config" "emp"

So, now we are ready to upload the dataframe into MongoDB.

#Convert Dataframe into BSON for mongoDB
emp_list=mongo.bson.from.df(emp_df)

emp = mongo.bson.from.list(emp_list)

#now lets insert BSON into MongoDB collection

mongo.insert(mongo,"emp.emp_collection1",emp)
## [1] TRUE
#once again lets confirm that we have the data 

mongo.count(mongo, "emp.emp_collection1")
## [1] 22
## lets see what we have collection so far

tmp = mongo.find.one(mongo, ns = "emp.emp_collection1")
## now finally lets see the real data that we have imported to mongoDB

emp_list_25=mongo.bson.from.df(emp_df_25)

emp_25 = mongo.bson.from.list(emp_list_25)

mongo.insert(mongo,"emp.emp_collection25",emp_25)
## [1] TRUE
mongo.find.one(mongo, ns = "emp.emp_collection25")
##  _id : 7      5bfa528cca58c00c4a178322
##  1 : 3    
##      user_id : 16     1
##      username : 2     rogers63
##      first_name : 2   david
##      last_name : 2    john
##      gender : 2   Female
##      password : 2     e6a33eee180b07e563d74fee8c2c66b8
##      status : 16      1
## 
##  2 : 3    
##      user_id : 16     2
##      username : 2     mike28
##      first_name : 2   rogers
##      last_name : 2    paul
##      gender : 2   Male
##      password : 2     2e7dc6b8a1598f4f75c3eaa47958ee2f
##      status : 16      1
## 
##  3 : 3    
##      user_id : 16     3
##      username : 2     rivera92
##      first_name : 2   david
##      last_name : 2    john
##      gender : 2   Male
##      password : 2     1c3a8e03f448d211904161a6f5849b68
##      status : 16      1
## 
##  4 : 3    
##      user_id : 16     4
##      username : 2     ross95
##      first_name : 2   maria
##      last_name : 2    sanders
##      gender : 2   Male
##      password : 2     62f0a68a4179c5cdd997189760cbcf18
##      status : 16      1
## 
##  5 : 3    
##      user_id : 16     5
##      username : 2     paul85
##      first_name : 2   morris
##      last_name : 2    miller
##      gender : 2   Female
##      password : 2     61bd060b07bddfecccea56a82b850ecf
##      status : 16      1
## 
##  6 : 3    
##      user_id : 16     6
##      username : 2     smith34
##      first_name : 2   daniel
##      last_name : 2    michael
##      gender : 2   Female
##      password : 2     7055b3d9f5cb2829c26cd7e0e601cde5
##      status : 16      1
## 
##  7 : 3    
##      user_id : 16     7
##      username : 2     james84
##      first_name : 2   sanders
##      last_name : 2    paul
##      gender : 2   Female
##      password : 2     b7f72d6eb92b45458020748c8d1a3573
##      status : 16      1
## 
##  8 : 3    
##      user_id : 16     8
##      username : 2     daniel53
##      first_name : 2   mark
##      last_name : 2    mike
##      gender : 2   Male
##      password : 2     299cbf7171ad1b2967408ed200b4e26c
##      status : 16      1
## 
##  9 : 3    
##      user_id : 16     9
##      username : 2     brooks80
##      first_name : 2   morgan
##      last_name : 2    maria
##      gender : 2   Female
##      password : 2     aa736a35dc15934d67c0a999dccff8f6
##      status : 16      1
## 
##  10 : 3   
##      user_id : 16     10
##      username : 2     morgan65
##      first_name : 2   paul
##      last_name : 2    miller
##      gender : 2   Female
##      password : 2     a28dca31f5aa5792e1cefd1dfd098569
##      status : 16      1
## 
##  11 : 3   
##      user_id : 16     11
##      username : 2     sanders84
##      first_name : 2   david
##      last_name : 2    miller
##      gender : 2   Female
##      password : 2     0629e4f9f0e01e6f20bc2066175e09f7
##      status : 16      1
## 
##  12 : 3   
##      user_id : 16     12
##      username : 2     maria40
##      first_name : 2   chrishaydon
##      last_name : 2    bell
##      gender : 2   Female
##      password : 2     17f286a78c74db7ee24374c608a2f20c
##      status : 16      1
## 
##  13 : 3   
##      user_id : 16     13
##      username : 2     brown71
##      first_name : 2   michael
##      last_name : 2    brown
##      gender : 2   Male
##      password : 2     fa0c46cc4339a8a51a7da1b33e9d2831
##      status : 16      1
## 
##  14 : 3   
##      user_id : 16     14
##      username : 2     james63
##      first_name : 2   morgan
##      last_name : 2    james
##      gender : 2   Male
##      password : 2     b945416fa907fac533d94efe1974ec07
##      status : 16      1
## 
##  15 : 3   
##      user_id : 16     15
##      username : 2     jenny0993
##      first_name : 2   rogers
##      last_name : 2    chrishaydon
##      gender : 2   Female
##      password : 2     388823cb9249d4cebc9d677a99e1d79d
##      status : 16      1
## 
##  16 : 3   
##      user_id : 16     16
##      username : 2     john96
##      first_name : 2   morgan
##      last_name : 2    wright
##      gender : 2   Male
##      password : 2     d0bb977705c3cdad1e346c898f32a1b7
##      status : 16      1
## 
##  17 : 3   
##      user_id : 16     17
##      username : 2     miller64
##      first_name : 2   morgan
##      last_name : 2    wright
##      gender : 2   Male
##      password : 2     58b207ee33794b046511203967c8e0d7
##      status : 16      1
## 
##  18 : 3   
##      user_id : 16     18
##      username : 2     mark46
##      first_name : 2   david
##      last_name : 2    ross
##      gender : 2   Female
##      password : 2     21cdcb68a932871524e16680fac72e18
##      status : 16      1
## 
##  19 : 3   
##      user_id : 16     19
##      username : 2     jenny0988
##      first_name : 2   maria
##      last_name : 2    morgan
##      gender : 2   Female
##      password : 2     ec9ed18ae2a13fef709964af24bb60e6
##      status : 16      1
## 
##  20 : 3   
##      user_id : 16     20
##      username : 2     mark80
##      first_name : 2   mike
##      last_name : 2    bell
##      gender : 2   Male
##      password : 2     084489b355edd349bca1c798788de19a
##      status : 16      1
## 
##  21 : 3   
##      user_id : 16     21
##      username : 2     morris72
##      first_name : 2   miller
##      last_name : 2    michael
##      gender : 2   Male
##      password : 2     bdb047eb9ea511052fc690a8ac72a7d3
##      status : 16      1
## 
##  22 : 3   
##      user_id : 16     22
##      username : 2     wright39
##      first_name : 2   ross
##      last_name : 2    rogers
##      gender : 2   Female
##      password : 2     1b6859df2da2a416c5b0fa044b1c6a75
##      status : 16      1
## 
##  23 : 3   
##      user_id : 16     23
##      username : 2     paul68
##      first_name : 2   brooks
##      last_name : 2    mike
##      gender : 2   Male
##      password : 2     12d836bf64839f987338414ccbec657f
##      status : 16      1
## 
##  24 : 3   
##      user_id : 16     24
##      username : 2     smith60
##      first_name : 2   miller
##      last_name : 2    daniel
##      gender : 2   Male
##      password : 2     494610644518624d05e2bdc8b9df3c36
##      status : 16      1
## 
##  25 : 3   
##      user_id : 16     25
##      username : 2     bell43
##      first_name : 2   mike
##      last_name : 2    wright
##      gender : 2   Male
##      password : 2     2bd4e16a15f5527cb43282ee0ef94619
##      status : 16      1

NOW, since we have got all the positive output, lets finally check if you can pull out the information from MongoDB

# checking queries
query = mongo.bson.buffer.create()
mongo.bson.buffer.append(query, 'firstname', c("david","marie","mike", "mark"))
## [1] TRUE
# testing if there are records or not
result =mongo.bson.from.buffer(query)  
result
##  firstname : 4    
##      0 : 2    david
##      1 : 2    marie
##      2 : 2    mike
##      3 : 2    mark

Relational DBMS vs NoSQL database

RDBMS

1. Relational database is well defined. It is a table based database.

2. Data are stored in tables as records (columns and rows).

3. Object names are not case-insensitive.

4. RDBMS can perform numeric computations at a faster rate than NoSQL.

5. Since data is stored in records, you can link with other databases.

6. You can join tables and records and databases in RDBMS and use the records as references also.

7. Maintaining RDBMS systems is expensive and need trained manpower.

NoSQL

1. NoSQL database doesn’t have a fixed schema.

2. Data are stored in collections as key - value pairs.

3. Object names are case-sensitive.

4. NoSQL have a better relationship with object oriented programing like photos, documents, videos,etc.

5. You cannot use joins in NoSQL and document nesting is allowed.

6. NoSQL is more popular in blogs data collection and cloud computing.

7. It support automatic repair, easier data distribution, simpler data models and less manpower.