Week 12 Assignment - NoSQL Migration

For this assignment, you should take information from a relational database and migrate it to a NoSQL database of your own choosing.

For the relational database, you might use the flights database, the tb database, the “data skills” database your team created for Project 3, or another database of your own choosing or creation.

For the NoSQL database, you may use MongoDB (which we introduced in week 7), Neo4j, or another NoSQL database of your choosing.

Your migration process needs to be reproducible. R code is encouraged, but not required. You should also briefly describe the advantages and disadvantages of storing the data in a relational database vs. your NoSQL database.

For this Assignment, I will use the Movies Database I created in MySQL for Week 2 Assignment

library(RMariaDB)
library(pool)
library(mongolite)

Get Data from MySQL

connection <- dbConnect(RMariaDB::MariaDB(),
                        user="movieuser",
                        password="password",
                        dbname="movies",
                        host="localhost")


dbListFields(connection, "movies")
## [1] "MovieID" "Title"   "Length"  "URL"
Query <- "SELECT * FROM movies;"

tb <- dbGetQuery(connection, Query)

Viewing MySQL Movies Data

tb
##   MovieID                                     Title Length
## 1       1                               Devil Winds     90
## 2       2         The Biggest of Oklahoma Tornadoes     50
## 3       3     Tornado Video Classics - Volume Three     60
## 4       4 El Reno Oklahoma Tornado Full Storm Chase     65
## 5       5       Tornado Video Classics - Volume Two     70
## 6       6                               Dirty Deeds     84
##                                           URL
## 1 https://www.youtube.com/watch?v=pOIQSwB38NM
## 2 https://www.youtube.com/watch?v=aULlfbjmKuQ
## 3 https://www.youtube.com/watch?v=3Y3MbsorN7M
## 4 https://www.youtube.com/watch?v=3SP5VSwUNDQ
## 5 https://www.youtube.com/watch?v=qTPsoDs-rcM
## 6 https://www.youtube.com/watch?v=5of46VZM7zg

Connecting to NoSQL using MongoDB

con <- mongo(collection = "tb", db = "movies", url = "mongodb://localhost",
      verbose = FALSE, options = ssl_options())

View the MongoDB for content, this should be empty at this point

con$count("{}")
## [1] 18

Insert the MySQL DB to the MongoDB

con$insert(tb)
## List of 5
##  $ nInserted  : num 6
##  $ nMatched   : num 0
##  $ nRemoved   : num 0
##  $ nUpserted  : num 0
##  $ writeErrors: list()

View the MongoDB result

MongoData <- con$find('{}')
print(MongoData)
##    MovieID                                     Title Length
## 1        1                               Devil Winds     90
## 2        2         The Biggest of Oklahoma Tornadoes     50
## 3        3     Tornado Video Classics - Volume Three     60
## 4        4 El Reno Oklahoma Tornado Full Storm Chase     65
## 5        5       Tornado Video Classics - Volume Two     70
## 6        6                               Dirty Deeds     84
## 7        1                               Devil Winds     90
## 8        2         The Biggest of Oklahoma Tornadoes     50
## 9        3     Tornado Video Classics - Volume Three     60
## 10       4 El Reno Oklahoma Tornado Full Storm Chase     65
## 11       5       Tornado Video Classics - Volume Two     70
## 12       6                               Dirty Deeds     84
## 13       1                               Devil Winds     90
## 14       2         The Biggest of Oklahoma Tornadoes     50
## 15       3     Tornado Video Classics - Volume Three     60
## 16       4 El Reno Oklahoma Tornado Full Storm Chase     65
## 17       5       Tornado Video Classics - Volume Two     70
## 18       6                               Dirty Deeds     84
## 19       1                               Devil Winds     90
## 20       2         The Biggest of Oklahoma Tornadoes     50
## 21       3     Tornado Video Classics - Volume Three     60
## 22       4 El Reno Oklahoma Tornado Full Storm Chase     65
## 23       5       Tornado Video Classics - Volume Two     70
## 24       6                               Dirty Deeds     84
##                                            URL
## 1  https://www.youtube.com/watch?v=pOIQSwB38NM
## 2  https://www.youtube.com/watch?v=aULlfbjmKuQ
## 3  https://www.youtube.com/watch?v=3Y3MbsorN7M
## 4  https://www.youtube.com/watch?v=3SP5VSwUNDQ
## 5  https://www.youtube.com/watch?v=qTPsoDs-rcM
## 6  https://www.youtube.com/watch?v=5of46VZM7zg
## 7  https://www.youtube.com/watch?v=pOIQSwB38NM
## 8  https://www.youtube.com/watch?v=aULlfbjmKuQ
## 9  https://www.youtube.com/watch?v=3Y3MbsorN7M
## 10 https://www.youtube.com/watch?v=3SP5VSwUNDQ
## 11 https://www.youtube.com/watch?v=qTPsoDs-rcM
## 12 https://www.youtube.com/watch?v=5of46VZM7zg
## 13 https://www.youtube.com/watch?v=pOIQSwB38NM
## 14 https://www.youtube.com/watch?v=aULlfbjmKuQ
## 15 https://www.youtube.com/watch?v=3Y3MbsorN7M
## 16 https://www.youtube.com/watch?v=3SP5VSwUNDQ
## 17 https://www.youtube.com/watch?v=qTPsoDs-rcM
## 18 https://www.youtube.com/watch?v=5of46VZM7zg
## 19 https://www.youtube.com/watch?v=pOIQSwB38NM
## 20 https://www.youtube.com/watch?v=aULlfbjmKuQ
## 21 https://www.youtube.com/watch?v=3Y3MbsorN7M
## 22 https://www.youtube.com/watch?v=3SP5VSwUNDQ
## 23 https://www.youtube.com/watch?v=qTPsoDs-rcM
## 24 https://www.youtube.com/watch?v=5of46VZM7zg

Comparison of MySQL and NoSQL

MySQL

MySQL is relational in nature since all the data is stored in different tables and relations are established using primary keys or other keys known as foreign keys.

The model is based on the concept of Tables.

Difficult to scale data.

Detailed database model needed before creation.

Availability of standard language.

Schema is rigid

The design is not flexible.

NoSQL

Easy to scale data. A primary reason for this different data structure could be driven by the simplicity of design, simpler horizontal scaling to clusters of machines and more control over availability.

Model based on the concept of document. Documents are addressed in the database via a unique key that represents the document.

No need to develop detailed database model.

No standard query language is available.

The schema is dynamic.

The design is flexible, new field or columns can be added easily.

Advantages of NoSQL

NoSQL are easier to manage and they provide higher level of flexibility with newer data models.

The open source nature of NoSQL databases makes them an appealing solution for smaller organizations with limited budgets.

NoSQL database experts often use elastic scalability as a major selling point of NoSQL. NoSQL databases are designed to function on full throttle even with low-cost hardware.

The non-relational nature of a NoSQL database allows database architects to quickly create a database without needing to develop a detailed (fine-grained) database model.

Disadvantages of NoSQL

NoSQL community is relatively new and lacks the maturity of the MySQL user base.

NoSQL databases lacks reporting tools for analysis and performance testing.

NoSQL’s lack of standardization can cause a problem during migration.

References Gajani, A. (2019). The key differences between SQL and NoSQL DBs. Retrieved from https://www.monitis.com/blog/cc-in-review-the-key-differences-between-sql-and-nosql-dbs/