Task:
For this assignment, you should take information from a relational database and migrate it to a NoSQL database of your own choosing.
For the relational database, you might use the flights database, the tb database, the “data skills” database your team created for Project 3, or another database of your own choosing or creation.
For the NoSQL database, you may use MongoDB (which we introduced in week 7), Neo4j, or another NoSQL database of your choosing.
Your migration process needs to be reproducible. R code is encouraged, but not required. You should also briefly describe the advantages and disadvantages of storing the data in a relational database vs. your NoSQL database.
Libraries
library(mongolite)
library(RPostgreSQL)
R code to connect to PostgreSQL Database:
drv <- dbDriver("PostgreSQL")
# creates a connection to the postgres database
con <- dbConnect(drv, dbname = "dvdrental",
host = "localhost", port = 5432,
user = "postgres", password = pw)
# dbListTables shows available tables in our dataset
dbListTables(con)
## [1] "category" "film_category" "country" "actor"
## [5] "language" "inventory" "payment" "rental"
## [9] "city" "film" "address" "film_actor"
## [13] "customer" "staff" "store"
Query to pull out data and store in R.
query1 <- dbGetQuery(con, "SELECT * FROM actor")
query2 <- dbGetQuery(con, "SELECT * FROM film")
## Warning in postgresqlExecStatement(conn, statement, ...): RS-DBI driver
## warning: (unrecognized PostgreSQL field type tsvector (id:3614) in column
## 12)
Connect to MongoDB and migrate the PostgreSQL tables actor and film into a MongoDB collection
mongodb <- mongo(collection = "dvdrental")
mongodb$insert(query1)
## List of 5
## $ nInserted : num 200
## $ nMatched : num 0
## $ nRemoved : num 0
## $ nUpserted : num 0
## $ writeErrors: list()
mongodb$insert(query2)
## List of 5
## $ nInserted : num 1000
## $ nMatched : num 0
## $ nRemoved : num 0
## $ nUpserted : num 0
## $ writeErrors: list()
Outut from within MongoDB Compass. In here you can notice the dvdrental collection, and the key:value pairs of data.
MongoDB
Advantages of NoSQL vs SQL.
In NoSQL, data doesn’t need to be “Structured” - meaning, each document within a collection can be different from one another.
An example being - in a table in a structured database (PostgresSQL, MySQL, etc.), you must conform to the specific variables - first name, last name, age, etc.. You can’t add more information to the table unless the table supports it.
In an unstructured database (MongoDB), one document can have firstname, lastname, and age, while another document can have firstname, lastname, favoritefood, favoritemovie, and age. These two documents will still be under the same collection.
Some disadvantages of NoSQL is that it’s still relativly new. Relational databases have been around much longer than NoSQL databases, which means there is less support for NoSQL.