library(RMySQL)
library(mongolite)
library(tidyverse)
For this assignment, you should take information from a relational database and migrate it to a NoSQL database of your own choosing.
For the relational database, you might use the flights database, the tb database, the “data skills” database your team created for Project 3, or another database of your own choosing or creation.
For the NoSQL database, you may use MongoDB (which we introduced in week 7), Neo4j, or another NoSQL database of your choosing.
test_json <- '{"Star Wars Films" : [ { "Title":"The Phantom Menace", "Joe":"5" , "Kristyn ":"4", "Matt ":"3", "Kristin":"4", "Justin":"5", "Kristina":"3" }, { "Title":"The Clone Wars", "Joe":"4" , "Kristyn ":"3", "Matt ":"2", "Kristin":"3", "Justin":"4", "Kristina":"4" }, {"Title":"Revenge of the Sith", "Joe":"5" , "Kristyn ":"4", "Matt ":"4", "Kristin":"4", "Justin":"5", "Kristina":"5"}, { "Title":"A New Hope", "Joe":"5" , "Kristyn ":"5", "Matt ":"4", "Kristin":"4", "Justin":"4", "Kristina":"4"},{ "Title":"Empire Strikes Back", "Joe":"5" , "Kristyn ":"5", "Matt ":"4", "Kristin":"4", "Justin":"5", "Kristina":"4" },{"Title":"Return of the Jedi", "Joe":"4" , "Kristyn ":"5", "Matt ":"2", "Kristin":"4", "Justin":"5", "Kristina":"5" }]}'
star_wars_ratin <- mongo(url = "mongodb://localhost", options = ssl_options(weak_cert_validation = T))
star_wars_ratin$drop()
star_wars_ratin$insert(test_json)
## List of 6
## $ nInserted : int 1
## $ nMatched : int 0
## $ nModified : int 0
## $ nRemoved : int 0
## $ nUpserted : int 0
## $ writeErrors: list()
star_wars_df <- as.tibble(star_wars_ratin$find())
## Warning: `as.tibble()` is deprecated, use `as_tibble()` (but mind the new semantics).
## This warning is displayed once per session.
star_wars_df
## # A tibble: 1 x 1
## `Star Wars Films`
## <list>
## 1 <df[,7] [6 x 7]>
Star_Wars_DF <- as.data.frame(star_wars_df[[1]][[1]])
Star_Wars_DF
## Title Joe Kristyn Matt Kristin Justin Kristina
## 1 The Phantom Menace 5 4 3 4 5 3
## 2 The Clone Wars 4 3 2 3 4 4
## 3 Revenge of the Sith 5 4 4 4 5 5
## 4 A New Hope 5 5 4 4 4 4
## 5 Empire Strikes Back 5 5 4 4 5 4
## 6 Return of the Jedi 4 5 2 4 5 5
db <- dbDriver("MySQL")
mydb <- dbConnect(db, username="root", password="Yogibear1", host="localhost")
dbSendQuery(mydb, "USE star_wars_films")
## <MySQLResult:1,0,0>
mydb <- dbConnect(db, username="root", password="Yogibear1", host="localhost", dbname = "star_wars_films")
dbSendQuery(mydb, "
CREATE TABLE film (
title VARCHAR(50),
Joe INT,
Kristyn INT,
Matt INT,
Kristin INT,
Justin INT,
Kristina INT);")
## <MySQLResult:8,1,0>
dbSendQuery(mydb, "INSERT INTO film
(title, Joe, Kristyn, Matt, Kristin, Justin, Kristina)
VALUES('The Phantom Menace','5','3', '3','4','5','4');")
## <MySQLResult:71959296,1,1>
dbSendQuery(mydb, "INSERT INTO film
(title, Joe, Kristyn, Matt, Kristin, Justin, Kristina)
VALUES('The Clone Wars','4','5', '3','4','4','5');")
## <MySQLResult:209408448,1,2>
dbSendQuery(mydb, "INSERT INTO film
(title, Joe, Kristyn, Matt, Kristin, Justin, Kristina)
VALUES('Revenge of the Sith','5','5', '4','4','5','5');")
## <MySQLResult:71507696,1,3>
dbSendQuery(mydb, "INSERT INTO film
(title, Joe, Kristyn, Matt, Kristin, Justin, Kristina)
VALUES('A New Hope','5','5', '3','4','5','3');")
## <MySQLResult:71523352,1,4>
dbSendQuery(mydb, "INSERT INTO film
(title, Joe, Kristyn, Matt, Kristin, Justin, Kristina)
VALUES('Empire Strike Back','5','4', '3','5','5','4');")
## <MySQLResult:210719992,1,5>
dbSendQuery(mydb, "INSERT INTO film
(title, Joe, Kristyn, Matt, Kristin, Justin, Kristina)
VALUES('Return of the Jedi','4','5', '3','3','4','4');")
## <MySQLResult:71515496,1,6>
try1 = fetch(dbSendQuery(mydb, "SELECT * FROM film;"))
mysql_star_wars_df <- as.data.frame(try1)
mysql_star_wars_df
## title Joe Kristyn Matt Kristin Justin Kristina
## 1 The Phantom Menace 5 3 3 4 5 4
## 2 The Clone Wars 4 5 3 4 4 5
## 3 Revenge of the Sith 5 5 4 4 5 5
## 4 A New Hope 5 5 3 4 5 3
## 5 Empire Strike Back 5 4 3 5 5 4
## 6 Return of the Jedi 4 5 3 3 4 4
star_wars_rating <- mongo(url = "mongodb://localhost", options = ssl_options(weak_cert_validation = T))
star_wars_rating$drop()
star_wars_rating$count("{}")
## [1] 0
star_wars_rating$insert(try1)
## List of 5
## $ nInserted : num 6
## $ nMatched : num 0
## $ nRemoved : num 0
## $ nUpserted : num 0
## $ writeErrors: list()
star_wars_rating$count("{}")
## [1] 6
nosql_star_wars_df <- as.data.frame(star_wars_rating$find())
nosql_star_wars_df
## title Joe Kristyn Matt Kristin Justin Kristina
## 1 The Phantom Menace 5 3 3 4 5 4
## 2 The Clone Wars 4 5 3 4 4 5
## 3 Revenge of the Sith 5 5 4 4 5 5
## 4 A New Hope 5 5 3 4 5 3
## 5 Empire Strike Back 5 4 3 5 5 4
## 6 Return of the Jedi 4 5 3 3 4 4
In essence, structured query language uses defined schema to define examples. This examples are placed in rows and linked with primary keys. No-SQL uses documents that have flexible schema. Each instance can have a fluid number of named schemas, for example, co-authors can be set up without having to change the schema.
No-SQL allows easy migration of data because of the easy and flexible schema. No-SQL databases such as MongoDB also scale better and have higher performance compared to SQL databases.
If databases have many links (with primary keys) it may make the most sense to stick with a structured query language. Also, if databases are better represented with emphasis placed on the connection between different variables, it may make more sense to use a no-sql graph database