library(dplyr)
library(RMySQL)
library(pool)
library(mongolite)
source("AssignmentWk12-password.R")
Here we pull the data down from mySQL:
movie_rate_db <- dbPool(
RMySQL::MySQL(),
dbname = 'ratemovie',
host = 'localhost',
username = 'root',
password = password
)
ratings <- as.data.frame(movie_rate_db %>% tbl("ratings"))
ratings
## title greg nick joe deb brian laura
## 1 Dunkirk 5 4 5 4 5 5
## 2 Star Wars: The Last Jedi 4 5 4 5 2 3
## 3 Baby Driver 5 5 5 4 5 5
## 4 The Mummy 3 3 1 4 3 2
## 5 The Emoji Movie 1 3 1 4 1 2
## 6 Despicable Me 3 3 3 1 4 3 2
Once we pull the movie ratings from mySQL, we connect to a MongoDB, and use the count() function to make sure the database is empty.
con <- mongo(collection = 'ratings', db = 'movieratingsdb', url = "mongodb://mike:1234@ds113046.mlab.com:13046/movieratingsdb")
con$count("{}")
## [1] 0
From here we insert the ratings table into Mongo and run a count() to make sure the data was uploaded.
con$insert(ratings)
## List of 5
## $ nInserted : num 6
## $ nMatched : num 0
## $ nRemoved : num 0
## $ nUpserted : num 0
## $ writeErrors: list()
con$count("{}")
## [1] 6
Lastly, we print out the database from MongoDB.
alldata <- con$find('{}')
print(alldata)
## title greg nick joe deb brian laura
## 1 Dunkirk 5 4 5 4 5 5
## 2 Star Wars: The Last Jedi 4 5 4 5 2 3
## 3 Baby Driver 5 5 5 4 5 5
## 4 The Mummy 3 3 1 4 3 2
## 5 The Emoji Movie 1 3 1 4 1 2
## 6 Despicable Me 3 3 3 1 4 3 2
In essence, structured query language uses defined schema to define examples. This examples are placed in rows and linked with primary keys.
No-SQL uses documents that have flexible schema. Each instance can have a fluid number of named schemas, for example, co-authors can be set up without having to change the schema.
No-SQL allows easy migration of data because of the easy and flexible schema. No-SQL databases such as MongoDB also scale better and have higher performance compared to SQL databases.
If databases have many links (with primary keys) it may make the most sense to stick with a structured query language. Also, if databases are better represented with emphasis placed on the connection between different variables, it may make more sense to use a no-sql graph database instead of a MongoDB document database.