Introduction

The goal of this assignment is to migrate a relational database to a NoSQL database. In this case, the destination NoSQL db will be a mongo database. The tb database will be used as an example.

Connect to the MySQL database

library(RMariaDB)

con <- dbConnect(
  drv = MariaDB(), 
  username = "root",
  password = "e20j3ss9d", 
  host = NULL, 
  port = 3306
)

From here the “con” variable will allow us to interact with the MySQL database running locally. We can change to the “tb” schema and store all the data there in a variable called “data.” After that, we can disconnect “con.”

dbSendQuery(con, "USE tb;")
## <MariaDBResult>
##   SQL  USE tb;
##   ROWS Fetched: 0 [complete]
##        Changed: 0
data <- dbGetQuery(con,"Select * FROM tb")

head(data)
##       country year    sex child adult elderly
## 1 Afghanistan 1995 female    -1    -1      -1
## 2 Afghanistan 1995   male    -1    -1      -1
## 3 Afghanistan 1996 female    -1    -1      -1
## 4 Afghanistan 1996   male    -1    -1      -1
## 5 Afghanistan 1997 female     5    96       1
## 6 Afghanistan 1997   male     0    26       0
dbDisconnect(con)

Send data to mongodb

Now that we have the data stored in memory, we can use the mongolite libary to create a collection called “tb” and insert our data into that collection.

library(mongolite)

mongodb <- mongo(collection = "tb")

mongodb$insert(data)
## List of 5
##  $ nInserted  : num 3800
##  $ nMatched   : num 0
##  $ nRemoved   : num 0
##  $ nUpserted  : num 0
##  $ writeErrors: list()

If we look into the collection with find() we can see the same data.

head(mongodb$find())
##       country year    sex child adult elderly
## 1 Afghanistan 1995 female    -1    -1      -1
## 2 Afghanistan 1995   male    -1    -1      -1
## 3 Afghanistan 1996 female    -1    -1      -1
## 4 Afghanistan 1996   male    -1    -1      -1
## 5 Afghanistan 1997 female     5    96       1
## 6 Afghanistan 1997   male     0    26       0

Relational and NoSQL databases: Advantages and Disadvantages

NoSQL databases are excellent for modern web applications that require high performance but also only require functionality that is mostly limited to the CRUD cycle of a web app: create, read, update, delete. NoSQL is also suited for handling large or evolving data sets that are common to “big data” projects – and can more easily scale to add more servers to the database, to handle increased traffic.

If data will have a pre-defined structure and needs to be consistent, a relational database is a better choice. Relational databases are also older and more standardized, so there is a wealth of knowledge to draw from and available support in a business environment.