The goal of this assignment is to migrate a relational database to a NoSQL database. In this case, the destination NoSQL db will be a mongo database. The tb database will be used as an example.
library(RMariaDB)
con <- dbConnect(
drv = MariaDB(),
username = "root",
password = "e20j3ss9d",
host = NULL,
port = 3306
)
From here the “con” variable will allow us to interact with the MySQL database running locally. We can change to the “tb” schema and store all the data there in a variable called “data.” After that, we can disconnect “con.”
dbSendQuery(con, "USE tb;")
## <MariaDBResult>
## SQL USE tb;
## ROWS Fetched: 0 [complete]
## Changed: 0
data <- dbGetQuery(con,"Select * FROM tb")
head(data)
## country year sex child adult elderly
## 1 Afghanistan 1995 female -1 -1 -1
## 2 Afghanistan 1995 male -1 -1 -1
## 3 Afghanistan 1996 female -1 -1 -1
## 4 Afghanistan 1996 male -1 -1 -1
## 5 Afghanistan 1997 female 5 96 1
## 6 Afghanistan 1997 male 0 26 0
dbDisconnect(con)
Now that we have the data stored in memory, we can use the mongolite libary to create a collection called “tb” and insert our data into that collection.
library(mongolite)
mongodb <- mongo(collection = "tb")
mongodb$insert(data)
## List of 5
## $ nInserted : num 3800
## $ nMatched : num 0
## $ nRemoved : num 0
## $ nUpserted : num 0
## $ writeErrors: list()
If we look into the collection with find() we can see the same data.
head(mongodb$find())
## country year sex child adult elderly
## 1 Afghanistan 1995 female -1 -1 -1
## 2 Afghanistan 1995 male -1 -1 -1
## 3 Afghanistan 1996 female -1 -1 -1
## 4 Afghanistan 1996 male -1 -1 -1
## 5 Afghanistan 1997 female 5 96 1
## 6 Afghanistan 1997 male 0 26 0
NoSQL databases are excellent for modern web applications that require high performance but also only require functionality that is mostly limited to the CRUD cycle of a web app: create, read, update, delete. NoSQL is also suited for handling large or evolving data sets that are common to “big data” projects – and can more easily scale to add more servers to the database, to handle increased traffic.
If data will have a pre-defined structure and needs to be consistent, a relational database is a better choice. Relational databases are also older and more standardized, so there is a wealth of knowledge to draw from and available support in a business environment.