#library(mongolite)
library(dbConnect)
## Loading required package: RMySQL
## Loading required package: DBI
## Loading required package: gWidgets
library(RMySQL)

For this assignment, you should take information from a relational database and migrate it to a NoSQL database of your own choosing. For the relational database, you might use the flights database, the tb database, the “data skills” database your team created for Project 3, or another database of your own choosing or creation. For the NoSQL database, you may use MongoDB (which we introduced in week 7), Neo4j, or another NoSQL database of your choosing. Your migration process needs to be reproducible. R code is encouraged, but not required. You should also briefly describe the advantages and disadvantages of storing the data in a relational database vs. your NoSQL database.

I will be using the relational database team cunyverse created for our final project.

url <- "https://raw.githubusercontent.com/miasiracusa/Data607/master/finalproject/Air_Quality.csv"
airquality <- read.csv(url)
names(airquality) <- c("Indicator Data ID", "Indicator ID", "Type of Toxin", "Unit of measurement", "Type of Location", "Location ID", "Location", "Year", "Measurement")
head(airquality)
##   Indicator Data ID Indicator ID
## 1            130728          646
## 2            130729          646
## 3            130730          646
## 4            130731          646
## 5            130732          646
## 6            130727          646
##                                               Type of Toxin
## 1 Air Toxics Concentrations- Average Benzene Concentrations
## 2 Air Toxics Concentrations- Average Benzene Concentrations
## 3 Air Toxics Concentrations- Average Benzene Concentrations
## 4 Air Toxics Concentrations- Average Benzene Concentrations
## 5 Air Toxics Concentrations- Average Benzene Concentrations
## 6 Air Toxics Concentrations- Average Benzene Concentrations
##     Unit of measurement Type of Location Location ID      Location Year
## 1 Average Concentration          Borough           1         Bronx 2005
## 2 Average Concentration          Borough           2      Brooklyn 2005
## 3 Average Concentration          Borough           3     Manhattan 2005
## 4 Average Concentration          Borough           4        Queens 2005
## 5 Average Concentration          Borough           5 Staten Island 2005
## 6 Average Concentration         Citywide           1 New York City 2005
##   Measurement
## 1         2.8
## 2         2.8
## 3         4.7
## 4         1.9
## 5         1.6
## 6         2.9

Since my computer is so old, I always always always have issues with MySQL, so I needed to make part of my code comments in order to knit it. My Rstudio also doesn’t come with mongolite or the neo4j package.

db_user <- 'root'
db_password <- 'password'
db_name <- 'airquality'
db_host <- '127.0.0.1' # for local access
db_port <- 3306

drv <- dbDriver("MySQL")

# creating connection
connect <- dbConnect(drv, user = db_user, password = db_password, dbname = db_name, host = db_host, port = db_port)

if (dbExistsTable(connect, "airquality"))
    dbRemoveTable(connect, "airquality")
## [1] TRUE
#dbWriteTable(connect, name = "airquality", value = airquality, append = T,  row.names = FALSE)
#the above is a comment because I am getting the error "Error in .local(conn, statement, ...) : 
# could not run statement: The used command is not allowed with this MySQL version" and I have a very old computer, so I cannot help this issue/cannot update
#make connection
##mongoDB_Movies = mongo(collection = "airquality", db = "airquality")
#input data
##my_collection$insert(imdb_movies)

Relational databases have predefined schema, and are beneficial for representing relationships within data. Non-relational databases have dynamic schema and ar beneficial for unstructed or changing data.