Rpubs: http://rpubs.com/umais/week12_assignment
Github: https://github.com/umais/DATA-607/tree/master/Week12_Assignment
In this assignment I will be taking data from a relational database and migrating it to MongoDB database that is set up on MongoLab. The data set that I have chosen is from Project 3 . I am going to retrieve the Job Title, Job Location and Skill Name from the Normalized tables DataScienceJob and DataScienceSkills and store them in a mongo database collection called jobskills.
In this step I will be using the RMySQL library to connect to MySQL data store and retrieve the Skills data. I will then connect to the MongoDB database called openletter on MongoLab and Insert the Skill data reteived from MySQL in the collection called jobskills.
my_collection = mongo(collection = "jobskills", url = "mongodb://ruser:Tw1stone23@ds031618.mlab.com:31618/openletter")
mydb = dbConnect(MySQL(), user='root', password='Welcome@1', dbname='project3', host='localhost')
results = dbSendQuery(mydb, "SELECT j.JobTitle,j.JobLocation,s.SkillName FROM DataScienceJobs j INNER JOIN DataScienceSkills s ON j.JobId=s.JobId ;")
jobSkills=fetch(results, n=-1)
my_collection$drop()
## [1] TRUE
my_collection$insert(jobSkills)
##
Processed 1000 rows...
Complete! Processed total of 1643 rows.
## $nInserted
## [1] 1643
##
## $nMatched
## [1] 0
##
## $nRemoved
## [1] 0
##
## $nUpserted
## [1] 0
##
## $writeErrors
## list()
head(jobSkills)
## JobTitle JobLocation SkillName
## 1 Data Scientist Sunnyvale, CA Python
## 2 Data Scientist Sunnyvale, CA C/C++
## 3 Data Scientist Sunnyvale, CA Apache Spark
## 4 Data Scientist Sunnyvale, CA Kafka
## 5 Data Scientist Sunnyvale, CA ElasticSearch
## 6 Data Scientist San Francisco, CA Postgres/Redshift
alldata <- my_collection$find('{}')
##
Found 1000 records...
Found 1643 records...
Imported 1643 records. Simplifying into dataframe...
head(alldata)
## JobTitle JobLocation SkillName
## 1 Data Scientist Sunnyvale, CA Python
## 2 Data Scientist Sunnyvale, CA C/C++
## 3 Data Scientist Sunnyvale, CA Apache Spark
## 4 Data Scientist Sunnyvale, CA Kafka
## 5 Data Scientist Sunnyvale, CA ElasticSearch
## 6 Data Scientist San Francisco, CA Postgres/Redshift