Read .csv file into MySQL
RMySQL package and the syntax below:library(RMySQL)
library(knitr)
library(kableExtra)
# connecting to MySQL from R
mydb = dbConnect(MySQL(), user='root', password=password, dbname='hw12', host='localhost', port = 3306)
dbListTables(mydb)
## [1] "averages"
sql_data <- dbGetQuery(mydb, "SELECT * FROM averages")
sql_data <- data.frame(sql_data)
kable(head(sql_data, n = 20L), align = rep('c', 5)) %>%
kable_styling(bootstrap_options = c("striped", "responsive", "condensed"), full_width = TRUE)
| congress | chamber | bioguide | last_name | state | district | party | votes | agree_pct | predicted_agree | net_trump_vote |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | house | A000055 | Aderholt | AL | 4 | Republican | 148 | 0.9729730 | 0.9561596 | 63.0 |
| 115 | house | A000055 | Aderholt | AL | 4 | R | 95 | 0.9684211 | 0.9463492 | 63.0 |
| 116 | house | A000055 | Aderholt | AL | 4 | R | 53 | 0.9811321 | 0.9737444 | 63.0 |
| 0 | house | A000367 | Amash | MI | 3 | Independent | 146 | 0.6301370 | 0.7679919 | 9.4 |
| 115 | house | A000367 | Amash | MI | 3 | R | 96 | 0.5416667 | 0.8473855 | 9.4 |
| 116 | house | A000367 | Amash | MI | 3 | I | 50 | 0.8000000 | 0.6155562 | 9.4 |
| 0 | house | A000369 | Amodei | NV | 2 | Republican | 146 | 0.9726027 | 0.8163489 | 12.3 |
| 115 | house | A000369 | Amodei | NV | 2 | R | 94 | 0.9893617 | 0.8765268 | 12.3 |
| 116 | house | A000369 | Amodei | NV | 2 | R | 52 | 0.9423077 | 0.7075659 | 12.3 |
| 0 | house | A000370 | Adams | NC | 12 | Democrat | 145 | 0.1172414 | 0.1395692 | -40.0 |
| 115 | house | A000370 | Adams | NC | 12 | D | 92 | 0.1739130 | 0.2037872 | -40.0 |
| 116 | house | A000370 | Adams | NC | 12 | D | 53 | 0.0188679 | 0.0280965 | -40.0 |
| 0 | house | A000371 | Aguilar | CA | 31 | Democrat | 147 | 0.1904762 | 0.2161542 | -21.1 |
| 115 | house | A000371 | Aguilar | CA | 31 | D | 94 | 0.2765957 | 0.3138676 | -21.1 |
| 116 | house | A000371 | Aguilar | CA | 31 | D | 53 | 0.0377358 | 0.0428511 | -21.1 |
| 0 | house | A000372 | Allen | GA | 12 | Republican | 149 | 0.9664430 | 0.8743644 | 16.2 |
| 115 | house | A000372 | Allen | GA | 12 | R | 95 | 0.9684211 | 0.9057892 | 16.2 |
| 116 | house | A000372 | Allen | GA | 12 | R | 54 | 0.9629630 | 0.8190800 | 16.2 |
| 0 | house | A000374 | Abraham | LA | 5 | Republican | 125 | 0.9360000 | 0.9377738 | 29.4 |
| 115 | house | A000374 | Abraham | LA | 5 | R | 96 | 0.9270833 | 0.9458395 | 29.4 |
Now that I had my data in a Relational database, and could read it into R, I did some research to find a package in R that would be able to connect to a NoSQL database. I decided to use the mongolite package and MongoDB. I’ve used MongoDB in the past, and it seemed like a great option for this assignment.
With my ‘averages’ data in a dataframe in R that was pulling directly from my SQL connection, I then created a collection in my new MongoDB instance and ran it locally. In order to connect and insert the data into my new collection using mongolite in R, I did the following:
library(mongolite)
# created a new collection in MongoDB
collection_add <- mongo(collection = "hw12data", db = "hw12")
# inserted the data into the collection from R to MongoDB
collection_add$insert(sql_data)
JSON view of the data in MongoDB collection after successful migration
Table view of the data in MongoDB collection after successful migration
paste0('In my MySQL dataset there are ', length(sql_data$bioguide), ' observations.')
## [1] "In my MySQL dataset there are 1416 observations."
paste0('In my MongoDB database collection there are ', collection_add$count(), ' observations.')
## [1] "In my MongoDB database collection there are 1416 observations."
It appears that there was a successful migration!