In this assignment, we will recall tables from the SQL database into R. The data is stored in the database server: cunydata607sql.mysql.database.azure.com. The two tables created were Friends and Ratings. Friends have names, ages, occupancy, and favorite genres. Ratings contain the ratings recorded for each movie.
# install packages needed
library(DBI)
library(odbc)
library(RSQLite)
library(RMySQL)
##
## Attaching package: 'RMySQL'
## The following object is masked from 'package:RSQLite':
##
## isIdCurrent
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
The tables were made in EXCEL and uploaded through the “Table Data Import Wizard” Available on MySQL Workbench.
In the code below, we will establish a connection with MySQL server with the stored data, and view insights from the tables.
con <- dbConnect(MySQL(), user='bishoy.sokkar60', password='<bishoy.sokkar607>', dbname='bishoy.sokkar60', host='cunydata607sql.mysql.database.azure.com')
To view the tables in the database:
dbListTables(con)
## [1] "friends" "ratings"
To learn more about the table variables in the data base:
dbListFields(con, "friends")
## [1] "my_row_id" "Friend" "Age" "Occupancy"
## [5] "Favorite Genre"
dbListFields(con, "ratings")
## [1] "my_row_id" "Friend" "Wonka" "The Beekeeper"
## [5] "A Ghost Story" "Anyone But You" "Origin" "Past Lives"
To retrieve tables from MySQL server, we use the following function:
#to create tables in R with the data
Friends_data = dbGetQuery(con, "SELECT * FROM friends ")
Ratings_data = dbGetQuery(con, "SELECT * FROM ratings")
# to glimpse to confirm all information transfered correctly.
glimpse(Friends_data)
## Rows: 5
## Columns: 4
## $ `Friend` <chr> "Alicia", "Lauren", "Jacky", "Vivian ", "Jacob"
## $ Age <int> 29, 24, 32, 34, 27
## $ Occupancy <chr> "Area Manager", "Area Manager", "Area Manager", "Oper…
## $ `Favorite Genre` <chr> "Romance", "Horror", "Comedy", "Fantasy", "Action"
glimpse(Ratings_data)
## Rows: 5
## Columns: 7
## $ `Friend` <chr> "Alicia", "Lauren", "Jacky", "Vivian ", "Jacob"
## $ Wonka <int> 2, 3, 1, 5, 1
## $ `The Beekeeper` <int> 3, 4, 2, 4, 3
## $ `A Ghost Story` <int> 1, 5, 5, 2, 2
## $ `Anyone But You` <int> 1, 5, 1, 4, 1
## $ Origin <int> 4, 4, 3, 3, 2
## $ `Past Lives` <int> 4, 2, 5, 1, 2
When uploading the tables into the SQL database, I selected NN (Not Null) which constrains the column from containing null values.