Load Libraries

#library(tidyverse)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(DBI)
library(RMySQL)
## Warning: package 'RMySQL' was built under R version 4.3.2

Connect to MySQL Database

password has been masked

lamadb <- dbConnect(RMySQL::MySQL(), 
                 user = "chhiring.lama65", 
                 password = password, 
                 dbname = "chhiring.lama65",
                 host = "cunydata607sql.mysql.database.azure.com")

List the tables

dbListTables(lamadb)
## [1] "movie_ratings"

Show Column Names using dbListFields function (assigned it to column_names)

table_name <- "movie_ratings"  
column_names <- dbListFields(lamadb, table_name)

Construct the SQL query using sprintf function and then retrieve the data from MySQL database using dbGetQuery function

query <- sprintf("SELECT * FROM %s", table_name)
table_data <- dbGetQuery(lamadb, query)

I have chosen to make null for one of Paricipants’ values. If it was an integer value, one of the approaches to handle it would be taking mean value from the specific field and replace the null value with it. The impact of a null value in the Participants field depends on the specific context and the importance of that field in the analysis. If the Participants field is not critical for the analysis or does not significantly affect the interpretation of the data, we might choose to leave the null value as it is. However, if the Participants field is relevant and its absence could lead to incomplete or inaccurate insights, imputing the missing value using relevant information ensures a more comprehensive dataset for analysis. Ultimately, the decision should align with the goals and requirements of our analysis.

table_data
##   Movie_ID              Movies    Participants  Rating Rating_Score
## 1      111         The Creator     Janesh Kaur   Awful            1
## 2       34       The Holdovers      Nancy Jain Not bad            3
## 3       65         The Marvels   Jaspreet Devi  Boring            2
## 4       89               Wonka David Goldsmith   Great            4
## 5      105 Society of the Snow                   Super            5
## 6      124            Napoleon     Putar Junee  Boring            2