introduction: In the assignment two, I will connect the MYSQL workbench from my computer to Azure SQL database server using credential. Then Create a table called movie_rating and insert data into the table. After that I will a create a connection to Azure with Rstudio in my computer. At that point, I will transfer data from Azure SQL database to R dataframe. Subsequently, I will drop missing data from dataframe for accurate analysis and explain the approach why I use it.
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
## package 'tidyverse' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\asadn\AppData\Local\Temp\Rtmp4OEouK\downloaded_packages
## package 'DBI' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\asadn\AppData\Local\Temp\Rtmp4OEouK\downloaded_packages
## package 'odbc' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\asadn\AppData\Local\Temp\Rtmp4OEouK\downloaded_packages
## package 'RSQLite' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\asadn\AppData\Local\Temp\Rtmp4OEouK\downloaded_packages
## package 'RMySQL' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\asadn\AppData\Local\Temp\Rtmp4OEouK\downloaded_packages
#Connect Azure sqldatabase server from mysql workbench
CREATE TABLE md.asaduzzaman39
.movie_rating
( movie_name
varchar(71) NOT NULL, Jarif
INT
NOT NULL, David
INT NOT NULL, Patric
INT NOT
NULL, Mark
INT NOT NULL, Himayat
INT NOT NULL,
PRIMARY KEY (movie_name
));
INSERT INTO md.asaduzzaman39
.movie_rating
(movie_name
, Jafir
, David
,
Patric
, Mark
, Himayat
) VALUES
(‘Avatar’, ‘5’, ‘3’, ‘2’, ‘3’, ‘4’); commit;
## [1] "movie_rating"
## movie_name Jafir David Patric Mark Himayat
## 1 Avatar 5 3 2 3 4
## 2 DarkGame 1 NA 4 NA NA
## 3 Do the Right Thing 2 5 4 3 3
## 4 John 2 5 6 3 5
## 5 Joker 4 5 3 2 5
## 6 The Pursuit of Happyness 5 4 5 3 5
## 7 Tufan 3 4 5 2 4
## movie_name Jafir David Patric Mark Himayat
## 1 Avatar 5 3 2 3 4
## 2 DarkGame 1 NA 4 NA NA
## 3 Do the Right Thing 2 5 4 3 3
## 4 John 2 5 6 3 5
## 5 Joker 4 5 3 2 5
## 6 The Pursuit of Happyness 5 4 5 3 5
## 7 Tufan 3 4 5 2 4
## movie_name Jafir David Patric Himayat
## 1 Avatar 5 3 2 4
## 3 Do the Right Thing 2 5 4 3
## 4 John 2 5 6 5
## 5 Joker 4 5 3 5
## 6 The Pursuit of Happyness 5 4 5 5
## 7 Tufan 3 4 5 4
Missing data: The missing value is the absent value in the dataset. It can the null, blank, unknown and NA. I will use the remove missing data approach because For missing values, it can get inaccurate results for out data analysis. It can create the a challenge to analyst the data. if the portion of missing data is too high,it can reduce the accuracy and reliability in the analysis.