##Student database
Before connecting the data base to Rstudio, it needed to created. There are six columns that give the id, first name, last name, score, GPA, and a probability value. The probabilities were made randomly and are based on the GPA of the student. After the database and the table were created connecting to the database was easy. Using the library RMySQL will allow you to connect to the data base that was created.
We are able to print the table and view the contents of selected table as a data frame. You can also insert items into the table from R This is displayed below.
library(RMySQL)
## Warning: package 'RMySQL' was built under R version 4.0.3
## Loading required package: DBI
#connceting to database
mysqlconnection = dbConnect(MySQL(), user = 'root', password = 'root', dbname = 'student', host = 'localhost')
dbListTables(mysqlconnection)
## [1] "student_table"
#prints the data frame of the database table
result = dbSendQuery(mysqlconnection, "select * from student_table")
data.frame = fetch(result)
print(data.frame)
## SID Firstname Lastname score gpa probability
## 1 1 Ben Hebbel 91 3.1 0.50
## 2 2 Dylan Nasser 93 3.1 0.60
## 3 3 Bob Smith 88 3.5 0.70
## 4 4 John Doe 95 3.1 0.60
## 5 5 Jane Doe 98 3.7 0.90
## 6 6 Jack Ryan 98 3.7 0.90
## 7 7 John Wick 91 3.6 0.80
## 8 8 Bruce Wayne 84 3.0 0.40
## 9 9 Bruce Willis 85 3.4 0.60
## 10 10 Brad Pit 80 2.7 0.60
## 11 11 Clark Kent 78 2.8 0.30
## 12 12 Mike Smith 74 2.4 0.30
## 13 13 Will Smith 79 2.9 0.40
## 14 14 Clint Eastwood 85 3.0 0.85
#printing a data frame that shows studenst that have a score greater than 90
result = dbSendQuery(mysqlconnection, "select * from student_table where score > 90")
data.frame = fetch(result)
print(data.frame)
## SID Firstname Lastname score gpa probability
## 1 1 Ben Hebbel 91 3.1 0.5
## 2 2 Dylan Nasser 93 3.1 0.6
## 3 4 John Doe 95 3.1 0.6
## 4 5 Jane Doe 98 3.7 0.9
## 5 6 Jack Ryan 98 3.7 0.9
## 6 7 John Wick 91 3.6 0.8
#dbSendQuery(mysqlconnection, "insert into student_table(SID, Firstname, Lastname, score, gpa, probability) values(11, 'Clark', 'Kent', 84, 2.8, 0.6)")
#dbSendQuery(mysqlconnection, "insert into student_table(SID, Firstname, Lastname, score, gpa, probability) values(12, 'Mike', 'Smith', 74, 2.4, 0.3)")
#dbSendQuery(mysqlconnection, "insert into student_table(SID, Firstname, Lastname, score, gpa, probability) values(13, 'Will', 'Smith', 79, 2.9, 0.4)")
#dbSendQuery(mysqlconnection, "insert into student_table(SID, Firstname, Lastname, score, gpa, probability) values(14, 'Clint', 'Eastwood', 85, 3., 0.85)")
result = dbSendQuery(mysqlconnection, "select * from student_table")
data.frame = fetch(result)
print(data.frame)
## SID Firstname Lastname score gpa probability
## 1 1 Ben Hebbel 91 3.1 0.50
## 2 2 Dylan Nasser 93 3.1 0.60
## 3 3 Bob Smith 88 3.5 0.70
## 4 4 John Doe 95 3.1 0.60
## 5 5 Jane Doe 98 3.7 0.90
## 6 6 Jack Ryan 98 3.7 0.90
## 7 7 John Wick 91 3.6 0.80
## 8 8 Bruce Wayne 84 3.0 0.40
## 9 9 Bruce Willis 85 3.4 0.60
## 10 10 Brad Pit 80 2.7 0.60
## 11 11 Clark Kent 78 2.8 0.30
## 12 12 Mike Smith 74 2.4 0.30
## 13 13 Will Smith 79 2.9 0.40
## 14 14 Clint Eastwood 85 3.0 0.85
#printing the results for people that will that have a high chance of getting hired.
result = dbSendQuery(mysqlconnection, "select * from student_table where probability >= 0.8")
data.frame = fetch(result)
print(data.frame)
## SID Firstname Lastname score gpa probability
## 1 5 Jane Doe 98 3.7 0.90
## 2 6 Jack Ryan 98 3.7 0.90
## 3 7 John Wick 91 3.6 0.80
## 4 14 Clint Eastwood 85 3.0 0.85
#calculating the average of the table data frame that is set above.
#mean(x, trim = 0, na.rm = FALSE)
x <-c(0.9,0.9,0.8,0.85)
result.mean <- mean(x)
print(result.mean)
## [1] 0.8625
#Conclusion
The above code calculates the average of the data frame of students that have a probability that is greater than or equal to 80%.The probability is based on the GPA of the student. The four students that are in that data frame will have an 86% chance of getting hired.