The objective is to construct a survey for the six recent popular movies. several people will be asked to participate to rate each of these movie that they have seen on a scale of 1 to 5. Results will be stored in a MySQL database and then load the information into an R dataframe for analysis.
The mysql and ggplot will be used to import, select query, and plot the data frame
library('DBI')
library('RMySQL')
library('ggplot2')
Connecting R to data base using MySQL library, then select a query to create a dataset and store the information into data frame.
## Participantid ParticipantName movieTitle rate
## 1 1 Zahara DON'T BREATHE 3
## 2 2 Mohamad DON'T BREATHE 5
## 3 3 Zeinab DON'T BREATHE 4
## 4 4 Hadee DON'T BREATHE 5
## 5 5 Salma DON'T BREATHE 5
## 6 6 Anthony DON'T BREATHE 5
## 7 1 Zahara SUICIDE SQUAD 4
## 8 2 Mohamad SUICIDE SQUAD 4
## 9 3 Zeinab SUICIDE SQUAD 5
## 10 4 Hadee SUICIDE SQUAD 4
## 11 5 Salma SUICIDE SQUAD 4
## 12 6 Anthony SUICIDE SQUAD 5
## 13 1 Zahara KUBO AND THE TWO STRINGS 5
## 14 2 Mohamad KUBO AND THE TWO STRINGS 3
## 15 3 Zeinab KUBO AND THE TWO STRINGS 3
## 16 4 Hadee KUBO AND THE TWO STRINGS 3
## 17 5 Salma KUBO AND THE TWO STRINGS 4
## 18 6 Anthony KUBO AND THE TWO STRINGS 3
## 19 1 Zahara SAUSAGE PARTY 2
## 20 2 Mohamad SAUSAGE PARTY 4
## 21 3 Zeinab SAUSAGE PARTY 3
## 22 4 Hadee SAUSAGE PARTY 4
## 23 5 Salma SAUSAGE PARTY 5
## 24 6 Anthony SAUSAGE PARTY 2
## 25 1 Zahara MECHANIC: RESURRECTION 4
## 26 2 Mohamad MECHANIC: RESURRECTION 4
## 27 3 Zeinab MECHANIC: RESURRECTION 2
## 28 4 Hadee MECHANIC: RESURRECTION 4
## 29 5 Salma MECHANIC: RESURRECTION 4
## 30 6 Anthony MECHANIC: RESURRECTION 1
## 31 1 Zahara PETE'S DRAGON 1
## 32 2 Mohamad PETE'S DRAGON 2
## 33 3 Zeinab PETE'S DRAGON 2
## 34 4 Hadee PETE'S DRAGON 2
## 35 5 Salma PETE'S DRAGON 5
## 36 6 Anthony PETE'S DRAGON 1
look into the data types and some ready statistic values
str(mydata)
## 'data.frame': 36 obs. of 4 variables:
## $ Participantid : int 1 2 3 4 5 6 1 2 3 4 ...
## $ ParticipantName: chr "Zahara" "Mohamad" "Zeinab" "Hadee" ...
## $ movieTitle : chr "DON'T BREATHE" "DON'T BREATHE" "DON'T BREATHE" "DON'T BREATHE" ...
## $ rate : int 3 5 4 5 5 5 4 4 5 4 ...
summary(mydata)
## Participantid ParticipantName movieTitle rate
## Min. :1.0 Length:36 Length:36 Min. :1.00
## 1st Qu.:2.0 Class :character Class :character 1st Qu.:2.75
## Median :3.5 Mode :character Mode :character Median :4.00
## Mean :3.5 Mean :3.50
## 3rd Qu.:5.0 3rd Qu.:4.25
## Max. :6.0 Max. :5.00
Plot the movies verses rate to visually analyze any relationship or observe the results using the boxplot and add statistic summary to every movie boxplot
means <- aggregate(rate~movieTitle,data=mydata, FUN =mean)
medians <- aggregate(rate~movieTitle,data=mydata, FUN=median)
p <- ggplot(mydata, aes(factor(movieTitle), rate))
p + geom_boxplot(aes(fill = movieTitle))+
stat_summary(fun.y=mean, colour="darkred",
geom="point", shape=18, size=3)+
geom_text(data = means, aes(label =factor(movieTitle),
y = rate + 0.08))
the_means <-means[order(means$rate,decreasing = TRUE),]
the_medians <-medians[order(medians$rate,decreasing = TRUE),]
based on the mean values the following is the table present the rating from highets to the lowest
the_means
## movieTitle rate
## 1 DON'T BREATHE 4.500000
## 6 SUICIDE SQUAD 4.333333
## 2 KUBO AND THE TWO STRINGS 3.500000
## 5 SAUSAGE PARTY 3.333333
## 3 MECHANIC: RESURRECTION 3.166667
## 4 PETE'S DRAGON 2.166667
based on the median values the follwing table present the rating from the heighest to the lowest
the_medians
## movieTitle rate
## 1 DON'T BREATHE 5.0
## 3 MECHANIC: RESURRECTION 4.0
## 6 SUICIDE SQUAD 4.0
## 5 SAUSAGE PARTY 3.5
## 2 KUBO AND THE TWO STRINGS 3.0
## 4 PETE'S DRAGON 2.0
In conclusion, the plot shows that people have a diverse opinion regarding the Sausage party and Mechanic Resurrection illustrated by the large interquartile value. While other where having fairly small consistent interquartile region which indicates that people tend to have similar opinion about these movies disregard of their rating. Based on the median the Don’t Breathe get the highest rating followed by Mechanic Resurrection and suicide squad with 4.0 rating. Sausage Party get 3.5 rating little higher the Kubo and the Two Strings of 3.0 points. Pete,s Dragon is the last on the list with 2.0 point rating.
----