The task of the assignment was to have at least six individuals rate five movies. However, it was not very suitable in my case as those I know very rarely watch movies, so little that they have never seen Black Panther! Yes, this makes me the cool friend. Therefore, I opt for the one thing that we all like watching, that is Korean Popular Music Videos.
K-Pop has become quite a phenomenon in Western society over the last decade. SHINee is one of distinguished boy bands that continues to greatly influence the trends of K-Pop music industry. They debuted as a 5-member group in 2008, and since then has release over 35 music videos.
This assignment seeks to investigate what is the most appealing attribute to fans about 5 random SHINee music videos, and which video has an equal distribution of the attributes based on fans’ rankings.
The five music videos that were selected from SHINee’s discography are:
| Music_Videos | Released_Date |
|---|---|
| Sherlock | 2012 |
| Everybody | 2013 |
| View | 2015 |
| Tell Me What To Do | 2016 |
| Good Evening | 2018 |
The music videos were ranked based on these attributes:
Utilizing the influence of my social media profile with over 1000 followers, on February 5th, 2019 from 9 AM to 2 PM EST, I announced and made available a link to a SurveyMonkey form that allows my followers, nearly 85% of whom are fans of SHINee, to participate in this survey.
Because the participants are devoted fans, I minimized bias by changing from a ‘rating scale’ to a ‘ranking scale’, where the participants are asked to rank the music videos from 1st place to 5th place over 3 attributes. For example, if someone ranks a specific attribute of the music video as 1, both the ranking spot and music video can no longer be used for the remaining four ranks in this attribute category. A rank in the first spot (1) means it is the best for the specific attribute than the others.
The following is a screenshot of what the survey looks like which allows participants to drag and drop their rankings for each attribute.
Screenshot
After the allotted time for responses elapsed, the survey was closed. There were a total of 50 participants. The .csv file was exported from SurveyMonkey, and cleaned to be uploaded as a database on MySQL.
The following code created the database on MySQL.
-- Samantha Deokinanan
-- CUNY MSDS DATA 607 Assignment 2
-- Database: SHINee Ranking 2/5/2019
-- Host: localhost
-- ------------------------------------------------------
-- Table structure for contengency table 'SHINeeRanking'
DROP TABLE IF EXISTS `SHINeeRanking`;
CREATE TABLE `SHINeeRanking` (
`ID` integer NOT NULL,
`mvs` varchar(100) NOT NULL,
`first_visual` varchar(100) NOT NULL,
`second_visual` varchar(100) NOT NULL,
`third_visual` varchar(100) NOT NULL,
`forth_visual` varchar(100) NOT NULL,
`fifth_visual` varchar(100) NOT NULL,
`first_theme_story` varchar(100) NOT NULL,
`second_theme_story` varchar(100) NOT NULL,
`third_theme_story` varchar(100) NOT NULL,
`forth_theme_story` varchar(100) NOT NULL,
`fifth_theme_story` varchar(100) NOT NULL,
`first_performance` varchar(100) NOT NULL,
`second_performance` varchar(100) NOT NULL,
`third_performance` varchar(100) NOT NULL,
`forth_performance` varchar(100) NOT NULL,
`fifth_performance` varchar(100) NOT NULL,
PRIMARY KEY (`ID`)
);
-- Dumping data for table `SHINeeRanking`
LOAD DATA INFILE 'path\to\hw2dataset_SHINeeRanking.csv'
INTO TABLE SHINeeRanking
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;
Now, it is time to connect R to MySQL database.
# Loading the required library
library(RMySQL)
# Connecting to MySQL Sever
db<- dbConnect(MySQL(), user = 'root', password='', dbname = 'shineeranking', host = 'localhost', port=3306)
# Retrieving the database and make it avaliable in RStudio
retrieved<- dbSendQuery(db, "SELECT * FROM shineeranking")
SHINeeRanking<- fetch(retrieved, n = -1)
SHINeeRanking<- SHINeeRanking[, -1]
# The database of the music video ranking
library(kableExtra)
kable(SHINeeRanking) %>%
kable_styling() %>%
scroll_box(width = "100%", height = "200px")
| mvs | first_visual | second_visual | third_visual | forth_visual | fifth_visual | first_theme_story | second_theme_story | third_theme_story | forth_theme_story | fifth_theme_story | first_performance | second_performance | third_performance | forth_performance | fifth_performance |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Sherlock | 0 | 0 | 14 | 21 | 15 | 10 | 7 | 14 | 13 | 6 | 17 | 15 | 7 | 9 | 2 |
| Everybody | 8 | 12 | 11 | 11 | 8 | 7 | 4 | 2 | 13 | 24 | 16 | 9 | 10 | 12 | 3 |
| View | 13 | 12 | 4 | 10 | 11 | 7 | 14 | 13 | 8 | 8 | 6 | 10 | 17 | 9 | 8 |
| Tell Me What To Do | 15 | 6 | 14 | 4 | 11 | 10 | 9 | 9 | 12 | 10 | 3 | 10 | 7 | 10 | 20 |
| Good Evening | 14 | 20 | 7 | 4 | 5 | 16 | 16 | 12 | 4 | 2 | 8 | 6 | 9 | 10 | 17 |
Using SHINeeRanking, the database was separated based on the attributes and converted to matrix form for analysis.
# Visual Ranking
visual<-SHINeeRanking[,c(2:6)]
row.names(visual)<-SHINeeRanking$mvs
colnames(visual)<-c("first", "second", "third", "forth", "fifth")
visualm<-data.matrix(visual)
# Theme/Story Ranking
theme_story<-SHINeeRanking[,c(7:11)]
row.names(theme_story)<-SHINeeRanking$mvs
colnames(theme_story)<-c("first", "second", "third", "forth", "fifth")
theme_storym<-data.matrix(theme_story)
# Performance Ranking
performance<-SHINeeRanking[,c(12:16)]
row.names(performance)<-SHINeeRanking$mvs
colnames(performance)<-c("first", "second", "third", "forth", "fifth")
performancem<-data.matrix(performance)
I wanted to visualize the nature of the dependence of how each music video rank among a specific attribute. Therefore, chi-squared statistic were used, where this test evaluates whether there is a significant association between the categories.
For visualization, corrplot() was used. Positive association are displayed in white and negative association in black color. Color intensity and the size of the circle are proportional to the correlation coefficients.
library(corrplot)
Firstly, for Visual, the test revealed that the variables are statistically associated, p-value < 0. The most contributing cells can be shown below.
The music videos with the frequent rankings of first place for Visual are Tell Me What To Do and Good Evening. While looking at the music video Sherlock, it is clear that it wasn’t commonly ranked as first or second for Visual.
chi_visual<-chisq.test(visualm)
chi_visual
##
## Pearson's Chi-squared test
##
## data: visualm
## X-squared = 70.6, df = 16, p-value = 7.832e-09
corrplot(chi_visual$residuals, col = c("black", "white"), bg = "lightblue", is.cor = FALSE, sig.level = .05)
Next, the test revealed that the variables for Theme/Story are also statistically associated, p-value < 0. The most contributing cells can be shown below by the plot. The music video with the most association ranking of first place for the best theme or story was Good Evening. While Everybody was frequently ranked fifth place among the music videos.
chi_theme<-chisq.test(theme_storym)
chi_theme
##
## Pearson's Chi-squared test
##
## data: theme_storym
## X-squared = 58.8, df = 16, p-value = 8.326e-07
corrplot(chi_theme$residuals, col = c("black", "white"), bg = "lightblue", is.cor = FALSE, sig.level = .05)
Lastly, the chi-squared test revealed that the variables are statistically associated, p-value < 0. The most contributing cells can be shown below. The Performance attribute of the music videos were mostly associated with Sherlock and Everybody. While it is clear that Tell Me What To Do and Good Evening were commonly associated as fifth place.
chi_pref<-chisq.test(performancem)
chi_pref
##
## Pearson's Chi-squared test
##
## data: performancem
## X-squared = 53.6, df = 16, p-value = 6.026e-06
corrplot(chi_pref$residuals, col = c("black", "white"), bg = "lightblue", is.cor = FALSE, sig.level = .05)
In conclusion, the music video that does not appear to be frequently ranked as outstanding to have one of the three attributes was View. It was close to a even rank on Visual, ranked often as second and third for Theme/Story, and often ranked third for Performance. This may suggest that fans believe that View is well-balanced on Visual, Theme/Story and Performance with no attribute being distinctively impressive over the next. Whereas, Good Evening was mostly associated with both Visual and Theme/Story, and Sherlock was mostly associated with the Performance attribute.