MyAnimeList, also known as MAL, is an anime and manga social networking website which contains a database where users can organize and add different anime to their list. When added to a list the anime items are given a rating after being watched. This process helps in finding users who have similar tastes.
This project will explore the contents of this dataset to gain insights. Later on, an item-item collaborative filtering recommeder system will be built to recommend and predict anime for users. Analysis and evaluation will be done on the recommender system to see how well it performs when recommending items. Please note that the project will not be limited to this proposal. As I progress through the assignment, some attributes may change as I learn the data and the projected recommender system.
This data set contains information on user preference data from 73,516 users on 12,294 anime. There’s two datasets:
Anime.csv
Contains detailed information about the anime.anime_id | name | genre | type | episodes | rating | members |
---|---|---|---|---|---|---|
32281 | Kimi no Na wa. | Drama, Romance, School, Supernatural | Movie | 1 | 9.37 | 200630 |
5114 | Fullmetal Alchemist: Brotherhood | Action, Adventure, Drama, Fantasy, Magic, Military, Shounen | TV | 64 | 9.26 | 793665 |
28977 | Gintamað | Action, Comedy, Historical, Parody, Samurai, Sci-Fi, Shounen | TV | 51 | 9.25 | 114262 |
9253 | Steins;Gate | Sci-Fi, Thriller | TV | 24 | 9.17 | 673572 |
9969 | Gintama' | Action, Comedy, Historical, Parody, Samurai, Sci-Fi, Shounen | TV | 51 | 9.16 | 151266 |
32935 | Haikyuu!!: Karasuno Koukou VS Shiratorizawa Gakuen Koukou | Comedy, Drama, School, Shounen, Sports | TV | 10 | 9.15 | 93351 |
Ratings.csv
Stores information about ratings given to each anime by a user.user_id | anime_id | rating |
---|---|---|
1 | 20 | -1 |
1 | 24 | -1 |
1 | 79 | -1 |
1 | 226 | -1 |
1 | 241 | -1 |
1 | 355 | -1 |
The scores/ratings range from 1 - 10 with 10 being the best. If the rating is -1, it means that the user did not provide a rating for that item.
The data is freely accessible to the public and was obtained from Kaggle.com. For a more condensed version, you can find it on Kaggle as well under Myanimelist.
To recommend and make predictions about a user’s taste. Specifically what a user will want to watch or buy in the future.
This project will be an exploration of modern recommender systems with the use of
The system will be implemented in R using a training and test set with a ratio of 80%:20% respectively. The error for each model will be reported as root mean square error (RMSE).