Read data into R.
x <- read.csv("cluster_reading_anon.csv", stringsAsFactors = F)
Subset dataframe for one specific class and include student nickname, book title, author, publisher, genre, and stars the student awarded for the book.
y <- x %>% filter(Campus == "A") %>%
select(2:6,9)
Convert character variables to factor variables.
y[,1:5] <- lapply(y[,1:5], as.factor)
Check the structure of the new dataframe.
str(y)
## 'data.frame': 108 obs. of 6 variables:
## $ Nickname : Factor w/ 20 levels "atsuhito","fumiya",..: 12 12 8 8 8 8 8 8 8 20 ...
## $ Book_title: Factor w/ 87 levels "a chrismas carol",..: 57 40 2 4 3 79 43 86 84 28 ...
## $ Author : Factor w/ 69 levels "alex raynham",..: 37 63 13 19 57 8 58 36 36 28 ...
## $ Publisher : Factor w/ 6 levels "Cambridge","Macmillan",..: 6 5 5 4 1 5 5 4 4 3 ...
## $ Genre : Factor w/ 13 levels "action adventure",..: 3 3 5 9 7 1 5 11 9 1 ...
## $ Stars : int 1 1 2 2 1 2 3 3 3 4 ...
Create a document feature matrix for book titles read by students.
Title_dfm <-table(y$Nickname, y$Book_title)
Likewise, create tables for author, publisher, genre, and stars.
Author_dfm <- table(y$Nickname, y$Book_title)
Publisher_dfm <- table(y$Nickname, y$Publisher)
Genre_dfm <- table(y$Nickname, y$Genre)
Stars_dfm <- table(y$Nickname, y$Stars)
Bind the five dfm to create a new dataframe.
Cluster_df <- cbind(Author_dfm, Genre_dfm, Publisher_dfm, Title_dfm, Stars_dfm)
Plot a dendrogram based on students’ book reading interests.
Cluster_df %>% dist %>% hclust %>% plot

Why is Kakuto different from the others?
x %>% filter(Nickname == "kakuto") %>% select(Book_title, Genre, Stars)
## Book_title Genre Stars
## 1 big hair day fantasy 1
## 2 a death in oxford mystery 1
## 3 let me out! fantasy 1
## 4 next door to love romance 2
## 5 book boy biography 1
## 6 why? historical fiction 1
## 7 help! fantasy 1
How about Shimpei and Shintaro?
x %>% filter(Nickname == "shimpei" | Nickname == "shintaro") %>%
select(Nickname, Book_title, Genre, Stars) %>%
arrange(Book_title, Genre, Stars)
## Nickname Book_title Genre Stars
## 1 shintaro a tale of two cities historical fiction 2
## 2 shimpei anna and the fighter other 2
## 3 shimpei dangerous journey action adventure 2
## 4 shintaro dangerous journey action adventure 4
## 5 shimpei l.a.raid mystery 2
## 6 shintaro marco non-fiction 3
## 7 shimpei marco young adult 2
## 8 shimpei picture puzzle mystery 2
## 9 shimpei project omega mystery 2
## 10 shimpei the house on the hill romance 2
## 11 shintaro the house on the hill romance 2
## 12 shintaro the man in the iron mask action adventure 2
## 13 shintaro the three masketeers action adventure 2
Comment: Both students read “Dangerous Journey”, “Marco”, and “The House on the Hill”, and like similar genres and award similar stars.
Jun and Hide?
x %>% filter(Nickname == "jun" | Nickname == "hide") %>%
select(Nickname, Book_title, Genre, Stars) %>%
arrange(Book_title, Genre, Stars)
## Nickname Book_title Genre Stars
## 1 jun ali and his camera young adult 1
## 2 hide alice in wonderland fantasy 2
## 3 hide american life other 5
## 4 hide extreme sports sport 2
## 5 jun jennifer lopez non-fiction 3
## 6 hide jojo's story non-fiction 4
## 7 hide michael jordan biography 5
## 8 jun michael jordan sport 1
## 9 jun new york other 3
## 10 hide new york other 4
## 11 jun sadie's big day at the office other 1
## 12 jun the fireboy children's literature 1
## 13 hide the mummy returns fantasy 2
## 14 hide the swiss family robinson action adventure 3
Comment: Both Jun and Hide read “Michael Jordan” and “New York”, and like to read non-fiction including sport and biography.
Ryo and Tomo.
x %>% filter(Nickname == "ryo" | Nickname == "tomo") %>%
select(Nickname, Book_title, Genre, Stars) %>%
arrange(Book_title, Genre, Stars)
## Nickname Book_title Genre Stars
## 1 tomo american life other 5
## 2 ryo marcel and the shakespeare letters children's literature 1
## 3 ryo six sketches children's literature 1
Comment: Ryo and Tomo didn’t read much.
One last pair, Mai and Rio.
x %>% filter(Nickname == "mai" | Nickname == "rio") %>%
select(Nickname, Book_title, Genre, Stars) %>%
arrange(Book_title, Genre, Stars)
## Nickname Book_title Genre Stars
## 1 rio a chrismas carol fantasy 3
## 2 rio a midsummer night's dream fantasy 3
## 3 mai a midsummmer night’s dream fantasy 3
## 4 mai hamlet classical literature 4
## 5 mai strong medicine mystery 3
## 6 rio strong medicine mystery 3
Comment: Mai and Rio both read “A Midsummer Night’s Dream” and “Strong Medicine”, and like fantasy and mystery.
Conclusion: When students discuss their graded readers in class, a dendrogram based on reading interests may be used to create groups. Students may have more in common within these clusters than otherwise, so this novel arrangement may be an occasional alternative to self-selection or randomization.
Comment: Kakuto didn’t enjoy reading his selections.