Concept
- Analyze user engagements with his friends.
- Assumption: closer friends will engage most with the user.
- Quantify every friend it terms of ‘like_count’ and ‘comment_count’ on user’s status messages.
- Map each user to 2-dimensional space.
- Visualize the proximity of these points (facebook users).
Features
- Provide access token to his facebook account for clustering of his friends.
- Visualize his friends in lying in different clusters.
- Select the number of clusters he want’s his friends to be grouped into.
- View the list of friends that belong to a particular cluster.
- Configure the number of friends to viewed in the cluster.
- View the cluster to which you belong to.
Algorithm
- User’s status data is collected using facebook graph API.
- Likes, comments and the friend who has liked or commented is fetched.
- Raw data is then aggregated so as to find the total like count and comment count by every friend of the user as well the user himself.
- The aggregated data is then modelled using a random forest.
- Random Forest’s proximity matrix is used to detect the closeness of users.
- The proximity matrix is fed to k-means algorithm to return the required clusters.
Example cluster of a Facebook user
## randomForest 4.6-7
## Type rfNews() to see new features/changes/bug fixes.
## Loading required package: lattice
## Loading required package: ggplot2
plot(rfd, col=km$cluster, xlab = 'commentIndex', ylab = 'likeIndex')
text(rfd[userIndex,1], rfd[userIndex,2], label='X', cex=3, col = 624)
