Social network mining is one of the key problems in the YELP dataset challange (http://www.yelp.com/dataset_challenge)
Mining the YELP social graph can reveal social influencers and subcommunities with desired commercial preferences.
In this project, I analyze the network of YELP users (nodes) and their friendship relationships (edges).
To generate the network of YELP community, the data in the file “yelp_academic_dataset_user.json” was used.
Network analysis consisted of three steps: (i) analysis of node-degree distribution, (ii) identification of social hubs, (iii) calculation of the shortest paths connecting any two YELP users in a connected component of the network.
Network development and analysis was done using the R package igraph.
The node-degree distribution shows a power law behavior characteristic of scale-free networks.
Due to the presence of hubs (2000+ friends),YELP users can be connected with a low (2-3) number of steps.
The key finding of this work is that the network of YELP users is scale-free, consistent with other social and biological networks.
An important feature of scale-free networks is the small world property, the possibility to connect any two users with a small number of steps.
Further research in this direction may allow decomposing the network in subcommunities with desired commercial preferences, which may be used for targeted marketing strategies.