PRESENTATION679

Dingxian Cao,Yida Yin,Peiyang Yu,Vy Nguyen,Kirthanaa Raghuraman

10/5/2016

stackexchange.com

  • 318 communities: stats.stackexchange.com
  • 8 subfolder: Badges, Comments, PostHistory, PostLinks, Posts,Tags, Users, Votes
> use stats_stackexchange_com
> show collections
Badges Comments PostHistory PostLinks Posts Tags Users Votes

> db.Posts.findOne()
{
    "@Id" : "1",
    "@PostTypeId" : "1",
    "@AcceptedAnswerId" : "15",
    "@CreationDate" : "2010-07-19T19:12:12.510",
    "@ViewCount" : "2054",
    "@Body" : "<p>How should I elicit prior distributions from experts when fitting a Bayesian model?</p>\n",
    "@OwnerUserId" : "8",
    "@Title" : "Eliciting priors from experts",
    "@Tags" : "<bayesian><prior><elicitation>"
}
  • data size: 200G

Tag Frequency

head(arrange(d,desc(count)),10)
##          tags  count
## 1  javascript 303928
## 2        java 237520
## 3     android 199823
## 4         php 191217
## 5          c# 174990
## 6      python 169329
## 7      jquery 135530
## 8        html 132973
## 9         ios 106888
## 10        css  91274

Tag wordcloud

wordcloud(d$tags, d$count, max.words=100,scale=c(8,0.2),min.freq=-Inf,colors=colors,random.order=F,random.color=F,ordered.colors=F)

Tag Network

plot(g,layout=layout.circle(g), vertex.size = d / max(d) * 50, edge.curved = T,vertex.label.font=2  ,vertex.label.dist = 1  ,vertex.label.cex   = 1.2)