Twitter Pre-Module Assignment - Network Analytics

1. Objective and Data Description

For this assignment we will use Twitter data to explore network graphs. The dataset was collected in July 2014 from four major Twitter accounts (Snapchat, Dropbox and two others) as seed nodes, collecting their most recent 4000 followers. The same was done for each of these followers. Then, for each of these accounts, all the accounts they follow were collected. The data set includes 59,974 Twitter users (nodes) and 73,277 follower relationships (edges).

The data is available in the following files:

    1. graph complete.txt provides the edges of the graph in the form from => to. Each line is an edge where each node is separated by a space.
    1. graph subset.txt: a subset of the complete network. This file contains roughly 1% of the total number of edges (randomly selected). Each line is an edge where each node is separated by a space.
    1. ids to usernames.csv: maps the integer ids given in the two data files to the actual Twitter usernames of the users in our dataset. There are two comma separated fields in this file: the integer id and the string username.

2. Import Libraries

For this assignment, we will use the igraph package for network analysis.

pacman::p_load(igraph, DT, ggplot2, dplyr)

3. Network Structure Visualization

Let us first plot the network by using the information in the file graph subset.txt. Note that this is not the complete network, but only a subset of its edges. By visualizing the graph, we can get an idea of the structure of the network we will be working on.

For better visualization, we will treat this data file as being in ncol format and turn on directed=TRUE. We also set vertex.size=1, vertex.color=“green”, layout=layout.kamada.kawai, vertex.label=NA, edge.arrow.size=.2, edge.curved =.1 and width=.5.

graph_subset <- read_graph("graph_subset.txt", format="ncol", directed = TRUE)
plot.igraph(graph_subset, 
            vertex.size = 1, 
            vertex.color = "green", 
            layout = layout.kamada.kawai, 
            vertex.label = NA, 
            edge.arrow.size = .4,
            edge.curved = .1,
            width = .5,
            main = "Subset of Twitter Network (Kamada Kawai)")

Let us also try other graph layouts to see the network from different perspectives. There are many layouts available in the igraph package. We will go ahead to pick five alternative layouts: In Circle, On Sphere, Fruchterman-Reingold, Nicely and Tree.

layouts <- grep("^layout_", ls("package:igraph"), value=TRUE)[-1] 
layouts
##  [1] "layout_as_bipartite"  "layout_as_star"       "layout_as_tree"      
##  [4] "layout_components"    "layout_in_circle"     "layout_nicely"       
##  [7] "layout_on_grid"       "layout_on_sphere"     "layout_randomly"     
## [10] "layout_with_dh"       "layout_with_drl"      "layout_with_fr"      
## [13] "layout_with_gem"      "layout_with_graphopt" "layout_with_kk"      
## [16] "layout_with_lgl"      "layout_with_mds"      "layout_with_sugiyama"
plot.igraph(graph_subset, 
            vertex.size = 1, 
            vertex.color = "green", 
            layout = layout_in_circle, 
            vertex.label = NA, 
            edge.arrow.size = .4,
            edge.curved = .1,
            width = .5,
            main = "Subset of Twitter Network (Layout In Circle)")

plot.igraph(graph_subset, 
            vertex.size = 1, 
            vertex.color = "green", 
            layout = layout_on_sphere, 
            vertex.label = NA, 
            edge.arrow.size = .4,
            edge.curved = .1,
            width = .5,
            main = "Subset of Twitter Network (Layout on Sphere)")

plot.igraph(graph_subset, 
            vertex.size = 1, 
            vertex.color = "green", 
            layout = layout_with_fr, 
            vertex.label = NA, 
            edge.arrow.size = .4,
            edge.curved = .1,
            width = .5,
            main = "Subset of Twitter Network (Fruchterman-Reingold)")

plot.igraph(graph_subset, 
            vertex.size = 1, 
            vertex.color = "green", 
            layout = layout_nicely, 
            vertex.label = NA, 
            edge.arrow.size = .4,
            edge.curved = .1,
            width = .5,
            main = "Subset of Twitter Network (Layout Nicely)")

graph_subset1 <- read_graph("graph_subset.txt", format="ncol", directed = FALSE)
plot.igraph(graph_subset1, 
            vertex.size = 1, 
            vertex.color = "green", 
            layout = layout_as_tree(graph_subset1, circular = TRUE),
            vertex.label = NA, 
            edge.arrow.size = .4,
            edge.curved = .1,
            width = .5,
            main = "Subset of Twitter Network (Tree)")

We can make two interesting observations based on a visual inspection of the graphs. Firstly, the graph appears to be connected, i.e. there is a path to every node in the network. Secondly, there appears to be four key nodes - giant components that contains a significant fraction of all the nodes. This could be due to the fact that the data was collected from four major Twitter accounts.

Let us take a closer look at the graph. We begin by examining the user names. The table below shows the first few user names.

usernames <- read.csv("ids_to_usernames.csv", stringsAsFactors = FALSE)
head(usernames)
##   id            name
## 1  0        Snapchat
## 2  1 insomniacevents
## 3  2         Dropbox
## 4  3  olympiacos_org
## 5  4          24_ida
## 6  5           IDDVx

Next, we take a look at the relationships among the users.

V(graph_subset) # prints the vertices (i.e. Twitter users) 
## + 1042/1042 vertices, named, from 2bc1c4e:
##    [1] 70    0     186   219   477   530   543   545   636   746   931  
##   [12] 940   955   957   1115  1281  1355  1482  1753  1899  1957  2445 
##   [23] 2664  2794  3067  3389  3590  3710  3754  3771  3863  3896  3964 
##   [34] 3970  4050  1     4144  4232  4264  4362  4403  4404  4441  4473 
##   [45] 4497  4711  4807  4973  5098  5100  5108  5134  5169  5262  5303 
##   [56] 5396  5435  5439  5571  5616  5704  5714  5779  5826  5977  6034 
##   [67] 6168  6234  6256  6315  6327  6522  6621  7029  7049  7157  7221 
##   [78] 7341  7370  7497  7532  7787  7850  7995  8051  2     8098  8209 
##   [89] 8352  8359  8426  8539  8552  8562  8594  8758  8996  8998  9045 
##  [100] 9377  9411  9426  9471  9494  9506  9591  9816  9898  9971  10054
## + ... omitted several vertices
E(graph_subset) # prints the edges (i.e. relationships)
## + 734/734 edges from 2bc1c4e (vertex names):
##  [1] 70  ->0 186 ->0 219 ->0 477 ->0 530 ->0 543 ->0 545 ->0 636 ->0
##  [9] 746 ->0 931 ->0 940 ->0 955 ->0 957 ->0 1115->0 1281->0 1355->0
## [17] 1482->0 1753->0 1899->0 1957->0 2445->0 2664->0 2794->0 3067->0
## [25] 3389->0 3590->0 3710->0 3754->0 3771->0 3863->0 3896->0 3964->0
## [33] 3970->0 4050->1 4144->1 4232->1 4264->1 4362->1 4403->1 4404->1
## [41] 4441->1 4473->1 4497->1 4711->1 4807->1 4973->1 5098->1 5100->1
## [49] 5108->1 5134->1 5169->1 5262->1 5303->1 5396->1 5435->1 5439->1
## [57] 5571->1 5616->1 5704->1 5714->1 5779->1 5826->1 5977->1 6034->1
## [65] 6168->1 6234->1 6256->1 6315->1 6327->1 6522->1 6621->1 7029->1
## [73] 7049->1 7157->1 7221->1 7341->1 7370->1 7497->1 7532->1 7787->1
## + ... omitted several edges
deg_in <- degree(graph_subset, mode="in") # Number of incoming relationships among users 
deg_in_sorted <- sort.int(deg_in, decreasing=TRUE, index.return=FALSE)
head(deg_in_sorted)
##    1    2    0    3 4406 7049 
##   49   49   33   33    4    3
(summary(deg_in))
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.0000  0.0000  0.7044  1.0000 49.0000
deg_out <- degree(graph_subset, mode="out") # Number of outgoing relationships among users 
deg_out_sorted <- sort.int(deg_out, decreasing=TRUE, index.return=FALSE)
head(deg_out_sorted) 
##    1 8599 5137 6738 1564 1288 
##   13    3    3    3    3    3
(summary(deg_out))
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.0000  1.0000  0.7044  1.0000 13.0000
ggplot() + geom_point(aes(c(1:length(deg_in)),deg_in)) +
  ylab("Incoming Relationships") +
  xlab("Users") + 
  ggtitle("Number of Incoming Relationships among Users") +
  annotate("text", x = 100, y = 47, label = "insomniacevents (1)") +
  annotate("text", x = 150, y = 51, label = "Dropbox (2)") +
  annotate("text", x = 70, y = 30, label = "Snapchat (0)") +
  annotate("text", x = 200, y = 35, label = "olympiacos_org (3)")

ggplot() + geom_point(aes(c(1:length(deg_out)),deg_out)) +
  ylab("Outgoing Relationships") +
  xlab("Users") +
  ggtitle("Number of Outgoing Relationships among Users") +
  annotate("text", x = 150, y = 12, label = "insomniacevents (1)") 

There are 1,042 vertices (i.e. Twitter users) and 734 edges (i.e. relationships) between them. We see that nodes 1 (insomniacevents), 2 (Dropbox), 0 (Snapchat) and 3 (olympiacos_org) stand out as having a significantly higher number of incoming relationships (49, 49, 33 and 33 respectively). The other nodes have 4 or less incoming relationships. Some nodes have no incoming relationship. The average number of incoming relationships per node is 0.7.

As for outgoing relationships, we see that node 1 (insomniacevents) stands out as having 13 outgoing relationships while the rest have 3 or less. Again, some nodes have no outgoing relationship. The average number of outgoing relationships per node is 0.7.

4. Data Analysis

We will now examine the complete graph contained in the file graph complete.txt and the username file ids_to_usernames.csv.

We want to plot the distribution of the number of followers of each user in our dataset (x-axis number of followers, y-axis number of nodes). For each edge, user a is said to be a follower of user b if there is some edge a => b.

  1. We start by counting the number of followers of each user using the table command. This is the same as the in-degree of each node in the graph. The table below shows the top users with the most followers. Again, we see nodes 0 (Snapchat), 1 (insomniacevents), 2 (Dropbox), and 3 (olympicos_org) stand out, with 4,020 followers each. The other nodes have 120 or less incoming relationships.
graph_complete <- read.table("graph_complete.txt", header = FALSE)
followers <- as.data.frame(table(graph_complete$V2))
names(followers) <- c("Node", "Number of Followers")
followers_usernames <- merge.data.frame(followers, usernames, by.x = 'Node', by.y = 'id')
followers_sorted <- followers_usernames %>% 
  arrange(desc(followers_usernames$`Number of Followers`))
head(followers_sorted)
##    Node Number of Followers            name
## 1     0                4020        Snapchat
## 2     1                4020 insomniacevents
## 3     2                4020         Dropbox
## 4     3                4020  olympiacos_org
## 5 11705                 120       elcapimar
## 6  1288                 120       Kill_Joy7

Of the 59,973 users, there are 27,457 with at least 1 relationship. This means that there are 32,516 (= 59973 - 27,457) users with 0 relationship. The average number of incoming relationships per node is 1.222.

graph_complete1 <- read_graph("graph_complete.txt", format="ncol", directed = TRUE)
deg_in_df <- data.frame(followers = degree(graph_complete1, mode="in"), userid = names(degree(graph_complete1)))
deg_in_df_sorted <- sort.int(deg_in_df$followers, decreasing=TRUE, index.return=FALSE)
table(deg_in_df_sorted)
## deg_in_df_sorted
##     0     1     2     3     4     5     6     7     8     9    10    11 
## 32516 24454  1368   434   249   142    83    65    57    39    27    30 
##    12    13    14    15    16    17    18    19    20    21    22    23 
##    21    21    12    16    14    10     8    10    18     4     1     7 
##    24    25    26    27    28    29    30    31    32    33    34    35 
##     7     6     3     4     4     4     1     5     3     6     3     5 
##    36    37    38    39    40    41    42    43    45    46    47    48 
##     3     3     4     3    85     3     1     3     1     1     4     1 
##    49    50    51    52    53    55    56    57    58    59    60    61 
##     4     1     4     1     1     1     2     1     3     1    55     1 
##    63    65    67    68    70    73    75    77    79    80    86    96 
##     2     1     1     1     2     1     3     2     2    37     1     1 
##    97    99   100   102   108   111   116   120  4020 
##     1     2    32     1     1     2     1    31     4
summary(deg_in_df_sorted)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##    0.000    0.000    0.000    1.222    1.000 4020.000

We can validate this by examining the in-degree of each node.

V(graph_complete1)  
## + 59973/59973 vertices, named, from 310da36:
##     [1] 4     0     5     6     7     8     9     10    11    12    13   
##    [12] 14    15    16    17    18    19    20    21    22    23    24   
##    [23] 25    26    27    28    29    30    31    32    33    34    35   
##    [34] 36    37    38    39    40    41    42    43    44    45    46   
##    [45] 47    48    49    50    51    52    53    54    55    56    57   
##    [56] 58    59    60    61    62    63    64    65    66    67    68   
##    [67] 69    70    71    72    73    74    75    76    77    78    79   
##    [78] 80    81    82    83    84    85    86    87    88    89    90   
##    [89] 91    92    93    94    95    96    97    98    99    100   101  
##   [100] 102   103   104   105   106   107   108   109   110   111   112  
## + ... omitted several vertices
E(graph_complete1) 
## + 73277/73277 edges from 310da36 (vertex names):
##   [1] 4  ->0 5  ->0 6  ->0 7  ->0 8  ->0 9  ->0 10 ->0 11 ->0 12 ->0 13 ->0
##  [11] 14 ->0 15 ->0 16 ->0 17 ->0 18 ->0 19 ->0 20 ->0 21 ->0 22 ->0 23 ->0
##  [21] 24 ->0 25 ->0 26 ->0 27 ->0 28 ->0 29 ->0 30 ->0 31 ->0 32 ->0 33 ->0
##  [31] 34 ->0 35 ->0 36 ->0 37 ->0 38 ->0 39 ->0 40 ->0 41 ->0 42 ->0 43 ->0
##  [41] 44 ->0 45 ->0 46 ->0 47 ->0 48 ->0 49 ->0 50 ->0 51 ->0 52 ->0 53 ->0
##  [51] 54 ->0 55 ->0 56 ->0 57 ->0 58 ->0 59 ->0 60 ->0 61 ->0 62 ->0 63 ->0
##  [61] 64 ->0 65 ->0 66 ->0 67 ->0 68 ->0 69 ->0 70 ->0 71 ->0 72 ->0 73 ->0
##  [71] 74 ->0 75 ->0 76 ->0 77 ->0 78 ->0 79 ->0 80 ->0 81 ->0 82 ->0 83 ->0
##  [81] 84 ->0 85 ->0 86 ->0 87 ->0 88 ->0 89 ->0 90 ->0 91 ->0 92 ->0 93 ->0
##  [91] 94 ->0 95 ->0 96 ->0 97 ->0 98 ->0 99 ->0 100->0 101->0 102->0 103->0
## + ... omitted several edges
deg_in1 <- degree(graph_complete1, mode="in") 
deg_in1_sorted <- sort.int(deg_in1, decreasing=TRUE, index.return=FALSE)
head(deg_in1_sorted)
##    0    1    2    3  266  336 
## 4020 4020 4020 4020  120  120
ggplot() + geom_point(aes(c(1:length(deg_in1)),deg_in1)) +
  ylab("Incoming Relationships") +
  xlab("Users") + 
  ggtitle("Number of Incoming Relationships among Users") +
  annotate("text", x = 20000, y = 3500, label = 'Nodes 0-3 stand out with 4,020 followers each.')

  1. Next, we apply the same process to count the number of users (nodes) that have a particular number of followers, that is, the number of outgoing relationships that have a destination node in the graph.
nodes <- as.data.frame(table(graph_complete$V1))
names(nodes) <- c("Node", "Number of Users")
nodes_sorted <- nodes %>% 
  arrange(desc(nodes$`Number of Users`))
head(nodes_sorted)
##   Node Number of Users
## 1    1            1976
## 2  173             121
## 3  655             121
## 4 1212             121
## 5 1288             121
## 6 1804             121

We see that node 1 (insomniacevents) stands out as having 1,976 outgoing relationships while the rest have 121 or less. Of the 59,973 users, there are 37,679 with at least 1 outgoing relationship. This means that there are 22,294 (= 59973 - 37,679) users with 0 relationship. The average number of outgoing relationships per node is 1.222.

deg_out_df <- data.frame(followers = degree(graph_complete1, mode="out"), userid = names(degree(graph_complete1)))
deg_out_df_sorted <- sort.int(deg_out_df$followers, decreasing=TRUE, index.return=FALSE)
table(deg_out_df_sorted)
## deg_out_df_sorted
##     0     1     2     3     4     5     6     7     8     9    10    11 
## 22294 36155   725   146    53    29    18    12     6     4     8    12 
##    12    13    14    15    16    17    18    19    20    21    23    24 
##     9     4     5     5     4     5     1     2     3    22     1     2 
##    25    26    27    28    29    30    32    33    34    36    37    38 
##     4     2     1     1     2     1     1     2     1     1     4     1 
##    40    41    42    44    45    46    47    48    49    50    51    52 
##     2   123     2     3     1     2     1     5     1     1     3     2 
##    53    54    55    56    59    60    61    62    63    67    70    72 
##     2     2     2     1     2     2    68     2     1     1     3     1 
##    73    77    78    80    81    84    89    90    95    98    99   100 
##     1     1     1     2    59     1     1     2     1     3     2     1 
##   101   104   105   113   121  1976 
##    59     1     2     2    54     1
summary(deg_out_df_sorted)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##    0.000    0.000    1.000    1.222    1.000 1976.000

And we can validate this by examining the out-degree of each node.

followers_out <- data.frame(followers = degree(graph_complete1, mode = "out"), 
                            userid = names(degree(graph_complete1)))
followers_out_usernames <- merge.data.frame(followers_out, usernames, by.x = 'userid', by.y = 'id')
followers_out_usernames_sorted <- followers_out_usernames %>% 
  arrange(desc(followers_out_usernames$followers))
head(followers_out_usernames_sorted)
##   userid followers            name
## 1      1      1976 insomniacevents
## 2  10098       121   RachelNIntuit
## 3  10181       121      TnahaSingh
## 4  10193       121      tuprasanti
## 5  10350       121           xwx28
## 6  10856       121     AutotaskLen
deg_out1 <- degree(graph_complete1, mode="out") 
deg_out1_sorted <- sort.int(deg_out1, decreasing=TRUE, index.return=FALSE)
head(deg_out1_sorted)
##    1  173  655 1212 1288 1804 
## 1976  121  121  121  121  121
ggplot() + geom_point(aes(c(1:length(deg_out1)),deg_out1)) +
  ylab("Outgoing Relationships") +
  xlab("Users") + 
  ggtitle("Number of Outgoing Relationships among Users") +
  annotate("text", x = 18000, y = 1800, label = 'Node 1 stand out with 1,976 outgoing relationships.')

Combining the two, we can see that node 1 has the most number of total relationships (5,996). Dropbox and Olympiacos_org, and Snapchat have the next three highest number of relationships. The rest are have 241 or less relationships.

followers_total <- data.frame(followers = degree(graph_complete1, mode = "total"), 
                            userid = names(degree(graph_complete1)))
followers_total_usernames <- merge.data.frame(followers_total, usernames, by.x = 'userid', by.y = 'id')
followers_total_usernames_sorted <- followers_total_usernames %>% 
  arrange(desc(followers_total_usernames$followers))
head(followers_total_usernames_sorted)
##   userid followers            name
## 1      1      5996 insomniacevents
## 2      2      4125         Dropbox
## 3      3      4073  olympiacos_org
## 4      0      4071        Snapchat
## 5  11705       241       elcapimar
## 6   1288       241       Kill_Joy7
  1. We can now go ahead to plot the distribution. We want to have on the x-axis the number of followers, and on y-axis how many users have that number of followers. This can be plotted using the geom_density function in ggplot. There are 32,516 users with 0 relationship, 24,454 users with 1 follower and 4 users have 4,020 followers.
ggplot(deg_in_df, aes(x=followers)) +
  geom_density(fill="seagreen3") +
  labs(title="Density Distribution of Followers",
       x = "Number of Followers",
       y="Density") 

count <- data.frame(table((followers$`Number of Followers`)))
count
##    Var1  Freq
## 1     1 24454
## 2     2  1368
## 3     3   434
## 4     4   249
## 5     5   142
## 6     6    83
## 7     7    65
## 8     8    57
## 9     9    39
## 10   10    27
## 11   11    30
## 12   12    21
## 13   13    21
## 14   14    12
## 15   15    16
## 16   16    14
## 17   17    10
## 18   18     8
## 19   19    10
## 20   20    18
## 21   21     4
## 22   22     1
## 23   23     7
## 24   24     7
## 25   25     6
## 26   26     3
## 27   27     4
## 28   28     4
## 29   29     4
## 30   30     1
## 31   31     5
## 32   32     3
## 33   33     6
## 34   34     3
## 35   35     5
## 36   36     3
## 37   37     3
## 38   38     4
## 39   39     3
## 40   40    85
## 41   41     3
## 42   42     1
## 43   43     3
## 44   45     1
## 45   46     1
## 46   47     4
## 47   48     1
## 48   49     4
## 49   50     1
## 50   51     4
## 51   52     1
## 52   53     1
## 53   55     1
## 54   56     2
## 55   57     1
## 56   58     3
## 57   59     1
## 58   60    55
## 59   61     1
## 60   63     2
## 61   65     1
## 62   67     1
## 63   68     1
## 64   70     2
## 65   73     1
## 66   75     3
## 67   77     2
## 68   79     2
## 69   80    37
## 70   86     1
## 71   96     1
## 72   97     1
## 73   99     2
## 74  100    32
## 75  102     1
## 76  108     1
## 77  111     2
## 78  116     1
## 79  120    31
## 80 4020     4

Let’s see the distribution more clearly by dropping the outliers.

followers_outliers <- deg_in_df[(deg_in_df$followers<4020),] # remove outliers
  
ggplot(followers_outliers, aes(x=followers)) +
  geom_density(fill="seagreen3") +
  labs(title="Density Distribution of Followers (without outliers)",
       x = "Number of Followers",
       y="Density") 

  1. Let’s transform the x-axis of the previous graph to logscale, to get a better understanding of the distribution. Note here some users that have 0 followers. This means that using the log of the x-axis will fail since log(0) will not be valid. Hence, we replace 0 with 0.1.
deg_in_df[(deg_in_df$followers ==0),'followers'] <- 0.1

ggplot(deg_in_df, aes(x=log(followers))) +
  geom_density(fill="seagreen3") +
  labs(title="Log-Density Distribution of Followers",
       x = "Number of Followers",
       y="Log-Density") 

Let’s also see the log-density distribution more clearly by dropping the outliers.

followers_outliers[(followers_outliers$followers == 0),'followers'] <- 0.1
  
ggplot(followers_outliers, aes(x=log(followers))) +
  geom_density(fill="seagreen3") +
  labs(title="Log-Density Distribution of Followers (without outliers)",
       x = "Number of Followers",
       y="Log-Density") 

The distribution is highly skewed to the right. Most users have very few followers but there are a few users with many followers. Ths is commonly seen in social networks. This skewness is seen by the low mean but high standard deviation of followers (see part 3 below).

  1. The average number of followers is 1.276 per user and the standard deviation is 33.25. The large disperson around the mean shows the skewness of the distribution.
round(mean(deg_in_df$followers),3)
## [1] 1.276
round(sd(deg_in_df$followers),2)
## [1] 33.25
  1. Finally, we report the Twitter usernames of the top 10 users with the most followers. We print users with multiple ties.
followers_top10 <- followers_sorted[followers_sorted$`Number of Followers` %in% 
                                      unique(followers_sorted$`Number of Followers`)[1:10],]
followers_top10
##     Node Number of Followers            name
## 1      0                4020        Snapchat
## 2      1                4020 insomniacevents
## 3      2                4020         Dropbox
## 4      3                4020  olympiacos_org
## 5  11705                 120       elcapimar
## 6   1288                 120       Kill_Joy7
## 7  14978                 120     JohnIrons95
## 8   1804                 120  sydneyhbrodsky
## 9   2337                 120     Can1ffs_bae
## 10  2370                 120   mcclainxkylie
## 11   266                 120        091Jacko
## 12  2796                 120 redBONEmango_YA
## 13   336                 120      byers_alec
## 14  3419                 120 glasstablegirlz
## 15  3883                 120  hannabaldovino
## 16  3916                 120    jillian_1015
## 17  4098                 120        kjrasing
## 18  4139                 120   SongsThisWeek
## 19  4146                 120       AminGa123
## 20  4164                 120     fatimeDuraj
## 21  4293                 120     MarvinaPete
## 22  4959                 120      djshawnjay
## 23  5183                 120    Massy_DeeJay
## 24  6313                 120       Nique_Kee
## 25  6588                 120        Enzolibo
## 26  6716                 120         JayFulk
## 27  6984                 120        miklit23
## 28  7300                 120   tha__symbolic
## 29  7343                 120    __rossgeller
## 30  7382                 120     jesse_jamal
## 31  7525                 120     BOXFEDmusic
## 32  7686                 120      deephousCo
## 33  7970                 120          GOvOSE
## 34  8673                 120   PetjaSairanen
## 35  9513                 120     Asit_Tewari
## 36 11163                 111          namfow
## 37  3355                 108      sammeyer98
## 38  7448                 102   djMikeHawkins
## 39  1100                 100  adamzfurniture
## 40 11298                 100      cocaine345
## 41 11444                 100     Steenberger
## 42 11686                 100   Brendon_Smale
## 43  1212                 100       brady1995
## 44 13443                 100 ArgirisKarvelas
## 45 13959                 100     ComEstadios
## 46 14141                 100      Tommac1602
## 47   173                 100       LuArDanni
## 48  1940                 100 MaricelaSGarcia
## 49   220                 100         gabider
## 50  2432                 100         thetli8
## 51  2881                 100     jodiieAnnex
## 52  3343                 100   haley_harloff
## 53  3958                 100   hesperfection
## 54  4406                 100          CTG757
## 55  4730                 100    Baharum135LC
## 56  5265                 100      MikeBeEasy
## 57  5634                 100    15serranista
## 58  5687                 100         _Lyss18
## 59  5758                 100    Goombaaabeth
## 60  6459                 100      jeremysk82
## 61  6491                 100  Callie_Mccleod
## 62  6665                 100  AlanSalazarx_O
## 63  7049                 100      paulinaxo_
## 64  7583                 100        17Sihota
## 65  7878                 100   bunnytheraver
## 66  8162                 100    Natashamarko
## 67  9040                 100 cristinaportalu
## 68  9141                 100        HTMLCOIN
## 69  9266                 100 marialu16955736
## 70  9812                 100        pbower10
## 71  2581                  99       David_Oz5
## 72  2675                  99    jujuchaudin7
## 73  4036                  97      Sean_Chell
## 74  7397                  96       istekovic
## 75  6925                  86    HardwellOnMe

Alvin ENG Han Wen (hwe210@stern.nyu.edu)

14 October 2017