Why?

I would prefer not to have to do this kind of self-promiting analysis essentially showing off how cool I am etc., but given the persistent personal attacks, it seems necessary. Essentially, I download my own Twitter network data (about inbound and outbound follower links), then filter them by content of their self-descriptions to show that, indeed, lots of competent mainstream researchers follow me online. Thus, critics are wrong to assert that essentially all mainstream scientists ignore me and my work. What they don’t often do is cite my work, but that has to do with the topics I study (very sensitive) and my odd publication habits. This is a conscious choice. However, because critics are rather annoying, I will publish a few more papers in mainstream outlets just to prove the point. Afterwards, I can hopefully go back to publishing in open science journals and get on with the science without having to worry too much about the plausibility of these personal attacks.

Additionally, I think analyzing your own place in the network of cool people is interesting. One can also look among one’s cool followers to find cool people to follow back. When doing this, it is useful to filter out people one already follows, but this is not done below.

One could do more comprehensive network analyses by downloading the data for e.g. all members of the Intelligence journal editorial board who are on Twitter (about 50% of them IIRC), and mapping out their follow network and finding my place in it. This would provide a network centrality measure, a useful indicator of how mainstream a given person is.

Init

Load libraries etc.

options(digits = 2)
library(pacman)
p_load(twitteR, kirkegaard, dplyr, lubridate, fmsb, stringi)
source("hide.R")

Useful functions

Some ad hoc useful functions.

# add_metrics -------------------------------------------------------------

#centile function
as_centile = function(x, descending = T, rank_method = NULL) {
  #reverse
  if (!descending) x = x * -1
  
  #ranks
  x_ranks = rank(x, ties.method = rank_method) %>% 
    #subtract 1, so first rank is 0, not 1
    `-`(1)
  
  #divide by max rank
  x_centiles = x_ranks / max(x_ranks)
  x_centiles * 100
}

c(1, 1, 1, 2, 2, 3, 4, 5, NA, NaN, Inf, -Inf) %>% as_centile
##  [1]  18  18  18  41  41  55  64  73  91 100  82   0
add_metrics = function(df) {
  df %>% mutate(
    #age in days
    age_days =  (lubridate::now() - created) %>% as.numeric,
    age_days_centile = age_days %>% as_centile,
    
    #activity
    tweets_per_day = statusesCount / age_days,
    tweets_per_day_centile = tweets_per_day %>% as_centile,
    
    #followers per tweets per day
    followers_per_tpd = followersCount / tweets_per_day,
    followers_per_tpd_centile = followers_per_tpd %>% as_centile,
    followers_per_tweet = followersCount / statusesCount,
    followers_per_tweet_centile = followers_per_tweet %>% as_centile,
    followers_per_day = followersCount / age_days,
    followers_per_day_centile = followers_per_day %>% as_centile,
    
    #followers per other-follow
    in_out_ratio = followersCount / friendsCount,
    in_out_ratio_centile = in_out_ratio %>% as_centile
  )
}

# Restructure
twit_to_df = function(x) {
  lapply(x, function(l) {
    #convert to df
    l = l$toDataFrame()
    
    #reorder columns
    dplyr::select(l, name, screenName, everything())
    
  }) %>%
    ldf_to_df()
}

#searching
browse_by_regex = function(x, regex) {
  x %>% dplyr::filter(stri_detect_regex(description, regex, case_insensitive=T)) %>% View()
}

output_by_regex = function(x, regex) {
  x %>% dplyr::filter(stri_detect_regex(description, regex, case_insensitive=T))
}

Data

Log in with secrets, not shown in output. Then get the data and restructure it.

#setup_twitter_oauth() #this call isn't shown for security reasons.

# download data -------------------------------------------------
#emils user
emil = getUser('KirkegaardEmil')

#his network
emil_friends = emil$getFriends()
emil_followers = emil$getFollowers()

#tweets
#emil_tweets = twitteR::

#dataframe versions
d_emil_friends = twit_to_df(emil_friends) %>% add_metrics()
d_emil_followers = twit_to_df(emil_followers) %>% add_metrics()

Find science people among my followers

Finally, we search by regex to find interesting people. When looking for cool people to follow back, it is also useful to filter out the ones one already follows.

#professors
d_emil_followers %>% hide_some_people %>% output_by_regex("professor")
#science people in general
d_emil_followers %>% hide_some_people %>% output_by_regex("university|ph\\.?d|(post-doc)|professor|(grad(uate)? student)")
#psychology people
d_emil_followers %>%hide_some_people %>% output_by_regex("psych(ology)?")

So we see, notorious pseudoscience etc. Emil manages to attract lots of very competent people and mainstream researchers. The question to ask for the people trying to argue for the crank model is: why? I cannot think of any plausible answer to this question. It is certainly not true that mainstream academics don’t interact with me or read my work (which I post on Twitter all the time).