Introduction

The rapid pace of publicatins regarding COVID-19 has been staggering. Here we begin to describe the prolieferaion of the published literature.

We utilized the National Library of Medicines search string:
> ((wuhan[All Fields] AND (“coronavirus”[MeSH Terms] OR “coronavirus”[All Fields])) AND 2019/12[PDAT] : 2030[PDAT]) OR 2019-nCoV[All Fields] OR 2019nCoV[All Fields] OR COVID-19[All Fields] OR SARS-CoV-2[All Fields]

Daily new cases of COVID-19 were extracted from the website Our World in Data

Dataset Construction

Exclusions are those published before january 1, 2020 and those published after the last Monday (June 8) before the search query was run. Select the code button in this section to see the R code for dataset construction.

setwd("C:/Users/jkempke/Box Sync/Side Projects/COVID/Global Publications/Data")
load("covid_pubmed_20200609_1000.RData")

# Some initial cleaning
pub_db1 <- pub_db %>% filter(DEP > "2020-01-01") %>% filter(DEP < "2020-06-08") %>% 
    select(LA, PST, PL, DEP, PMID, PT, PT2, TA, ) %>% mutate(day = weekdays(DEP), 
    LA = droplevels(LA), PL = droplevels(PL), week = isoweek(DEP), week.cat = as.factor(week), 
    LA = if_else(LA == "chi", "Chinese", if_else(LA == "dut", "Dutch", if_else(LA == 
        "eng", "English", if_else(LA == "fre", "French", if_else(LA == "ger", "German", 
        if_else(LA == "hun", "Hungarian", if_else(LA == "ita", "Italian", if_else(LA == 
            "pol", "Polish", if_else(LA == "por", "Portuguese", if_else(LA == "rus", 
            "Russian", if_else(LA == "spa", "Spanish", if_else(LA == "swe", "Swedish", 
                NA_character_)))))))))))), PT = as.character(PT), PT2 = as.character(PT2), 
    PT.adj = if_else(PT2 == "", PT, if_else(is.na(PT2) == F, PT2, NA_character_))) %>% 
    mutate(PT.adj = if_else(PT == "Case Reports", PT, PT.adj))

# Dataframe for exaing freuqncies by Journal
pub_db.journal <- pub_db1 %>% group_by(TA) %>% summarise(N = n())

pub_db.journal <- pub_db.journal %>% mutate(Total.N = sum(N), Percent = round(100 * 
    N/Total.N, digits = 1)) %>% dplyr::select(-Total.N) %>% filter(Percent > 0.4) %>% 
    arrange(desc(Percent))

# Data frame for associating weeks with the date of its corersponding Monday
date <- seq(as.Date("2020/01/06"), by = "weeks", length.out = 22)
mondays <- as.data.frame(date)
mondays <- mondays %>% mutate(day = "Monday", week = isoweek(date))

# Create time series for graphing, 1 aggregated by day another by week
pub_db.day <- pub_db1 %>% group_by(DEP) %>% summarise(N = n()) %>% mutate(Cum.N = cumsum(N), 
    day = weekdays(DEP))

pub_db.week <- pub_db1 %>% group_by(week) %>% summarise(N = n()) %>% mutate(Cum.N = cumsum(N))

# Xcluded first two weeks of january for low pub counts as outliers.
pub_db.week.trimmed <- filter(pub_db.week, week > 3)

pub_db.weekts <- left_join(pub_db.week, mondays, by = "week")
# now import all of the global COVID daily new case numbers and merge
global.covid <- read.csv("owid-covid-data.csv")

global.covid <- global.covid %>% dplyr::select(date, new_cases) %>% mutate(date = as.character(date), 
    Date = as_date(date), week = isoweek(Date)) %>% filter(Date > "2020-01-01") %>% 
    filter(Date < "2020-06-08")


global.covid.day <- global.covid %>% group_by(Date) %>% summarise(N = sum(new_cases)) %>% 
    mutate(Cum.N = cumsum(N), day = weekdays(Date))

global.covid.week <- global.covid %>% group_by(week) %>% summarise(Cases = sum(new_cases)) %>% 
    mutate(Cum.Cases = cumsum(Cases))


weekts <- left_join(pub_db.weekts, global.covid.week, by = "week")

Time Series

  • General statistics:
    • Weekly mean (SD) = 803.3181818 (778.402845)
    • 10th percentile = 17.2
    • 25th percentile = 90
    • 50th percentile = 473.5
    • 75th percentile = 1626.5
    • 90th percentile = 1771.3
  • Removing the first 2 weeks of January 2020:
    • Weekly mean (SD) = 883.45 (771.5836398)
    • 10th percentile = 74.8
    • 25th percentile = 159
    • 50th percentile = 748.5
    • 75th percentile = 1693.5
    • 90th percentile = 1775.8
ggplot(pub_db.day, aes(x = DEP, fill = day)) + geom_bar(aes(y = N), stat = "identity", 
    color = "black") + theme_bw() + labs(x = "Electronic Publication Date", y = "Number of Publications") + 
    scale_x_date(breaks = "1 week", date_labels = "%b %d") + theme(axis.title = element_text(size = 14), 
    axis.text.x = element_text(angle = 60, hjust = 1, size = 12), axis.text.y = element_text(size = 12))

ggplot(pub_db.weekts, aes(x = date, y = N)) + geom_bar(stat = "identity", color = "black") + 
    theme_bw() + geom_text(aes(label = N), nudge_y = 500) + scale_x_date(breaks = "1 week", 
    date_labels = "%b %d") + labs(x = "Week of Electronic Publication Date", y = "Number of Publications") + 
    theme(axis.title = element_text(size = 14), axis.text.x = element_text(angle = 60, 
        hjust = 1, size = 12), axis.text.y = element_text(size = 12))

ggplot(weekts, aes(x = date)) + geom_bar(aes(y = N), stat = "identity", color = "black", 
    fill = "light blue") + geom_line(aes(y = Cases/1000), size = 1) + theme_bw() + 
    geom_text(aes(y = N, label = N), nudge_y = 100) + scale_x_date(breaks = "1 week", 
    date_labels = "%b %d") + scale_y_continuous(name = "Number of Publications", 
    sec.axis = sec_axis(trans = ~. + 0, name = "Number of Global Cases (in thousands)")) + 
    labs(x = "Monday Date") + theme(axis.title = element_text(size = 14), axis.text.x = element_text(angle = 60, 
    hjust = 1, size = 12), axis.text.y = element_text(size = 12))

ggplot(pub_db.weekts, aes(x = date, y = Cum.N)) + geom_bar(stat = "identity", color = "black") + 
    geom_text(aes(label = Cum.N), nudge_y = 1000) + theme_bw() + scale_x_date(breaks = "1 week", 
    date_labels = "%b %d") + labs(x = "Week of Electronic Publication Date", y = "Number of Publications") + 
    theme(axis.title = element_text(size = 14), axis.text.x = element_text(angle = 60, 
        hjust = 1, size = 12), axis.text.y = element_text(size = 12))

Publication Characteristics

compare.pubs1 <- compareGroups(~PL + LA + PST + day + week.cat, data = pub_db1, max.xlev = 60)

compare.pubs1.table <- createTable(compare.pubs1, hide.no = "no")


export2md(compare.pubs1.table, loc = "left", size = "large")
Summary descriptives table
[ALL] N
N=17673
PL: 17672
Argentina 3 (0.02%)
Australia 204 (1.15%)
Austria 26 (0.15%)
Belgium 1 (0.01%)
Bosnia and Herzegovina 2 (0.01%)
Brazil 144 (0.81%)
Canada 350 (1.98%)
Chile 8 (0.05%)
China 281 (1.59%)
China (Republic : 1949- ) 7 (0.04%)
Czech Republic 1 (0.01%)
Denmark 80 (0.45%)
Egypt 5 (0.03%)
England 4849 (27.4%)
France 283 (1.60%)
Germany 565 (3.20%)
Greece 12 (0.07%)
Hungary 15 (0.08%)
India 152 (0.86%)
Iran 57 (0.32%)
Ireland 275 (1.56%)
Italy 289 (1.64%)
Japan 71 (0.40%)
Korea (South) 128 (0.72%)
Malaysia 4 (0.02%)
Mexico 13 (0.07%)
Nepal 11 (0.06%)
Netherlands 1470 (8.32%)
New Zealand 48 (0.27%)
Norway 37 (0.21%)
Oman 1 (0.01%)
Poland 49 (0.28%)
Portugal 17 (0.10%)
Romania 3 (0.02%)
Russia (Federation) 1 (0.01%)
Saudi Arabia 5 (0.03%)
Scotland 38 (0.22%)
Singapore 74 (0.42%)
South Africa 10 (0.06%)
Spain 241 (1.36%)
Sweden 21 (0.12%)
Switzerland 634 (3.59%)
Thailand 2 (0.01%)
Turkey 71 (0.40%)
United Arab Emirates 26 (0.15%)
United States 7088 (40.1%)
LA: 17673
Chinese 103 (0.58%)
Dutch 14 (0.08%)
English 17250 (97.6%)
French 92 (0.52%)
German 59 (0.33%)
Hungarian 13 (0.07%)
Italian 2 (0.01%)
Polish 1 (0.01%)
Portuguese 23 (0.13%)
Russian 2 (0.01%)
Spanish 106 (0.60%)
Swedish 8 (0.05%)
PST: 17673
aheadofprint 11435 (64.7%)
epublish 2899 (16.4%)
ppublish 3339 (18.9%)
day: 17673
Friday 3498 (19.8%)
Monday 2718 (15.4%)
Saturday 1388 (7.85%)
Sunday 745 (4.22%)
Thursday 3256 (18.4%)
Tuesday 2871 (16.2%)
Wednesday 3197 (18.1%)
week.cat: 17673
2 3 (0.02%)
3 1 (0.01%)
4 15 (0.08%)
5 37 (0.21%)
6 79 (0.45%)
7 80 (0.45%)
8 120 (0.68%)
9 172 (0.97%)
10 182 (1.03%)
11 234 (1.32%)
12 400 (2.26%)
13 547 (3.10%)
14 950 (5.38%)
15 1107 (6.26%)
16 1306 (7.39%)
17 1684 (9.53%)
18 1775 (10.0%)
19 1722 (9.74%)
20 2284 (12.9%)
21 1783 (10.1%)
22 1738 (9.83%)
23 1454 (8.23%)

Journal’s contriburinting to at least 0.5% of total publications

knitr::kable(pub_db.journal, format = "html")
TA N Percent
BMJ 495 2.8
J Med Virol 344 1.9
medRxiv 241 1.4
Clin Infect Dis 169 1.0
J Infect 175 1.0
Lancet 183 1.0
N Engl J Med 177 1.0
Dermatol Ther 154 0.9
Int J Infect Dis 149 0.8
JAMA 139 0.8
bioRxiv 129 0.7
Int J Environ Res Public Health 117 0.7
Infect Control Hosp Epidemiol 113 0.6
J Am Acad Dermatol 112 0.6
Lancet Infect Dis 104 0.6
Med Hypotheses 112 0.6
Sci Total Environ 102 0.6
Travel Med Infect Dis 109 0.6
Ann Intern Med 83 0.5
Brain Behav Immun 89 0.5
Crit Care 97 0.5
Head Neck 85 0.5
J Clin Virol 84 0.5
J Eur Acad Dermatol Venereol 88 0.5
Psychiatry Res 82 0.5