Download raw data from here and unzip it: https://www.kaggle.com/stackoverflow/statsquestions

library(tidyverse)

stats_questions <- "~/Downloads/statsquestions"

questions <- read_csv(file.path(stats_questions, "Questions.csv"))
answers <- read_csv(file.path(stats_questions, "Answers.csv"))
tags <- read_csv(file.path(stats_questions, "Tags.csv"))

Don’t include the most recent few months, since there may not have been adequate time to answer them! And there were almost no questions in 2009.

library(lubridate)

answers_by_question <- answers %>%
  count(Id = ParentId) %>%
  rename(NumAnswers = n)

joined <- questions %>%
  filter(CreationDate >= "2010-01-01", CreationDate <= "2016-07-01") %>%
  left_join(answers_by_question, by = "Id") %>%
  replace_na(list(NumAnswers = 0))

by_year <- joined %>%
  group_by(Year = year(CreationDate)) %>%
  summarize(NumQuestions = n(),
            AverageAnswers = mean(NumAnswers),
            PercentAnswered = mean(NumAnswers > 0))

By any metric, the amount to which questions are answered has been decreasing:

ggplot(by_year, aes(Year, AverageAnswers)) +
  geom_line()

ggplot(by_year, aes(Year, PercentAnswered)) +
  geom_line() +
  scale_y_continuous(labels = scales::percent_format())

What about time of day? Note that times are UTC (so England’s time zone).

joined %>%
  group_by(Year = year(CreationDate), Hour = hour(CreationDate)) %>%
  summarize(NumQuestions = n(),
            AverageAnswers = mean(NumAnswers),
            PercentAnswered = mean(NumAnswers > 0)) %>%
  ggplot(aes(Hour, PercentAnswered, color = Year, group = Year)) +
  geom_line() +
  expand_limits(y = 0) +
  scale_y_continuous(labels = scales::percent_format())

No trend- a question asked at UTC midnight is about as likely to get an answer as a question asked midday, and this trend is true within each year.

One problem is that this doesn’t include closed or deleted questions, and older non-answered questions are more likely to be deleted. Using the Stack Exchange Data Explorer can get the data for closed (but not deleted) quesitons and investigate this further.

Still, I do think this is a real effect!