── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.4.4 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(pageviews) # This package gets data on Wikipedia viewinglibrary(DT) # DT stands for datatable, and creates interactive tableslibrary(infer) # for some stats like t_testlibrary(devtools)
Loading required package: usethis
gun_control <-article_pageviews(article ="Gun control", start =as.Date("2017-1-1"), end =as.Date("2023-12-31"))glimpse(gun_control)
gun_violence <-article_pageviews(article ="Gun violence in the United States", start =as.Date("2017-1-1"), end =as.Date("2023-12-31"))glimpse(gun_violence)
guns %>%ggplot(aes(x = date, y = views, color = article)) +geom_line()
This graph shows the correlation between views on the Gun Control and Gun Violence in the United States Wikipedia pages according to date.
guns %>%pivot_wider(names_from = article, values_from = views) %>%ggplot(aes(x = Gun_violence_in_the_United_States, y = Gun_control)) +# create scatterplotgeom_point() +geom_smooth(method = lm) +# create regression linelabs(x ="Views of the Wikipedia Gun violence in the US article", y ="Views of the Wikipedia Gun control article", title ="Relationship between Wikipedia article views")
`geom_smooth()` using formula = 'y ~ x'
This scatterplot shows the analysis between the views on the Gun Violence in the US Wikipedia page and the views of the Gun Control Wikipedia page.
top <-top_articles(start =as.Date("2019-10-19"))
top %>%select(article, views) %>%filter(!article =="Main_Page", !article =="Special:Search") %>%slice_max(views, n =10) %>%datatable()
top %>%select(article, views) %>%filter(!article =="Main_Page", !article =="Special:Search") %>%top_n(10, views) %>%ggplot(aes(x =as_factor(article), y = views)) +coord_flip() +geom_col(fill ="Orange") +labs(y ="Number of Views", x ="Article", title ="Top Wikipedia articles, Oct. 19, 2019")
The bar graph shows that views on Wikipedia did not show increased interest in the Grambling State University shooting on October 19, 2019 which was the day after the tragedy.
top <-top_articles(start =as.Date("2019-10-23"))
top %>%select(article, views) %>%filter(!article =="Main_Page", !article =="Special:Search") %>%slice_max(views, n =10) %>%datatable()
top %>%select(article, views) %>%filter(!article =="Main_Page", !article =="Special:Search") %>%top_n(10, views) %>%ggplot(aes(x =as_factor(article), y = views)) +coord_flip() +geom_col(fill ="Orange") +labs(y ="Number of Views", x ="Article", title ="Top Wikipedia articles, Oct. 23, 2019")
The bar graph shows that the Ridgeway High School shooting did not become a top viewed Wikipedia article on the day after the incident (10/22/2019)
california <-article_pageviews(article ="Gun_control",start =as.Date("2019-10-7"),end =as.Date("2019-11-4"))
california <- california %>%mutate(day =-14:14) %>%mutate(event ="california")
louisiana %>%ggplot(aes(x = day, y = views)) +geom_line()
california %>%ggplot(aes(x = day, y = views)) +geom_line()
shootings <-bind_rows(louisiana, california)shootings %>%ggplot(aes(x = day, y = views, color = event)) +geom_line() +theme_minimal() +labs(x ="Days before/after Shooting", y ="Wikipedia Views", color ="Event", title ="Views of the Wikipedia Gun Control Article before and after Two Mass Shootings")
The graph shows the wikipedia views of Gun Control days before and after a school shooting in both California and Louisiana.
Warning: The statistic is based on a difference or ratio; by default, for
difference-based statistics, the explanatory variable is subtracted in the
order "TRUE" - "FALSE", or divided in the order "TRUE" / "FALSE" for
ratio-based statistics. To specify this order yourself, supply `order =
c("TRUE", "FALSE")`.
# A tibble: 2 × 4
after_event Mean StdDev N
<lgl> <dbl> <dbl> <int>
1 FALSE 858. 177. 30
2 TRUE 894. 205. 28
The average number of views of the Wikipedia Gun Control article in the 7 days prior to the two shootings (M = 883.2, SD = 212.90) was not statistically significantly different from the average number of views in the 7 days after the shooting (M = 909.61, SD = 258.26), t(52.5) = 0.42, p = 0.674.