library(tidyverse)
wiki_pages_df <- read_csv("https://myxavier-my.sharepoint.com/:x:/g/personal/grotjana_xavier_edu/Eckm9F_hfGRIsMAcLU5aLD0Bur7Kry1oe9BNmf54yOzllA?download=1")In-Class Activity Web Scraping
Wikipedia Web Scraping
15 random Wikipedia articles were scraped off of the web so that we get a data frame recording the names of the articles.
Running Code
Load in your data that stores the names of the wiki articles that were scraped
You should now have a data frame which you can now use to make a histogram. This histogram below shows the distribution of character lengths for the different article names.
wiki_pages_df %>%
ggplot(aes(x=nchar(wiki_pages))) +
geom_histogram()`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.