Scraping https://akipress.com/

1. Create a function that scrapes one page of a new site. Get at least two texts and one link.

get_akipress_one_page <- function(url) {
  t <- read_html(url)
  title <- t %>% html_nodes('.newstitle') %>% html_text()
  category <- t %>% html_nodes('.catreg') %>% html_text()
  link <- paste0('https://akipress.com', t %>% html_nodes('.newstitle') %>% html_attr('href'))
  df <- data.frame('title' = title, 'category' = category, 'link' = link)
  return(df)
}

2. Create the links for the first 10 page.

links <- paste0('https://akipress.com/reg:4/page:', 1:10, '/')

3. Apply your function to your vector containing the links.

df <- lapply(links, get_akipress_one_page)

4. rbindlist your dataframes into one

df <- rbindlist(df)