We will use the R-package:“rvest” for web crawler.

The example web html: https://heavenlyfood.cn/books/menu.php?id=2021 (国度的操练为着教会的建造)
This web is written by simple Chinese. So we will trans the language to Traditional Chinese.
We will use the R-package:“ropencc” to do this job. You can download the “ropencc” on Github. Then output the the chapters to each text files.

Content Grabbing

for(i in c(1:length(title))){
  
  #link to the chapter url
  chapter_url <- paste0("https://heavenlyfood.cn/", url[i])
  bible1 <- read_html(chapter_url)
  
  #grab the content
  bible_cont <- html_nodes(bible1,".cont")
  cont <- html_text(bible_cont,trim = TRUE)
  
  #trans simple Chinese to traditional Chinese
  cont[1] <- title[i] #name the title
  cont <- run_convert(trans, cont)
  
  #output the txt for each chapter
  nam <- paste(title[i],".txt", sep=" ")
  write.table(cont,nam)
}

We will get eight text files after the code run.
The result:
Alt text