What is RedditExtractor?
A minimalistic R wrapper for the Reddit API (application programming interface)
Install the RedditExtractoR package
install.packages("RedditExtractoR")
Load the package into the console
library(RedditExtractoR)
Building the search query
Use the get_reddit()
function to query Reddit data and modify the arguments below:
- Arguments
search_terms
A string of terms to be searched on Reddit.regex_filter
An optional regular expression filter that will remove URLs with titles that do not match the condition.subreddit
An optional character string that will restrict the search to the specified subreddit.cn_threshold
Comment number threshold that remove URLs with fewer comments that cn_threshold. 0 by default.page_threshold
Page threshold that controls the number of pages is going to be searchedsort_by
"comments" to arrange by number of comments, or "new" to arrange by date.wait_time
Wait time in seconds between page requests. 2 by default
Posts in the r/pics subreddit containing the term bone marrow
bone_marrow_posts = get_reddit(search_terms = "bone marrow",subreddit = "pics",cn_threshold=0,sort_by = "comments")
Output
datatable(sample_n(bone_marrow_posts,25) %>% select(structure,author,user,comment))