Suppose you have a bunch of submitted student reports. Since they are working in groups, there are duplicate files submitted. You want to discard the duplicates.

First get the files

reports <- dir("Report file directory", full.names = TRUE)

Then calculate MD5 sums. Identical files have identical sums: You can also embed plots, for example:

library(tools)
md5 <- md5sum(reports)

Put this into the data.frame :

dr <- data.frame(name = names(md5), hash = md5)

Get the rows with unique MD5 sums:

unique_reports <- dr[match(unique(dr$hash), dr$hash), ]

If you on Mac OS X, you can open these files directly from R:

system(paste0("open '", unique_reports$name[1], "'"))

Who needs Finder (any other file browser), you can do everything from R! :)

Note that I do not advocate such workflow. But it works for me! :) Sometimes :)

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.