This is written for explaining some of the problems people get when trying to download data files with R (primarily for the Getting and Cleaning Data course on Coursera, but it does have more general utility).
Some initial download instructions might look like
myurl <- "https://127.0.0.1/imaginary/file.csv"
download.file(url=myurl, destfile="localcopy.csv")
But this will often just not work (even if it was a real file), due to the complexities of https.
First tip, does it work if you take the s off https, making it http://
If it does that is most of your problems solved, and you can get on with your life (at least for anything that is not an excel file or similar binary, see later).
myurl <- "http://127.0.0.1/imaginary/file.csv"
download.file(url=myurl, destfile="localcopy.csv")
On the other hand, if you have to use a https connection to get the data,
generally adding the method=“curl” will get things working on a Macintosh, as it can hand the download off to the behind the scenes curl library.
myurl <- "https://127.0.0.1/imaginary/file.csv"
download.file(url=myurl, destfile="localcopy.csv", method="curl")
The two main routes for Windows are either to install curl from http://curl.haxx.se or to use the Internet2 option to tell R to download using Internet Explorer
myurl <- "https://127.0.0.1/imaginary/file.csv"
setInternet2(use = TRUE)
download.file(url=myurl, destfile="localcopy.csv")
Try installing curl from http://curl.haxx.se and using the same instructions as the Macintosh.
This is an added package called downloader which sorts out most of the difficulties so, instead of the above, you can can use install.packages(“downloader”) to get the added library on your machine, then
require(downloader)
myurl <- "https://127.0.0.1/imaginary/file.csv"
download(myurl, destfile="localcopy.csv")
Some kind of files (In particular Excel and h5) are binary rather than text files. This needs an added setting to warn R that it is a binary file and should be handled appropriately.
myurl <- "http://127.0.0.1/imaginary/file.xlsx"
download.file(url=myurl, destfile="localcopy.xlsx", mode="wb")
or, for using downloader
myurl <- "http://127.0.0.1/imaginary/file.xlsx"
download(myurl, destfile="localcopy.csv", mode="wb")
You may be trying to download the file from an organisation that limits internet traffic, you can try and configure R to use your organisations proxy settings, see
?download.file
but ultimately there may come the point where it is easier to download the file with a web browser.