Peruvian Currency Exchange Scraper

A simple Scraper extrancing currency exchange data from SUNAT

Here I will demostrate a simple web scraping example using a table of exchange rates from Peru’s tax agency SUNAT. Today that page looked like this:

Sunat currency exchange for Oct 2017

First we load the `rvest` library and the webpage we want to scraper using its url.

  library(rvest)

## Loading required package: xml2

  url <- 'http://www.sunat.gob.pe/cl-at-ittipcam/tcS01Alias'
  webpage <- read_html(url)

We get the webpage tables with `html_nodes`

  tbls <- html_nodes(webpage, "table")
  length(tbls)

## [1] 6

the are 6 tables. But through trial and error I find that my table of interest is table 2.

  tbl2<-html_table(tbls[[2]])
  print(tbl2)

##    X1     X2    X3  X4     X5    X6  X7     X8    X9 X10    X11   X12
## 1 Día Compra Venta Día Compra Venta Día Compra Venta Día Compra Venta
## 2   3  3.267 3.271   4  3.266 3.268   5  3.258 3.260   6  3.254 3.256
## 3   7  3.266 3.268  10  3.270 3.273  11  3.265 3.267  12  3.260 3.262
## 4  13  3.254 3.256  14  3.248 3.251  17  3.244 3.247  18  3.244 3.246
## 5  19  3.242 3.244  20  3.235 3.237  21  3.237 3.240  24  3.238 3.241
## 6  25  3.238 3.242  26  3.233 3.235  27  3.236 3.239  28  3.244 3.248

  dim(tbl2)

## [1]  6 12

  num.cols<-dim(tbl2)[2]
  num.rows<-dim(tbl2)[1]
  num.cols

## [1] 12

  num.rows

## [1] 6

Reformatting the data into a tidy data.frame

We already have the number of rows and columns and we used them to create vectors that we then integrate into a data.frame

  dia<-c()
  compra<-c()
  venta<-c()
  num.cols

## [1] 12

  num.rows

## [1] 6

  for(i in 2:num.rows){
     for(j in 1:(num.cols/3)){
     
        dia<-c(dia,as.numeric(tbl2[i,(j-1)*3+1]))
        compra<-c(compra,as.numeric(tbl2[i,(j-1)*3+2]))
        venta<-c(venta,as.numeric(tbl2[i,(j-1)*3+3]))
     }
  }
  
  output<-data.frame(dia,compra,venta)
  print(output)

##    dia compra venta
## 1    3  3.267 3.271
## 2    4  3.266 3.268
## 3    5  3.258 3.260
## 4    6  3.254 3.256
## 5    7  3.266 3.268
## 6   10  3.270 3.273
## 7   11  3.265 3.267
## 8   12  3.260 3.262
## 9   13  3.254 3.256
## 10  14  3.248 3.251
## 11  17  3.244 3.247
## 12  18  3.244 3.246
## 13  19  3.242 3.244
## 14  20  3.235 3.237
## 15  21  3.237 3.240
## 16  24  3.238 3.241
## 17  25  3.238 3.242
## 18  26  3.233 3.235
## 19  27  3.236 3.239
## 20  28  3.244 3.248

Peruvian Currency Exchange Scraper

Luis Avila

10/30/2017

A simple Scraper extrancing currency exchange data from SUNAT

First we load the `rvest` library and the webpage we want to scraper using its url.

We get the webpage tables with `html_nodes`

Reformatting the data into a tidy data.frame

Peruvian Currency Exchange Scraper

Luis Avila

10/30/2017

A simple Scraper extrancing currency exchange data from SUNAT

First we load the rvest library and the webpage we want to scraper using its url.

We get the webpage tables with html_nodes

Reformatting the data into a tidy data.frame

First we load the `rvest` library and the webpage we want to scraper using its url.

We get the webpage tables with `html_nodes`