wdman: Basics

John D Harrison

2017-01-19

The goal of this vignette is to describe the basic functionality of the wdman package.

Introduction

wdman (Webdriver Manager) is an R package that allows the user to manage the downloading/running of third party binaries relating to the webdriver/selenium projects. The package was inspired by a similar node package webdriver-manager.

The checking/downloading of binaries is handled by the binman package and the running of the binaries as processes is handled by the subprocess package.

The wdman package currently manages the following binaries:

Associated with the above are five functions to download/manage the binaries:

The driver functions take a number of common arguments (verbose, check, retcommand) which we describe:

Verbosity

Each of the driver functions has a verbose argument which controls message output to the user. If verbose = TRUE then messages are relayed to the user to inform them when drivers are checked/downloaded/ran. The default value for the driver functions is TRUE.

selServ <- selenium(verbose = TRUE)

## checking Selenium Server versions:

## BEGIN: PREDOWNLOAD

## BEGIN: DOWNLOAD

## BEGIN: POSTDOWNLOAD

## checking chromedriver versions:

## BEGIN: PREDOWNLOAD

## BEGIN: DOWNLOAD

## BEGIN: POSTDOWNLOAD

## checking geckodriver versions:

## BEGIN: PREDOWNLOAD

## BEGIN: DOWNLOAD

## BEGIN: POSTDOWNLOAD

## checking phantomjs versions:

## BEGIN: PREDOWNLOAD

## BEGIN: DOWNLOAD

## BEGIN: POSTDOWNLOAD

selServ$stop()

## [1] TRUE

versus’s

selServ <- selenium(verbose = FALSE)
selServ$stop()

## [1] TRUE

Check for updates

Each driver function has a check argument. If check= TRUE the function will liaise with the driver repository for any updates. If new driver versions are available these will be downloaded. The binman package is used for this purpose.

Command line output

For diagnostic purposes each driver function has a retcommand argument. If retcommand = TRUE the command that would have been launched as a process is instead returned as a string. As an example:

selCommand <- selenium(retcommand = TRUE, verbose = FALSE, check = FALSE)
selCommand

## [1] "/usr/bin/java -Dwebdriver.chrome.driver=/home/john/.local/share/binman_chromedriver/linux64/2.27/chromedriver -Dwebdriver.gecko.driver=/home/john/.local/share/binman_geckodriver/linux64/0.13.0/geckodriver -Dphantomjs.binary.path=/home/john/.local/share/binman_phantomjs/linux64/2.1.1/phantomjs-2.1.1-linux-x86_64/bin/phantomjs -jar /home/john/.local/share/binman_seleniumserver/generic/3.0.1/selenium-server-standalone-3.0.1.jar -port 4567"
chromeCommand <- chrome(retcommand = TRUE, verbose = FALSE, check = FALSE)
chromeCommand

## [1] "/home/john/.local/share/binman_chromedriver/linux64/2.27/chromedriver --port=4567 --url-base=wd/hub --verbose"

Selenium Standalone

The selenium function manages the Selenium Standalone binary. It can check for updates at http://selenium-release.storage.googleapis.com/index.html and run the resulting binaries as processes.

Running the Selenium binary

The binary takes a port argument which defaults to port = 4567L. There are a number of optional arguments to use a particular version of the binaries related to browsers selenium may control. By default the selenium function will look to use the latest version of each.

selServ <- selenium(verbose = FALSE, check = FALSE)
selServ$process

## Process Handle
## command   : /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Dwebdriver.chrome.driver=/home/john/.local/share/binman_chromedriver/linux64/2.27/chromedriver -Dwebdriver.gecko.driver=/home/john/.local/share/binman_geckodriver/linux64/0.13.0/geckodriver -Dphantomjs.binary.path=/home/john/.local/share/binman_phantomjs/linux64/2.1.1/phantomjs-2.1.1-linux-x86_64/bin/phantomjs -jar /home/john/.local/share/binman_seleniumserver/generic/3.0.1/selenium-server-standalone-3.0.1.jar -port 4567
## system id : 105698
## state     : running

The selenium function returns a list of functions and a handle representing the running process.

The returned output, error and log functions give access to the stdout/stderr pipes and the cumulative stdout/stderr messages rerspectively.

selServ$log()

## $stderr
##  [1] "10:28:57.720 INFO - Selenium build info: version: '3.0.1', revision: '1969d75'"                                                                                          
##  [2] "10:28:57.721 INFO - Launching a standalone Selenium Server"                                                                                                              
##  [3] "2017-01-17 10:28:57.736:INFO::main: Logging initialized @186ms"                                                                                                          
##  [4] ""                                                                                                                                                                        
##  [5] "10:28:57.780 INFO - Driver provider org.openqa.selenium.ie.InternetExplorerDriver registration is skipped:"                                                              
##  [6] " registration capabilities Capabilities [{ensureCleanSession=true, browserName=internet explorer, version=, platform=WINDOWS}] does not match the current platform LINUX"
##  [7] "10:28:57.781 INFO - Driver provider org.openqa.selenium.edge.EdgeDriver registration is skipped:"                                                                        
##  [8] " registration capabilities Capabilities [{browserName=MicrosoftEdge, version=, platform=WINDOWS}] does not match the current platform LINUX"                             
##  [9] "10:28:57.781 INFO - Driver class not found: com.opera.core.systems.OperaDriver"                                                                                          
## [10] "10:28:57.782 INFO - Driver provider com.opera.core.systems.OperaDriver registration is skipped:"                                                                         
## [11] "Unable to create new instances on this machine."                                                                                                                         
## [12] "10:28:57.782 INFO - Driver class not found: com.opera.core.systems.OperaDriver"                                                                                          
## [13] "10:28:57.783 INFO - Driver provider com.opera.core.systems.OperaDriver is not registered"                                                                                
## [14] "10:28:57.784 INFO - Driver provider org.openqa.selenium.safari.SafariDriver registration is skipped:"                                                                    
## [15] " registration capabilities Capabilities [{browserName=safari, version=, platform=MAC}] does not match the current platform LINUX"                                        
## [16] "2017-01-17 10:28:57.815:INFO:osjs.Server:main: jetty-9.2.15.v20160210"                                                                                                   
## [17] ""                                                                                                                                                                        
## [18] "2017-01-17 10:28:57.836:INFO:osjsh.ContextHandler:main: Started o.s.j.s.ServletContextHandler@2ef5e5e3{/,null,AVAILABLE}"                                                
## [19] ""                                                                                                                                                                        
## [20] "2017-01-17 10:28:57.849:INFO:osjs.ServerConnector:main: Started ServerConnector@724af044{HTTP/1.1}{0.0.0.0:4567}"                                                        
## [21] ""                                                                                                                                                                        
## [22] "2017-01-17 10:28:57.851:INFO:osjs.Server:main: Started @301ms"                                                                                                           
## [23] ""                                                                                                                                                                        
## [24] "10:28:57.852 INFO - Selenium Server is up and running"                                                                                                                   
## 
## $stdout
## character(0)

The stop function sends a signal that terminates the process:

selServ$stop()

## [1] TRUE

Available browsers

By default the selenium function includes paths to chromedriver/geckodriver/ phantomjs so that the Chrome/Firefox and PhantomJS browsers are available respectively. All versions (chromever, geckover etc) are given as “latest”. If the user passes a value of NULL for any driver it will be excluded.

On Windows operating systems the option to included the Internet Explorer driver is also given. This is set to iedrver = NULL so not ran by default. Set it to iedrver = "latest" or a specific version string to include it on your Windows.

Issues with Windows and Firefox/GeckoDriver

This issue is fixed now.

To run the binaries related to the Selenium/webdriver projects wdman uses the R package subprocess. Currently the windows version of this package uses blocking pipes when it instantiates a process. The causes issues with firefox/geckodriver when called from selenium. A “shim” is required as the stderr pipe is blocking and firefox/geckodriver waits for the pipe to free.

An example of implementing this shim for windows can be seen in the Rselenium package. The rsDriver function currently implements such a shim. It basically clears the error pipe so the firefox/geckodriver can finish its startup.

Chrome Driver

The chrome function manages the Chrome Driver binary. It can check for updates at https://chromedriver.storage.googleapis.com/index.html and run the resulting binaries as processes.

The chrome function runs the Chrome Driver binary as a standalone process. It takes a default port argument port = 4567L. Users can then connect directly to the chrome driver to drive a chrome browser.

Similarly to the selenium function the chrome function returns a list of four functions and a handle to the underlying running process.

cDrv <- chrome(verbose = FALSE, check = FALSE)
cDrv$process

## Process Handle
## command   : /home/john/.local/share/binman_chromedriver/linux64/2.27/chromedriver --port=4567 --url-base=wd/hub --verbose
## system id : 105886
## state     : running
cDrv$log()

## $stderr
## character(0)
## 
## $stdout
## [1] "Starting ChromeDriver 2.27.440175 (9bc1d90b8bfa4dd181fbbf769a5eb5e575574320) on port 4567"
## [2] "Only local connections are allowed."
cDrv$stop()

## [1] TRUE

PhantomJS

The phantomjs function manages the PhantomJS binary. It can check for updates at https://bitbucket.org/ariya/phantomjs/downloads and run the resulting binaries as processes.

The phantomjs function runs the PhantomJS binary as a standalone process in webdriver mode. It takes a default port argument port = 4567L. Users can then connect directly to the “ghostdriver” to drive a PhantomJS browser. Currently the default version is set to version = "2.1.1". At the time of writing 2.5.0-beta has been released. It currently does not have an up-to-date version of ghostdriver associated with it. For this reason it will beunstable/unpredictable to use it in webdriver mode.

Similarly to the selenium function the phantomjs function returns a list of four functions and a handle to the underlying running process.

pjsDrv <- phantomjs(verbose = FALSE, check = FALSE)
pjsDrv$process

## Process Handle
## command   : /home/john/.local/share/binman_phantomjs/linux64/2.1.1/phantomjs-2.1.1-linux-x86_64/bin/phantomjs --webdriver=4567 --webdriver-loglevel=INFO
## system id : 106709
## state     : running
pjsDrv$log()

## $stderr
## character(0)
## 
## $stdout
## [1] "[INFO  - 2017-01-17T19:16:41.984Z] GhostDriver - Main - running on port 4567"
pjsDrv$stop()

## [1] TRUE

Gecko Driver

The gecko function manages the Gecko Driver binary. It can check for updates at https://github.com/mozilla/geckodriver/releases and run the resulting binaries as processes.

The gecko function runs the Gecko Driver binary as a standalone process. It takes a default port argument port = 4567L. Users can then connect directly to the gecko driver to drive a firefox browser. Currently the default version is set to version = "2.1.1".

A very IMPORTANT point to note is that geckodriver implements the W3C webdriver protocol which as at the time of writing is not finalised. Currently packages such as RSelenium implement the JSONwireprotocol which whilst similar expects different return from the underlying driver.

The geckodriver implementation like the W3C webdriver specification is incomplete at this point in time.

Similarly to the selenium function the gecko function returns a list of four functions and a handle to the underlying running process.

gDrv <- gecko(verbose = FALSE, check = FALSE)
gDrv$process

## Process Handle
## command   : /home/john/.local/share/binman_geckodriver/linux64/0.13.0/geckodriver --port=4567 --log=info
## system id : 107783
## state     : running
gDrv$log()

## $stderr
## [1] "1484681330494\tgeckodriver\tINFO\tListening on 127.0.0.1:4567"
## 
## $stdout
## character(0)
gDrv$stop()

## [1] TRUE

IE Driver

The iedriver function manages the Internet Explorer Driver binary. It can check for updates at http://selenium-release.storage.googleapis.com/index.html and run the resulting binaries as processes (the iedriver is distributed currently with the Selenium standalone binary amongst other files).

The chrome function runs the Chrome Driver binary as a standalone process. It takes a default port argument port = 4567L. Users can then connect directly to the chrome driver to drive a chrome browser.

Please note that additional settings are required to drive an Internet Explorer browser. Security settings and zoom level need to be set correctly in the browser. The author of this document needed to set a registry entry (for ie 11). This is outlined at https://github.com/SeleniumHQ/selenium/wiki/InternetExplorerDriver in the required configuration section.

Similarly to the selenium function the gecko function returns a list of four functions and a handle to the underlying running process.

ieDrv <- iedriver(verbose = FALSE, check = FALSE)
ieDrv$process

## Process Handle
## command   : C:\Users\john\AppData\Local\binman\binman_iedriverserver\win64\3.0.0\IEDriverServer.exe /port=4567 /log-level=FATAL /log-file=C:\Users\john\AppData\Local\Temp\RtmpqSdw94\file5247395f2a.txt
## system id : 7484
## state     : running
ieDrv$log()

## $stderr
## character(0)
## 
## $stdout
## [1] "Started InternetExplorerDriver server (64-bit)"                                          
## [2] "3.0.0.0"                                                                                 
## [3] "Listening on port 4567"                                                                  
## [4] "Log level is set to FATAL"                                                               
## [5] "Log file is set to C:\\Users\\john\\AppData\\Local\\Temp\\RtmpqSdw94\\file5247395f2a.txt"
## [6] "Only local connections are allowed"
ieDrv$stop()

## [1] TRUE

Issues and problems

If you experience issues or problems running one of the drivers/functions please try running the command in a terminal on your OS initially.

You can access the command to run by using the retcommand argument in each of the main package functions.

If you continue to have problems consider posting an issue at https://github.com/johndharrison/wdman/issues