The goal of this document is to help you implement a new backend for DBI.

If you are writing a package that connects a database to R, I highly recommend that you make it DBI compatible because it makes your life easier by spelling out exactly what you need to do. The consistent interface provided by DBI makes it easier for your implement the package (because you have fewer arbitrary choices to make), and easier for your users (because it follows a familiar pattern).

I’ll illustrate the process using a fictional database called Kazam.

Getting started

Start by creating a package. It’s up to you what to call the package, but following the existing pattern of RSQLite, RMySQL, RPostgreSQL and ROracle will make it easier for people to find it. For this example, I’ll call my package RKazam.

In your DESCRIPTION, make sure to include:

Depends:
  DBI (>= 0.3.0),
  methods

This is one of the few situations in which using depends (instead of imports) is correct, because you want to make sure the generics from the DBI package (like dbConnect(), dbGetQuery() etc) are available.

Driver

Start by making a driver class which inherits from DBIDriver. This class doesn’t need to do anything, it’s just used to dispatch other generics to the right method. Users don’t need to know about this, so you can remove it from the default help listing with @keywords internal:

#' Driver for Kazam database.
#' 
#' @keywords internal
#' @export
setClass("KazamDriver", contains = "DBIDriver")

The driver class was more important in older versions of DBI, so you should also provide a dummy dbUnloadDriver() method.

#' @export
#' @rdname Kazam-class
setMethod("dbUnloadDriver", "KazamDriver", function(drv, ...) {
  TRUE
})
#> [1] "dbUnloadDriver"

If your package needs global setup or tear down, do this in the .onLoad() and .onUnload() functions.

You might also want to add a show method so the object prints nicely:

setMethod("show", "KazamDriver", function(object) {
  cat("<KazamDriver>\n")
})
#> [1] "show"

Next create Kazam() which instantiates this class.

#' @export
Kazam <- function() {
  new("KazamDriver")
}

Kazam()
#> <KazamDriver>

Connection

Next create a connection class that inherits from DBIConnection. This should store all the information needed to connect to the database. If you’re talking to a C api, this will include a slot that holds an external pointer.

#' Kazam connection class.
#' 
#' @export
#' @keywords internal
setClass("KazamConnection", 
  contains = "DBIConnection", 
  slots = list(
    host = "character", 
    username = "character", 
    # and so on
    ptr = "externalptr"
  )
)

Now you have some of the boilerplate out of the way, you can start work on the connection. The most important method here is dbConnect() which allows you to connect to a specified instance of the database. Note the use of @rdname Kazam. This ensures that Kazam() and the connect method are documented together.

#' @param drv An object created by \code{Kazam()} 
#' @rdname Kazam
#' @export
#' @examples
#' \dontrun{
#' db <- dbConnect(RKazam::Kazam())
#' dbWriteTable(db, "mtcars", mtcars)
#' dbGetQuery(db, "SELECT * FROM mtcars WHERE cyl == 4")
#' }
setMethod("dbConnect", "KazamDriver", function(drv, ...) {
  # ...
  
  new("KazamConnection", host = host, ...)
})
#> [1] "dbConnect"

Next, implement show() and dbDisconnect() methods.

Results

Finally, you’re ready to implement the meat of the system: fetching results of a query into a data frame. First define a results class:

#' Kazam results class.
#' 
#' @keywords internal
#' @export
setClass("KazamResult", 
  contains = "DBIResult",
  slots = list(ptr = "externalptr")
)

Then write a dbSendQuery() method. This takes a connection and SQL string as arguments, and returns a result object. Again ... is needed for compatibility with the generic, but you can add other arguments if you need them.

#' Send a query to Kazam.
#' 
#' @export
#' @examples 
#' # This is another good place to put examples
setMethod("dbSendQuery", "KazamConnection", function(conn, statement, ...) {
  # some code
  new("KazamResult", ...)
})
#> [1] "dbSendQuery"

The hardest part of every DBI package is writing the dbFetch() method. This needs to take a result set and (optionally) number of records to return, and create a dataframe:

#' Retrieve records from Kazam query
#' @export
setMethod("dbFetch", "KazamResult", function(res, n = -1, ...) {
  ...
})
#> [1] "dbFetch"

Finally implement, dbClearResult():

#' @export
setMethod("dbClearResult", "KazamResult", function(res, ...) { 
  
})
#> [1] "dbClearResult"

Next implement dbHasCompleted() which should return a boolean indicating if there are any rows remaining to be fetched.

#' @export
setMethod("dbHasCompleted", "KazamResult", function(res, ...) { 
  
})
#> [1] "dbHasCompleted"

With these four methods in place, you can now use the default dbGetQuery() to send a query to the database, retrieve results if available and then clean up. Spend some time now making sure this works with an existing database.

SQL methods

You’re now on the home stretch, and can make your wrapper substantially more useful by implementing methods that wrap around variations in SQL across databases:

Metadata methods

Finally there are a lot of extra metadata methods that you might want to implement. However, I’ve run out of energy to list them all - instead look at the source code for the DBI package.