A good coding style is like using correct punctuation. You can manage without it, but it sure makes things easier to read. The following guide describes the style that I use. It is based on Google’s R style guide and “Advanced R, by Hadley Wickham”, with a few tweaks. The goal of the R Programming Style Guide is to make our R code easier to read, share, and verify.
File names should be meaningful and end in .R.
# Good
fit-models.R
utility_functions.R
# Bad
foo.r
stuff.r
If files need to be run in sequence, prefix them with numbers:
0-download.R
1-parse.R
2-explore.R
“There are only two hard things in Computer Science: cache invalidation and naming things.” - Phil Karlton
Variable and function names should be lowercase. Use an underscore (_) to separate words within a name. Generally, variable names should be nouns and function names should be verbs. Strive for names that are concise and meaningful (this is not easy!).
# Good
day_one
day_1
dayOne
# Bad
first_day_of_the_month
dayone
djm1
Where possible, avoid using names of existing functions and variables. Doing so will cause confusion for the readers of your code.
# Bad
T <- FALSE
c <- 10
mean <- function(x) sum(x)
Place spaces around all infix operators (=, +, -, <-, etc.). The same rule applies when using = in function calls. Always put a space after a comma, and never before (just like in regular English).
# Good
average <- mean(feet / 12 + inches, na.rm = TRUE)
# Bad
average<-mean(feet/12+inches,na.rm=TRUE)
There’s a small exception to this rule: :, :: and ::: don’t need spaces around them.
# Good
x <- 1:10
base::get
# Bad
x <- 1 : 10
base :: get
Place a space before left parentheses, except in a function call.
# Good
if (debug) do(x)
plot(x, y)
# Bad
if(debug)do(x)
plot (x, y)
Extra spacing (i.e., more than one space in a row) is ok if it improves alignment of equal signs or assignments (<-).
list(
total = a + b + c,
mean = (a + b + c) / n
)
Do not place spaces around code in parentheses or square brackets (unless there’s a comma, in which case see above).
# Good
if (debug) do(x)
diamonds[5, ]
# Bad
if ( debug ) do(x) # No spaces around debug
x[1,] # Needs a space after the comma
x[1 ,] # Space goes after comma not before
An opening curly brace should never go on its own line and should always be followed by a new line. A closing curly brace should always go on its own line, unless it’s followed by else.
Always indent the code inside curly braces.
# Good
if (y < 0 && debug) {
message("Y is negative")
}
if (y == 0) {
log(x)
} else {
y ^ x
}
# Bad
if (y < 0 && debug)
message("Y is negative")
if (y == 0) {
log(x)
}
else {
y ^ x
}
It’s ok to leave very short statements on the same line:
if (y < 0 && debug) message("Y is negative")
Strive to limit your code to 80 characters per line. This fits comfortably on a printed page with a reasonably sized font. If you find yourself running out of room, this is a good indication that you should encapsulate some of the work in a separate function.
When indenting your code, use two spaces. Never use tabs or mix tabs and spaces.
The only exception is if a function definition runs over multiple lines. In that case, indent the second line to where the definition starts:
long_function_name <- function(a = "a long argument",
b = "another argument",
c = "another long argument") {
# As usual code is indented by two spaces.
}
Use <-, not =, for assignment.
# Good
x <- 5
# Bad
x = 5
Comment your code. Each line of a comment should begin with the comment symbol and a single space: #. Comments should explain the why, not the what. Short comments can be placed after code preceded by two spaces, #, and then one space.
Use commented lines of - and = to break up your file into easily readable chunks.
# Load data ---------------------------
# Plot data ---------------------------
Function definitions should first list arguments without default values, followed by those with default values.
In both function definitions and function calls, multiple arguments per line are allowed; line breaks are only allowed between assignments.
# Good
PredictCTR <- function(query, property, num.days,
show.plot = TRUE)
# Bad
PredictCTR <- function(query, property, num.days, show.plot =
TRUE)
Functions should contain a comments section immediately below the function definition line. These comments should consist of a one-sentence description of the function; a list of the function’s arguments, denoted by Args:, with a description of each (including the data type); and a description of the return value, denoted by Returns:. The comments should be descriptive enough that a caller can use the function without reading any of the function’s code.
CalculateSampleCovariance <- function(x, y, verbose = TRUE) {
# Computes the sample covariance between two vectors.
#
# Args:
# x: One of two vectors whose sample covariance is to be calculated.
# y: The other vector. x and y must have the same length, greater than one,
# with no missing values.
# verbose: If TRUE, prints sample covariance; if not, not. Default is TRUE.
#
# Returns:
# The sample covariance between x and y.
n <- length(x)
# Error handling
if (n <= 1 || n != length(y)) {
stop("Arguments x and y have different lengths: ",
length(x), " and ", length(y), ".")
}
if (TRUE %in% is.na(x) || TRUE %in% is.na(y)) {
stop(" Arguments x and y must not have missing values.")
}
covariance <- var(x, y)
if (verbose)
cat("Covariance = ", round(covariance, 4), ".\n", sep = "")
return(covariance)
}
Enjoy R!