March 4, 2016

Quotes

"to err is human; to forgive, divine" - Alexander Pope, "Essay on Criticism"

"to err is human; to really screw up, you need a computer" - Internet

"Programs must be written for people to read, and only incidentally for machines to execute." ― Harold Abelson, Structure and Interpretation of Computer Programs

Presentation overview

  • Overview and intended audience
  • Introduction
  • Common sources of error messages
    • Data types & structures
    • Data wrangling
  • Additional resources
  • Contact

Intended audience

  • This presentation is intended for beginners to the R language who want to overcome fears of decoding messages and will help build portable skills to asses their code and minimize downtime due to R's confusing & cryptic errors.

Introduction

Everyone makes mistakes

  • Errors (and their messages) happen to everyone, even experienced programmers.
  • Learning how to solve and interpret errors will help us to write efficient code.
  • Developing skills to cope with the fustration of error message will make for a more enjoyable time writing code.

All the feels

Describe your feelings about this picture

B.R.I.E Method

it's cheesy, I know

  • First steps in confidently approaching error messages is by following the B.R.I.E Method
    • Breathe and relax your mind
    • Re-read the error message and focus on key wording
    • Isolate the Error within your code to reduce complexity
  • Hopefully this method can act as a guide to assesing and solving errors in your R code (and other programming languages as well!)

Common sources of error messages

Data types & structures

  • Many error message originate from improper manipulation of data structures.
  • R base has 5 common data structures:
    • Atomic Vector: logical, integer, double (often called numeric), and character.
    • Matrix
    • Array
    • List
    • Data Frame

Error #1: non-numeric argument to binary operator

name_string = "Jasmine"
name_string * 2
## Error in name_string * 2: non-numeric argument to binary operator
  • To decode this error the key statement is 'binary operator'. Multiplication requires two numeric values around the * symbol.
  • This error informs the user that one of the values supplied to the multiplication recipe is not a number!
  • Note: A unary operator is a + (positive) or - (negative) in-front of a number.

Error #2: invalid 'type' (character) of argument

vec <- c(1, 2, 3, 4, 5, "hello world")
sum(vec)
## Error in sum(vec): invalid 'type' (character) of argument
  • This error states that the character 'type' cannot be included in the aggregate function sum()
  • Once again this is a improper mixing of data types.

Data wrangling

  • Data wrangling and cleaning can take the majority of time in the data science workflow.
  • Many type of errors can arise when subsetting and manipulating multi-dimensional data objects.
  • With different types of packages it can be common to get flustered by confusing errors that can be funciton specific such as requiring an input arguement to be a list instead of a data frame

Error #3: all arguments must have the same length

df <- data.frame("age" = c(32, 31, 34), 
                 "pet" = c(F, T, T), 
                 "name" = c("Adam", "Blake", "Anders"))
table(df$age, df$pets)
## Error in table(df$age, df$pets): all arguments must have the same length
  • The error describes that the two supplied arguments to table() are not the same length which is required for cross tabulation. The spelling mistake (pet versus pets) actually has a length of 0 which is a mismatch from df$age which has a length 3 but the error message does not indicate that the column name was misspelled.

more to come …

Additional resources

Contact Information

Jasmine Dumas (@jasdumas)

fin