Contents

Note: this is only a page for testing paged.js. If you are interested in the content, please read the original post.


Last week I learned about an interesting JupyterCon talk given by Joel Grus titled “I Don’t Like Notebooks”. I’d applaud the open-mindedness of the conference committee to invite Joel to give such a talk to (presumably so many) notebook lovers. I agree with Joel that criticizing a popular tool is not “an unhelpful way to spend time” (see slide #4). I wish there were more criticism to popular tools in the R world, and personally I’d welcome criticism and complaints about my own work.1

Joel’s talk triggered a lot of discussions on Twitter (come on, people, let’s get off Twitter whenever there is something worth a deep discussion). For example, Hilary Parker doesn’t like notebooks, either. Philip Guo declared that the summer 2018 ended with “the first great notebook war”. “A war between whom and whom?” You may ask. I guess it was a war between notebooks and editors/IDEs.

Before I start this post, I want to mention that I know little about Python (I have written one short Python script in total in my life), and I don’t use Jupyter myself. In fact, I also rarely use any notebooks, including R Markdown notebooks. That is mainly because I’m a software developer (engineer). For this reason, I think I can understand Joel’s complaints about notebooks pretty well. My main point is that if you use notebooks for software engineering, you are probably using the wrong tool, no matter how popular it is. I’m not sure if Joel would agree with me, but if I were to give this talk, that would be my main message to the audience.

While reading Joel’s critiques on Jupyter notebooks, I couldn’t help thinking whether they apply to R Markdown notebooks, or R Markdown documents in general, so I’ll mention how some of the problems have been addressed in the R Markdown ecosystem in this post, too.

1 The two cultures: the R vs Python culture, or data analysis vs software engineering culture

I feel a major difference between the R culture and Python culture is that Python users seem to create code more often, whereas R users often use code. There seems to be a strong atmosphere of software engineering in the Python world: in the beginning was the custom class (with methods). For R users, in the beginning was the data.

I think notebooks are more suitable for a world where the data analysis culture is stronger than the engineering culture. Joel insists that even if you are only experimenting or prototyping, you should follow good software engineering rules (slides #46-49). I tend to disagree, because prototyping is prototyping, and engineering is engineering. Good software engineering is important, but I don’t think it is necessary to write unit tests or factor out code at the prototyping stage. It is fine to do these things later, but again, I agree with Joel that if you are going to develop software seriously later, you’d better leave the notebook and use a real editor or IDE instead (e.g., to write reusable modules or packages).

How about doing data analysis in notebooks? Is “good software engineering” relevant? I’d argue that it is not highly relevant, and it is fine to use notebooks. Analyzing data and developing software are different in several aspects. The latter is meant to create generally useful and reusable products, but the former is often not generalizable—you only analyze a specific dataset, you want to draw conclusions from the specific data, and you may not be interested in or have the time to make your code reusable by other people (or it may not be possible). When the analysis is done, it is done (pretty much). Of course, if you happen to discover any parts that could be potentially reusable and useful, you could factor them out into a package, but I think this should be relatively rare—how many packages like dplyr have been abstracted or distilled from the process of data analysis? Not many.

How would you write unit tests for data analysis? I feel it will be both tricky and unnecessary. For a function/method, if you defined it, you know what its expected output should be. For data, you often don’t know what exactly to expect in the output. For example, when you subset a dataset, how do you know the result is correct?

That is probably not something you, as a data analyst, need to worry about. It is the responsibility of the package author (the software engineer) to write enough unit tests in the package that you are using.

On the other hand, data analysts often do tests in an informal way, too. As they explore the data, they may draw plots or create summary tables, in which they may be able to discover problems (e.g., wrong categories, outliers, and so on). Notebooks are great for these inline output elements, from which you can make quick discoveries.

2 What do notebooks and spreadsheets have in common?

Or ask it another way, why are notebooks so popular? My answer is that they make the (typically boring and abstract) program code much more tangible. You can see the output of each snippet of source code right below the snippet of source code. Although Joel’s #1 complaint about notebooks is that it is hard to reason about the output due to the possible out-of-order execution of cells, I think being able to see the output inside the source document can actually help you reason about the code better. Yes, you can use the traditional way of running code line by line, and see the output in the console, but the distance between source and output is much longer. Imagine you create ten plots in ten snippets of code, it will be quite messy to have ten separate plot windows floating around on your screen, and it is hard to know which piece of code created which plot. It will be worse if a plot doesn’t have a title, in which case you may not even remember what story a particular plot is supposed to tell.

I think notebooks are popular for the same reason that explains the popularity of spreadsheets such as Excel. I haven’t met a single software engineer who loves Excel. Everyone hates it and makes fun of it, but why do so many users still use it? Again, Excel makes things tangible. You can touch the data (although it is usually a very bad idea), and draw graphics in a sheet that contains the source data (bad idea again). It makes you feel everything is well under your control: oh here is my data, and here is a graph next to it; oh I should use that column to draw the graph instead, so let me change it and I can see the updated graph immediately.2 You can do everything in a single place, and the short distance between the source (data) and the output is ace.

Excel makes things tangible at the price of making things messy (e.g., it may contain manually edited data that is hard to keep track of, or merged cells or graphs that make it hard for other software to read the data). By comparison, although notebooks can mess up the state, but that is only an intermediate problem. At its core, it is still relatively clean and encourages the reproducibility principle, i.e., you shall use code to generate results automatically instead of manually copying and pasting results in your report. If you are concerned about the state, you can restart the session and recompile the whole notebook from scratch. Spreadsheets are often hopeless here—you cannot easily restart your brain and redo exactly the same things.

3 Things Joel likes about notebooks

Joel mentioned two things he liked about notebooks: well-documented code (and the idea of mixing Markdown and code) and inline plots. I want to expand the first thing a bit. The idea should be attributed to literate programming, a programming paradigm invented by Donald Knuth. Knuth probably didn’t expect its popularity in tools for data analysis decades later, but I absolutely love the idea of writing narratives in a specialized authoring language (such as Markdown or LaTeX) instead of the traditional code comments. Using an authoring language makes the narratives easier to read. By comparison, code comments are always boring plain text with no structure or rich-text elements. Literate programming is more suitable in the data analysis culture.

4 Joel’s complaints about notebooks

If I have read Joel’s slides carefully enough and counted it correctly, he mentioned the following 11 problems about Jupyter notebooks.

4.1 Hidden state and out-of-order execution

I agree when this happens, the output can be highly confusing, but I think the biggest issue brought forward from his arguments against notebooks is the lack of mentioning the serious problem of hidden state in all those “notable tutorials”, “definitive guide”, and “comprehensive beginner’s guide”. I care a lot about reproducibility, so does RStudio, and we have made a decision on Day One that R Markdown documents should be compiled in new R sessions, which will (mostly) get rid of the problems of hidden state and out-of-order execution. Although many users have complained about it and wanted to run R Markdown documents in the current R session instead, we have never changed the decision. After we created R Markdown notebooks, we have also been constantly alerting users about the potential problems of notebooks. For example, in our book “R Markdown: The Definitive Guide”, we emphasized this issue over and over again:

[…] Again, for the sake of reproducibility, you will need to compile the whole document eventually in a clean environment.

Joel admitted that “the ipython console has plenty of hidden state, too” (slide #26) but he “can scroll back and see the command history to reason about the state” (slide #27). Jupyter provided the %history magic to print the command history, and Joel thought it was too much hassle and counterintuitive. I want to mention that this isn’t a problem for R Markdown notebooks. When you run cells (we call them “code chunks” in the R world), you can always see the history in the “History” pane in RStudio, e.g., I ran the second cell before the first one, and you can tell it from the right pane below:

Command history of R Markdown notebooks in RStudio

Command history of R Markdown notebooks in RStudio

Anyway, I think as long as Jupyter users are educated well enough to develop the habit of recompiling the whole notebook from scratch in the end instead of just leaving the results from out-of-order execution as the final product, Jupyter is probably fine.

4.2 Notebooks are difficult for beginners

This (slides #33-43) is essentially a corollary of the above hidden state issue. Joel suggested that the (beginner’s) tutorials scream out loud “DON’T RUN YOUR CELLS OUT OF ORDER YOU FOOL”. I can see the point, but honestly this is not strictly true. Some cells may be independent and do not introduce (hidden) state, and they could be run in arbitrary order, which is sometimes a benefit (you can work on a small part of the notebook a time). Allowing users to run cells in an arbitrary order doesn’t necessarily mean giving them enough rope to hang themselves. Even when you script in a text editor or IDE, I don’t believe you always run the code in the linear order from beginning to end—you may still need to focus on or iterate certain parts from time to time.

4.3 Notebooks encourage bad habits

Joel mentioned three bad habits encouraged by notebooks:

  1. Unnamed notebooks like Untitled24.ipynb, Untitled25.ipynb, etc. I don’t believe this problem is specific to Jupyter notebooks. Everyone must have a bunch of these files (Untitled.docx, Untitled.pptx, Untitled23.R, Untitled5.Rmd, …). Sometimes it is just that the file is not worth saving, and sometimes the user is just too busy or lazy to name a file, which is completely understandable.

  2. Notebooks don’t encourage users to follow good software engineering rules. I tend to agree with the person (whose name was blacked out on slide #46) that data science is not about creating software. However, Joel’s “data science” might involve more “creating software” than others. After all, who knows what “data science” really means…3 Of course, if your data science is about creating software, you should definitely follow software engineering rules.

  3. Notebooks encourage users to import notebooks instead of writing modules or libraries. Joel’s complaint here (slides #50-53) makes perfect sense to me.4 If you want to import code, import code (instead of a notebook). That’s it.

    Importing a notebook in another is not necessarily a totally bad idea, if you actually import the whole thing (not only the code but also text). This feature is known as “child documents” in the R Markdown world (actually in the broader knitr world). I think this is a very useful feature, because you can break a giant notebook into smaller ones, on which you can work individually. In the end, you can compile the master notebook, which will run the child notebooks.

4.4 Notebooks discourage modularity and testing

As I said earlier, if you use notebooks to develop software, you are probably using the wrong tool.

4.5 Jupyter’s autocomplete, linting, and way of looking up the help are awkward

Again, don’t use notebooks to develop software. However, autocomplete and linting are also very helpful even when you are not developing software. I cannot say anything for Jupyter, but for R Markdown notebooks, you get autocomplete and linting “for free” from the RStudio IDE, e.g.,