a workflow for slides and papers

sebastian barfort
NUMEDIG, april 2014

Writing academic papers is a pain

  • not only do you need a good idea
  • you also need to turn that idea into a nice looking product
  • it's a bit like sausage making in the sense that little attention is paid to how slides and papers are made
  • my focus today is this process
    • most of you probably use either MS Word or LaTeX?
    • I'll try to make the case for a more stripped down text file approach

What do we want?

  • academic papers
  • nice looking slides to sell our ideas

How do we want them?

  • today: primarily pdf
  • in the future (but the future is now): also html

The perfect workflow

input

  • text
  • equations
  • plots
  • tables
  • citations

output

  • pdf or html
  • because of some unexplainable facts about the world: docx
  • we do not want to worry about citations, equations, etc.
  • we do not want to do any post-processing (this is tedius and not reproducible)

If only there were some program that could do all these things…

Enter: Pandoc!

Pandoc

  • written by John Macfarlane (philosophy prof at UC Berkeley, hobby programmer)
  • Pandoc is your swiss-army knife for document conversion
  • Among a huge set of inputs, Pandoc accepts LaTeX

Article

  1. .tex to .html

    pandoc -s paper.tex -o paper_tex.html

  2. .tex to .docx

    pandoc -s paper.tex -o paper_tex.docx

Slides

  1. .tex to beamer slides

    pandoc -t beamer slides.tex -o slides_tex.pdf

  2. .tex to .html slides

    pandoc -s –mathml -i -t dzslides slides.tex -o slides_tex.html

  3. using the slidy framework

    pandoc -s –webtex -i -t slidy slides.tex -o slides_tex.html

If you absolutely love LaTeX, stop here...

I think we can do better

  • I can write an entire slideshow in the time it takes you to type \documentclass{beamer}, \title{}, etc
  • I never remember the commands
  • I never understand the error messsages
  • … and I think beamer slides are ugly

If only there were ever some simpler program that could give us the same funtionality…

Enter Markdown!

Markdown philosophy

  • writing should not be an alienating experience trapped in WYSIWYG editors
  • a file should be readable intuitively and not be buried in markup
  • markdown is a markup language, but one meant to be read by humans rather than machines
  • markdown was created for the web (you know it if you use Github, Stackoverflow, etc.)

Example

Suppose we want to create a nested list

  • fruits
    • apples
      • macintosh
      • red delicious
    • pears
    • peaches
  • vegetables
    • broccoli
    • chard

Latex

\begin{itemize}      
\item fruits         
    \begin{itemize}       
    \item apples          
        \begin{itemize}     
        \item macintosh     
        \item red delicious 
        \end{itemize}       
    \item pears           
    \item peaches         
    \end{itemize}         
\item vegetables        
    \begin{itemize}       
    \item brocolli        
    \item chard           
    \end{itemize}         
\end{itemize}    

HTML

<ul>
    <li>fruits
    <ul>
        <li>apples
        <ul>
            <li>macintosh</li>
            <li>red delicious</li>
        </ul></li>
        <li>pears</li>
        <li>peaches</li>
    </ul></li>
    <li>vegetables
    <ul>
        <li>brocolli</li>
        <li>chard</li>
    </ul></li>
</ul>                       

Markdown

* fruits
    - apples
        - macintosh
        - red delicious
    - pears 
    - peaches
* vegetables
    - broccoli
    - chard

Article

  1. .md to .html

    pandoc -s paper.md -o paper_md.html

  2. .md to .pdf

    pandoc -s paper.md -o paper_md.pdf

Slides

  1. .md to beamer slides

    pandoc -t beamer slides.md -o slides_md.pdf

  2. using the slidy .html framework

    pandoc -s –webtex -i -t slidy slides.md -o slides_md.html

The general idea

  • markdown is extremely easy to learn
  • it is widely used, with almost endless possibilities
  • so why not use it for academic papers?

Some technical stuff…

  • customize your stylesheet

    • latex template
    • install proper fonts (minion pro)
    • pandoc templates
  • get help here

  • my style files are on github

  • let's see an example of a real paper (which won't be online until there's a draft ready)

More technical stuff

  • html doesn't understand latex tables (and vice versa)
  • so we need to be able to typeset
    • markdown tables
    • latex and html tables
  • you can include raw html and latex code in your markdown file
  • can your favorite stats program do that?

Some can…

library(stargazer)
linear.1 <- lm(rating ~ complaints + privileges + learning + raises + critical,
data=attitude)
linear.2 <- lm(rating ~ complaints + privileges + learning, data=attitude)
## create an indicator dependent variable, and run a probit model
attitude$high.rating <- (attitude$rating > 70)
probit.model <- glm(high.rating ~ learning + critical + advance, data=attitude,
family = binomial(link = "probit"))

Results
Dependent variable:
ratinghigh.rating
OLSprobit
(1)(2)(3)
complaints0.692***0.682***
(0.149)(0.129)
privileges-0.104-0.103
(0.135)(0.129)
learning0.2490.238*0.164***
(0.160)(0.139)(0.053)
raises-0.033
(0.202)
critical0.015-0.001
(0.147)(0.044)
advance-0.062
(0.042)
Constant11.01011.260-7.476**
(11.700)(7.318)(3.570)
Observations303030
R20.7150.715
Adjusted R20.6560.682
Log Likelihood-9.087
Akaike Inf. Crit.26.180
Residual Std. Error7.139 (df = 24)6.863 (df = 26)
F Statistic12.060*** (df = 5; 24)21.740*** (df = 3; 26)
Note:*p<0.1; **p<0.05; ***p<0.01
  • it would be even better to output markdown tables
  • this is almost possible in R and is a project at GSOC 2014
library(pander)
m <- mtcars[1:5, 1:3]
pandoc.table(m, style = "rmarkdown")


|         &nbsp;          |  mpg  |  cyl  |  disp  |
|:-----------------------:|:-----:|:-----:|:------:|
|      **Mazda RX4**      |  21   |   6   |  160   |
|    **Mazda RX4 Wag**    |  21   |   6   |  160   |
|     **Datsun 710**      | 22.8  |   4   |  108   |
|   **Hornet 4 Drive**    | 21.4  |   6   |  258   |
|  **Hornet Sportabout**  | 18.7  |   8   |  360   |

output

  mpg cyl disp
Mazda RX4 21 6 160
Mazda RX4 Wag 21 6 160
Datsun 710 22.8 4 108
Hornet 4 Drive 21.4 6 258
Hornet Sportabout 18.7 8 360

Thank you