"Enough Docker to be Dangerous"

Joyce Cahoon
November 10, 2017

Install Docker


First things first, install: https://docs.docker.com/engine/installation/

Then create a docker hub login: https://hub.docker.com

What is Docker?


Basic Docker Fun


docker run <image>

docker start <name|id>

docker stop <name|id>

docker ps [-a include stopped containers]

docker rm <name|id>

Let's Try It


docker

docker login 

docker run tutum/hello-world

In another terminal, run:

docker ps

Accessing Your Container


docker run -p 8080:80 tutum/hello-world


Check it out on your browser, go to http://localhost:8080/

Clean Up


docker ps -a 

docker rm <name|id> 

docker ps -a 

Let's Get Fancy


docker run -d --name web1 -p 8080:80 tutum/hello-world

docker run -d --name web2 -p 8082:80 tutum/hello-world

docker run -d --name web3 -p 8083:80 tutum/hello-world

docker stop web1 

docker ps -a 

docker start web1 

Docker and R


docker run --rm -p 8787:8787 rocker/verse


Check out http://localhost:8787/. You should be able to sign-in with:

  • Username: rstudio
  • Password: rstudio

Now you can use Rstudio in your browser the same way you would on your desktop.

Enabling Reproducibility


docker run --rm -p 8787:8787 -v ~/r-docker-tutorial:/home/rstudio/r-docker-tutorial rocker/verse


Use the \( \texttt{-v} \) flag to access data on your local hard drive as well as save things there.

Make sure paths placed to the left of \( \texttt{:} \) is a path on your machine.

Example: Simulations in purrr


Use functions in \( \texttt{purrr} \) to perform iterative tasks:

for each ___ do ____


There is nothing new here, you might already be addressing this in your code with

  • copy & paste
  • for loops
  • \( \texttt{apply} \)

Example: Simulation in purrr


How I used to run simulations…

nreps <- 10000
n <- 5
results <- matrix(NA, nrow = nreps, ncol = 2)
for(i in 1:nreps){
  one_rep <- rnorm(n = 5, mean = 0, sd = 1)
  results[i, 1] <- sqrt(mean(one_rep^2))
  results[i, 2] <- median(abs(one_rep-median(one_rep)))
}

Example: Simulation in purrr


How I used to run simulations…

Let's do this for varying \( \texttt{n} \) and \( \texttt{sd} \):

nreps <- 10000
combinations <- expand.grid(n = c(5, 10, 15), sd = c(1, 2, 3)) 
results <- combinations[rep(seq_len(nrow(combinations)), each = nreps),]
system.time(
  for(i in 1:dim(results)[1]){
    one_rep <- rnorm(n = results$n[i], mean = 0, sd = results$sd[i])
    results$rms[i] <- sqrt(mean(one_rep^2))
    results$mad[i] <- median(abs(one_rep-median(one_rep)))
  }
)

Example: Simulation in purrr


Maybe this is what we should do…

nreps <- 10000
simulator <- function(n, sd){
  one_rep <- rnorm(n = n, mean = 0, sd = sd)
  rms <- sqrt(mean(one_rep^2))
  mad <- median(abs(one_rep-median(one_rep)))
  return(list(one_rep = one_rep, 
              rms = rms, 
              mad = mad))
}
combinations <- expand.grid(n = c(5, 10, 15), sd = c(1, 2, 3)) 
results <- combinations[rep(seq_len(nrow(combinations)), each = nreps),]
system.time(
  results <- apply(results, 1, function(x) simulator(x[1], x[2])) 
)
# Now you have to unlist everything.... 
head(results)

Example: Simulation in purrr


Life could be so much easier…

library(purrr)
library(dplyr)
library(reshape)
library(ggplot2)

simulator <- function(n, sd){ rnorm(n = n, mean = 0, sd = sd)}
combinations <- expand.grid(n = c(5, 10, 15), sd = c(1, 2, 3)) 
results <- combinations[rep(seq_len(nrow(combinations)), each = nreps),]
system.time(output <- results %>%
    mutate(sim = map2(n, sd, simulator), 
           rms = map_dbl(sim, ~sqrt(mean(.^2))), 
           mad = map_dbl(sim, ~median(abs(.))))) 

# Let's visualize the results ---------------------------------------------
output_melted <- melt(select(output, -sim), id.vars=c("n", "sd"))
ggplot(output_melted, aes(x = n, y = value, col = variable)) + 
  geom_boxplot() +
  facet_wrap(~sd)
ggsave("picture.pdf")

Saving Your Current Instance


docker ps

docker commit -m "[INSERT COMMENT]" [CONTAINER ID] [IMAGE NAME]

docker images

Getting Image Up To Docker Hub


docker login --username=yourhubusername --email=youremail@ncsu.edu

docker images


If all your files and data are where they need to be, then:

docker tag [IMAGE ID] your_docker_hub_username/[IMAGE NAME]:[ANY TAG]

docker push your_docker_hub_username/[IMAGE NAME]

Run it on another machine


You can find public images here: https://hub.docker.com/

docker run jyu21/slg_purr_example:v1 

docker ps

docker exec -it [CONTAINER ID] bash 

Rscript [YOUR SCRIPT].R


ROpenSci's Launching Docker
Draw.io's Making Diagrams
Sean Kross' Enough Docker To Be Dangerous
Mine Cetinkaya-Rundel's Purrr
Dr. Post, Kara & Todd for SLG