December 6, 2016

What is R?

R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R. - The R Project website.

R is Freeware, and is DOD Complient on a number of platforms.

R has packages developed for just about everything imaginable, including creating your own xkcd-style graphics.

This tool is developing FAST

2016 RStudio release dates:

-1 November

-18 July

-14 May

-18 April

Each R Release adds functionality, and requires package upgrades

*2016 R Release Dates:

-3.3.2 October "Sincere Pumpkin Patch"

-3.3.1 June "Bug in your Hair"

-3.2.5 April "Very, Very secure Dishes"

-3.2.3 December 2015 "Wooden Christmas Tree"

Analysis Preparation Tasks

Visualization in R

There are many ways to create visualizations in R, and we will not cover all of them here. This brief will discuss the following techniques:

  1. Plot Functions in Base R
  2. Grammar of Graphics - ggplot2
  3. 3-d web based graphics, including plot_ly

We will also consider the following enablers:

  1. Interactive Environments - Markdown and Shiny
  2. Power Tools - magrittr and dplyr

This presentation in code

Data - The Morley Dataset

Example dataset that comes standard with base R about the Michaelson-Morely Experiment.

data(morley); library(knitr);library(magrittr)
morley[1:3,] %>% kable()
Expt Run Speed
001 1 1 850
002 1 2 740
003 1 3 900

Base R - Boxplot

library(dplyr); data(morley)
boxplot(Speed~Run, data = morley, xlab = "Run",
        ylab = "Speed", main = "R Base Plot Example")

ggplot - smoothed with errors

library(ggplot2); library(magrittr)
mp = morley[morley$Expt %in% 1:4,] %>% ggplot( aes(x = Speed, y = Run)) +
geom_smooth() 
mp + facet_wrap(~Expt, ncol = 2)

Resized

plot_ly - Interactive Surface Plot

library(plotly); library(reshape2)
PD = acast(morley, Run ~Expt, value.var = "Speed") 
plot_ly(z = PD, type = 'surface')

You can be creative…

library(ggplot2);library(xkcd);library(extrafont)
p = ggplot() + xkcdrect(mapping, data) + 
  xkcdaxis(xrange, yrange) + xlab("Run") + ylab("Speed")
p

Equations Look nice, too

\(\LaTeX\) is built in

\[ F(\omega) = \int_{-\infty}^{\infty} f(t) e^{- j \omega t} dt \] This audience might prefer: \[ \hat{\beta} = (X'X)^{-1}X'Y \]

Applications to DOD

Here's a wordcloud of this talk:

Some notes from the Hadleyverse

-dplyr: Advanced summarization / data extraction tools for R

-reshape2: Allows the user to 'flatten' and 'recast' tabular dat

-magrittr: introduces the 'pipe' %>% operator, greatly streamlining code

-ggplot2 (covered previously): Advanced Graphics

Putting it on the web:

-Rpubs: Publish to your own (free) website. Examples are here

-Shinyapps.io: Publish applications built in R/Shiny to the web. Fee structure depends on how much you want to host. A simple example is here, a more complicated example is here

Lessons Learned

  • Three choices for Markdown Documents, in order of output quality:
    • HTML
    • PDF via \(\LaTeX\)
    • MS Word
  • Three Choices for Markdown Presentations, in order of Quality
    • HTML via IOslides (This Presentation)
    • HTML via Slidy (not bad)
    • Beamer via \(\LaTeX\). I used to be a huge Beamer fan, but IOSlides and Slidy blow it away

Helpful References

  • Books:
  • R in a Nutshell
  • Advanced R Programming
  • The R Cookbook
  • Web:
  • Stack Overflow
  • HipsteR
  • rBloggers
  • rProject.org

When you start with R it feels like this:

But after a while, it's like this:

Fin.