Knitr sample

I noticed that when i had both the equals signs under Knitr sample and the hash mark, the hash mark printed. I think this means that the equals over rides and stops the processing of additional markdown. I have changed it to a single hash # in order to keep it consistent with the other sections of the document.

The code chunk below is a very simple thing.

set.seed(1)
x<-rnorm(100)
mean(x)
## [1] 0.1089

Can i continue to use the same variable?

plot(x)

plot of chunk unnamed-chunk-3

I am quite sure i need to do the stupid double space to get a newline. If there is only one line here, that is true.

However here,
there should be two lines.

Just need to make sure i can publish to Rpubs

Introduction

Testing some markdown and knitr

Here, echo is equal to false.
As you can see, this means we see the results, but not the code.

## [1] 1 2 3 4 5

Here, results are equal to hide.
Conversely, here we see the results, but no code.

print(c(1:5))

Inline variables and r code

I wonder what time it is?
Here we are going to use a chunk with a name.

What i should do here is set echo=FALSE and results=‘hide’ because i simply want to set these variables for later use. For this document though, i want to leave it visible so i can see it later.

Since there is not a default print along with a variable assignment, i really don’t need the results=‘hide’ but it just seems to me that setting both of these is normal good practice.

time<-format(Sys.time(), '%a %b %d %X %Y')
rand<-rnorm(1)

We can use variables inter-spaced with text by using a single back tick, the letter r and then the r code. (such as printing a variable) This makes the echo=FALSE and results=‘hide’ important because we can now embed the code early and call it later when needed.

The current time is Fri Oct 10 3:29:17 PM 2014. My favorite random number is -0.6204.

Plotting

Simulate the data

x<-rnorm(100)
y<-x + rnorm(100, sd=.5)

Plot the data

par(mar=c(5,4,1,1), las=1)
plot(x,y,main='Some simulated data. Note fig.height=4 in the R code header')

plot of chunk scatterplot

Tables and Fraphics

Lets do some linear modeling to play with tables.

library(datasets)
data(airquality)

fit<-lm(Ozone~Wind + Temp+Solar.R, data=airquality)

So here we have our lm fit.

Lets put this together in a table.

library(xtable)
xt<-xtable(summary(fit))
print(xt, type='html')
Estimate Std. Error t value Pr(>|t|)
(Intercept) -64.3421 23.0547 -2.79 0.0062
Wind -3.3336 0.6544 -5.09 0.0000
Temp 1.6521 0.2535 6.52 0.0000
Solar.R 0.0598 0.0232 2.58 0.0112

Hmmm… three months and i have never seen the type argument to print?
Need to look into this.

Anyway, above we see the table printed in html.

Now the plots!

plot of chunk unnamed-chunk-6plot of chunk unnamed-chunk-6plot of chunk unnamed-chunk-6plot of chunk unnamed-chunk-6

Global Options

Don t forget for markdown

  • put the double single spaces ahead of a list.
  • Make sure that there is a blank line ahead of the list.
  • To nest or indent list items, place four (4) spaces ahead of the list item.
  • With ordered lists (number) you must have a
    • Number
    • period
    • space
    • Then the content.

You can apply settings to all code chunks by:

  1. Calling the knitr library with library(knitr)
  2. Creating a code chunk
  3. Set opts_chunk@set(setting1 = value1, setting2 = value2)

This is an example. Normally we would do echo=false and results=‘hide’, but as this is a tutorial doc, I want to see the code.

I wonder what the precedence settings are here. I am guessing that local chuck settings its own values will over ride these global settings.

Also, I am curious to see if order of execution is relevant here.

library(knitr)
opts_chunk$set(echo=TRUE)

Options

Just a list of some common options.

  • Output
    • results = ‘asis’, ‘hide’
    • echo = TRUE,FALSE
    • fig.height = int
    • cache = TRUE,FALSE

Caching

Caching is important in situations were some portion of the code takes a long time to run. Say importing a dataframe, running a fit, etc. Caching allows us to not have to constantly execute this code. Execute it once, cache the data and as long as the data remains current (not sure how r decides this) you will pull the existing copy instead of creating a new one. Chucks with significant side effects may not be cache-able.

  n<-runif(1000000)
  print(summary(n))
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    0.25    0.50    0.50    0.75    1.00

General Questions

  • Order of precedence for chunk options.
  • Assumptions confirmed. In fact, locally (within the individual r code chunk), settings will override defaults, even those set with the opts_chunk$set function call.
  • Where did the print(type=x) come from, it does not appear in the help.
  • Spell checking! My personal bane.
    • F7 is the shortcut.