An Example for R Markdown

This is an example of using r markdown to produce an HTML page from a Markdown document.

r markdown embeds r codes onto a markdown document.

A simple data analysis with R

yctl <- c(4.17, 5.58, 5.18, 6.11, 4.50, 4.61, 5.17, 4.53, 5.33, 5.14)
ytrt <- c(4.81, 4.17, 4.41, 3.59, 5.87, 3.83, 6.03, 4.89, 4.32, 4.69)
trt <- c(rep(0, 10), rep(1, 10))
weight <- c(yctl, ytrt)

lm.1 <- lm(weight ~ trt)
lm.0 <- lm(weight ~ 1) 

summary(lm.1)
## 
## Call:
## lm(formula = weight ~ trt)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.0710 -0.4938  0.0685  0.2462  1.3690 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   5.0320     0.2202  22.850 9.55e-15 ***
## trt          -0.3710     0.3114  -1.191    0.249    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6964 on 18 degrees of freedom
## Multiple R-squared:  0.07308,    Adjusted R-squared:  0.02158 
## F-statistic: 1.419 on 1 and 18 DF,  p-value: 0.249

Formulate a hypothesis:

\[H_0 : \beta_1 = 0\] \[H_1 : \beta_1 \neq 0\]

\[ \begin{aligned} H_0 : \beta_1 &= 0\\ H_1 : \beta_1 &\neq 0 \end{aligned} \]

Result:

We cannot reject \(H_0\). Comparing those treated to controls, the estimated mean weight difference is -0.371, but this weight difference is not statistically significantly different from 0. (p value > 0.05).

Variable Estimate Std. Error t value p-value
(Intercept) 5.0320 0.2202 22.850 9.55e-15
trt -0.3710 0.3114 -1.191 0.249
## Analysis of Variance Table
## 
## Model 1: weight ~ 1
## Model 2: weight ~ trt
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1     19 9.4175                           
## 2     18 8.7292  1   0.68821 1.4191  0.249

R code chunks

Now we write some code chunks in this markdown file:

x <- 1+1      # a simple calculator
set.seed(123)
rnorm(5)      # boring random numbers
## [1] -0.56047565 -0.23017749  1.55870831  0.07050839  0.12928774

Inline R code and Mathematics expression

Inline R code is also supported, e.g. the value of x is 2.

The mean of the numbers 2,3,4 is 3.

\(5 \times \pi\) = 15.7079633.

Plots

We can also produce plots:

# first attempt
plot(cars)

# third attempt
qplot(speed, dist, data=cars) + geom_smooth()

Problem 6. [10 points]

Conduct a simulation to empirically demonstrate the properties of the LSE \(\beta_1\); specifically, illustrate the findings that \(\hat{\beta_1}\) is unbiased and has variance \(\frac{\sigma^2}{S_{xx}}\). A suggested structure for your simulation is:

  1. Set parameter values of your choice for \(\beta_0, \beta_1, \sigma^2_{\epsilon}\) and choose a sample size \(n\).
  2. Select values for \(x_i\) (recall these are assumed to be fixed in Problem 3b)
  3. Repeat these steps:
    • Simulate errors \(\epsilon_i\) from a mean-zero distribution with variance \(\sigma^2_{\epsilon}\).
    • Construct observed responses \(y_i\) according to the simple linear regression model.
    • Fit the simple linear regression and save the estimated slope \(\hat{\beta}_1\).
  4. Repeat step (3) many times to find the empirical distribution of \(\hat{\beta}_1\). Report the mean and variance of the estimates across simulated datasets and provide a histogram or other visual display.

Solution:

For code implementing this simulation exercise, please refer to the .Rmd file that generates this report; here, we note the relevant quantities and disucss the results of the simulation. Because simulations are computationally demanding, this code chunk uses the cache=TRUE option, so that the code is only executed once and the results are saved.

For this simulation we used \(n=20\), \(\beta_0 = 3\), \(\beta_1 = 3\), \(x = 1, 2, \ldots, 20\), \(\sigma^2 = 5\) and generated errors using a normal distribution.

From problem 3, we expect that \(E(\hat{\beta}_1) = 3\) and \(Var(\hat{\beta}_1) = \frac{\sigma^2}{ \sum_{i=1}^n (x_i-\bar x)^2} = 0.00752\). From 1000 simulated datasets, we had an empirical mean 3.0007191 and variance 0.007152. A density plot of estimated coefficients across all simulations is shown below.

Key Formatting Constructs

The key formatting constructs are discussed at http://rmarkdown.rstudio.com/authoring_basics.html.
To make it go on another line, add two spaces after the previous line.

Emphasis

This is italic. This is bold.

Superscripts

This is y2.

Lists

Unordered

  • Item 1
  • Item 2
    • Item 2a
    • Item 2b

Ordered

  1. Item 1
  2. Item 2
  3. Item 3
    • Item 3a
    • Item 3b

Block Quotes

A friend once said:

It’s always better to give than to receive.

\(H_0 : \beta_1 = 0\)
\(H_1 : \beta_1 \neq 0\)

Displaying Blocks of Code Without Evaluating

In some situations, you want to display R code but not evaluate it. Here is an example of how you format.

This text is displayed verbatim.

Math

We can embed LaTeX math expression into R markdown:
\[f(\alpha, \beta) \propto x^{\alpha-1}(1-x)^{\beta-1}\].

Conclusion

Markdown is easy to write. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. R markdown combines regular text, html, latex, R, and other stuff, and is a useful tool. For more details on using R Markdown see http://rmarkdown.rstudio.com.

Some LaTeX Basics

In this section, we show you some rudiments of the LaTeX typesetting language.

Subscripts and Superscripts

To indicate a subscript, use the underscore _ character. To indicate a superscript, use a single caret character ^. Note: this can be confusing, because the R Markdown language delimits superscripts with two carets. In LaTeX equations, a single caret indicates the superscript.

Square Roots

We indicate a square root using the \sqrt operator.

$$\sqrt{b^2 - 4ac}$$

\[\sqrt{b^2 - 4ac}\]

Alligned equations

\[ \begin{aligned} \dot{x} & = \sigma(y-x)\\ \dot{y} & = \rho x-y -xz\\ \dot{z} & = -\beta z +xy \end{aligned} \]

Fractions

Displayed fractions are typeset using the \frac operator.

$$\frac{4z^3}{16}$$

\[\frac{4z^3}{16}\]

Summation Expressions

Here is an example.

$$\sum_{i=1}^{n} X^3_i$$

\[\sum_{i=1}^{n} X^3_i\]

Parentheses

In LaTeX, you can create parentheses, brackets, and braces which size themselves automatically to contain large expressions. You do this using the \left and \right operators. Here is an example

$$\sum_{i=1}^{n} \left( \frac{X_i}{Y_i} \right)$$

\[\sum_{i=1}^{n} \left( \frac{X_i}{Y_i} \right)\]

Greek Letters

Many statistical expressions use Greek letters. Much of the Greek alphabet is implemented in LaTeX.

$$\alpha, \beta,  \gamma, \Gamma$$

\[\alpha, \beta, \gamma, \Gamma\]

Special Symbols

All common mathematical symbols are implemented, and you can find a listing on the LaTeX cheat sheet.

$$a \pm b$$
$$x \ge 15$$

\[a \pm b\] \[x \ge 15\]

Special Functions

LaTeX typesets special functions in a different font from mathematical variables. These functions, such as \(\sin\), \(\cos\), etc. are indicated in LaTeX with a backslash. Here is an example that also illustrates how to typeset an integral.

$$\int_0^{2\pi} \sin x~dx$$

\[\int_0^{2\pi} \sin x~dx\]

Matrices

Matrics are presented in the array environment. One begins with the statement \begin{array} and ends with the statement \end{array}. Following the opening statement, a format code is used to indicate the formatting of each column. In the example below, we use the code {rrr} to indicate that each column is right justified. Each row is then entered, with cells separated by the & symbol, and each line (except the last) terminated by \\.

$$\begin{array}
{rrr}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 & 9
\end{array}
$$

\[\begin{array} {rrr} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{array} \]

In math textbooks, matrices are often surrounded by brackets, and are assigned to a boldface letter. Here is an example

$$\mathbf{X} = \left[\begin{array}
{rrr}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 & 9
\end{array}\right]
$$

\[\mathbf{X} = \left[\begin{array} {rrr} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{array}\right] \]

x = 'hello, python world!'
print(x)
print(x.split(' '))
## hello, python world!
## ['hello,', 'python', 'world!']