Learning Objectives:

Students will learn how to …

  • Generate random variables in R from the normal and multivariate normal distributions
  • Calculate probabilities
  • Visualize densities

Example 1: Normal Distribution

R has several built-in functions for known distributions. Let’s learn a little bit more about them!

For this example please use the rnorm() function. To learn about the arguments that this function takes you can type ?rnorm to read the help file.

Step 1: Generate

Generate 10000 observations from the Normal distribution using the defaults in the rnorm function.

### GENERATE
x<-rnorm(10000)

Step 2: Visualize

Create a plot that estimates the density based on these data

### PLOT DENSITY

## BASE
plot(density(x))

## GGPLOT
library(tidyverse)

as.data.frame(x)%>%
  ggplot(aes(x))+
  geom_density()

Step 3: Density

We can find the height

### DENSITY AT 0
dnorm(0)
## [1] 0.3989423

Step 4: Probability

We can evaluate the area under the curve (to the left)

### PROBABILITY
pnorm(0)
## [1] 0.5

Question:

What observations do you have about the normal distribution based on your answers above?

Example 2: Multivariate Normal Distribution

Step 1A: Generate Correlated Data

Generate 10000 observations from the Normal distribution using the defaults in the rnorm function.

### MULTIVARIATE
#install.packages("MASS")
library(MASS)
## 
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
## 
##     select
## GENERATE
xx<-mvrnorm(1000, mu=c(0,0), Sigma=matrix(c(1, 0.5, 0.5, 1), 2))

Step 1B: Kernel Density

### TWO-DIM KERNEL DENSITY
### n=50 GRID

xx.kde<-kde2d(xx[,1], xx[,2], n=50)

Step 2: Visualize Correlation

Plots built into base R:

### PLOT (SCATTER)
plot(xx[,1], xx[,2], pch=16)

### CONTOUR
contour(xx.kde, lwd = 2, add = TRUE,
        col = hcl.colors(10, "Spectral"))

### HEATMAP
image(xx.kde)

### PERSPECTIVE
persp(xx.kde, phi=45, theta=30)

Step 3A: Generate Independent Data

Generate data from the multivariate normal distribution that has zero correlation

## ZERO CORR 
yy<-mvrnorm(1000, mu=c(0,0), Sigma=matrix(c(1, 0, 0, 1), 2))

Step 3B: Kernel Density

### ESTIMATED DENSITY
yy.kde<-kde2d(yy[,1], yy[,2], n=50)

Step 4: Visualize Independence

Plots built into base R:

### PLOT (SCATTER)
plot(yy[,1], yy[,2], pch=16)

### CONTOUR
contour(yy.kde, lwd = 2, add = TRUE,
        col = hcl.colors(10, "Spectral"))

### HEATMAP
image(yy.kde)

### PERSPECTIVE
persp(yy.kde, phi=45, theta=30)

Question:

Compare the sets of plots. What do you observe?