The Normal Distribution in R

Plotting the Normal Distribution in R

To plot a normal distribution in R, we can either use the base R or ggplot2.

Example 1 (Using Base R):

To plot a normal distribution with mean = 0 and standard deviation = 1, we can use the following code:

# Create a sequence of 100 equally spaced numbers between -5 and 5
x <- seq(-5, 5, length = 100)

# Create a vector of values that shows the height of the probability distribution for each value in x
y <- dnorm(x)

# Plot x and y as a scatterplot with connected lines (type = "l") and add an x-axis with custom labels
plot(x, y, type = "l", lwd = 3, axes = FALSE, xlab = "x", ylab = "Density")
axis(1, at = -3:3, labels = c("-3s", "-2s", "-1s", "mean", "1s", "2s", "3s"))

Example 2: Create a normal distribution using the mpg from the mtcars dataset in R using ggplot2 package.

# Load the tidyverse package, which includes ggplot2
library(tidyverse)

# Load the mtcars dataset
data(mtcars)

# Create a ggplot object using the mtcars dataset, mapping mpg to the x-axis
ggplot(mtcars, aes(x = mpg)) +

# Add a statistical function to the plot, in this case, a normal distribution
# The 'args' argument provides the mean and standard deviation of the mpg variable
stat_function(fun = dnorm, 
              args = with(mtcars, c(mean = mean(mpg), sd = sd(mpg)))) +

# Set the x-axis label
scale_x_continuous("Miles per gallon") +

# Set the y-axis label
scale_y_continuous("Density")

Simulating the Normal Distribution in R

The normal distribution can be simulated using the base R function rnorm(). This function generates random numbers from a normal distribution with a specified mean and standard deviation.

Example 3: Generate 1000 random numbers from a normal distribution with a mean of 0 and a standard deviation of 1. Plot a histogram of the generated data to visualize the distribution.

# Set the seed for reproducibility
set.seed(12358)

# Generate 1000 random numbers from a normal distribution
normal_random_numbers <- rnorm(1000, mean = 0, sd = 1)

# Create a histogram of the generated normal random numbers
hist(
  normal_random_numbers,                                   # Data to be plotted
  main = "Histogram: Standard Normal Distribution",        # Title of the histogram
  xlab = "Normal Random Numbers",                          # Label for the x-axis
  ylab = "Frequency"                                       # Label for the y-axis
)

Example 4: Generate 1000 random numbers from a normal distribution with a mean of 50 and a standard deviation of 5. Plot a histogram of the generated data to visualize the distribution.

# Set the seed for reproducibility
set.seed(12358)

# Generate 1000 random numbers from a normal distribution
normal_random_numbers <- rnorm(1000, mean = 50, sd = 5)

# Create a histogram of the generated normal random numbers
hist(
  normal_random_numbers,                         # Data to be plotted
  main = "Histogram: Normal Distribution",        # Title of the histogram
  xlab = "Normal Random Numbers",                # Label for the x-axis
  ylab = "Frequency"                             # Label for the y-axis
)