What is a Discrete Variabe

  • A discrete variable is a random variable with finite values in a limited range
  • usually represented through integers
  • The probability of a discrete variable is called a probability mass function(pmf)

Example:
- The random variable X is the number of true or false questions you get correct on the test
- this is a discrete random variable since the range of X is finite and there is a finite interval of its range

What is a Continuous Varibale

  • A continuous variable is a random variable which takes values from an infinite and uncountable set
  • usually represented by rounded real numbers
  • the probability of a continuous variable is called a probability density function(pdf)

Examples of continuous variables:
- height
- weight
- time

Types of Discrete Variable Distributions

  • Discrete Uniform
    PMF:\(P(X=k) = \frac{1}{n} \qquad \text{for } k=1,2,\dots,n\)

  • Binomial
    PMF:\(P(X=k) = \binom{n}{k} p^k (1-p)^{n-k} \qquad \text{for } k=0,1,\dots,n\)

  • Poisson
    PMF:\(P(X=k) = e^{-\lambda} \frac{\lambda^k}{k!} \qquad \text{for } k=0,1,2,\dots\)

Example of a Discrete Uniform Distribution

a <- 1
b <- 10

x <- seq(a, b, by = 1)

y <- dunif(x, min = a, max = b)

plot_ly(x = x, y = y, type = "bar", name = "Discrete Uniform Distribution") %>%
  layout(title = "Discrete Uniform Distribution",
         xaxis = list(title = "X"),
         yaxis = list(title = "Probability Mass"))

Example of a Discrete Uniform Distribution(cont’d)

Types of Continuous Variable Distributions

  • Continuous Uniform
    \(f(x) = \begin{cases}\frac{1}{b-a} & \text{for } a \leq x \leq b \\0 & \text{otherwise}\end{cases}\)
  • Normal
    \(f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}\)
  • Standard Normal
    \(f(x) = \frac{1}{\sqrt{2\pi}} e^{-\frac{x^2}{2}}\)
  • Exponential
    \(f(x) = \begin{cases}\lambda e^{-\lambda x} & \text{for } x \geq 0 \\0 & \text{for } x < 0\end{cases}\)

Examlpe of a Standard Normal Distribution

x <- seq(-4, 4, length.out = 1000)

y <- dnorm(x)

plot_ly(x = x, y = y, type = "scatter", mode = "lines",
        name = "Standard Normal Density") %>%
  layout(title = "Standard Normal Distribution",
         xaxis = list(title = "Z-Score"),
         yaxis = list(title = "Density"))

Example of a Standard Normal Distribution(cont’d)

Example Graph of Discrete Variabes

library(ggplot2)
data <- data.frame(
  animal = c("Mammals", "Fish", "Birds", "Reptiles", "Amphibians"),
  count = c(20, 25, 5, 30, 10)
)

ggplot(data, aes(x = animal, y = count, fill = animal)) +
  geom_bar(stat = "Type of Animals") +
  xlab("Animals") +
  ylab("Count")

Example Graph of Discrete Variabes(cont’d)

Example graph of Continuous Varibales

data <- data.frame(
  height = rnorm(1000, mean = 80, sd = 5)
)

ggplot(data, aes(x = height)) +
  geom_histogram(binwidth = 0.5, fill = "red", color = "white") +
  xlab("Height (in)") +
  ylab("Count")

Example graph of Continuous Varibales(cont’d)

Summary

  • Both graphs show how a discrete variable and a continuous varibale distribution looks like. For the discrete, it shows categories on the x axis and the count on the y, whereas the the continuous variable shows the infinite range of height which forms and standard normal curve.

  • Overall this information shows how discrete and continuous variables are different. It also gives an idea on which types of data apply to discrete variables and which types of data apply to continuous variables. Additionally the graphs and distributions show how the random variables look like when using a specific distribution.