Point Estimation

2024-02-04

Introduction

Lets understand Point Estimation in Statistics.
We’ll explore how to estimate unknown parameters from sample data.
Point estimators are functions that are used to find an approximate value of a population parameter using random samples of the population.
The size of the sample decides the accuracy of the estimate.
- Larger the sample size the more accurate the estimate.

Population and Sample

Population Parameter: Denoted by \(\theta\) (unknown)
- Represents a characteristic of the entire population.
- Typically unknown and needs to be estimated.
Sample Statistic: Estimated by \(\hat{\theta}\)
- An estimate of the population parameter.
- Obtained from the sample data.

Estimating the Mean (Eqn in Latex)

For the sample mean \(\bar{X}\), the point estimator is \(\mu\).
\[ \hat{\mu} = \bar{X} = \frac{\sum_{i=1}^{n} X_i}{n} \]

Estimating Variance (Eqn in Latex)

The point estimator for sample variance \(S^2\) is \(\sigma^2\).
\[ \hat{\sigma}^2 = S^2 = \frac{\sum_{i=1}^{n} (X_i - \bar{X})^2}{n-1} \]

Distribution of Sample and its Mean

Confidence Intervals (Eqn in Latex)

Confidence intervals has a range of likely values for the estimate.
\[ \text{{Confidence Interval: }} \hat{\theta} \pm \text{{Margin of Error}} \]

Point Estimate of Population Example

Lets estimate the proportion of computer owners in a certain city that use antivirus. We survey a random sample of 20 citizens.

The calculated sample proportion = 0.6

Problem: Estimating Pollution Level

Lets estimate the average concentration of a harmful pollutant, \(\mu_{\text{pollutant}}\), in the air to implement mitigation strategies.

The first step will be to collect the data from various monitoring stations

Then we estimate the average concentration of the pollutant based on the sample data

Lets create a 3D scatter plot and also has estimated mean concentration

Visualization of the Data and point estimate

R code for the 3D Plot Before

library(plotly)

# Creating a hypothetical dataset 
set.seed(123)
monitoring_data <- data.frame(
  Longitude = rnorm(100, mean = 12, sd = 2),
  Latitude = rnorm(100, mean = 34, sd = 1),
  Pollutant_Concentration = rnorm(100, mean = 25, sd = 5)
)

# Calculating the point estimate for population
point_estimate <- mean(monitoring_data$Pollutant_Concentration)

plot_ly(monitoring_data, 
        x = ~Longitude, 
        y = ~Latitude, 
        z = ~Pollutant_Concentration,
        color = ~Pollutant_Concentration,
        size = ~Pollutant_Concentration,
        type = "scatter3d",
        mode = "markers",
        marker = list(colorbar = list(title = "Concentration"),
                      line = list(color = "red", width = 2),
                      size = 5),
        text = ~paste("Concentration: ", round(Pollutant_Concentration, 2)),
        showlegend = FALSE) %>%
  add_trace(x = mean(monitoring_data$Longitude),
            y = mean(monitoring_data$Latitude),
            z = mean(monitoring_data$Pollutant_Concentration),
            type = 'scatter3d',
            mode = 'markers',
            marker = list(color = "red", size = 7, symbol = 4),
            text = "Point Estimate") %>%
  colorbar(title = "Concentration", colors = 'Viridis') %>%
  layout(scene = list(title = "3D Scatter Plot - Pollutant Concentration Across the City",
                      xaxis = list(title = "Longitude"),
                      yaxis = list(title = "Latitude"),
                      zaxis = list(title = "Pollutant Concentration")),
         margin = list(l = 0, r = 0, b = 0, t = 0))

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

Introduction

Population and Sample

Estimating the Mean (Eqn in Latex)

Estimating Variance (Eqn in Latex)

Distribution of Sample and its Mean

Confidence Intervals (Eqn in Latex)

Point Estimate of Population Example

Problem: Estimating Pollution Level

Visualization of the Data and point estimate

R code for the 3D Plot Before

Additional Related Formulas (Eqn in Latex)

Thank You!

References

Introduction

Population and Sample

Estimating the Mean (Eqn in Latex)

Estimating Variance (Eqn in Latex)

Distribution of Sample and its Mean

Confidence Intervals (Eqn in Latex)

Point Estimate of Population Example

Problem: Estimating Pollution Level

Visualization of the Data and point estimate

R code for the 3D Plot Before

Additional Related Formulas (Eqn in Latex)

Bias of an Estimator

Standard Error (SE)

Thank You!

References