2024-09-21

Introduction to Point Estimation

Point Estimation is the process of utilizing data from a sample to calculate a single value that is used as the approximation of an unknown population parameter

In Point Estimation, sample data is used to calculate the point estimate. Sample size affects how accurate the point estimate is (The larger the sample size, the more accurate the point estimate)

Common Point Estimators:
- Sample Mean (Used to find center of continuous data)
- Sample Proportion (Used to estimate population proportion)
- Sample Variance (Used to estimate population variance)

Point Estimate Formula

Point Estimate Formula is applied to sample data in order to obtain the point estimate:

\(\hat{p} = x / \eta\)

\(\hat{p}\) ; sample proportion
\(x\) ; number of successes in sample
\(\eta\) ; sample size

Customer Survey Data

  • CMT customer satisfaction survey is used as sample data to demonstrate data plots and point estimation

  • The survey includes data obtained from customers over the period of 7 years (2015-2022)

  • Customers are given the choice between 5 different rating categories to quantify their satisfaction (Very Poor, Poor, Satisfactory, Above Average, Excellent)

  • A percentage is assigned to each customer rating category for each year

Data Obtained from Customer Satisfaction Survery

df = read.csv("CTM_Customer_Satisfaction_Survey_20240922.csv", sep=";",header=TRUE)

years <- c(2015, 2016, 2017, 2019, 2020, 2021, 2022)
ratings <- data.frame(
  Rating = c("Very Poor", "Poor", "Satisfactory", "Above Average", "Excellent"),
  `2015` = c(1.2, 3.5, 29.7, 31.2, 34.4),
  `2016` = c(0.2, 2.4, 29.1, 33.7, 34.6),
  `2017` = c(1.5, 2, 26.4, 31.5, 38.6),
  `2019` = c(0.8, 3, 24.2, 29.5, 42.6),
  `2020` = c(0.4, 4.3, 29.7, 30.2, 35.5),
  `2021` = c(1.8, 6.94, 27.95, 26.78, 36.52),
  `2022` = c(1.39, 6.82, 31.43, 23.5, 36.86)
)

ratings_sum <- colSums(ratings[2:8])

plot_data <- data.frame(Year = years, TotalRating = ratings_sum)

Scatter Plot Showing Sum of Ratings for Each Year