Point estimate

2023-11-13

## Warning: package 'ggplot2' was built under R version 4.3.2

## Warning: package 'plotly' was built under R version 4.3.2

## 
## Attaching package: 'plotly'

## The following object is masked from 'package:ggplot2':
## 
##     last_plot

## The following object is masked from 'package:stats':
## 
##     filter

## The following object is masked from 'package:graphics':
## 
##     layout

definition

In statistics, a point estimate is a value that is used to approximate an unknown population. The purpose of using a point estimate is to provide the best guess for the true value of a parameter based on information obtained from a sample of data. It is not perfect by any means, which is why confidence intervals are used with point estimates to give a good range of possible values.

how to calculate a point estimate

There are a few different ways of calculating a point estimate depending on what you are using it for. For a population mean, the formula is \[X = (\sum_{i=1}^n X_i)/n\] where X_i is each observation in the sample, and n is the sample size.

For a population proportion, the formula for the point estimate is \[P = S/n\] Where S is the number of successes in the sample, and n is the sample size.

calculation of point estimate 2

To calculate the point estimate of a population variance, the formula is:

\[S^2 = (\sum_{i=1}^n (X_i - X_m))/(n-1)\]

where \[X_i\] is each observation is the sample, \[X_m\] is the sample mean, and n is the sample size.

ggplot slide

This plot shows the point estimate of a hypothetical population with a dotted red line.

box plot

scatter plot

## A marker object has been specified, but markers is not in the mode
## Adding markers to the mode...

code for histogram

here is the code that create the histogram of the hypothetical population

set.seed(123) data <- data.frame(x = rnorm(100, mean = 10, sd = 2))

sample_mean <- mean(data$x)

ggplot(data, aes(x = x)) + geom_histogram(binwidth = 1, fill = “lightblue”, color = “black”, alpha = 0.7) + geom_vline(xintercept = sample_mean, linetype = “dashed”, color = “red”) + labs( title = “Histogram with Point Estimate”, subtitle = sprintf(“Sample Mean: %.2f”, sample_mean), x = ““, y =”Frequency” )