2. Point & Interval estimation

A point estimate is a single value estimate of a parameter. For instance, a sample mean is a point estimate of a population mean.

An interval estimate gives a range of values where the parameter is expected to lie. A confidence interval is the most common type of interval estimate.

2.1 Point estimate of population mean

2.1.1

Let’s calculate length, mean and standard deviation:

n <- length(wage1$wage)
xbar <- mean(wage1$wage, na.rm = TRUE)
s <- sd(wage1$wage)
c("Number of wages:"=n, "Mean:"=xbar, "Standard deviation:"=s)

##    Number of wages:               Mean: Standard deviation: 
##          526.000000            5.896103            3.693086

2.1.2

Now we will calculate margin of error and lower and upper bounds of 95% confidence interval

margin <- qt(0.975, df=n-1) * s / sqrt(n)
low <- xbar - margin
high <- xbar + margin
c("From:" = low, "To:" = high)

##    From:      To: 
## 5.579768 6.212437

2.2 Interval estimation

2.2.1

Let’s calculate standard error of the mean.

# Standard error of mean (whole)
s/sqrt(n)

## [1] 0.1610262

2.2.2

Let’s now create empty vectors where we are going to save means and standard deviations of each sample

samp_mean <- rep(NA, 55)
samp_sd <- rep(NA, 55)
samp_n <- 44
for(i in 1:55) {
  samp <- sample(wage1$wage, samp_n)
  samp_mean[i] <- mean(samp)
  samp_sd[i] <- s
}

2.2.3

Let’s get lower and upper bounds of those 55 confidence intervals and view how the first interval will look like:

lower_ie <- samp_mean - 1.96 * samp_sd / sqrt(samp_n)
upper_ie <- samp_mean + 1.96 * samp_sd / sqrt(samp_n)
c("Lower bound:" = lower_ie[1], "Upper bound:" = upper_ie[1])

## Lower bound: Upper bound: 
##     4.561490     6.743964

2.2.4

Finally we can plot our mean confidence interval

plotCI(1:55,
    samp_mean,
    uiw = qnorm(0.975)*samp_sd,
    pt.bg=par("bg"),
    pch=21,
    xlab = "Sample means confidence interval (from 1 to 55)",
    ylab = "Samples of size 44",
    main = "Mean confidence interval")

Point & Interval Estimation

Tomasz Dąbrowski

9 11 2021

Welcome to my mathematical statistics report.

1. Descriptive statistics

Cross-section wage data consisting of a random sample taken from the U.S. Current Population Survey for the year 1976. There are 526 observations in total. A data frame with 24 columns, and 526 rows.

The visualization of the dataset:

1.1

Boxplot and violinplot showing married and not married comparison:

1.2

Histograms showing wage and its mean (blue) and density (red area under the curve):

2. Point & Interval estimation

A point estimate is a single value estimate of a parameter. For instance, a sample mean is a point estimate of a population mean.

An interval estimate gives a range of values where the parameter is expected to lie. A confidence interval is the most common type of interval estimate.

2.1 Point estimate of population mean

2.1.1

Let’s calculate length, mean and standard deviation:

2.1.2

Now we will calculate margin of error and lower and upper bounds of 95% confidence interval

2.2 Interval estimation

2.2.1

Let’s calculate standard error of the mean.

2.2.2

Let’s now create empty vectors where we are going to save means and standard deviations of each sample

2.2.3

Let’s get lower and upper bounds of those 55 confidence intervals and view how the first interval will look like:

2.2.4

Finally we can plot our mean confidence interval