Confidence_Interval

Author

Kausthub Balaji

Abstract

Confidence Interval

A confidence interval displays the probability that a parameter will fall between a pair of values around the mean. Depending on its use, it measures the degree of certainty or uncertainty of the occurrence of an event in a given statistic sample. The selection of a confidence level for an interval determines the probability that the confidence interval produced will contain the true parameter value.

Uses

Confidence Intervals are commonly used in

  • Hypothesis Testing

  • Regression Analysis

  • Inferring a Census

Introduction

A Confidence Interval is a range of values above and below the given statistic’s mean that estimate a given unknown parameter at a designated confidence level, which gives us the probability of the parameter falling in the confidence interval.

This is a concept often used by statisticians to measure the uncertainty in a sample variable. Of several samples selected from a given population, the confidence interval computed for each of the samples at a given confidence level shows us how well the sample data set can represent the true value of the population.

Misconceptions

A common misunderstanding regarding the topic is that it represents a percentage of values that fall under the limit. A 95% confidence interval does not represent 95% of all possible values for the given sample, it simply states that any given value in that interval has a 95% probability of containing the population mean.

Statistic Methods

Several statistic methods are used to conduct confidence intervals. Some of them are

  • t-test : It is used to compare two samples and determine averages of different data sets with unknown variance.

  • z-test : It is the statistical hypothesis used to determine whether the two samples’ means calculated are different if the standard deviation is available and the sample is large.

Factors affecting confidence intervals

The factors affecting a confidence interval are

  • Confidence Level : Higher the level, bigger the interval

  • Sample Size : A larger sample means that the population is better represented. This means that for a given confidence level, larger the sample size, smaller the interval.

  • Population Size : Population size is mostly irrelevant, in fact, if it is too high, then it is completely ignored. It is considered only when working with a small, closed population such as a relatively known group of people.

Formulae

Confidence Interval (CI) for an infinite population is given by

x̄ = sample mean

z = confidence level value

s = sample standard deviation

n = sample size

Confidence Interval (CI) for a population of size N is given by

x̄ = sample mean

z = confidence level value

s = sample standard deviation

n = sample size

Real-Life Application

Biology

Confidence Intervals are used to estimate the mean parameters of many plants and animal species. A specific situation is that if there is need to find the mean parameters(length, weight, etc) of a species of snake, instead of finding all the snakes and measuring them, only 50 are chosen and their mean and standard deviation is recorded, which is then used to construct an interval for the true mean of the frogs in the entire population

Clinical Trials

Confidence intervals are regularly used in scientific trials to decide the imply trade in blood pressure, coronary heart rate, cholesterol, etc. produced by using some new drug or treatment.

For example, a physician may additionally accept as true with that a new drug is capable to decrease blood stress in patients. To take a look at this, he may additionally recruit 20 sufferers to take part in a trial in which they used the new drug for one month. At the give up of the month, the medical doctor may additionally report the suggest reduce in blood strain and the trendy deviation of the limit in every affected person in the sample.

He ought to then use the pattern imply and pattern popular deviation to assemble an interval for the genuine imply trade in blood strain that sufferers are possibly to journey in the population.

Marketing

Confidence intervals are regularly used by means of advertising and marketing departments inside businesses to decide if some new marketing technique, method, tactic, etc. produces drastically greater revenue.

For example, a advertising crew at a grocery retailer may also run two exclusive marketing campaigns at 20 distinct shops every at some stage in one quarter and measure the common income produced by way of every marketing campaign at every keep at the quit of the quarter.

Manufacturing

Confidence intervals are frequently used through engineers in manufacturing vegetation to decide if some new process, technique, method, etc. motives a significant exchange in the quantity of faulty merchandise produced via the plant.

For example, an engineer may additionally agree with that a new procedure will exchange the range of faulty widgets produced per day, which is presently 50. To check this, he may additionally put in force the new procedure and report the variety of faulty merchandise produced every day for one month at the plant.

He ought to then use the pattern suggest and pattern wellknown deviation of the range of every day defects to assemble a self assurance interval for the proper imply variety of faulty merchandise produced through the new process.

Population Studies

Confidence intervals can additionally assist with populace studies. For example, governments or fitness insurance plan organizations may favor to understand what share of the populace has a positive fitness condition.


They ought to mixture statistics from more than a few medical doctors to get a giant pattern and then estimate with a self assurance interval to get a vary for the share of human beings with the fitness condition.

Market Research

Confidence intervals can assist market researchers to higher recognize customers. For example, you may choose to understand the common age or common family earnings of your customers.

With a self assurance interval for common profits of your customers, you can estimate how lots disposable profits they have and whether or not they can manage to pay for your product or not.

With a self belief interval for common age of your customers, you can discover out if it makes feel to make bigger into a associated product line and cross-sell it to current clients (depending on the goal demographic).

Problems and Solutions

  1. A sample of the various prices(in $) for a particular product has been conducted in 16 stores which were selected at random in a neighborhood of a city. The following prices were noted:

    95, 108, 97, 112, 99, 106, 105, 100, 99, 98, 104, 110, 107, 111, 103, 110.

    Assuming that the prices of this product follow a normal law of variance of 25 and an unknown mean:

    a. What is the sample mean?

    b. Determine the confidence interval at 95% for the population mean

Thus, mean is $104 and the confidence interval at 95% for the population mean is ($101.55, %106.45).

  1. The average heights of a random sample of 400 people from a city is 1.75 m. It is known that the heights of the population are random variables that follow a normal distribution with a variance of 0.16.

    1.Determine the interval of 95% confidence for the average heights of the population.

    2.With a confidence level of 90%, what would the minimum sample size needed to be in order for the true mean of the heights to be less than 2 cm from the sample mean?

Thus, the confidence interval at 95% for the average height is (1.7108m , 1.7892m) and a sample with at least 1083 people is needed to be in order for the true mean of the heights to be less than 2 cm from the sample mean.

Conclusion

Confidence intervals help estimate the probability that the outcomes from statistical analyses are real or unlikely. When inferring or predicting based totally on a sample, there will be some uncertainty as to whether any information gleaned from the sample would affect the population being studied. Confidence Intervals are thus an important tool that gives us a range, at different probabilities of what we can work with.

References