Administration Topics

Chapter 7 Statistical Inference and Tests of Inference

Prologue

In social sciences, biology, or any field…we have a slight problem with the need to generalize about groups. Sometimes, the groups are too large and we can’t find everyone, it could be cost prohibitive in both terms of time and money, etc. So, how can we study a smaller representation of a larger group that we may not be able to study in it’s entirety. To accomplish this fundamental aspect of research….we can take a representative and random sample from a population. We then can use probability theory to make a decision about the hypothesis that we wish to test.

Now when we talk about statistics, we can keep this in mind:

  • A Parameter is to a Population as a Statistic is to a Sample

Chapter Outline

Statistical Inference

Statistical inference is achieved by using tests of statistical significance, or techniques that help us to generalize to a larger group.

A Parameter is to a Population as a Statistic is to a Sample

Statistical Inference…. Not a problem.

Eleanor scores 680 on the Mathematics part of the SAT. The distribution of SAT scores in a reference population is Normal, with mean 500 and standard deviation 100. Gerald takes the American College Testing (ACT) Mathematics test and scores 27. ACT scores are Normally distributed with mean 18 and standard deviation 6. Assuming that both tests measure the same kind of ability, who did better?

  1. Eleanor
  2. Gerald

Explanation

We need to standardize the scores to make the comparison for the informed decision.

  • Eleanor \[z_E = \frac{680 - 500}{100} = 1.8\]
  • Gerald \[z_G = \frac{27 - 18}{6} = 1.5\]

Since, Eleanor has a higher standardized score, we can conclude that Eleanor did better!

Statistical Inference … a few more words.

Consider this type of question:

Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.

Which is more probable?

  1. Linda is a bank teller.
  2. Linda is a bank teller and is active in the feminist movement.

Hint

Think about the probabilities of each event, and that of both of them together.

Explanation

If you selected 2, step back and think. Suppose we denote the event of Linda being a teller by A and the event she is active in the feminist movement by B, then probabilities in question can be written as.

  • P(A)
  • \(P(A \cap B)\)

This is called the conjugacy fallacy that occurs when it is assumed that specific conditions are more probable than a single general one.

Sampling

Examples:

Sampling

Brief Intro to the Central Limit Theorem and Sampling

  • Next week we will cover this in more detail, but for now:

  • Central Limit Theorem – the means of an infinite number of samples drawn from the same population will approximate a normal distribution

  • The mean of this distribution, called the sampling distribution of the mean, is equal to the mean of the population

Sampling Distribution

Definition: The sampling distribution of a statistic is the probability distribution for the possible values of the statistic that results when random samples of size n are repeatedly drawn from the population.

width width

Each value of x-bar is equally likely, with probability 1/4

Random Samples

Types of Random Samples

Sampling can occur in two types of practical situations:

  • Observational studies: The data existed before you decided to study it. Watch out for
    • Nonresponse: Are the responses biased because only opinionated people responded? *Undercoverage: Are certain segments of the population systematically excluded?
    • Wording bias: The question may be too complicated or poorly worded.
  • Experimentation: The data are generated by imposing an experimental condition or treatment on the experimental units. *Hypothetical populations can make random sampling difficult if not impossible.
    • Samples must sometimes be chosen so that the experimenter believes they are representative of the whole population.
    • Samples must behave like random samples!

Examples

  • Stratified random sample: Divide the population into sub populations or strata and select a simple random sample from each strata.
  • Cluster sample: Divide the population into subgroups called clusters; select a simple random sample of clusters and take a census of every element in the cluster.
  • 1-in-k systematic sample: Randomly select one of the first k elements in an ordered population, and then select every k-th element thereafter.

More Examples

Non-Random Sampling Plans: Not used for statistical inference

Normal Curve (Outside Textbook Information)

…The Bell Shaped Curve…

The normal curve is symmetric, bell shaped, and asymptotic. The inflection points fall at one standard deviation above and below the mean.

width

Normal Curve: Areas Under the Curve

The normal curve always has these proportions:

width

and

width

Hypotheses

Hypothesis tests follow a logical format …

Z-scores have the same formula with X = what you got, M = what you expected, and s = the standardized random error

Comparing Means

Work through Example Pg. 209

Upcoming Items

  • Two Sample z test (next chapter)
  • Multiple Samples (future chapters)

Test Statistic: z score

Obtain the value for the test statistic that was calculated… Page 561 width

It actually looks like this: width

For this value it would: width

Z-scores can be used to determine proportions of the curve

Probabilities, Significance and Confidence

For Critical Values of Z

Prob. significance level 1-tailed (directional) 2-Tailed (non-directional)
0.05 1.65 1.96
0.01 2.33 2.58
0.001 3.09 3.29

Example: Pg. 212

\(z_{obtained}=\frac{51.70-50}{\frac{10}{\sqrt{100}}}\) \(= \frac{1.70}{\frac{10}{10}} = \frac{1.70}{1} = 1.70\)

Depending on the direction or non-direction of your result, you can make the following conclusions if:

and

Decision Making

width

Types of Errors

Graphically

  • When the null is retained we expect the following to be the case

width

  • When the null is rejected we expect the following to be the case

width

Types of Errors

Type I: \(\alpha\) Error

  1. The probability of falsely rejeting a true null hypothesis.
  2. A type 1 error occurs when two samples appear to be different, but are actually from the same population.

width

Type II: \(\beta\) Error

  1. Failure to reject a false null hypothesis.
  2. Type 2 errors occur when two samples appear to be from the same population, but are actually from different populations.

width

Equations: Population Parameters

Equations calculated FOR POPULATION MEASURES only.

Equations: Sample Statistics

Equations calculated for SAMPLE MEASURES only

New Equation: z Test of Statistical Significance

The z test equation is: \(z = \frac{\bar{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\)

Key Concepts

Next Time