Chapter 8 Slides

Probability Distributions and One-Sample \(z\) and \(t\) Tests

Date: October 6, 2014

Author, Sherri Ann Verdugo, M.S.

Instructor, CSUF Sociology Department

Administration

Feedback Packets (previous graded assignments and homework)

Prologue & Introduction

This chapter is a continuation of chapter 7. This chapter is focused on what we are actually doing when we perform a \(z\) test. Sometimes understanding the underlying process, you gain more insight into the task at hand. Case in point, taking care of a pool. You don't need to understand the chemistry behind the chlorine and chemical test kits. If you know it though, you have a better understanding of achieving a pool that has appropriate levels of chlorine. Let's dive into statistics a little bit further.

Key Concepts

Key Concepts, continued

Note: this requires further testing in the real world...for this class, we will use that assumption without testing. However, we may find ourselves in the real world testing the normality of a distribution along a variable x.

Key Concepts, continued

Key Concepts, continued

Key Concepts, continued

Chapter 8 Outline

Introduction

Normal Distributions

What is normal?

Family of gaussian distrubutions, by increasing st. deviation 1 to 4

width

Normal Distributions, continued

Major characteristics

Reason for the term "normal"

Normal Distributions: Standard Scores

This time our equation is this: \(\frac{x-\mu}{\sigma}\) and the table for proportions of area under the standard normal curve is

width

How do we read this table?

Reading the table for proportions of area under the standard normal curve

Z Score example with the table pages 227 to 230

Sandy's IQ

Question: What proportion of people are above and below Sandy's IQ.

width

So how do we handle this?

Sandy's z Score

The z equation is: \(\frac{x - \mu}{\sigma}\)

From the z table in the book:

The proportion of people with IQs below Sandy is .9332

What about when z is negative? Pg. 229

What if George has a score of 80 on the IQ? This is not a problem because we know z scores can be positive (above the mean) and negative (below the mean).

width

Nearly 98% of all IQ scores fall above George's score.

One Sample z Test for Statistical Significance: One Tail

Chapter 7 Equation for z

One Sample z Test for Statistical Significance: Two Tail

Working out the sampling distribution of the mean

test x1 = x2 = x3 = x4 = x5 = x6 = x7 = x8 = x9 = x10 =
1 5 5 5 5 5 5 4 4 4 3
2 4 4 4 3 3 2 3 3 3 2
3 3 2 1 2 1 1 2 1 1 1
\(\Sigma\) 12 11 10 10 9 8 9 8 8 6
\(\bar{x}\) 4 3.7 3.3 3.3 3 2.7 3 2.7 2.7 2

\(\bar{x}\) Values and Frequencies

\(\bar{x}\) 4.0 3.7 3.3 3.0 2.7 2.3 2.0
\(f\) 1 1 2 2 2 1 1

width

The FAMOUS & FABULOUS Central Limit Theorem (CLT)

CLT: If repeated random samples of size n are drawn from a population that is normally distributed along some variable x, having a mean, \(\mu\) and a standard deviation \(\sigma\), the the sampling distribution of all theoretically possible sample means will be a normal distribution of all theoretically possible sample means will be a normal distribution having a mean \(\mu\) and a standard deviation of \(\frac{\sigma}{\sqrt{n}}\).

  1. Sampling distribution of sample means will be a NORMAL DISTRIBUTION
  2. The mean of the sampling distribution of sample means:
    • the mean of all sample means (designated \(\bar{x}\)) will be EQUAL TO \(\mu\)
    • \(\mu\) is the mean from the population that the sample is drawn from.
  3. The St. Dev. of the sampling distribution of sample means
    • will be EQUAL to \(\frac{\sigma}{n}\)
    • new concept Standard Error of the Mean or SEM

Standard Error of the Mean

The standard deviation of the sampling distribution, designated with the symbol \(\sigma_{\bar{x}}\)

CLT graphically

Histogram with, n=100, \(\mu=70\) and \(\sigma=20\)

width

From the text book, the actual appearance of a sampling distribution

width

Possible Sample Means % Range from Mean (in st. errors) Range in Exam Scores
68.27% \(\mu \pm 1 \sigma_{\bar{x}}\) 68-72
95.45% \(\mu \pm 2 \sigma_{\bar{x}}\) 66-74
99.73$ \(\mu \pm 3 \sigma_{\bar{x}}\) 64-76

Area under the normal curve = percentage of all possible sample means

CLT at work....an example through a story

Scenario: A competency exam score is given to students with a sample size of 100 ninth grade students enrolled in a six week course to prepare for the exam. The sample mean is 73, the population mean is 70, and the population standard deviation is 20. Set alpha to .05 and use a one-tailed directional hypothesis.

Hypothesis

Equation for z using chapter 7

\(z = \frac{\bar{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\)

Plugging in from our story problem and solving

\(z = \frac{73-70}{\frac{20}{\sqrt{100}}}\) and \(\frac{3}{\frac{20}{\sqrt{100}}}\) = \(\frac{3}{\frac{20}{10}}\) = \(\frac{3}{2}\) = 1.50

Significant or not...

Critical value of z when \(\alpha=0.05\) is 1.65 and we reject if our z is greater. Drum roll....the result is: \(1.50 < 1.65\) and we fail to reject \(H_0\) in favor of the \(H_1\)

Normality Assumption

The assumption that the population being studied is normally distributed along variable x.

Law of Large Numbers

A law that states that if the size of the sample, \(n\), is sufficiently large (no less than 30, preferably no less than 50), then the Central Limit Theorem will apply even if the population is not normally distributed along, variable x.

Rules for N

\(IF\) \(THEN\)
\(n ≥ 100\) It is always safe to relax the normality assumption (for this class)
\(50≤n≤100\) It is ALMOST always safe to relax the normality assumption
\(30≤n≤50\) It is PROBABLY safe to relax the normality assumption
\(n < 30\) It is PROBABLY NOT safe to relax the normality assumption

A last little bit on Z

For \(z\) we have to know population parameters, \(\mu\) and \(\sigma\)

Let's dissect the following graph

width

One Sample t-test for Statistical Inference

Variance revisited

This is a new version of the standard deviation equation, but recall the equation for standard deviation which is the square root of variance. It is really this equation: \(\sqrt{\frac{\Sigma (x-\bar{x})^2}{n}}\).

The t- test

Changes in the Sampling Distribution of t as Sample Size Decreases

width

Degrees of Freedom (a.k.a. "DF")

Source: http://en.wikipedia.org/wiki/Degrees_of_freedom_(statistics)

The t Table

width

http://www.sjsu.edu/faculty/gerstman/StatPrimer/t-table.pdf

Alternative t Formula

Z...again: A z Test for Proportions

\(z\) test for Proportions

This test is designed to test whether the difference between proportions in a sample reflects the difference in the population.

Equation: A z Test for Proportions

What the Symbols Mean

Example: A z Test for Proportions in Action

Scenario (pg. 253): This is a \(z\) Test for Proportion

In a small community, the proportions of minorities make up 20% of the population. The School Board President believes that minorities as teachers are under-represented in a specific school system. The current number of teachers is 100. Finally, the current system has only 15 minority faculty.

Let:

Hypothesis

\(H_0: P_s = P_p\) and \(H_1: P_s < P_p\)

Example: A z Test for Proportions in Action, continued

Scenario (pg. 253): This is a \(z\) Test for Proportion

Our equation that we are interested in is:

Decision

Interval Estimation

Interval Estimation:

Confidence Intervals for Means and Proportions

Confidence Intervals for z Scores...

For 95%: \(\bar{x} \pm 1.96(\frac{\sigma}{\sqrt{n}})\)

For 99%: \(\bar{x} \pm 2.58(\frac{\sigma}{\sqrt{n}})\)

What about the t test confidence intervals???

t Test at 95% Confidence Intervals

t Test at 99% Confidence Intervals

SPSS DOES THIS FOR YOU.

Visit http://academic.udayton.edu/gregelvers/psy216/spss/ttests.htm for a tutorial outside of class.

Confidence Intervals for Proportions

Equation: Confidence Intervals for Proportions using \(z_{critical}\) at 95%

Equation: Confidence Intervals for Proportions using \(z_{critical}\) at 99%

CI for Proportions an Example at 95%... a scenario

You are hired as a campaign manager for a local politician in a two person race. A random stratified telephone survey of 900 likely voters gives your candidate a 53% lead over the opponent. How likely does that percentage lead reflect the outcome electorate? You tell your boss that you will construct a 95% confidence interval around the .53 proportion that your candidate received in the sample.

That's a lot to comprehend....let's break it down

Given:

CI for Proportions an Example at 95%... continued

Upper Limit

Lower Limit

CI for the politician: (.5027, .5627)

A FEW MORE WORDS ON PROBABILITY

Statistical testing is rooted in probability. We have to introduce some more probability concepts:

  1. P(A)
    • The probability of outcome A occurring
  2. P(A or B)
    • The probability of either outcome A or outcome b occurring
  3. P(A and B)
    • The probability of both outcomes A and B occurring jointly
  4. P(A | B) Baye's Formula
    • Conditional probability, that the outcome of A occurring given that that outcome B has already occurred.

Conditional Probability : Addition Rules of Probability

The probability of outcome \(A\) occurring given that the outcome for \(B\) has already occurred.

P(A or B) = P(A) + P(B) for heads/tails on a "fair" coin.... .50 + .50 = 1.0

P(A or B) = P(A) + P(B) - P(A and B)

Multiplication Rules of Probability and Statistically Independent Probability

\(P(A and B) = P(A) X P(B | A) = P(A) X P(B)\)

Permutations and Combinations

Permutations and Combinations Examples

Perumtation

Combination

Review

We will be coming up on more equations and techniques...if you are confused

COME BY MY OFFICE DURING OFFICE HOURS :)

Equations

Equations, continued

Probability Terms

Permutation and Combination Equations

Level of Significance for Critical Value of \(z\)

Level of Significance Critical Value of \(z\)
.05 1.96
.01 2.58
.001 3.29

Upcoming Items

/

#