I. Data

There are two main data types: qualitative and quantitative. Quantitative data sets are comprised of lists of numbers and often referred to as numeric data. Qualitative data sets refer to lists of categorizations, like class rank in college. If we think of CLASS as a qualitative variable, it has four levels: freshman, sophomore, junior and senior. We tend to use single-letter codes when possible, so we might describe CLASS as having the four levels F, S, J and R (with R for senior since S is already used).

The Lady Tasting Tea example was an experiment with qualitative data. The outcomes of the test taste were 8 reports of either success or failure. For example, the data from Muriel’s attempts might have been \[\{S,F,S,S,S,F,S,S\}\] were she to have successfully identified 6 of the 8 cups of tea.

Initializing RStudio

The data set we will use primarily is Data3350 which was produced in 2015 during an undergraduate research project about personality and humor. The VarsData3350 PDF file has descriptions of each variable in the Data3350 file. Both are available for download in D2L. Be sure to put the Data3350 in your R folder in Documents, and make sure your working directory is set the same way (Session menu). The code block below uses the library function to ensure that the Mosaic package is loaded and will import the data frame used in this module: Data3350 and Dolphin.

library(mosaic)
library(readxl)
Dolphin = read_excel("Dolphin.xlsx")
Data3350 = read_excel("Data3350.xlsx")

II. Proportion Tests

When working with qualitative or category data, we can’t calculate means and standard deviations. Instead, we tally up the counts of each outcome and calculate percentages or proportions.

We have two main proportion tests, one based on the \(z\) distribution, one based on the \(\chi^2\) distribution. The \(z\)-proportion test can only handle one or two samples. We use \(\chi^2\) for category variables with more than two levels. Because \(\chi^2\) is a more robust statistic, we sometimes use it for small sample experiments in the two sample case.

Case Study: Swimming with dolphins

Let’s discuss an experiment that produced category data. A group of 30 patients experiencing chronic depression were invited to a Caribean island to take part in therapy. The researchers randomly split the group in two. Half of them received clinical treatment for their depression only (along with a huge dose of beach therapy, one presumes). The other half received the same clinical treatment (and beach therapy) plus they went swimming with dolphins.

Results

Let’s scan the data frame which summarizes the results of the experiment.

Dolphin

Setting up the Hypothesis Test

The researchers want to know whether there is enough evidence to show that the dolphin therapy is significantly better than the clinical and beach therapy option. What null hypothesis will test their claim?

\[H_0 : \text{ Results are independent of treatment}\] How could we test it? We can randomly assign the outcomes of Improved or Did Not to the dolphin group. We then check to see the percentage of times that 10 or more successes land in the randomized dolphin group.

Let’s tally the data to show the researcher’s experimental data:

tally(Result ~ Treatment, data = Dolphin)
          Treatment
Result     Control Dolphins
  Did Not       12        5
  Improved       3       10

Mosaic’s Randomization Options

  1. Shuffle. Permutes the values in the sample data.
  2. Sample. Draws a sub-sample from the sample data
    without replacement.
  3. Resample. Draws a sub-sample from the sample data
    with replacement.

Randomization for Dolphins Case Study

What if we randomly shuffled the Result variable entries? We would create random treatment vs. control group samples which could be use to estimate the probability of having at least 10 “Improved” Results in the Dolphin group. Here’s a look using Mosaic’s shuffle command.

tally(shuffle(Result) ~ Treatment, data = Dolphin)
               Treatment
shuffle(Result) Control Dolphins
       Did Not        8        9
       Improved       7        6

Run the code block above several times, and you will notice how rare it is to have 10 “Improved” results in the Dolphins group.

To create several hundred random samples and be able to count them up, we’ll use the sample function to draw samples of the Result observations of size \(n=15\). Why? This approach simulates randomly picking 15 “Improved” and “Did Not Improve” for the dolphin group. We don’t need to create the Control group since all remaining results will end up there. The dollar sign in Dolphin$Result is an operator that chooses a specific column from a data frame, in this case, the Result.

dolph = sample(Dolphin$Result, size = 15)
tally(dolph)
X
 Did Not Improved 
       9        6 

Since tally creates a vector of results, we can just grab the number of patients who improved by referencing the 2nd vector component with the square brackets: [2].

tally(dolph)[2]
Improved 
       6 

This randomization can be repeated dozens, hundreds or even thousands of times. In this way, we can determine the empirical probability that the experimental results were random noise rather than evidence that dolphin therapy works better.

Rossman & Chance

The dolphin study is a teaching approach pioneered
by Alan Rossman and Beth Chance who have created
web apps to simulate data for their own statistics
courses. You should definitely spend some time
experimenting with their Dolphin Study App.

Below, we are combining the code snips from above into one line that will do the following:

  1. Collect random samples of size \(n=15\) from the results column.
  2. Determine how many patients in this random group “Improved”
  3. Tally the number vector listing the number that improved in each of 50 randomized groups.
  4. Assign the result to the variable randDolph so we can create a histogram from the table of results.
randDolph = do(50) * tally(sample(Dolphin$Result, size = 15))[2]
randDolph

Create a histogram to clarify the picture.

histogram(~ Improved, data = randDolph,
          width = 1,
          center = 10,
          type = "count",
          main = "Histogram: Random Improvements",
          xlab = "Improvements for Dolphin Group: 50 Samples of 15 Results")

Empirical \(p\)-values using RStudio

We’re going to need lots more samples to determine this probability. Let’s jump up to 2,000 samples and see what the probability is that at least 10 of the “Improved” outcomes appear at random in a sample of 15.

Don’t push the number of trials too high. Executing the code block below will probably take at least 10 seconds, depending upon the speed of the processor used to run it.

randDolph = do(2000) * tally(sample(Dolphin$Result, 15))[2]
histogram( ~Improved, data = randDolph, 
          width = 1,
          center = 10,
          type = "count",
          main = "Histogram: Random Improvements",
          xlab = "Improvements for Dolphin Group: 2000 Samples of 15 Results")

We would like to know how many of these two thousand random samples ended up with 10 more “Improved” outcomes. That’s the probability we’re estimating. That’s our \(p\)-value.

sum(randDolph >= 10)
[1] 12

Rossman & Chance online app

App Setup

Dolphin App: Left Panel Dolphin App: Right Panel

So the \(p\)-value is \[p=\frac{12 }{2000}= 0.006\]

The R Studio simulation shows that, by random chance, 10 or more improvements in the Dolphin group would occur only about 1% of the time. Since the \(p\)-value is low, our randomized simulation suggests the null hypothesis is false.

If the improvements are not due to random chance, the researchers have evidence for the claim that the Dolphin therapy seemed to help these clinical patients.

Emprical \(p\)-values from Rossman-Chance app

You can – and should!! – perform the same simulations much more efficiently on the Rossman-Chance Dolphin applet. When you do, make sure to set the randomization \(\fbox{statistic}\) to “Cell 1 Count” in the bottom-left corner of your screen (see figure). when you click the link, the right panel will not show up. To make it appear, check the “Show Shuffle Options” box. To generate a \(p\)-value, set “Count Samples” to \(\fbox{> = 10}\). (See figure showing Right Panel.)

I ran 10,000 trials in about one second on the Rossman Chance Dolphin applet and found an empirical estimate of \[p=\frac{126}{10000} = 0.0126\] which matches pretty well to what we did earlier in RStudio. You should run both the code in RStudio as well as in the app to best understand what’s going on. The screen shot below shows the results when I used the web app for 10,000 trials.

Dolphin App

III. Modern Statistical Hypothesis Testing

Hypothesis Testing Steps

  1. Identify correct procedure
  2. Setup null \((H_0)\) and
    alternate \((H_a)\) hypotheses
  3. Verification: are data appropriate
    for procedure?
  4. Set \(\alpha\) (\(\alpha = .05\) is default value)
  5. Run Stats App to determine \(p\)-value
  6. Statistical Conclusion: “Reject \(H_0\)” or
    “Fail to reject \(H_0\)
  7. Research conclusion

Let’s perform the same analysis but with the theoretical approach to determining the \(p\)-value. I will also conduct the entire process using the hypothesis testing steps shown to the right.

Verification Procedures.

Before we begin, I should specifically discuss Step 3: Verification. We will end up using the \(\chi^2\) distribution for our theoretical estimation of the \(p\)-value mainly because the \(z\)-proportion test would fail its verification procedure due to small sample size.

Verification Procedure: \(z\)-proportion Tests

To use the \(z\)-statistic to produce estimated \(p\)-values for the binomial distribution, we must have at least 10 successes and 10 failures in all our samples. For the Dolphin study, we have only 30 observations total, so there is no possible way to have at least 10 observations in each of the four cells in the cross-tabulation table.

tally(Result ~ Treatment, data = Dolphin)
          Treatment
Result     Control Dolphins
  Did Not       12        5
  Improved       3       10

We see that the successes count is only 3 for the Control group, and failures count is only 5 for the Treatment group.

Verification Procedure: \(\chi^2\) Two-Way Tests

\(\chi^2\) in R

The Mosaic version of the \(\chi^2\) test is xchisq.test
which uses a Yates correction. To match graphing
calculator or by-hand results, turn off the Yates
correction by using argument \(\fbox{correct = TRUE}\).

To use the \(\chi^2\)-statistic to produce estimated \(p\)-values for the independence of two categorical variables, low cell counts are the problem children we must avoid, just like with the \(z\)-proportion tests. However, the \(\chi^2\) statistic is more robust (less breakable) than \(z\).

For all \(\chi^2\) statistical testing, we require that there be no more than 20% low Expected cell counts where a “low cell count” is defined to be strictly less than 5.

xchisq.test(Result ~ Treatment, data = Dolphin, correct = FALSE)

    Pearson's Chi-squared test

data:  x
X-squared = 6.6516, df = 1, p-value = 0.009907

   12        5   
 (8.50)   (8.50) 
 [1.44]   [1.44] 
< 1.20>  <-1.20> 
   
    3       10   
 (6.50)   (6.50) 
 [1.88]   [1.88] 
<-1.37>  < 1.37> 
   
key:
    observed
    (expected)
    [contribution to X-squared]
    <Pearson residual>

We ignore the \(p\)-value and other output, initially. For the verification procedure, we see that all the expected cell counts are least 5, and these data are appropriate for \(\chi^2\) procedures.

How is the Expected Matrix calculated?

The null hypothesis for \(\chi^2\) Two Way Test is that the two category variables are independent of one another. In this case, we’re saying that the results would be the same in both the Dolphin group and the Control group. Since 13 out of 30 study participants improved, we have an overall Improved Rate of \(\frac{13}{30} = 0.433\). Applying the \(43.3\%\) success rate to the 15 participants in the Dolphin group, we expect \(0.4\overline{33}*15 = 6.5\) of them to Improve.

Hypothesis Testing: Dolphin Study

while different textbooks and teachers may chunk the process differently, a scientific report of statistical conclusions requires all of the elements shown below. In scientific presentation of results, we tend to use paragraph, not bullet-point, format. The bullet points below are simply there to highlight each required portion of the statistical hypothesis test.

1. Procedure

For the study design, we are comparing the proportion of successes in two samples to one another, and generally we would use a 2-sample \(z\)-proportion test for this purpose. However, as shown above, the sample size is too small to expect accurate \(p\)-values from the \(z\)-proportion approach, so we will use a \(\chi^2\) Two Way Test, sometimes called a \(\chi^2\) Test of Independence.

2. Hypothesis

Our null hypothesis is that the results of the experimental treatment are completely independent of which group the participants were assigned to. We use the symbols \(H_0\) to indicate the null and \(H_a\) to indicate the alternate hypothesis.

\[\begin{align*} H_0 &: \text{RESULT is independent of TREATMENT}\\ H_a &: \text{RESULT is dependent upon TREATMENT} \end{align*}\]

In statistics, we often indicate variables by using words or abbreviations in uppercase letters.

3. Verification

As shown above, all four expected cell counts are at least 6.5, so we have 0% low expected cells. The data are appropriate for using \(\chi^2\) procedures.

4. Alpha

We will discuss the nuances of the level of significance later in the course. For now, we will use the default setting of \(\alpha = 0.05\).

5. Run Stats App

We can use a graphing calculator or statistics software like JMP, SPSS or RStudio to run this procedure. We can even be the app ourselves and run the procedure by-hand using statistical tables.

xchisq.test(Result ~ Treatment, data = Dolphin, correct = FALSE)

    Pearson's Chi-squared test

data:  x
X-squared = 6.6516, df = 1, p-value = 0.009907

   12        5   
 (8.50)   (8.50) 
 [1.44]   [1.44] 
< 1.20>  <-1.20> 
   
    3       10   
 (6.50)   (6.50) 
 [1.88]   [1.88] 
<-1.37>  < 1.37> 
   
key:
    observed
    (expected)
    [contribution to X-squared]
    <Pearson residual>

The calculated \(\chi^2\) statistic was 4.8869, and \(p\)-value was 0.02706 indidicating an approximately 3% probability that the null hypothesis is true.

6. Statistical Conclusion

When the probability of the null being true is small, we have evidence that it is likely false. Therefore, our criteria for rejecting the null hypothesis is \[\text{If}\hspace{3mm}p<\alpha\hspace{3mm} \rightarrow \hspace{3mm}\text{Reject } H_0\]

Our statistical conclusion for the Dolphin study is to reject the null because \[p\approx 0.027<0.05 = \alpha\] For any \(p\)-value greater than or equal to \(\alpha\), we would “fail to reject \(H_0\)”. Since we’re math geeks, we use the symbol \(H_0\) and the words “null hypothesis” interchangeably.

7. Research Conclusion

The research conclusion is a statement using real-world terms that describe the results of the entire process. For this hypothesis test, consider the following statement. \[\text{Evidence suggests that RESULT depends upon TREATMENT }(p=0.027)\text{.}\]

IV. Examples

Let’s load our big data set for this class and work out some examples. You can use “File: Import Dataset” or the code block below if the file is in your working directory.

Data3350 = read_excel("Data3350.xlsx")

Dating Example: Do you Accept?

From the “VarsData3350” PDF file we find the following description of the ACCEPTDATE variable.

Dating Question

“At a time in your life when you are not involved with anyone, a person asks you out. This person has a great personality, but you do not find this person physically attractive. Do you accept the date?”

Responses: Y = “Yes,” N, = “No”

Suppose that we investigate whether this category variable is completely independent of another category variable: biological sex where “F” indicates female and “M” indicates male.

tally(AccDate ~ Sex , data = Data3350)
       Sex
AccDate  F  M
      N 36 36
      Y 60 33

A visualization of the data where proportions are shown as areas is called a mosaic plot.

mosaicplot(AccDate ~ Sex , data = Data3350, 
           color = TRUE,
           main = "Mosaic Plot: AcceptDate by Sex")

Note how the “No” row is narrower: overall more participants said “Yes.” Note the total dark gray area is much larger than the light gray: females make up almost 60% of the subjects. The shading also highlights how the females responded “Yes” almost twice as often they responded “No” while the males split about 50-50 Yes vs. No.

While the mosaic plot does indicate a sex-based difference seems to exist, the question is whether or not that difference is statistically significant. That’s why we run a hypothesis test.

Using the \(\chi^2\) default, we would have the following output.

xchisq.test(AccDate ~ Sex , data = Data3350, correct = FALSE)

    Pearson's Chi-squared test

data:  x
X-squared = 3.5146, df = 1, p-value = 0.06083

   36       36   
(41.89)  (30.11) 
 [0.83]   [1.15] 
<-0.91>  < 1.07> 
   
   60       33   
(54.11)  (38.89) 
 [0.64]   [0.89] 
< 0.80>  <-0.94> 
   
key:
    observed
    (expected)
    [contribution to X-squared]
    <Pearson residual>

The downside of the xchsqr.test function (from the Mosaic package) is it’s lack of 1-tailed testing options. There is a stereotype that women evaluate potential dates more holostically while men evaluate potential dates mostly on appearance. We can test the stereotype by using the following 1-tailed hypothesis test.

Hypothesis Test: Do you Accept?

Hypothesis Testing Style

This example is written up in paragraph form,
but note how the details from each of the seven
steps listed before are discussed. You can use
bullet-point format if you’d like. Most
researchers do not.

We will use a 1-tailed, 2-proportion test which, in RStudio, will mean a \(\chi^2\) Test of Independence.

Let “Yes” responses to the dating question be considered successes. Let \(p_F\) represent the percentage of successes in the female sub-population, and \(p_M\) represent the percentage of successes for the male sub-population. \[\begin{align*}H_0 &: p_F = p_M\\ \\ H_a &: p_F > p_M\end{align*}\]

For verification, note that the xchisq.test function generates the Expected matrix.

chi = xchisq.test(AccDate ~ Sex , data = Data3350, correct = FALSE)

    Pearson's Chi-squared test

data:  x
X-squared = 3.5146, df = 1, p-value = 0.06083

   36       36   
(41.89)  (30.11) 
 [0.83]   [1.15] 
<-0.91>  < 1.07> 
   
   60       33   
(54.11)  (38.89) 
 [0.64]   [0.89] 
< 0.80>  <-0.94> 
   
key:
    observed
    (expected)
    [contribution to X-squared]
    <Pearson residual>

The key at the bottom indicates the values in parentheses are the cells of the expected matrix. All four expected cell counts are far greater than 5, so these data are appropriate for \(\chi^2\) procedures. Remember, even though we would run this as a 2-proportion \(z\)-test on the graphing calculator, RStudio always defaults to \(\chi^2\) for proportion tests, even the 1-proportion variety.

One-tailed proportion testing in RStudio

With the default level of significance, we want to run a 1-tailed test using a “less than” alternative hypothesis. Why “less than”? Because “Females” and “No” were listed first for SEX and ACCEPT DATE respectively, so R interprets the alternative hypothesis as comparing the percentage of “No” responses for females (left-hand side) vs. the percentage of “No” responses for males (right-hand side). Confusing? Don’t worry, I had to figure it out by trial and error, too.

prop.test(AccDate ~ Sex , data = Data3350,
          alternative = "less"
          , correct = FALSE
          )

    2-sample test for equality of proportions without continuity correction

data:  tally(AccDate ~ Sex)
X-squared = 3.5146, df = 1, p-value = 0.03041
alternative hypothesis: less
95 percent confidence interval:
 -1.00000000 -0.01871767
sample estimates:
   prop 1    prop 2 
0.3750000 0.5217391 

Because \[p = 0.0304 < .05 =\alpha\] we reject the null. The evidence from this proportion test suggests that the pattern of Yes/No responses to the “Accept Date” question depends upon gender (p = .0304).

Example: Lady Tasting Tea

Suppose that Dr. Muriel Bristol had identified 8 of 8 cups of tea correctly. How can we quickly generate a theoretical probability in RStudio? The proportion statistic is \[\hat{p} = \frac{x}{n}\] where \(x\) is the number of successes out of \(n\) total observations. We can use the function \(\fbox{prop.test}\) and specify \(x\) and \(n\). We are testing the hypothesis \[\begin{align*}H_0 &: \text{prop} = 0.5\\H_a &: \text{prop} > 0.5 \end{align*}\] which is why we include the \(\fbox{alternative = greater}\) option in the function call.

prop.test(x = 8, n = 8, p = .5,
          alternative = "greater",
          correct = FALSE)
Chi-squared approximation may be incorrect

    1-sample proportions test without continuity correction

data:  8 out of 8
X-squared = 8, df = 1, p-value = 0.002339
alternative hypothesis: true p is greater than 0.5
95 percent confidence interval:
 0.7472764 1.0000000
sample estimates:
p 
1 

Notice that the output has thrown an error code which prints out in red type at the top. Why? Well, the expected matrix has only 4 successes and 4 failures, so the sample size is too small. The error code is R’s message to double check the verification.

Suppose Bristol tasted more tea, and that she identified 12 of 14 successfully: \[\hat p =\frac{12}{14}\]

For verification of a 1-proportion test, we use the proportion \(p_0\) from the hypothesis along with the sample size \(n\) to generate the Expected cell counts: \(n*p_0 = 14 * 0.5 = 7\) for the successes, and \(n*(1-p_0) = 14 * 0.5 = 7\) for the failures. The test now runs without any error codes thrown because the cell counts are all greater than 5.

prop.test(x = 12, n = 14, p = 0.5,
          alternative = "greater",
          correct = FALSE)

    1-sample proportions test without continuity correction

data:  12 out of 14
X-squared = 7.1429, df = 1, p-value = 0.003763
alternative hypothesis: true p is greater than 0.5
95 percent confidence interval:
 0.6470626 1.0000000
sample estimates:
        p 
0.8571429 

The \(p\)-value is 0.0081 which is far less than \(\alpha = 0.05\), so we reject the null (e.g. guessing at random). Our research conclusion is that we have strong evidence that Bristol can actually taste the difference in the tea.

V. Proportion Testing for Qualitative Variables with more than 2 Levels

Consider our Data3350 data frame. We have two straightforward variables.

  1. SitClass is seating preference in class: front, middle or back.
  2. VarsAth is aks if one is a varsity athlete: response is Yes/No.

Let’s use a \(\chi^2\) Test of Independence with the following hypothesis:

\[\begin{align*}H_0 &: \text{Class seating prefence is independent of varsity althlete status}\\H_a &: \text{Class seating prefence depends upon varsity althlete status} \end{align*}\]

Let’s present the data using \(\fbox{tally}\).

tally(SitClass ~ VarsAth, data = Data3350)
        VarsAth
SitClass  N  Y
       B 27  8
       F 56  2
       M 65  7

While not a necessary hypothesis testing step, the mosaic plot tells a nice visual story and should be including in any proportion testing.

mosaicplot(SitClass ~ VarsAth, data = Data3350,
           color = TRUE,
           main = "Class seating preference vs. Varsity Athlete status")

Notice the strong preference for seating in the back by varsity athletes, perhaps a function of many athletes being taller than average.The areas are more difficult to compare between front and middle for non-athletes, but it’s clear non-athletes prefer seats in the back far less than either of the other options.

We will need to be careful about verification as there are several low cell counts in the observed data. Let’s run the procedure, then hit the pause button while check on the Expected cell counts.

xchisq.test(SitClass ~ VarsAth, data = Data3350,
            correct = FALSE)
Chi-squared approximation may be incorrect

    Pearson's Chi-squared test

data:  x
X-squared = 8.9442, df = 2, p-value = 0.01142

   27        8   
(31.39)  ( 3.61) 
[0.6150] [5.3540]
<-0.784> < 2.314>
   
   56        2   
(52.02)  ( 5.98) 
[0.3038] [2.6451]
< 0.551> <-1.626>
   
   65        7   
(64.58)  ( 7.42) 
[0.0027] [0.0236]
< 0.052> <-0.154>
   
key:
    observed
    (expected)
    [contribution to X-squared]
    <Pearson residual>

R throws a red error line at the top of the output, but our requirement was that there be no more than 20% low Expected cell counts. We have only 1 of 6 low expected cells, so about 17%. We use a slightly more liberal verification procedure than R which flags any expected cell counts less than 5.

Since the data do pass verification and appear to be appropriate for \(\chi^2\) procedures, we find that \[p = 0.01142 < 0.05 =\alpha\] (again using our default level of significance). Thus, we reject the null. There is strong evidence that class seating preferences depend upon whether one is a varsity athlete.

VI. Supplementary Material: 1- and 2-Proportion Tests with Summary Stats Only

We can run both 1-proportion \(z\)-tests and 2-proportion \(z\)-tests even if we only know the summary statistics but don’t have a data frame.

Example: 1-Proportion \(z\)-test

A recent study in Pakistan estimated that approximately 11% of all undergraduate students in that country had Type A personalities. A recent survey at the University of North Georgia showed a sample of 155 North Georgia students contained 22 students with Type A personalities. Is there evidence at the 0.05 level that a higher proportion of students at UNG have Type A personalities?

For this 1-proportion \(z\)-test, our hypotheses are: \[H_0 : \text{prop} = .11\\H_a : \text{prop} > .11\] The data pass verification as \(np_0 = 155(.11) = 17.05\geq 10\), and clearly \(n(1-p_0)\) is even larger. To run the test, we can specify \(x,n,p_0\) as parameters along with the alternative hypothesis. Note that we can type out the whole word “greater” in the alternative hypothesis parameter, or we can abbreviate it is “g.” We can also abbreviate TRUE and FALSE as shown.

prop.test( x = 22 , n = 155 , p = .11 , 
           alternative = "g" ,
           correct = F)

    1-sample proportions test without continuity correction

data:  22 out of 155
X-squared = 1.6147, df = 1, p-value = 0.1019
alternative hypothesis: true p is greater than 0.11
95 percent confidence interval:
 0.1019576 1.0000000
sample estimates:
        p 
0.1419355 

We fail to reject the null and have no evidence at the 0.05 level that the Type A personality rates are significantly higher (or different) at North Georgia than at universities in Pakistan.

Example: 2-Proportion \(z\)-test

Are young men more likely to experience ADD/ADHD symptoms than young women? A recent survey found 22 of 78 females had experienced significant ADD/ADHD symptoms while 12 of 40 males had experienced them. Test for a difference at the .1 level.

For this 1-proportion \(z\)-test, our hypotheses are: \[H_0 : p_F = p_M\\H_a : p_F < p_M\] The data pass verification because the number of successes are 22 and 12 in the female and male samples respectively, and both samples have more failures than successes. Thus, the number of successes and failures in both samples is greater than or equal to 10, as required.

We use concatenate function c to provide the numerator and denominator of two different \(\hat p\) fractions, one for females and one for males.

prop.test( x = c(22,12), n = c(78,40), 
           alternative = "less",
           correct = FALSE)

    2-sample test for equality of proportions without continuity correction

data:  c out of c22 out of 7812 out of 40
X-squared = 0.041528, df = 1, p-value = 0.4193
alternative hypothesis: less
95 percent confidence interval:
 -1.0000000  0.1277498
sample estimates:
   prop 1    prop 2 
0.2820513 0.3000000 

We fail to reject the null and have no evidence for a different rate of ADD/ADHD symptoms based on biological sex.

Example: \(\chi^2\) Test of Independence

Does level of smoking affect level of exercise? Test at the 0.05 level. The smokers were categorized as heavy smokers, occasional smokers or had never smoked. They were also categorized according to level of exercise: frequent, some, and none.

Frequency of Exercise vs. Smoking Level \[\begin{array}{l|ccc}&\textbf{Frequent} &\textbf{Some} & \textbf{None}\\ \hline \textbf{Heavy} & 7 & 3 & 1\\ \textbf{Occasional} & 21 & 11 & 4\\ \textbf{Never} & 87 & 84 & 18\\ \end{array}\]

We need to create the observed matrix which I will call tab. We can then run the Chi-squared test.

tab = (matrix(c(7, 3, 1, 21, 11, 4, 87 , 84 , 18),nrow=3))
xchisq.test(tab)
Chi-squared approximation may be incorrect

    Pearson's Chi-squared test

data:  x
X-squared = 3.5177, df = 4, p-value = 0.4752

  7.00    21.00    87.00  
( 5.36)  (17.54)  (92.10) 
[0.5017] [0.6815] [0.2821]
< 0.708> < 0.826> <-0.531>
     
  3.00    11.00    84.00  
( 4.57)  (14.95)  (78.48) 
[0.5381] [1.0433] [0.3878]
<-0.734> <-1.021> < 0.623>
     
  1.00     4.00    18.00  
( 1.07)  ( 3.51)  (18.42) 
[0.0048] [0.0689] [0.0096]
<-0.070> < 0.262> <-0.098>
     
key:
    observed
    (expected)
    [contribution to X-squared]
    <Pearson residual>

Note that three of the nine expected cells are less than 5, so the data are not appropriate for Chi-Squared procedures since 33% of the expected cells have low cell counts. We cannot use these data for Chi-Squared – do not proceed.

However, we can still illustrate how to create a mosaic plot from summary statistics.I am also creating column names and row names so the Mosaic plot is easier to interpret.

colnames(tab) = c("Frequent","Some","None")
rownames(tab) = c("Heavy","Occasional","Never")
mosaicplot(tab)

Special Note: For tests, quizzes and projects in this course, we will not be be using the Agresti Plus 4 Method, nor will use the Yates correction for \(\chi^2\) or the continuity correction in the prop.test.

VII. Exercises

  1. Using the SitClass variable from the Data3350 data frame, test the hypothesis that classroom seating preference depends upon membership in the corps of cadets (variable Corps) at the \(\alpha = 0.05\) level. Include a Mosaic plot and describe it’s relationship to your \(p\)-value and conclusions.

  2. Using the SitClass variable from the Data3350 data frame, test the hypothesis that classroom seating preference depends upon biological Sex at the \(\alpha = 0.05\) level. Include a Mosaic plot and describe it’s relationship to your \(p\)-value and conclusions.

  3. Using the AccDate variable from the Data3350 data frame, test the hypothesis that the Yes responses to the dating question are more likely for those in social Greek fraternities and sororities at the \(\alpha = 0.1\) level. Include a Mosaic plot with a description about it’s relationship to your \(p\)-value and conclusions.

  4. A strong sense of Coping Humor indicates a person who uses humor to relieve stress and deal with the struggles of life. Test at the .05 level whether more than 10% of North Georgia students exhibit strong Coping Humor. A recent study used a criteria of scoring 30 or higher on the Coping Humor Scale to evaluate this criteria, and found that 21 of 175 North Georgia students did so.

  5. A recent study asked UNG students whether they frequently texted at work about things unrelated to work. For younger students, 25 of 103 said they frequently did so while 15 of 46 students who 21 or older reported doing so. Test for an age-difference for Texting Frequently at Work at the .05 level.

