Introduction to Probability and Statistics

Based on the book of William Mendenhall III, Robert J.Beaver and Barbara M. Beaver

Raquel Arcos Bulos

2021-07-11

Estimating the Difference Between Two Population Means

  1. Teaching Biology. An experiment was conducted to compare a teacher-developed curriculum, that was standards-based, activity-oriented, and inquiry-centered to the traditional presentation using lecture, vocabulary, and memorized facts. The test results when students were tested on biology concepts, published in The American Biology Teacher, are shown in the following table.
n1 <- 365
x1 <- 18.5
s1 <- 8.03
n2 <- 298
x2 <- 16.5
s2 <- 6.96
  1. Find a 95% confidence interval for the difference in mean scores for the two teaching methods
IC <- c((x1-x2)-1.96*sqrt(s1^2/n1+s2^2/n2), (x1-x2)+1.96*sqrt(s1^2/n1+s2^2/n2)); IC
## [1] 0.8584523 3.1415477
  1. Does the confidence interval in part a provide evidence that there is a real difference in the average scores using the two different teaching methods? Explain.
    El intervalo de confianza no contiene al cero por lo que es improbable que las medias poblacionales sean iguales, lo que significa que los promedios de calificaciones en la población son distintos.
  1. Are You Dieting?. To compare two weight reduction diets A and B, 60 dieters were randomly selected. One group of 30 dieters was placed on diet A and the other 30 on diet B, and their weight losses were recorded over a 30-day period. The means and standard deviations of the weight-loss measurements for the two groups are shown in the table. Find a 95% confidence interval for the difference in mean weight loss for the two diets. Can you conclude that there is a difference in the average weight loss for the two diets? Why or why not
n1 <- 30
x1 <- 21.3
s1 <- 2.6
n2 <- 30
x2 <- 13.4
s2 <- 1.9
IC <- c((x1-x2)-qnorm(0.05/2,lower.tail = FALSE)*sqrt(s1^2/n1+s2^2/n2), (x1-x2)+qnorm(0.05/2,lower.tail = FALSE)*sqrt(s1^2/n1+s2^2/n2)); IC
## [1] 6.74767 9.05233

Los intervalos construidos de esta forma contendrán el parámetro 95% de las veces en muestreo repetido. Estamos bastante seguros de que este intervalo en particular contiene la diferencia de medias poblacionales. Concluimos que sí hay una diferencia entre las dietas.

  1. Starting Salaries. As a group, students majoring in the engineering disciplines have the highest salary expectations, followed by those studying the computer science fields, according to a Michigan State University study. To compare the starting salaries of college graduates majoring in electrical engineering and computer science, random samples of 50 recent college graduates in each major were selected and the following information obtained.
n1 <- 50
x1 <- 62428
s1 <- 12500
n2 <- 50
x2 <- 57762
s2 <- 13330
  1. Find a point estimate for the difference in the average starting salaries of college students majoring in electrical engineering and computer science. What is the margin of error for your estimate?
espuntual <- (x1-x2); espuntual
## [1] 4666
ME <- qnorm(0.05/2,lower.tail = FALSE)*sqrt(s1^2/n1+s2^2/n2); ME
## [1] 5065.2
  1. Based upon the results in part a, do you think that there is a significant difference in the average starting salaries for electrical engineers and computer scientists? Explain.
IC <- c((x1-x2)-qnorm(0.05/2,lower.tail = FALSE)*sqrt(s1^2/n1+s2^2/n2), (x1-x2)+qnorm(0.05/2,lower.tail = FALSE)*sqrt(s1^2/n1+s2^2/n2)); IC
## [1] -399.2 9731.2

Dado que el IC contiene el cero, no podemos concluir que exista una diferencia entre los salarios promedios de los egresados de ingeniería eléctrica y los egresados de ciencias de la computación.

  1. Hotel Costs, again. Suppose that we randomly select 50 billing statements from each of the computer databases of the Marriott, Westin, and Doubletree hotel chains. The means and standard deviations for the data are given in the table:
n1 <- 50
x1 <- 150
s1 <- 17.2
n2 <- 50
x2 <- 165
s2 <- 22.5
n3 <- 50
x3 <- 125
s3 <- 12.8
  1. Find a 95% confidence interval for the difference in the average room rates for the Marriott and the Westin hotel chains.
IC <- c((x1-x2)-qnorm(0.05/2,lower.tail = FALSE)*sqrt(s1^2/n1+s2^2/n2), (x1-x2)+qnorm(0.05/2,lower.tail = FALSE)*sqrt(s1^2/n1+s2^2/n2)); IC
## [1] -22.85009  -7.14991
  1. Find a 99% confidence interval for the difference in the average room rates for the Westin and the Doubletree hotel chains.
IC <- c((x2-x3)-qnorm(0.01/2,lower.tail = FALSE)*sqrt(s2^2/n1+s3^2/n2), (x2-x3)+qnorm(0.01/2,lower.tail = FALSE)*sqrt(s2^2/n1+s3^2/n2)); IC
## [1] 30.57028 49.42972
  1. Do the intervals in parts a and b contain the value \({{\mu }_{1}}-{{\mu }_{2}}=0\)? Why is this of interest to the researcher?
    Los intervalos de confianza construidos no contienen el cero. Este valor es de interés para el investigador porque en caso de incluirlo, indicaría que es altamente probable que las medias poblacionales de interés no son distintas

  2. Do the data indicate a difference in the average room rates between the Marriott and the Westin chains? Between the Westin and the Doubletree chains?
    Puesto que el cero no está incluido en los IC calculados, concluimos que hay diferencia en la tarifas por habitación promedio de las cadenas hoteleras Marriott y Westin por un lado y de las cadenas hoteleras Westin y Doubletree por el otro.