HW5

Question1)

a)Imagine that Verizon claims that they take 7.6 minutes to repair phone services for its customers on average. The PUC seeks to verify this claim at 99% confidence (i.e., significance α = 1%) using traditional statistical methods.

i)Visualize the distribution of Verizon’s repair times, marking the mean with a vertical line

data1 <- data.table::fread("C:/R-language/BACS/verizon.csv")
plot(density(data1$Time),col="blue",lwd=2,main="Verizon's repair time") +
  abline(v=mean(data1$Time))

## integer(0)

ii)Given what the PUC wishes to test, how would you write the hypothesis? (not graded)

Null Hypothesis (H0): The population mean repair time for Verizon’s phone services is 7.6 minutes.

Since the PUC seeks to verify Verizon’s claim at 99% confidence, this implies that the level of significance (α) for the test is 1%.

iii)Estimate the population mean, and the 99% confidence interval (CI) of this estimate.

mean_time <- mean(data1$Time)
cat("the population mean of time might be", mean_time)

## the population mean of time might be 8.522009

sde_time <- sd(data1$Time) / sqrt(length(data1$Time));sde_time

## [1] 0.3600527

po_ti <- mean_time + 2.58 * sde_time
ne_ti <- mean_time - 2.58 * sde_time
cat("the 99% confidence interval (CI) of this estimate might be", ne_ti, "to", po_ti)

## the 99% confidence interval (CI) of this estimate might be 7.593073 to 9.450946

iv)Find the t-statistic and p-value of the test

hypothesized_mean <- 7.6
t_stat <- (mean_time - hypothesized_mean) / sde_time
p_value <- 2 * pt(-abs(t_stat), df = length(data1$Time) - 1)
cat("t-statistic:", round(t_stat, 3), "\n")

## t-statistic: 2.561

cat("p-value:", format(p_value, scientific = FALSE))

## p-value: 0.01053068

v)Briefly describe how these values relate to the Null distribution of t (not graded)

The t-statistic is equal to 2.561, which means that the sample mean is 2.561 standard errors away from the hypothesized mean of 7.6 minutes.

The p-value is equal to 0.0105, which means that if the null hypothesis were true (i.e., the population mean repair time for Verizon’s phone services is 7.6 minutes), there is a 1.05% chance of obtaining a t-statistic as extreme or more extreme than 2.561.

Therefore, we can’t reject the null hypothesis since the p-value is slightly larger than the level of significance (α), which is 0.01 in this case.

vi)What is your conclusion about the company’s claim from this t-statistic, and why?

Just as I mentioned in question a-v). Based on the calculated t-statistic of 2.561 and a two-tailed hypothesis test with a sample size of 1687, we cannot reject the null hypothesis that the population mean repair time for Verizon’s phone services is 7.6 minutes at the 1% level of significance since p-value is slightly larger than significance level.

b)Let’s re-examine Verizon’s claim using bootstrapped testing:

i)Bootstrapped Percentile: Estimate the bootstrapped 99% CI of the population mean

boot_mean <- function(sample0) {
  resample <- sample(sample0, length(sample0), replace=TRUE)
  return( mean(resample) )
}
set.seed(42379878)
num_boots <- 3000
mean_boots <- replicate(
  num_boots,
  boot_mean(data1$Time)
)
ci_99 <- quantile(mean_boots, probs=c(0.005, 0.995));ci_99

##     0.5%    99.5% 
## 7.603120 9.484516

ii)Bootstrapped Difference of Means: What is the 99% CI of the bootstrapped difference between the sample mean and the hypothesized mean?

boot_mean_diffs <- function(sample0,mean_hyp) {
  resample <- sample(sample0, length(sample0), replace=TRUE)
    return( mean(resample) - mean_hyp )
}
set.seed(42379878)
num_boots <- 3000
mean_diffs <- replicate(
  num_boots,
  boot_mean_diffs(data1$Time,hypothesized_mean)
)
diff_ci_99 <- quantile(mean_diffs, probs=c(0.005, 0.995));diff_ci_99

##        0.5%       99.5% 
## 0.003120036 1.884516212

iii)Plot distribution the two bootstraps above on two separate plots.

par( mfrow= c(1,2) )
plot(density(mean_boots),col="blue",main="Bootstrapped Percentile")+
  abline(v=ci_99, lty="dashed")

## integer(0)

plot(density(mean_diffs),col="blue",main="Bootstrapped Difference of Means")+
  abline(v=diff_ci_99, lty="dashed")

## integer(0)

iv)Does the bootstrapped approach agree with the traditional t-test in part [a]?

No. Since the 99% CI of the difference do not contain zero, we can reject the Verizon’s claim.

c)They claim that the median is a more fair test, and claim that the median repair time is no more than 3.5 minutes at 99% confidence (i.e., significance α = 1%).

i)Bootstrapped Percentile: Estimate the bootstrapped 99% CI of the population median

boot_median <- function(sample0) {
  resample <- sample(sample0, length(sample0), replace=TRUE)
  return( median(resample) )
}
set.seed(42379878)
num_boots <- 3000
median_boots <- replicate(
  num_boots,
  boot_median(data1$Time)
)
med_ci_99 <- quantile(median_boots, probs=c(0.005, 0.995));med_ci_99

##  0.5% 99.5% 
##  3.22  3.93

ii)Bootstrapped Difference of Medians: What is the 99% CI of the bootstrapped difference between the sample median and the hypothesized median?

hyp_med <- 3.5
boot_median_diffs <- function(sample0,median_hyp) {
  resample <- sample(sample0, length(sample0), replace=TRUE)
  return( median(resample) -  median_hyp)
}
set.seed(42379878)
num_boots <- 3000
median_diffs <- replicate(
  num_boots,
  boot_median_diffs(data1$Time,hyp_med)
)
meddiff_ci_99 <- quantile(median_diffs, probs=c(0.005, 0.995));meddiff_ci_99

##  0.5% 99.5% 
## -0.28  0.43

iii)Plot distribution the two bootstraps above on two separate plots.

par( mfrow= c(1,2) )
plot(density(median_boots),col="blue",main="Bootstrapped Percentile")+
  abline(v=med_ci_99, lty="dashed")

## integer(0)

plot(density(median_diffs),col="blue",main="Bootstrapped Difference of Medians")+
  abline(v=meddiff_ci_99, lty="dashed")

## integer(0)

iv)What is your conclusion about Verizon’s claim about the median, and why?

Because the 99% CI of the median difference do contain zero, we still can’t reject the Verizon’s claim.

Question2)

H-null: The mean usage time of the new smartwatch is the same or less than for the previous smartwatch.

H-alt: The mean usage time is greater than that of our previous smartwatch.

Answer the question to each senario:

1.Would this scenario create systematic or random error (or both or neither)?

2.Which part of the t-statistic or significance (diff, sd, n, alpha) would be affected?

3.Will it increase or decrease our power to reject the null hypothesis?

4.Which kind of error (Type I or Type II) becomes more likely because of this scenario?

Scenario a)Only collected data from a pool of young consumers, and missed many older customers

1.This scenario would create systematic error. The sample is not representative of the entire population, as it only includes young consumers and excludes older customers who might use the product less frequently.

2.The difference in mean usage time (diff) may be affected if older customers indeed use the product less frequently than young consumers.

3.This scenario would likely decrease our power to reject the null hypothesis. If the sample is not representative of the entire population, the observed difference in mean usage time may be less extreme than the true difference in the population. As a result, we may not be able to reject the null hypothesis.

4.This scenario increases the likelihood of Type II error. In this scenario, if the true mean usage time of the new smartwatch is actually higher than that of the previous smartwatch, but we fail to reject the null hypothesis due to a biased sample, we would make a Type II error.

Scenario b) 20 of the respondents are reporting data from the wrong wearable device.

1.This scenario creates systematic error because it is a mistake made in data collection that affects a specific group of respondents.

2.The ample size, difference in mean (diff) and standard deviation (sd) would be affected since the 20 respondents who reported data from the wrong device would have different usage times than those who reported data from the correct device.

3.Removing these 20 respondents would likely increase the power to reject the null hypothesis since they were reporting data from the wrong device, and their usage times would not be representative of the true population.

4.This scenario increases the likelihood of a Type I error, as removing respondents who reported data from the wrong device may create a biased sample that is not representative of the general population.

Scenario c) “95% confidence” criteria change to just 90%.

1.This scenario creates systematic error because it is a deliberate change in the criteria used for hypothesis testing.

2.The significance level (alpha) would be affected because it would change from 0.05 (for 95% confidence) to 0.10 (for 90% confidence).

3.Relaxing the criteria for hypothesis testing from 95% confidence to 90% confidence would increase the power to reject the null hypothesis since the critical value for t at 90% confidence is smaller than the critical value at 95% confidence. This means that a smaller difference in means would be needed to reject the null hypothesis.

4.This scenario increases the likelihood of a Type I error since relaxing the criteria for hypothesis testing makes it easier to reject the null hypothesis.

Scenario d) measured usage times on five weekdays and taken a daily average(with out consider weekends).

1.This scenario creates systematic error because the way the data is collected is biased towards the measure in certain time.

2.The difference in means (diff) and standard deviation (sd) would also be affected since the biased data could increase the standard deviation and affect the difference.

3.Including biased data would decrease the power to reject the null hypothesis since it would decrease the difference in means and increase the variability in the sample.

4.This scenario increases the likelihood of a Type II error because biased data could make it difficult to detect a true difference in means, and we may fail to reject the null hypothesis when we should have.

HW5

111078517

2023-03-19

Question1)

a)Imagine that Verizon claims that they take 7.6 minutes to repair phone services for its customers on average. The PUC seeks to verify this claim at 99% confidence (i.e., significance α = 1%) using traditional statistical methods.

i)Visualize the distribution of Verizon’s repair times, marking the mean with a vertical line

ii)Given what the PUC wishes to test, how would you write the hypothesis? (not graded)

Null Hypothesis (H0): The population mean repair time for Verizon’s phone services is 7.6 minutes.

Since the PUC seeks to verify Verizon’s claim at 99% confidence, this implies that the level of significance (α) for the test is 1%.

iii)Estimate the population mean, and the 99% confidence interval (CI) of this estimate.

iv)Find the t-statistic and p-value of the test

v)Briefly describe how these values relate to the Null distribution of t (not graded)

The t-statistic is equal to 2.561, which means that the sample mean is 2.561 standard errors away from the hypothesized mean of 7.6 minutes.

The p-value is equal to 0.0105, which means that if the null hypothesis were true (i.e., the population mean repair time for Verizon’s phone services is 7.6 minutes), there is a 1.05% chance of obtaining a t-statistic as extreme or more extreme than 2.561.

Therefore, we can’t reject the null hypothesis since the p-value is slightly larger than the level of significance (α), which is 0.01 in this case.

vi)What is your conclusion about the company’s claim from this t-statistic, and why?

b)Let’s re-examine Verizon’s claim using bootstrapped testing:

i)Bootstrapped Percentile: Estimate the bootstrapped 99% CI of the population mean

ii)Bootstrapped Difference of Means: What is the 99% CI of the bootstrapped difference between the sample mean and the hypothesized mean?

iii)Plot distribution the two bootstraps above on two separate plots.

iv)Does the bootstrapped approach agree with the traditional t-test in part [a]?

No. Since the 99% CI of the difference do not contain zero, we can reject the Verizon’s claim.

c)They claim that the median is a more fair test, and claim that the median repair time is no more than 3.5 minutes at 99% confidence (i.e., significance α = 1%).

i)Bootstrapped Percentile: Estimate the bootstrapped 99% CI of the population median

ii)Bootstrapped Difference of Medians: What is the 99% CI of the bootstrapped difference between the sample median and the hypothesized median?

iii)Plot distribution the two bootstraps above on two separate plots.

iv)What is your conclusion about Verizon’s claim about the median, and why?

Because the 99% CI of the median difference do contain zero, we still can’t reject the Verizon’s claim.

Question2)

H-null: The mean usage time of the new smartwatch is the same or less than for the previous smartwatch.

H-alt: The mean usage time is greater than that of our previous smartwatch.

Answer the question to each senario:

1.Would this scenario create systematic or random error (or both or neither)?

2.Which part of the t-statistic or significance (diff, sd, n, alpha) would be affected?

3.Will it increase or decrease our power to reject the null hypothesis?

4.Which kind of error (Type I or Type II) becomes more likely because of this scenario?

Scenario a)Only collected data from a pool of young consumers, and missed many older customers

1.This scenario would create systematic error. The sample is not representative of the entire population, as it only includes young consumers and excludes older customers who might use the product less frequently.

2.The difference in mean usage time (diff) may be affected if older customers indeed use the product less frequently than young consumers.

4.This scenario increases the likelihood of Type II error. In this scenario, if the true mean usage time of the new smartwatch is actually higher than that of the previous smartwatch, but we fail to reject the null hypothesis due to a biased sample, we would make a Type II error.

Scenario b) 20 of the respondents are reporting data from the wrong wearable device.

1.This scenario creates systematic error because it is a mistake made in data collection that affects a specific group of respondents.

2.The ample size, difference in mean (diff) and standard deviation (sd) would be affected since the 20 respondents who reported data from the wrong device would have different usage times than those who reported data from the correct device.

3.Removing these 20 respondents would likely increase the power to reject the null hypothesis since they were reporting data from the wrong device, and their usage times would not be representative of the true population.

4.This scenario increases the likelihood of a Type I error, as removing respondents who reported data from the wrong device may create a biased sample that is not representative of the general population.

Scenario c) “95% confidence” criteria change to just 90%.

1.This scenario creates systematic error because it is a deliberate change in the criteria used for hypothesis testing.

2.The significance level (alpha) would be affected because it would change from 0.05 (for 95% confidence) to 0.10 (for 90% confidence).

4.This scenario increases the likelihood of a Type I error since relaxing the criteria for hypothesis testing makes it easier to reject the null hypothesis.

Scenario d) measured usage times on five weekdays and taken a daily average(with out consider weekends).

1.This scenario creates systematic error because the way the data is collected is biased towards the measure in certain time.

2.The difference in means (diff) and standard deviation (sd) would also be affected since the biased data could increase the standard deviation and affect the difference.

3.Including biased data would decrease the power to reject the null hypothesis since it would decrease the difference in means and increase the variability in the sample.

4.This scenario increases the likelihood of a Type II error because biased data could make it difficult to detect a true difference in means, and we may fail to reject the null hypothesis when we should have.