HYPOTHESES TESTING

pacman::p_load(tidyverse,Lock5Data,ISLR,wooldridge,expss,hablar,rio)
  1. A survey of American consumers asked respondents to report the amount of money they spend on bakery products in a typical month. If we assume that the population standard deviation is $5, can we conclude at the 10% significance level that the mean monthly expenditures on bakery products for all Americans is not equal to $30?
#Two tailed test of population mean with known variance
  #Ho: μ = 30
  #Ha: μ ≠ 30

mu <- 30
sigma <- 5
n <- 30
alfa <- 0.10


zcrit <- qnorm(p=alfa/2)
x_bar1 <- ((zcrit*sigma)+(mu*sqrt(n)))/(sqrt(n));x_bar1
## [1] 28.49846
x_bar2 <- ((-zcrit*sigma)+(mu*sqrt(n)))/(sqrt(n));x_bar2
## [1] 31.50154
#z <- (x_bar-mu)/(sigma/(sqrt(n)));z
#pvalue= 2*(1-pnorm(abs(z))); pvalue
#At a value of less than 28.49846 or greater than 31.50154 for x bar, the null hypotheses can be rejected, which would mean that at the 10% significance level, the mean monthly expenditures on bakery products for all Americans is not equal to $30
  1. An economist surveyed homeowners in a large city to determine the percentage increase in their heating bills over the last 5 years. The economist particularly wanted to know if there was enough evidence to infer that heating cost increases were greater than the rate of inflation, which was 10%. Assuming that the percentage increase in heating is normally distributed with a standard deviation of 3% can the economist conclude at the 5% significance level that heating costs increased faster than inflation?
#Upper tail test of population mean with known variance
  #Ho: μ <= 0.10
  #Ha: μ > 0.10

mu <- 0.10
sigma <- 0.03
n <- 30
alfa <- 0.05


zcrit <- qnorm(p=alfa, lower.tail = FALSE)
x_bar <- ((zcrit*sigma)+(mu*sqrt(n)))/(sqrt(n));x_bar
## [1] 0.1090092
#z <- (x_bar-mu)/(sigma/(sqrt(n)))
#pvalue= pnorm(z,lower.tail= FALSE); pvalue
#At a value of more than 0.1090092 (10.9%) for x bar, the null hypotheses can be rejected, which would mean that at the 5% significance level, heating costs increased faster than inflation.
  1. Suppose that you are working for a start-up that develops education software for children. You’re working on a new software package and are now trying to determine how much to charge. Based on experience and market trends, the leadership team thinks £50 is reasonable. As a data scientist, you are asked to do some research. The plan is for you to conduct a survey to check how much people would be willing to pay for the software. The leadership team will plan to charge £50 unless there is substantial evidence that people are willing to pay more. Your objective is to use the survey data to determine if the company should re-think the £50 price point. You design a survey and send it to n = 30 potential customers. After everyone has responded, you find that the average willingness to pay in your sample is x_bar = £55.7 and s2 = £64.8. What should the management do?
#Test of population mean with unknown variance
n=30
xbarra=55.7
s=sqrt(64.8)
alfa=0.01
mu0=50

#Rejection region
Tcal=(xbarra-mu0)/(s/sqrt(n))

#p-value
p=pt(abs(Tcal),n-1,lower.tail=F); p
## [1] 0.0002780401
pv=ifelse(p<0.5,2*p,2*(1-p));pv
## [1] 0.0005560802
ifelse(pv<alfa,"reject H0","Don't reject H0")
## [1] "reject H0"
#pvalue < level of significance
#Rejection of H0, the people surveyed would be willing to pay more than the 50 pounds initially proposed to obtain the software
  1. The manufacturer of the X-15 steel-belted radial truck tire claims that the mean mileage the tire can be driven before the tread wears out is 60,000 miles. Assume the mileage wear follows the normal distribution and the standard deviation of the distribution is 5,000 miles. Cross Truck Company bought 48 tires and found that the mean mileage for its trucks is 59,500 miles. Is the company’s experience different from that claimed by the manufacturer at the .05 significance level?
#Test of population mean with known variance
n= 48
xbarra=59500
sigma=5000
mu0=60000
alfa= 0.05

Zcal=(xbarra-mu0)/(sigma/sqrt(n))

p=pnorm(Zcal,lower.tail= F)

ifelse(p<alfa,"reject H0","Don't reject H0")
## [1] "Don't reject H0"
#pvalue > level of significance
#Do not reject H0, there is not enough evidence to affirm that what the manufacturer declares differs from the results in the sample.
  1. The waiting time for customers at Mac Burger Restaurants follows a normal distribution with a mean of 3 minutes and a standard deviation of 1 minute. At the Warren Road Mac Burger, the quality assurance department sampled 50 customers and found that the mean waiting time was 2.75 minutes. At the .05 significance level, can we conclude that the mean waiting time is less than 3 minutes?
  #Ho: μ >= 3
  #Ha: μ < 3

miu <- 3
desv <- 1
n <- 50
xbar <- 2.75

z <- ((xbar-miu)/(desv/sqrt(n)))

pnorm(z,lower.tail = TRUE)
## [1] 0.03854994
#pvalue < level of significance
#We reject the null hypotheses, which means that the mean waiting time is less than 3 minutes
  1. The management of White Industries is considering a new method of assembling its golf cart. The present method requires 42.3 minutes, on average, to assemble a cart. The mean assembly time for a random sample of 24 carts, using the new method, was 40.6 minutes, and the standard deviation of the sample was 2.7 minutes. Using the .10 level of significance, can we conclude that the assembly time using the new method is faster?
  #Ho: μ >= 42.3
  #Ha: μ < 42.3

miu <- 42.3
s <- 2.7
n <- 24 
xbar <- 40.6

t <- (xbar-miu)/(s/(sqrt(n)))

pvalue= pt(t,df=n-1, lower.tail= TRUE); pvalue
## [1] 0.00261788
#pvalue < level of significance
#We reject the null hypotheses, which means that at a 10% significance level, we can conclude that the assembly time using the new method is faster. 
  1. The mean income per person in the United States is $40,000, and the distribution of incomes follows a normal distribution. A random sample of 10 residents of Wilmington, Delaware, had a mean of $50,000 with a standard deviation of $10,000. At the .05 level of significance, is that enough evidence to conclude that residents of Wilmington, Delaware, have more income than the national average?
Mu <- 40000
Xbarra <- 50000
S <- 10000
n <- 10
alfa <- 0.05

#Case of study: 1 mean with unknown variance
#test: T-student

#Ho: Xbarra <= 40000
#H1: Xbarra > 40000

Tcalc <- ((Xbarra - Mu)/(S/sqrt(n)))
Tcalc
## [1] 3.162278
df <- n-1

Tcrit <- qt(1-(alfa/2), df)

Tcalc > Tcrit
## [1] TRUE
#The above expression is true, so the null hypothesis is rejected with 95% confidence
#It is 95% accurate to say that Wilmington residents earn more than the national average
  1. According to the local union president, the mean gross income of plumbers in the Salt Lake City area follows the normal probability distribution with a mean of $45,000 and a standard deviation of $3,000. A recent investigative reporter for KYAK TV found, that for a sample of 120 plumbers, the mean gross income was $45,500. At the .10 significance level, is it reasonable to conclude that the mean income is not equal to $45,000? Determine the p-value.
# H0: Mu = 45000


x_bar <- 45500
mu <- 45000
sigma <- 3000
n <- 120

z <- (x_bar - mu) / (sigma/ sqrt(n))

p.value <- pnorm(z, lower.tail = FALSE); p.value
## [1] 0.03394458
#This means that H0 is false, so with .10 significance level its certain to say that the mean income is not equal to 45000.  
  1. A new weight-watching company, Weight Reducers International, advertises that those who join will lose, on average, 10 pounds in the first two weeks with a standard deviation of 2.8 pounds. A random sample of 50 people who joined the new weight reduction program revealed the mean loss to be 9 pounds. At the .05 level of significance, can we conclude that those joining Weight Reducers on average will lose less than 10 pounds? Determine the p-value.
#Ho: μ >= 10
#H1: μ < 10

t=(10-9)/(2.8*sqrt(50))
t
## [1] 0.05050763
xbar= 9
mu = 10
sigma = 2.8
n = 50 

z <- (xbar - mu) / (sigma / sqrt(n))
z
## [1] -2.525381
pvalue <- pnorm(z, lower.tail = TRUE); pvalue
## [1] 0.00577864
#pvalue < level of significance
#We reject the null hypotheses, which means that with 5% level of significance we can conclude that those joining Weight Reducers on average will lose less than 10 pounds
  1. According to a recent survey, Americans get a mean of 7 hours of sleep per night. A random sample of 50 students at West Virginia University revealed the mean number of hours slept last night was 6 hours and 48 minutes (6.8 hours). The standard deviation of the sample was 0.9 hours. Is it reasonable to conclude that students at West Virginia sleep less than the typical American? Compute the p-value.
#Ho: μ >= 7
#Ha: μ < 7

xbar2 = 6.8
mu2 = 7
s = 0.9
n2 = 50

t <- (xbar2-mu2)/(s/(sqrt(n2)))

pvalue2= pt(t,df=n-1, lower.tail= TRUE); pvalue2
## [1] 0.06126869
#Level of significance = 5%
#pvalue > level of significance
#At a 5% significance level, we don't reject the null hypotheses, which means that we cannot conclude that students at West Virginia sleep less than the typical American