Problem 1: Commuting by Tube.

Transport for London reports that 68% of commuters use the Underground at least once per week.
You take a random sample of \(n = 120\) London residents.

Part A

We are dealing with sample proportions, since each person either uses or does not use the Underground weekly. The 68% is a clue (with proportion p=0.68). So the CLT for proportions is the way to go. We get that our sampling distribution should follow the normal model N(0.68, sqrt(0.68*0.32/120)), which rounds to N(0.68, 0.043). The calculations are below:

p <- 0.68          # population proportion
n <- 120           # sample size

mean_p <- p
sd_p <- sqrt(p*(1-p)/n)

mean_p

## [1] 0.68

sd_p

## [1] 0.04258325

Part B

The probability that there are fewer than 75 out of 120 sampled residents using the underground can be found by calculating the z-score 75/120 = 0.625. We then calculate the probability of a sample having a z-score as small, or smaller, than 0.625. It turns out that the z-score is -1.291588, which we round to -1.29. The probability that a random sample has a z-score less than -1.29 is equal to 0.09852533. The relevant calculations are below.

sample_p <- 75/120
sample_z <- (sample_p - mean_p)/sd_p

sample_p

## [1] 0.625

sample_z

## [1] -1.291588

pnorm(-1.29, mean=0, sd=1)

## [1] 0.09852533

Therefore: the probability that fewer than 75 of the sampled residents use the Underground weekly is equal to 0.09852533, or roughly 9.9%.

Problem 2: British Museum Visit Times.

Tourist time in the British Museum (minutes) has population mean \(\mu = 90\) and standard deviation \(\sigma = 30\).
We take a random sample of \(n = 50\).

Part A

We are dealing with means. The mean of 90 and standard deviation is a clue. So the CLT for means is the model we need to use. We get that our sampling distribution should follow the normal model N(90,(30/sqrt(50)), which rounds to N(90, 4.24). The calculations are:

mu <- 90 # population mean
sigma <- 30 # standard deviation
n <- 50 #sample size
mean_x <- mu
sd_x <- sigma / sqrt(n)

mean_x

## [1] 90

sd_x

## [1] 4.242641

Part B

The probability our sample has an average time greater than 100 minutes can be found by calculating the z-score (100-90)/4.24 = 2.36. The probability that a random sample spends greater than 100 minutes is equal to 0.009211063. The relevant calculations are below.

x <- 100
1 - pnorm(x, mean = mean_x, sd = sd_x)

## [1] 0.009211063

Therefore, the probability that our sample has an average visit time greater than 100 minutes is approximately 0.009211063 or 0.92%.

Problem 3: The London Eye.

The number of selfies taken per capsule ride is approximately normal with mean \(\mu = 12\) and standard deviation \(\sigma = 5\).
We take a random sample of \(n = 16\) capsule rides and consider the average number of selfies taken in this sample.

Part A

mu <- 12 # population mean
sigma <- 5 # standard deviation
n <- 16 # sample size
mean_x <- mu
sd_x <- sigma / sqrt(n)

mean_x

## [1] 12

sd_x

## [1] 1.25

Part B

x <- 15
1 - pnorm(x, mean = mean_x, sd = sd_x)

## [1] 0.008197536

Therefore, the probability that our sample’s average number of selfies per capsule ride exceeds 15 is about 0.008197536 or 0.819%

Problem 4: Recycling Habits.

In the UK, about \(p = 0.74\) of households report recycling regularly.
Suppose you randomly sample \(n = 120\) households.

Part A

p <- 0.74 #population proportion
n <- 120 #sample size

mean_p <- p
sd_p <- sqrt(p * (1 - p) / n)

mean_p

## [1] 0.74

sd_p

## [1] 0.04004164

Part B

sample_p <- 90 / 120
sample_z <- (sample_p - mean_p) / sd_p

sample_p

## [1] 0.75

sample_z

## [1] 0.24974

# Probability that z is greater than sample_z
1 - pnorm(sample_z, mean = 0, sd = 1)

## [1] 0.4013942

Therefore, the probability that at least 90 of the 120 households recycle regularly is approximately 0.4013942, or 40.13%.

Problem 5: Bumblebee Flight Distances.

A scientist finds that bumblebee flight distances between flowers, in meters, follow a normal distribution with mean \(\mu = 3.2\) and standard deviation \(\sigma = 2.1\).
Suppose \(n = 64\).

Part A

mu <- 3.2 #population mean
sigma <- 2.1 #population sd
n <- 64 #sample size

mean_x <- mu
sd_x <- sigma / sqrt(n)

mean_x

## [1] 3.2

sd_x

## [1] 0.2625

Part B

x <- 3.5
1 - pnorm(x, mean = mean_x, sd = sd_x)

## [1] 0.126549

Therefore, the probability that the sample of 64 flight distances has an average distance greater than 3.5 meters is approximately 0.126549 or 12.65%

Problem 6: Social Media Use.

Among university students, the population mean time on social media per day is \(\mu = 2.8\) hours with standard deviation \(\sigma = 1.1\) hours.
Let \(n = 50\).

Part A

mu <- 2.8 #population mean
sigma <- 1.1 #standard deviation
n <- 50 #sample size

mean_x <- mu
sd_x <- sigma / sqrt(n)

mean_x

## [1] 2.8

sd_x

## [1] 0.1555635

Part B

x <- 3.0
1 - pnorm(x, mean = mean_x, sd = sd_x)

## [1] 0.09928285

Therefore, the probability that the sample of 50 students spends an average of 3 hours or more per day on social media is approximately 0.09928285 or 9.92%

Week 7 RMarkdown

Problem 1: Commuting by Tube.

Part A

Part B

Problem 2: British Museum Visit Times.

Part A

Part B

Problem 3: The London Eye.

Part A

Part B

Problem 4: Recycling Habits.

Part A

Part B

Problem 5: Bumblebee Flight Distances.

Part A

Part B

Problem 6: Social Media Use.

Part A

Part B