\(~\)
This is a documentation on how you are to present your solution/output to the problems on the Normal Distribution using R. The same examples from the module will be used to illustrate the procedure.
\(~\)
This part presents the steps for the solution. The R codes and a short information about the commands used are presented. After this part, a presentation of sample problems with solutions are presented as your guide for your activity outputs.Before we run the scripts, we need to load the ggplot2 package.
\(~\)
Find the area under the normal curve to the left of \(z = -2.53\).
We need to have an illustration of the standard normal curve for the given problem. We do this first by generating a sequence of \(z\) values for the normal curve that we are going to plot. This can be done by using the seq function. The following script considers \(z\) values from -4 to +4. The value for length.out can be varied should there be problems in rendering the plot of the standard normal curve. In this case, setting length.out to 1000 subdivides the interval -4 to +4 into a sequence of 1000 \(z\) values.
z <- seq(from = -4, to = +4, length.out = 1000)
\(~\)
We create a data frame for the ordered pair of values \(z\) and its corresponding probability density. This script creates a data frame containing the 1000 \(z\) values and their corresponding probability density. You can check the data frame by executing the command View(normal). The second argument for the function data.frame set the mean to 0 and the standard deviation to 1 since these are the parameters for the standard normal distribution which is what we are going to plot.
normal <- data.frame(x = z, y = dnorm(z, mean = 0, sd = 1))
\(~\)
We now create the standard normal distribution. The following script now creates the standard normal distribution considering \(z\)-values from -4 to +4.
stdnormal <- ggplot(normal, aes(x,y)) + geom_line()+ labs(title = "Standard Normal Distribution", y = "Density", x = "Z")
\(~\)
To generate the plot:
stdnormal
\(~\)
The following script shades the area under the curve considering \(z\) values < -2.53. This shaded portion corresponds to the required area under the standard normal curve which is to the left of \(z\) = -2.53. For the subset command in the script, the first argument is the data frame containing the \(z\) values and their corresponding probability density (in this case, the object ‘normal’ that we have created in the previous second script). The second argument for this command will be the set of \(z\) values covered by the required area under the curve (in this case, the set of \(z\) values to the left of -2.53 or set of \(z\) values less than -2.53).
stdnormal+geom_ribbon(data = subset(normal, z < -2.53), aes(ymax = y), ymin = 0, fill = "blue", alpha = 0.3)
\(~\)
The following script now gives the area under the standard normal curve which is to the left of \(z = -2.53\).
pnorm(-2.53)
## [1] 0.005703126
\(~\)
After this result in your .docx file, you now provide an interpretation of the result or a concluding statement, like:
The area under the (standard) normal curve which is to the left of \(z = -2.53\) is 0.0057 (for the final answer, you may consider 4 decimal place values).
\(~\)
Suppose that the area under the standard normal curve, to the right of an unknown \(z\) value is 0.25. Determine the value of \(z\).
\(~\)
We present the given area under the standard normal curve. The same script in the preceding example will be used with just a few tweaks in some parts of the code.
We generate a sequence of \(z\) values to consider for the graph/plot of the standard normal curve.
z <- seq(from = -4, to = +4, length.out = 1000)
\(~\)
Again, we create the data frame containing the generated \(z\) values and their corresponding probability density.
normal <- data.frame(x = z, y = dnorm(z, mean = 0, sd = 1))
\(~\)
We now create the standard normal distribution.
stdnormal <- ggplot(normal, aes(x,y)) + geom_line()+ labs(title = "Standard Normal Distribution", y = "Density", x = "Z")
\(~\)
We generate the plot.
stdnormal
\(~\)
Now, we shade the area of interest under the standard normal curve. The difference of the following script from the one we had in the preceding example is the argument used in the subset command to represent the set of \(z\) values that is covered by the given area to be shaded. In the preceding example, since the reference \(z\) value was given, we enter the range of \(z\) values covered directly into the script. However, for this case, the (reference) \(z\) value is unknown but can be determined by means of the qnorm function. Hence, this function will be incorporated in the following script, particularly in the subset command. Note the inclusion of the argument lower.tail = F since we are considering the area to the right in this case.
stdnormal+geom_ribbon(data = subset(normal, z > qnorm(0.25, lower.tail = F)), aes(ymax = y), ymin = 0, fill = "blue", alpha = 0.3)
\(~\)
We can then see, from these two examples, that in shading the area under the normal curve, we just need to make a little change on the subset command portion particularly on the argument where we are to set the \(z\) values covered by the area given or required.
\(~\)
Now, we determine the unknown or required \(z\) value:
z <- qnorm(0.25, lower.tail = F)
z
## [1] 0.6744898
The value of \(z\), with an area of 0.25 to the right, is 0.67. (the \(z\) value can be rounded-off to two decimal places)
\(~\)
Here now are some examples that will serve as your guide in working on the problems for the Normal distribution. Follow the presentation of the solution for each of these problems when you work on the practice exercise and the LRA for Module 7.
\(~\)
Find the probability that a \(z\) value is between -0.54 and 2.07.
\(~\)
We are required to determine the \(P(-0.54 <= z <= 2.07)\). This required probability is represented by the area under the standard normal curve which is between \(z=-0.54\) and \(z=2.07\). This is now illustrated as:
z <- seq(from = -4, to = +4, length.out = 1000)
normprob <- data.frame(x = z, y = dnorm(z, mean = 0, sd = 1))
stdnormal <- ggplot(normprob, aes(x,y)) + geom_line() + labs(title = "Standard Normal Distribution", y = "Density", x = "Z")
stdnormal + geom_ribbon(data = subset(normprob, z > -0.54 & z < 2.07), aes(ymax = y), ymin = 0, fill = "blue", alpha = 0.5)
pnorm(2.07) - pnorm(-0.54)
## [1] 0.6861753
The probability that a \(z\) value is between -0.54 and 2.07 is 0.6862.
\(~\)
Find the probability of obtaining a z-value that is at least 1.58.
\(~\)
We determine the \(P(z >= 1.58)\). This is represented by the area under the standard normal curve which is to the right of \(z = 1.58\).
z <- seq(from = -4, to = +4, length.out = 1000)
normprob <- data.frame(x = z, y = dnorm(z, mean = 0, sd = 1))
stdnormal <- ggplot(normprob, aes(x,y)) + geom_line() + labs(title = "Standard Normal Distribution", y = "Density", x = "Z")
stdnormal + geom_ribbon(data = subset(normprob, z > 1.58), aes(ymax = y), ymin = 0, fill = "blue", alpha = 0.5)
pnorm(1.58, lower.tail = F)
## [1] 0.05705343
The probability of obtaining a \(z\) value that is at least 1.58 is 0.0571.
\(~\)
Determine a \(z\) value such that the probability of obtaining any larger \(z\) value than this is 0.10.
\(~\)
The problem gives the area, 0.10, under the standard normal curve which is to the right of the unknown \(z\) value. This indicates that the probability of obtaining any \(z\) value greater than the required, \(P(Z > z)\), is equal to 0.10.
z <- seq(from = -4, to = +4, length.out = 1000)
normprob <- data.frame(x = z, y = dnorm(z, mean = 0, sd = 1))
stdnormal <- ggplot(normprob, aes(x,y)) + geom_line() + labs(title = "Standard Normal Distribution", y = "Density", x = "Z")
stdnormal + geom_ribbon(data = subset(normprob, z > qnorm(0.10, lower.tail = F)), aes(ymax = y), ymin = 0, fill = "blue", alpha = 0.5)
qnorm(0.10, lower.tail = F)
## [1] 1.281552
The required \(z\) value, such that the probability of obtaining any larger \(z\) value would be 0.10, is 1.28.
\(~\)
Given a normal distribution with \(\mu\) = 40 and \(\sigma\) = 6, find the area below the value 32.
\(~\)
We are required to find the area below the value 32. This is equivalent to finding the probability of a value less than 32 for the given normally distributed variable.
We solve for the equivalent \(z\)-score of 32.
z1 <- (32 - 40)/6
z1
## [1] -1.333333
\(~\)
The required area is now presented under the standard normal curve, to the left of the obtained \(z\)-score.
z <- seq(from = -4, to = +4, length.out = 1000)
normprob <- data.frame(x = z, y = dnorm(z, mean = 0, sd = 1))
stdnormal <- ggplot(normprob, aes(x,y)) + geom_line() + labs(title = "Standard Normal Distribution", y = "Density", x = "Z")
stdnormal + geom_ribbon(data = subset(normprob, z < z1), aes(ymax = y), ymin = 0, fill = "blue", alpha = 0.5)
\(~\)
To compute for the required area:
pnorm(z1)
## [1] 0.09121122
pnorm(32, mean = 40, sd=6)
## [1] 0.09121122
The area under the normal curve which is below the value 32 is 0.0912.
\(~\)
Given a normal distribution with \(\mu\) = 40 and \(\sigma\) = 6, find the area between 42 and 51.
\(~\)
We are required to find the area between 42 and 51. This is equivalent to finding the probability of a value less that is between 42 and 51 for the given normally distributed variable.
We solve first for the equivalent \(z\)-scores of 42 and 51.
z1 <- (42 - 40)/6
z1
## [1] 0.3333333
z2 <- (51- 40)/6
z2
## [1] 1.833333
\(~\)
The required area is now shown under the standard normal curve, between the computed z-values.
z <- seq(from = -4, to = +4, length.out = 1000)
normprob <- data.frame(x = z, y = dnorm(z, mean = 0, sd = 1))
stdnormal <- ggplot(normprob, aes(x,y)) + geom_line() + labs(title = "Standard Normal Distribution", y = "Density", x = "Z")
stdnormal + geom_ribbon(data = subset(normprob, z > z1 & z < z2), aes(ymax = y), ymin = 0, fill = "blue", alpha = 0.5)
\(~\)
To compute for the required area:
pnorm(z2) - pnorm(z1)
## [1] 0.3360648
pnorm(51, mean = 40, sd=6) - pnorm(42, mean = 40, sd=6)
## [1] 0.3360648
The area under the normal curve which is between 42 and 51 is 0.3361.
\(~\)
Given a normal distribution with \(\mu\) = 40 and \(\sigma\) = 6, find the \(x\) value that has 45% of the area below it.
\(~\)
Here, we are given the area below or to the left of an unknown \(x\) value where we are required to determine this \(x\) value. We present the area under the Standard Normal curve to the left of a \(z\) value which corresponds to this unknown \(x\) value. The value of \(z\) for that given area will also be determined.
z <- seq(from = -4, to = +4, length.out = 1000)
normprob <- data.frame(x = z, y = dnorm(z, mean = 0, sd = 1))
stdnormal <- ggplot(normprob, aes(x,y)) + geom_line() + labs(title = "Standard Normal Distribution", y = "Density", x = "Z")
stdnormal + geom_ribbon(data = subset(normprob, z < qnorm(0.45)), aes(ymax = y), ymin = 0, fill = "blue", alpha = 0.5)
\(~\)
The z-value with the area to its left equal to 0.45 is:
qnorm(0.45)
## [1] -0.1256613
\(~\)
Now, to determine the required \(x\) value corresponding to this z value:
x <- qnorm(0.45, mean = 40, sd = 6)
x
## [1] 39.24603
The \(x\) value that has 45% of the area under the normal curve below it is 39.2460.
\(~\)
Given a normal distribution with \(\mu\) = 40 and \(\sigma\) = 6, find the \(x\) value that has 13% of the area above it.
\(~\)
Here, we are given the area above or to the right of an unknown \(x\) value, and we have to find this unknown \(x\) value. We present the area under the Standard Normal curve to the right of a \(z\) value that corresponds to this unknown \(x\). The value of \(z\)for that given area will also be determined.
z <- seq(from = -4, to = +4, length.out = 1000)
normprob <- data.frame(x = z, y = dnorm(z, mean = 0, sd = 1))
stdnormal <- ggplot(normprob, aes(x,y)) + geom_line() + labs(title = "Standard Normal Distribution", y = "Density", x = "Z")
stdnormal + geom_ribbon(data = subset(normprob, z > qnorm(0.13, lower.tail = F)), aes(ymax = y), ymin = 0, fill = "blue", alpha = 0.5)
\(~\)
The z-value with an area of 0.13 to its right is:
qnorm(0.13, lower.tail = F)
## [1] 1.126391
\(~\)
Now, to determine the required \(x\) value corresponding to this z value:
x <- qnorm(0.13, mean = 40, sd = 6, lower.tail = F)
x
## [1] 46.75835
The \(x\) value that has 13% of the area under the normal curve above it is 46.7584.
\(~\)