Data 606_Week2_Lab 2 (Distribution)

3.3. (P*) GRE scores, Part I.

Sophia’s verbal: 160 Sophia’s quantitative: 157 Verbal mu, SD: 151, 7 Quant mu / SD: 153, 7.67

# Sophia's scores
Q3.3v.soph = 160
Q3.3q.soph = 157
# Verbal distribution 
Q3.3v.mean = 151
Q3.3v.sd = 7
# Quantitative distribution
Q3.3q.mean = 153
Q3.3q.sd = 7.67

Short-hand for these normal distributions

Verbal: N(mu = 151, sigma = 7) Quantitative: N(mu = 153, sigma = 7.67)

Sophia’s Z-score on verbal, quantitative; draw standard normal distribution and mark Z-scores

Q3.3v.soph.z <- (Q3.3v.soph - Q3.3v.mean) / Q3.3v.sd
Q3.3q.soph.z <- (Q3.3q.soph - Q3.3q.mean) / Q3.3q.sd
cat("Sophia's Z-score on verbal is ", round(Q3.3v.soph.z, 3), " and on math is", round(Q3.3q.soph.z, 3))

## Sophia's Z-score on verbal is  1.286  and on math is 0.522

# NB: seed already set above
# Had trouble mapping segment height using findinterval (y value for given x in density function) so manually juked height of Z-score marker
par(mfrow=c(2,1))
xseq <- seq(120, 180, .01)
# First, the verbal distribution
Q3.3v.dens <- dnorm(xseq, Q3.3v.mean, Q3.3v.sd)
Q3.3v.densplot <- plot(xseq, Q3.3v.dens, type = "l", xlab = "", ylab = "", yaxt = 'n', lwd = 2, cex = 2, main = "GRE verbal", cex.axis = .8)
segments(Q3.3v.soph, 0, Q3.3v.soph, .025, col = "red")
text(168, .028, "Sophia's z-score \nof 1.286", cex = .8, col = "red")
# Next, the quantitative distribution
Q3.3q.dens <- dnorm(xseq, Q3.3q.mean, Q3.3q.sd)
Q3.3q.densplot <- plot(xseq, Q3.3q.dens, type = "l", xlab = "", ylab = "", yaxt = 'n', lwd = 2, cex = 2, main = "GRE quantitative", cex.axis = .8)
segments(Q3.3q.soph, 0, Q3.3q.soph, .045, col = "red")
text(167, .044, "Sophia's z-score \nof .522", cex = .8, col = "red")

What do these Z-scores tell us?

First off, they are positive values and so exceed the test taking population’s mean score. Sophie’s verbal Z-score of ~1.286 is about 1.3 standard deviations above the mean. Her quantitative Z-score of ~.5 indicates she is less than half a standard deviation above the mean.

Relative to others, which section did she do better on?

While her quantitative GRE score is the higher of the two, the z-score for verbal is comparatively higher. The z-score helps us to compare two approximately normal distributions with different means and standard deviations. Using it helps to reveal that her verbal performance was better than her quantitative compared with the population for each.

Find her percentile score for these two exams

Referring to the normal probability table on p428:

Sophia’s verbal percentile is approximately: .8997 (closer to .9 based on third decimal place, presumably) Sophia’s quantitative percentile is approximately: .6985

What % of test takers did better than her on verbal? on quantitative?

Q3.3v.soph.ptl = .8997
Q3.3q.soph.ptl = .6985
Q3.3v.soph.above = 1 - Q3.3v.soph.ptl
Q3.3q.soph.above = 1 - Q3.3q.soph.ptl
cat("", (round((100 * Q3.3v.soph.above), 1)), "% of test takers beat Sophia on verbal and\n", (round((100 * Q3.3q.soph.above), 1)), "% of test takers did better on quantitative.")

##  10 % of test takers beat Sophia on verbal and
##  30.1 % of test takers did better on quantitative.

Explain why simply comparing raw scores from the two sections could to lead to an incorrect conclusion as to which section student did better on.

Sophie achieved a higher overall score on her verbal than her quantitative; additionally, she did better than nearly ~90% of test takers on the verbal and ~70% on the quantitative. Depending on the distribution, it is conceivable that a test taker could get a higher overall score on one exam but a higher percentile on the other. The z-scores allow us to compare performance relative to the overall distribution, which is an important function of standardized tests.

If distributions of the scores on these exames are not nearly normal, how would answers to (b)-(f) change? Explain reasoning.

The z-score serves as a tool to compare different normal distributions. If the distributions weren’t approximately normal and take a different shapes, we couldn’t use z-scores to map from one to the other. Accordingly, we couldn’t easily compare Sophia’s performance on each test to that of other test takers; and we would need more information about the distributions in order to understand Sophia’s relative performance.

Data 606_Week2_Lab 2 (Distribution)

Jeremy O’Brien

February 28, 2018

3.3. (P*) GRE scores, Part I.