library(grid)
library(futile.logger)
library(VennDiagram)

## Warning: package 'VennDiagram' was built under R version 3.4.4

library(knitr)
library(ggplot2)

## Warning: package 'ggplot2' was built under R version 3.4.4

library('DATA606') # Load the package

## Loading required package: shiny

## Warning: package 'shiny' was built under R version 3.4.4

## Loading required package: openintro

## Please visit openintro.org for free statistics materials

## 
## Attaching package: 'openintro'

## The following object is masked from 'package:ggplot2':
## 
##     diamonds

## The following objects are masked from 'package:datasets':
## 
##     cars, trees

## Loading required package: OIdata

## Loading required package: RCurl

## Warning: package 'RCurl' was built under R version 3.4.4

## Loading required package: bitops

## Loading required package: maps

## Warning: package 'maps' was built under R version 3.4.4

## Loading required package: markdown

## 
## Welcome to CUNY DATA606 Statistics and Probability for Data Analytics 
## This package is designed to support this course. The text book used 
## is OpenIntro Statistics, 3rd Edition. You can read this by typing 
## vignette('os3') or visit www.OpenIntro.org. 
##  
## The getLabs() function will return a list of the labs available. 
##  
## The demo(package='DATA606') will list the demos that are available.

## 
## Attaching package: 'DATA606'

## The following object is masked from 'package:utils':
## 
##     demo

library(knitr)

3.3 GRE scores, Part I. Sophia who took the Graduate Record Examination (GRE) scored 160 on the Verbal Reasoning section and 157 on the Quantitative Reasoning section. The mean score for Verbal Reasoning section for all test takers was 151 with a standard deviation of 7, and the mean score for the Quantitative Reasoning was 153 with a standard deviation of 7.67. Suppose that both distributions are nearly normal.

Write down the short-hand for these two normal distributions.
Verbal: N(μ=151,σ=7) | Quant: N(μ=153,σ=7.67)
What is Sophia’s Z-score on the Verbal Reasoning section? On the Quantitative Reasoningsection? Draw a standard normal distribution curve and mark these two Z-scores.

\[Verbal:\quad Z=\cfrac { (160−151) }{ 7 } =1.285714\] We will round this off to 2 decimal places to 1.29 \[Quant:\quad Z=\cfrac { (157−153) }{ 7.67 } =0.521512\] We will round this off to 2 decimal places to 1.52

curve(dnorm, from = -3, to=3)
abline(v=1.285714, col="purple")
abline(v=0.521512, col="red")
text(1.285714+1, 0.3, "Verbal: 1.29",col="purple") 
text(0.521512-1, 0.3, "Quant: 0.52", col="red")

What do these Z-scores tell you?
She scored 1.29 standard deviations above the mean on the Verbal Reasoning section and 0.52 standard deviations above the mean on the Quantitative Reasoning section.
Relative to others, which section did she do better on?
She did better on the Verbal Reasoning section since her Z-score on that section was higher.
Find her percentile scores for the two exams.

We can also use pnorm() to get the percentiles for the normal distribution.

pnorm(1.285714)

## [1] 0.9007286

pnorm(0.5215124)

## [1] 0.6989951

Verbal: Using above we have 0.9015 (z-score = 1.29). So she was in the 90th percentile.
Quantitative: Using above we have 0.6985 (Z-score = 0.52). Rounding, she was in the 70th percentile.

What percent of the test takers did better than her on the Verbal Reasoning section? On the Quantitative Reasoning section?
Given the above, 10% of test takers did better than her on the Verbal Reasoning section (100-90) and 30% of test takers did better than her on the Quantitative Reasoning section (100-70).
Explain why simply comparing her raw scores from the two sections would lead to the incorrect conclusion that she did better on the Quantitative Reasoning section.
We cannot compare the raw scores since they are on different scales. Comparing her percentile scores is more appropriate when comparing her performance to others.
If the distributions of the scores on these exams are not nearly normal, would your answers to parts (b) - (f) change? Explain your reasoning.
Answer to part (b) would not change as Z-scores can be calculated for distributions that are not normal. However, we could not answer parts (d)-(f) since we cannot use the normal probability table to calculate probabilities and percentiles without a normal model.

Appendix

library(grid)
library(futile.logger)
library(VennDiagram)
library(knitr)
library(ggplot2)
library('DATA606') # Load the package
library(knitr)
curve(dnorm, from = -3, to=3)
abline(v=1.285714, col="purple")
abline(v=0.521512, col="red")
text(1.285714+1, 0.3, "Verbal: 1.29",col="purple") 
text(0.521512-1, 0.3, "Quant: 0.52", col="red")
pnorm(1.285714)
pnorm(0.5215124)
``

Exercise 3.3

Ashish Kumar

10/08/2018

Appendix