Let us continue getting started with R as we start discussing important statistical concepts.

Case-scenario 1

Kate must take four quizzes in a math class. If her scores on the first three quizzes are 71, 69, and 79, what score does she need on the final quiz for her overall mean to be at least 70?

Solution

Given that \(x_1 = 71, x_2 = 69, x_3 = 79\)

we want to find \(x_4\) such that the mean (average) grade is \(\bar{x} >= 70\)

Notice that in this case \(n = 4\).

According to the information above: \(70 \times 4 = 71 + 69 + 79 + x_4\)

so when \(x_4 = 61\), the quiz average will be 70.

# Grades so far
grades_before <- c(71, 69, 79)
# Average quiz grade wanted
wanted_grade <- 70
# Number of quizzes
n_quizzes <- 4
# Needed grade on quiz 4
x_4 <- n_quizzes*wanted_grade - sum(grades_before)
# Minimum grade needed by Kate
x_4
[1] 61

According to the calculations above, Kate must score 61 or better on the final quiz to get an average quiz grade of at least 70.

We could confirm this, by using the function mean() in R

# Quiz grades
kate_grades <- c(71, 69, 79,61)
# Find mean
mean(kate_grades)
[1] 70
# Find standard deviation
sd(kate_grades)
[1] 7.393691
# Find maximum grade
max(kate_grades)
[1] 79
# Find minimum grade
min(kate_grades)
[1] 61

We can also use the summary() function to find basic statistics, including the median!

summary(kate_grades)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
     61      67      70      70      73      79 

Next, I would like you to explain in detail every single task we completed above. In addition, let us deal with a similar case scenario and complete every single task we executed in Case-scenario 1.

Frank must take six quizzes in a Physics class. If his scores on the first five quizzes are 41, 69,63,94, and 99, what score does he need on the final quiz for his overall mean to be at least 70?

###Now let us go back to Case-scenario 1

Another useful function is quantile to find

# the 25% 
quantile(kate_grades, 1/4)
25% 
 67 
# the 75%
quantile(kate_grades, 3/4)
75% 
 73 
# the function IQR finds the interquantile range
# IQR(x) = quantile(x, 3/4) - quantile(x, 1/4)
IQR(kate_grades)
[1] 6
?quantile

Make comments about the output and run a similar query using Frank_grades.

Case-scenario 2

The average salary of 10 men is 72,000 and the average salary of 4 women is 84,000. Find the mean salary of all 14 people.

Solution

We can easily find the joined mean by adding both mean and dividing by the total number of people.

Let \(n_1 = 10\) denote the number of men, and \(y_1 = 72000\) their mean salary. Let \(n_2 = 4\) the number of women and \(y_2 = 84000\) their mean salary. Then the mean salary of all 16 individuals is: \(\frac{n_1 x_1 + n_2 x_2}{n_1 + n_2}\)

We can compute this in R as follows:

n_1 <- 10
n_2 <- 4
y_1 <- 72000
y_2 <- 84000
# Mean salary overall
salary_ave <-  (n_1*y_1 + n_2*y_2)/(n_1+n_2)
salary_ave
[1] 75428.57

Solve a similar problem by changing the number of men and women as well as the average income for each group. Make comments about the output.

Case-scenario 3

The frequency distribution below lists the results of a test given in Professor Wang’s String theory class.

Score Number of students
10 5
9 10
8 6
7 8
6 3
5 2
  1. Find the mean,the median and the standard deviation of the scores.

  2. What percentage of the data lies within one standard deviation of the mean?

  3. What percentage of the data lies within two standard deviations of the mean?

  4. What percent of the data lies within three standard deviations of the mean?

  5. Draw a histogram to illustrate the data.

Solution

The allScores.csv file contains all the students’ scores in the quiz. We can read this file in R using the read.csv() function (hint:First create a csv file with 6 rows and 2 columns)

getwd()
[1] "C:/Users/npenaper/Downloads"
scores <- read.table("allScores.csv", header = TRUE, sep = ",")
WangScores <- scores$Score
View(scores)
View(WangScores)

Make comments about the code we just ran above.

  1. To find the mean and the standard deviation
# Mean 
Scores_mean  <- mean(WangScores)
Scores_mean
[1] 8
# Median
Scores_median <- median(WangScores)
Scores_median
[1] 8
# Find number of observations
Scores_n <- length(WangScores)
# Find standard deviation
Scores_sd <- sd(WangScores)
  1. What percentage of the data lies within one standard deviation of the mean?
scores_w1sd <- sum((WangScores - Scores_mean)/Scores_sd < 1)/ Scores_n
# Percentage of observation within one standard deviation of the mean
scores_w1sd
[1] 0.8529412
## Difference from empirical 
scores_w1sd - 0.68
[1] 0.1729412
  1. What percentage of the data lies within two standard deviations of the mean?
## Within 2 sd
scores_w2sd <- sum((WangScores - Scores_mean)/ Scores_sd < 2)/Scores_n
scores_w2sd
[1] 1
## Difference from empirical 
scores_w2sd - 0.95
[1] 0.05
  1. What percent of the data lies within three standard deviations of the mean?
## Within 3 sd 
scores_w3sd <- sum((WangScores - Scores_mean)/ Scores_sd < 3)/Scores_n
scores_w3sd
[1] 1
## Difference from empirical 
scores_w3sd - 0.9973
[1] 0.0027

Explain the implications of the results obtained in this problem. In addition, create a similar query but this time addressing Frank_Scores.

  1. Draw a histogram
# Create histogram
hist(WangScores)

Explain the output and create a similar histogram for Frank_Scores.

LS0tDQp0aXRsZTogIkdldHRpbmcgU3RhcnRlZCB3aXRoIFIsUGFydCBJSSINCm91dHB1dDoNCiAgaHRtbF9ub3RlYm9vazogZGVmYXVsdA0KICBwZGZfZG9jdW1lbnQ6IGRlZmF1bHQNCi0tLQ0KDQpMZXQgdXMgY29udGludWUgZ2V0dGluZyBzdGFydGVkIHdpdGggUiBhcyB3ZSBzdGFydCBkaXNjdXNzaW5nIGltcG9ydGFudCBzdGF0aXN0aWNhbCBjb25jZXB0cy4NCg0KDQojIyBDYXNlLXNjZW5hcmlvIDENCkthdGUgbXVzdCB0YWtlIGZvdXIgcXVpenplcyBpbiBhIG1hdGggY2xhc3MuIElmIGhlciBzY29yZXMgb24gdGhlIGZpcnN0IHRocmVlIHF1aXp6ZXMgYXJlIDcxLCA2OSwgYW5kIDc5LCB3aGF0IHNjb3JlIGRvZXMgc2hlIG5lZWQgb24gdGhlIGZpbmFsIHF1aXogZm9yIGhlciBvdmVyYWxsIG1lYW4gdG8gYmUgYXQgbGVhc3QgNzA/DQoNCiMjIFNvbHV0aW9uDQoNCkdpdmVuIHRoYXQgDQokeF8xID0gNzEsIHhfMiA9IDY5LCB4XzMgPSA3OSQNCg0Kd2Ugd2FudCB0byBmaW5kICR4XzQkIHN1Y2ggdGhhdCB0aGUgbWVhbiAoYXZlcmFnZSkgZ3JhZGUgaXMgDQokXGJhcnt4fSA+PSA3MCQNCg0KTm90aWNlIHRoYXQgaW4gdGhpcyBjYXNlICRuID0gNCQuDQoNCkFjY29yZGluZyB0byB0aGUgaW5mb3JtYXRpb24gYWJvdmU6DQokNzAgXHRpbWVzIDQgPSA3MSArIDY5ICsgNzkgKyB4XzQkDQoNCnNvIHdoZW4gJHhfNCA9IDYxJCwgdGhlIHF1aXogYXZlcmFnZSB3aWxsIGJlIDcwLg0KDQoNCmBgYHtyfQ0KIyBHcmFkZXMgc28gZmFyDQpncmFkZXNfYmVmb3JlIDwtIGMoNzEsIDY5LCA3OSkNCiMgQXZlcmFnZSBxdWl6IGdyYWRlIHdhbnRlZA0Kd2FudGVkX2dyYWRlIDwtIDcwDQojIE51bWJlciBvZiBxdWl6emVzDQpuX3F1aXp6ZXMgPC0gNA0KIyBOZWVkZWQgZ3JhZGUgb24gcXVpeiA0DQp4XzQgPC0gbl9xdWl6emVzKndhbnRlZF9ncmFkZSAtIHN1bShncmFkZXNfYmVmb3JlKQ0KIyBNaW5pbXVtIGdyYWRlIG5lZWRlZCBieSBLYXRlDQp4XzQNCmBgYA0KDQpBY2NvcmRpbmcgdG8gdGhlIGNhbGN1bGF0aW9ucyBhYm92ZSwgS2F0ZSBtdXN0IHNjb3JlIDYxIG9yIGJldHRlciBvbiB0aGUgZmluYWwgcXVpeiB0byBnZXQgYW4gYXZlcmFnZSBxdWl6IGdyYWRlIG9mIGF0IGxlYXN0IDcwLg0KDQoNCldlIGNvdWxkIGNvbmZpcm0gdGhpcywgYnkgdXNpbmcgdGhlIGZ1bmN0aW9uIGBtZWFuKClgIGluIGBSYA0KDQpgYGB7cn0NCiMgUXVpeiBncmFkZXMNCmthdGVfZ3JhZGVzIDwtIGMoNzEsIDY5LCA3OSw2MSkNCiMgRmluZCBtZWFuDQptZWFuKGthdGVfZ3JhZGVzKQ0KIyBGaW5kIHN0YW5kYXJkIGRldmlhdGlvbg0Kc2Qoa2F0ZV9ncmFkZXMpDQojIEZpbmQgbWF4aW11bSBncmFkZQ0KbWF4KGthdGVfZ3JhZGVzKQ0KIyBGaW5kIG1pbmltdW0gZ3JhZGUNCm1pbihrYXRlX2dyYWRlcykNCmBgYA0KDQoNCg0KDQoNCldlIGNhbiBhbHNvIHVzZSB0aGUgYHN1bW1hcnkoKWAgZnVuY3Rpb24gdG8gZmluZCBiYXNpYyBzdGF0aXN0aWNzLCBpbmNsdWRpbmcgdGhlIG1lZGlhbiENCg0KYGBge3J9DQpzdW1tYXJ5KGthdGVfZ3JhZGVzKQ0KYGBgDQoNCg0KTmV4dCwgSSB3b3VsZCBsaWtlIHlvdSB0byBleHBsYWluIGluIGRldGFpbCBldmVyeSBzaW5nbGUgdGFzayB3ZSBjb21wbGV0ZWQgYWJvdmUuIEluIGFkZGl0aW9uLCBsZXQgdXMgZGVhbCB3aXRoIGEgc2ltaWxhciBjYXNlIHNjZW5hcmlvIGFuZCBjb21wbGV0ZSBldmVyeSBzaW5nbGUgdGFzayB3ZSBleGVjdXRlZCBpbiBDYXNlLXNjZW5hcmlvIDEuDQoNCg0KRnJhbmsgbXVzdCB0YWtlIHNpeCBxdWl6emVzIGluIGEgUGh5c2ljcyBjbGFzcy4gSWYgaGlzIHNjb3JlcyBvbiB0aGUgZmlyc3QgZml2ZSBxdWl6emVzIGFyZSA0MSwgNjksNjMsOTQsIGFuZCA5OSwgd2hhdCBzY29yZSBkb2VzIGhlIG5lZWQgb24gdGhlIGZpbmFsIHF1aXogZm9yIGhpcyBvdmVyYWxsIG1lYW4gdG8gYmUgYXQgbGVhc3QgNzA/DQoNCg0KDQoNCiMjI05vdyBsZXQgdXMgZ28gYmFjayB0byBDYXNlLXNjZW5hcmlvIDENCg0KQW5vdGhlciB1c2VmdWwgZnVuY3Rpb24gaXMgYHF1YW50aWxlYCB0byBmaW5kIA0KYGBge3J9DQojIHRoZSAyNSUgDQpxdWFudGlsZShrYXRlX2dyYWRlcywgMS80KQ0KIyB0aGUgNzUlDQpxdWFudGlsZShrYXRlX2dyYWRlcywgMy80KQ0KIyB0aGUgZnVuY3Rpb24gSVFSIGZpbmRzIHRoZSBpbnRlcnF1YW50aWxlIHJhbmdlDQojIElRUih4KSA9IHF1YW50aWxlKHgsIDMvNCkgLSBxdWFudGlsZSh4LCAxLzQpDQpJUVIoa2F0ZV9ncmFkZXMpDQpgYGANCg0KYGBge3J9DQo/cXVhbnRpbGUNCmBgYA0KDQoNCk1ha2UgY29tbWVudHMgYWJvdXQgdGhlIG91dHB1dCBhbmQgcnVuIGEgc2ltaWxhciBxdWVyeSB1c2luZyBGcmFua19ncmFkZXMuDQoNCg0KDQoNCiMgQ2FzZS1zY2VuYXJpbyAyDQoNCg0KVGhlIGF2ZXJhZ2Ugc2FsYXJ5IG9mIDEwIG1lbiBpcyA3MiwwMDAgYW5kIHRoZSBhdmVyYWdlIHNhbGFyeSBvZiA0IHdvbWVuIGlzIDg0LDAwMC4gRmluZCB0aGUgbWVhbiBzYWxhcnkgb2YgYWxsIDE0IHBlb3BsZS4NCg0KIyMgU29sdXRpb24NCg0KV2UgY2FuIGVhc2lseSBmaW5kIHRoZSBqb2luZWQgbWVhbiBieSBhZGRpbmcgYm90aCBtZWFuIGFuZCBkaXZpZGluZyBieSB0aGUgdG90YWwgbnVtYmVyIG9mIHBlb3BsZS4NCg0KTGV0ICRuXzEgPSAxMCQgZGVub3RlIHRoZSBudW1iZXIgb2YgbWVuLCBhbmQgJHlfMSA9IDcyMDAwJCB0aGVpciBtZWFuIHNhbGFyeS4gTGV0ICRuXzIgPSA0JCB0aGUgbnVtYmVyIG9mIHdvbWVuIGFuZCAkeV8yID0gODQwMDAkIHRoZWlyIG1lYW4gc2FsYXJ5Lg0KVGhlbiB0aGUgbWVhbiBzYWxhcnkgb2YgYWxsIDE2IGluZGl2aWR1YWxzIGlzOg0KJFxmcmFje25fMSB4XzEgKyBuXzIgeF8yfXtuXzEgKyBuXzJ9JA0KDQpXZSBjYW4gY29tcHV0ZSB0aGlzIGluIFIgYXMgZm9sbG93czoNCg0KYGBge3J9DQpuXzEgPC0gMTANCm5fMiA8LSA0DQp5XzEgPC0gNzIwMDANCnlfMiA8LSA4NDAwMA0KIyBNZWFuIHNhbGFyeSBvdmVyYWxsDQpzYWxhcnlfYXZlIDwtICAobl8xKnlfMSArIG5fMip5XzIpLyhuXzErbl8yKQ0Kc2FsYXJ5X2F2ZQ0KYGBgDQoNClNvbHZlIGEgc2ltaWxhciBwcm9ibGVtIGJ5IGNoYW5naW5nIHRoZSBudW1iZXIgb2YgbWVuIGFuZCB3b21lbiBhcyB3ZWxsIGFzIHRoZSBhdmVyYWdlIGluY29tZSBmb3IgZWFjaCBncm91cC4gTWFrZSBjb21tZW50cyBhYm91dCB0aGUgb3V0cHV0Lg0KDQoNCg0KDQoNCiMgQ2FzZS1zY2VuYXJpbyAzDQoNClRoZSBmcmVxdWVuY3kgZGlzdHJpYnV0aW9uIGJlbG93IGxpc3RzIHRoZSByZXN1bHRzIG9mIGEgdGVzdCBnaXZlbiBpbiBQcm9mZXNzb3IgV2FuZydzIFN0cmluZyB0aGVvcnkgY2xhc3MuDQoNClNjb3JlICAgfCAgIE51bWJlciBvZiBzdHVkZW50cw0KLS0tLS0tLS18LS0tLS0tLS0tLS0tLS0tLS0tLS0tLQ0KMTAgICAgICB8IDUNCjkgICAgICAgfCAxMA0KOCAgICAgICB8IDYNCjcgICAgICAgfCA4DQo2ICAgICAgIHwgMw0KNSAgICAgICB8IDINCg0KDQoNCjEuIEZpbmQgdGhlIG1lYW4sdGhlIG1lZGlhbiBhbmQgdGhlIHN0YW5kYXJkIGRldmlhdGlvbiBvZiB0aGUgc2NvcmVzLg0KDQoyLiBXaGF0IHBlcmNlbnRhZ2Ugb2YgdGhlIGRhdGEgbGllcyB3aXRoaW4gb25lIHN0YW5kYXJkIGRldmlhdGlvbiBvZiB0aGUgbWVhbj8NCg0KMy4gV2hhdCBwZXJjZW50YWdlIG9mIHRoZSBkYXRhIGxpZXMgd2l0aGluIHR3byBzdGFuZGFyZCBkZXZpYXRpb25zIG9mIHRoZSBtZWFuPw0KDQo0LiBXaGF0IHBlcmNlbnQgb2YgdGhlIGRhdGEgbGllcyB3aXRoaW4gdGhyZWUgc3RhbmRhcmQgZGV2aWF0aW9ucyBvZiB0aGUgbWVhbj8gDQoNCjUuIERyYXcgYSBoaXN0b2dyYW0gdG8gaWxsdXN0cmF0ZSB0aGUgZGF0YS4gDQoNCiMjIFNvbHV0aW9uDQoNClRoZSBgYWxsU2NvcmVzLmNzdmAgZmlsZSBjb250YWlucyBhbGwgdGhlIHN0dWRlbnRzJyBzY29yZXMgaW4gdGhlIHF1aXouIFdlIGNhbiByZWFkIHRoaXMgZmlsZSBpbiBgUmAgdXNpbmcgdGhlIGByZWFkLmNzdigpYCBmdW5jdGlvbiAoaGludDpGaXJzdCBjcmVhdGUgYSBjc3YgZmlsZSB3aXRoIDYgcm93cyBhbmQgMiBjb2x1bW5zKQ0KDQoNCmBgYHtyfQ0KZ2V0d2QoKQ0KYGBgDQoNCg0KYGBge3J9DQpzY29yZXMgPC0gcmVhZC50YWJsZSgiYWxsU2NvcmVzLmNzdiIsIGhlYWRlciA9IFRSVUUsIHNlcCA9ICIsIikNCldhbmdTY29yZXMgPC0gc2NvcmVzJFNjb3JlDQpgYGANCg0KDQoNCmBgYHtyfQ0KVmlldyhzY29yZXMpDQpgYGANCg0KYGBge3J9DQpWaWV3KFdhbmdTY29yZXMpDQpgYGANCg0KDQoNCk1ha2UgY29tbWVudHMgYWJvdXQgdGhlIGNvZGUgd2UganVzdCByYW4gYWJvdmUuICANCg0KDQoxLiBUbyBmaW5kIHRoZSBtZWFuIGFuZCB0aGUgc3RhbmRhcmQgZGV2aWF0aW9uDQoNCmBgYHtyfQ0KIyBNZWFuIA0KU2NvcmVzX21lYW4gIDwtIG1lYW4oV2FuZ1Njb3JlcykNClNjb3Jlc19tZWFuDQojIE1lZGlhbg0KU2NvcmVzX21lZGlhbiA8LSBtZWRpYW4oV2FuZ1Njb3JlcykNClNjb3Jlc19tZWRpYW4NCiMgRmluZCBudW1iZXIgb2Ygb2JzZXJ2YXRpb25zDQpTY29yZXNfbiA8LSBsZW5ndGgoV2FuZ1Njb3JlcykNCiMgRmluZCBzdGFuZGFyZCBkZXZpYXRpb24NClNjb3Jlc19zZCA8LSBzZChXYW5nU2NvcmVzKQ0KYGBgDQoNCjIuIFdoYXQgcGVyY2VudGFnZSBvZiB0aGUgZGF0YSBsaWVzIHdpdGhpbiBvbmUgc3RhbmRhcmQgZGV2aWF0aW9uIG9mIHRoZSBtZWFuPw0KDQpgYGB7cn0NCnNjb3Jlc193MXNkIDwtIHN1bSgoV2FuZ1Njb3JlcyAtIFNjb3Jlc19tZWFuKS9TY29yZXNfc2QgPCAxKS8gU2NvcmVzX24NCiMgUGVyY2VudGFnZSBvZiBvYnNlcnZhdGlvbiB3aXRoaW4gb25lIHN0YW5kYXJkIGRldmlhdGlvbiBvZiB0aGUgbWVhbg0Kc2NvcmVzX3cxc2QNCiMjIERpZmZlcmVuY2UgZnJvbSBlbXBpcmljYWwgDQpzY29yZXNfdzFzZCAtIDAuNjgNCg0KYGBgDQoNCjMuIFdoYXQgcGVyY2VudGFnZSBvZiB0aGUgZGF0YSBsaWVzIHdpdGhpbiB0d28gc3RhbmRhcmQgZGV2aWF0aW9ucyBvZiB0aGUgbWVhbj8NCg0KYGBge3J9DQojIyBXaXRoaW4gMiBzZA0Kc2NvcmVzX3cyc2QgPC0gc3VtKChXYW5nU2NvcmVzIC0gU2NvcmVzX21lYW4pLyBTY29yZXNfc2QgPCAyKS9TY29yZXNfbg0Kc2NvcmVzX3cyc2QNCiMjIERpZmZlcmVuY2UgZnJvbSBlbXBpcmljYWwgDQpzY29yZXNfdzJzZCAtIDAuOTUNCmBgYA0KDQoNCjQuIFdoYXQgcGVyY2VudCBvZiB0aGUgZGF0YSBsaWVzIHdpdGhpbiB0aHJlZSBzdGFuZGFyZCBkZXZpYXRpb25zIG9mIHRoZSBtZWFuPw0KDQpgYGB7cn0NCiMjIFdpdGhpbiAzIHNkIA0Kc2NvcmVzX3czc2QgPC0gc3VtKChXYW5nU2NvcmVzIC0gU2NvcmVzX21lYW4pLyBTY29yZXNfc2QgPCAzKS9TY29yZXNfbg0Kc2NvcmVzX3czc2QNCiMjIERpZmZlcmVuY2UgZnJvbSBlbXBpcmljYWwgDQpzY29yZXNfdzNzZCAtIDAuOTk3Mw0KYGBgDQoNCkV4cGxhaW4gdGhlIGltcGxpY2F0aW9ucyBvZiB0aGUgcmVzdWx0cyBvYnRhaW5lZCBpbiB0aGlzIHByb2JsZW0uIEluIGFkZGl0aW9uLCBjcmVhdGUgYSBzaW1pbGFyIHF1ZXJ5IGJ1dCB0aGlzIHRpbWUgYWRkcmVzc2luZyBGcmFua19TY29yZXMuDQoNCg0KDQoNCg0KNS4gRHJhdyBhIGhpc3RvZ3JhbQ0KDQpgYGB7cn0NCiMgQ3JlYXRlIGhpc3RvZ3JhbQ0KaGlzdChXYW5nU2NvcmVzKQ0KYGBgDQoNCkV4cGxhaW4gdGhlIG91dHB1dCBhbmQgY3JlYXRlIGEgc2ltaWxhciBoaXN0b2dyYW0gZm9yIEZyYW5rX1Njb3Jlcy4NCg0KDQoNCg0KDQoNCg0KDQo=