This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Ctrl+Shift+Enter.

plot(cars)

Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Ctrl+Alt+I.

mean(cars$speed)
## [1] 15.4
mean(cars$dist)
## [1] 42.98
max(cars$speed)
## [1] 25
max(cars$dist)
## [1] 120
4+1
## [1] 5
5-2
## [1] 3
2^2
## [1] 4
sqrt(25)
## [1] 5
2^5
## [1] 32
log(2.72) #natural log of 2
## [1] 1.000632
log10(5)
## [1] 0.69897
log10(10)
## [1] 1
log10(100)
## [1] 2
#Compute log base 10
log(10,base=5)
## [1] 1.430677
log(10,base=2)
## [1] 3.321928
log(1000,base=10)
## [1] 3
#Question_1: Compute the log base 5 of 10 and the log of 10.
log(10, base=5)
## [1] 1.430677
log(10,base=10)
## [1] 1

Computing some offensive metrics in Baseball

BA = (29/112)
BA
## [1] 0.2589286
Batting_Avg = round(BA,digits = 3)
Batting_Avg
## [1] 0.259
#Question_2:What is the batting average of a player that bats 42 hits in 212 at bats?.
BA2= round ((42/212),digits=3)
BA2
## [1] 0.198

Compute the OBP for a player with the following general stats

#AB=515,H=172,BB=84,HBP=5,SF=6 #OBP=(H+BB+HBP)/(At Bats+H+BB+HBP+SF)

OBP=(172+84+5)/(515+172+84+5+6)
OBP
## [1] 0.3337596
#Question_3:Compute the OBP for a player with the following general stats:
#AB=565,H=156,BB=65,HBP=3,SF=7

OBP3=round(((156+65+3)/(565+156+65+3+7)),digits = 3)
OBP3
## [1] 0.281

#Often you will want to test whether something is less than, greater than or equal to something.

3 == 8
## [1] FALSE
3 !=8
## [1] TRUE
3 <= 8
## [1] TRUE
3>4
## [1] FALSE

#The logical operators are & for logical AND, | for logical OR, and ! for NOT. These are some examples:

# Logical Disjunction (or)
FALSE | FALSE # False OR False
## [1] FALSE
# Logical Conjunction (and)
TRUE & FALSE #True AND False
## [1] FALSE
# Negation
! FALSE # Not False
## [1] TRUE
# Combination of statements
2 < 3 | 1 == 5 # 2<3 is True, 1==5 is False, True OR False is True
## [1] TRUE

Assigning Values to Variables

Total_Bases <- 6 + 5
Total_Bases*3
## [1] 33

#To see the variables that are currently defined, use ls (as in “list”)

ls()
## [1] "BA"          "BA2"         "Batting_Avg" "OBP"         "OBP3"       
## [6] "Total_Bases"

#To delete a variable, use rm (as in “remove”)

rm(Total_Bases)

Vectors

#Question_4: Define two vectors,runs_per_9innings and hits_per_9innings, each with five elements. 

runs_per_9innings <- c (10,15,20,30,1)
hits_per_9innings <- c (8,12,9,12,15)

runs_per_9innings
## [1] 10 15 20 30  1
hits_per_9innings
## [1]  8 12  9 12 15

Data Frames

data.frame(bonus = c(2, 3, 1),#in millions 
           active_roster = c("yes", "no", "yes"), 
           salary = c(1.5, 2.5, 1))#in millions 

How to Make a Random Sample

sample(1:10, size=5)
## [1] 6 8 1 3 7
bar <- data.frame(var1 = LETTERS[1:10], var2 = 1:10)
# Check data frame
bar
n <- 5

samplerows <- sample(1:nrow(bar), size=n) 
# print sample rows
samplerows
## [1]  8 10  2  3  4
# extract rows
barsample <- bar[samplerows, ]
# print sample
print(barsample)
##    var1 var2
## 8     H    8
## 10    J   10
## 2     B    2
## 3     C    3
## 4     D    4
bar[sample(1:nrow(bar), n), ]

Numerical measures of center and spread

Suppose, MLB Teams’ CEOs yearly compensations are sampled and the following are found (in millions)

12 .4 5 2 50 8 3 1 4 0.25

sals <- c(12, .4, 5, 2, 50, 8, 3, 1, 4, 0.25)
# the average
mean(sals) 
## [1] 8.565
# the variance
var(sals)
## [1] 225.5145
# the standard deviation
sd(sals)
## [1] 15.01714
# the median
median(sals)
## [1] 3.5
# Tukey's five number summary, usefull for boxplots
# five numbers: min, lower hinge, median, upper hinge, max
fivenum(sals)
## [1]  0.25  1.00  3.50  8.00 50.00
# summary statistics
summary(sals)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.250   1.250   3.500   8.565   7.250  50.000

How about the mode?

#Question_5: Get the first element of hits_per_9innings.

hits_per_9innings[1]
## [1] 8
#Question_6: Get the last element of hits_per_9innings.

hits_per_9innings[length((hits_per_9innings))]
## [1] 15
#Question_7: Find the most frequent value of hits_per_9innings.

#Define function to find the most frequent value
getMostFrequent <- function(X) {
  ux <- unique(X)
  ux[which.max(tabulate(match(X,ux)))]
}

getMostFrequent (hits_per_9innings)
## [1] 12
#Question_8: Summarize the following survey with the `table()` command:
#What is your favorite day of the week to watch baseball? A total of 10 fans submitted this survey.
#Saturday, Saturday, Sunday, Monday, Saturday,Tuesday, Sunday, Friday, Friday, Monday

game_day<-c("Saturday", "Saturday", "Sunday", "Monday", "Saturday","Tuesday", "Sunday", "Friday", "Friday", "Monday")

table(game_day)
## game_day
##   Friday   Monday Saturday   Sunday  Tuesday 
##        2        2        3        2        1
#Question_9: What is the most frequent answer recorded in the survey? Use the getMode function to compute results. 

getMostFrequent(game_day)
## [1] "Saturday"

When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Ctrl+Shift+K to preview the HTML file).

The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike Knit, Preview does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.