Case-scenario 1

HR_before <- c (11,13,12)

wanted_HR <- 20

n_seasons <- 4

x_4 <- n_seasons*wanted_HR - sum(HR_before)

x_4
## [1] 44
Robert_HRs <- c(11, 13, 12,44)
# Find mean
mean(Robert_HRs)
## [1] 20
sd(Robert_HRs)
## [1] 16.02082
max(Robert_HRs)
## [1] 44
min(Robert_HRs)
## [1] 11
summary(Robert_HRs)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   11.00   11.75   12.50   20.00   20.75   44.00
fivenum(Robert_HRs)
## [1] 11.0 11.5 12.5 28.5 44.0

Question 1

Now, you must complete the problem below which represents a similar case scenario. You may use the steps that we executed in Case-scenario 1 as a template for your solution.

This is the sixth season of outfielder Juan Soto in the majors. If during the first five seasons he received 79, 108,41,145, and 135 walks, how many does he need on this season for his overall number of walks per season to be at least 100?

Soto_Walks_Before <- c(79,108,41,145,135)

wanted_BB <-100

n_soto_seasons <- 6

Soto_Walks_6 <- n_soto_seasons*wanted_BB -
sum(Soto_Walks_Before)
Soto_Walks_6
## [1] 92

Case-scenario 2

The average salary of 10 baseball players is 72,000 dollars a week and the average salary of 4 soccer players is 84,000. Find the mean salary of all 14 professional players.

n_1 <- 10
n_2 <- 4
y_1 <- 72000
y_2 <- 84000
# Mean salary overall
salary_ave <-  (n_1*y_1 + n_2*y_2)/(n_1+n_2)
salary_ave
## [1] 75428.57

Question 2

The average salary of 7 basketball players is 102,000 dollars a week and the average salary of 9 NFL players is 91,000. Find the mean salary of all 16 professional players.

n_3<-7
y_3<-10200
n_4<-9
y_4<-91000
salary_ave_2<-(n_3*y_3+n_4*y_4)/(n_3+n_4)
salary_ave_2
## [1] 55650

Case-scenario 3

The frequency distribution below lists the number of active players in the Barclays Premier League and the time left in their contract.

contract_length <- read.table("allcontracts.csv", header = TRUE, sep = ",")
contract_years <- contract_length$years
# Mean 
contracts_mean  <- mean(contract_years)
contracts_mean
## [1] 3.458918
# Median
contracts_median <- median(contract_years)
contracts_median
## [1] 3
# Find number of observations
contracts_n <- length(contract_years)
# Find standard deviation
contracts_sd <- sd(contract_years)
contracts_w1sd <- sum((contract_years - contracts_mean)/contracts_sd < 1)/ contracts_n
# Percentage of observation within one standard deviation of the mean
contracts_w1sd
## [1] 0.8416834

What percentage of the data lies within one standard deviation of the mean?

contracts_w1sd-0.68
## [1] 0.1616834

What percentage of the data lies within two standard deviations of the mean?

## Within 2 sd
contracts_w2sd <- sum((contract_years - contracts_mean)/ contracts_sd < 2)/contracts_n
contracts_w2sd
## [1] 1
contracts_w2sd -0.95
## [1] 0.05

As we already knew 100% of the data is within 2 standard deviation of the mean so this calculation was not necessary.

# Create histogram
hist(contract_years,xlab = "Years Left in Contract",col = "blue",border = "black", xlim = c(0,8), ylim = c(0,225),
   breaks = 6)

View  
## function (x, title) 
## {
##     check_screen_device("View()")
##     if (missing(title)) 
##         title <- paste("Data:", deparse(substitute(x))[1])
##     x0 <- as.data.frame(x)
##     x <- as.list(format.data.frame(x0))
##     rn <- row.names(x0)
##     if (any(rn != seq_along(rn))) 
##         x <- c(list(row.names = rn), x)
##     if (!is.list(x) || !length(x) || !all(vapply(x, is.atomic, 
##         NA)) || !max(lengths(x))) 
##         stop("invalid 'x' argument")
##     if (grepl("darwin", R.version$os)) 
##         check_for_XQuartz(file.path(R.home("modules"), "R_de.so"))
##     invisible(.External2(C_dataviewer, x, title))
## }
## <bytecode: 0x64f7136b39d8>
## <environment: namespace:utils>