In this assignment, we are tasked to explore, analyze and model a major league baseball dataset which contains around 2000 records where each record presents a baseball team from 1871 to 2006. Each observation provides the perforamce of the team for that particular year with all the statistics for the performance of 162 game season. The problem statement for the main objective is that “Can we predict the number of wins for the team with the given attributes of each record?”. In order to provide a solution for the problem, our goal is to build a linear regression model on the training data that creates this prediction.
The data set are provided in csv format as moneyball-evaluation-data and moneyball-training-data where we will explore, preperate and create our model with the training data and further test the model with the evaluation data. Below is short description of the variables within the datasets.
**INDEX: Identification Variable(Do not use)
**TARGET_WINS: Number of wins
**TEAM_BATTING_H : Base Hits by batters (1B,2B,3B,HR)
**TEAM_BATTING_2B: Doubles by batters (2B)
**TEAM_BATTING_3B: Triples by batters (3B)
**TEAM_BATTING_HR: Homeruns by batters (4B)
**TEAM_BATTING_BB: Walks by batters
**TEAM_BATTING_HBP: Batters hit by pitch (get a free base)
**TEAM_BATTING_SO: Strikeouts by batters
**TEAM_BASERUN_SB: Stolen bases
**TEAM_BASERUN_CS: Caught stealing
**TEAM_FIELDING_E: Errors
**TEAM_FIELDING_DP: Double Plays
**TEAM_PITCHING_BB: Walks allowed
**TEAM_PITCHING_H: Hits allowed
**TEAM_PITCHING_HR: Homeruns allowed
**TEAM_PITCHING_SO: Strikeouts by pitchers
# load libraries
library(ggplot2)
library(ggcorrplot)
library(psych)##
## Attaching package: 'psych'
## The following objects are masked from 'package:ggplot2':
##
## %+%, alpha
#library(statsr)
library(dplyr)##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(PerformanceAnalytics)## Loading required package: xts
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
##
## Attaching package: 'xts'
## The following objects are masked from 'package:dplyr':
##
## first, last
##
## Attaching package: 'PerformanceAnalytics'
## The following object is masked from 'package:graphics':
##
## legend
library(tidyr)
library(reshape2)##
## Attaching package: 'reshape2'
## The following object is masked from 'package:tidyr':
##
## smiths
library(rcompanion)##
## Attaching package: 'rcompanion'
## The following object is masked from 'package:psych':
##
## phi
library(caret)## Loading required package: lattice
library(MASS)##
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
library(imputeTS)## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
##
## Attaching package: 'imputeTS'
## The following object is masked from 'package:zoo':
##
## na.locf
library(rsample)
library(huxtable)##
## Attaching package: 'huxtable'
## The following object is masked from 'package:dplyr':
##
## add_rownames
## The following object is masked from 'package:ggplot2':
##
## theme_grey
library(glmnet)## Loading required package: Matrix
##
## Attaching package: 'Matrix'
## The following objects are masked from 'package:tidyr':
##
## expand, pack, unpack
## Loaded glmnet 4.1-3
##
## Attaching package: 'glmnet'
## The following object is masked from 'package:imputeTS':
##
## na.replace
library(sjPlot)## Install package "strengejacke" from GitHub (`devtools::install_github("strengejacke/strengejacke")`) to load all sj-packages at once!
##
## Attaching package: 'sjPlot'
## The following object is masked from 'package:huxtable':
##
## font_size
library(modelr)# Load data sets
baseball_eva <- read.csv("https://raw.githubusercontent.com/anilak1978/data621/master/moneyball-evaluation-data.csv")
baseball_train <- read.csv("https://raw.githubusercontent.com/anilak1978/data621/master/moneyball-training-data.csv")We can start exploring our training data set by looking at basic descriptive statistics.
# look at training dataset structure
str(baseball_train)## 'data.frame': 2276 obs. of 17 variables:
## $ INDEX : int 1 2 3 4 5 6 7 8 11 12 ...
## $ TARGET_WINS : int 39 70 86 70 82 75 80 85 86 76 ...
## $ TEAM_BATTING_H : int 1445 1339 1377 1387 1297 1279 1244 1273 1391 1271 ...
## $ TEAM_BATTING_2B : int 194 219 232 209 186 200 179 171 197 213 ...
## $ TEAM_BATTING_3B : int 39 22 35 38 27 36 54 37 40 18 ...
## $ TEAM_BATTING_HR : int 13 190 137 96 102 92 122 115 114 96 ...
## $ TEAM_BATTING_BB : int 143 685 602 451 472 443 525 456 447 441 ...
## $ TEAM_BATTING_SO : int 842 1075 917 922 920 973 1062 1027 922 827 ...
## $ TEAM_BASERUN_SB : int NA 37 46 43 49 107 80 40 69 72 ...
## $ TEAM_BASERUN_CS : int NA 28 27 30 39 59 54 36 27 34 ...
## $ TEAM_BATTING_HBP: int NA NA NA NA NA NA NA NA NA NA ...
## $ TEAM_PITCHING_H : int 9364 1347 1377 1396 1297 1279 1244 1281 1391 1271 ...
## $ TEAM_PITCHING_HR: int 84 191 137 97 102 92 122 116 114 96 ...
## $ TEAM_PITCHING_BB: int 927 689 602 454 472 443 525 459 447 441 ...
## $ TEAM_PITCHING_SO: int 5456 1082 917 928 920 973 1062 1033 922 827 ...
## $ TEAM_FIELDING_E : int 1011 193 175 164 138 123 136 112 127 131 ...
## $ TEAM_FIELDING_DP: int NA 155 153 156 168 149 186 136 169 159 ...
TARGET_Wins<-as.numeric(baseball_train$TARGET_WINS)We have 2276 observations and 17 variables. All of our variables are integer type as expected.
# look at descriptive statistics
metastats <- data.frame(describe(baseball_train))
metastats <- tibble::rownames_to_column(metastats, "STATS")
metastats["pct_missing"] <- round(metastats["n"]/2276, 3)
head(metastats)| STATS | vars | n | mean | sd | median | trimmed | mad | min | max | range | skew | kurtosis | se | pct_missing |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| INDEX | 1 | 2.28e+03 | 1.27e+03 | 736 | 1.27e+03 | 1.27e+03 | 953 | 1 | 2.54e+03 | 2.53e+03 | 0.00421 | -1.22 | 15.4 | 1 |
| TARGET_WINS | 2 | 2.28e+03 | 80.8 | 15.8 | 82 | 81.3 | 14.8 | 0 | 146 | 146 | -0.399 | 1.03 | 0.33 | 1 |
| TEAM_BATTING_H | 3 | 2.28e+03 | 1.47e+03 | 145 | 1.45e+03 | 1.46e+03 | 114 | 891 | 2.55e+03 | 1.66e+03 | 1.57 | 7.28 | 3.03 | 1 |
| TEAM_BATTING_2B | 4 | 2.28e+03 | 241 | 46.8 | 238 | 240 | 47.4 | 69 | 458 | 389 | 0.215 | 0.00616 | 0.981 | 1 |
| TEAM_BATTING_3B | 5 | 2.28e+03 | 55.2 | 27.9 | 47 | 52.2 | 23.7 | 0 | 223 | 223 | 1.11 | 1.5 | 0.586 | 1 |
| TEAM_BATTING_HR | 6 | 2.28e+03 | 99.6 | 60.5 | 102 | 97.4 | 78.6 | 0 | 264 | 264 | 0.186 | -0.963 | 1.27 | 1 |
With the descriptive statistics, we are able to see mean, standard deviation, median, min, max values and percentage of each missing value of each variable. For example, when we look at TEAM_BATTING_H, we see that average 1469 Base hits by batters, with standard deviation of 144, median of 1454 with maximum base hits of 2554.
# Look for missing values
colSums(is.na(baseball_train))## INDEX TARGET_WINS TEAM_BATTING_H TEAM_BATTING_2B
## 0 0 0 0
## TEAM_BATTING_3B TEAM_BATTING_HR TEAM_BATTING_BB TEAM_BATTING_SO
## 0 0 0 102
## TEAM_BASERUN_SB TEAM_BASERUN_CS TEAM_BATTING_HBP TEAM_PITCHING_H
## 131 772 2085 0
## TEAM_PITCHING_HR TEAM_PITCHING_BB TEAM_PITCHING_SO TEAM_FIELDING_E
## 0 0 102 0
## TEAM_FIELDING_DP
## 286
# Percentage of missing values
missing_values <- metastats %>%
filter(pct_missing < 1) %>%
dplyr::select(STATS, pct_missing) %>%
arrange(pct_missing)
missing_values| STATS | pct_missing |
|---|---|
| TEAM_BATTING_HBP | 0.084 |
| TEAM_BASERUN_CS | 0.661 |
| TEAM_FIELDING_DP | 0.874 |
| TEAM_BASERUN_SB | 0.942 |
| TEAM_BATTING_SO | 0.955 |
| TEAM_PITCHING_SO | 0.955 |
When we look at the missing values within the training data set, we see that proportionaly against the total observations, TEAM_BATTING_HBP and TEAM_BESARUN_CS variables have the most missing values. We will be handling these missing values in our Data Preperation section.
# Look at correlation between variables
baseball_train$TARGET_WINS<-as.numeric(baseball_train$TARGET_WINS)
corr <- round(cor(baseball_train), 1)
ggcorrplot(corr,
type="lower",
lab=TRUE,
lab_size=3,
method="circle",
colors=c("tomato2", "white", "springgreen3"),
title="Correlation of variables in Training Data Set",
ggtheme=theme_bw)## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.
Team_Batting_H and Team_Batting_2B have the strongest positive correlation with Target_Wins. We also see that, there is a strong correlation between Team_Batting_H and Team_Batting_2B, Team_Pitching_B and TEAM_FIELDING_E. We will consider these findings on model creation as collinearity might complicate model estimation and we want to have explanotry variables to be independent from each other. We will try to avoid adding explanotry variables that are correlated to each other.
Let’s look at the correlations and distribution of the variables in more detail.
# Look at correlation from batting, baserunning, pitching and fielding perspective
Batting_df <- baseball_train[c(2:7, 10)]
BaseRunning_df <- baseball_train[c(8:9)]
Pitching_df <- baseball_train[c(11:14)]
Fielding_df <- baseball_train[c(15:16)]# Batting Correlations
chart.Correlation(Batting_df, histogram=TRUE, pch=19)We can see that our response variable TARGET_WINS, TEAM_BATTING_H, TEAM_BATTING_2B, TEAM_BATTING_BB and TEAM_BASERUN_CS are normaly distributed. TEAM_BATTING_HR on the other hand is bimodal.
# baserunning Correlation
chart.Correlation(BaseRunning_df, histogram=TRUE, pch=19)TEAM_BASERUN_SB is right skewed and TEAM_BATTING_SO is bimodal.
#pitching correlations
chart.Correlation(Pitching_df, histogram=TRUE, pch=19)TEAM_BATTING_HBP seems to be normally distributed however we shouldnt forget that we have a lot of missing values in this variable.
# fielding correlations
chart.Correlation(Fielding_df, histogram=TRUE, pch=19)Let’s also look at the outliers and skewness for each varibale.
par(mfrow=c(3,3))
datasub_1 <- melt(baseball_train)## No id variables; using all as measure variables
suppressWarnings(ggplot(datasub_1, aes(x= "value", y=value)) +
geom_boxplot(fill='lightblue') + facet_wrap(~variable, scales = 'free') )## Warning: Removed 3478 rows containing non-finite values (stat_boxplot).
Based on the boxplot we created, TEAM_FIELDING_DP, TEAM_PITCHING_HR, TEAM_BATTING_HR and TEAM_BATTING_SO seem to have the least amount of outliers.
par(mfrow = c(3, 3))
datasub = melt(baseball_train) ## No id variables; using all as measure variables
suppressWarnings(ggplot(datasub, aes(x= value)) +
geom_density(fill='lightblue') + facet_wrap(~variable, scales = 'free') )## Warning: Removed 3478 rows containing non-finite values (stat_density).
metastats %>%
filter(skew > 1) %>%
dplyr::select(STATS, skew) %>%
arrange(desc(skew))| STATS | skew |
|---|---|
| TEAM_PITCHING_SO | 22.2 |
| TEAM_PITCHING_H | 10.3 |
| TEAM_PITCHING_BB | 6.74 |
| TEAM_FIELDING_E | 2.99 |
| TEAM_BASERUN_CS | 1.98 |
| TEAM_BASERUN_SB | 1.97 |
| TEAM_BATTING_H | 1.57 |
| TEAM_BATTING_3B | 1.11 |
We can see that the most skewed variable is TEAM_PITCHING_SO. We will correct the skewed variables in our data preperation section.
When we are creating a linear regression model, we are looking for the fitting line with the least sum of squares, that has the small residuals with minimized squared residuals. From our correlation analysis, we can see that the explatory variable that has the strongest correlation with TARGET_WINS is TEAM_BATTING_H. Let’s look at a simple model example to further expand our explaroty analysis.
#library(statsr)
# line that follows the best assocation between two variables
#plot_ss(x = TEAM_BATTING_H, y = TARGET_WINS, data=baseball_train, showSquares = TRUE, leastSquares = TRUE)When we are exploring to build a linear regression, one of the first thing we do is to create a scatter plot of the response and explanatory variable.
# scatter plot between TEAM_BATTING_H and TARGET_WINS
ggplot(baseball_train, aes(x=TEAM_BATTING_H, y=TARGET_WINS))+
geom_point()One of the conditions for least square lines or linear regression are Linearity. From the scatter plot between TEAM_BATTING_H and TARGET_WINS, we can see this condition is met. We can also create a scatterplot that shows the data points between TARGET_WINS and each variable.
baseball_train %>%
gather(var, val, -TARGET_WINS) %>%
ggplot(., aes(val, TARGET_WINS))+
geom_point()+
facet_wrap(~var, scales="free", ncol=4)## Warning: Removed 3478 rows containing missing values (geom_point).
As we displayed earlier, hits walks and home runs have the strongest correlations with TARGET_WINS and also meets the linearity condition.
# create a simple example model
lm_sm <- lm(baseball_train$TARGET_WINS ~ baseball_train$TEAM_BATTING_H)
summary(lm_sm)##
## Call:
## lm(formula = baseball_train$TARGET_WINS ~ baseball_train$TEAM_BATTING_H)
##
## Residuals:
## Min 1Q Median 3Q Max
## -71.768 -8.757 0.856 9.762 46.016
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 18.562326 3.107523 5.973 2.69e-09 ***
## baseball_train$TEAM_BATTING_H 0.042353 0.002105 20.122 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 14.52 on 2274 degrees of freedom
## Multiple R-squared: 0.1511, Adjusted R-squared: 0.1508
## F-statistic: 404.9 on 1 and 2274 DF, p-value: < 2.2e-16
TARET_BATTING_H has the strongest correlation with TARGET_WINS response variable, however when we create a simple model just using TARGET_BATTING_H, we can only explain 15% of the variablity. (Adjusted R-squared: 0.1508). The remainder of the varibility can be explained with selected other variables within the training dataset.
#histogram of residuals for the simple model
hist(lm_sm$residuals)# check for constant variability (honoscedasticity)
plot(lm_sm$residuals ~ baseball_train$TEAM_BATTING_H)We do see that the residuals are distributed normally and variability around the regression line is roughly constant.
Based on our explatory analysis, we were able to see the correlation level between the possible explanatory variables and repsonse variable TARGET_WINS. Some of the variables such as TARGET_BATTING_H has somewhat strong positive correlation, however some of the variables such as TEAM_PITCHING_BB has weak positive relationship with TARGET_WINS. We also found out, hit by the pitcher(TEAM_BATTING_HBP) and caught stealing (TEAM_BASERUN_CS) variables are missing majority of the values. Skewness and distribution analysis gave us the insights that we have some variables that are right-tailed. Considering all of these insights, we will handle missing values, correct skewness and outliers and select our explaratory variables based on correlation in order to create our regression model.
In this section, we will prepare the dataset for linear regression modeling. We accomplish this by handling missing values and outliers and by tranforming the data into more normal distributions. This section covers:
Identify and Handle Missing Data Correct Outliers *Adjust Skewed value - Box Cox Transformation
First, we will start by copying the dataset into a new variable, baseball_train_01, and we will remove the Index variable from the new dataset as well. We will now have 16 variables.
baseball_train_01 <- baseball_train
baseball_train_01 <-subset(baseball_train_01, select = -c(INDEX))In the Data Exploration section, we identified these variables as having missing data values.The table below lists the variables with missing data. The variable, TEAM_BATTING_HBP, is sparsely populated. Since this data is Missing Completely at Random (MCAR) and is not related to any other variable, it is safe to completely remove the variable from the dataset.
missing_values| STATS | pct_missing |
|---|---|
| TEAM_BATTING_HBP | 0.084 |
| TEAM_BASERUN_CS | 0.661 |
| TEAM_FIELDING_DP | 0.874 |
| TEAM_BASERUN_SB | 0.942 |
| TEAM_BATTING_SO | 0.955 |
| TEAM_PITCHING_SO | 0.955 |
baseball_train_01 <-subset(baseball_train_01, select = -c(TEAM_BATTING_HBP))There are now 15 variables.
dim(baseball_train_01)## [1] 2276 15
For the remaining variables with missing values, we will impute the mean of the variable. The function, “na_mean” updates all missing values with the mean of the variable.
baseball_train_01 <- na_mean(baseball_train_01, option = "mean") Re-running the metastats dataframe on the new baseball_train_01 dataset shows that there are no missing values.
# look at descriptive statistics
metastats <- data.frame(describe(baseball_train_01))
metastats <- tibble::rownames_to_column(metastats, "STATS")
metastats["pct_missing"] <- round(metastats["n"]/2276, 3)# Percentage of missing values
missing_values2 <- metastats %>%
filter(pct_missing < 1) %>%
dplyr::select(STATS, pct_missing) %>%
arrange(pct_missing)
missing_values2| STATS | pct_missing |
|---|
In this section, we created two functions that can identify outliers. The funcion, Identify_Outlier, uses the Turkey method, where outliers are identified by being below Q1-1.5IQR and above Q3+1.5IQR. The second function, tag_outlier, returns a binary list of values, “Acceptable” or “Outlier” that will be added to the dataframe.
Identify_Outlier <- function(value){
interquartile_range = IQR(sort(value),na.rm = TRUE)
q1 = matrix(c(quantile(sort(value),na.rm = TRUE)))[2]
q3 = matrix(c(quantile(sort(value),na.rm = TRUE)))[4]
lower = q1-(1.5*interquartile_range)
upper = q3+(1.5*interquartile_range)
bound <- c(lower, upper)
return (bound)
}tag_outlier <- function(value) {
boundaries <- Identify_Outlier(value)
tags <- c()
counter = 1
for (i in as.numeric(value))
{
if (i >= boundaries[1] & i <= boundaries[2]){
tags[counter] <- "Acceptable"
} else{
tags[counter] <- "Outlier"
}
counter = counter +1
}
return (tags)
}As seen in the box plots from the previous section, “TEAM_BASERUN_SB”, “TEAM_BASERUN_CS”, “TEAM_PITCHING_H”, “TEAM_PITCHING_BB”, “TEAM_PITCHING_SO”, and “TEAM_FIELDING_E” all have a high number of outliers. We will use the two functions above to tag those rows with extreme outliers.
tags<- tag_outlier(baseball_train_01$TEAM_BASERUN_SB)
baseball_train_01$TEAM_BASERUN_SB_Outlier <- tags
tags<- tag_outlier(baseball_train_01$TEAM_BASERUN_CS)
baseball_train_01$TEAM_BASERUN_CS_Outlier <- tags
tags<- tag_outlier(baseball_train_01$TEAM_PITCHING_H)
baseball_train_01$TEAM_PITCHING_H_Outlier <- tags
tags<- tag_outlier(baseball_train_01$TEAM_PITCHING_BB)
baseball_train_01$TEAM_PITCHING_BB_Outlier <- tags
tags<- tag_outlier(baseball_train_01$TEAM_PITCHING_SO)
baseball_train_01$TEAM_PITCHING_SO_Outlier <- tags
tags<- tag_outlier(baseball_train_01$TEAM_FIELDING_E)
baseball_train_01$TEAM_FIELDING_E_Outlier <- tagsBelow, we filtered out all of the outliers and created a new dataframe, baseball_train_02
baseball_train_02 <- baseball_train_01 %>%
filter(
TEAM_BASERUN_SB_Outlier != "Outlier" &
TEAM_BASERUN_CS_Outlier != "Outlier" &
TEAM_PITCHING_H_Outlier != "Outlier" &
TEAM_PITCHING_BB_Outlier != "Outlier" &
TEAM_PITCHING_SO_Outlier != "Outlier" &
TEAM_FIELDING_E_Outlier != "Outlier"
)Re-running the boxplots show data that has a better normal distribution except for the variable, TEAM_FIELDING_E which is still skewed. We will handle this next.
par(mfrow=c(3,3))
datasub_1 <- melt(baseball_train_02)## Using TEAM_BASERUN_SB_Outlier, TEAM_BASERUN_CS_Outlier, TEAM_PITCHING_H_Outlier, TEAM_PITCHING_BB_Outlier, TEAM_PITCHING_SO_Outlier, TEAM_FIELDING_E_Outlier as id variables
suppressWarnings(ggplot(datasub_1, aes(x= "value", y=value)) +
geom_boxplot(fill='lightblue') + facet_wrap(~variable, scales = 'free') )Removing the outliers tranformed each variable to a closer to a normal distribution and checking the skewness of the variables confirm this with the exception of TEAM_FIELDING_E. This variable is still skewed and not normal. In this section, we will use the Box Cox tranformation from the MASS library to normalize this variable.
metastats_02 <- data.frame(describe(baseball_train_02))
metastats_02 <- tibble::rownames_to_column(metastats_02, "STATS")
metastats_02 %>%
filter(skew > 1 | skew < -1) %>%
dplyr::select(STATS, skew) %>%
arrange(desc(skew))| STATS | skew |
|---|---|
| TEAM_FIELDING_E | 1.45 |
Looking at the histogram and QQ plots we can confirm that the variable, TEAM_FIELDING_E, is not normally distributed. It is skewed to the right.
plotNormalHistogram(baseball_train_02$TEAM_FIELDING_E)qqnorm(baseball_train_02$TEAM_FIELDING_E,
ylab="Sample Quantiles for TEAM_FIELDING_E")
qqline(baseball_train_02$TEAM_FIELDING_E,
col="blue")The following Box Cox transformation section is based on the tutorial at the link below:
[://rcompanion.org/handbook/I_12.html][Summary and Analysis of Extension Program Evaluation in R]
The Box Cox procedure uses a log-likelihood to find the lambda to use to transform a variable to a normal distribution.
TEAM_FIELDING_E <- as.numeric(dplyr::pull(baseball_train_02, TEAM_FIELDING_E))
#Transforms TEAM_FIELDING_E as a single vector
Box = boxcox(TEAM_FIELDING_E ~ 1, lambda = seq(-6,6,0.1))#Creates a dataframe with results
Cox = data.frame(Box$x, Box$y)
# Order the new data frame by decreasing y to find the best lambda.Displays the lambda with the greatest log likelihood.
Cox2 = Cox[with(Cox, order(-Cox$Box.y)),]
Cox2[1,] | Box.x | Box.y |
|---|---|
| -0.8 | -3.95e+03 |
#Extract that lambda and Transform the data
lambda = Cox2[1, "Box.x"]
T_box = (TEAM_FIELDING_E ^ lambda - 1)/lambdaWe can now see that TEAM_FIELDING_E has a normal distribution.
plotNormalHistogram(T_box)qqnorm(T_box, ylab="Sample Quantiles for TEAM_FIELDING_E")
qqline(T_box,
col="blue")baseball_train_02$TEAM_FIELDING_E <- T_boxThe density plots below show that all of the variables for the dataset baseball_train_02 are now normally distributed. In the next section, we will use this dataset to build the models and discuss the coefficients of the models.
par(mfrow = c(3, 3))
datasub = melt(baseball_train_02)
suppressWarnings(ggplot(datasub, aes(x= value)) +
geom_density(fill='lightblue') + facet_wrap(~variable, scales = 'free') )Viewing the dataframe shows that the dataset contains characters resulting from the transfromation of the outliers. These non numeric characters will impact our models especially if we build the intial baseline model with all the variables. We will need one more step to have our data ready for the models.
str(baseball_train_02)## 'data.frame': 1521 obs. of 21 variables:
## $ TARGET_WINS : num 70 82 75 80 85 76 78 87 88 66 ...
## $ TEAM_BATTING_H : int 1387 1297 1279 1244 1273 1271 1305 1417 1563 1460 ...
## $ TEAM_BATTING_2B : int 209 186 200 179 171 213 179 226 242 239 ...
## $ TEAM_BATTING_3B : int 38 27 36 54 37 18 27 28 43 32 ...
## $ TEAM_BATTING_HR : int 96 102 92 122 115 96 82 108 164 107 ...
## $ TEAM_BATTING_BB : int 451 472 443 525 456 441 374 539 589 546 ...
## $ TEAM_BATTING_SO : num 922 920 973 1062 1027 ...
## $ TEAM_BASERUN_SB : num 43 49 107 80 40 72 60 86 100 92 ...
## $ TEAM_BASERUN_CS : num 30 39 59 54 36 34 39 69 53 64 ...
## $ TEAM_PITCHING_H : int 1396 1297 1279 1244 1281 1271 1364 1417 1563 1478 ...
## $ TEAM_PITCHING_HR : int 97 102 92 122 116 96 86 108 164 108 ...
## $ TEAM_PITCHING_BB : int 454 472 443 525 459 441 391 539 589 553 ...
## $ TEAM_PITCHING_SO : num 928 920 973 1062 1033 ...
## $ TEAM_FIELDING_E : num 1.23 1.23 1.22 1.23 1.22 ...
## $ TEAM_FIELDING_DP : num 156 168 149 186 136 159 141 136 172 146 ...
## $ TEAM_BASERUN_SB_Outlier : chr "Acceptable" "Acceptable" "Acceptable" "Acceptable" ...
## $ TEAM_BASERUN_CS_Outlier : chr "Acceptable" "Acceptable" "Acceptable" "Acceptable" ...
## $ TEAM_PITCHING_H_Outlier : chr "Acceptable" "Acceptable" "Acceptable" "Acceptable" ...
## $ TEAM_PITCHING_BB_Outlier: chr "Acceptable" "Acceptable" "Acceptable" "Acceptable" ...
## $ TEAM_PITCHING_SO_Outlier: chr "Acceptable" "Acceptable" "Acceptable" "Acceptable" ...
## $ TEAM_FIELDING_E_Outlier : chr "Acceptable" "Acceptable" "Acceptable" "Acceptable" ...
Subsetting - The code below will subset the data to have only numeric or integer values that will be used for our models. This will create baseball_train_03 dataframe.
baseball_train_03 <- baseball_train_02[c(1:15) ]
str(baseball_train_03)## 'data.frame': 1521 obs. of 15 variables:
## $ TARGET_WINS : num 70 82 75 80 85 76 78 87 88 66 ...
## $ TEAM_BATTING_H : int 1387 1297 1279 1244 1273 1271 1305 1417 1563 1460 ...
## $ TEAM_BATTING_2B : int 209 186 200 179 171 213 179 226 242 239 ...
## $ TEAM_BATTING_3B : int 38 27 36 54 37 18 27 28 43 32 ...
## $ TEAM_BATTING_HR : int 96 102 92 122 115 96 82 108 164 107 ...
## $ TEAM_BATTING_BB : int 451 472 443 525 456 441 374 539 589 546 ...
## $ TEAM_BATTING_SO : num 922 920 973 1062 1027 ...
## $ TEAM_BASERUN_SB : num 43 49 107 80 40 72 60 86 100 92 ...
## $ TEAM_BASERUN_CS : num 30 39 59 54 36 34 39 69 53 64 ...
## $ TEAM_PITCHING_H : int 1396 1297 1279 1244 1281 1271 1364 1417 1563 1478 ...
## $ TEAM_PITCHING_HR: int 97 102 92 122 116 96 86 108 164 108 ...
## $ TEAM_PITCHING_BB: int 454 472 443 525 459 441 391 539 589 553 ...
## $ TEAM_PITCHING_SO: num 928 920 973 1062 1033 ...
## $ TEAM_FIELDING_E : num 1.23 1.23 1.22 1.23 1.22 ...
## $ TEAM_FIELDING_DP: num 156 168 149 186 136 159 141 136 172 146 ...
The first Model is using stepwise in Backward direction to eliminate variables, this is an automated process which is different from the manual variable selction process. We will not pay much attention to this process as the focus of the project is to manually identify and select those significant variables that will predict TARGET WINS.
Model <- step(lm(TARGET_WINS ~ ., data=baseball_train_03), direction = "backward")## Start: AIC=7313.35
## TARGET_WINS ~ TEAM_BATTING_H + TEAM_BATTING_2B + TEAM_BATTING_3B +
## TEAM_BATTING_HR + TEAM_BATTING_BB + TEAM_BATTING_SO + TEAM_BASERUN_SB +
## TEAM_BASERUN_CS + TEAM_PITCHING_H + TEAM_PITCHING_HR + TEAM_PITCHING_BB +
## TEAM_PITCHING_SO + TEAM_FIELDING_E + TEAM_FIELDING_DP
##
## Df Sum of Sq RSS AIC
## - TEAM_PITCHING_H 1 0.7 182710 7311.4
## - TEAM_PITCHING_HR 1 162.2 182872 7312.7
## - TEAM_BATTING_H 1 216.2 182926 7313.2
## <none> 182709 7313.4
## - TEAM_BASERUN_CS 1 330.3 183040 7314.1
## - TEAM_BATTING_HR 1 338.0 183047 7314.2
## - TEAM_PITCHING_BB 1 363.7 183073 7314.4
## - TEAM_BATTING_BB 1 629.6 183339 7316.6
## - TEAM_PITCHING_SO 1 1242.9 183952 7321.7
## - TEAM_BATTING_SO 1 1857.6 184567 7326.7
## - TEAM_BATTING_2B 1 1864.9 184574 7326.8
## - TEAM_FIELDING_DP 1 6690.2 189400 7366.1
## - TEAM_BATTING_3B 1 7536.4 190246 7372.8
## - TEAM_BASERUN_SB 1 8080.4 190790 7377.2
## - TEAM_FIELDING_E 1 18743.7 201453 7459.9
##
## Step: AIC=7311.36
## TARGET_WINS ~ TEAM_BATTING_H + TEAM_BATTING_2B + TEAM_BATTING_3B +
## TEAM_BATTING_HR + TEAM_BATTING_BB + TEAM_BATTING_SO + TEAM_BASERUN_SB +
## TEAM_BASERUN_CS + TEAM_PITCHING_HR + TEAM_PITCHING_BB + TEAM_PITCHING_SO +
## TEAM_FIELDING_E + TEAM_FIELDING_DP
##
## Df Sum of Sq RSS AIC
## - TEAM_PITCHING_HR 1 173.3 182883 7310.8
## <none> 182710 7311.4
## - TEAM_BASERUN_CS 1 331.1 183041 7312.1
## - TEAM_BATTING_HR 1 358.9 183069 7312.3
## - TEAM_PITCHING_SO 1 1259.1 183969 7319.8
## - TEAM_PITCHING_BB 1 1509.7 184220 7321.9
## - TEAM_BATTING_2B 1 1876.6 184587 7324.9
## - TEAM_BATTING_SO 1 1880.0 184590 7324.9
## - TEAM_BATTING_BB 1 2658.3 185368 7331.3
## - TEAM_BATTING_H 1 4833.0 187543 7349.1
## - TEAM_FIELDING_DP 1 6705.4 189416 7364.2
## - TEAM_BATTING_3B 1 7548.6 190259 7370.9
## - TEAM_BASERUN_SB 1 8142.6 190853 7375.7
## - TEAM_FIELDING_E 1 18841.1 201551 7458.6
##
## Step: AIC=7310.8
## TARGET_WINS ~ TEAM_BATTING_H + TEAM_BATTING_2B + TEAM_BATTING_3B +
## TEAM_BATTING_HR + TEAM_BATTING_BB + TEAM_BATTING_SO + TEAM_BASERUN_SB +
## TEAM_BASERUN_CS + TEAM_PITCHING_BB + TEAM_PITCHING_SO + TEAM_FIELDING_E +
## TEAM_FIELDING_DP
##
## Df Sum of Sq RSS AIC
## <none> 182883 7310.8
## - TEAM_BASERUN_CS 1 418.7 183302 7312.3
## - TEAM_PITCHING_SO 1 1202.0 184085 7318.8
## - TEAM_PITCHING_BB 1 1537.3 184421 7321.5
## - TEAM_BATTING_2B 1 1928.3 184812 7324.8
## - TEAM_BATTING_SO 1 1977.9 184861 7325.2
## - TEAM_BATTING_BB 1 2678.0 185561 7330.9
## - TEAM_BATTING_H 1 4860.2 187744 7348.7
## - TEAM_BATTING_HR 1 5541.1 188424 7354.2
## - TEAM_BATTING_3B 1 7420.8 190304 7369.3
## - TEAM_FIELDING_DP 1 7423.8 190307 7369.3
## - TEAM_BASERUN_SB 1 9570.9 192454 7386.4
## - TEAM_FIELDING_E 1 18687.6 201571 7456.8
summary(Model)##
## Call:
## lm(formula = TARGET_WINS ~ TEAM_BATTING_H + TEAM_BATTING_2B +
## TEAM_BATTING_3B + TEAM_BATTING_HR + TEAM_BATTING_BB + TEAM_BATTING_SO +
## TEAM_BASERUN_SB + TEAM_BASERUN_CS + TEAM_PITCHING_BB + TEAM_PITCHING_SO +
## TEAM_FIELDING_E + TEAM_FIELDING_DP, data = baseball_train_03)
##
## Residuals:
## Min 1Q Median 3Q Max
## -45.468 -6.985 -0.128 7.454 34.637
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.368e+03 1.084e+02 12.624 < 2e-16 ***
## TEAM_BATTING_H 3.169e-02 5.006e-03 6.331 3.22e-10 ***
## TEAM_BATTING_2B -4.198e-02 1.053e-02 -3.987 7.00e-05 ***
## TEAM_BATTING_3B 1.756e-01 2.244e-02 7.822 9.68e-15 ***
## TEAM_BATTING_HR 7.519e-02 1.112e-02 6.759 1.97e-11 ***
## TEAM_BATTING_BB 1.564e-01 3.328e-02 4.699 2.85e-06 ***
## TEAM_BATTING_SO -8.768e-02 2.171e-02 -4.038 5.65e-05 ***
## TEAM_BASERUN_SB 6.625e-02 7.458e-03 8.884 < 2e-16 ***
## TEAM_BASERUN_CS -6.510e-02 3.503e-02 -1.858 0.063350 .
## TEAM_PITCHING_BB -1.130e-01 3.175e-02 -3.560 0.000382 ***
## TEAM_PITCHING_SO 6.484e-02 2.059e-02 3.148 0.001675 **
## TEAM_FIELDING_E -1.085e+03 8.739e+01 -12.413 < 2e-16 ***
## TEAM_FIELDING_DP -1.121e-01 1.433e-02 -7.824 9.56e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 11.01 on 1508 degrees of freedom
## Multiple R-squared: 0.3725, Adjusted R-squared: 0.3675
## F-statistic: 74.59 on 12 and 1508 DF, p-value: < 2.2e-16
The step backward variable selection process identified eleven significant variables with an R-squared of 37%, Residual Error of 11.01 and F-Statistic of 74.59. Notice that some of the coefficients are negative which means these Team will most likely result in negative wins. We will explore these coefficient a little further in this analysis.
Using all the 15 Variables
Model1 <-lm(TARGET_WINS ~ ., data=baseball_train_03)
summary(Model1)##
## Call:
## lm(formula = TARGET_WINS ~ ., data = baseball_train_03)
##
## Residuals:
## Min 1Q Median 3Q Max
## -45.067 -7.014 -0.101 7.499 34.361
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.376e+03 1.089e+02 12.634 < 2e-16 ***
## TEAM_BATTING_H 2.992e-02 2.242e-02 1.335 0.1821
## TEAM_BATTING_2B -4.140e-02 1.056e-02 -3.921 9.23e-05 ***
## TEAM_BATTING_3B 1.775e-01 2.252e-02 7.882 6.15e-15 ***
## TEAM_BATTING_HR 2.424e-01 1.452e-01 1.669 0.0953 .
## TEAM_BATTING_BB 1.606e-01 7.051e-02 2.278 0.0229 *
## TEAM_BATTING_SO -1.072e-01 2.741e-02 -3.913 9.52e-05 ***
## TEAM_BASERUN_SB 6.361e-02 7.794e-03 8.161 6.94e-16 ***
## TEAM_BASERUN_CS -5.853e-02 3.547e-02 -1.650 0.0992 .
## TEAM_PITCHING_H 1.587e-03 2.058e-02 0.077 0.9386
## TEAM_PITCHING_HR -1.617e-01 1.398e-01 -1.156 0.2478
## TEAM_PITCHING_BB -1.166e-01 6.735e-02 -1.732 0.0836 .
## TEAM_PITCHING_SO 8.366e-02 2.614e-02 3.201 0.0014 **
## TEAM_FIELDING_E -1.092e+03 8.785e+01 -12.430 < 2e-16 ***
## TEAM_FIELDING_DP -1.086e-01 1.463e-02 -7.426 1.87e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 11.01 on 1506 degrees of freedom
## Multiple R-squared: 0.3731, Adjusted R-squared: 0.3672
## F-statistic: 64.01 on 14 and 1506 DF, p-value: < 2.2e-16
This Model identified seven significant variables at = 0.05 with an R-squared of 37%, Residual Error of 11.01 and F-Statistic of 64.01. Although the F-Statistic reduced, this model does not improve significantly from the previous model.
Metrics1 <- data.frame(
R2 = rsquare(Model1, data = baseball_train_03),
RMSE = rmse(Model1, data = baseball_train_03),
MAE = mae(Model1, data = baseball_train_03)
)
print(Metrics1)## R2 RMSE MAE
## 1 0.3730717 10.96013 8.741105
Using all the seven (7) significant variables from Model 1
Model2 <- lm(TARGET_WINS~TEAM_FIELDING_E + TEAM_BASERUN_SB + TEAM_BATTING_3B + TEAM_FIELDING_DP + TEAM_PITCHING_SO + TEAM_BATTING_SO + TEAM_BATTING_2B,data=baseball_train_03)
summary(Model2)##
## Call:
## lm(formula = TARGET_WINS ~ TEAM_FIELDING_E + TEAM_BASERUN_SB +
## TEAM_BATTING_3B + TEAM_FIELDING_DP + TEAM_PITCHING_SO + TEAM_BATTING_SO +
## TEAM_BATTING_2B, data = baseball_train_03)
##
## Residuals:
## Min 1Q Median 3Q Max
## -48.799 -8.299 -0.053 8.472 39.785
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.808e+03 1.158e+02 15.607 < 2e-16 ***
## TEAM_FIELDING_E -1.410e+03 9.313e+01 -15.141 < 2e-16 ***
## TEAM_BASERUN_SB 5.429e-02 7.497e-03 7.242 7.02e-13 ***
## TEAM_BATTING_3B 1.788e-01 2.289e-02 7.808 1.08e-14 ***
## TEAM_FIELDING_DP -5.319e-02 1.525e-02 -3.488 0.000501 ***
## TEAM_PITCHING_SO -8.587e-03 9.176e-03 -0.936 0.349497
## TEAM_BATTING_SO -8.186e-03 9.308e-03 -0.880 0.379267
## TEAM_BATTING_2B 4.475e-02 7.971e-03 5.614 2.35e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 12.19 on 1513 degrees of freedom
## Multiple R-squared: 0.2288, Adjusted R-squared: 0.2253
## F-statistic: 64.14 on 7 and 1513 DF, p-value: < 2.2e-16
This Model identified five significant variables at = 0.05 with an R-squared of 22%, Residual Error of 12.19 and F-Statistic of 64.14. The R-Squared decreased and the Error increased slightly.
Metrics2 <- data.frame(
R2 = rsquare(Model2, data = baseball_train_03),
RMSE = rmse(Model2, data = baseball_train_03),
MAE = mae(Model2, data = baseball_train_03)
)
print(Metrics2)## R2 RMSE MAE
## 1 0.2288348 12.15572 9.731629
All offensive categories which include hitting and base running
Model3 <-lm(TARGET_WINS~TEAM_BATTING_H + TEAM_BATTING_BB + TEAM_BATTING_HR + TEAM_BATTING_2B + TEAM_BATTING_SO + TEAM_BASERUN_CS + TEAM_BATTING_3B + TEAM_BASERUN_SB,data=baseball_train_03)
summary(Model3)##
## Call:
## lm(formula = TARGET_WINS ~ TEAM_BATTING_H + TEAM_BATTING_BB +
## TEAM_BATTING_HR + TEAM_BATTING_2B + TEAM_BATTING_SO + TEAM_BASERUN_CS +
## TEAM_BATTING_3B + TEAM_BASERUN_SB, data = baseball_train_03)
##
## Residuals:
## Min 1Q Median 3Q Max
## -49.812 -7.822 0.247 8.166 35.877
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 15.763131 7.024077 2.244 0.0250 *
## TEAM_BATTING_H 0.024765 0.005285 4.686 3.03e-06 ***
## TEAM_BATTING_BB 0.037681 0.003994 9.435 < 2e-16 ***
## TEAM_BATTING_HR 0.099319 0.011448 8.676 < 2e-16 ***
## TEAM_BATTING_2B -0.013435 0.010919 -1.230 0.2187
## TEAM_BATTING_SO -0.010801 0.002767 -3.904 9.88e-05 ***
## TEAM_BASERUN_CS -0.068614 0.037166 -1.846 0.0651 .
## TEAM_BATTING_3B 0.115379 0.022950 5.027 5.57e-07 ***
## TEAM_BASERUN_SB 0.076701 0.007431 10.321 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 11.73 on 1512 degrees of freedom
## Multiple R-squared: 0.2857, Adjusted R-squared: 0.2819
## F-statistic: 75.58 on 8 and 1512 DF, p-value: < 2.2e-16
This Model identified five significant variables at = 0.05 with an R-squared of 28%, Residual Error of 11.73 and F-Statistic of 75.58. Although the R-squared is not that great, the standard errors are more reasonable. We will hold onto this Model as performing better than the previous models for now.
Metrics3 <- data.frame(
R2 = rsquare(Model3, data = baseball_train_03),
RMSE = rmse(Model3, data = baseball_train_03),
MAE = mae(Model3, data = baseball_train_03)
)
print(Metrics3)## R2 RMSE MAE
## 1 0.2856527 11.69934 9.330048
All defensive categories which include fielding and pitching
Model4 <- lm(TARGET_WINS~TEAM_PITCHING_H + TEAM_PITCHING_BB + TEAM_PITCHING_HR + TEAM_PITCHING_SO + TEAM_FIELDING_E,data=baseball_train_03)
summary(Model4)##
## Call:
## lm(formula = TARGET_WINS ~ TEAM_PITCHING_H + TEAM_PITCHING_BB +
## TEAM_PITCHING_HR + TEAM_PITCHING_SO + TEAM_FIELDING_E, data = baseball_train_03)
##
## Residuals:
## Min 1Q Median 3Q Max
## -48.818 -8.397 0.393 8.617 42.600
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.915e+02 1.079e+02 7.335 3.61e-13 ***
## TEAM_PITCHING_H 2.420e-02 2.705e-03 8.947 < 2e-16 ***
## TEAM_PITCHING_BB 2.542e-02 3.943e-03 6.448 1.52e-10 ***
## TEAM_PITCHING_HR 1.503e-02 9.799e-03 1.533 0.125415
## TEAM_PITCHING_SO -9.205e-03 2.369e-03 -3.886 0.000106 ***
## TEAM_FIELDING_E -6.153e+02 8.770e+01 -7.016 3.44e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 12.46 on 1515 degrees of freedom
## Multiple R-squared: 0.1932, Adjusted R-squared: 0.1905
## F-statistic: 72.56 on 5 and 1515 DF, p-value: < 2.2e-16
This Model identified five significant variables at = 0.05 with an R-squared of 19%, Residual Error of 12.46 and F-Statistic of 75.56.There is no significant improvement with this model.
Metrics4 <- data.frame(
R2 = rsquare(Model4, data = baseball_train_03),
RMSE = rmse(Model4, data = baseball_train_03),
MAE = mae(Model4, data = baseball_train_03)
)
print(Metrics4)## R2 RMSE MAE
## 1 0.1932003 12.43339 9.945165
Using only the significant variables from Model 3
Model5 <- lm(TARGET_WINS~TEAM_PITCHING_H + TEAM_PITCHING_BB + TEAM_PITCHING_HR + TEAM_PITCHING_SO + TEAM_BATTING_3B + TEAM_BASERUN_SB,data=baseball_train_03)
summary(Model5)##
## Call:
## lm(formula = TARGET_WINS ~ TEAM_PITCHING_H + TEAM_PITCHING_BB +
## TEAM_PITCHING_HR + TEAM_PITCHING_SO + TEAM_BATTING_3B + TEAM_BASERUN_SB,
## data = baseball_train_03)
##
## Residuals:
## Min 1Q Median 3Q Max
## -51.002 -7.725 0.407 8.171 36.647
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 44.113607 4.898047 9.006 < 2e-16 ***
## TEAM_PITCHING_H 0.002544 0.002951 0.862 0.389
## TEAM_PITCHING_BB 0.029880 0.003774 7.917 4.66e-15 ***
## TEAM_PITCHING_HR 0.134928 0.009503 14.198 < 2e-16 ***
## TEAM_PITCHING_SO -0.016789 0.002522 -6.657 3.91e-11 ***
## TEAM_BATTING_3B 0.141192 0.023104 6.111 1.26e-09 ***
## TEAM_BASERUN_SB 0.077656 0.007190 10.800 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 11.93 on 1514 degrees of freedom
## Multiple R-squared: 0.2606, Adjusted R-squared: 0.2576
## F-statistic: 88.92 on 6 and 1514 DF, p-value: < 2.2e-16
This Model identified five significant variables at = 0.05 with an R-squared of 26%, Residual Error of 11.93 and F-Statistic of 88.92. Although the R-squared is not better than than Model3, the F-statistic improved with smaller Standard Error.
Metrics5 <- data.frame(
R2 = rsquare(Model5, data = baseball_train_03),
RMSE = rmse(Model5, data = baseball_train_03),
MAE = mae(Model5, data = baseball_train_03)
)
print(Metrics5)## R2 RMSE MAE
## 1 0.2605772 11.90291 9.506094
anova(Model, Model1, Model2, Model3, Model4, Model5)| Res.Df | RSS | Df | Sum of Sq | F | Pr(>F) |
|---|---|---|---|---|---|
| 1.51e+03 | 1.83e+05 | ||||
| 1.51e+03 | 1.83e+05 | 2 | 174 | 0.717 | 0.488 |
| 1.51e+03 | 2.25e+05 | -7 | -4.2e+04 | 49.5 | 1.39e-63 |
| 1.51e+03 | 2.08e+05 | 1 | 1.66e+04 | 136 | 3.01e-30 |
| 1.52e+03 | 2.35e+05 | -3 | -2.69e+04 | 74 | 1.15e-44 |
| 1.51e+03 | 2.15e+05 | 1 | 1.96e+04 | 162 | 2.72e-35 |
tab_model(Model, Model1, Model2, Model3, Model4, Model5)| TARGET_WINS | TARGET_WINS | TARGET_WINS | TARGET_WINS | TARGET_WINS | TARGET_WINS | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Predictors | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p |
| (Intercept) | 1368.26 | 1155.66 – 1580.87 | <0.001 | 1376.10 | 1162.45 – 1589.76 | <0.001 | 1807.57 | 1580.39 – 2034.75 | <0.001 | 15.76 | 1.99 – 29.54 | 0.025 | 791.49 | 579.82 – 1003.16 | <0.001 | 44.11 | 34.51 – 53.72 | <0.001 |
| TEAM BATTING H | 0.03 | 0.02 – 0.04 | <0.001 | 0.03 | -0.01 – 0.07 | 0.182 | 0.02 | 0.01 – 0.04 | <0.001 | |||||||||
| TEAM BATTING 2B | -0.04 | -0.06 – -0.02 | <0.001 | -0.04 | -0.06 – -0.02 | <0.001 | 0.04 | 0.03 – 0.06 | <0.001 | -0.01 | -0.03 – 0.01 | 0.219 | ||||||
| TEAM BATTING 3B | 0.18 | 0.13 – 0.22 | <0.001 | 0.18 | 0.13 – 0.22 | <0.001 | 0.18 | 0.13 – 0.22 | <0.001 | 0.12 | 0.07 – 0.16 | <0.001 | 0.14 | 0.10 – 0.19 | <0.001 | |||
| TEAM BATTING HR | 0.08 | 0.05 – 0.10 | <0.001 | 0.24 | -0.04 – 0.53 | 0.095 | 0.10 | 0.08 – 0.12 | <0.001 | |||||||||
| TEAM BATTING BB | 0.16 | 0.09 – 0.22 | <0.001 | 0.16 | 0.02 – 0.30 | 0.023 | 0.04 | 0.03 – 0.05 | <0.001 | |||||||||
| TEAM BATTING SO | -0.09 | -0.13 – -0.05 | <0.001 | -0.11 | -0.16 – -0.05 | <0.001 | -0.01 | -0.03 – 0.01 | 0.379 | -0.01 | -0.02 – -0.01 | <0.001 | ||||||
| TEAM BASERUN SB | 0.07 | 0.05 – 0.08 | <0.001 | 0.06 | 0.05 – 0.08 | <0.001 | 0.05 | 0.04 – 0.07 | <0.001 | 0.08 | 0.06 – 0.09 | <0.001 | 0.08 | 0.06 – 0.09 | <0.001 | |||
| TEAM BASERUN CS | -0.07 | -0.13 – 0.00 | 0.063 | -0.06 | -0.13 – 0.01 | 0.099 | -0.07 | -0.14 – 0.00 | 0.065 | |||||||||
| TEAM PITCHING BB | -0.11 | -0.18 – -0.05 | <0.001 | -0.12 | -0.25 – 0.02 | 0.084 | 0.03 | 0.02 – 0.03 | <0.001 | 0.03 | 0.02 – 0.04 | <0.001 | ||||||
| TEAM PITCHING SO | 0.06 | 0.02 – 0.11 | 0.002 | 0.08 | 0.03 – 0.13 | 0.001 | -0.01 | -0.03 – 0.01 | 0.349 | -0.01 | -0.01 – -0.00 | <0.001 | -0.02 | -0.02 – -0.01 | <0.001 | |||
| TEAM FIELDING E | -1084.76 | -1256.17 – -913.35 | <0.001 | -1091.94 | -1264.26 – -919.62 | <0.001 | -1410.16 | -1592.85 – -1227.48 | <0.001 | -615.31 | -787.34 – -443.27 | <0.001 | ||||||
| TEAM FIELDING DP | -0.11 | -0.14 – -0.08 | <0.001 | -0.11 | -0.14 – -0.08 | <0.001 | -0.05 | -0.08 – -0.02 | 0.001 | |||||||||
| TEAM PITCHING H | 0.00 | -0.04 – 0.04 | 0.939 | 0.02 | 0.02 – 0.03 | <0.001 | 0.00 | -0.00 – 0.01 | 0.389 | |||||||||
| TEAM PITCHING HR | -0.16 | -0.44 – 0.11 | 0.248 | 0.02 | -0.00 – 0.03 | 0.125 | 0.13 | 0.12 – 0.15 | <0.001 | |||||||||
| Observations | 1521 | 1521 | 1521 | 1521 | 1521 | 1521 | ||||||||||||
| R2 / R2 adjusted | 0.372 / 0.367 | 0.373 / 0.367 | 0.229 / 0.225 | 0.286 / 0.282 | 0.193 / 0.191 | 0.261 / 0.258 | ||||||||||||
The Ridge regression is an extension of linear regression where the loss function is modified to minimize the complexity of the model. This modification is done by adding a penalty parameter that is equivalent to the square of the magnitude of the coefficients.
Before implementing the RIDGE model, we will split the training dataset into 2 parts that is - training set within the training set and a test set that can be used for evaluation. By enforcing stratified sampling both our training and testing sets have approximately equal response “TARGET_WINS” distributions.
Transforming the variables into the form of a matrix will enable us to penalize the model using the ‘glmnet’ method in glmnet package.
#Split the data into Training and Test Set
baseball_train_set<- initial_split(baseball_train_03, prop = 0.7, strata = "TARGET_WINS")
train_baseball <- training(baseball_train_set)
test_baseball <- testing(baseball_train_set)
train_Ind<- as.matrix(train_baseball)
train_Dep<- as.matrix(train_baseball$TARGET_WINS)
test_Ind<- as.matrix(test_baseball)
test_Dep<- as.matrix(test_baseball$TARGET_WINS)For the avoidance of multicollinearity, avoiding overfitting and predicting better, implementing RIDGE regression will become useful.
lambdas <- 10^seq(2, -3, by = -.1)
Model6 <- glmnet(train_Ind,train_Dep, nlambda = 25, alpha = 0, family = 'gaussian', lambda = lambdas)
summary(Model6)## Length Class Mode
## a0 51 -none- numeric
## beta 765 dgCMatrix S4
## df 51 -none- numeric
## dim 2 -none- numeric
## lambda 51 -none- numeric
## dev.ratio 51 -none- numeric
## nulldev 1 -none- numeric
## npasses 1 -none- numeric
## jerr 1 -none- numeric
## offset 1 -none- logical
## call 7 -none- call
## nobs 1 -none- numeric
print(Model6, digits = max(3, getOption("digits") - 3),
signif.stars = getOption("show.signif.stars"))##
## Call: glmnet(x = train_Ind, y = train_Dep, family = "gaussian", alpha = 0, nlambda = 25, lambda = lambdas)
##
## Df %Dev Lambda
## 1 15 30.03 100.000
## 2 15 35.00 79.430
## 3 15 40.34 63.100
## 4 15 45.97 50.120
## 5 15 51.78 39.810
## 6 15 57.65 31.620
## 7 15 63.44 25.120
## 8 15 69.02 19.950
## 9 15 74.26 15.850
## 10 15 79.05 12.590
## 11 15 83.30 10.000
## 12 15 86.97 7.943
## 13 15 90.05 6.310
## 14 15 92.55 5.012
## 15 15 94.53 3.981
## 16 15 96.06 3.162
## 17 15 97.20 2.512
## 18 15 98.05 1.995
## 19 15 98.65 1.585
## 20 15 99.08 1.259
## 21 15 99.38 1.000
## 22 15 99.59 0.794
## 23 15 99.73 0.631
## 24 15 99.82 0.501
## 25 15 99.88 0.398
## 26 15 99.92 0.316
## 27 15 99.95 0.251
## 28 15 99.97 0.200
## 29 15 99.98 0.158
## 30 15 99.99 0.126
## 31 15 99.99 0.100
## 32 15 99.99 0.079
## 33 15 100.00 0.063
## 34 15 100.00 0.050
## 35 15 100.00 0.040
## 36 15 100.00 0.032
## 37 15 100.00 0.025
## 38 15 100.00 0.020
## 39 15 100.00 0.016
## 40 15 100.00 0.013
## 41 15 100.00 0.010
## 42 15 100.00 0.008
## 43 15 100.00 0.006
## 44 15 100.00 0.005
## 45 15 100.00 0.004
## 46 15 100.00 0.003
## 47 15 100.00 0.003
## 48 15 100.00 0.002
## 49 15 100.00 0.002
## 50 15 100.00 0.001
## 51 15 100.00 0.001
The significant difference between the OLS and the Ridge Regresion is the hyperparameter tuning using lambda. The Ridge regression does not perform Feature Selection, but it predicts better and solve overfitting. Cross Validating the Ridge Regression will help us to identify the optimal lambda to penalize the model and enhance the predictability.
CrossVal_ridge <- cv.glmnet(train_Ind,train_Dep, alpha = 0, lambda = lambdas)
optimal_lambda <- CrossVal_ridge$lambda.min
optimal_lambda #The optimal lambda is 0.001 which we will use to penelize the Ridge Regression model.## [1] 0.001
coef(CrossVal_ridge) # Shows the coefficients## 16 x 1 sparse Matrix of class "dgCMatrix"
## s1
## (Intercept) 1.535312e-01
## TARGET_WINS 9.998834e-01
## TEAM_BATTING_H -1.736638e-05
## TEAM_BATTING_2B -7.943975e-06
## TEAM_BATTING_3B 1.688642e-05
## TEAM_BATTING_HR -4.354268e-04
## TEAM_BATTING_BB -6.515657e-06
## TEAM_BATTING_SO 9.941254e-05
## TEAM_BASERUN_SB 1.482210e-05
## TEAM_BASERUN_CS -2.352124e-05
## TEAM_PITCHING_H 1.926142e-05
## TEAM_PITCHING_HR 4.271052e-04
## TEAM_PITCHING_BB 9.457700e-06
## TEAM_PITCHING_SO -9.800251e-05
## TEAM_FIELDING_E -1.175340e-01
## TEAM_FIELDING_DP -2.368209e-05
plot(CrossVal_ridge)The plot shows that the errors increases as the magnitude of lambda increases, previously, we identified that the optimal lambda is 0.001 which is very obvious from the plot above. The coefficients are restricted to be small but not quite zero as Ridge Regression does not force the coefficient to zero. This indicates that the model is performing well so far. But let’s make it better using the optimal labmda.
eval_results <- function(true, predicted, df){
SSE <- sum((predicted - true)^2)
SST <- sum((true - mean(true))^2)
R_square <- 1 - SSE / SST
RMSE = sqrt(SSE/nrow(df))
data.frame(
RMSE = RMSE,
Rsquare = R_square
)
}
# Prediction and evaluation on train data
predictions_train <- predict(Model6, s = optimal_lambda, newx = train_Ind)
eval_results(train_Dep, predictions_train, train_baseball)| RMSE | Rsquare |
|---|---|
| 0.00171 | 1 |
We should be a little concern about the 100% R-squared performance for this Model. Although the Ridge Regression forces the coefficients towards zero to improve the Model performance and enhance the predictability, the very high peformance may require further investigation. Lets improve the model using a more reason lambda because optimal might not always be the best.
Model6_Improved <- glmnet(train_Ind,train_Dep, nlambda = 25, alpha = 0, family = 'gaussian', lambda = 6.310)
summary(Model6_Improved)## Length Class Mode
## a0 1 -none- numeric
## beta 15 dgCMatrix S4
## df 1 -none- numeric
## dim 2 -none- numeric
## lambda 1 -none- numeric
## dev.ratio 1 -none- numeric
## nulldev 1 -none- numeric
## npasses 1 -none- numeric
## jerr 1 -none- numeric
## offset 1 -none- logical
## call 7 -none- call
## nobs 1 -none- numeric
coef(Model6_Improved)## 16 x 1 sparse Matrix of class "dgCMatrix"
## s0
## (Intercept) 1.764003e+02
## TARGET_WINS 6.224314e-01
## TEAM_BATTING_H 6.627483e-03
## TEAM_BATTING_2B 1.362185e-03
## TEAM_BATTING_3B 2.639383e-02
## TEAM_BATTING_HR 8.228176e-03
## TEAM_BATTING_BB 7.133799e-03
## TEAM_BATTING_SO -1.082704e-03
## TEAM_BASERUN_SB 1.357864e-02
## TEAM_BASERUN_CS -2.805586e-03
## TEAM_PITCHING_H 1.789510e-03
## TEAM_PITCHING_HR 7.331744e-03
## TEAM_PITCHING_BB 4.904272e-03
## TEAM_PITCHING_SO -1.705157e-03
## TEAM_FIELDING_E -1.331875e+02
## TEAM_FIELDING_DP -2.366768e-02
Let’s compute the Model’s Performance Metric to see how this model is doing.
eval_results <- function(true, predicted, df){
SSE <- sum((predicted - true)^2)
SST <- sum((true - mean(true))^2)
R_square <- 1 - SSE / SST
RMSE = sqrt(SSE/nrow(df))
data.frame(
RMSE = RMSE,
Rsquare = R_square
)
}
# Prediction and evaluation on train data
predictions_train <- predict(Model6_Improved, s = lambda, newx = train_Ind)
eval_results(train_Dep, predictions_train, train_baseball)| RMSE | Rsquare |
|---|---|
| 4.32 | 0.9 |
# Prediction and evaluation on test data
predictions_test <- predict(Model6_Improved, s = lambda, newx = test_Ind)
eval_results(test_Dep, predictions_test, test_baseball)| RMSE | Rsquare |
|---|---|
| 4.51 | 0.899 |
The improved Model6 output shows that the RMSE and R-squared values for the Ridge Regression model on the training and test data are significantly improved. The Loss Function (RMSE) are severely reduced compared to the OLS models which indicates that the Ridge Regression is not overfitting. These performance is significantly improved compared to the OLS Models 1 to 5.
ModelName <- c("Model", "Model1","Model2","Model3","Model4","Model5","Model6")
Model_RSquared <- c("37%", "37%", "22%", "28%", "19%", "26% ", "90%")
Model_RMSE <- c("11.01", "10.96", "12.15", "11.69", "12.43", "11.93 ", "4.33")
Model_FStatistic <- c("74.59", "64.01", "64.14", "75.58", "72.56", "88.92 ", "NA")
Model_Performance <- data.frame(ModelName,Model_RSquared,Model_RMSE,Model_FStatistic)
Model_Performance| ModelName | Model_RSquared | Model_RMSE | Model_FStatistic |
|---|---|---|---|
| Model | 37% | 11.01 | 74.59 |
| Model1 | 37% | 10.96 | 64.01 |
| Model2 | 22% | 12.15 | 64.14 |
| Model3 | 28% | 11.69 | 75.58 |
| Model4 | 19% | 12.43 | 72.56 |
| Model5 | 26% | 11.93 | 88.92 |
| Model6 | 90% | 4.33 | NA |
Based on the Model metrics above, we’re ready to make prediction and we will select our acceptable OLS Model3 and Model5 which has better F-Statistic, smaller standard errors and less negative coefficient as our best OLS models. We will also compare the prediction accuracy of these models to that of the improved Ridge Regression Model which is our champion Model for this exercise based on the very small RMSE and the highest R-squared of over 90%.
predicted <- predict(Model3, newx = test_baseball)# predict on test data
predicted_values <- cbind (actual=test_baseball$TARGET_WINS, predicted) # combine## Warning in cbind(actual = test_baseball$TARGET_WINS, predicted): number of rows
## of result is not a multiple of vector length (arg 1)
predicted_values## actual predicted
## 1 76 69.49875
## 2 70 67.56125
## 3 81 68.38378
## 4 91 73.25628
## 5 80 67.37030
## 6 70 66.86382
## 7 82 63.36361
## 8 84 76.58169
## 9 51 89.59170
## 10 77 76.54657
## 11 68 86.70081
## 12 58 77.31870
## 13 56 78.60647
## 14 63 84.73255
## 15 59 89.15530
## 16 84 86.01142
## 17 84 76.15138
## 18 68 76.18203
## 19 68 79.02243
## 20 79 72.03653
## 21 93 89.30377
## 22 85 84.16328
## 23 81 79.30964
## 24 76 76.00667
## 25 69 93.36918
## 26 82 79.72780
## 27 76 83.96472
## 28 66 81.83322
## 29 98 86.75976
## 30 104 83.85432
## 31 79 75.24767
## 32 69 96.12431
## 33 66 85.18416
## 34 78 87.23777
## 35 66 86.66397
## 36 59 80.90673
## 37 45 71.73633
## 38 69 81.22686
## 39 62 90.64517
## 40 60 94.18601
## 41 78 95.53100
## 42 95 80.08285
## 43 97 75.86419
## 44 88 66.74352
## 45 98 64.57578
## 46 84 60.91243
## 47 87 72.28834
## 48 85 68.48019
## 49 80 65.67250
## 50 88 72.48471
## 51 88 86.44011
## 52 107 80.65902
## 53 83 82.06254
## 54 94 78.70595
## 55 83 77.58369
## 56 85 72.97126
## 57 78 78.18561
## 58 49 72.24895
## 59 60 76.25716
## 60 68 77.02116
## 61 86 75.30225
## 62 80 76.46272
## 63 76 78.36297
## 64 92 68.74638
## 65 96 70.51741
## 66 84 66.00240
## 67 78 74.32965
## 68 83 68.02212
## 69 80 71.72855
## 70 97 69.41157
## 71 85 69.61566
## 72 78 70.79319
## 73 92 69.48860
## 74 83 65.80599
## 75 98 70.70940
## 76 75 82.73716
## 77 62 77.23526
## 78 110 81.04205
## 79 87 86.68577
## 80 86 81.16218
## 81 90 88.42305
## 82 105 85.99275
## 83 74 73.58515
## 84 78 82.22049
## 85 103 85.06236
## 86 73 86.09993
## 87 81 76.29960
## 88 59 79.91231
## 89 72 77.53938
## 90 88 84.34369
## 91 92 74.79848
## 92 79 70.02759
## 93 71 82.39423
## 94 78 79.28918
## 95 68 91.97048
## 96 89 78.96951
## 97 88 74.19733
## 98 100 72.42719
## 99 93 77.14336
## 100 83 76.15731
## 101 83 77.37089
## 102 105 72.05640
## 103 74 76.89373
## 104 59 75.92414
## 105 72 85.81217
## 106 79 75.59281
## 107 81 80.60366
## 108 96 85.62160
## 109 89 82.00590
## 110 95 84.74713
## 111 99 84.88357
## 112 86 86.74637
## 113 85 89.85103
## 114 95 88.78040
## 115 75 78.46174
## 116 74 84.86044
## 117 77 83.42672
## 118 94 82.79613
## 119 90 82.72617
## 120 93 66.91725
## 121 90 69.51332
## 122 72 78.66910
## 123 70 72.33942
## 124 77 67.50771
## 125 79 64.77165
## 126 63 69.79720
## 127 111 71.15595
## 128 62 77.34661
## 129 78 77.95811
## 130 59 86.54917
## 131 102 73.89007
## 132 65 79.45996
## 133 98 76.41158
## 134 89 81.88671
## 135 108 84.23277
## 136 88 81.62631
## 137 91 78.66292
## 138 70 78.01240
## 139 86 82.19463
## 140 84 84.27986
## 141 88 73.82318
## 142 77 75.50558
## 143 94 72.20240
## 144 99 71.73169
## 145 71 78.70550
## 146 93 82.60550
## 147 92 79.89101
## 148 79 80.98410
## 149 72 73.62353
## 150 97 74.78823
## 151 117 70.09769
## 152 82 69.79510
## 153 79 73.21664
## 154 87 76.42604
## 155 71 70.00951
## 156 61 79.24643
## 157 100 76.55529
## 158 87 77.45710
## 159 97 77.29383
## 160 91 74.83120
## 161 83 80.57564
## 162 73 75.27052
## 163 68 72.96984
## 164 66 88.01300
## 165 70 89.33125
## 166 83 86.25568
## 167 71 69.87604
## 168 93 89.64940
## 169 82 80.48480
## 170 94 81.38282
## 171 79 78.12840
## 172 82 80.46360
## 173 100 77.90473
## 174 77 86.93266
## 175 101 88.60549
## 176 79 86.22276
## 177 74 85.13130
## 178 83 80.44713
## 179 92 88.03475
## 180 84 80.82213
## 181 75 83.47961
## 182 66 72.80089
## 183 64 80.37391
## 184 92 79.72336
## 185 79 78.69135
## 186 83 85.57211
## 187 66 83.20429
## 188 65 89.48194
## 189 79 96.68228
## 190 80 87.75298
## 191 90 91.87313
## 192 77 86.26651
## 193 76 77.82142
## 194 102 76.04487
## 195 87 84.88891
## 196 65 81.55068
## 197 85 83.70493
## 198 92 80.38773
## 199 83 101.92399
## 200 88 81.10336
## 201 58 73.46680
## 202 73 93.36972
## 203 80 86.05010
## 204 75 85.59259
## 205 85 73.88114
## 206 69 74.83380
## 207 70 64.08681
## 208 75 65.84094
## 209 92 71.98560
## 210 94 79.13560
## 211 102 85.15069
## 212 75 84.82851
## 213 91 88.64454
## 214 86 87.36139
## 215 94 73.46671
## 216 92 73.32881
## 217 98 76.75376
## 218 95 80.18206
## 219 73 75.03975
## 220 77 75.66433
## 221 86 70.51948
## 222 88 74.54246
## 223 75 72.27715
## 224 80 70.35686
## 225 69 72.18585
## 226 68 79.91554
## 227 66 88.74324
## 228 77 87.01201
## 229 87 85.96054
## 230 83 89.84882
## 231 80 93.28619
## 232 78 91.12115
## 233 74 94.67752
## 234 74 73.79792
## 235 56 80.75316
## 236 67 76.99427
## 237 59 87.39336
## 238 53 77.79257
## 239 45 75.87232
## 240 67 87.87867
## 241 65 79.02455
## 242 97 79.53593
## 243 70 77.39581
## 244 89 71.37449
## 245 67 85.39949
## 246 62 78.33072
## 247 66 86.04030
## 248 71 79.55776
## 249 89 82.75494
## 250 91 86.19625
## 251 81 87.67022
## 252 77 85.00793
## 253 85 80.45297
## 254 77 87.60385
## 255 81 82.44495
## 256 71 77.50487
## 257 78 79.20244
## 258 41 69.31674
## 259 66 74.80593
## 260 98 83.14847
## 261 108 91.63208
## 262 91 93.48635
## 263 72 84.53632
## 264 66 86.35405
## 265 83 86.94776
## 266 87 82.27458
## 267 99 80.34078
## 268 77 97.84021
## 269 97 90.77786
## 270 77 86.29480
## 271 54 89.59173
## 272 61 81.45834
## 273 96 79.98754
## 274 106 76.10745
## 275 87 78.12297
## 276 86 76.91669
## 277 102 74.90087
## 278 108 87.24509
## 279 96 95.14666
## 280 93 90.93669
## 281 83 90.70110
## 282 80 81.29394
## 283 99 72.64682
## 284 100 82.98819
## 285 99 75.16897
## 286 91 78.10723
## 287 98 82.14880
## 288 90 84.91555
## 289 92 97.32176
## 290 96 101.10846
## 291 114 84.72247
## 292 88 75.83547
## 293 96 76.87742
## 294 101 84.10901
## 295 88 89.08411
## 296 110 81.50902
## 297 38 81.45233
## 298 58 78.28904
## 299 56 74.25636
## 300 107 75.53547
## 301 63 78.88184
## 302 58 77.90593
## 303 56 81.68807
## 304 52 76.22767
## 305 74 71.70319
## 306 62 74.06627
## 307 77 74.02593
## 308 75 79.72855
## 309 102 78.32084
## 310 77 75.75259
## 311 76 83.80749
## 312 99 79.14016
## 313 66 77.77473
## 314 87 77.15735
## 315 78 76.99388
## 316 83 68.85761
## 317 93 74.32405
## 318 72 76.06391
## 319 75 76.38482
## 320 69 77.40995
## 321 82 77.07957
## 322 57 81.59777
## 323 49 73.66385
## 324 45 79.18950
## 325 67 77.63615
## 326 85 79.20131
## 327 96 79.58513
## 328 92 78.45513
## 329 82 85.18691
## 330 61 87.36482
## 331 77 81.06456
## 332 70 86.55401
## 333 75 76.85508
## 334 65 80.08373
## 335 80 80.76698
## 336 79 78.96028
## 337 120 81.18968
## 338 103 84.35796
## 339 117 78.91002
## 340 73 75.96436
## 341 54 85.42552
## 342 99 80.55205
## 343 91 84.56468
## 344 79 82.79470
## 345 93 76.50017
## 346 72 78.80238
## 347 73 81.74452
## 348 84 81.81584
## 349 87 83.28464
## 350 69 75.40413
## 351 65 74.86536
## 352 74 75.32920
## 353 80 70.05289
## 354 57 72.91093
## 355 80 73.69728
## 356 98 84.47503
## 357 67 72.99975
## 358 63 83.43029
## 359 60 87.95245
## 360 73 78.31703
## 361 81 78.64492
## 362 83 81.71404
## 363 89 80.02600
## 364 82 77.59988
## 365 98 79.43621
## 366 66 79.28112
## 367 57 83.90819
## 368 59 93.54258
## 369 77 86.40102
## 370 83 82.58538
## 371 82 85.89380
## 372 116 79.81702
## 373 93 76.95876
## 374 63 70.17323
## 375 103 74.13669
## 376 97 72.30131
## 377 88 75.25157
## 378 92 76.54545
## 379 103 78.41872
## 380 101 81.53011
## 381 97 85.93305
## 382 93 83.95041
## 383 98 88.37161
## 384 97 86.31066
## 385 90 82.44512
## 386 77 83.91708
## 387 91 86.67909
## 388 83 87.44296
## 389 64 82.21262
## 390 85 76.52642
## 391 77 75.16841
## 392 101 64.24604
## 393 97 72.58621
## 394 73 77.46501
## 395 87 72.80068
## 396 83 78.05726
## 397 90 80.25288
## 398 72 76.11026
## 399 89 90.32384
## 400 87 74.10965
## 401 72 79.40059
## 402 75 70.55224
## 403 88 86.59573
## 404 101 87.70315
## 405 75 83.05623
## 406 50 78.62248
## 407 42 75.43148
## 408 90 83.51626
## 409 88 74.53313
## 410 79 75.06487
## 411 52 86.07172
## 412 58 87.84534
## 413 64 94.14527
## 414 64 96.04200
## 415 106 90.90022
## 416 76 86.34220
## 417 101 94.79885
## 418 92 86.76487
## 419 76 87.56303
## 420 89 85.09908
## 421 83 88.22749
## 422 85 93.78125
## 423 93 90.14901
## 424 76 97.39963
## 425 76 88.67552
## 426 84 87.20702
## 427 84 80.27788
## 428 83 85.93653
## 429 78 74.39622
## 430 86 74.75851
## 431 70 72.61288
## 432 88 90.56274
## 433 73 89.47333
## 434 83 78.69775
## 435 105 77.35048
## 436 100 79.53762
## 437 63 83.39287
## 438 67 83.62948
## 439 70 76.57712
## 440 83 77.10248
## 441 64 83.34904
## 442 83 78.56847
## 443 85 69.57909
## 444 83 73.96331
## 445 88 63.06532
## 446 95 69.99296
## 447 71 76.64262
## 448 54 79.98629
## 449 53 73.50146
## 450 84 77.92409
## 451 83 82.13105
## 452 52 77.90857
## 453 73 72.95778
## 454 75 69.73774
## 455 76 69.46233
## 456 96 67.52634
## 457 86 72.19872
## 458 67 78.59388
## 459 71 73.35113
## 460 76 71.86038
## 461 70 75.50214
## 462 81 70.66007
## 463 91 73.52882
## 464 80 88.14247
## 465 70 88.25647
## 466 82 81.56281
## 467 84 85.11315
## 468 51 80.24222
## 469 77 81.24244
## 470 68 82.69428
## 471 58 73.51443
## 472 56 75.48643
## 473 63 88.59721
## 474 59 74.75932
## 475 84 69.56123
## 476 84 88.86014
## 477 68 71.28452
## 478 68 86.22500
## 479 79 88.95228
## 480 93 92.69140
## 481 85 101.10893
## 482 81 93.85807
## 483 76 84.70288
## 484 69 82.88195
## 485 82 84.57159
## 486 76 73.01808
## 487 66 78.50309
## 488 98 87.77309
## 489 104 90.41206
## 490 79 81.01700
## 491 69 81.18505
## 492 66 78.76486
## 493 78 79.22418
## 494 66 89.33282
## 495 59 81.16533
## 496 45 79.74078
## 497 69 92.39093
## 498 62 89.28796
## 499 60 77.12440
## 500 78 79.77105
## 501 95 75.68524
## 502 97 87.43997
## 503 88 76.52938
## 504 98 84.78700
## 505 84 80.50877
## 506 87 77.35972
## 507 85 72.72118
## 508 80 80.34685
## 509 88 72.55936
## 510 88 70.06595
## 511 107 83.25037
## 512 83 80.59516
## 513 94 81.71215
## 514 83 78.13880
## 515 85 82.04003
## 516 78 88.64843
## 517 49 84.59548
## 518 60 90.67361
## 519 68 84.94869
## 520 86 82.43145
## 521 80 77.98885
## 522 76 72.89823
## 523 92 83.43799
## 524 96 86.92602
## 525 84 87.18417
## 526 78 77.09404
## 527 83 89.40353
## 528 80 93.34790
## 529 97 86.96778
## 530 85 82.14507
## 531 78 83.43017
## 532 92 75.10548
## 533 83 74.03722
## 534 98 74.16809
## 535 75 94.69567
## 536 62 85.56016
## 537 110 93.39981
## 538 87 85.25712
## 539 86 87.29422
## 540 90 87.46496
## 541 105 83.61357
## 542 74 79.45141
## 543 78 79.45387
## 544 103 78.43478
## 545 73 73.38522
## 546 81 76.03883
## 547 59 72.47713
## 548 72 66.23869
## 549 88 67.67506
## 550 92 74.29160
## 551 79 76.15938
## 552 71 68.53339
## 553 78 64.18492
## 554 68 76.55550
## 555 89 72.31920
## 556 88 75.70448
## 557 100 86.25488
## 558 93 87.13625
## 559 83 85.66963
## 560 83 75.01403
## 561 105 72.48383
## 562 74 76.99319
## 563 59 67.65273
## 564 72 78.88975
## 565 79 85.81330
## 566 81 99.38599
## 567 96 102.30433
## 568 89 93.87881
## 569 95 100.37487
## 570 99 98.32826
## 571 86 89.11290
## 572 85 78.88624
## 573 95 73.97117
## 574 75 86.87866
## 575 74 82.38174
## 576 77 82.41774
## 577 94 70.81168
## 578 90 94.27943
## 579 93 96.56763
## 580 90 96.17381
## 581 72 83.56929
## 582 70 91.26643
## 583 77 93.49057
## 584 79 95.36925
## 585 63 80.85944
## 586 111 84.39534
## 587 62 82.97421
## 588 78 77.45613
## 589 59 82.72740
## 590 102 77.65384
## 591 65 67.80243
## 592 98 66.62810
## 593 89 74.11153
## 594 108 75.35735
## 595 88 91.41911
## 596 91 89.37015
## 597 70 82.88452
## 598 86 82.28763
## 599 84 84.44431
## 600 88 77.11652
## 601 77 89.87563
## 602 94 87.62389
## 603 99 93.76432
## 604 71 87.78726
## 605 93 93.25463
## 606 92 97.35486
## 607 79 92.79773
## 608 72 98.47135
## 609 97 94.72234
## 610 117 93.80173
## 611 82 96.01043
## 612 79 82.86228
## 613 87 76.13348
## 614 71 77.17624
## 615 61 77.23491
## 616 100 84.72118
## 617 87 84.54481
## 618 97 91.45792
## 619 91 81.48643
## 620 83 76.05063
## 621 73 75.66432
## 622 68 77.26623
## 623 66 77.11791
## 624 70 85.23634
## 625 83 94.58567
## 626 71 82.27350
## 627 93 76.21672
## 628 82 80.88749
## 629 94 77.09902
## 630 79 79.78842
## 631 82 80.55472
## 632 100 78.32830
## 633 77 74.66971
## 634 101 70.44247
## 635 79 75.76856
## 636 74 81.25690
## 637 83 85.21570
## 638 92 87.45149
## 639 84 83.24715
## 640 75 85.12337
## 641 66 90.08131
## 642 64 85.69963
## 643 92 91.72563
## 644 79 95.57921
## 645 83 80.82969
## 646 66 76.76201
## 647 65 84.40290
## 648 79 88.69413
## 649 80 83.20373
## 650 90 92.68599
## 651 77 82.30209
## 652 76 79.07586
## 653 102 86.73771
## 654 87 81.29509
## 655 65 66.71945
## 656 85 73.26650
## 657 92 86.38450
## 658 83 80.74925
## 659 88 72.12318
## 660 58 86.47275
## 661 73 79.04706
## 662 80 82.85552
## 663 75 74.31326
## 664 85 75.84208
## 665 69 84.08627
## 666 70 79.05523
## 667 75 79.31311
## 668 92 80.23662
## 669 94 75.09644
## 670 102 61.44278
## 671 75 60.04693
## 672 91 73.69484
## 673 86 75.61348
## 674 94 75.09419
## 675 92 60.58337
## 676 98 79.31909
## 677 95 85.07393
## 678 73 71.05731
## 679 77 82.34659
## 680 86 75.70126
## 681 88 76.36214
## 682 75 79.97192
## 683 80 79.96235
## 684 69 77.60617
## 685 68 75.63627
## 686 66 72.69408
## 687 77 78.08013
## 688 87 77.26828
## 689 83 81.21731
## 690 80 75.48051
## 691 78 71.40040
## 692 74 74.64094
## 693 74 78.67222
## 694 56 88.47661
## 695 67 82.65871
## 696 59 89.85068
## 697 53 97.41705
## 698 45 84.06305
## 699 67 83.32637
## 700 65 79.06699
## 701 97 81.32051
## 702 70 81.98676
## 703 89 72.57010
## 704 67 73.44789
## 705 62 74.81811
## 706 66 74.98640
## 707 71 84.21973
## 708 89 84.56636
## 709 91 82.75465
## 710 81 77.97886
## 711 77 84.76507
## 712 85 79.18286
## 713 77 82.33910
## 714 81 78.44898
## 715 71 81.50181
## 716 78 84.18849
## 717 41 80.75256
## 718 66 87.77617
## 719 98 79.01648
## 720 108 84.53152
## 721 91 73.76380
## 722 72 74.64648
## 723 66 97.05403
## 724 83 95.97254
## 725 87 77.70426
## 726 99 91.18758
## 727 77 75.73231
## 728 97 73.62204
## 729 77 65.45411
## 730 54 78.44786
## 731 61 84.83279
## 732 96 85.78446
## 733 106 82.58456
## 734 87 80.45664
## 735 86 74.78727
## 736 102 70.95567
## 737 108 73.82233
## 738 96 84.23530
## 739 93 79.76443
## 740 83 75.55942
## 741 80 83.90318
## 742 99 78.55669
## 743 100 80.46496
## 744 99 76.10448
## 745 91 77.89772
## 746 98 71.32829
## 747 90 73.19595
## 748 92 84.23038
## 749 96 80.71645
## 750 114 82.88104
## 751 88 86.58766
## 752 96 77.59884
## 753 101 76.13671
## 754 88 87.43058
## 755 110 86.48412
## 756 38 90.45556
## 757 58 86.15989
## 758 56 97.07526
## 759 107 95.61839
## 760 63 91.30071
## 761 58 102.70882
## 762 56 94.41229
## 763 52 97.02811
## 764 74 89.96791
## 765 62 82.18003
## 766 77 83.80303
## 767 75 85.00570
## 768 102 80.53210
## 769 77 87.92249
## 770 76 94.71954
## 771 99 73.29979
## 772 66 71.60731
## 773 87 67.21755
## 774 78 61.79597
## 775 83 75.69141
## 776 93 84.94305
## 777 72 76.45704
## 778 75 76.87543
## 779 69 79.08576
## 780 82 82.82991
## 781 57 77.53815
## 782 49 88.68762
## 783 45 86.93289
## 784 67 86.40456
## 785 85 85.95538
## 786 96 71.33692
## 787 92 81.90683
## 788 82 74.62795
## 789 61 72.43715
## 790 77 68.46631
## 791 70 80.61536
## 792 75 77.72868
## 793 65 78.76628
## 794 80 79.50862
## 795 79 83.21869
## 796 120 78.19842
## 797 103 89.64355
## 798 117 89.13728
## 799 73 82.46795
## 800 54 77.71627
## 801 99 68.72122
## 802 91 73.81261
## 803 79 90.70552
## 804 93 88.53050
## 805 72 93.36524
## 806 73 81.87482
## 807 84 69.53519
## 808 87 65.83305
## 809 69 80.91972
## 810 65 75.66819
## 811 74 69.78134
## 812 80 74.65439
## 813 57 86.20933
## 814 80 90.52296
## 815 98 89.88804
## 816 67 91.22499
## 817 63 85.30952
## 818 60 76.50755
## 819 73 77.81370
## 820 81 76.53768
## 821 83 80.85762
## 822 89 83.55077
## 823 82 86.28180
## 824 98 90.36860
## 825 66 76.09565
## 826 57 75.87291
## 827 59 87.02377
## 828 77 78.48445
## 829 83 76.31062
## 830 82 73.67242
## 831 116 82.00670
## 832 93 76.92192
## 833 63 77.12030
## 834 103 75.08059
## 835 97 87.08930
## 836 88 84.10366
## 837 92 69.87058
## 838 103 67.95321
## 839 101 76.27043
## 840 97 77.82358
## 841 93 61.48364
## 842 98 73.93711
## 843 97 80.09102
## 844 90 80.39881
## 845 77 77.94942
## 846 91 79.27261
## 847 83 80.39794
## 848 64 79.57612
## 849 85 88.63689
## 850 77 85.94038
## 851 101 83.89021
## 852 97 84.32137
## 853 73 88.01176
## 854 87 89.30777
## 855 83 84.21572
## 856 90 89.72324
## 857 72 88.68351
## 858 89 93.65513
## 859 87 86.63118
## 860 72 80.90752
## 861 75 83.25118
## 862 88 82.34711
## 863 101 83.24225
## 864 75 77.00634
## 865 50 79.45941
## 866 42 70.73381
## 867 90 76.64514
## 868 88 80.04985
## 869 79 77.07574
## 870 52 81.67504
## 871 58 74.98745
## 872 64 84.02452
## 873 64 72.21179
## 874 106 77.78425
## 875 76 81.11352
## 876 101 82.50825
## 877 92 76.61177
## 878 76 74.25842
## 879 89 69.83767
## 880 83 78.11892
## 881 85 75.86576
## 882 93 70.01321
## 883 76 82.94723
## 884 76 80.03326
## 885 84 82.57406
## 886 84 87.68125
## 887 83 74.67282
## 888 78 67.15518
## 889 86 76.66567
## 890 70 77.28677
## 891 88 73.35303
## 892 73 85.23277
## 893 83 82.52631
## 894 105 80.73811
## 895 100 74.93185
## 896 63 84.73855
## 897 67 75.75555
## 898 70 81.92132
## 899 83 87.57509
## 900 64 81.64124
## 901 83 78.23981
## 902 85 79.85345
## 903 83 83.07927
## 904 88 77.58223
## 905 95 83.27506
## 906 71 84.80112
## 907 54 77.35284
## 908 53 83.92642
## 909 84 80.88043
## 910 83 79.19680
## 911 52 61.31942
## 912 73 58.38833
## 913 75 66.35593
## 914 76 60.64071
## 915 96 59.60618
## 916 86 71.52196
## 917 67 82.97232
## 918 71 69.80963
## 919 76 73.80638
## 920 70 71.90841
## 921 81 76.54061
## 922 91 81.68224
## 923 80 85.81771
## 924 70 92.23216
## 925 82 84.05760
## 926 84 79.86707
## 927 51 83.63096
## 928 77 72.09129
## 929 68 76.10990
## 930 58 76.51728
## 931 56 79.08606
## 932 63 75.26965
## 933 59 92.94321
## 934 84 85.87283
## 935 84 73.31166
## 936 68 77.15060
## 937 68 70.75125
## 938 79 83.97144
## 939 93 95.72627
## 940 85 77.69374
## 941 81 79.19794
## 942 76 76.35351
## 943 69 78.32832
## 944 82 75.91017
## 945 76 70.76003
## 946 66 86.96236
## 947 98 78.23717
## 948 104 81.18502
## 949 79 75.78016
## 950 69 75.92134
## 951 66 77.63943
## 952 78 97.27400
## 953 66 85.27491
## 954 59 88.53154
## 955 45 93.95050
## 956 69 104.40963
## 957 62 93.08013
## 958 60 94.25311
## 959 78 96.39318
## 960 95 102.13149
## 961 97 93.15151
## 962 88 93.26443
## 963 98 88.38960
## 964 84 83.12927
## 965 87 87.16569
## 966 85 86.20118
## 967 80 84.48924
## 968 88 93.72888
## 969 88 100.22576
## 970 107 90.75092
## 971 83 85.92259
## 972 94 90.08918
## 973 83 88.66595
## 974 85 94.18213
## 975 78 85.07843
## 976 49 87.16259
## 977 60 76.98871
## 978 68 66.83786
## 979 86 71.05327
## 980 80 80.22158
## 981 76 77.96724
## 982 92 74.09699
## 983 96 75.16087
## 984 84 75.85082
## 985 78 78.57326
## 986 83 86.20376
## 987 80 90.85388
## 988 97 82.07620
## 989 85 92.37084
## 990 78 84.76236
## 991 92 86.21000
## 992 83 82.30628
## 993 98 93.02867
## 994 75 92.67147
## 995 62 87.26267
## 996 110 85.61029
## 997 87 81.43227
## 998 86 78.74050
## 999 90 85.55504
## 1000 105 85.67697
## 1001 74 89.49447
## 1002 78 89.33847
## 1003 103 97.19323
## 1004 73 94.41711
## 1005 81 90.90383
## 1006 59 89.17507
## 1007 72 90.21799
## 1008 88 93.50255
## 1009 92 93.87559
## 1010 79 95.81350
## 1011 71 90.93701
## 1012 78 92.44312
## 1013 68 78.34708
## 1014 89 72.71619
## 1015 88 76.46792
## 1016 100 74.59291
## 1017 93 72.01169
## 1018 83 63.68687
## 1019 83 80.15956
## 1020 105 87.11331
## 1021 74 92.71264
## 1022 59 72.71623
## 1023 72 73.56595
## 1024 79 69.61434
## 1025 81 73.36825
## 1026 96 79.86398
## 1027 89 79.38112
## 1028 95 76.32710
## 1029 99 87.36658
## 1030 86 87.46441
## 1031 85 94.51520
## 1032 95 93.38030
## 1033 75 91.96627
## 1034 74 89.23678
## 1035 77 82.81230
## 1036 94 80.88571
## 1037 90 87.73838
## 1038 93 86.40998
## 1039 90 81.97909
## 1040 72 82.52483
## 1041 70 68.20531
## 1042 77 66.22217
## 1043 79 70.54708
## 1044 63 67.31787
## 1045 111 70.66388
## 1046 62 84.08191
## 1047 78 85.42203
## 1048 59 72.46110
## 1049 102 76.83792
## 1050 65 75.06820
## 1051 98 75.85712
## 1052 89 76.59533
## 1053 108 70.17289
## 1054 88 68.75619
## 1055 91 73.16580
## 1056 70 82.69221
## 1057 86 78.45319
## 1058 84 74.46707
## 1059 88 84.53133
## 1060 77 72.54941
## 1061 94 85.11678
## 1062 99 83.35557
## 1063 71 82.59932
## 1064 93 83.53339
## 1065 92 86.85838
## 1066 79 84.46278
## 1067 72 88.67685
## 1068 97 83.37239
## 1069 117 88.58651
## 1070 82 89.59870
## 1071 79 84.64656
## 1072 87 81.64766
## 1073 71 92.61935
## 1074 61 86.27243
## 1075 100 82.85143
## 1076 87 85.66203
## 1077 97 95.84800
## 1078 91 90.95834
## 1079 83 95.81290
## 1080 73 98.69009
## 1081 68 85.71811
## 1082 66 69.76796
## 1083 70 75.60889
## 1084 83 71.34770
## 1085 71 72.41132
## 1086 93 72.15901
## 1087 82 71.27513
## 1088 94 84.49067
## 1089 79 82.25327
## 1090 82 81.79879
## 1091 100 85.53387
## 1092 77 80.98372
## 1093 101 73.90174
## 1094 79 73.92732
## 1095 74 71.53401
## 1096 83 74.23172
## 1097 92 81.61689
## 1098 84 82.79164
## 1099 75 77.45100
## 1100 66 76.58674
## 1101 64 78.73276
## 1102 92 96.77330
## 1103 79 87.60659
## 1104 83 76.93790
## 1105 66 87.38424
## 1106 65 72.88084
## 1107 79 80.21365
## 1108 80 79.76677
## 1109 90 66.28857
## 1110 77 69.89638
## 1111 76 67.14770
## 1112 102 70.31318
## 1113 87 63.48091
## 1114 65 71.00934
## 1115 85 70.69567
## 1116 92 70.69689
## 1117 83 73.50918
## 1118 88 78.85426
## 1119 58 83.60694
## 1120 73 83.10800
## 1121 80 80.02899
## 1122 75 71.49187
## 1123 85 66.74572
## 1124 69 73.57565
## 1125 70 80.18532
## 1126 75 72.35786
## 1127 92 72.86273
## 1128 94 71.97395
## 1129 102 65.32023
## 1130 75 73.37091
## 1131 91 70.95516
## 1132 86 67.70542
## 1133 94 71.73850
## 1134 92 76.98511
## 1135 98 84.88781
## 1136 95 94.03477
## 1137 73 83.22815
## 1138 77 84.15055
## 1139 86 89.26872
## 1140 88 79.14093
## 1141 75 84.17269
## 1142 80 84.81615
## 1143 69 72.06287
## 1144 68 77.75736
## 1145 66 78.77255
## 1146 77 72.63020
## 1147 87 78.03885
## 1148 83 90.17251
## 1149 80 78.48975
## 1150 78 79.38583
## 1151 74 77.11963
## 1152 74 90.17322
## 1153 56 81.64387
## 1154 67 83.99443
## 1155 59 85.64188
## 1156 53 82.44665
## 1157 45 95.51429
## 1158 67 77.02829
## 1159 65 89.46990
## 1160 97 94.55289
## 1161 70 95.36864
## 1162 89 91.57210
## 1163 67 80.87037
## 1164 62 82.79777
## 1165 66 75.87495
## 1166 71 80.97836
## 1167 89 84.59811
## 1168 91 81.97436
## 1169 81 94.32579
## 1170 77 85.66276
## 1171 85 74.16725
## 1172 77 77.35892
## 1173 81 71.56327
## 1174 71 89.51002
## 1175 78 83.62805
## 1176 41 96.01226
## 1177 66 99.63880
## 1178 98 86.66543
## 1179 108 85.08412
## 1180 91 88.86202
## 1181 72 96.16164
## 1182 66 94.35191
## 1183 83 77.78130
## 1184 87 79.97524
## 1185 99 76.64023
## 1186 77 81.35226
## 1187 97 82.58319
## 1188 77 83.45274
## 1189 54 80.12935
## 1190 61 82.74097
## 1191 96 78.38666
## 1192 106 85.76137
## 1193 87 80.03589
## 1194 86 75.26520
## 1195 102 79.94641
## 1196 108 86.44159
## 1197 96 84.42915
## 1198 93 76.77597
## 1199 83 85.49227
## 1200 80 85.43649
## 1201 99 82.01907
## 1202 100 85.49127
## 1203 99 76.48413
## 1204 91 72.47560
## 1205 98 74.71764
## 1206 90 79.17754
## 1207 92 75.89540
## 1208 96 72.52410
## 1209 114 72.99217
## 1210 88 73.95031
## 1211 96 80.28342
## 1212 101 76.84542
## 1213 88 72.66814
## 1214 110 79.43934
## 1215 38 84.21485
## 1216 58 76.13103
## 1217 56 75.85239
## 1218 107 80.34523
## 1219 63 81.40989
## 1220 58 89.32086
## 1221 56 71.65314
## 1222 52 83.47910
## 1223 74 79.31333
## 1224 62 83.99140
## 1225 77 86.04019
## 1226 75 80.58368
## 1227 102 78.55305
## 1228 77 81.19871
## 1229 76 80.75316
## 1230 99 83.34638
## 1231 66 72.71053
## 1232 87 83.05962
## 1233 78 75.30919
## 1234 83 61.84184
## 1235 93 75.49493
## 1236 72 66.04304
## 1237 75 67.75688
## 1238 69 69.12003
## 1239 82 71.72482
## 1240 57 69.96031
## 1241 49 70.90493
## 1242 45 82.19781
## 1243 67 78.83265
## 1244 85 79.97341
## 1245 96 73.89507
## 1246 92 74.58934
## 1247 82 77.85825
## 1248 61 80.11585
## 1249 77 72.59577
## 1250 70 73.24053
## 1251 75 74.94112
## 1252 65 80.93204
## 1253 80 82.70670
## 1254 79 83.81551
## 1255 120 81.01255
## 1256 103 83.63764
## 1257 117 83.91016
## 1258 73 75.41719
## 1259 54 77.28019
## 1260 99 85.05006
## 1261 91 76.05888
## 1262 79 78.48064
## 1263 93 86.08551
## 1264 72 75.94236
## 1265 73 79.40133
## 1266 84 84.21428
## 1267 87 76.93622
## 1268 69 76.13965
## 1269 65 80.08179
## 1270 74 80.51164
## 1271 80 78.57486
## 1272 57 82.35286
## 1273 80 91.28096
## 1274 98 96.60549
## 1275 67 94.87582
## 1276 63 92.31089
## 1277 60 94.05118
## 1278 73 94.20321
## 1279 81 96.22611
## 1280 83 87.91910
## 1281 89 83.99419
## 1282 82 78.77420
## 1283 98 75.09218
## 1284 66 82.55089
## 1285 57 71.98056
## 1286 59 69.20131
## 1287 77 77.96775
## 1288 83 87.78023
## 1289 82 86.35932
## 1290 116 82.21422
## 1291 93 79.65022
## 1292 63 75.35074
## 1293 103 81.20123
## 1294 97 90.70037
## 1295 88 84.24705
## 1296 92 78.67993
## 1297 103 87.97745
## 1298 101 87.69785
## 1299 97 92.90616
## 1300 93 96.57243
## 1301 98 85.69885
## 1302 97 78.76782
## 1303 90 73.07410
## 1304 77 77.54960
## 1305 91 83.26963
## 1306 83 78.98654
## 1307 64 79.07793
## 1308 85 80.37722
## 1309 77 79.71387
## 1310 101 78.50491
## 1311 97 76.75080
## 1312 73 80.27969
## 1313 87 73.46866
## 1314 83 79.25616
## 1315 90 82.00955
## 1316 72 79.89839
## 1317 89 91.02800
## 1318 87 88.08337
## 1319 72 88.85369
## 1320 75 94.45356
## 1321 88 84.35462
## 1322 101 78.05740
## 1323 75 83.95127
## 1324 50 86.33623
## 1325 42 82.90238
## 1326 90 81.67246
## 1327 88 85.51611
## 1328 79 80.18255
## 1329 52 79.54560
## 1330 58 75.54402
## 1331 64 73.65885
## 1332 64 71.01422
## 1333 106 81.45969
## 1334 76 82.86082
## 1335 101 80.62995
## 1336 92 88.25696
## 1337 76 76.46766
## 1338 89 80.34835
## 1339 83 72.91620
## 1340 85 76.90302
## 1341 93 83.18826
## 1342 76 70.94189
## 1343 76 76.87526
## 1344 84 80.45797
## 1345 84 69.39293
## 1346 83 84.77448
## 1347 78 85.62956
## 1348 86 83.98628
## 1349 70 80.12358
## 1350 88 86.59676
## 1351 73 87.14948
## 1352 83 89.34894
## 1353 105 96.16870
## 1354 100 89.87599
## 1355 63 82.54057
## 1356 67 72.35155
## 1357 70 75.48538
## 1358 83 89.39629
## 1359 64 90.79489
## 1360 83 72.17537
## 1361 85 98.82305
## 1362 83 91.07166
## 1363 88 74.70489
## 1364 95 74.08408
## 1365 71 75.45591
## 1366 54 75.76548
## 1367 53 67.74708
## 1368 84 63.73702
## 1369 83 63.09795
## 1370 52 79.36250
## 1371 73 85.89266
## 1372 75 88.93737
## 1373 76 86.42517
## 1374 96 79.24316
## 1375 86 80.67956
## 1376 67 74.32748
## 1377 71 77.51726
## 1378 76 74.68399
## 1379 70 90.58364
## 1380 81 84.36898
## 1381 91 89.61974
## 1382 80 87.27046
## 1383 70 91.72583
## 1384 82 90.46183
## 1385 84 92.84914
## 1386 51 83.81388
## 1387 77 78.44059
## 1388 68 78.52910
## 1389 58 84.60661
## 1390 56 80.35381
## 1391 63 81.01220
## 1392 59 81.58834
## 1393 84 83.27561
## 1394 84 83.42052
## 1395 68 88.09058
## 1396 68 80.71739
## 1397 79 82.57078
## 1398 93 79.27823
## 1399 85 79.45690
## 1400 81 87.45363
## 1401 76 82.67362
## 1402 69 83.26842
## 1403 82 79.96190
## 1404 76 83.17930
## 1405 66 81.28235
## 1406 98 88.91157
## 1407 104 79.89535
## 1408 79 82.59148
## 1409 69 83.88969
## 1410 66 79.11848
## 1411 78 80.02279
## 1412 66 79.38053
## 1413 59 79.62617
## 1414 45 81.86965
## 1415 69 81.52487
## 1416 62 75.59362
## 1417 60 76.78916
## 1418 78 74.30279
## 1419 95 76.30224
## 1420 97 69.65634
## 1421 88 74.26352
## 1422 98 83.82491
## 1423 84 84.75257
## 1424 87 73.73032
## 1425 85 75.06767
## 1426 80 84.63083
## 1427 88 78.97639
## 1428 88 76.45999
## 1429 107 71.29329
## 1430 83 81.13600
## 1431 94 80.70558
## 1432 83 78.14821
## 1433 85 71.04719
## 1434 78 80.36170
## 1435 49 81.90376
## 1436 60 93.33645
## 1437 68 88.19413
## 1438 86 90.25974
## 1439 80 84.66770
## 1440 76 82.96837
## 1441 92 89.12204
## 1442 96 87.95702
## 1443 84 82.57955
## 1444 78 82.70475
## 1445 83 79.25230
## 1446 80 80.20210
## 1447 97 71.72168
## 1448 85 74.42225
## 1449 78 79.63366
## 1450 92 81.94346
## 1451 83 82.75808
## 1452 98 80.78277
## 1453 75 76.62972
## 1454 62 77.85689
## 1455 110 68.36665
## 1456 87 69.11303
## 1457 86 66.24626
## 1458 90 81.84655
## 1459 105 78.35787
## 1460 74 69.94233
## 1461 78 75.76441
## 1462 103 81.02432
## 1463 73 74.52049
## 1464 81 81.01784
## 1465 59 71.46302
## 1466 72 75.61764
## 1467 88 78.34152
## 1468 92 77.83881
## 1469 79 77.22750
## 1470 71 86.79195
## 1471 78 77.71251
## 1472 68 82.62507
## 1473 89 81.35461
## 1474 88 89.93796
## 1475 100 96.51608
## 1476 93 86.16285
## 1477 83 91.33961
## 1478 83 86.93554
## 1479 105 84.63384
## 1480 74 72.70103
## 1481 59 71.08248
## 1482 72 91.35231
## 1483 79 86.00672
## 1484 81 91.94352
## 1485 96 85.43193
## 1486 89 83.53370
## 1487 95 85.50160
## 1488 99 80.42214
## 1489 86 87.39467
## 1490 85 93.21013
## 1491 95 83.54572
## 1492 75 77.31713
## 1493 74 89.01612
## 1494 77 89.92811
## 1495 94 76.09057
## 1496 90 77.41702
## 1497 93 85.67225
## 1498 90 87.78315
## 1499 72 71.42951
## 1500 70 78.50359
## 1501 77 70.81870
## 1502 79 80.91072
## 1503 63 75.32227
## 1504 111 68.16678
## 1505 62 79.64992
## 1506 78 71.17387
## 1507 59 79.90700
## 1508 102 84.00078
## 1509 65 82.16401
## 1510 98 77.16445
## 1511 89 76.00898
## 1512 108 78.13104
## 1513 88 72.61470
## 1514 91 77.78654
## 1515 70 77.61568
## 1516 86 72.10171
## 1517 84 81.84125
## 1518 88 77.74045
## 1519 77 78.32464
## 1520 94 67.84336
## 1521 99 80.92785
mean (apply(predicted_values, 1, min)/apply(predicted_values, 1, max)) # calculate accuracy## [1] 0.8582949
The prediction accuracy here is at 85.85%
predicted <- predict(Model5, newx = test_baseball)# predict on test data
predicted_values <- cbind (actual=test_baseball$TARGET_WINS, predicted) # combine## Warning in cbind(actual = test_baseball$TARGET_WINS, predicted): number of rows
## of result is not a multiple of vector length (arg 1)
predicted_values## actual predicted
## 1 76 67.44342
## 2 70 67.45113
## 3 81 70.07451
## 4 91 75.43397
## 5 80 67.72684
## 6 70 67.72587
## 7 82 63.76240
## 8 84 77.57821
## 9 51 87.50168
## 10 77 75.33778
## 11 68 86.57140
## 12 58 76.10532
## 13 56 79.30210
## 14 63 85.05514
## 15 59 89.33395
## 16 84 86.35917
## 17 84 75.87327
## 18 68 75.27661
## 19 68 76.92580
## 20 79 72.38257
## 21 93 91.24888
## 22 85 81.21417
## 23 81 76.85482
## 24 76 75.73465
## 25 69 92.95906
## 26 82 79.74796
## 27 76 82.59460
## 28 66 82.93132
## 29 98 84.01025
## 30 104 83.67314
## 31 79 74.98318
## 32 69 94.40116
## 33 66 85.35736
## 34 78 86.31272
## 35 66 85.20792
## 36 59 79.86968
## 37 45 71.39335
## 38 69 79.65920
## 39 62 88.48478
## 40 60 92.89590
## 41 78 95.70531
## 42 95 80.15908
## 43 97 77.64883
## 44 88 67.18263
## 45 98 64.94270
## 46 84 62.58915
## 47 87 71.88851
## 48 85 67.42428
## 49 80 66.89260
## 50 88 74.11101
## 51 88 84.19595
## 52 107 77.19389
## 53 83 81.04087
## 54 94 78.77630
## 55 83 78.14143
## 56 85 76.04199
## 57 78 79.46042
## 58 49 73.69400
## 59 60 74.54247
## 60 68 75.74929
## 61 86 74.97222
## 62 80 75.44410
## 63 76 79.02010
## 64 92 69.75831
## 65 96 70.64981
## 66 84 69.12150
## 67 78 74.71038
## 68 83 69.62466
## 69 80 70.17964
## 70 97 70.25523
## 71 85 69.23863
## 72 78 71.09084
## 73 92 72.67250
## 74 83 67.57287
## 75 98 73.96325
## 76 75 82.29090
## 77 62 76.84995
## 78 110 80.96384
## 79 87 86.00466
## 80 86 80.99199
## 81 90 90.30120
## 82 105 86.68144
## 83 74 76.35323
## 84 78 83.38313
## 85 103 86.70403
## 86 73 88.15920
## 87 81 77.29176
## 88 59 78.74435
## 89 72 78.89828
## 90 88 84.86922
## 91 92 76.52274
## 92 79 68.53117
## 93 71 80.63921
## 94 78 79.04574
## 95 68 91.06636
## 96 89 78.71918
## 97 88 73.94550
## 98 100 72.47404
## 99 93 76.97092
## 100 83 78.30645
## 101 83 77.84933
## 102 105 73.92133
## 103 74 87.30513
## 104 59 76.38672
## 105 72 87.67499
## 106 79 77.46469
## 107 81 82.05258
## 108 96 85.24247
## 109 89 87.16002
## 110 95 83.45556
## 111 99 83.35626
## 112 86 86.73727
## 113 85 91.21914
## 114 95 88.04261
## 115 75 78.51105
## 116 74 81.95540
## 117 77 82.76242
## 118 94 81.70924
## 119 90 85.27418
## 120 93 69.20964
## 121 90 69.37010
## 122 72 78.11881
## 123 70 69.72151
## 124 77 66.82142
## 125 79 65.28600
## 126 63 71.84567
## 127 111 71.77897
## 128 62 76.57889
## 129 78 79.01450
## 130 59 85.26364
## 131 102 75.40463
## 132 65 82.54799
## 133 98 78.50081
## 134 89 82.48798
## 135 108 84.16834
## 136 88 81.26085
## 137 91 80.88645
## 138 70 75.53196
## 139 86 80.44320
## 140 84 84.92377
## 141 88 75.02694
## 142 77 75.21066
## 143 94 71.99137
## 144 99 71.21597
## 145 71 81.47611
## 146 93 80.90290
## 147 92 80.66912
## 148 79 82.13911
## 149 72 74.68290
## 150 97 73.79684
## 151 117 69.14518
## 152 82 69.07586
## 153 79 74.64032
## 154 87 75.73128
## 155 71 71.89865
## 156 61 78.09027
## 157 100 76.71291
## 158 87 77.55373
## 159 97 77.24322
## 160 91 75.43189
## 161 83 81.07324
## 162 73 75.80984
## 163 68 75.01335
## 164 66 87.32782
## 165 70 87.38260
## 166 83 85.36363
## 167 71 72.33313
## 168 93 88.77234
## 169 82 80.72687
## 170 94 81.51611
## 171 79 79.82869
## 172 82 79.48619
## 173 100 79.26894
## 174 77 88.37324
## 175 101 86.43100
## 176 79 85.22346
## 177 74 83.65356
## 178 83 80.09002
## 179 92 88.30086
## 180 84 79.69794
## 181 75 84.31400
## 182 66 74.81675
## 183 64 80.47708
## 184 92 79.70448
## 185 79 78.06843
## 186 83 85.14572
## 187 66 82.05263
## 188 65 94.71854
## 189 79 96.65872
## 190 80 87.64550
## 191 90 90.21192
## 192 77 87.23107
## 193 76 78.75851
## 194 102 79.00789
## 195 87 81.25164
## 196 65 81.88150
## 197 85 84.03268
## 198 92 82.40478
## 199 83 95.78563
## 200 88 80.30579
## 201 58 75.01089
## 202 73 95.34271
## 203 80 87.50287
## 204 75 87.19443
## 205 85 76.17791
## 206 69 76.06067
## 207 70 64.91472
## 208 75 65.72480
## 209 92 72.39191
## 210 94 77.65363
## 211 102 84.15721
## 212 75 82.04471
## 213 91 86.08161
## 214 86 87.63339
## 215 94 75.05263
## 216 92 74.98438
## 217 98 83.25936
## 218 95 82.71286
## 219 73 75.53272
## 220 77 76.88389
## 221 86 71.86619
## 222 88 76.06333
## 223 75 74.51141
## 224 80 70.23775
## 225 69 73.52272
## 226 68 79.16581
## 227 66 86.89115
## 228 77 87.13819
## 229 87 85.46662
## 230 83 89.75357
## 231 80 91.54361
## 232 78 90.59831
## 233 74 95.02468
## 234 74 75.40648
## 235 56 79.71284
## 236 67 75.96297
## 237 59 85.91518
## 238 53 78.85678
## 239 45 76.84806
## 240 67 86.14862
## 241 65 78.05120
## 242 97 77.47512
## 243 70 78.23876
## 244 89 73.50749
## 245 67 86.78095
## 246 62 79.08171
## 247 66 84.72774
## 248 71 78.34657
## 249 89 82.76116
## 250 91 86.34586
## 251 81 87.92397
## 252 77 84.11055
## 253 85 77.99811
## 254 77 85.45186
## 255 81 77.80467
## 256 71 75.25132
## 257 78 76.95882
## 258 41 68.38314
## 259 66 73.50657
## 260 98 94.02116
## 261 108 93.71058
## 262 91 89.96523
## 263 72 80.98936
## 264 66 84.84977
## 265 83 85.19842
## 266 87 79.47680
## 267 99 79.10396
## 268 77 95.64288
## 269 97 86.71087
## 270 77 85.77351
## 271 54 88.98923
## 272 61 80.79937
## 273 96 77.98964
## 274 106 75.28088
## 275 87 77.35333
## 276 86 76.26755
## 277 102 75.33738
## 278 108 88.17904
## 279 96 96.27179
## 280 93 89.92458
## 281 83 92.67221
## 282 80 84.05349
## 283 99 74.83770
## 284 100 85.95691
## 285 99 78.72426
## 286 91 78.71273
## 287 98 81.26346
## 288 90 85.39316
## 289 92 94.85380
## 290 96 98.58835
## 291 114 82.07467
## 292 88 75.18221
## 293 96 76.37619
## 294 101 82.90378
## 295 88 88.14256
## 296 110 81.52590
## 297 38 82.87604
## 298 58 78.59107
## 299 56 75.43887
## 300 107 75.87870
## 301 63 77.95901
## 302 58 77.42188
## 303 56 79.91021
## 304 52 75.70473
## 305 74 72.30603
## 306 62 74.72546
## 307 77 77.04007
## 308 75 84.57445
## 309 102 79.03250
## 310 77 75.61020
## 311 76 85.12809
## 312 99 83.01153
## 313 66 81.21832
## 314 87 77.58116
## 315 78 76.16731
## 316 83 71.52026
## 317 93 75.22951
## 318 72 75.48431
## 319 75 76.87937
## 320 69 75.93558
## 321 82 75.95209
## 322 57 82.46585
## 323 49 74.48463
## 324 45 77.49127
## 325 67 75.66515
## 326 85 77.13311
## 327 96 79.35974
## 328 92 78.58116
## 329 82 85.79704
## 330 61 87.45633
## 331 77 82.37835
## 332 70 86.72390
## 333 75 75.70592
## 334 65 80.05905
## 335 80 80.56484
## 336 79 81.56194
## 337 120 79.82098
## 338 103 87.10452
## 339 117 79.65431
## 340 73 76.07465
## 341 54 83.48986
## 342 99 80.22453
## 343 91 82.98359
## 344 79 82.75474
## 345 93 75.27901
## 346 72 79.49034
## 347 73 82.43166
## 348 84 82.29389
## 349 87 84.23164
## 350 69 76.59806
## 351 65 74.96025
## 352 74 76.25467
## 353 80 68.62133
## 354 57 73.26311
## 355 80 75.04697
## 356 98 81.57823
## 357 67 76.97103
## 358 63 84.95317
## 359 60 88.65480
## 360 73 81.29242
## 361 81 80.11348
## 362 83 81.05779
## 363 89 78.21721
## 364 82 78.12708
## 365 98 77.36967
## 366 66 77.29128
## 367 57 81.79287
## 368 59 87.54212
## 369 77 85.17605
## 370 83 81.07061
## 371 82 85.04887
## 372 116 78.68070
## 373 93 77.38031
## 374 63 70.92452
## 375 103 74.03955
## 376 97 71.75593
## 377 88 75.89851
## 378 92 76.24042
## 379 103 78.59571
## 380 101 82.86759
## 381 97 86.95993
## 382 93 84.03568
## 383 98 87.55968
## 384 97 85.98648
## 385 90 82.14632
## 386 77 85.69631
## 387 91 87.07823
## 388 83 86.90617
## 389 64 81.14569
## 390 85 74.98691
## 391 77 74.22010
## 392 101 66.23871
## 393 97 70.80897
## 394 73 79.02572
## 395 87 74.46659
## 396 83 76.17838
## 397 90 78.85746
## 398 72 75.16575
## 399 89 90.13430
## 400 87 76.43702
## 401 72 80.42516
## 402 75 71.76322
## 403 88 85.62887
## 404 101 88.54009
## 405 75 85.41391
## 406 50 80.87090
## 407 42 76.33751
## 408 90 85.13585
## 409 88 76.30060
## 410 79 72.82338
## 411 52 85.53497
## 412 58 88.16092
## 413 64 94.67713
## 414 64 92.65560
## 415 106 90.99936
## 416 76 85.43587
## 417 101 92.80923
## 418 92 89.13903
## 419 76 87.66494
## 420 89 86.26550
## 421 83 87.30838
## 422 85 95.64263
## 423 93 89.77173
## 424 76 96.58498
## 425 76 89.08766
## 426 84 84.87436
## 427 84 79.09352
## 428 83 83.13129
## 429 78 74.78158
## 430 86 75.01630
## 431 70 75.37997
## 432 88 91.42830
## 433 73 89.53917
## 434 83 80.57424
## 435 105 77.44170
## 436 100 80.56396
## 437 63 86.13667
## 438 67 86.17377
## 439 70 76.58950
## 440 83 76.80901
## 441 64 83.22035
## 442 83 78.71702
## 443 85 68.39029
## 444 83 74.51458
## 445 88 65.97702
## 446 95 69.94125
## 447 71 78.07337
## 448 54 79.56762
## 449 53 75.47085
## 450 84 78.95959
## 451 83 82.04844
## 452 52 78.93698
## 453 73 75.55269
## 454 75 73.11614
## 455 76 71.16156
## 456 96 68.87201
## 457 86 74.10248
## 458 67 80.17785
## 459 71 77.30115
## 460 76 72.49663
## 461 70 77.63991
## 462 81 72.55387
## 463 91 75.26255
## 464 80 89.56768
## 465 70 89.34414
## 466 82 81.88434
## 467 84 84.87743
## 468 51 81.65693
## 469 77 82.50675
## 470 68 81.01440
## 471 58 75.36897
## 472 56 74.83014
## 473 63 86.95051
## 474 59 75.64894
## 475 84 71.91194
## 476 84 88.80437
## 477 68 72.45152
## 478 68 85.64352
## 479 79 87.18982
## 480 93 88.23807
## 481 85 97.61437
## 482 81 94.15000
## 483 76 85.46822
## 484 69 81.29913
## 485 82 83.24641
## 486 76 73.97669
## 487 66 78.98370
## 488 98 87.86617
## 489 104 90.77691
## 490 79 82.77272
## 491 69 81.84398
## 492 66 78.91267
## 493 78 78.56951
## 494 66 90.81276
## 495 59 82.15841
## 496 45 77.31243
## 497 69 91.95794
## 498 62 87.79896
## 499 60 76.99782
## 500 78 79.89077
## 501 95 74.67528
## 502 97 86.66090
## 503 88 76.67202
## 504 98 84.36278
## 505 84 82.37365
## 506 87 77.49743
## 507 85 72.37151
## 508 80 75.22012
## 509 88 73.18528
## 510 88 71.40753
## 511 107 78.79972
## 512 83 77.39860
## 513 94 81.06421
## 514 83 78.92919
## 515 85 83.29334
## 516 78 92.13062
## 517 49 86.88210
## 518 60 86.09691
## 519 68 83.65993
## 520 86 80.00122
## 521 80 75.79357
## 522 76 71.29807
## 523 92 80.14395
## 524 96 85.04878
## 525 84 86.42506
## 526 78 77.88073
## 527 83 89.99265
## 528 80 93.12574
## 529 97 86.35583
## 530 85 82.66900
## 531 78 86.55700
## 532 92 76.00762
## 533 83 75.10227
## 534 98 76.24862
## 535 75 93.95561
## 536 62 86.80163
## 537 110 94.03234
## 538 87 86.58101
## 539 86 86.56917
## 540 90 88.22369
## 541 105 85.39937
## 542 74 81.25140
## 543 78 82.13859
## 544 103 80.30914
## 545 73 74.44826
## 546 81 76.10264
## 547 59 74.08154
## 548 72 69.92862
## 549 88 69.95132
## 550 92 74.53685
## 551 79 77.21791
## 552 71 68.76742
## 553 78 67.21485
## 554 68 78.58185
## 555 89 74.37172
## 556 88 77.15918
## 557 100 83.53078
## 558 93 84.49880
## 559 83 86.41669
## 560 83 75.22195
## 561 105 73.30565
## 562 74 76.43071
## 563 59 67.01121
## 564 72 78.15236
## 565 79 84.22717
## 566 81 103.62586
## 567 96 99.78573
## 568 89 92.94880
## 569 95 96.14162
## 570 99 93.83560
## 571 86 87.21972
## 572 85 80.56718
## 573 95 75.47100
## 574 75 85.48369
## 575 74 81.64591
## 576 77 80.88031
## 577 94 72.00168
## 578 90 98.77644
## 579 93 95.99900
## 580 90 95.74313
## 581 72 81.34501
## 582 70 90.01800
## 583 77 90.45721
## 584 79 93.97424
## 585 63 79.58646
## 586 111 83.30092
## 587 62 80.62590
## 588 78 74.64967
## 589 59 81.50964
## 590 102 77.75612
## 591 65 68.67546
## 592 98 67.09276
## 593 89 74.64771
## 594 108 74.22550
## 595 88 91.69689
## 596 91 88.52477
## 597 70 83.85911
## 598 86 85.75061
## 599 84 84.90506
## 600 88 78.19194
## 601 77 86.69231
## 602 94 84.65045
## 603 99 90.56613
## 604 71 88.21833
## 605 93 89.20599
## 606 92 96.33209
## 607 79 89.50356
## 608 72 96.00533
## 609 97 94.62374
## 610 117 93.39055
## 611 82 95.00998
## 612 79 80.83454
## 613 87 75.38982
## 614 71 76.75518
## 615 61 78.99632
## 616 100 84.62905
## 617 87 82.87983
## 618 97 89.69512
## 619 91 80.89995
## 620 83 77.82870
## 621 73 76.45063
## 622 68 78.45039
## 623 66 76.99874
## 624 70 87.95743
## 625 83 92.81547
## 626 71 80.81087
## 627 93 78.03260
## 628 82 81.21524
## 629 94 79.27102
## 630 79 80.71248
## 631 82 81.70592
## 632 100 78.63917
## 633 77 75.29746
## 634 101 72.33944
## 635 79 76.57523
## 636 74 81.96928
## 637 83 82.33040
## 638 92 85.74432
## 639 84 85.14898
## 640 75 84.98782
## 641 66 90.15115
## 642 64 87.60581
## 643 92 92.92149
## 644 79 95.31093
## 645 83 81.02289
## 646 66 76.88477
## 647 65 84.77585
## 648 79 88.93106
## 649 80 82.10155
## 650 90 89.52683
## 651 77 86.02564
## 652 76 79.37780
## 653 102 84.48734
## 654 87 82.93491
## 655 65 68.22435
## 656 85 75.99833
## 657 92 85.70357
## 658 83 79.57618
## 659 88 70.89438
## 660 58 89.49693
## 661 73 77.55810
## 662 80 81.40149
## 663 75 73.90587
## 664 85 73.55784
## 665 69 83.10389
## 666 70 78.39615
## 667 75 77.05779
## 668 92 80.72315
## 669 94 73.46132
## 670 102 61.90398
## 671 75 62.28685
## 672 91 73.70169
## 673 86 74.54397
## 674 94 73.84600
## 675 92 61.25132
## 676 98 79.47124
## 677 95 83.26075
## 678 73 71.60880
## 679 77 83.55334
## 680 86 75.68196
## 681 88 76.36565
## 682 75 80.20358
## 683 80 79.05666
## 684 69 77.81118
## 685 68 83.28782
## 686 66 74.22485
## 687 77 76.82253
## 688 87 77.12806
## 689 83 80.63736
## 690 80 77.08373
## 691 78 71.73085
## 692 74 74.57166
## 693 74 79.22417
## 694 56 86.49261
## 695 67 82.07615
## 696 59 86.23495
## 697 53 96.73694
## 698 45 83.63311
## 699 67 82.47753
## 700 65 79.80981
## 701 97 80.53284
## 702 70 81.83612
## 703 89 76.42994
## 704 67 73.20961
## 705 62 75.03780
## 706 66 74.59428
## 707 71 84.51842
## 708 89 84.17397
## 709 91 85.30279
## 710 81 78.78881
## 711 77 84.07513
## 712 85 79.38092
## 713 77 80.61640
## 714 81 78.04178
## 715 71 84.50836
## 716 78 83.54679
## 717 41 80.08120
## 718 66 82.91023
## 719 98 78.49132
## 720 108 83.35613
## 721 91 73.32258
## 722 72 71.67373
## 723 66 98.19085
## 724 83 98.01751
## 725 87 78.46787
## 726 99 89.91365
## 727 77 75.16995
## 728 97 75.50304
## 729 77 67.49380
## 730 54 79.75337
## 731 61 83.97999
## 732 96 85.34265
## 733 106 82.35342
## 734 87 80.79035
## 735 86 75.78706
## 736 102 74.82875
## 737 108 75.08221
## 738 96 82.44515
## 739 93 78.18402
## 740 83 76.62192
## 741 80 84.27091
## 742 99 78.70232
## 743 100 80.78674
## 744 99 76.73710
## 745 91 76.18862
## 746 98 68.60446
## 747 90 71.96965
## 748 92 84.90424
## 749 96 80.17055
## 750 114 83.99648
## 751 88 86.59310
## 752 96 75.47677
## 753 101 75.53225
## 754 88 86.19133
## 755 110 85.54998
## 756 38 89.02909
## 757 58 86.18981
## 758 56 97.91304
## 759 107 98.63173
## 760 63 92.47344
## 761 58 104.48665
## 762 56 97.33576
## 763 52 100.85301
## 764 74 92.93128
## 765 62 82.82575
## 766 77 87.45635
## 767 75 85.89462
## 768 102 82.10294
## 769 77 89.28083
## 770 76 92.29101
## 771 99 72.82231
## 772 66 72.27080
## 773 87 67.86196
## 774 78 62.46698
## 775 83 74.97552
## 776 93 82.33372
## 777 72 73.87224
## 778 75 76.05064
## 779 69 77.17043
## 780 82 82.98197
## 781 57 77.89406
## 782 49 88.99333
## 783 45 86.81349
## 784 67 86.70833
## 785 85 85.03963
## 786 96 73.59994
## 787 92 81.58437
## 788 82 75.50149
## 789 61 72.12338
## 790 77 68.86495
## 791 70 80.02704
## 792 75 77.39364
## 793 65 78.07165
## 794 80 77.63262
## 795 79 82.97429
## 796 120 79.14137
## 797 103 89.78282
## 798 117 89.27390
## 799 73 83.86970
## 800 54 77.30894
## 801 99 69.58990
## 802 91 72.98084
## 803 79 88.69279
## 804 93 87.92394
## 805 72 91.70302
## 806 73 82.67101
## 807 84 71.85679
## 808 87 68.10150
## 809 69 81.84717
## 810 65 78.12253
## 811 74 69.83695
## 812 80 76.46566
## 813 57 86.11409
## 814 80 90.88470
## 815 98 91.80391
## 816 67 92.11302
## 817 63 84.80266
## 818 60 74.65166
## 819 73 77.57825
## 820 81 76.78635
## 821 83 82.36152
## 822 89 82.50355
## 823 82 88.04743
## 824 98 87.72790
## 825 66 76.63681
## 826 57 75.42680
## 827 59 82.96084
## 828 77 78.21579
## 829 83 76.27439
## 830 82 73.79846
## 831 116 81.10167
## 832 93 75.42576
## 833 63 76.49613
## 834 103 74.59911
## 835 97 89.20900
## 836 88 87.22479
## 837 92 75.04804
## 838 103 71.78931
## 839 101 78.02374
## 840 97 76.61472
## 841 93 63.46132
## 842 98 73.67504
## 843 97 77.48161
## 844 90 80.45738
## 845 77 78.22131
## 846 91 80.05917
## 847 83 80.85200
## 848 64 80.57371
## 849 85 86.57068
## 850 77 85.80458
## 851 101 83.86479
## 852 97 83.11116
## 853 73 86.62546
## 854 87 86.96736
## 855 83 81.63054
## 856 90 84.51782
## 857 72 83.66754
## 858 89 90.72007
## 859 87 84.56464
## 860 72 79.70922
## 861 75 80.94532
## 862 88 79.55306
## 863 101 83.57741
## 864 75 76.30040
## 865 50 79.15809
## 866 42 72.63969
## 867 90 77.97984
## 868 88 80.06715
## 869 79 75.38970
## 870 52 80.66269
## 871 58 75.27283
## 872 64 83.24337
## 873 64 74.94358
## 874 106 81.51085
## 875 76 83.24421
## 876 101 84.04611
## 877 92 77.97604
## 878 76 75.08056
## 879 89 72.00019
## 880 83 78.40749
## 881 85 74.36827
## 882 93 69.61521
## 883 76 78.99502
## 884 76 76.21166
## 885 84 79.95868
## 886 84 85.97882
## 887 83 74.10177
## 888 78 74.43285
## 889 86 76.14842
## 890 70 76.55578
## 891 88 71.64883
## 892 73 88.44513
## 893 83 82.81897
## 894 105 79.19356
## 895 100 74.99922
## 896 63 84.37986
## 897 67 74.92339
## 898 70 81.98816
## 899 83 83.05794
## 900 64 79.14053
## 901 83 76.41732
## 902 85 77.00170
## 903 83 83.40517
## 904 88 77.80671
## 905 95 80.72405
## 906 71 84.92448
## 907 54 76.37162
## 908 53 80.14464
## 909 84 81.99346
## 910 83 79.12052
## 911 52 63.72078
## 912 73 60.35400
## 913 75 66.78161
## 914 76 60.68783
## 915 96 59.18272
## 916 86 70.53416
## 917 67 81.46170
## 918 71 71.45605
## 919 76 74.57813
## 920 70 71.72674
## 921 81 75.34780
## 922 91 81.52163
## 923 80 83.94570
## 924 70 91.68011
## 925 82 85.36356
## 926 84 81.99874
## 927 51 84.14420
## 928 77 73.31153
## 929 68 78.85476
## 930 58 76.00103
## 931 56 77.32058
## 932 63 73.77252
## 933 59 90.41431
## 934 84 85.21387
## 935 84 73.53234
## 936 68 76.49770
## 937 68 70.54324
## 938 79 84.21458
## 939 93 97.21267
## 940 85 80.64946
## 941 81 79.82280
## 942 76 76.58877
## 943 69 76.01548
## 944 82 75.33370
## 945 76 71.64498
## 946 66 84.87058
## 947 98 79.01173
## 948 104 81.63345
## 949 79 77.29707
## 950 69 79.07006
## 951 66 78.70672
## 952 78 98.16423
## 953 66 85.26197
## 954 59 90.39484
## 955 45 94.47383
## 956 69 104.75405
## 957 62 92.78122
## 958 60 93.66794
## 959 78 95.31631
## 960 95 102.26987
## 961 97 95.77832
## 962 88 93.74198
## 963 98 87.68429
## 964 84 85.41279
## 965 87 86.50211
## 966 85 86.91786
## 967 80 86.91658
## 968 88 92.97927
## 969 88 100.04520
## 970 107 92.21314
## 971 83 86.56469
## 972 94 90.68067
## 973 83 89.20259
## 974 85 95.19330
## 975 78 84.80754
## 976 49 85.47340
## 977 60 79.99373
## 978 68 66.40110
## 979 86 74.33150
## 980 80 80.55107
## 981 76 77.92324
## 982 92 75.59030
## 983 96 74.68523
## 984 84 73.58853
## 985 78 79.11850
## 986 83 86.74757
## 987 80 91.30102
## 988 97 80.72652
## 989 85 91.67552
## 990 78 85.58264
## 991 92 85.24275
## 992 83 79.61319
## 993 98 94.02997
## 994 75 91.34655
## 995 62 86.76379
## 996 110 83.15531
## 997 87 80.73357
## 998 86 78.38100
## 999 90 82.05406
## 1000 105 84.55653
## 1001 74 85.12766
## 1002 78 85.11712
## 1003 103 94.73952
## 1004 73 92.33759
## 1005 81 89.29616
## 1006 59 89.76105
## 1007 72 87.44643
## 1008 88 91.54011
## 1009 92 93.41918
## 1010 79 92.01233
## 1011 71 91.22298
## 1012 78 93.76596
## 1013 68 80.15303
## 1014 89 74.74933
## 1015 88 75.69253
## 1016 100 76.00785
## 1017 93 71.17943
## 1018 83 66.08970
## 1019 83 80.56114
## 1020 105 85.16664
## 1021 74 87.91888
## 1022 59 74.29519
## 1023 72 73.74423
## 1024 79 72.48966
## 1025 81 76.45746
## 1026 96 79.55431
## 1027 89 82.36934
## 1028 95 77.09259
## 1029 99 85.04632
## 1030 86 87.48039
## 1031 85 94.97976
## 1032 95 91.99768
## 1033 75 90.99592
## 1034 74 89.30324
## 1035 77 81.71456
## 1036 94 79.33073
## 1037 90 89.00786
## 1038 93 87.12255
## 1039 90 81.65797
## 1040 72 81.35012
## 1041 70 68.60934
## 1042 77 67.88631
## 1043 79 68.54235
## 1044 63 67.56090
## 1045 111 69.46154
## 1046 62 82.63933
## 1047 78 85.29402
## 1048 59 74.25824
## 1049 102 76.50022
## 1050 65 79.50063
## 1051 98 78.73351
## 1052 89 77.56332
## 1053 108 71.55565
## 1054 88 70.39303
## 1055 91 74.11267
## 1056 70 85.19926
## 1057 86 78.36045
## 1058 84 77.45106
## 1059 88 83.83186
## 1060 77 76.32044
## 1061 94 86.73581
## 1062 99 83.39038
## 1063 71 83.93669
## 1064 93 83.00696
## 1065 92 86.60338
## 1066 79 86.10335
## 1067 72 88.47441
## 1068 97 83.14315
## 1069 117 91.80580
## 1070 82 88.58660
## 1071 79 82.50243
## 1072 87 79.89829
## 1073 71 91.77266
## 1074 61 85.07039
## 1075 100 83.01118
## 1076 87 85.68794
## 1077 97 95.76558
## 1078 91 90.58290
## 1079 83 91.94511
## 1080 73 96.31662
## 1081 68 84.96631
## 1082 66 69.83334
## 1083 70 73.56281
## 1084 83 72.00398
## 1085 71 74.46932
## 1086 93 73.08326
## 1087 82 71.27250
## 1088 94 84.53501
## 1089 79 83.59480
## 1090 82 81.98597
## 1091 100 86.48434
## 1092 77 81.75761
## 1093 101 76.77313
## 1094 79 76.37420
## 1095 74 75.89710
## 1096 83 77.61684
## 1097 92 81.74684
## 1098 84 81.41920
## 1099 75 76.82860
## 1100 66 74.59177
## 1101 64 79.96919
## 1102 92 94.95448
## 1103 79 83.66543
## 1104 83 76.84529
## 1105 66 87.13900
## 1106 65 71.93989
## 1107 79 78.74975
## 1108 80 78.76011
## 1109 90 66.61349
## 1110 77 70.79383
## 1111 76 70.84328
## 1112 102 71.91145
## 1113 87 67.12428
## 1114 65 71.62438
## 1115 85 71.33810
## 1116 92 71.90173
## 1117 83 73.78630
## 1118 88 81.48361
## 1119 58 84.80836
## 1120 73 83.19514
## 1121 80 81.29440
## 1122 75 73.80876
## 1123 85 67.63890
## 1124 69 73.59700
## 1125 70 79.16576
## 1126 75 72.63248
## 1127 92 72.41918
## 1128 94 73.04657
## 1129 102 66.49953
## 1130 75 73.95977
## 1131 91 71.42417
## 1132 86 69.30896
## 1133 94 71.54921
## 1134 92 76.78362
## 1135 98 82.63552
## 1136 95 95.12850
## 1137 73 83.90767
## 1138 77 84.47296
## 1139 86 87.77827
## 1140 88 80.02842
## 1141 75 84.42900
## 1142 80 85.19428
## 1143 69 73.29018
## 1144 68 78.51938
## 1145 66 75.82602
## 1146 77 71.69911
## 1147 87 75.95149
## 1148 83 85.64490
## 1149 80 83.34346
## 1150 78 77.77329
## 1151 74 74.54778
## 1152 74 86.37762
## 1153 56 80.14170
## 1154 67 83.43499
## 1155 59 85.01173
## 1156 53 79.65990
## 1157 45 97.22181
## 1158 67 78.07314
## 1159 65 91.60057
## 1160 97 92.29360
## 1161 70 92.95479
## 1162 89 89.26299
## 1163 67 79.41539
## 1164 62 80.53341
## 1165 66 73.59357
## 1166 71 81.24245
## 1167 89 83.56699
## 1168 91 82.38249
## 1169 81 93.33796
## 1170 77 85.35170
## 1171 85 76.41227
## 1172 77 79.07561
## 1173 81 72.52526
## 1174 71 95.66395
## 1175 78 87.66176
## 1176 41 92.90300
## 1177 66 99.28524
## 1178 98 86.39980
## 1179 108 81.58923
## 1180 91 85.33114
## 1181 72 93.70586
## 1182 66 93.36064
## 1183 83 77.33035
## 1184 87 79.55576
## 1185 99 75.40082
## 1186 77 79.76631
## 1187 97 81.47024
## 1188 77 80.17441
## 1189 54 77.22722
## 1190 61 81.74268
## 1191 96 78.60004
## 1192 106 84.63944
## 1193 87 79.37534
## 1194 86 76.55477
## 1195 102 78.77538
## 1196 108 85.52381
## 1197 96 84.44809
## 1198 93 77.59152
## 1199 83 87.63939
## 1200 80 86.29028
## 1201 99 84.57447
## 1202 100 87.19745
## 1203 99 77.20342
## 1204 91 73.46173
## 1205 98 73.29697
## 1206 90 78.67011
## 1207 92 74.68087
## 1208 96 71.33847
## 1209 114 70.73626
## 1210 88 70.52728
## 1211 96 78.88400
## 1212 101 71.53207
## 1213 88 71.39736
## 1214 110 74.93688
## 1215 38 81.09400
## 1216 58 72.57754
## 1217 56 75.39038
## 1218 107 75.68757
## 1219 63 80.16107
## 1220 58 89.40135
## 1221 56 72.16460
## 1222 52 83.38028
## 1223 74 81.32313
## 1224 62 84.89702
## 1225 77 84.84828
## 1226 75 80.52925
## 1227 102 76.62719
## 1228 77 79.65161
## 1229 76 79.83045
## 1230 99 82.55894
## 1231 66 73.84234
## 1232 87 81.12999
## 1233 78 73.31931
## 1234 83 63.40639
## 1235 93 75.90363
## 1236 72 67.21770
## 1237 75 70.21542
## 1238 69 68.87801
## 1239 82 72.77913
## 1240 57 70.17411
## 1241 49 71.05437
## 1242 45 81.34791
## 1243 67 79.48590
## 1244 85 80.68455
## 1245 96 72.75448
## 1246 92 74.81253
## 1247 82 78.33119
## 1248 61 80.06916
## 1249 77 73.74964
## 1250 70 74.50480
## 1251 75 75.05737
## 1252 65 80.77484
## 1253 80 80.54898
## 1254 79 80.71122
## 1255 120 80.60354
## 1256 103 84.06425
## 1257 117 81.92056
## 1258 73 74.13072
## 1259 54 74.34099
## 1260 99 82.79877
## 1261 91 78.63570
## 1262 79 79.79308
## 1263 93 86.05803
## 1264 72 77.78018
## 1265 73 79.84836
## 1266 84 84.47589
## 1267 87 79.79337
## 1268 69 76.72617
## 1269 65 79.21975
## 1270 74 80.25741
## 1271 80 79.14570
## 1272 57 83.44643
## 1273 80 93.98953
## 1274 98 94.00703
## 1275 67 93.68490
## 1276 63 91.38872
## 1277 60 93.75317
## 1278 73 92.88496
## 1279 81 91.70101
## 1280 83 85.48907
## 1281 89 80.65960
## 1282 82 74.69654
## 1283 98 75.32830
## 1284 66 81.12020
## 1285 57 70.93756
## 1286 59 70.80932
## 1287 77 74.57660
## 1288 83 85.39189
## 1289 82 86.64623
## 1290 116 84.02929
## 1291 93 79.94081
## 1292 63 78.06523
## 1293 103 84.11563
## 1294 97 88.72233
## 1295 88 86.22934
## 1296 92 79.80524
## 1297 103 86.74477
## 1298 101 87.62938
## 1299 97 92.83543
## 1300 93 94.36469
## 1301 98 86.01547
## 1302 97 80.54337
## 1303 90 73.29776
## 1304 77 78.68654
## 1305 91 82.39342
## 1306 83 78.33648
## 1307 64 79.95985
## 1308 85 81.96134
## 1309 77 81.59286
## 1310 101 79.05989
## 1311 97 77.86516
## 1312 73 82.34401
## 1313 87 75.16981
## 1314 83 80.07663
## 1315 90 82.51050
## 1316 72 82.61926
## 1317 89 96.35798
## 1318 87 91.78498
## 1319 72 91.83003
## 1320 75 96.02732
## 1321 88 86.45282
## 1322 101 82.53612
## 1323 75 86.96077
## 1324 50 87.22632
## 1325 42 84.18313
## 1326 90 83.26233
## 1327 88 89.16195
## 1328 79 81.82784
## 1329 52 80.15164
## 1330 58 76.91678
## 1331 64 72.78148
## 1332 64 70.53999
## 1333 106 78.85112
## 1334 76 81.40689
## 1335 101 83.17569
## 1336 92 87.87173
## 1337 76 75.63225
## 1338 89 78.58231
## 1339 83 73.46579
## 1340 85 78.71896
## 1341 93 82.70960
## 1342 76 73.33920
## 1343 76 77.90761
## 1344 84 79.98687
## 1345 84 70.52942
## 1346 83 84.46694
## 1347 78 98.40246
## 1348 86 86.63365
## 1349 70 78.47395
## 1350 88 85.92126
## 1351 73 83.71443
## 1352 83 87.85776
## 1353 105 94.71885
## 1354 100 90.06946
## 1355 63 81.89370
## 1356 67 71.95118
## 1357 70 75.62014
## 1358 83 88.90679
## 1359 64 95.62316
## 1360 83 72.47442
## 1361 85 97.62852
## 1362 83 88.88781
## 1363 88 73.04341
## 1364 95 74.20625
## 1365 71 74.85814
## 1366 54 75.56058
## 1367 53 67.96074
## 1368 84 64.83668
## 1369 83 66.12139
## 1370 52 77.36962
## 1371 73 86.11549
## 1372 75 88.08887
## 1373 76 84.97467
## 1374 96 79.91790
## 1375 86 82.47619
## 1376 67 76.35949
## 1377 71 78.68785
## 1378 76 76.76146
## 1379 70 90.75052
## 1380 81 83.07884
## 1381 91 90.01429
## 1382 80 88.37454
## 1383 70 92.99445
## 1384 82 90.64963
## 1385 84 90.30104
## 1386 51 83.17853
## 1387 77 79.07535
## 1388 68 77.46767
## 1389 58 84.16275
## 1390 56 79.55429
## 1391 63 80.16002
## 1392 59 80.89242
## 1393 84 83.19119
## 1394 84 81.77316
## 1395 68 88.11476
## 1396 68 78.96010
## 1397 79 82.05215
## 1398 93 78.88763
## 1399 85 79.95473
## 1400 81 87.94550
## 1401 76 84.21109
## 1402 69 82.94389
## 1403 82 81.63757
## 1404 76 82.86589
## 1405 66 82.07041
## 1406 98 88.16516
## 1407 104 83.92208
## 1408 79 82.40573
## 1409 69 82.94323
## 1410 66 79.72355
## 1411 78 80.43830
## 1412 66 81.25349
## 1413 59 77.38370
## 1414 45 78.72661
## 1415 69 78.92403
## 1416 62 73.09722
## 1417 60 76.51427
## 1418 78 75.89040
## 1419 95 75.09768
## 1420 97 69.03655
## 1421 88 73.11790
## 1422 98 80.32345
## 1423 84 81.62448
## 1424 87 72.46322
## 1425 85 73.05070
## 1426 80 82.60617
## 1427 88 76.80164
## 1428 88 74.71730
## 1429 107 72.53451
## 1430 83 79.24098
## 1431 94 80.59521
## 1432 83 77.15610
## 1433 85 74.10392
## 1434 78 79.46361
## 1435 49 81.59949
## 1436 60 92.84525
## 1437 68 86.74162
## 1438 86 89.00812
## 1439 80 83.81009
## 1440 76 82.45990
## 1441 92 86.81278
## 1442 96 87.08329
## 1443 84 81.02582
## 1444 78 81.79036
## 1445 83 76.09764
## 1446 80 79.58062
## 1447 97 70.85242
## 1448 85 73.77270
## 1449 78 78.06725
## 1450 92 82.56246
## 1451 83 82.22541
## 1452 98 82.97301
## 1453 75 77.64168
## 1454 62 79.54539
## 1455 110 68.24060
## 1456 87 70.24206
## 1457 86 67.33869
## 1458 90 80.95609
## 1459 105 77.96052
## 1460 74 70.39466
## 1461 78 75.42098
## 1462 103 79.98365
## 1463 73 72.26114
## 1464 81 80.33432
## 1465 59 72.40334
## 1466 72 76.28209
## 1467 88 77.74120
## 1468 92 76.98433
## 1469 79 74.78655
## 1470 71 84.57421
## 1471 78 77.67434
## 1472 68 84.70901
## 1473 89 78.88817
## 1474 88 86.51928
## 1475 100 94.59262
## 1476 93 83.68060
## 1477 83 90.09426
## 1478 83 86.45699
## 1479 105 85.18612
## 1480 74 73.40551
## 1481 59 73.27707
## 1482 72 91.99847
## 1483 79 86.52118
## 1484 81 92.42330
## 1485 96 84.04087
## 1486 89 83.86156
## 1487 95 85.52067
## 1488 99 80.89763
## 1489 86 86.58041
## 1490 85 91.77189
## 1491 95 82.89224
## 1492 75 78.98014
## 1493 74 87.14506
## 1494 77 89.37809
## 1495 94 74.04740
## 1496 90 75.81540
## 1497 93 84.01966
## 1498 90 88.51943
## 1499 72 72.64024
## 1500 70 79.09215
## 1501 77 70.49611
## 1502 79 79.19742
## 1503 63 74.80890
## 1504 111 70.05598
## 1505 62 79.93610
## 1506 78 71.64301
## 1507 59 81.23458
## 1508 102 84.56800
## 1509 65 81.05502
## 1510 98 77.11747
## 1511 89 78.69532
## 1512 108 76.56764
## 1513 88 74.31725
## 1514 91 79.24925
## 1515 70 77.95751
## 1516 86 73.35033
## 1517 84 82.50775
## 1518 88 77.38719
## 1519 77 79.51796
## 1520 94 67.76209
## 1521 99 80.89661
mean (apply(predicted_values, 1, min)/apply(predicted_values, 1, max)) # calculate accuracy## [1] 0.8595468
The prediction accuracy for the OLS Model5 is at 85.94% which is not bad for this purpose. But lets compare it to the Champion Model- The improved Ridge Regression.
predicted <- predict(Model6_Improved, newx = test_Ind)# predict on test data
predicted_values <- cbind (actual=test_baseball$TARGET_WINS, predicted) # combine
predicted_values## actual s0
## 6 76 73.66949
## 12 70 71.84946
## 13 81 79.67667
## 18 91 85.71056
## 19 80 78.85583
## 22 70 74.76474
## 25 82 83.72951
## 36 84 83.91670
## 37 51 59.85228
## 41 77 82.53565
## 43 68 71.29912
## 44 58 62.09606
## 58 56 63.38799
## 60 63 68.62745
## 62 59 66.31901
## 67 84 82.03755
## 70 84 80.49142
## 71 68 69.11721
## 75 68 69.88603
## 80 79 80.85346
## 85 93 88.87315
## 90 85 83.56938
## 92 81 78.43074
## 95 76 79.70924
## 100 69 72.33649
## 102 82 78.27072
## 103 76 78.58337
## 104 66 68.76997
## 107 98 92.28108
## 108 104 96.66694
## 118 79 80.80432
## 121 69 70.61699
## 125 66 66.42360
## 133 78 78.75057
## 136 66 72.19496
## 137 59 66.96932
## 139 45 58.91448
## 144 69 70.87911
## 146 62 69.11330
## 152 60 64.94816
## 155 78 76.27766
## 156 95 88.65596
## 168 97 92.91980
## 171 88 84.33082
## 172 98 91.05720
## 179 84 84.06918
## 183 87 85.52825
## 187 85 84.92960
## 188 80 84.29604
## 189 88 89.29075
## 200 88 85.50866
## 204 107 98.52410
## 206 83 80.76340
## 210 94 87.61199
## 212 83 82.99892
## 214 85 85.64356
## 218 78 79.95363
## 221 49 58.90912
## 223 60 66.11034
## 226 68 73.42406
## 233 86 87.72576
## 238 80 78.87719
## 242 76 77.98864
## 243 92 86.69120
## 249 96 91.60043
## 252 84 82.47891
## 254 78 81.36094
## 255 83 83.63497
## 259 80 79.36742
## 261 97 93.99240
## 262 85 86.72939
## 263 78 79.97027
## 264 92 90.40020
## 267 83 82.88000
## 269 98 95.18071
## 271 75 78.65767
## 272 62 68.56283
## 277 110 97.17208
## 285 87 83.55274
## 286 86 83.20469
## 287 90 87.15159
## 294 105 96.81194
## 299 74 75.17226
## 301 78 79.35306
## 303 103 96.90797
## 305 73 73.93758
## 310 81 80.11138
## 315 59 65.22982
## 317 72 72.34479
## 319 88 84.41755
## 320 92 86.72476
## 326 79 79.57316
## 328 71 73.97725
## 336 78 79.22454
## 340 68 72.77745
## 341 89 88.53058
## 345 88 84.33003
## 352 100 91.09735
## 353 93 85.34941
## 356 83 82.73441
## 357 83 79.33159
## 359 105 97.18247
## 360 74 76.89107
## 363 59 67.05691
## 365 72 75.84600
## 367 79 81.75510
## 373 81 80.26269
## 382 96 91.85637
## 383 89 87.97123
## 384 95 91.44455
## 386 99 93.39189
## 388 86 85.84246
## 389 85 84.74085
## 391 95 87.89671
## 397 75 77.13700
## 401 74 76.21791
## 408 77 78.38659
## 412 94 91.07904
## 421 90 89.22567
## 422 93 91.56111
## 424 90 89.51933
## 425 72 76.10659
## 430 70 71.39420
## 431 77 75.69540
## 432 79 81.55133
## 434 63 67.56990
## 438 111 101.04140
## 442 62 68.64254
## 448 78 78.41574
## 449 59 64.98248
## 451 102 94.34021
## 460 65 69.45769
## 469 98 91.71949
## 473 89 88.13970
## 480 108 101.14147
## 482 88 89.24030
## 484 91 88.09229
## 487 70 74.09790
## 488 86 84.90213
## 489 84 84.70648
## 490 88 85.24837
## 496 77 78.87545
## 508 94 88.92966
## 518 99 96.77308
## 520 71 76.48413
## 525 93 91.49765
## 529 92 89.72595
## 531 79 80.24263
## 534 72 73.82659
## 539 97 91.47392
## 540 117 105.10405
## 543 82 80.03032
## 546 79 78.29175
## 549 87 81.74385
## 554 71 72.83651
## 559 61 69.24746
## 567 100 98.35468
## 568 87 88.35633
## 569 97 96.62710
## 571 91 89.97806
## 580 83 84.83277
## 585 73 76.28405
## 587 68 74.44914
## 591 66 68.06319
## 595 70 76.07178
## 597 83 83.01055
## 598 71 76.49541
## 599 93 90.90381
## 604 82 83.27722
## 608 94 94.22907
## 612 79 80.72949
## 617 82 83.33519
## 618 100 96.21209
## 619 77 78.30794
## 625 101 96.26424
## 626 79 80.64369
## 635 74 74.36614
## 639 83 82.56344
## 640 92 89.34205
## 648 84 84.50413
## 649 75 77.86426
## 654 66 71.25035
## 659 64 68.69240
## 662 92 88.62217
## 665 79 80.36639
## 667 83 82.14409
## 671 66 66.27550
## 672 65 68.60920
## 677 79 80.76310
## 683 80 79.89404
## 685 90 88.47844
## 686 77 76.15623
## 689 76 78.70656
## 696 102 97.28532
## 699 87 86.09152
## 704 65 68.37043
## 712 85 83.15728
## 713 92 88.83556
## 720 83 83.57234
## 723 88 88.94104
## 729 58 62.22752
## 736 73 73.88046
## 738 80 80.75786
## 740 75 76.92471
## 743 85 82.77846
## 744 69 72.77203
## 746 70 73.11059
## 748 75 78.64709
## 754 92 89.85543
## 759 94 92.01389
## 760 102 96.90217
## 766 75 75.52031
## 767 91 88.30164
## 768 86 84.23012
## 769 94 90.11417
## 781 92 86.67011
## 782 98 93.10819
## 783 95 91.20429
## 788 73 73.81146
## 797 77 80.59707
## 798 86 85.58914
## 803 88 87.86604
## 804 75 76.90630
## 805 80 81.78838
## 807 69 69.95716
## 810 68 70.61273
## 811 66 68.34438
## 819 77 77.49187
## 820 87 83.63150
## 822 83 83.14073
## 824 80 82.28046
## 825 78 77.64465
## 826 74 75.03677
## 827 74 78.74627
## 830 56 64.04780
## 832 67 71.65187
## 839 59 65.01047
## 840 53 62.28410
## 841 45 52.65309
## 843 67 71.47506
## 847 65 71.50734
## 852 97 93.38403
## 855 70 75.67307
## 863 89 86.71099
## 866 67 70.00640
## 872 62 69.08450
## 874 66 69.56761
## 876 71 74.20402
## 877 89 85.04458
## 878 91 85.69797
## 882 81 78.00084
## 885 77 78.69369
## 886 85 84.87952
## 887 77 75.95385
## 891 81 79.82858
## 897 71 74.43830
## 899 78 82.82889
## 910 41 53.90780
## 913 66 66.67548
## 922 98 92.29797
## 923 108 99.08596
## 927 91 88.33199
## 928 72 73.90306
## 937 66 68.87268
## 938 83 82.84075
## 940 87 83.92714
## 941 99 91.42367
## 942 77 76.53661
## 943 97 90.31428
## 944 77 76.40089
## 945 54 60.26887
## 947 61 67.43391
## 955 96 94.46084
## 957 106 99.96243
## 965 87 85.31860
## 966 86 84.67602
## 968 102 96.63565
## 973 108 99.09979
## 976 96 91.52678
## 980 93 88.53365
## 982 83 79.55454
## 983 80 77.93230
## 986 99 93.75805
## 987 100 95.28734
## 988 99 93.19172
## 991 91 88.30428
## 993 98 93.98624
## 994 90 89.37868
## 1001 92 91.78497
## 1002 96 93.98382
## 1003 114 106.68353
## 1005 88 88.74703
## 1006 96 92.71782
## 1008 101 97.27173
## 1011 88 88.03569
## 1020 110 99.91845
## 1022 38 50.74094
## 1023 58 64.81709
## 1026 56 64.62663
## 1032 107 101.89503
## 1035 63 69.97144
## 1037 58 68.28667
## 1038 56 67.53556
## 1045 52 60.53534
## 1047 74 77.28542
## 1050 62 66.21647
## 1051 77 76.45813
## 1053 75 73.79031
## 1057 102 93.14765
## 1062 77 78.97421
## 1063 76 78.32035
## 1064 99 92.38123
## 1082 66 68.89555
## 1086 87 82.35800
## 1087 78 76.18034
## 1088 83 82.32457
## 1094 93 86.83625
## 1095 72 73.71302
## 1102 75 81.64445
## 1104 69 72.70738
## 1105 82 83.74787
## 1107 57 65.58329
## 1109 49 57.73233
## 1112 45 55.31674
## 1114 67 69.74431
## 1118 85 83.11699
## 1119 96 91.05771
## 1121 92 87.81319
## 1128 82 78.80447
## 1132 61 65.70621
## 1145 77 78.31097
## 1147 70 73.48872
## 1151 75 77.44235
## 1153 65 72.25560
## 1155 80 82.80743
## 1158 79 77.34397
## 1161 120 109.25547
## 1166 103 94.70497
## 1167 117 104.41507
## 1171 73 73.99515
## 1173 54 60.96786
## 1179 99 94.28190
## 1180 91 90.31187
## 1183 79 78.73689
## 1190 93 89.31279
## 1191 72 75.20828
## 1194 73 75.03266
## 1195 84 82.95277
## 1200 87 86.08791
## 1204 69 70.75628
## 1205 65 69.09981
## 1208 74 72.33228
## 1217 80 78.34545
## 1221 57 64.57035
## 1222 80 81.33819
## 1225 98 93.55813
## 1233 67 69.99805
## 1235 63 67.04151
## 1238 60 63.52108
## 1241 73 73.46212
## 1243 81 80.57780
## 1245 83 80.57369
## 1247 89 84.75617
## 1250 82 80.51450
## 1255 98 92.03747
## 1258 66 69.99693
## 1262 57 64.50557
## 1264 59 64.66279
## 1269 77 78.64186
## 1270 83 81.91273
## 1272 82 82.99225
## 1279 116 108.44216
## 1281 93 91.73140
## 1282 63 70.26464
## 1287 103 94.40850
## 1288 97 91.96169
## 1289 88 85.89978
## 1290 92 87.34953
## 1291 103 93.64751
## 1293 101 93.58544
## 1297 97 92.31105
## 1301 93 90.27343
## 1304 98 90.70872
## 1306 97 90.31728
## 1308 90 86.33013
## 1310 77 78.64740
## 1312 91 87.73345
## 1315 83 83.28675
## 1316 64 70.43550
## 1317 85 85.48794
## 1319 77 80.63634
## 1320 101 95.31216
## 1321 97 90.35531
## 1323 73 74.63510
## 1325 87 84.98641
## 1326 83 82.27712
## 1329 90 85.21216
## 1335 72 74.66895
## 1340 89 85.47129
## 1341 87 85.04700
## 1345 72 72.14119
## 1348 75 77.89748
## 1351 88 88.06348
## 1355 101 95.14930
## 1356 75 76.33118
## 1359 50 62.55677
## 1360 42 53.09652
## 1361 90 89.51177
## 1362 88 87.15312
## 1365 79 77.63761
## 1369 52 57.46257
## 1370 58 66.13929
## 1376 64 67.03592
## 1378 64 68.30848
## 1386 106 97.54848
## 1387 76 77.32796
## 1390 101 94.04790
## 1391 92 89.03294
## 1393 76 78.62508
## 1395 89 87.92648
## 1403 83 82.44874
## 1404 85 83.82256
## 1405 93 88.67019
## 1406 76 79.86732
## 1410 76 77.03864
## 1413 84 81.80691
## 1414 84 82.53161
## 1418 83 79.11674
## 1424 78 77.30460
## 1430 86 84.46357
## 1433 70 71.59163
## 1434 88 85.44831
## 1435 73 75.97577
## 1436 83 84.30523
## 1442 105 98.25253
## 1443 100 92.81374
## 1449 63 69.64314
## 1451 67 72.64015
## 1459 70 72.94801
## 1464 83 82.42686
## 1465 64 67.51159
## 1469 83 81.19922
## 1470 85 85.30665
## 1472 83 83.35560
## 1474 88 88.35271
## 1475 95 93.50340
## 1476 71 76.32125
## 1480 54 62.21648
## 1481 53 59.96944
## 1493 84 85.17291
## 1494 83 84.12396
## 1499 52 58.89308
## 1500 73 74.07686
## 1505 75 77.61798
## 1506 76 75.14380
## 1507 96 90.43255
## 1508 86 85.66273
## 1515 67 71.38616
## 1521 71 75.65729
Lets calculate the accuracy of using Model6 for our predictions
mean (apply(predicted_values, 1, min)/apply(predicted_values, 1, max)) # calculate accuracy## [1] 0.9554337
The prediction accuracy of the improved Ridge Regression Model is 95.75%.
ModelName <- c("Model3", "Model5","Model6")
Model_Accuracy <- c("85.85%", "85.85%", "95.75%")
AccuracyCompared <- data.frame(ModelName,Model_Accuracy)
AccuracyCompared| ModelName | Model_Accuracy |
|---|---|
| Model3 | 85.85% |
| Model5 | 85.85% |
| Model6 | 95.75% |
The prediction accuracy of the improved Ridge Regression Model6 is at 95.75% which is very good for this purpose.
The improved Model6 shows significant improvement from all the OLS Models when the R-Squared and the RMSE of the Models are compared. THis Model also predict TARGET WINS better than the OLS models because it is more stable and less prone to overfitting.
The chosen OLS Model3 and Model5 are due to the improved F-Statistic, positive variable coefficients and low Standard Errors. We will chose to make our predictions with the champion model the improved Ridge Regression Model6 because it beats all the OLs models when the model performance metrics are compared as well as the predictive ability of this model.
For Models 3 and 4, the variables were chosen just to test how the offfensive categories only would affect the model and how only defensive variables would affect the model. Based on the Coefficients for each model, the third model took the highest coefficient from each category model.
For offense, the two highest were HR and Triples. Which intuively does make sense because the HR and triple are two of the highest objectives a hitter can achieve when batting and thus the higher the totals in those categories the higher the runs scored which help a team win. And on the defensive side, the two highest cooeficients were Hits and WALKS. Which again just looking at it from a common sense point does make sense because as a pitcher, what they want to do is limit the numbers of times a batter gets on base whether by a hit or walk. Unless its an error, if a batter does not get a hit or walk then the outcome would be an out which would in essence limit the amount of runs scored by the opposing team.