DATA 621 - HW 1

Murali Kunissery

March 30, 2019

Overview

In this homework assignment, you will explore, analyze and model a data set containing approximately 2200 records. Each record represents a professional baseball team from the years 1871 to 2006 inclusive. Each record has the performance of the team for the given year, with all of the statistics adjusted to match the performance of a 162 game season.

Your objective is to build a multiple linear regression model on the training data to predict the number of wins for the team. You can only use the variables given to you (or variables that you derive from the variables provided). Below is a short description of the variables of interest in the data set:

Picture from PDF

Picture from PDF

Data Exploration

The dataset consists of 17 elements, with 2276 total cases. Out of those 17, 15 are explanatory variables, which can be broken down into four groups:

  • batting
  • baserun
  • fielding
  • pitching
vars n mean sd median trimmed mad min max range skew kurtosis se na_count
TARGET_WINS 2 191 80.92670 12.115013 82 81.11765 13.3434 43 116 73 -0.1698314 -0.2952783 0.8766116 0
TEAM_BATTING_H 3 191 1478.62827 76.147869 1477 1477.42484 74.1300 1308 1667 359 0.1302702 -0.3710350 5.5098664 0
TEAM_BATTING_2B 4 191 297.19895 26.329335 296 296.62745 25.2042 201 373 172 0.0915189 0.4778716 1.9051238 0
TEAM_BATTING_3B 5 191 30.74346 9.043878 29 30.13072 8.8956 12 61 49 0.7007420 0.7446217 0.6543921 0
TEAM_BATTING_HR 6 191 178.05236 32.413243 175 176.81046 35.5824 116 260 144 0.2980673 -0.7172373 2.3453399 0
TEAM_BATTING_BB 7 191 543.31937 74.842133 535 541.31373 74.1300 365 775 410 0.3115199 -0.1474175 5.4153867 0
TEAM_BATTING_SO 8 191 1051.02618 104.156382 1050 1046.95425 97.8516 805 1399 594 0.3985050 0.3955105 7.5364913 102
TEAM_BASERUN_SB 9 191 90.90576 29.916401 87 89.06536 29.6520 31 177 146 0.5553966 -0.1414909 2.1646748 131
TEAM_BASERUN_CS 10 191 39.94241 11.898334 38 39.49020 11.8608 12 74 62 0.3468509 0.0006392 0.8609332 772
TEAM_BATTING_HBP 11 191 59.35602 12.967123 58 58.86275 11.8608 29 95 66 0.3185754 -0.1119828 0.9382681 2085
TEAM_PITCHING_H 12 191 1479.70157 75.788625 1480 1478.50327 72.6474 1312 1667 355 0.1279056 -0.3894781 5.4838725 0
TEAM_PITCHING_HR 13 191 178.17801 32.391678 175 176.93464 35.5824 116 260 144 0.2989191 -0.7190905 2.3437795 0
TEAM_PITCHING_BB 14 191 543.71728 74.916681 537 541.74510 72.6474 367 775 408 0.3144366 -0.1338563 5.4207808 0
TEAM_PITCHING_SO 15 191 1051.81675 104.347208 1052 1047.80392 97.8516 805 1399 594 0.3945586 0.3903991 7.5502990 102
TEAM_FIELDING_E 16 191 107.05236 16.632162 106 106.58170 17.7912 65 145 80 0.1780432 -0.3567367 1.2034610 0
TEAM_FIELDING_DP 17 191 152.33508 17.611682 152 152.04575 19.2738 113 204 91 0.2164822 -0.2115741 1.2743366 286

Looking at the data above, there are multiple variables with missing (NA) values, with TEAM-BATTING_HBP being the highest.

The boxplots below help show the spread of data within the dataset, and show various outliers. As shown in the graph below, TEAM_PITCHING_H seems to have the highest spread with the most outliers.

## Warning: Removed 3478 rows containing non-finite values (stat_boxplot).

The graph below zooms into the other variables, so it becomes easier to see spread and outliers from the other variables.

In the Histograms below, the data shows multiple graphs with right skews while only a few have left-skew.

The above boxplots show all of the variables listed in the dataset. This visualization may assist in showing how the data is spread.

The correlation plot below shows how variables in the dataset are related to each other. Looking at the plot, we can see that certain variables are more related than others.

For this project, it makes sense to break down the correlation by wins - since that’s what we’re trying to predict.

x
TARGET_WINS 1.0000000
TEAM_BATTING_H 0.4699467
TEAM_BATTING_2B 0.3129840
TEAM_BATTING_3B -0.1243459
TEAM_BATTING_HR 0.4224168
TEAM_BATTING_BB 0.4686879
TEAM_BATTING_SO -0.2288927
TEAM_BASERUN_SB 0.0148364
TEAM_BASERUN_CS -0.1787560
TEAM_BATTING_HBP 0.0735042
TEAM_PITCHING_H 0.4712343
TEAM_PITCHING_HR 0.4224668
TEAM_PITCHING_BB 0.4683988
TEAM_PITCHING_SO -0.2293648
TEAM_FIELDING_E -0.3866880
TEAM_FIELDING_DP -0.1958660

Below is a visual representation of the correlation plot.

According to the coorelation graph, batting_h, batting_2b, batting_hr, batting_bb, pitching_h, pitching_hr, and pitching_bb are the most positively correlated.

Data Preparation

Removal of Data

The variable TEAM_BATTING_HBP is also missing over 90% of its values. That variable will be removed completely.

The variable TEAM_PITCHING_HR and TEAM_BATTING_HR are also very closely correlated with each other. This shows that there may be some collinearity involved. The TEAM_PITCHING_HR variable will be dropped from the dataset

Using the VIF and vifstep function from the USDM package, the data will first be tested for other collinearity issues. The variables determined that have collinearity issues will be discarded.

Imputation of Missing (NA) values

The data exploration revealed multiple variables that had numerious NA values. There are multiple ways to handle NA data: deleting the observations, deleting the variables, imputation with the mean/median/mode, or imputation with a prediction.

Imputation the mean/median/mode is an easy way to fill in the missing NA’s, however it reduces the variance in the dataset and shrinks standard errors - which can invalidate hypothesis tests.

In this case, data will be imputated via prediction using the MICE (Multivariate Imputation) library using a random forest prediction method.

Variables that exceed the established threshold will be discarded to avoid collinearity issues.

vif(imputed)
##           Variables      VIF
## 1       TARGET_WINS 1.495584
## 2    TEAM_BATTING_H 3.993383
## 3   TEAM_BATTING_2B 2.468801
## 4   TEAM_BATTING_3B 3.034968
## 5   TEAM_BATTING_HR 4.765619
## 6   TEAM_BATTING_BB 5.567294
## 7   TEAM_BATTING_SO 5.174404
## 8   TEAM_BASERUN_SB 2.428659
## 9   TEAM_BASERUN_CS 1.966029
## 10  TEAM_PITCHING_H 3.690147
## 11 TEAM_PITCHING_BB 4.780373
## 12 TEAM_PITCHING_SO 2.991219
## 13  TEAM_FIELDING_E 4.724461
## 14 TEAM_FIELDING_DP 1.715499
v1 <- vifstep(imputed, th=10)

Output - The below table shows the results of above data manipulation.

The NA data has been ‘filled in’ using the MICE prediction, using the Random Forest Method. Variables with collinearity as established by the vir/virstep package have been dropped.

vars n mean sd median trimmed mad min max range skew kurtosis se
TARGET_WINS 1 2276 80.79086 15.75215 82.0 81.31229 14.8260 0 146 146 -0.3987232 1.0274757 0.3301823
TEAM_BATTING_H 2 2276 1469.26977 144.59120 1454.0 1459.04116 114.1602 891 2554 1663 1.5713335 7.2785261 3.0307891
TEAM_BATTING_2B 3 2276 241.24692 46.80141 238.0 240.39627 47.4432 69 458 389 0.2151018 0.0061609 0.9810087
TEAM_BATTING_3B 4 2276 55.25000 27.93856 47.0 52.17563 23.7216 0 223 223 1.1094652 1.5032418 0.5856226
TEAM_BATTING_HR 5 2276 99.61204 60.54687 102.0 97.38529 78.5778 0 264 264 0.1860421 -0.9631189 1.2691285
TEAM_BATTING_BB 6 2276 501.55888 122.67086 512.0 512.18331 94.8864 0 878 878 -1.0257599 2.1828544 2.5713150
TEAM_BATTING_SO 7 2276 730.70299 245.61906 737.5 736.16959 278.7288 0 1399 1399 -0.2549685 -0.3084427 5.1484434
TEAM_BASERUN_SB 8 2276 129.42443 93.36049 103.0 114.21570 64.4931 0 697 697 1.8720112 4.5696253 1.9569377
TEAM_BASERUN_CS 9 2276 66.40158 40.54928 54.0 59.12514 22.9803 0 201 201 1.6605026 2.3270043 0.8499571
TEAM_PITCHING_H 10 2276 1779.21046 1406.84293 1518.0 1555.89517 174.9468 1137 30132 28995 10.3295111 141.8396985 29.4889618
TEAM_PITCHING_BB 11 2276 553.00791 166.35736 536.5 542.62459 98.5929 0 3645 3645 6.7438995 96.9676398 3.4870317
TEAM_PITCHING_SO 12 2276 811.82821 542.08185 804.0 791.04116 253.5246 0 19278 19278 22.5269865 695.8563547 11.3626268
TEAM_FIELDING_E 13 2276 246.48067 227.77097 159.0 193.43798 62.2692 65 1898 1833 2.9904656 10.9702717 4.7743279
TEAM_FIELDING_DP 14 2276 143.05185 27.70682 146.0 144.23381 25.2042 52 228 176 -0.3858534 0.0383034 0.5807651

Build Models

Using the training data provided, we will build 3 different linear regression models, to determine which will provide the best prediction for the # of wins for a baseball team. The tree approachs are: all variables, only significant variables, and backwards elimination of each variable.

Model 1: All Variables

All remaining variables after the data prep. After the data has been manipulated (imputed, etc. as stated above), all of the variables will be tested to determine the base model they provided. This will allow us to see which variables are significant in our dataset, and allow us to make other models based on that.

## 
## Call:
## lm(formula = TARGET_WINS ~ ., data = imputed)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -47.214  -8.583   0.082   8.223  60.878 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      28.3713002  5.2607342   5.393 7.65e-08 ***
## TEAM_BATTING_H    0.0474426  0.0036076  13.151  < 2e-16 ***
## TEAM_BATTING_2B  -0.0199528  0.0090826  -2.197  0.02813 *  
## TEAM_BATTING_3B   0.0461603  0.0168594   2.738  0.00623 ** 
## TEAM_BATTING_HR   0.0701159  0.0096527   7.264 5.15e-13 ***
## TEAM_BATTING_BB   0.0082687  0.0052063   1.588  0.11238    
## TEAM_BATTING_SO  -0.0115965  0.0024963  -4.646 3.59e-06 ***
## TEAM_BASERUN_SB   0.0375287  0.0044513   8.431  < 2e-16 ***
## TEAM_BASERUN_CS  -0.0093903  0.0093628  -1.003  0.31600    
## TEAM_PITCHING_H  -0.0003316  0.0003697  -0.897  0.36989    
## TEAM_PITCHING_BB  0.0013362  0.0035593   0.375  0.70738    
## TEAM_PITCHING_SO  0.0025734  0.0008624   2.984  0.00287 ** 
## TEAM_FIELDING_E  -0.0291932  0.0025105 -11.628  < 2e-16 ***
## TEAM_FIELDING_DP -0.1185404  0.0125576  -9.440  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12.92 on 2262 degrees of freedom
## Multiple R-squared:  0.3314, Adjusted R-squared:  0.3275 
## F-statistic: 86.23 on 13 and 2262 DF,  p-value: < 2.2e-16

Conclusions based on model:

F-statistic is 89.25, R-squared is 0.3352 Out of the 14 variables, 9 have statistically significant p-values at the 5% level.

Model 2: Highly Significant Variables Only

Based on model one, Model 2 will focus only on the variables that are statistically significant - in order to see if only those variables allow for a better model. Variables will be choosen based on their significance level from the R output.

## 
## Call:
## lm(formula = TARGET_WINS ~ TEAM_BATTING_H + TEAM_BATTING_3B + 
##     TEAM_BATTING_HR + TEAM_BATTING_SO + TEAM_BASERUN_SB + TEAM_PITCHING_SO + 
##     TEAM_FIELDING_E + TEAM_FIELDING_DP, data = imputed)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -48.374  -8.475   0.172   8.411  59.779 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      35.6500049  4.5345287   7.862 5.80e-15 ***
## TEAM_BATTING_H    0.0414945  0.0026739  15.518  < 2e-16 ***
## TEAM_BATTING_3B   0.0533683  0.0164396   3.246  0.00119 ** 
## TEAM_BATTING_HR   0.0790639  0.0091277   8.662  < 2e-16 ***
## TEAM_BATTING_SO  -0.0134542  0.0023235  -5.791 7.99e-09 ***
## TEAM_BASERUN_SB   0.0403159  0.0038111  10.578  < 2e-16 ***
## TEAM_PITCHING_SO  0.0022997  0.0005851   3.930 8.75e-05 ***
## TEAM_FIELDING_E  -0.0323135  0.0017623 -18.336  < 2e-16 ***
## TEAM_FIELDING_DP -0.1114227  0.0122581  -9.090  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12.95 on 2267 degrees of freedom
## Multiple R-squared:  0.3269, Adjusted R-squared:  0.3245 
## F-statistic: 137.6 on 8 and 2267 DF,  p-value: < 2.2e-16

Conclusions based on model: F-statistic is 143, R-squared is 0.333

The F-statistic is better than the first model, however the R-squared drops slightly.

Model 3: Backwards Elimination and Significance

Variables will be removed one by one to determine best fit model. After each variable is removed, the model will be ‘ran’ again - until the most optimal output (r2, f-stat) are produced. Only the final output will be shown. This model is similar to the ‘forward selection’ variant - however I find it easier to work our way backwards and to eliminate variables rather than add them.

## 
## Call:
## lm(formula = TARGET_WINS ~ TEAM_BATTING_H + TEAM_BASERUN_SB + 
##     TEAM_FIELDING_E + TEAM_BATTING_HR, data = imputed)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -50.610  -8.995  -0.021   8.594  57.952 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      5.077766   2.924654   1.736 0.082665 .  
## TEAM_BATTING_H   0.050216   0.002036  24.669  < 2e-16 ***
## TEAM_BASERUN_SB  0.045517   0.003430  13.270  < 2e-16 ***
## TEAM_FIELDING_E -0.025152   0.001632 -15.408  < 2e-16 ***
## TEAM_BATTING_HR  0.022498   0.006010   3.744 0.000186 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 13.28 on 2271 degrees of freedom
## Multiple R-squared:  0.2902, Adjusted R-squared:  0.2889 
## F-statistic: 232.1 on 4 and 2271 DF,  p-value: < 2.2e-16

Conclusions based on model: F-statistic is 245.5, R-squared is 0.3006 The F-statistic is larger than both of the other two models, however the R-squared is slightly lower than the other two.

Select Models

The three models from the previous selection have been summarised below. From the three models, I decided to use model 3 for the predictions. While the first model had the highest R-squared, it had multiple variables that weren’t statistically significant, and some that had multicollinearity issues. The F-statistic in model 3 is also much higher than the other two.

A comparsion of the multiple linear regression models, based on: mean square error, R2, F-stat, and root MSE.

Model 1 Model 2 Model 3
Mean Squared Error: 165.835691578037 Mean Squared Error: 166.943210870667 Mean Squared Error: 176.04763746818
Root MSE: 12.8777207446829 Root MSE: 12.9206505591115 Root MSE: 13.2682944445841
Adjusted R-squared: 0.327522358762679 Adjusted R-squared: 0.324524368912807 Adjusted R-squared: 0.28894120788989
F-statistic: 86.2317003105261 F-statistic: 137.624643676696 F-statistic: 232.113536336003

Predictions

Similar to the train data, the evaulation data also needs some prep work. Similar to what was done for the test data, the eval data has had columns removed, and NA values imputed using the MICE - Random Forest method to predict what the NA values could be.

vars n mean sd median trimmed mad min max range skew kurtosis se
TEAM_BATTING_H 1 259 1469.38996 150.65523 1455 1463.68421 114.1602 819 2170 1351 0.5876139 3.6642947 9.361261
TEAM_BATTING_2B 2 259 241.32046 49.51612 239 242.32536 48.9258 44 376 332 -0.3273282 0.6693023 3.076782
TEAM_BATTING_3B 3 259 55.91120 27.14410 52 52.94737 26.6868 14 155 141 0.9790284 0.6987468 1.686652
TEAM_BATTING_HR 4 259 95.63320 56.33221 101 93.67943 66.7170 0 242 242 0.1712363 -0.9031262 3.500313
TEAM_BATTING_BB 5 259 498.95753 120.59215 509 505.98086 94.8864 15 792 777 -0.9209916 2.5265655 7.493232
TEAM_BATTING_SO 6 259 704.83012 240.10056 684 710.75598 260.9376 0 1268 1268 -0.2545376 -0.1626454 14.919123
TEAM_BASERUN_SB 7 259 127.35135 95.85440 95 111.41148 63.7518 0 580 580 1.6916469 3.2195354 5.956103
TEAM_BASERUN_CS 8 259 63.77606 32.41871 56 59.80383 23.7216 0 154 154 1.1259031 0.7866266 2.014401
TEAM_PITCHING_H 9 259 1813.46332 1662.91308 1515 1554.25359 173.4642 1155 22768 21613 9.2764797 102.0702914 103.328391
TEAM_PITCHING_BB 10 259 552.41699 172.95006 526 536.46411 97.8516 136 2008 1872 4.1113772 29.2127324 10.746594
TEAM_PITCHING_SO 11 259 794.44402 614.24483 745 762.10526 249.0768 0 9963 9963 12.8125903 188.7218639 38.167316
TEAM_FIELDING_E 12 259 249.74903 230.90260 163 197.36364 59.3040 73 1568 1495 3.0887263 10.8748551 14.347589
TEAM_FIELDING_DP 13 259 143.12355 26.76108 146 144.24402 23.7216 69 204 135 -0.3703465 -0.0030839 1.662852

After imputing and cleaning the data, using the predict function and Model 3, the following are the predicted values for the test set of the data, including prediction intervals:

fit lwr upr
66.95672 40.880709 93.03273
67.43342 41.358742 93.50809
75.98284 49.923146 102.04254
89.55465 63.491225 115.61808
69.58667 43.511302 95.66204
72.19661 46.126149 98.26707
82.00799 55.916172 108.09980
75.53983 49.476875 101.60279
69.96007 43.883004 96.03713
74.27298 48.205350 100.34060
75.74324 49.675657 101.81082
82.30221 56.240178 108.36423
78.53395 52.464311 104.60358
80.43735 54.371598 106.50311
78.89742 52.839012 104.95583
79.45161 53.391886 105.51134
73.08235 47.018533 99.14617
82.03673 55.975953 108.09750
68.24615 42.170190 94.32211
92.06882 65.996037 118.14159
81.83769 55.766774 107.90861
86.03648 59.970180 112.10278
77.72201 51.662311 103.78171
73.72481 47.660960 99.78867
86.03458 59.975694 112.09347
89.84519 63.781684 115.90869
54.45751 28.258865 80.65615
76.68335 50.615363 102.75134
82.32863 56.252081 108.40518
78.42227 52.345308 104.49924
86.85597 60.794554 112.91740
84.26294 58.204269 110.32160
82.25922 56.199494 108.31894
83.03982 56.976409 109.10322
79.57505 53.514985 105.63512
80.80982 54.735095 106.88455
75.36312 49.303059 101.42318
87.93801 61.859696 114.01633
86.09443 60.031839 112.15703
87.52757 61.457160 113.59798
82.10938 56.049490 108.16927
87.27549 61.213164 113.33782
29.31669 2.902522 55.73085
99.84551 73.668177 126.02284
90.35918 64.254072 116.46428
90.49413 64.408260 116.58000
96.47838 70.381850 122.57490
73.44007 47.376738 99.50341
69.44328 43.368647 95.51792
76.74061 50.675700 102.80552
79.96191 53.898538 106.02529
87.88202 61.810600 113.95344
77.30164 51.242060 103.36123
73.11018 47.045985 99.17438
78.19611 52.140776 104.25144
79.36140 53.303992 105.41881
88.73278 62.649811 114.81575
73.73578 47.655823 99.81574
61.95676 35.855734 88.05778
77.46812 51.399177 103.53706
86.65506 60.581039 112.72908
76.28141 50.209251 102.35357
86.03124 59.971479 112.09101
85.92691 59.835900 112.01792
86.26173 60.186929 112.33653
100.09273 73.954455 126.23101
75.04732 48.987807 101.10683
83.43234 57.361880 109.50281
78.53441 52.453831 104.61500
84.91823 58.837869 110.99858
84.91352 58.826463 111.00057
77.22193 51.146685 103.29718
79.48407 53.423623 105.54452
83.71259 57.626735 109.79844
85.51554 59.452696 111.57839
86.53136 60.467170 112.59555
81.32469 55.265711 107.38366
82.52369 56.461400 108.58598
71.65740 45.582676 97.73212
78.31456 52.241959 104.38717
87.34317 61.275576 113.41077
89.36368 63.292299 115.43507
96.90586 70.815208 122.99651
81.69814 55.631485 107.76479
80.59007 54.529325 106.65081
82.29506 56.237640 108.35249
79.32640 53.264432 105.38837
82.63104 56.573959 108.68811
84.52398 58.467118 110.58085
90.10467 64.029770 116.17956
79.07110 52.998397 105.14381
86.68059 60.455585 112.90560
73.05495 46.991059 99.11884
82.62380 56.552084 108.69552
84.66098 58.580191 110.74176
79.75951 53.687218 105.83181
82.82420 56.751667 108.89673
96.52888 70.425919 122.63184
86.66035 60.579720 112.74097
89.01628 62.946297 115.08626
83.13716 57.068397 109.20593
72.98590 46.921757 99.05005
83.76381 57.699717 109.82790
78.33177 52.267473 104.39607
81.18666 55.111167 107.26215
73.36628 47.293968 99.43859
49.90414 23.762095 76.04619
83.63799 57.577470 109.69851
84.01010 57.943853 110.07635
61.22378 35.135283 87.31228
82.74785 56.691415 108.80428
87.10839 61.045000 113.17178
94.54157 68.472284 120.61086
91.40280 65.339931 117.46568
83.96104 57.905471 110.01662
83.19022 57.133738 109.24670
91.19839 65.134637 117.26214
82.38539 56.326918 108.44386
79.74267 53.684993 105.80034
77.08037 50.974567 103.18616
89.55816 63.479145 115.63718
66.89562 40.810659 92.98059
66.68533 40.606158 92.76450
59.89504 33.794455 85.99562
68.72900 42.650953 94.80704
87.58099 61.505190 113.65678
88.79357 62.705353 114.88179
74.93689 48.872560 101.00121
87.80374 61.740343 113.86714
93.40481 67.324368 119.48525
86.47473 60.404772 112.54469
80.62942 54.565745 106.69309
79.50166 53.441811 105.56151
85.08547 59.024124 111.14683
85.26432 59.191920 111.33672
62.06705 35.953815 88.18029
76.44605 50.386822 102.50527
79.24505 53.185184 105.30491
90.50510 64.427964 116.58223
82.48411 56.427745 108.54047
67.22321 41.144362 93.30205
69.59687 43.517865 95.67588
91.83087 65.751538 117.91021
76.68432 50.619638 102.74900
72.84124 46.768482 98.91399
72.24085 46.174479 98.30722
78.41507 52.355214 104.47492
81.67418 55.615680 107.73268
84.76441 58.704550 110.82427
81.29038 55.233154 107.34760
82.81827 56.748320 108.88823
84.25666 58.199321 110.31400
45.66251 19.336215 71.98881
73.75445 47.692176 99.81672
77.06106 51.002189 103.11992
76.20099 50.139810 102.26216
87.37396 61.298194 113.44973
75.66213 49.550989 101.77328
87.38062 61.307755 113.45348
72.24133 46.170628 98.31203
96.42067 70.336372 122.50496
99.59181 73.505118 125.67851
86.70641 60.644960 112.76786
97.93392 71.840897 124.02695
89.68816 63.615194 115.76112
84.51716 58.453008 110.58131
82.21262 56.153039 108.27219
81.74794 55.689434 107.80644
77.01756 50.953610 103.08151
82.89366 56.833710 108.95361
88.25147 62.172918 114.33003
85.95470 59.883617 112.02578
78.30696 52.242821 104.37109
90.77680 64.696470 116.85714
81.67735 55.618968 107.73573
73.76752 47.698114 99.83693
74.94489 48.883996 101.00579
75.24340 49.183430 101.30336
73.93674 47.872526 100.00096
79.81539 53.756706 105.87407
86.24414 60.137680 112.35061
84.26461 58.188029 110.34119
85.83620 59.776159 111.89623
82.03095 55.957902 108.10400
89.79544 63.594426 115.99646
99.90248 73.702781 126.10217
87.88728 61.806310 113.96826
54.12375 27.960332 80.28716
65.66877 39.580342 91.75720
114.16115 87.985760 140.33655
70.43420 44.367103 96.50129
79.74326 53.675492 105.81102
78.86134 52.804029 104.91866
81.32947 55.262172 107.39677
84.45924 58.385296 110.53318
70.30220 44.234476 96.36993
77.16646 51.109767 103.22315
76.07628 50.015294 102.13726
75.62430 49.564447 101.68416
82.26294 56.204874 108.32101
76.20941 50.148153 102.27067
80.12921 54.071309 106.18711
73.43959 47.376662 99.50252
87.42575 61.367135 113.48436
81.53604 55.480086 107.59200
78.46754 52.408001 104.52708
81.64758 55.590331 107.70482
79.03403 52.977130 105.09092
81.39617 55.326007 107.46633
71.92309 45.849000 97.99718
101.09759 74.989901 127.20529
90.66562 64.585981 116.74526
78.83789 52.778269 104.89751
67.95195 41.881498 94.02240
70.56938 44.501759 96.63700
85.04454 58.980861 111.10822
83.98917 57.931260 110.04708
94.09180 68.013078 120.17052
79.33739 53.280667 105.39410
77.35790 51.299370 103.41643
81.51432 55.457818 107.57082
81.25195 55.194345 107.30956
85.07674 59.015367 111.13811
80.70677 54.648898 106.76464
87.91608 61.642170 114.18999
74.72902 48.668669 100.78937
81.52271 55.466760 107.57867
81.98110 55.919767 108.04243
80.87262 54.816372 106.92886
80.22462 54.122673 106.32657
74.33703 48.261582 100.41247
92.23486 66.163982 118.30574
78.01514 51.955724 104.07455
85.88725 59.815497 111.95900
77.61229 51.554755 103.66983
73.71032 47.648306 99.77234
84.35541 58.300515 110.41030
77.39786 51.337404 103.45832
85.36315 59.295021 111.43128
72.92169 46.855820 98.98756
87.34318 61.279314 113.40704
85.83924 59.771728 111.90676
84.76841 58.695736 110.84108
86.98948 60.930192 113.04877
65.75522 39.670702 91.83973
90.00589 63.942093 116.06969
81.40150 55.345952 107.45704
84.94114 58.875515 111.00676
73.87934 47.814991 99.94370
88.89702 62.822163 114.97189
82.98967 56.923477 109.05586
60.57565 34.445048 86.70626
90.93710 64.853848 117.02035
40.90990 14.593604 67.22620
69.77056 43.701874 95.83924
74.83523 48.775536 100.89493
82.52300 56.460288 108.58572
85.00544 58.937830 111.07306
80.52329 54.460150 106.58642
##       fit              lwr              upr        
##  Min.   : 29.32   Min.   : 2.903   Min.   : 55.73  
##  1st Qu.: 76.14   1st Qu.:50.078   1st Qu.:102.20  
##  Median : 81.54   Median :55.480   Median :107.59  
##  Mean   : 80.53   Mean   :54.454   Mean   :106.61  
##  3rd Qu.: 86.03   3rd Qu.:59.971   3rd Qu.:112.09  
##  Max.   :114.16   Max.   :87.986   Max.   :140.34
##        1 
## 81.07535
##        fit      lwr      upr
## 1 81.07535 55.02037 107.1303

link to source file