Try to develop a model that predicts total winnings using the same techniques we just reviewed: Check Assumptions: Linear relationship between dep and indeps (plots), normal predictor variables (skewness, shapiro), non-correlation (VIF) and consistant errors (nvcTest) Create a model using lm() Use the model to create predict values and if you have time develop a RSME We are going to us data located on a website here:http://www.stat.ufl.edu/~winner/data/pga2004.dat with information on the dataset located here:http://www.stat.ufl.edu/~winner/data/pga2004.txt Use the data.table package to get your data into R then develop a initial model

library(data.table)
pgadata <- fread("http://www.stat.ufl.edu/~winner/data/pga2004.dat", fill = TRUE)
head(pgadata)
##       V1         V2 V3    V4   V5   V6    V7   V8  V9 V10     V11    V12
## 1: Aaron   Baddeley 23 288.0 53.1 58.2 1.767 50.9 123  27  632878  23440
## 2:  Adam      Scott 24 295.4 57.7 65.6 1.757 59.3   7  16 3724984 232812
## 3:  Alex      Cejka 34 285.8 64.2 63.8 1.795 50.7  54  24 1313484  54729
## 4: Andre      Stolz 34 297.9 59.0 63.0 1.787 47.7 101  20  808373  40419
## 5: Arjun      Atwal 31 289.4 60.5 62.5 1.766 43.5 146  30  486053  16202
## 6: Arron Oberholser 29 284.6 68.8 67.0 1.780 50.9  52  23 1355433  58932
##    V13
## 1:  NA
## 2:  NA
## 3:  NA
## 4:  NA
## 5:  NA
## 6:  NA
str(pgadata)
## Classes 'data.table' and 'data.frame':   196 obs. of  13 variables:
##  $ V1 : chr  "Aaron" "Adam" "Alex" "Andre" ...
##  $ V2 : chr  "Baddeley" "Scott" "Cejka" "Stolz" ...
##  $ V3 : chr  "23" "24" "34" "34" ...
##  $ V4 : num  288 295 286 298 289 ...
##  $ V5 : num  53.1 57.7 64.2 59 60.5 68.8 74.2 64.4 64.3 62.6 ...
##  $ V6 : num  58.2 65.6 63.8 63 62.5 67 68.9 64.2 63.4 65.3 ...
##  $ V7 : num  1.77 1.76 1.79 1.79 1.77 ...
##  $ V8 : num  50.9 59.3 50.7 47.7 43.5 50.9 40.4 53.8 42.2 47.7 ...
##  $ V9 : num  123 7 54 101 146 52 80 75 141 83 ...
##  $ V10: int  27 16 24 20 30 23 23 27 20 15 ...
##  $ V11: int  632878 3724984 1313484 808373 486053 1355433 962167 1036958 500818 943589 ...
##  $ V12: int  23440 232812 54729 40419 16202 58932 41833 38406 25041 62906 ...
##  $ V13: int  NA NA NA NA NA NA NA NA NA NA ...
##  - attr(*, ".internal.selfref")=<externalptr>
names(pgadata) <- c("First", "Last","Age","AveDrive","DriveAccur","GreensReg","AvePuts","Save%","MoneyRank","NoEvents","TotalWin","AverWin","NA")
head(pgadata)
##    First       Last Age AveDrive DriveAccur GreensReg AvePuts Save%
## 1: Aaron   Baddeley  23    288.0       53.1      58.2   1.767  50.9
## 2:  Adam      Scott  24    295.4       57.7      65.6   1.757  59.3
## 3:  Alex      Cejka  34    285.8       64.2      63.8   1.795  50.7
## 4: Andre      Stolz  34    297.9       59.0      63.0   1.787  47.7
## 5: Arjun      Atwal  31    289.4       60.5      62.5   1.766  43.5
## 6: Arron Oberholser  29    284.6       68.8      67.0   1.780  50.9
##    MoneyRank NoEvents TotalWin AverWin NA
## 1:       123       27   632878   23440 NA
## 2:         7       16  3724984  232812 NA
## 3:        54       24  1313484   54729 NA
## 4:       101       20   808373   40419 NA
## 5:       146       30   486053   16202 NA
## 6:        52       23  1355433   58932 NA