Untitled

Introduction

In this paper I attmept to create a model to predict team wins in MLB. My model will make these predictions on a train and eval dataset of real life baseball team yearly statistics.
Through data exploration, I realized that there are several data entry mistakes in these dataframes.
- Several entries of 0 which are impossible
- Historically inaccurate entries. Further there is no way to verify if the historically incorrect values are corrupt through the entire observation(all stats), or just for the singular column entry.
  - To identify and handle these misentries, I provide links to historical MLB records and adjust the data accordingly.
  - Unfortunately a problem may arise from this on the prediction side of things. Being that I have filtered for wins (min-20,max- 116), the model may miss on any test indexes that would need to predict outside that range. As this is an entry level attempt to create a linear model, hopefully this does not end up hurting the model to badly.
The paper proceeds by using the exploratory information gained to perform some intial data transformations. I break the categories into 3 areas
- Batting
- Baserunning
- Pithcing
I explore these categories, filter them as i see best, and attempt to visualize and dispaly their correlation to our target variable(Wins)
This project highlights a major concern with observational studies. I came to the realization that many of the pitching statistics must have been incorrectly entered into the dataframe. There are an outright 780 entries which are duplicate values of the batting statistics, but the correlations in some of the pitching fields do not make sense either. I will guide you through the data which brought me to this realization.

Data Exploration

## Warning: package 'stringr' was built under R version 3.4.4

## Warning: package 'corrplot' was built under R version 3.4.4

General data observations of Entire Dataset

	vars	n	mean	sd	median	trimmed	mad	min	max	range	skew	kurtosis	se
INDEX	1	2276	1268.46353	736.34904	1270.5	1268.56970	952.5705	1	2535	2534	0.0042149	-1.2167564	15.4346788
TARGET_WINS	2	2276	80.79086	15.75215	82.0	81.31229	14.8260	0	146	146	-0.3987232	1.0274757	0.3301823
TEAM_BATTING_H	3	2276	1469.26977	144.59120	1454.0	1459.04116	114.1602	891	2554	1663	1.5713335	7.2785261	3.0307891
TEAM_BATTING_2B	4	2276	241.24692	46.80141	238.0	240.39627	47.4432	69	458	389	0.2151018	0.0061609	0.9810087
TEAM_BATTING_3B	5	2276	55.25000	27.93856	47.0	52.17563	23.7216	0	223	223	1.1094652	1.5032418	0.5856226
TEAM_BATTING_HR	6	2276	99.61204	60.54687	102.0	97.38529	78.5778	0	264	264	0.1860421	-0.9631189	1.2691285
TEAM_BATTING_BB	7	2276	501.55888	122.67086	512.0	512.18331	94.8864	0	878	878	-1.0257599	2.1828544	2.5713150
TEAM_BATTING_SO	8	2174	735.60534	248.52642	750.0	742.31322	284.6592	0	1399	1399	-0.2978001	-0.3207992	5.3301912
TEAM_BASERUN_SB	9	2145	124.76177	87.79117	101.0	110.81188	60.7866	0	697	697	1.9724140	5.4896754	1.8955584
TEAM_BASERUN_CS	10	1504	52.80386	22.95634	49.0	50.35963	17.7912	0	201	201	1.9762180	7.6203818	0.5919414
TEAM_BATTING_HBP	11	191	59.35602	12.96712	58.0	58.86275	11.8608	29	95	66	0.3185754	-0.1119828	0.9382681
TEAM_PITCHING_H	12	2276	1779.21046	1406.84293	1518.0	1555.89517	174.9468	1137	30132	28995	10.3295111	141.8396985	29.4889618
TEAM_PITCHING_HR	13	2276	105.69859	61.29875	107.0	103.15697	74.1300	0	343	343	0.2877877	-0.6046311	1.2848886
TEAM_PITCHING_BB	14	2276	553.00791	166.35736	536.5	542.62459	98.5929	0	3645	3645	6.7438995	96.9676398	3.4870317
TEAM_PITCHING_SO	15	2174	817.73045	553.08503	813.5	796.93391	257.2311	0	19278	19278	22.1745535	671.1891292	11.8621151
TEAM_FIELDING_E	16	2276	246.48067	227.77097	159.0	193.43798	62.2692	65	1898	1833	2.9904656	10.9702717	4.7743279
TEAM_FIELDING_DP	17	1990	146.38794	26.22639	149.0	147.57789	23.7216	52	228	176	-0.3889390	0.1817397	0.5879114

Our Training dataset contains 2276 different MLB teams yearly results
18 Total categories
- Numeric index representing each team
- Category tracking Wins(Our target category)
- 7 different batting statistics
- 2 base running statistics
- 4 pitching statistics
- 2 fielding statistics
Our testing dataset has 259 observations and has all of the same variables except for our target variable(wins)
- This leaves us with an issue. Below I filter our test data for modern era historical records. This allows us to correct many of the outliers that are present within the data.

Explore target variable Wins

	vars	n	mean	sd	median	trimmed	mad	min	max	range	skew	kurtosis	se
X1	1	2276	80.79086	15.75215	82	81.31229	14.826	0	146	146	-0.3987232	1.027476	0.3301823

Our mean and median are of little practical value. 162 games in a season, so we should expect to see 81 wins on average given our total sample size.
Although not exact as seasons since 1800’s havent all been 162 games, but my interval test shows a left tail interval of 80.35-81 which our 80.79 mean falls within confidently
While our average wins seem fine, our min and max seem incorrect
- The most wins by an mlb team ever was 116
- The least wins by an MLB team in the modern era was 20

Remove outliers discovered in Wins category

	vars	n	mean	sd	median	trimmed	mad	min	max	range	skew	kurtosis	se
X1	1	2255	80.58537	15.05826	82	81.18726	14.826	21	116	95	-0.4395385	0.3152447	0.3171039

The mean of our sample moved further from our expected population mean when we removed incorrect outliers, however it is still within the Confidence interval. We lost 35 observations when removing these data entries which suffered from some sort of data entry mistake. This does raise concerns as to the vailidity of all the entires, however, with no way to confirm index to specific teams lets continue

View target variable distribution before and after filters

The median and mean are around 82 and the histogram looks normal, however there is somewhat of a left tail. We can assume our target variable is normally distributed

Descriptive Look at batting statistics

Create and view summary of batting DF

	vars	n	mean	sd	median	trimmed	mad	min	max	range	skew	kurtosis	se
TEAM_BATTING_H	1	2255	1466.81685	135.92839	1454	1458.24377	114.1602	992	2496	1504	1.2098420	4.8627441	2.8624434
TEAM_BATTING_2B	2	2255	241.09490	46.28571	238	240.29695	47.4432	69	458	389	0.2012387	-0.0427041	0.9747061
TEAM_BATTING_3B	3	2255	54.93348	27.63678	47	51.89695	23.7216	0	223	223	1.1205339	1.5849716	0.5819881
TEAM_BATTING_HR	4	2255	100.22395	60.39496	103	98.07867	77.0952	0	264	264	0.1771368	-0.9584953	1.2718252
TEAM_BATTING_BB	5	2255	503.80089	119.41435	513	513.29640	93.4038	29	878	849	-0.9614819	2.1131941	2.5146829
TEAM_BATTING_SO	6	2155	739.51323	244.66509	753	744.96464	284.6592	0	1399	1399	-0.2526910	-0.4140827	5.2704582
TEAM_BATTING_HBP	7	191	59.35602	12.96712	58	58.86275	11.8608	29	95	66	0.3185754	-0.1119828	0.9382681
Singles	8	2255	1070.56452	121.79442	1049	1058.05319	97.8516	709	2112	1403	1.7574832	6.7989398	2.5648036

Fig 1&2

##          TEAM_BATTING_H     TEAM_BATTING_2B    TEAM_BATTING_3B   
## breaks   Integer,17         Integer,10         Numeric,13        
## counts   Integer,16         Integer,9          Integer,12        
## density  Numeric,16         Numeric,9          Numeric,12        
## mids     Numeric,16         Numeric,9          Numeric,12        
## xname    "dots[[1L]][[1L]]" "dots[[1L]][[2L]]" "dots[[1L]][[3L]]"
## equidist TRUE               TRUE               TRUE              
##          TEAM_BATTING_HR    TEAM_BATTING_BB    TEAM_BATTING_SO   
## breaks   Numeric,15         Numeric,19         Numeric,15        
## counts   Integer,14         Integer,18         Integer,14        
## density  Numeric,14         Numeric,18         Numeric,14        
## mids     Numeric,14         Numeric,18         Numeric,14        
## xname    "dots[[1L]][[4L]]" "dots[[1L]][[5L]]" "dots[[1L]][[6L]]"
## equidist TRUE               TRUE               TRUE              
##          TEAM_BATTING_HBP   Singles           
## breaks   Integer,9          Integer,16        
## counts   Integer,8          Integer,15        
## density  Numeric,8          Numeric,15        
## mids     Numeric,8          Numeric,15        
## xname    "dots[[1L]][[7L]]" "dots[[1L]][[8L]]"
## equidist TRUE               TRUE

##       TEAM_BATTING_H TEAM_BATTING_2B TEAM_BATTING_3B TEAM_BATTING_HR
## stats Numeric,5      Numeric,5       Numeric,5       Numeric,5      
## n     2255           2255            2255            2255           
## conf  Numeric,2      Numeric,2       Numeric,2       Numeric,2      
## out   Numeric,61     Numeric,11      Numeric,26      Numeric,0      
## group Numeric,61     Numeric,11      Numeric,26      Numeric,0      
## names ""             ""              ""              ""             
##       TEAM_BATTING_BB TEAM_BATTING_SO TEAM_BATTING_HBP Singles   
## stats Numeric,5       Numeric,5       Numeric,5        Numeric,5 
## n     2255            2155            191              2255      
## conf  Numeric,2       Numeric,2       Numeric,2        Numeric,2 
## out   Numeric,120     Numeric,0       95               Numeric,70
## group Numeric,120     Numeric,0       1                Numeric,70
## names ""              ""              ""               ""

Triples,hits,walks, and singles seem to have a large amount of outliers.
- strikeouts and Homeruns are the only variables with no outliers
All of the means and medians seem very close to each other
- However, the sd seems rather high in certain categories relative to the mean. Without expert knowledge, this could be normal
- Some Min values in these categories are 0, which is impossible.
  - Exploring these categories, there are less than 10 or so 0's per category. Given that the other categories in these observations seem in line, some sort of data entry mistake was likely made again. In this case, lets replace the 0’s with NA.

Data manipulation of batting statistics

Replace 0 with NA’s

## Warning in `[<-.data.frame`(`*tmp*`, , c(3:8, 11), value = structure(list(:
## provided 8 variables to replace 7 variables

	vars	n	mean	sd	median	trimmed	mad	min	max	range	skew	kurtosis	se
TEAM_BATTING_H	1	2255	1466.81685	135.92839	1454.0	1458.24377	114.1602	992	2496	1504	1.2098420	4.8627441	2.8624434
TEAM_BATTING_2B	2	2255	241.09490	46.28571	238.0	240.29695	47.4432	69	458	389	0.2012387	-0.0427041	0.9747061
TEAM_BATTING_3B	3	2254	54.95785	27.61866	47.0	51.91075	23.7216	8	223	215	1.1240833	1.5881365	0.5817357
TEAM_BATTING_HR	4	2244	100.71524	60.13266	104.0	98.55679	75.6126	3	264	261	0.1785408	-0.9565209	1.2694014
TEAM_BATTING_BB	5	2255	503.80089	119.41435	513.0	513.29640	93.4038	29	878	849	-0.9614819	2.1131941	2.5146829
TEAM_BATTING_SO	6	2140	744.69673	237.52652	756.5	747.73598	283.9179	67	1399	1332	-0.1320162	-0.7184334	5.1345834
TEAM_BATTING_HBP	7	191	59.35602	12.96712	58.0	58.86275	11.8608	29	95	66	0.3185754	-0.1119828	0.9382681
Singles	8	2255	1070.56452	121.79442	1049.0	1058.05319	97.8516	709	2112	1403	1.7574832	6.7989398	2.5648036

Fig 3

NA counts
	na_count
INDEX	0
TARGET_WINS	0
TEAM_BATTING_H	0
TEAM_BATTING_2B	0
TEAM_BATTING_3B	1
TEAM_BATTING_HR	11
TEAM_BATTING_BB	0
TEAM_BATTING_SO	115
TEAM_BASERUN_SB	121
TEAM_BASERUN_CS	757
TEAM_BATTING_HBP	2064
TEAM_PITCHING_H	0
TEAM_PITCHING_HR	0
TEAM_PITCHING_BB	0
TEAM_PITCHING_SO	100
TEAM_FIELDING_E	0
TEAM_FIELDING_DP	271
Singles	0

Examine historical record to detect incorrect outliers

Triples
- max-153
- min- 11
Homerun records
- This link provides some good evidence that our assumptions so far were correct. The least HR recorded ever were 3, and the most were 264, which matches with our dataset
Doubles records
- Total doubles team record is 376
- min is 110
hits records
- Hits record is 1,783,
Base on Balls
- min- 282
- max- 835
Singles
- min- 811
- max- 1,338

Filter for historical records

Applied filters print in code blocks below
NA values in all batting predictor columns except strikeouts are discarded with these filters
- Looking at fig 3 above, there are not many of these NA values, as such I felt like they should just be discarded as a precaution
For strikeouts there were 115 NA values, To not lose all these observations I chose to fill these NA values with the median of Strikeouts. The distribution for Strikeouts seems normal, although median is higher, so I think mean or median would be an adequate choice for NA replacement

## Change in batting dataset
batting <- batting %>% 
    filter(TEAM_BATTING_3B < 154  &
           TEAM_BATTING_3B > 10   &
           TEAM_BATTING_2B < 377  &
           TEAM_BATTING_2B > 109  &   
           TEAM_BATTING_H  < 1784 &
           TEAM_BATTING_BB > 281  &
           TEAM_BATTING_BB < 836  &
           Singles         < 1339 &
           Singles         > 810   )
# FILL Strikeout NA's with median value
batting$TEAM_BATTING_SO[is.na(batting$TEAM_BATTING_SO)] <- median(batting$TEAM_BATTING_SO, na.rm=TRUE)    

## Change in overall dataset's

money_ball_train_2 <- money_ball_train_2 %>% 
       filter(TEAM_BATTING_3B < 154  &
           TEAM_BATTING_3B > 10   &
           TEAM_BATTING_2B < 377  &
           TEAM_BATTING_2B > 109  &   
           TEAM_BATTING_H  < 1784 &
           TEAM_BATTING_BB > 281  &
           TEAM_BATTING_BB < 836  &
           Singles         < 1339 &
           Singles         > 810   )
# FILL Strikeout NA's with median value
money_ball_train_2$TEAM_BATTING_SO[is.na(money_ball_train_2$TEAM_BATTING_SO)] <- median(money_ball_train_2$TEAM_BATTING_SO, na.rm=TRUE)

money_eval <- money_eval %>% 
    filter(TEAM_BATTING_3B < 154  &
           TEAM_BATTING_3B > 10   &
           TEAM_BATTING_2B < 377  &
           TEAM_BATTING_2B > 109  &   
           TEAM_BATTING_H  < 1784 &
           TEAM_BATTING_BB > 281  &
           TEAM_BATTING_BB < 836  &
           Singles         < 1339 &
           Singles         > 810   )
# FILL Strikeout NA's with median value
money_eval$TEAM_BATTING_SO[is.na(money_eval$TEAM_BATTING_SO)] <- median(money_eval$TEAM_BATTING_SO, na.rm=TRUE)

Figure 3 : Reexamine boxplot post data manipulation

##       TEAM_BATTING_H TEAM_BATTING_2B TEAM_BATTING_3B TEAM_BATTING_HR
## stats Numeric,5      Numeric,5       Numeric,5       Numeric,5      
## n     2092           2092            2092            2092           
## conf  Numeric,2      Numeric,2       Numeric,2       Numeric,2      
## out   Numeric,15     Numeric,6       Numeric,29      Numeric,0      
## group Numeric,15     Numeric,6       Numeric,29      Numeric,0      
## names ""             ""              ""              ""             
##       TEAM_BATTING_BB TEAM_BATTING_SO TEAM_BATTING_HBP Singles   
## stats Numeric,5       Numeric,5       Numeric,5        Numeric,5 
## n     2092            2092            191              2092      
## conf  Numeric,2       Numeric,2       Numeric,2        Numeric,2 
## out   Numeric,12      Numeric,0       95               Numeric,15
## group Numeric,12      Numeric,0       1                Numeric,15
## names ""              ""              ""               ""

Our data set has been trimmed from 2241 to 2090
Looking at fig 4 as compared to fig 2, we can see many of the outliers from singles, hits,bb from fig 2 were removed
- Category Triples seems to still have many outliers
- There are some small outliers present in other categories still as well

Batting statistics overall correlations

HBP has the least correlations, It can likley be safely discarded
3b are heavily negatively correlated with HR and strikeouts
Homeruns and strikeouts share the largest correlation(positive), of all batting statistics

How do batting statistics correlate with our target(Wins)?

There aren’t really any strong predictors.
- Relative to the availble predictors, Hits, Doubles, and Homeruns ahve the strongest correlation with target vairable(wins).
- Strikeouts and HBP stand out as having little to no correlation with wins

Stolen bases

Edit stolen bases via historical records and observe correlations

Stolen bases and Caught stealing
fewest stolen bases=13
most stolen bases= 415
fewest caught stealing=8
most caught stelaing=191
Caught stealing has a substantial amount of NA’s. IF we filter out NA, we will lose nearly 1/3 of our data. Therefore I chose to use the median for all caught stealing==NA fields

NA counts
	na_count
INDEX	0
TARGET_WINS	0
TEAM_BATTING_H	0
TEAM_BATTING_2B	0
TEAM_BATTING_3B	0
TEAM_BATTING_HR	0
TEAM_BATTING_BB	0
TEAM_BATTING_SO	0
TEAM_BASERUN_SB	0
TEAM_BASERUN_CS	0
TEAM_BATTING_HBP	1853
TEAM_PITCHING_H	0
TEAM_PITCHING_HR	0
TEAM_PITCHING_BB	0
TEAM_PITCHING_SO	100
TEAM_FIELDING_E	0
TEAM_FIELDING_DP	128
Singles	0

Baserunning doesn’t appear to have much correlation with our target variable

Pitching categories

Display pitching categories

pitching <- money_ball_train_2 %>% 
    select(2,12:17)
kable(describe(pitching))

	vars	n	mean	sd	median	trimmed	mad	min	max	range	skew	kurtosis	se
TARGET_WINS	1	2044	80.52789	13.89309	81.0	80.96638	14.8260	21	116	95	-0.3090862	-0.0162643	0.3072970
TEAM_PITCHING_H	2	2044	1539.55479	220.16288	1498.0	1513.97127	146.7774	1137	6723	5586	7.2180667	150.1985045	4.8697161
TEAM_PITCHING_HR	3	2044	111.71771	59.92485	114.0	110.10391	68.1996	3	343	340	0.1784958	-0.5918350	1.3254597
TEAM_PITCHING_BB	4	2044	553.66830	109.40697	541.0	546.10452	91.9212	312	2169	1857	2.2520497	23.8773902	2.4199397
TEAM_PITCHING_SO	5	1944	807.61060	228.05424	818.0	803.06427	250.5594	301	2309	2008	0.3567090	0.6392627	5.1723752
TEAM_FIELDING_E	6	2044	188.89922	109.36885	150.5	166.38570	49.6671	65	765	700	2.3312285	5.9463732	2.4190964
TEAM_FIELDING_DP	7	1916	147.77192	24.99992	149.0	148.74967	23.7216	68	228	160	-0.3456466	0.2527489	0.5711377

par(mfrow=c(2, 4)) 
mapply(hist,pitching,main=colnames(pitching),xlab="FIG-5",col="blue")

##          TARGET_WINS        TEAM_PITCHING_H    TEAM_PITCHING_HR  
## breaks   Integer,11         Integer,13         Numeric,8         
## counts   Integer,10         Integer,12         Integer,7         
## density  Numeric,10         Numeric,12         Numeric,7         
## mids     Numeric,10         Numeric,12         Numeric,7         
## xname    "dots[[1L]][[1L]]" "dots[[1L]][[2L]]" "dots[[1L]][[3L]]"
## equidist TRUE               TRUE               TRUE              
##          TEAM_PITCHING_BB   TEAM_PITCHING_SO   TEAM_FIELDING_E   
## breaks   Integer,11         Integer,12         Integer,16        
## counts   Integer,10         Integer,11         Integer,15        
## density  Numeric,10         Numeric,11         Numeric,15        
## mids     Numeric,10         Numeric,11         Numeric,15        
## xname    "dots[[1L]][[4L]]" "dots[[1L]][[5L]]" "dots[[1L]][[6L]]"
## equidist TRUE               TRUE               TRUE              
##          TEAM_FIELDING_DP  
## breaks   Integer,18        
## counts   Integer,17        
## density  Numeric,17        
## mids     Numeric,17        
## xname    "dots[[1L]][[7L]]"
## equidist TRUE

mtext("Fig 5", SOUTH<-1, line=2, adj=5.0, 
      col="blue")
par(mfrow=c(2, 4))

mapply(boxplot,pitching,main=colnames(pitching),xlab=" FIG-6")

##       TARGET_WINS TEAM_PITCHING_H TEAM_PITCHING_HR TEAM_PITCHING_BB
## stats Numeric,5   Numeric,5       Numeric,5        Numeric,5       
## n     2044        2044            2044             2044            
## conf  Numeric,2   Numeric,2       Numeric,2        Numeric,2       
## out   Numeric,7   Numeric,99      Numeric,3        Numeric,49      
## group Numeric,7   Numeric,99      Numeric,3        Numeric,49      
## names ""          ""              ""               ""              
##       TEAM_PITCHING_SO TEAM_FIELDING_E TEAM_FIELDING_DP
## stats Numeric,5        Numeric,5       Numeric,5       
## n     1944             2044            1916            
## conf  Numeric,2        Numeric,2       Numeric,2       
## out   Numeric,9        Numeric,178     Numeric,36      
## group Numeric,9        Numeric,178     Numeric,36      
## names ""               ""              ""

Pitching stats look like they have alot of outliers

Historical pitching records

Pithcing records were much harder to find
homeruns allowed
- Most Homeruns allowed- 258 +Errors
- most errors= 867

Filter for historical records and run correlation plot

Having some knowledge of baseball, this seems weird
- More hits and bb’s and homeruns are associated with an increase in total team wins?
Something else is wrong with the data

Look at batting to pitching correlations

M <- cor(money_ball_train_2,use = "pairwise.complete.obs")
corrplot(M)

Pitching HR has a perfect correaltion with Batting HR
Pitching BB has perfect correlation with batting BB
Pitching SO has perfect correlation with batting SO
Pitching Hits has strong correlation with batting Hits

Problems with correlations

I noticed some duplicate entries between batting and pitching stats
I am hoping that perhaps these duplicate values are causing problems in my correlations
Lets check to see how many duplicate values there are between the abtting/ pitching categories

##   total_equal_categories
## 1                    785

785 teams accoring to our DF have the same amount of hits pitching as hits hitting
- That means that over 30% of teams on this list randomly struckout as many times as their pitching struck out oppposing batters
- This is nearly impossible
Below correlation plot is a plot of these shared values displaying how all of these categories are identical entries(785 observation Df)

Filter all of these misentries out of DF

There still appears to be a heavy correlation between the pitching and hitting categories
More worriesome is the pitching stats still have correlations that make no sense

Batting and Pitching staistics shouldn’t be correlated in the same direction.
These pitching stats suffer from far too many mistakes to be trusted

I have to drop all of the pitching statistics

Becuase of collinearity issues, this liely would have had to happen anyway
This shows how dangerous using observatiional data can be

Create custom stats in place of poor pitching stats

Singles
a slugging scale ( hr=4, triple=3, double=2, single/bb=1)
an on base stat (hits + bb+HBP)

{r, ref.label=knitr::all_labels(),echo=TRUE,eval=FALSE}

Untitled

Justin Herman

June 6, 2018

Introduction

Data Exploration

General data observations of Entire Dataset

Explore target variable Wins

Remove outliers discovered in Wins category

View target variable distribution before and after filters

Descriptive Look at batting statistics

Create and view summary of batting DF

Fig 1&2

Data manipulation of batting statistics

Replace 0 with NA’s

Fig 3

Examine historical record to detect incorrect outliers

Filter for historical records

Figure 3 : Reexamine boxplot post data manipulation

Batting statistics overall correlations

How do batting statistics correlate with our target(Wins)?

Stolen bases

Edit stolen bases via historical records and observe correlations

Pitching categories

Display pitching categories

Historical pitching records

Filter for historical records and run correlation plot

Look at batting to pitching correlations

Problems with correlations

Filter all of these misentries out of DF

I have to drop all of the pitching statistics

Create custom stats in place of poor pitching stats