AoHP: Assessing Validity

Assessing Validity / Method Comparison

Criterion validity: the extent to which test scores are associated with some other measure of the same ability.

Concurrent validity: the extent to which test scores are associated with those other accepted tests that measure the same ability.

Here we are going to look at comparing our practical estimates stopwatch timing and GPS maximum velocity with our “criterion” method dual beam timing gates.

Load packages and data

Remember if you have not installed the packages beforehand you will need to do this first using the following code:

install.packages("tidyverse")

library(tidyverse)
library(readxl)
library(ggpubr)

Stopwatch data

Remember we need to set our working direct to the file where our data sheet is saved. We can then read the excel document and correct sheet into R.

# A tibble: 6 × 7
  Player        Max_V_GPS Max_V_TG   SW1   SW2    TG     dif
  <chr>             <dbl>    <dbl> <dbl> <dbl> <dbl>   <dbl>
1 Player Eight       7.03     7.14  3.38  3.22  3.34 -0.0500
2 Player Eleven      7.49     7.58  3.26  3.22  3.26 -0.0100
3 Player Five        7.96     7.94  3.22  3.16  3.14 -0.0800
4 Player Four        7.39     7.14  3.40  3.41  3.34 -0.0600
5 Player Nine        5.47     5.38  4.16  4.6   4.18  0.0250
6 Player One         5.39     5.10  4.30  4.23  4.37  0.0750

Validity statistics

When assessing “concurrent validity” we are running a simple linear regression (a general linear model) and looking to see the extend that our practical measure (X) can predict our criterion (Y).

Y = intercept x slope x Y

This can be graphed with a simple x - y scatter plot

ggscatter(data, x = "SW2", y = "TG") + 
  geom_point(colour = "blue") +
  geom_smooth(method=lm, se = TRUE)
`geom_smooth()` using formula = 'y ~ x'

We can run a simple correlation cor.test(Sprint_data$x30m, Sprint_data$Watch) and get a standardized r value between -1 and +1. -1 = perfect negative association and +1 perfect positive correlation.

cor.test(x = data$SW1, data$TG)

    Pearson's product-moment correlation

data:  data$SW1 and data$TG
t = 46.765, df = 8, p-value = 4.833e-11
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.9919991 0.9995851
sample estimates:
     cor 
0.998176 

As we can see from above the Pearson’s correlation is high suggesting a strong correlation: r = 0.998 (95% CI 0.991 to 1.00)

Another approach which gives us a bit more information is to run the linear model code as: lm(SW1 ~ TG, data)

lm_model<-lm(SW1 ~ TG, data)

summary(lm_model)

Call:
lm(formula = SW1 ~ TG, data = data)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.04266 -0.02276  0.01017  0.01535  0.04747 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.37863    0.07098   5.334 0.000699 ***
TG           0.89985    0.01924  46.765 4.83e-11 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.02992 on 8 degrees of freedom
Multiple R-squared:  0.9964,    Adjusted R-squared:  0.9959 
F-statistic:  2187 on 1 and 8 DF,  p-value: 4.833e-11

The results from our model summary shows the intercept (this is the point on the y-axis where the line of best fit would intercept when x = 0) to be 0.38. The slope refers to how steep the line on the graph is, the amount X (timing gate time) increases for every 1 unit increase in Y (stop what time. This is 0.90.

So if you want to predict what the timing gates might give you for 20 m speed based on your stop watch time of, let’s say 3.4 s, you would use the multiply this (3.4) by 0.90 (slope) and add 0.38 (intercept)

3.4 * 0.90 + 0.38

The answer? 3.44 s If this prediction is accurate you could use your stop watch interchangeably with the timing gates and just apply the prediction equation.

Error in prediction

The standard error refers to the error in this prediction and in this case is only ± 0.03 s (Residual standard error: 0.02992) so the prediction of 3.44 s above could be as low as 3.41 and as high as 3.47 s.

Also note if you square root the R-squared from the model you will get the same Pearson’s R as presented above using the code cor.test

Systematic bias / differences in the mean

Whilst it is important to understand the associations between two measures we might also be interested to know if there are any “systematic” differences or bias between the systems, particularly. To do so we would run a simple t-test or and ANOVA.

t.test(x = data$SW2, y = data$TG, paired = TRUE, conf.level = .95)

    Paired t-test

data:  data$SW2 and data$TG
t = 0.55676, df = 9, p-value = 0.5913
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
 -0.1118017  0.1848017
sample estimates:
mean difference 
         0.0365 

This shows a very small and non significant difference (0.013s, -0.03 to 0.06, p = 0.520). Therefore the the stopwatch is not under or over estimating the sprint time.

We can graph this nicely using a box plot but first we need to transform our data into “long format”.

The code selects the 3 columns we need (ID, 30 m time (gates) and 30 m time on the watch.

Sprint_long<-subset(data, select = c(Player, SW1, TG)) %>% 
  rename(
    Gates = TG,
    Stopwatch = SW1
    )  %>%
  pivot_longer(
    cols = c(Gates, Stopwatch),
    names_to = "Method",
    values_to = "Time") 

head(Sprint_long)
# A tibble: 6 × 3
  Player        Method     Time
  <chr>         <chr>     <dbl>
1 Player Eight  Gates      3.34
2 Player Eight  Stopwatch  3.38
3 Player Eleven Gates      3.26
4 Player Eleven Stopwatch  3.26
5 Player Five   Gates      3.14
6 Player Five   Stopwatch  3.22

Now we can use ggplot to run a nice box plot .

ggplot(Sprint_long, aes(x = Method, y = Time, fill = Method)) +
  geom_boxplot() +
  geom_jitter(alpha = 0.7)+
    scale_fill_viridis_d(option = "viridis", direction = 1) +
  labs(title="Comparison of stopwatch & timing gates", x ="" , y= "Twenty meter time (s)") + 
  theme_classic() + 
      theme(legend.position = "none")

GPS data

Maybe you can have a go at doing this for the GPS data versus timing gates?

Conclusion

In conclusion, our stopwatch data from SW1 showed little evidence bias when compared to timing gates, and was highly correlated with timing gate time too. It could be used to predict sprint time with good accuracy. The stop watch might be accurate enough to measure differences between participants, particularly in a group of varying speed like ours.

It will be interesting to see how closely correlated our GPS units were with timing gate derived maximal velocities.

Previous published data by Kyprianou et al., 20191 found a close agreement 0.04 95% CI; -0.03 to 0.11 m/s) however, they compared to a laser gun which is probably a better estimate of maximum velocity than our timing gate method.

References

  1. Kyprianou, E., Lolli, L., Haddad, H.A., Di Salvo, V., Varley, M.C., Mendez Villanueva, A., Gregson, W. and Weston, M., 2019. A novel approach to assessing validity in sports performance research: integrating expert practitioner opinion into the statistical analysis. Science and Medicine in Football, 3(4), pp.333-338.