Here we will analyze several metrics related to baseball to find a metric that most strongly correlates (and will be a loose predictor of) game length (in minutes). These metrics are sourced from the data frame below:

##       Game League Runs Margin Pitchers Attendance Time
## 1  CLE-DET     AL   14      6        6      38774  168
## 2  CHI-BAL     AL   11      5        5      15398  164
## 3  BOS-NYY     AL   10      4       11      55058  202
## 4  TOR-TAM     AL    8      4       10      13478  172
## 5   TEX-KC     AL    3      1        4      17004  151
## 6  OAK-LAA     AL    6      4        4      37431  133
## 7  MIN-SEA     AL    5      1        5      26292  151
## 8  CHI-PIT     NL   23      5       14      17929  239
## 9  LAD-WAS     NL    3      1        6      26110  156
## 10 FLA-ATL     NL   19      1       12      17539  211
## 11 CIN-HOU     NL    3      1        4      30395  147
## 12 MIL-STL     NL   12     12        9      41121  185
## 13  ARI-SD     NL   11      7       10      32104  164
## 14  COL-SF     NL    9      5        7      32695  180
## 15 NYM-PHI     NL   15      1       16      45204  317


In which, the metrics of importance for correlation analysis will be League, Runs, Margin, Pitchers, Attendance, and Time as the main variable for analyzing. The Games metric does not lend itself to computing correlation, as it is strictly categorical. However, we can assign League to a set of dummy variables such that AL is mapped to 0 and NL is mapped to 1.
We can now calculate each of the respective correlation coefficients:

Variable Correlation
League -0.4121187
Runs 0.6813144
Margin -0.0713583
Pitchers 0.8943082
Attendance 0.2571925


We can see that the absolute value of the correlation between pitchers and time was the greatest, so we will consider the number of pitchers to be the metric most strongly correlated with the length of time of a game. Next, we can construct a linear regression model to attempt to generalize the relationship between number of pitchers and length of game. The model will be represented by the red dashed line below:

The intercept coefficient of the model would be 94.8432502, and the slope coefficient would be 10.7101727. The model would imply that for each additional pitcher present in the game, we can expect the duration of the game to increase by about 10.7101727 minutes.

We can obtain a p-value for statistical significance by using a correlation test (with t-distribution that has 13 degrees of freedom) and the correlation test statistic. Doing so yields a p-value of 6.8838508^{-6} with a 95% confidence interval of (0.705038, 0.9646464). That is, we are 95% sure that the true correlation between the number of pitchers and duration of a game is between 0.705038 and 0.9646464. This low p-value does suggest that, under the null hypothesis, the correlation is statistically significant.
Consider the residual plots:

Here, in the residuals vs. fitted plot, we can see a roughly random scattering of points. This is good as it demonstrates that the linear model does not generally under or over-predict the true values. Additionally the Q-Q plot suggests that the linear model is typically accurate in its predictions as most points lie near to the predicted line.