Train Loading Statistics
Relationship Between Volume and Load TIme
Introduction
This project analyzes train tracking data to examine the relationship between shipment volume and total load time. The dataset includes operational records such as the number of barrels (BBLS) loaded onto trains and the total hours required to complete each load. Understanding this relationship is important because load time is a key factor in operational efficiency and scheduling within transportation systems.
It is reasonable to expect that larger shipment volumes require more time to load and process. As the number of barrels increases, additional time may be needed for handling, coordination, and equipment usage. This aligns with general operational principles where increased workload typically results in longer processing times.
If my model is correct, there will be a positive linear relationship between shipment volume (BBLS) and total hours.
Descriptive Statistics
| Variable | Mean | Median | Variance | SD |
| Hours | 24.58 | 24 | 45.90 | 6.77 |
| Volume (BBLS) | 69,406.22 | 71,343.35 | 28,919,438.00 | 5,377.68 |
Plots
Dependent Variable: Hours
The histogram shows that most load times fall between approximately 20 and 30 hours, with a concentration around the mid-20s. The distribution is slightly right-skewed, with a few higher-duration observations, including one significant outlier.
The histogram indicates that shipment volumes are concentrated between approximately 65,000 and 72,000 BBLS. The distribution is slightly left-skewed, with a few lower-volume observations that appear as outliers.
The scatterplot shows a positive relationship between shipment volume and total load time, as indicated by the upward-sloping regression line. However, the data points are widely dispersed, suggesting that the relationship is relatively weak.
Correlation and Covariance
| Measure | Value |
| Correlation | 0.1767 |
| Covariance | 6,437.1680 |
The correlation between shipment volume and total load time is 0.1767, indicating a weak positive relationship. As volume increases, load time tends to increase, but the relationship is not strong.
The covariance is 6,437.17, which also indicates a positive relationship, though it is less directly interpretable because it depends on the scale of the data.
This analysis uses a single independent variable, so no additional plots for multiple variables are included.
Regression Analysis
| Dependent variable: | |
| Hours | |
| Volume (BBLS) | 0.0002** |
| (0.0001) | |
| Constant | 9.1290 |
| (7.6900) | |
| Observations | 128 |
| R2 | 0.0312 |
| Adjusted R2 | 0.0235 |
| Residual Std. Error | 6.6947 (df = 126) |
| F Statistic | 4.0601** (df = 1; 126) |
| Note: | p<0.1; p<0.05; p<0.01 |
The regression results show that shipment volume has a statistically significant positive effect on total load time (p = 0.046).
The coefficient for volume is 0.0002226, indicating that as shipment volume increases, load time also increases. An increase of 1,000 BBLS is associated with an increase of approximately 0.22 hours.
However, the R² value of 0.031 indicates that only about 3.1% of the variation in load time is explained by shipment volume. This suggests that while volume has a measurable impact, other factors likely play a larger role in determining load time.
Conclusion
The results partially support the hypothesis of a positive relationship between shipment volume and load time. While the relationship is statistically significant, it is weak, indicating that volume explains only a small portion of the variation in load time. This suggests that other operational factors play a larger role. An extreme outlier in load time was also observed and may have influenced the results. Overall, volume contributes to load time, but it is not the primary driver.
Works Cited:
Clugy, Eric. Train Tracking Data. Unpublished dataset. Harvest Midstream, 2026.