1 Lawrence University | CMSC/STAT 405: Advanced Data Computing
2 Institute of Digital Anti-Aging Healthcare, Inje University, Gimhae, 50834, Republic of Korea
In the last 2 decades, Major League Baseball has begun to track everything when it comes to ball flight, movement, spin, and trajectory. This research has been widely integrated into the scouting and evaluation of player metrics.
In terms of analyzing performance, it turns out accurately detecting, tracking, and evaluating a baseball pitch in real time is extremely difficult using ordinary video.
Baseball’s are small, move at high speeds, can blur due to camera frame-rate limitations, and may disappear or shrink visually as it travels further from the camera.
Modern tracking systems provide valuable data on pitch movement and player performance.
Low-cost video-based tracking is promising, but inconsistent environments make detection difficult.
Deep learning strengthens object detection by handling visual challenges such as blur, lighting changes, and background clutter
Figure 1: What is Yolo?
YOLO (You Only Look Once) is an object detection model. It scans an image once and predicts:
This makes YOLO especially useful because it is faster and uses less processing power than prior systems of tracking.
In this study, YOLOv12 is used to detect small, fast-moving baseballs in video. The system then tracks the ball across frames to estimate its trajectory, speed, and pitch outcome
In object tracking models, the IoU, or intersection over union, is important when evaluating precision. When the predicted box around an object like a ball overlaps completely with the real labelled ball, the IoU would be 100%. This paper used mean average precision (mAP) to get a more nuanced look at precision.
| Model | mAP@50 | mAP@50-95 | Precision | Recall | Inference Time (ms) | Pitch Speed Est |
|---|---|---|---|---|---|---|
| YOLOv8 | 0.875 | 0.52 | 0.84 | 0.81 | 6.3 | ~117 kph |
| YOLOv11 | 0.915 | 0.57 | 0.88 | 0.86 | 7.5 | ~119 kph |
| YOLOv11n | 0.880 | 0.54 | 0.85 | 0.83 | 9.1 | ~116 kph |
| YOLOv11m | 0.920 | 0.58 | 0.89 | 0.87 | 7.0 | ~120 kph |
| YOLOv12 | 0.945 | 0.60 | 0.91 | 0.89 | 8.6 | ~121 kph |
mAP@50: a forgiving threshold for IoU, with any overlap of 50% or higher being labelled “True”. a score of 0.945 means the rough location is found almost 95% of the time
mAP@50-95: the “strict” test, this mAP takes the average of intervals from 50 to 95%. Always lower than mAP@50, the score of 0.6 beat all other YOLO models.
Recall: scoring the highest with a recall of 0.89, the v12 model catches almost 90% of true positives.
Inference Time: in ms, translates to 116 frames per second
When looking at the basic interface to view the results, the YOLOv12 seems simple: boxes are drawn with either “Bat” or “Ball” labels.
Data imbalance: The model had more bat examples than ball examples.
Missed ball detections: The ball class recall was about 80%.
Simplified motion model: The model assumes the ball moves at a mostly constant velocity, but real pitches are affected by gravity, drag, spin, and curve.
Camera angle requirements: The model also required a specific camera angle and field dimensions, which may not hold at amateur levels.
The current system does not fully model how a baseball actually moves through the air. Future models could include physics ideas like gravity, drag, spin, and curve. This would make trajectory prediction more realistic.
Use multi-camera input for more accurate speed estimation.
The approach may be transfferable to other sports with high speed like cricket, tennis or badminton.
Overall: The performance could be further developed in the future if a larger, and more representative training data set is available.
Final Remarks
The system was able to achieve high scores and successfully managed to perform the targeted tasks. The results of this study indicate the system can be used by the baseball coaches and players to get valuable information while training baseball and help them improve their pitching or catching performance better. While their are still inconsistencies in the model, with further development this could become a more accurate, accessible, and practical tool for real-world baseball training and analysis.
References
https://www.sciencedirect.com/science/article/pii/S2405844026005037?via%3Dihub
Google Gemini & Chat GPT