Percentage Change of Average MLB Player Salary 1985–2015
This article was published Aug 12, 2017
| Reading.Time.in.Minutes | Word.Count | Likes | Comments |
|---|---|---|---|
| 29 | 7401 | 95 | 3 |
Author Will Koehrsen used sabermetrics to determine that, because baseball players regress to the mean in performance and because salaries are strongly affected by performance in “flashier” metrics, including runs batted in (RBI) for batters and number of wins for pitchers, in the two years before signing, teams should make an effort to discover players before they have a breakout season.
There are several conclusions that can be drawn from this analysis but there are also numerous caveats that must be mentioned that prevent these conclusions from being accepted as fact. I also must state the ever-present rule that correlation does not imply causation. In other words, players statistics may be correlated with salary, but that does not mean the statistics caused a higher or a lower salary.
The main conclusions are as follows:
Author Will Koehrsen, Data Scientist at Cortex Intel, Data Science Communicator; Visit Will’s Medium page at https://medium.com/@williamkoehrsen for more info
What I Learned from Writing a Data Science Article Every Week for a Year
Using his own experience writing a data science article every week for a year as an analogue, the author explores how he believes that a slow, yet consistent, dedication to learning data science is the single most important factor in becoming proficient in the field.
Gaussian Mixture Models and Expectation-Maximization (A full explanation)
Utilizing a Bayesian perspective, this article employed complex mathematics to teach how Gaussian Mixture Models and the Expectation-Maximization might be implemeneted by data scientists. These are two methods commonly seen in machine learning, so this article offered an interesting, albeit high-level, primer on high data science.
Exploring the Tokyo Neighbourhoods: Data-Science in Real Life
This article delves deeply into what being a data scientist in the corporate world means, demonstrating how information must be processed according to the client’s desires, which oftentimes complicates the process. Specifically, python libraries were used to scrap web-data and Foursquare API was used to explore the major districts of Tokyo, all in order to fulfill the parameters of the project.
Building COVID-19 interactive dashboard from Jupyter Notebooks
In this article, the author demonstrates the step by step process he used to gather, process, plot, and export up-to-date coronavius case data so that the pandemic could be better understood through visualization.
Cleaning and Preparing Data in Python
Noting that 70-80% of a data scientist’s work is consumed by the process of data wrangling, this article explores best practices for processing data, using Python to demonstrate.