Date that article was published: August 10, 2018

Summary of Article

This article defines a structure to the field of data science, and describes it as being a combination of data mining, statistical inference, and machine learning. The article also talks about how data science is about decision making in the three aforementioned stages. In the data mining stage, you are simply gathering data while making no conclusions. In the statistical inference stage, you make one important decision that determines the course of your investigation. Finally, in the Machine Learning stage, you make a “recipe” to replicate that decision. The article also distinguishes between data science and data engineering by saying that data science has to do with data after it has been made, whereas data engineering is everything up to that point.

Areas of Application

## [1] "Decision Making (Decision Trees)"             
## [2] "Risk Control"                                 
## [3] "Launching AI and wanting to validate it first"

Author Information

Cassie Kozyrkov is a data scientist and statistician. She is the founder of Decision Intelligent at Google and is the Chief Decision Scientist there. She also writes on her own blog on Medium, which has approximately 40K followers. She is often featured on the site “toward’s data science.”

## [1] "https://kozyrkov.medium.com/"

Cassie Kozyrkov’s Blog

What Do I Think?

I thought this article was really helpful at providing a strong structure for the field of data science. Kozyrkov provides a clear cut definition of what she believes data science is, but she also leaves room for interpretation. Furthermore, I appreciated how she distinguished data science from data engineering, machine learning, etc. It made a clearer image in my head as to what data science could be (and its subsections)

Random Plots

wine plot

MPG Plot

Iris Datatable with DT