Is data massiveness what makes this data interesting?
Data becomes the building blocks of data products
Amazon recommendation systems
Netflix
Finances: credit ratings, train algorithms
Education: personalized learning
Government: policies based on data
Introduction to Data Science
Data products particularities
Massive, culturally saturated feedback loop
Our behavior changes the product
The product changes our behavior
Technology plays important role
Data centers
Large-scale processing
Large amounts of memory
Bandwidth
"The Rise of Big Data", Kenneth Neil Cukier and Viktor Mayer-Schoenberger, Foreign Affairs, May/June 2013.
Datafication
Datafication is a recent concept
taking all aspects of life and turning them into data
Location datafied with latitude and longitude, later with GPS
Facebook datafies friendship through likes
Google's augmented-reality glasses datafy the gaze
Twitter datafies straight thoughts
LinkedIn datafies professional networks
Now we can use this data
Datafication
Datafication
We are able to know information about people (who shares their data)
We also share data (in a pasive way) when we use the internet
We are being datafied through cookies
We are datafied when we walk on the street with sensors
Cameras
Importance of datafication
"Once we datafy things, we can transform their purpose and turn the information into new forms of value"
"we" –> entrepreneurs
"value" –> increased efficiency through automation
What is a Data Scientist in Academia
Not clearly defined
…an academic data scientist is a scientist, trained in anything from social science to biology, who works with large amounts of data, and must grapple with computational problems posed by the structure, size, messiness, and the complexity and nature of the data, while simultaneously solving a real world problem.(Cathy O'Neil & Rachel Schutt)
What is a Data Scientist in Industry
Chief Data Scientist
Sets the data strategy of the company
Engineering and infrastructure to collect data and logging
Privacy concerns
How data will be used to take decisions
How data is used to build products
Manages the team of engineers, scientists, and analysts
Communicates with CEO, CTO, etc
Patenting innovative solutions
Set research goals
What is a Data Scientist in Industry
This is better defined
Someone who knows how to extract meaning from and interpret data, which requires both tools and methods from statistics and machine learning, as well as being human.
Collects, cleans and munges data
Understands biases in data
Performs exploratory data analysis
Finds patterns, builds models and algorithms
Designs experiments
Communicates with team members
Big Data & Data Science & Knowledge Discovery
Big Data [sorry] & Data Science: What Does a Data Scientist Do? by Carlos Somohano, Founder Data Science London
Data Science (defined)
The Science and Art of
Discover what we don't know from data
As in data mining
Obtain predictive, actionable insight from data
Create data products that have business impact
Communicate relevant business stories from data
Build confidence in decisions that drive business value