Business Problem: How do we attract and retain high quality data scientists quickly to meet growing corporate need?

Analytical Goals

The purpose of this analysis is to provide insight into the demand for data scientists and highly sought-after hard data science skills by American companies competing with our company for a limited talent pool in a growing and already hot marketplace.

Because corporate goals demand that we build out a team of data scientist and engineers to extract useful information from our corporate database build new products with the next 12 months, delays in hiring could be catastrophic.

The following analysis is designed to help us find uniquely qualified individuals quickly with the ability to perform in a competitive marketplace despite extreme competition from companies in areas with larger technology-based economies which tend to attract consistently larger portions of potential intelligence workers.

Map of United States Data Science Postings. Circle size is relative to Number of listing

What are we competing against?

According to Price Waterhouse Coopers, the demand for data scientists has created a situation where it takes twice as long to fill vital analytics jobs in strong markets, as long as 90 days with active recruiting.

For us, recruiting in Oklahoma City, it could take even longer given the difficulty in finding suitable applicants willing to relocate and the cost logistics involved with interviewing and vetting potential candidates.


Data Science Jobs Require:

So…how do we get all of this quickly in a smaller marketplace at with reasonable salaries?

What does the national markeplace look like and how might we go find and convince candidates to relocate to Oklahoma City to work with us?

High Demand - Limited Supply

A quick search of LinkedIn.com’s listings in the United States returns 39,973 members listing Data Scientist as a job title.

That might sound like a lot, when you consider that there are currently (as of October 22,2017) 5996 openings posted on Indeed.com for the United States alone. According to Stackoverflow’s 2017 developer report, however, less than 12% of IT professionals are currently actively seeking a new position.This includes both employed and unemployed data scientists.

For employers seeking skilled talent this would mean around 4300 currently working as Data Scientists might be seeking a new position with almost 6000 open jobs on a single job board right now, so the competition for highly skilled experienced workers could be fierce.

These numbers might underestimate the number of companies and openings competing for analytical talent, as positions with titles like, Quantitative Analyst, Data Analyst, Business Analyst and Machine Learning Engineer may be drawing from the same talent pool to do similar work under another job title.

With most highly skilled and experienced workers staying where the are, the Natural place to look for talent is this years crop of science and engineering graduates of PhD programs.

Recent Graduates?

According to National Science Foundation’s 2015 special report 55,000 science and engineering PhD’s were conferred to US graduate students, so there is potential to find skilled and capable employees from this pool.

However, the following chart shows, many of these new graduates will not be available to recruiters as they are heir-marked for academia or studying on temporary visas and slated to return home. Of those graduates left, it becomes increasingly difficult to find potential candidates with all the language, technology and platform skills we might hope to find in an experienced data scientist.


What are other employers looking for in a data scientist?

To better target the right candidates, and even more importantly reach out to and interview prospects who might become solid analytical workers for our company, we decided to put our data science skills to work for us and analyze job postings.

We wanted what hard-technical skills other companies are looking for in data scientists to see if we might be able to build better job postings, work more effectively with human resources to qualify candidates and engage our recruiting partners effectively as we attempt to build out a 12 member data analytics and engineering team over the next 12 months.

The next few slides show what we found, what we think this might mean for us in our recruiting efforts and some possible suggestions on how to overcome some of the challenges we face recruiting far from traditionally coveted technological centers.

Approach to analyze Data Science Skills on Indeed to help build a clearer picture of who we are competing against.

The challenge is to distill thousands of posts like this into meaningful, relevant and actionable information to make recommendations.
Posting from Indeed.com October 22, 2017

Method of Analysis

Evaluate our Position

Use the results of this analysis and our own understanding of the job requirements as data scientists to build a viable alternative search process.

Compare national results to the languages, technologies and skills we are currently using in our data science stack:

How do the top 25 most sought-after data science skills compare to our own stack?

What Our Scraping Revealed About Key Skills


Our Core Stack
How do we compare?

Based on our results from scraping 6000+ Data Scientist postings, pulling more than 50,000 relevant technical skill words, it is clear that we are competing against many other companies, in more sought after cities with deeper pockets, for a common set of core skills.

Main Scripting Luanguages

With Python as the most sought after scripting language, and Oracle placing first on the list of SQL variants, we will be going up against a fair amount of competition trying to recruit premium experienced candidates with python skills.

Databases and Business Analysis Tools Similarly, with Spark, Tableau and machine learning skills in the top 25, we might find ourselves working hard, or paying more than we have allocated to attract and keep a team who are experienced in our core skills.

Given our analysis, how can we find, hire and train effective data scientists given the skills we need and the geographic and financial challenges we face?

Recommendations Consider recruiting candidates with similar experiences in a different language, big data environment, tool or flavor of SQL.

The important aspect of data science is being able to think through problems, creatively and systematically; the tool she uses is secondary to the understanding of the problem.
Data science stack
Job listings with alternative skills

Alternative Search Methods

Search based on languages and skills similar to the ones we use in our core stack!

If a prospect has some knowledge or experience with one of our core languages, and has worked in other similar environments and platforms consider them a viable candidate for interview, particularly if the have some of the following:

Because the recruiting process is long and expensive, we might be better off finding junior level data professionals with related technical skills who excel in business areas, engineering or have superlative soft skills. We are more likely to get the full attention of these candidates than those with lots of machine learning, Python, R and Spark experience.

Build Our Own Data Scientists
If a candidate knows the basics in any programming language, or database, we can probably build a data scientist to suit our needs more quickly and affordably than we can recruit one!

Bibliography of works citedin the report and works used to help us build the system

Law, J., & Rosenblum, J. (n.d.). rvest tutorial: scraping the web using R. Retrieved October 15, 2017, from

DevNami. (n.d.). R Programming Import Data from URL. Retrieved from

R and SQLite: Part 1. (2012, November 18). Retrieved October 22, 2017, from

Entrepreneur Tactics. (n.d.). Scraping Website Data From Multiple Pages For FREE Using Import io. Retrieved from

Melvin L. (n.d.). Simple Web Scraping using R. Retrieved from

Bourret, R. (n.d.). rpbourret.com - XPath in Five Paragraphs. Retrieved October 22, 2017, from

National Science Foundation, National Center for Science and Engineering Statistics. 2017. Doctorate Recipients from U.S. Universities: 2015. Special Report NSF 17-306. Arlington, VA. Available at

PricewaterhouseCoopers. (n.d.). What’s next for the 2017 data science and analytics job market? Retrieved October 23, 2017


Fahad Arif Lidiia Tronina
Fahad Arif Lidiia Tronina
“Seek not to become a man of knowledge but a man of value. (Einstein)”
The world is one big data problem.(Andrew McAfee)
Peter Goodridge Bethany Poulin
Peter Goodridge Bethany Poulin
“Far better an approximate answer to the right question, which is often vague, than the exact answer to the wrong question, which can always be made precise.”" John Tukey
“Fail your way to greatness!”

Aknowledgement from Project Manager

This project was the culmination of great teamwork, shared responsibility and insightful ideation and problem solving.

Although Peter’s herculean effort in extracting data was magnificent, Fahad’s diligence in problem solving and chasing down useful solutions and input on the database was equally as valuable.

Lidiia was super proactive getting everything ready for the last minute surge to make a polished final product and she handled the ticking clock with grace.

The process was remarkable in how democratic, smooth and humanely it proceeded, making my job as the defacto project manager extremely easy!

Thanks you guys for the best project I could have imagined!

We would also like to extend our thanks to those who put their understanding out on the web for all to see, use and grow from.