We will take a look at two datasets of job listings - one for New York City government jobs and another for technology jobs within New York City that were posted to dice.com. Both datasets were procured from kaggle.com.


Load libraries

library(stringr)
library(dplyr)
library(tidyr)
library(zoo)
library(ggplot2)
library(knitr)


Get city jobs dataset

We start by importing the data for NYC jobs from Github and cleaning up the column names. Then we remove duplicate listings.

raw_nyc_df <- read.csv('https://raw.githubusercontent.com/mehtablocker/cuny_607/master/project_3/nyc-jobs.csv')
nyc_jobs_df <- raw_nyc_df
names(nyc_jobs_df) <- names(nyc_jobs_df) %>% tolower() %>% gsub("\\.", "_", .)
names(nyc_jobs_df)[names(nyc_jobs_df)=="x__of_positions"] <- "n_of_positions"
nyc_jobs_df <- nyc_jobs_df %>% select(-posting_type) %>% unique()
nyc_jobs_df %>% tail() %>% kable()
job_id agency n_of_positions business_title civil_service_title title_code_no level job_category full_time_part_time_indicator salary_range_from salary_range_to salary_frequency work_location division_work_unit job_description minimum_qual_requirements preferred_skills additional_information to_apply hours_shift work_location_1 recruitment_contact residency_requirement posting_date post_until posting_updated process_date
3966 379312 OFFICE OF MANAGEMENT & BUDGET 1 Analyst FEMA Public Assistance Policy & Monitoring & Compliance 6088 1 Finance, Accounting, & Procurement Policy, Research & Analysis F 45491 60660 Annual 255 Greenwich Street Office of Budget Review NA 2019-01-07T00:00:00.000 2019-01-07T00:00:00.000 2019-02-19T00:00:00.000
3968 380166 OFFICE OF MANAGEMENT & BUDGET 1 Assistant Director Value Engineering 0608A M4 Engineering, Architecture, & Planning Technology, Data & Innovation Policy, Research & Analysis F 137637 137637 Annual 255 Greenwich Street IFA (Mgrl) NA 2019-01-11T00:00:00.000 2019-01-11T00:00:00.000 2019-02-19T00:00:00.000
3970 383297 OFFICE OF MANAGEMENT & BUDGET 1 Analyst DEPARTMENT OF ENVIRONMENTAL PROTECTION (DEP) 6088 1 Finance, Accounting, & Procurement Policy, Research & Analysis F 60660 68244 Annual 255 Greenwich Street Infra, Librar. And Cultural NA 2019-02-07T00:00:00.000 2019-02-07T00:00:00.000 2019-02-19T00:00:00.000
3972 383303 MUNICIPAL WATER FIN AUTHORITY 1 Analyst New York City Municipal Water Finance Authority (NYW) 6088 1 Finance, Accounting, & Procurement Policy, Research & Analysis F 60660 68244 Annual 255 Greenwich Street Municipal Water Authority NA 2019-02-11T00:00:00.000 2019-02-11T00:00:00.000 2019-02-19T00:00:00.000
3974 383828 OFFICE OF MANAGEMENT & BUDGET 1 Analyst REPORTING AND SYSTEMS MANAGEMENT 6088 1 Finance, Accounting, & Procurement Policy, Research & Analysis F 45491 68244 Annual 255 Greenwich Street Community Development NA 2019-02-12T00:00:00.000 2019-02-12T00:00:00.000 2019-02-19T00:00:00.000
3976 383833 OFFICE OF MANAGEMENT & BUDGET 1 Analyst Administration for Children’s Services 6088 1 Finance, Accounting, & Procurement Policy, Research & Analysis F 45491 68244 Annual 255 Greenwich Street Social Services NA 2019-02-12T00:00:00.000 2019-02-12T00:00:00.000 2019-02-19T00:00:00.000


Filter for data-specific jobs

We filter for data science jobs by using a regular expression to search the business_title column for the case insensitive terms “data” or “analytics.” Then we create another table for non-data jobs.

data_jobs_df <- nyc_jobs_df %>% filter(grepl("data|analytics", business_title, ignore.case = T))
other_jobs_df <- nyc_jobs_df %>% filter(!grepl("data|analytics", business_title, ignore.case = T))
data_jobs_df %>% head() %>% kable()
job_id agency n_of_positions business_title civil_service_title title_code_no level job_category full_time_part_time_indicator salary_range_from salary_range_to salary_frequency work_location division_work_unit job_description minimum_qual_requirements preferred_skills additional_information to_apply hours_shift work_location_1 recruitment_contact residency_requirement posting_date post_until posting_updated process_date
195805 ADMIN FOR CHILDREN’S SVCS 2 Tracking and Monitoring Data Analyst ASSOCIATE STAFF ANALYST 12627 0 Social Services F 59536 88649 Annual 66 John Street, New York, Ny Headstart (ECE) NA 2015-05-29T00:00:00.000 2015-06-16T00:00:00.000 2019-02-19T00:00:00.000
226044 DEPT OF INFO TECH & TELECOMM 1 Payroll Data Associate CLERICAL ASSOCIATE 10251 3 Clerical & Administrative Support 32888 50000 Annual 75 Park Place New York Ny Human Resources NA 2015-12-21T00:00:00.000 2015-12-21T00:00:00.000 2019-02-19T00:00:00.000
234203 ADMIN FOR CHILDREN’S SVCS 1 Business and Data Analyst Manager ASSOCIATE STAFF ANALYST 12627 0 Finance, Accounting, & Procurement F 63817 95022 Annual 150 William Street, New York N Asst Comm Bdgt & Clmng-Financ NA 2016-03-03T00:00:00.000 2016-03-04T00:00:00.000 2019-02-19T00:00:00.000
289615 NYC HOUSING AUTHORITY 1 Data Support Analyst COMPUTER SYSTEMS MANAGER 10050 M1 Technology, Data & Innovation F 80000 100000 Annual Analysis & Reporting Capital Projects Admin NA 2017-07-07T00:00:00.000 2018-07-13T00:00:00.000 2019-02-19T00:00:00.000
294911 NYC EMPLOYEES RETIREMENT SYS 1 CERTIFIED IT ADMINISTRATOR (DATABASE) CERT IT ADMINISTRATOR (DB) 13644 2 Technology, Data & Innovation F 79471 111598 Annual 335 Adams Street, Brooklyn Ny Executive Management NA 2017-07-19T00:00:00.000 2017-07-20T00:00:00.000 2019-02-19T00:00:00.000
340965 DEPT OF HEALTH/MENTAL HYGIENE 1 Data Manager, Viral Hepatitis, Bureau of Communicable Diseases CITY RESEARCH SCIENTIST 21744 1 Health F 59708 65678 Annual 42-09 28th Street Communicable Diseases NA 2018-05-07T00:00:00.000 2018-05-07T00:00:00.000 2019-02-19T00:00:00.000


Analyze quantity and salary

We can see from the above table that a lot of key values are missing, including Job Description and Preferred Skills. This significantly limits our analysis capabilities to only a few areas.


Of all the jobs working for New York City, how many are data jobs?

### Total number of jobs in the dataset:
nrow(nyc_jobs_df)
## [1] 2205
### Number of data jobs:
nrow(data_jobs_df)
## [1] 71
### Data jobs, as a percentage of total:
nrow(data_jobs_df)/nrow(nyc_jobs_df)
## [1] 0.03219955

In this dataset, only about 3.2 percent of jobs are data jobs.


In terms of the high range of salary, how well do data jobs pay relative to non-data jobs?

### Data jobs
summary(data_jobs_df$salary_range_to)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##      15   65339   84301   78740   99000  161497
### Non-data jobs
summary(other_jobs_df$salary_range_to)
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
##     10.36  56703.25  75595.50  78639.47 101673.00 230000.00
par(mfrow=c(1,2))
boxplot(data_jobs_df$salary_range_to, xlab="Data Jobs", ylab="Salary in Dollars", ylim=c(0, 200000))
boxplot(other_jobs_df$salary_range_to, xlab="Non-Data Jobs", ylab="Salary in Dollars", ylim=c(0, 200000))

par(mfrow=c(1,1))

The distribution is wider for non-data jobs, but the median salary is higher for data jobs. It is important to remember that these are all government jobs, which overall may pay less than private sector jobs.


Get Dice jobs

Next we import the data for technology jobs within New York City that were posted to dice.com. We separate one of the columns and rename a few others.

raw_dice_df <- read.csv('https://raw.githubusercontent.com/mehtablocker/cuny_607/master/project_3/dice_com_nyc_jobs.csv', stringsAsFactors = F)
dice_jobs_df <- as_tibble(raw_dice_df) %>% 
  separate(employmenttype_jobstatus, into=c("employment_type", "job_status"), sep = ", ", fill="right", extra = "drop")
dice_jobs_df <- dice_jobs_df %>% 
  rename(advertiser_url = advertiserurl, 
         job_description = jobdescription,
         job_id = jobid, 
         job_location = joblocation_address, 
         job_title = jobtitle, 
         post_date = postdate)
dice_jobs_df %>% head() %>% kable()
advertiser_url company employment_type job_status job_description job_id job_location job_title post_date shift site_name skills uniq_id
https://www.dice.com/jobs/detail/Front-End-Developer-Genesis10-New-York-NY-10001/gentx001/16-03932?icid=sr14288-477p&q=&l=California,%20Us,%20CALIFORNIA Genesis10 Full Time Direct Placement This is a fulltime position for a Javascript developer for a financial software, data, and media company. This role is based our of the Midtown, NY location (candidates must work onsite). You’ll be part of a team working on the application that is the glue that holds the firm’s main product together and is used as a communication tool, a price dissemination system and a way to receive information from all other applications within the firm. Responsibilities:   Help us design, create and build our next-generation user interface Be a key contributor in the re-architecting of our application service layers to improve our scalability, stability and performance Work with the team, our product and our customers to define priorities and the technology we use Work with other teams and learn about our search infrastructure, core databases and how we support hundreds of applications through our APIs Required Skills and Experience:   3+ years of production-level development Solid JavaScript knowledge Must want to and like working on UI Preferred:   An understanding of C++ fundamentals Experience working on other highly visible applications  Start Date: 06/03/2016 Dice Id : gentx001 New York, NY Front End Developer 7 hours ago Telecommuting not available|Travel not required C++, Developer, Development, JavaScript, User Interface 28f5e0c1cc3314813e674f0c32b04d1b
https://www.dice.com/jobs/detail/Senior-Full-Stack-Developer-Genesis10-New-York-NY-10001/gentx001/16-02139?icid=sr14773-493p&q=&l=California,%20Us,%20CALIFORNIA Genesis10 Full Time Direct Placement Description:   Our client’s Open Software Frameworks team is building a variety of products used by both independent software vendors and in-house developers. Their diverse portfolio of products includes Application Portal, Geo Spatial Mapping platform and IDE used to build their terminal applications. This team is contributing to a number of cutting-edge open source projects. Based on the breadth of the team’s work, you will need to be a strong full stack developer. You strive to design, implement and support the ideal solution. Balance between elegant design and system performance and reliability is always at the front of your mind. You will have the opportunity to work closely with users, UX and Product teams. Requirements:   * 3+ years of experience programming in C/C++ * 3+ years of experience with HTML5, CSS and JavaScript * Strong OOD/OOP skills and experience applying modern design patterns * Knowledge of algorithms, standard data structures and multithreading We’d love to see: * Experience with C#, .Net internals and WPF * Experience developing distributed systems in a Windows or Linux environment * Familiarity with and understanding of an Agile methodology If you have the described qualifications and are interested in this exciting opportunity, please apply! About SWATT:   The Genesis10 Software and Technology Team (SWATT) is a specialized recruiting service focused on helping accomplished software developers, programmers, platform engineers and elite technology professionals find once-in-a-lifetime career opportunities in New York City with the world’s most advanced technology organizations. Whether local to New York or relocating from across North America, we take an authentic approach to helping people make life-changing technology career decisions. For more information go to http://swatt.genesis10.com/ “Genesis10 is an Equal Opportunity Employer, M/F/D/V”  Start Date: 06/27/2016 Dice Id : gentx001 New York, NY Senior Full Stack Developer 7 hours ago Telecommuting not available|Travel not required .Net, Algorithms, Developer, JavaScript, Linux, OOD, OOP, Programming, Systems, Windows b9520049ff3e799ec368ae0aa99ec5f5
https://www.dice.com/jobs/detail/Java-Full-Stack-Engineer-%2528Angular-JS-is-must%2529-Stratitude-Inc-New-York-City-NY-10001/10292564/569166?icid=sr14409-481p&q=&l=California,%20Us,%20CALIFORNIA Stratitude Inc Contract Corp-To-Corp 12+ months Required Skills                                   1.      6-9 years of experience in software development.2.      Experience working with MVC based front-end library Angular JS3.      Strong expertise in Java/J2EE, Spring , Hibernate.4.      Experience with SQL/NoSQL databases.5.      Responsible for module design / high architecture.6.      Participate in customer interaction, code review and follow-up.7.      Work closely with the development team to optimize and improve the e-commerce platform to grow subscription business. This involved identifying opportunities , developing requirements and co-ordinating with QA. Dice Id : 10292564 New York City, NY Java Full Stack Engineer (Angular JS is must) 7 hours ago Telecommuting not available|Travel not required Angular JS, Java/J2EE, Spring , Hibernate 755688570dda03850c9ea9974e241110
https://www.dice.com/jobs/detail/Linux-Engineer-Genesis10-New-York-NY-10001/gentx001/16-04058?icid=sr14777-493p&q=&l=California,%20Us,%20CALIFORNIA Genesis10 Full Time Contract Genesis10 is currently seeking a Product Specific Technologist with our client in the financial industry in their New York, NY location. This is a 12 month + contract position. Description:   * High Server Patch project is major effort to remediate security patch level on Grid /Quartz LINUX Infrastructure and deploy recurring patching solution * We are hiring this position to work closely with business technology group, application owners and operations teams to establish and coordinate deployment of automated recurring patching solution * This sophisticated platform requires very specialized skills and this engineer will provide solution, design, development, and implementation and be responsible for co-ordination to facilitating the migration of this platform Responsibilities:   * Requirement analysis of the project * Technical recommendations for automated patching * End to end technical plan for patching * Co-ordination of execution of automated recurring plan Requirements:   * Experience with J2EE, Java, Python, C++ programming, shell script * Strong understanding and working experience of Linux Operating system * Good understanding of Network /Server Infrastructure spanning multiple sites * Experience with J2EE, Java, Python, C++ programming, shell script * Good communication skills, self-motivated, positive attitude and ability to work in a global team environment * Strong troubleshooting skills and application Administration/Support skills * Experience in engineering/upgrading complex solutions in line with demanding client requirements * Solutions engineering (design & implementation of messaging solutions) experience * Proven timely delivery of key infrastructure and products * Good understanding and work experience in an Enterprise Environment * Strong understanding and hands on Linux Operating system * Strong Development experience to write tool on Linux Operating system * Previous software solution to deploy patching solution will be plus * Ability to handle highly volatile support with platform and clients spanning multiple time zones * Ability to adapt quickly to the client needs * Team Player * Good Communication skills * Able to work as a W2 employee of Genesis10 (no Corp-to-Corp) If you have the described qualifications and are interested in this exciting opportunity, please apply! About Genesis10:   Genesis10 is a leading U.S. business and technology consulting firm with hundreds of clients needing proven talent and solutions to power their strategic initiatives. If you are a high performing business or IT professional with solid, referenced experience, we want to meet you. Genesis10 recruiters and delivery professionals are highly accomplished career advocates, who get to know you beyond your resume to position you with the opportunities that fit your skills, experience and aspirations. We have benefit options to fit your needs and a support staff that works with you from placement throughout your engagement - project after project. To learn more about Genesis10 and to view all our available career opportunities, please visit us at www.genesis10.com. “Genesis10 is an Equal Opportunity Employer, M/F/D/V” Dice Id : gentx001 New York, NY Linux Engineer 7 hours ago Telecommuting not available|Travel not required Application, C++, Development, Engineer, Engineering, J2EE, Java, Linux, Network, Programming, Python, Shell Script, Software, System 75296a05c107731f400d602f3e10bf47
https://www.dice.com/jobs/detail/Front-end%2526%252345UI-developer%2526%252347UI%2526%252345Web-designer-Fahrenheit-IT-Staffing-%2526-Consulting-New-York-NY-10010/10111360/3042319-28-MH17?icid=sr14379-480p&q=&l=California,%20Us,%20CALIFORNIA Fahrenheit IT Staffing & Consulting NA Front end-UI developer/UI-Web designer job opportunity at a top-financial services company located in NYC * Midtown 6 * 12 months with high potential for FTE conversion. AN ONLINE PORTFOLIO IS REQUIRED TO APPLY TO THIS JOB  Contact oleon@fahrenheitit.com846 582 1467 Looking for CANDIDATES LOCATED IN THE TRI-STATE AREA ONLY If qualified and currently considering new job opportunities lets set up a time to discuss other details at your earliest convenience. Job details: This job is a combination of front end development (css, html, javascript) as well as user interface/web design As far as design looking for someone who can show artistic/creative samples of user interfaces the candidate has worked in.  Presentation layers are critical [think colors, fonts, easy-to-read language, easy-to-follow design]  Looking for someone who can create interactive charts, graphs for reporting purposes where users can view dashboards, charts, graphs of analytical data (experience with one of the following D3, Angular js, or Node js.  This team is responsible for presenting technical data through graphs, charts commercial quality data visualization for a large audience of stakeholders & users.  Dice Id : 10111360 New York, NY Front end-UI developer/UI-Web designer 7 hours ago Telecommuting not available Travel not required c6f99f38bf465487323ed9d46f19514d
https://www.dice.com/jobs/detail/Infrastructure-Production-Developer-Genesis10-New-York-NY-10001/gentx001/16-02261?icid=sr14184-473p&q=&l=California,%20Us,%20CALIFORNIA Genesis10 Full Time Direct Placement Description:   Our client’s Infrastructure Production group in delivers a wide range of technologies. The team builds out common services that every department can use and consume to monitor, visualize and diagnose their applications and infrastructure. This team is also the forefront in implementing modern technology ideology within the organization and assist all of the teams with implementation, automation and design. The Production Engineering team is a new team, and is one of the most fast-paced and soon to be widely used across the entire company. You will have the ability to be a part of a large cultural shift within the organization. If you like large scale systems, billions of data points a day, automating all of the things, hacking on open source software and making a cultural impact, ask us where to sign up. Responsibilities:   * Design, architect, automate and deliver large scale production ready services for employees to consume. * Build internal tools to monitor, visualize and diagnose all aspects of applications & hardware in our client’s stack. * Work closely with our client’s product and platform teams with architecture, design and scaling challenges they may have. * Help teams replace legacy software and design patterns with modern technologies. * Develop and maintain documentation, training and SLAs for managed infrastructure. Requirements:   * Minimum 2 - 3 years of experience building similar systems * Experience with large scale data processing * Previous experience automating and implementing large scale fault tolerant distributed systems * Experience with physical hardware and provisioning process * Experience working with Opensource software Common Tools used:   * Ruby / Go / Python / Java * Linux * Kafka * Hadoop / Zookeeper / HBase * Mesos * Icinga / OpenTSDB * Chef * Spark If you have the described qualifications and are interested in this exciting opportunity, please apply! About SWATT:   The Genesis10 Software and Technology Team (SWATT) is a specialized recruiting service focused on helping accomplished software developers, programmers, platform engineers and elite technology professionals find once-in-a-lifetime career opportunities in New York City with the world’s most advanced technology organizations. Whether local to New York or relocating from across North America, we take an authentic approach to helping people make life-changing technology career decisions. For more information go to http://swatt.genesis10.com/ “Genesis10 is an Equal Opportunity Employer, M/F/D/V”  Start Date: 06/27/2016 Dice Id : gentx001 New York, NY Infrastructure Production Developer 7 hours ago Telecommuting not available|Travel not required Developer, Hardware, Java, Linux, Python, Software, Systems 1f20cc46a1df5eb235007fac5fa0cbf4


Filter for data science jobs

Since this dataset is comprised of only technology jobs, finding specifically data science jobs may require a bit more nuance. For example, if we try to filter for the words “data” or “analytics” as before, we catch a lot of software developer jobs that are not exactly the same subspace as data science.

ds_dice_df <- dice_jobs_df %>% filter(grepl("data|analytics", job_title, ignore.case = T))
ds_dice_df %>% select(job_title, company, employment_type, skills) %>% head() %>% kable()
job_title company employment_type skills
Data Architect - III Mitchell Martin Contract Corp-To-Corp AIX, Architecture, Business Requirements, CSS, Development, HTML, HTTP, Informatica, Oracle, PLSQL, PL/SQL, Programming, Project, Shell Scripting, SQL, SQL Server, Supervision, Unix
Database Engineer Princeton Information Ltd Contract Independent Access, Applications, Architecture, Business Requirements, Computer, Configuration Management, Consulting, Database, Engineer, Engineers, IT, Linux, Management, Oracle, Project, SDLC, Security, SQL, SQL Server, Systems, Telecom, Web, Windows
Director of Research Databases and Software Development Mount Sinai Medical Center Full Time 2 years of progressive database and/or software development management experience. Experience with healthcare and research preferred
Data Scientist - NYC Amazon Full Time Analysis, Automated, Data Mining, Data Modeling, Development, Java, Matlab, Modeling, Perl, Python, Research, Validation
Business Analyst Data Warehouse NYC Telecomm Software Full Time business analyst, tableau, etl, data warehouse
Project Manager for Big Data Blackstone Professional Recruiting Contract Corp-To-Corp “project manager”

We can refine our search by excluding words like “engineer” and “architect” to get a more relevant result.

ds_dice_df <- ds_dice_df %>% 
  filter(!grepl("architect|architecture|engineer|developer|development|administrator|administration", job_title, ignore.case = T))
ds_dice_df %>% select(job_title, company, employment_type, skills) %>% head() %>% kable()
job_title company employment_type skills
Data Scientist - NYC Amazon Full Time Analysis, Automated, Data Mining, Data Modeling, Development, Java, Matlab, Modeling, Perl, Python, Research, Validation
Business Analyst Data Warehouse NYC Telecomm Software Full Time business analyst, tableau, etl, data warehouse
Project Manager for Big Data Blackstone Professional Recruiting Contract Corp-To-Corp “project manager”
Senior Product Manager, Enterprise Data Management Bridge Search Associates Full Time Product Manager, Enterprise, Data, Marketing, Pricing, Go-To-Market
Executive Director, Implementation and Data Governance Blue Horizon Tek Solutions, Inc. Full Time Establish the Competency Center Strategy, Guidelines, Methodology, Processes, and Best Practices
Senior Analytics Specialist Intelligent Capital Network, Inc. Contract Corp-To-Corp analytics


Search for keywords

We can text mine the job_description and skills columns to find specific keywords.


How many job postings mention the R programming language?

r_dice_df <- ds_dice_df %>% 
  filter(grepl(" R | R,", job_description, ignore.case=T) | grepl(" R | R,", skills, ignore.case=T))
nrow(r_dice_df)
## [1] 6
r_dice_df %>% head() %>% kable()
advertiser_url company employment_type job_status job_description job_id job_location job_title post_date shift site_name skills uniq_id
https://www.dice.com/jobs/detail/Data-Scientist-%2526%252345-NYC-Amazon-New-York-NY-10001/amazon20/418984?icid=sr9445-315p&q=&l=New%20York,%20NY Amazon Full Time NA Excited by Big Data, Machine Learning and Predictive Software? Interested in creating new state-of-the-art solutions using Machine Learning and Data Mining techniques on Terabytes of Data? At AWS, we are developing state-of-the-art large-scale Machine Learning Services and Applications on the Cloud involving large amounts of data. We work on applying predictive technology to a wide spectrum of problems. We are looking for talented and experienced Machine Learning Scientists who can apply innovative Machine Learning techniques to real-world problems. You will get to work in a team dedicated to advancing Machine Learning solutions at AWS and converting it to business-impacting solutions. Major responsibilities * Use machine learning, data mining and statistical techniques to create new, scalable solutions for business problems * Analyze and extract relevant information from business data to help automate and optimize key processes * Design, develop and evaluate highly innovative models for predictive learning * Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation * Research and implement novel machine learning and statistical approaches Basic Qualifications * An MS/PhD in CS, Machine Learning, Operational research, Statistics or in a highly quantitative field. PhD strongly preferred. * 4+ years of industrial experience in predictive modeling and analysis, predictive software development * Strong Problem solving ability * Good skills with Java or C++, Perl/Python (or similar scripting language) * Experience in using R, Matlab, or any other statistical software * Experience in mentoring junior team members, and guiding them on machine learning and data modeling applications * Strong communication and data presentation skills * Ability to travel to client locations when needed. Up to 50% regionally. Preferred Qualifications * 4+ years of industrial experience in predictive modeling and analysis, predictive software development * Experience handling terabyte size datasets * Experience working with distributed systems and grid computing * Publications or presentation in recognized Machine Learning and Data Mining journals/conferences   Posted Date: 6/17/2016 Dice Id : amazon20 New York, NY Data Scientist - NYC 1 week ago Telecommuting not available|Travel required to 50%. Analysis, Automated, Data Mining, Data Modeling, Development, Java, Matlab, Modeling, Perl, Python, Research, Validation 0dec6af599287542a20279a6f6a87802
https://www.dice.com/jobs/detail/Data-Scientist-M2-Resources-Inc-New-York-NY-10285/10125751/686713?icid=sr10841-362p&q=&l=New%20York,%20NY M2 Resources Inc Full Time Fulltime Role: Data Scientist Location: New York City, NYType of hire: Fulltime/Permanent Basic Qualifications: PhD/Post Doc in any field with advanced quantitative focus in modelling oriented discipline including but not limited to Machine learning, Statistics, Psychometrics, Mathematics, Physics, Chemistry, Biology, Bioinformatics, Econometrics, Neuroscience, Computer Science.5+ years of analytical experience including 2-3 years of post-PhD experience in the field of advanced quantitative techniques while working for leading global academic institutes or corporate innovation research labs or analytics organizations of large corporate or in consulting companies in analytics roles.Nice blend of big data technologies coupled with strong knowledge of predictive modeling methods! Additionally, you must be skilled at clearly communicating your findings and translating them into practical solutions. Sound knowledge and application in the following: Advanced statistical methods including complex multivariate statistical methods, discrete choice modelling, conjoint based analysisAdvance knowledge of machine learning methods including classification, regression and clustering methodsKnowledge of heuristic methods and optimization techniques including system modeling and simulations.Deep programming skills and 3+ years’ experience in R, Perl, Python, or other languages appropriate for large scale analysis of numerical and categorical dataAdvanced quantitative methods relevant to modelling risk and consumer behavior: both parametric and non-parametric modelling, using unguided, semi-guided and guided approaches as appropriate.Willingness and desire to learn from other data scientists and modellers in the team on the art and science of modelling, feature engineering, decision trade-offs between model complexity and model deploymentExcellent prototyping skillsExcellent interpersonal and collaboration skills, ability to explain complicated mathematical concepts, algorithms and data structures to all business partners Technical Pluses: Experience with graph algorithms such as semi-supervised learning on graphs, graph clustering, community detection, interest/topic graphs, and social network analysisKnowledge of emerging platforms Responsibilities: Understand complex business challenges, develop hypotheses, integrate internal and external data sources, analyze them using cutting edge machine learning or statistical modelling techniques to uncover causality (i.e., we go beyond correlations and interesting trends in making decisions that affect people’s financial wellbeing) and synthesizing insightsPropose innovative modelling solutions, evaluate their effectiveness through proof of concept experimentations and refine and enhance them as necessary to ensure scalability and provide support for their implementation. Create new models through entire life cycle using the most effective application of supervised, semi-supervised and unsupervised parametric and non-parametric modeling methods.Investigate the impact of new computing technologies and niche, cutting edge analytical techniques and specialized applications, on the future of bankingDrive, understand, and adapt latest developments in machine learning and statistical modelling and apply them appropriately to solve business problems. Clean, manipulate and investigate large data sets If interested, please share me your resume to sarath(at)m2ri(dot)com / contact me at 856-624-9036 / Apply through dice   Thanks & Regards, Sarathkumar M2 Resources Inc.   Phone: 856-624-9036 (Direct)Email: sarath@m2ri.comGtalk: sarathm2r Dice Id : 10125751 New York, NY Data Scientist 2 weeks ago Telecommuting not available|Travel not required Data Science, Quantitative modelling, Machine learning, Statistics, Predictive modelling, R/Perl/Python d03eac43ee018bfcc08f234cca54c0c7
https://www.dice.com/jobs/detail/SAS-Data-Analyst-%2528Remote%2529-Odesus-New-York-NY-10020/10106335/JG9095?icid=sr11591-387p&q=&l=New%20York,%20NY Odesus C2H W2 6+ Months Junior SAS Architect (Telecommuting/Remote position)Our entertainment client in New York City is hiring 2 Jr. SAS Architects for their Avenue of the Americas location. The Jr. SAS Architects will assist in quickly building 2000+ forecasting models using SAS Forecaster. The SAS Architects will be able to work virtually and that they will not need to be located in NYC. However, they will be asked to come into NYC sporadically. Anyone more than 2 hours away, will have travel and hotel reimbursed. Qualifications:Experience with Time series modeling, ETS, Forecasting, R or even “mixed modeling”.The resources for running SAS Forecast Server should be comfortable executing project setup, running models, creating / evaluating tables of outputs, understand forecast reconciliations (for hierarchical models) and understanding how to incorporate events / inputs into forecasting models.These people could be relatively more junior as we have relatively well defined process in place for typical time series model development and validation. Responsibilities: The resources for running SAS Forecast Server should be comfortable executing project setup, running models, creating / evaluating tables of outputs, understand forecast reconciliations (for hierarchical models) and understanding how to incorporate events / inputs into forecasting models.These people could be relatively more junior as we have relatively well defined process in place for typical time series model development and validation. Dice Id : 10106335 New York, NY SAS Data Analyst (Remote) 4 weeks ago Telecommuting available|Travel not required SAS Architect, forecasting models, SAS Forecaster, ETS a9b83fec319ba8a9d21b203f78bd381f
https://www.dice.com/jobs/detail/Advance-Analytics-Net2Source-Inc.-New-York-NY-10001/10271304/Analytics_NY?icid=sr11724-391p&q=&l=New%20York,%20NY Net2Source Inc. Full Time Contract Corp-To-Corp Net2Source, Inc. is one of the fastest growing IT Consulting company across USA. N2S is headquartered at NJ, USA with its branch offices in Asia Pacific Region. N2S offers a wide gamut of consulting solutions customized to client needs including staffing, training and technologyPosition : Advance Analytics Location : New York,NYDuration : Full Time Mandatory skill : R, SAS, Strong Analytical SkillsJob Description : * Strong analytical skills and intellectual curiosity * Proven experience with various Data and statistical analysis tools such as R, SAS, Python, SQL Strong communication, presentation and writing skills with emphasis on demonstrated ability to translate complex concepts between business and technical resources* Strong ability to take business issues and transform them into analytics requirementsStrong interpersonal and teamwork skills* Ability to take business context and strategy married with current state of analytics and derive future state analytics and the requirements necessary to achieve clients strategic business objectives Thought leadership in the areas of data, data structure, analytics, and strategic consulting * Ability to leverage internal and client resources as well as knowledge of market, competitors, and industry to assess implications for clients* business strategy, identify potential business, analytics, and data challenges, frame potential solutions, and drive these solutions in the overall business and analytics strategy process* Experience building long term data and analytics strategic plans that support overall corporate and departmental business objectives and future vision * Innovative thinker that can paint a vision of the possible for colleagues as well as clients and then drive the implementation of that vision through effective project planning and executionAbout Net2Source, Inc.Net2Source is an employer-of-choice for over 1000 consultants across the globe. We recruit top-notch talent for over 40 Fortune and Government clients coast-to-coast across the U.S. We are one of the fastest-growing companies in the U.S. and this may be your opportunity to join us! Want to read more about Net2Source? , Visit us at www.net2source.comRegards,Shashi KumarSr. Technical RecruiterNet2Source Inc.Direct : 201-479-2153Board: (201) 340-8700 Ext. 442Fax : (201) 221-8131 Address: One Evertrust Plaza, Suite # 305, Jersey City, NJ - 07302https://in.linkedin.com/in/shashi-singh-76b47685Website: www.net2source.com Refer and Earn : For contractual position upto $500 and for Full time upto $1000 Note: Under Bill s.1618 Title III passed by the 105th U.S. Congress this mail cannot be considered Spam as long as we include contact information and a remove link for removal from our mailing list. To be removed from our mailing list reply with “remove” and include your “original email address/addresses” in the subject heading. Include complete address/addresses and/or domain to be removed. We will immediately update it accordingly. We apologize for the inconvenience if any caused Dice Id : 10271304 New York, NY Advance Analytics 1 month ago Telecommuting not available|Travel not required Advance Analytics,optimization,Oracle Data Mining,Oracle R Enterprise,text mining,statistical analysis,computations,R scripts,R functions,Data Scientists,RStudio 80525fe2d264ca2bae9166bf9fee69ea
https://www.dice.com/jobs/detail/Sr.-Data-Scientist-%2526%252345-Machine-Learning%252C-Python%252C-R%252C-Predictive-Analytics-%2526%252347-Big-Data-Precision-Systems-New-York-NY-10001/precisn/3032DI?icid=sr32793-1094p&q=&l=California,%20Us,%20CALIFORNIA Precision Systems Full Time Permanent Our direct client is looking for a Sr. Data Scientist with a programming background! The business itself focuses on using big data analytics and social graph theories to capitalize on human relationships and to predict and understand human behavior in ways that benefit their clients. So, this role is really at the heart of their business! You will be replacing a senior consultant - they want to bring this role in-house.Apply today to be a part of an exciting, fast-paced organization in the center of some of the latest technologies!Essential Skills:- 3+ years of non-academic data science experience- Software engineering background (Java preferred)- Strong foundation and expertise in at least two of the following: predictive modeling, statistical learning/inference, survey and experiment design and analysis, independent research, or NLP- Expertise in various statistical packages such as R, STATA, Python, Weka, or Apache Spark- Experience with SQL queries and data visualization tools- Familiarity with graph theory/algorithms and numerical optimizationPlusses:- BA/BS in Economics, Statistics, Mathematics, Computer Science, Machine Learning, or other related technical field- PhD preferred* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *Please click the Apply Now button to apply for the job.We will review your resume and call if you are qualified.Resumes will NOT be sent to clients without your approval.REFERRALS WANTED - $ 1000 REWARD!Refer a colleague to us, and Precision will give you $ 1000 if we find a job for that person!(The fine print: The referred candidate must be previously unknown to us. Start date must be within 6 months of referral.) Job ID 3032DI-3230 Dice Id : precisn New York, NY Sr. Data Scientist - Machine Learning, Python, R, Predictive Analytics / Big Data 5 hours ago Telecommuting not available|Travel not required Data Science, Predictive Analytics, Machine Learning, Big Data, Python, Java, R, Statistics, SQL, Hadoop, Clustering Methods, Graph Theory a06157d9a8fe7f48a28adf6d405c55d2
https://www.dice.com/jobs/detail/Network-Data-Scientist-Open-Systems-Technologies-New-York-NY-10022/opensyst/31192?icid=sr61267-2043p&q=&l=California,%20Us,%20CALIFORNIA Open Systems Technologies Full Time NA The team focuses on real-time monitoring of changing market conditions, loads on distribution servers and network outages when building high-quality distribution algorithms.Analyze the efficiency of distribution algorithms and suggest innovative features and enhancements to improve qualityTake full ownership of the full software development life-cycle, including researching infrastructure needs for adopting new technologiesQualifications3+ years of experience programming in Python/R/SAS/Matlab or similar technologyKnowledge of data science and machine learning methodologiesBackground in statistical inference over time-series, unsupervised data in a professional environmentKnowledge of querying relational databases (MSSQL, Oracle, MySQL) or NoSQL (Cassandra/MongoDB)Experience with distributed datastores (Hadoop/S3) or search technologies (Splunk/Solr) Dice Id : opensyst New York, NY Network Data Scientist 4 days ago Telecommuting not available|Travel not required data science, python, r, sas, matlab, machine learning, MSSQL, oracle, mysql, nosql, cassandra, mongodb, splunk, solr, hadoop, s3 c79924320c73852f152b9af29b7b3a32

Of our 59 filtered job listings, six explicitly mention R.


How many job postings mention Python?

python_dice_df <- ds_dice_df %>% 
  filter(grepl(" python | python,", job_description, ignore.case=T) | grepl(" python | python,", skills, ignore.case=T))
nrow(python_dice_df)
## [1] 8
python_dice_df %>% head() %>% kable()
advertiser_url company employment_type job_status job_description job_id job_location job_title post_date shift site_name skills uniq_id
https://www.dice.com/jobs/detail/Data-Scientist-%2526%252345-NYC-Amazon-New-York-NY-10001/amazon20/418984?icid=sr9445-315p&q=&l=New%20York,%20NY Amazon Full Time NA Excited by Big Data, Machine Learning and Predictive Software? Interested in creating new state-of-the-art solutions using Machine Learning and Data Mining techniques on Terabytes of Data? At AWS, we are developing state-of-the-art large-scale Machine Learning Services and Applications on the Cloud involving large amounts of data. We work on applying predictive technology to a wide spectrum of problems. We are looking for talented and experienced Machine Learning Scientists who can apply innovative Machine Learning techniques to real-world problems. You will get to work in a team dedicated to advancing Machine Learning solutions at AWS and converting it to business-impacting solutions. Major responsibilities * Use machine learning, data mining and statistical techniques to create new, scalable solutions for business problems * Analyze and extract relevant information from business data to help automate and optimize key processes * Design, develop and evaluate highly innovative models for predictive learning * Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation * Research and implement novel machine learning and statistical approaches Basic Qualifications * An MS/PhD in CS, Machine Learning, Operational research, Statistics or in a highly quantitative field. PhD strongly preferred. * 4+ years of industrial experience in predictive modeling and analysis, predictive software development * Strong Problem solving ability * Good skills with Java or C++, Perl/Python (or similar scripting language) * Experience in using R, Matlab, or any other statistical software * Experience in mentoring junior team members, and guiding them on machine learning and data modeling applications * Strong communication and data presentation skills * Ability to travel to client locations when needed. Up to 50% regionally. Preferred Qualifications * 4+ years of industrial experience in predictive modeling and analysis, predictive software development * Experience handling terabyte size datasets * Experience working with distributed systems and grid computing * Publications or presentation in recognized Machine Learning and Data Mining journals/conferences   Posted Date: 6/17/2016 Dice Id : amazon20 New York, NY Data Scientist - NYC 1 week ago Telecommuting not available|Travel required to 50%. Analysis, Automated, Data Mining, Data Modeling, Development, Java, Matlab, Modeling, Perl, Python, Research, Validation 0dec6af599287542a20279a6f6a87802
https://www.dice.com/jobs/detail/Data-Scientist-M2-Resources-Inc-New-York-NY-10285/10125751/686713?icid=sr10841-362p&q=&l=New%20York,%20NY M2 Resources Inc Full Time Fulltime Role: Data Scientist Location: New York City, NYType of hire: Fulltime/Permanent Basic Qualifications: PhD/Post Doc in any field with advanced quantitative focus in modelling oriented discipline including but not limited to Machine learning, Statistics, Psychometrics, Mathematics, Physics, Chemistry, Biology, Bioinformatics, Econometrics, Neuroscience, Computer Science.5+ years of analytical experience including 2-3 years of post-PhD experience in the field of advanced quantitative techniques while working for leading global academic institutes or corporate innovation research labs or analytics organizations of large corporate or in consulting companies in analytics roles.Nice blend of big data technologies coupled with strong knowledge of predictive modeling methods! Additionally, you must be skilled at clearly communicating your findings and translating them into practical solutions. Sound knowledge and application in the following: Advanced statistical methods including complex multivariate statistical methods, discrete choice modelling, conjoint based analysisAdvance knowledge of machine learning methods including classification, regression and clustering methodsKnowledge of heuristic methods and optimization techniques including system modeling and simulations.Deep programming skills and 3+ years’ experience in R, Perl, Python, or other languages appropriate for large scale analysis of numerical and categorical dataAdvanced quantitative methods relevant to modelling risk and consumer behavior: both parametric and non-parametric modelling, using unguided, semi-guided and guided approaches as appropriate.Willingness and desire to learn from other data scientists and modellers in the team on the art and science of modelling, feature engineering, decision trade-offs between model complexity and model deploymentExcellent prototyping skillsExcellent interpersonal and collaboration skills, ability to explain complicated mathematical concepts, algorithms and data structures to all business partners Technical Pluses: Experience with graph algorithms such as semi-supervised learning on graphs, graph clustering, community detection, interest/topic graphs, and social network analysisKnowledge of emerging platforms Responsibilities: Understand complex business challenges, develop hypotheses, integrate internal and external data sources, analyze them using cutting edge machine learning or statistical modelling techniques to uncover causality (i.e., we go beyond correlations and interesting trends in making decisions that affect people’s financial wellbeing) and synthesizing insightsPropose innovative modelling solutions, evaluate their effectiveness through proof of concept experimentations and refine and enhance them as necessary to ensure scalability and provide support for their implementation. Create new models through entire life cycle using the most effective application of supervised, semi-supervised and unsupervised parametric and non-parametric modeling methods.Investigate the impact of new computing technologies and niche, cutting edge analytical techniques and specialized applications, on the future of bankingDrive, understand, and adapt latest developments in machine learning and statistical modelling and apply them appropriately to solve business problems. Clean, manipulate and investigate large data sets If interested, please share me your resume to sarath(at)m2ri(dot)com / contact me at 856-624-9036 / Apply through dice   Thanks & Regards, Sarathkumar M2 Resources Inc.   Phone: 856-624-9036 (Direct)Email: sarath@m2ri.comGtalk: sarathm2r Dice Id : 10125751 New York, NY Data Scientist 2 weeks ago Telecommuting not available|Travel not required Data Science, Quantitative modelling, Machine learning, Statistics, Predictive modelling, R/Perl/Python d03eac43ee018bfcc08f234cca54c0c7
https://www.dice.com/jobs/detail/Advance-Analytics-Net2Source-Inc.-New-York-NY-10001/10271304/Analytics_NY?icid=sr11724-391p&q=&l=New%20York,%20NY Net2Source Inc. Full Time Contract Corp-To-Corp Net2Source, Inc. is one of the fastest growing IT Consulting company across USA. N2S is headquartered at NJ, USA with its branch offices in Asia Pacific Region. N2S offers a wide gamut of consulting solutions customized to client needs including staffing, training and technologyPosition : Advance Analytics Location : New York,NYDuration : Full Time Mandatory skill : R, SAS, Strong Analytical SkillsJob Description : * Strong analytical skills and intellectual curiosity * Proven experience with various Data and statistical analysis tools such as R, SAS, Python, SQL Strong communication, presentation and writing skills with emphasis on demonstrated ability to translate complex concepts between business and technical resources* Strong ability to take business issues and transform them into analytics requirementsStrong interpersonal and teamwork skills* Ability to take business context and strategy married with current state of analytics and derive future state analytics and the requirements necessary to achieve clients strategic business objectives Thought leadership in the areas of data, data structure, analytics, and strategic consulting * Ability to leverage internal and client resources as well as knowledge of market, competitors, and industry to assess implications for clients* business strategy, identify potential business, analytics, and data challenges, frame potential solutions, and drive these solutions in the overall business and analytics strategy process* Experience building long term data and analytics strategic plans that support overall corporate and departmental business objectives and future vision * Innovative thinker that can paint a vision of the possible for colleagues as well as clients and then drive the implementation of that vision through effective project planning and executionAbout Net2Source, Inc.Net2Source is an employer-of-choice for over 1000 consultants across the globe. We recruit top-notch talent for over 40 Fortune and Government clients coast-to-coast across the U.S. We are one of the fastest-growing companies in the U.S. and this may be your opportunity to join us! Want to read more about Net2Source? , Visit us at www.net2source.comRegards,Shashi KumarSr. Technical RecruiterNet2Source Inc.Direct : 201-479-2153Board: (201) 340-8700 Ext. 442Fax : (201) 221-8131 Address: One Evertrust Plaza, Suite # 305, Jersey City, NJ - 07302https://in.linkedin.com/in/shashi-singh-76b47685Website: www.net2source.com Refer and Earn : For contractual position upto $500 and for Full time upto $1000 Note: Under Bill s.1618 Title III passed by the 105th U.S. Congress this mail cannot be considered Spam as long as we include contact information and a remove link for removal from our mailing list. To be removed from our mailing list reply with “remove” and include your “original email address/addresses” in the subject heading. Include complete address/addresses and/or domain to be removed. We will immediately update it accordingly. We apologize for the inconvenience if any caused Dice Id : 10271304 New York, NY Advance Analytics 1 month ago Telecommuting not available|Travel not required Advance Analytics,optimization,Oracle Data Mining,Oracle R Enterprise,text mining,statistical analysis,computations,R scripts,R functions,Data Scientists,RStudio 80525fe2d264ca2bae9166bf9fee69ea
https://www.dice.com/jobs/detail/Sr.-Data-Scientist-%2526%252345-Machine-Learning%252C-Python%252C-R%252C-Predictive-Analytics-%2526%252347-Big-Data-Precision-Systems-New-York-NY-10001/precisn/3032DI?icid=sr32793-1094p&q=&l=California,%20Us,%20CALIFORNIA Precision Systems Full Time Permanent Our direct client is looking for a Sr. Data Scientist with a programming background! The business itself focuses on using big data analytics and social graph theories to capitalize on human relationships and to predict and understand human behavior in ways that benefit their clients. So, this role is really at the heart of their business! You will be replacing a senior consultant - they want to bring this role in-house.Apply today to be a part of an exciting, fast-paced organization in the center of some of the latest technologies!Essential Skills:- 3+ years of non-academic data science experience- Software engineering background (Java preferred)- Strong foundation and expertise in at least two of the following: predictive modeling, statistical learning/inference, survey and experiment design and analysis, independent research, or NLP- Expertise in various statistical packages such as R, STATA, Python, Weka, or Apache Spark- Experience with SQL queries and data visualization tools- Familiarity with graph theory/algorithms and numerical optimizationPlusses:- BA/BS in Economics, Statistics, Mathematics, Computer Science, Machine Learning, or other related technical field- PhD preferred* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *Please click the Apply Now button to apply for the job.We will review your resume and call if you are qualified.Resumes will NOT be sent to clients without your approval.REFERRALS WANTED - $ 1000 REWARD!Refer a colleague to us, and Precision will give you $ 1000 if we find a job for that person!(The fine print: The referred candidate must be previously unknown to us. Start date must be within 6 months of referral.) Job ID 3032DI-3230 Dice Id : precisn New York, NY Sr. Data Scientist - Machine Learning, Python, R, Predictive Analytics / Big Data 5 hours ago Telecommuting not available|Travel not required Data Science, Predictive Analytics, Machine Learning, Big Data, Python, Java, R, Statistics, SQL, Hadoop, Clustering Methods, Graph Theory a06157d9a8fe7f48a28adf6d405c55d2
https://www.dice.com/jobs/detail/Data-Analyst-at-%25246B-Hedge-Fund-in-Midtown-Averity-New-York-NY-10016/90906950/699900?icid=sr10018-334p&q=&l=New%20York,%20NY Averity Full Time NA We are looking for a strong Data Analyst who possesses the ability to work directly with Investment Committee on utilizing data to answer important questions and support the decision making process. What’s the Job?As a Data Analyst you will partner with a mix of developers, engineers and the business to integrate, cleanse, monitor and prep multiple large streams of incoming data.Build tools to load and monitor financial dataImplement tool sets and databases to support research and investment decision processCreate and use bespoke softwareUse expertise in analyzing data to answer business questions Who Are We?We are a top-tier Multi-Strategy Hedge Fund focused on managing assets for our global clients. Compensation$160,000 (commensurate with experience)Fully Covered Medical401k MatchFlexible Paid Time Off PolicyCatered Meals and Fully Stocked Kitchen What Skills Do You Need?Knowledge of SQL and advanced Excel neededUnderstanding of C# and Python a plus Experience with a variety of asset class data is desirable, including equity, derivatives, futures, bonds, preferred stocks, foreign exchange and commodities What’s In It For You?This is a great opportunity to make a substantial impact in a growing hedge fund and gain experience working with some of wall streets best investment individuals. Dice Id : 90906950 New York, NY Data Analyst at $6B Hedge Fund in Midtown 2 weeks ago Telecommuting not available|Travel not required SQL, Excel, Python, C# b29cc4ed3617a6edf1258c25c8a44aef
https://www.dice.com/jobs/detail/Consultant-Data-Project-Manager--Investment-Bank-Thomson-Keene-Inc.-New-York-NY-10007/10527936/699751?icid=sr10010-334p&q=&l=New%20York,%20NY Thomson Keene Inc. Contract W2 12 months Technical project manager is required to support a well regarded development team working on a flagship global reference data platform ensuring the successful delivery of a significant change program.  You will ideally have a technical background and understanding of the agile SDLC and typical software testing cycles.  The development team are using leading edge data management technologies, a knowledge of this domain is highly desirable (Scala, Python, RDF messaging).  Main responsibilities will include proactive management and tracking of issues and risks, communicating to relevant stakeholders and ensuring timelines are maintained.  You will keep the PMO office updated creating dashboards for upward reporting utilising JIRA (already in place).  This is an excellent opportunity to work with a high performing technical team using the most current technologies in a key strategic group within the bank. Dice Id : 10527936 New York, NY Consultant Data Project Manager Investment Bank 2 weeks ago Telecommuting not available|Travel not required PM, Project Manager, Data, Agile, JIRA 12d835e60b4a626dc6614e31cad6732a

Of our 59 filtered job listings, eight explicitly mention Python.


Keywords in Indeed data

We load Mary Anna’s dataset from indeed.com.

listings_db <- src_postgres(host="mkivenson-job-scrape-data.cvc7wr5vvljm.us-east-1.rds.amazonaws.com", user="postgres", password="postgres607", dbname="listings")
listings_df <- tbl(listings_db, "listings") %>% collect(n=Inf)
nrow(listings_df)
## [1] 1111


We can re-run the same keyword searches for R and Python on this new dataset.

r_listings_df <- listings_df %>% 
  filter(grepl(" R | R,", description, ignore.case=T) | grepl(" R | R,", summary, ignore.case=T))
nrow(r_listings_df)
## [1] 463
python_listings_df <- listings_df %>% 
  filter(grepl(" python | python,", description, ignore.case=T) | grepl(" python | python,", summary, ignore.case=T))
nrow(python_listings_df)
## [1] 586

Of the 1111 data science job listings on Indeed, 463 explicitly mention R and 586 mention Python.