We will take a look at two datasets of job listings - one for New York City government jobs and another for technology jobs within New York City that were posted to dice.com. Both datasets were procured from kaggle.com.
library(stringr)
library(dplyr)
library(tidyr)
library(zoo)
library(ggplot2)
library(knitr)
We start by importing the data for NYC jobs from Github and cleaning up the column names. Then we remove duplicate listings.
raw_nyc_df <- read.csv('https://raw.githubusercontent.com/mehtablocker/cuny_607/master/project_3/nyc-jobs.csv')
nyc_jobs_df <- raw_nyc_df
names(nyc_jobs_df) <- names(nyc_jobs_df) %>% tolower() %>% gsub("\\.", "_", .)
names(nyc_jobs_df)[names(nyc_jobs_df)=="x__of_positions"] <- "n_of_positions"
nyc_jobs_df <- nyc_jobs_df %>% select(-posting_type) %>% unique()
nyc_jobs_df %>% tail() %>% kable()
| job_id | agency | n_of_positions | business_title | civil_service_title | title_code_no | level | job_category | full_time_part_time_indicator | salary_range_from | salary_range_to | salary_frequency | work_location | division_work_unit | job_description | minimum_qual_requirements | preferred_skills | additional_information | to_apply | hours_shift | work_location_1 | recruitment_contact | residency_requirement | posting_date | post_until | posting_updated | process_date | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3966 | 379312 | OFFICE OF MANAGEMENT & BUDGET | 1 | Analyst | FEMA Public Assistance Policy & Monitoring & Compliance | 6088 | 1 | Finance, Accounting, & Procurement Policy, Research & Analysis | F | 45491 | 60660 | Annual | 255 Greenwich Street | Office of Budget Review | NA | 2019-01-07T00:00:00.000 | 2019-01-07T00:00:00.000 | 2019-02-19T00:00:00.000 | |||||||||
| 3968 | 380166 | OFFICE OF MANAGEMENT & BUDGET | 1 | Assistant Director | Value Engineering | 0608A | M4 | Engineering, Architecture, & Planning Technology, Data & Innovation Policy, Research & Analysis | F | 137637 | 137637 | Annual | 255 Greenwich Street | IFA (Mgrl) | NA | 2019-01-11T00:00:00.000 | 2019-01-11T00:00:00.000 | 2019-02-19T00:00:00.000 | |||||||||
| 3970 | 383297 | OFFICE OF MANAGEMENT & BUDGET | 1 | Analyst | DEPARTMENT OF ENVIRONMENTAL PROTECTION (DEP) | 6088 | 1 | Finance, Accounting, & Procurement Policy, Research & Analysis | F | 60660 | 68244 | Annual | 255 Greenwich Street | Infra, Librar. And Cultural | NA | 2019-02-07T00:00:00.000 | 2019-02-07T00:00:00.000 | 2019-02-19T00:00:00.000 | |||||||||
| 3972 | 383303 | MUNICIPAL WATER FIN AUTHORITY | 1 | Analyst | New York City Municipal Water Finance Authority (NYW) | 6088 | 1 | Finance, Accounting, & Procurement Policy, Research & Analysis | F | 60660 | 68244 | Annual | 255 Greenwich Street | Municipal Water Authority | NA | 2019-02-11T00:00:00.000 | 2019-02-11T00:00:00.000 | 2019-02-19T00:00:00.000 | |||||||||
| 3974 | 383828 | OFFICE OF MANAGEMENT & BUDGET | 1 | Analyst | REPORTING AND SYSTEMS MANAGEMENT | 6088 | 1 | Finance, Accounting, & Procurement Policy, Research & Analysis | F | 45491 | 68244 | Annual | 255 Greenwich Street | Community Development | NA | 2019-02-12T00:00:00.000 | 2019-02-12T00:00:00.000 | 2019-02-19T00:00:00.000 | |||||||||
| 3976 | 383833 | OFFICE OF MANAGEMENT & BUDGET | 1 | Analyst | Administration for Children’s Services | 6088 | 1 | Finance, Accounting, & Procurement Policy, Research & Analysis | F | 45491 | 68244 | Annual | 255 Greenwich Street | Social Services | NA | 2019-02-12T00:00:00.000 | 2019-02-12T00:00:00.000 | 2019-02-19T00:00:00.000 |
We filter for data science jobs by using a regular expression to search the business_title column for the case insensitive terms “data” or “analytics.” Then we create another table for non-data jobs.
data_jobs_df <- nyc_jobs_df %>% filter(grepl("data|analytics", business_title, ignore.case = T))
other_jobs_df <- nyc_jobs_df %>% filter(!grepl("data|analytics", business_title, ignore.case = T))
data_jobs_df %>% head() %>% kable()
| job_id | agency | n_of_positions | business_title | civil_service_title | title_code_no | level | job_category | full_time_part_time_indicator | salary_range_from | salary_range_to | salary_frequency | work_location | division_work_unit | job_description | minimum_qual_requirements | preferred_skills | additional_information | to_apply | hours_shift | work_location_1 | recruitment_contact | residency_requirement | posting_date | post_until | posting_updated | process_date |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 195805 | ADMIN FOR CHILDREN’S SVCS | 2 | Tracking and Monitoring Data Analyst | ASSOCIATE STAFF ANALYST | 12627 | 0 | Social Services | F | 59536 | 88649 | Annual | 66 John Street, New York, Ny | Headstart (ECE) | NA | 2015-05-29T00:00:00.000 | 2015-06-16T00:00:00.000 | 2019-02-19T00:00:00.000 | |||||||||
| 226044 | DEPT OF INFO TECH & TELECOMM | 1 | Payroll Data Associate | CLERICAL ASSOCIATE | 10251 | 3 | Clerical & Administrative Support | 32888 | 50000 | Annual | 75 Park Place New York Ny | Human Resources | NA | 2015-12-21T00:00:00.000 | 2015-12-21T00:00:00.000 | 2019-02-19T00:00:00.000 | ||||||||||
| 234203 | ADMIN FOR CHILDREN’S SVCS | 1 | Business and Data Analyst Manager | ASSOCIATE STAFF ANALYST | 12627 | 0 | Finance, Accounting, & Procurement | F | 63817 | 95022 | Annual | 150 William Street, New York N | Asst Comm Bdgt & Clmng-Financ | NA | 2016-03-03T00:00:00.000 | 2016-03-04T00:00:00.000 | 2019-02-19T00:00:00.000 | |||||||||
| 289615 | NYC HOUSING AUTHORITY | 1 | Data Support Analyst | COMPUTER SYSTEMS MANAGER | 10050 | M1 | Technology, Data & Innovation | F | 80000 | 100000 | Annual | Analysis & Reporting | Capital Projects Admin | NA | 2017-07-07T00:00:00.000 | 2018-07-13T00:00:00.000 | 2019-02-19T00:00:00.000 | |||||||||
| 294911 | NYC EMPLOYEES RETIREMENT SYS | 1 | CERTIFIED IT ADMINISTRATOR (DATABASE) | CERT IT ADMINISTRATOR (DB) | 13644 | 2 | Technology, Data & Innovation | F | 79471 | 111598 | Annual | 335 Adams Street, Brooklyn Ny | Executive Management | NA | 2017-07-19T00:00:00.000 | 2017-07-20T00:00:00.000 | 2019-02-19T00:00:00.000 | |||||||||
| 340965 | DEPT OF HEALTH/MENTAL HYGIENE | 1 | Data Manager, Viral Hepatitis, Bureau of Communicable Diseases | CITY RESEARCH SCIENTIST | 21744 | 1 | Health | F | 59708 | 65678 | Annual | 42-09 28th Street | Communicable Diseases | NA | 2018-05-07T00:00:00.000 | 2018-05-07T00:00:00.000 | 2019-02-19T00:00:00.000 |
We can see from the above table that a lot of key values are missing, including Job Description and Preferred Skills. This significantly limits our analysis capabilities to only a few areas.
Of all the jobs working for New York City, how many are data jobs?
### Total number of jobs in the dataset:
nrow(nyc_jobs_df)
## [1] 2205
### Number of data jobs:
nrow(data_jobs_df)
## [1] 71
### Data jobs, as a percentage of total:
nrow(data_jobs_df)/nrow(nyc_jobs_df)
## [1] 0.03219955
In this dataset, only about 3.2 percent of jobs are data jobs.
In terms of the high range of salary, how well do data jobs pay relative to non-data jobs?
### Data jobs
summary(data_jobs_df$salary_range_to)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 15 65339 84301 78740 99000 161497
### Non-data jobs
summary(other_jobs_df$salary_range_to)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 10.36 56703.25 75595.50 78639.47 101673.00 230000.00
par(mfrow=c(1,2))
boxplot(data_jobs_df$salary_range_to, xlab="Data Jobs", ylab="Salary in Dollars", ylim=c(0, 200000))
boxplot(other_jobs_df$salary_range_to, xlab="Non-Data Jobs", ylab="Salary in Dollars", ylim=c(0, 200000))
par(mfrow=c(1,1))
The distribution is wider for non-data jobs, but the median salary is higher for data jobs. It is important to remember that these are all government jobs, which overall may pay less than private sector jobs.
Next we import the data for technology jobs within New York City that were posted to dice.com. We separate one of the columns and rename a few others.
raw_dice_df <- read.csv('https://raw.githubusercontent.com/mehtablocker/cuny_607/master/project_3/dice_com_nyc_jobs.csv', stringsAsFactors = F)
dice_jobs_df <- as_tibble(raw_dice_df) %>%
separate(employmenttype_jobstatus, into=c("employment_type", "job_status"), sep = ", ", fill="right", extra = "drop")
dice_jobs_df <- dice_jobs_df %>%
rename(advertiser_url = advertiserurl,
job_description = jobdescription,
job_id = jobid,
job_location = joblocation_address,
job_title = jobtitle,
post_date = postdate)
dice_jobs_df %>% head() %>% kable()
| advertiser_url | company | employment_type | job_status | job_description | job_id | job_location | job_title | post_date | shift | site_name | skills | uniq_id |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| https://www.dice.com/jobs/detail/Front-End-Developer-Genesis10-New-York-NY-10001/gentx001/16-03932?icid=sr14288-477p&q=&l=California,%20Us,%20CALIFORNIA | Genesis10 | Full Time | Direct Placement | This is a fulltime position for a Javascript developer for a financial software, data, and media company. This role is based our of the Midtown, NY location (candidates must work onsite). You’ll be part of a team working on the application that is the glue that holds the firm’s main product together and is used as a communication tool, a price dissemination system and a way to receive information from all other applications within the firm. Responsibilities:  Help us design, create and build our next-generation user interface Be a key contributor in the re-architecting of our application service layers to improve our scalability, stability and performance Work with the team, our product and our customers to define priorities and the technology we use Work with other teams and learn about our search infrastructure, core databases and how we support hundreds of applications through our APIs Required Skills and Experience:  3+ years of production-level development Solid JavaScript knowledge Must want to and like working on UI Preferred:  An understanding of C++ fundamentals Experience working on other highly visible applications Start Date: 06/03/2016 | Dice Id : gentx001 | New York, NY | Front End Developer | 7 hours ago | Telecommuting not available|Travel not required | C++, Developer, Development, JavaScript, User Interface | 28f5e0c1cc3314813e674f0c32b04d1b | |
| https://www.dice.com/jobs/detail/Senior-Full-Stack-Developer-Genesis10-New-York-NY-10001/gentx001/16-02139?icid=sr14773-493p&q=&l=California,%20Us,%20CALIFORNIA | Genesis10 | Full Time | Direct Placement | Description:  Our client’s Open Software Frameworks team is building a variety of products used by both independent software vendors and in-house developers. Their diverse portfolio of products includes Application Portal, Geo Spatial Mapping platform and IDE used to build their terminal applications. This team is contributing to a number of cutting-edge open source projects. Based on the breadth of the team’s work, you will need to be a strong full stack developer. You strive to design, implement and support the ideal solution. Balance between elegant design and system performance and reliability is always at the front of your mind. You will have the opportunity to work closely with users, UX and Product teams. Requirements:  * 3+ years of experience programming in C/C++ * 3+ years of experience with HTML5, CSS and JavaScript * Strong OOD/OOP skills and experience applying modern design patterns * Knowledge of algorithms, standard data structures and multithreading We’d love to see: * Experience with C#, .Net internals and WPF * Experience developing distributed systems in a Windows or Linux environment * Familiarity with and understanding of an Agile methodology If you have the described qualifications and are interested in this exciting opportunity, please apply! About SWATT:  The Genesis10 Software and Technology Team (SWATT) is a specialized recruiting service focused on helping accomplished software developers, programmers, platform engineers and elite technology professionals find once-in-a-lifetime career opportunities in New York City with the world’s most advanced technology organizations. Whether local to New York or relocating from across North America, we take an authentic approach to helping people make life-changing technology career decisions. For more information go to http://swatt.genesis10.com/ “Genesis10 is an Equal Opportunity Employer, M/F/D/V” Start Date: 06/27/2016 | Dice Id : gentx001 | New York, NY | Senior Full Stack Developer | 7 hours ago | Telecommuting not available|Travel not required | .Net, Algorithms, Developer, JavaScript, Linux, OOD, OOP, Programming, Systems, Windows | b9520049ff3e799ec368ae0aa99ec5f5 | |
| https://www.dice.com/jobs/detail/Java-Full-Stack-Engineer-%2528Angular-JS-is-must%2529-Stratitude-Inc-New-York-City-NY-10001/10292564/569166?icid=sr14409-481p&q=&l=California,%20Us,%20CALIFORNIA | Stratitude Inc | Contract Corp-To-Corp | 12+ months | Required Skills                   1.      6-9 years of experience in software development.2.      Experience working with MVC based front-end library Angular JS3.      Strong expertise in Java/J2EE, Spring , Hibernate.4.      Experience with SQL/NoSQL databases.5.      Responsible for module design / high architecture.6.      Participate in customer interaction, code review and follow-up.7.      Work closely with the development team to optimize and improve the e-commerce platform to grow subscription business. This involved identifying opportunities , developing requirements and co-ordinating with QA. | Dice Id : 10292564 | New York City, NY | Java Full Stack Engineer (Angular JS is must) | 7 hours ago | Telecommuting not available|Travel not required | Angular JS, Java/J2EE, Spring , Hibernate | 755688570dda03850c9ea9974e241110 | |
| https://www.dice.com/jobs/detail/Linux-Engineer-Genesis10-New-York-NY-10001/gentx001/16-04058?icid=sr14777-493p&q=&l=California,%20Us,%20CALIFORNIA | Genesis10 | Full Time | Contract | Genesis10 is currently seeking a Product Specific Technologist with our client in the financial industry in their New York, NY location. This is a 12 month + contract position. Description: Â * High Server Patch project is major effort to remediate security patch level on Grid /Quartz LINUX Infrastructure and deploy recurring patching solution * We are hiring this position to work closely with business technology group, application owners and operations teams to establish and coordinate deployment of automated recurring patching solution * This sophisticated platform requires very specialized skills and this engineer will provide solution, design, development, and implementation and be responsible for co-ordination to facilitating the migration of this platform Responsibilities: Â * Requirement analysis of the project * Technical recommendations for automated patching * End to end technical plan for patching * Co-ordination of execution of automated recurring plan Requirements: Â * Experience with J2EE, Java, Python, C++ programming, shell script * Strong understanding and working experience of Linux Operating system * Good understanding of Network /Server Infrastructure spanning multiple sites * Experience with J2EE, Java, Python, C++ programming, shell script * Good communication skills, self-motivated, positive attitude and ability to work in a global team environment * Strong troubleshooting skills and application Administration/Support skills * Experience in engineering/upgrading complex solutions in line with demanding client requirements * Solutions engineering (design & implementation of messaging solutions) experience * Proven timely delivery of key infrastructure and products * Good understanding and work experience in an Enterprise Environment * Strong understanding and hands on Linux Operating system * Strong Development experience to write tool on Linux Operating system * Previous software solution to deploy patching solution will be plus * Ability to handle highly volatile support with platform and clients spanning multiple time zones * Ability to adapt quickly to the client needs * Team Player * Good Communication skills * Able to work as a W2 employee of Genesis10 (no Corp-to-Corp) If you have the described qualifications and are interested in this exciting opportunity, please apply! About Genesis10: Â Genesis10 is a leading U.S. business and technology consulting firm with hundreds of clients needing proven talent and solutions to power their strategic initiatives. If you are a high performing business or IT professional with solid, referenced experience, we want to meet you. Genesis10 recruiters and delivery professionals are highly accomplished career advocates, who get to know you beyond your resume to position you with the opportunities that fit your skills, experience and aspirations. We have benefit options to fit your needs and a support staff that works with you from placement throughout your engagement - project after project. To learn more about Genesis10 and to view all our available career opportunities, please visit us at www.genesis10.com. “Genesis10 is an Equal Opportunity Employer, M/F/D/V” | Dice Id : gentx001 | New York, NY | Linux Engineer | 7 hours ago | Telecommuting not available|Travel not required | Application, C++, Development, Engineer, Engineering, J2EE, Java, Linux, Network, Programming, Python, Shell Script, Software, System | 75296a05c107731f400d602f3e10bf47 | |
| https://www.dice.com/jobs/detail/Front-end%2526%252345UI-developer%2526%252347UI%2526%252345Web-designer-Fahrenheit-IT-Staffing-%2526-Consulting-New-York-NY-10010/10111360/3042319-28-MH17?icid=sr14379-480p&q=&l=California,%20Us,%20CALIFORNIA | Fahrenheit IT Staffing & Consulting | NA | Front end-UI developer/UI-Web designer job opportunity at a top-financial services company located in NYC * Midtown 6 * 12 months with high potential for FTE conversion. AN ONLINE PORTFOLIO IS REQUIRED TO APPLY TO THIS JOB  Contact oleon@fahrenheitit.com846 582 1467 Looking for CANDIDATES LOCATED IN THE TRI-STATE AREA ONLY If qualified and currently considering new job opportunities lets set up a time to discuss other details at your earliest convenience. Job details: This job is a combination of front end development (css, html, javascript) as well as user interface/web design As far as design looking for someone who can show artistic/creative samples of user interfaces the candidate has worked in. Presentation layers are critical [think colors, fonts, easy-to-read language, easy-to-follow design]  Looking for someone who can create interactive charts, graphs for reporting purposes where users can view dashboards, charts, graphs of analytical data (experience with one of the following D3, Angular js, or Node js. This team is responsible for presenting technical data through graphs, charts commercial quality data visualization for a large audience of stakeholders & users.  | Dice Id : 10111360 | New York, NY | Front end-UI developer/UI-Web designer | 7 hours ago | Telecommuting not available Travel not required | c6f99f38bf465487323ed9d46f19514d | |||
| https://www.dice.com/jobs/detail/Infrastructure-Production-Developer-Genesis10-New-York-NY-10001/gentx001/16-02261?icid=sr14184-473p&q=&l=California,%20Us,%20CALIFORNIA | Genesis10 | Full Time | Direct Placement | Description:  Our client’s Infrastructure Production group in delivers a wide range of technologies. The team builds out common services that every department can use and consume to monitor, visualize and diagnose their applications and infrastructure. This team is also the forefront in implementing modern technology ideology within the organization and assist all of the teams with implementation, automation and design. The Production Engineering team is a new team, and is one of the most fast-paced and soon to be widely used across the entire company. You will have the ability to be a part of a large cultural shift within the organization. If you like large scale systems, billions of data points a day, automating all of the things, hacking on open source software and making a cultural impact, ask us where to sign up. Responsibilities:  * Design, architect, automate and deliver large scale production ready services for employees to consume. * Build internal tools to monitor, visualize and diagnose all aspects of applications & hardware in our client’s stack. * Work closely with our client’s product and platform teams with architecture, design and scaling challenges they may have. * Help teams replace legacy software and design patterns with modern technologies. * Develop and maintain documentation, training and SLAs for managed infrastructure. Requirements:  * Minimum 2 - 3 years of experience building similar systems * Experience with large scale data processing * Previous experience automating and implementing large scale fault tolerant distributed systems * Experience with physical hardware and provisioning process * Experience working with Opensource software Common Tools used:  * Ruby / Go / Python / Java * Linux * Kafka * Hadoop / Zookeeper / HBase * Mesos * Icinga / OpenTSDB * Chef * Spark If you have the described qualifications and are interested in this exciting opportunity, please apply! About SWATT:  The Genesis10 Software and Technology Team (SWATT) is a specialized recruiting service focused on helping accomplished software developers, programmers, platform engineers and elite technology professionals find once-in-a-lifetime career opportunities in New York City with the world’s most advanced technology organizations. Whether local to New York or relocating from across North America, we take an authentic approach to helping people make life-changing technology career decisions. For more information go to http://swatt.genesis10.com/ “Genesis10 is an Equal Opportunity Employer, M/F/D/V” Start Date: 06/27/2016 | Dice Id : gentx001 | New York, NY | Infrastructure Production Developer | 7 hours ago | Telecommuting not available|Travel not required | Developer, Hardware, Java, Linux, Python, Software, Systems | 1f20cc46a1df5eb235007fac5fa0cbf4 |
Since this dataset is comprised of only technology jobs, finding specifically data science jobs may require a bit more nuance. For example, if we try to filter for the words “data” or “analytics” as before, we catch a lot of software developer jobs that are not exactly the same subspace as data science.
ds_dice_df <- dice_jobs_df %>% filter(grepl("data|analytics", job_title, ignore.case = T))
ds_dice_df %>% select(job_title, company, employment_type, skills) %>% head() %>% kable()
| job_title | company | employment_type | skills |
|---|---|---|---|
| Data Architect - III | Mitchell Martin | Contract Corp-To-Corp | AIX, Architecture, Business Requirements, CSS, Development, HTML, HTTP, Informatica, Oracle, PLSQL, PL/SQL, Programming, Project, Shell Scripting, SQL, SQL Server, Supervision, Unix |
| Database Engineer | Princeton Information Ltd | Contract Independent | Access, Applications, Architecture, Business Requirements, Computer, Configuration Management, Consulting, Database, Engineer, Engineers, IT, Linux, Management, Oracle, Project, SDLC, Security, SQL, SQL Server, Systems, Telecom, Web, Windows |
| Director of Research Databases and Software Development | Mount Sinai Medical Center | Full Time | 2 years of progressive database and/or software development management experience. Experience with healthcare and research preferred |
| Data Scientist - NYC | Amazon | Full Time | Analysis, Automated, Data Mining, Data Modeling, Development, Java, Matlab, Modeling, Perl, Python, Research, Validation |
| Business Analyst Data Warehouse NYC | Telecomm Software | Full Time | business analyst, tableau, etl, data warehouse |
| Project Manager for Big Data | Blackstone Professional Recruiting | Contract Corp-To-Corp | “project manager” |
We can refine our search by excluding words like “engineer” and “architect” to get a more relevant result.
ds_dice_df <- ds_dice_df %>%
filter(!grepl("architect|architecture|engineer|developer|development|administrator|administration", job_title, ignore.case = T))
ds_dice_df %>% select(job_title, company, employment_type, skills) %>% head() %>% kable()
| job_title | company | employment_type | skills |
|---|---|---|---|
| Data Scientist - NYC | Amazon | Full Time | Analysis, Automated, Data Mining, Data Modeling, Development, Java, Matlab, Modeling, Perl, Python, Research, Validation |
| Business Analyst Data Warehouse NYC | Telecomm Software | Full Time | business analyst, tableau, etl, data warehouse |
| Project Manager for Big Data | Blackstone Professional Recruiting | Contract Corp-To-Corp | “project manager” |
| Senior Product Manager, Enterprise Data Management | Bridge Search Associates | Full Time | Product Manager, Enterprise, Data, Marketing, Pricing, Go-To-Market |
| Executive Director, Implementation and Data Governance | Blue Horizon Tek Solutions, Inc. | Full Time | Establish the Competency Center Strategy, Guidelines, Methodology, Processes, and Best Practices |
| Senior Analytics Specialist | Intelligent Capital Network, Inc. | Contract Corp-To-Corp | analytics |
We can text mine the job_description and skills columns to find specific keywords.
How many job postings mention the R programming language?
r_dice_df <- ds_dice_df %>%
filter(grepl(" R | R,", job_description, ignore.case=T) | grepl(" R | R,", skills, ignore.case=T))
nrow(r_dice_df)
## [1] 6
r_dice_df %>% head() %>% kable()
| advertiser_url | company | employment_type | job_status | job_description | job_id | job_location | job_title | post_date | shift | site_name | skills | uniq_id |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| https://www.dice.com/jobs/detail/Data-Scientist-%2526%252345-NYC-Amazon-New-York-NY-10001/amazon20/418984?icid=sr9445-315p&q=&l=New%20York,%20NY | Amazon | Full Time | NA | Excited by Big Data, Machine Learning and Predictive Software? Interested in creating new state-of-the-art solutions using Machine Learning and Data Mining techniques on Terabytes of Data? At AWS, we are developing state-of-the-art large-scale Machine Learning Services and Applications on the Cloud involving large amounts of data. We work on applying predictive technology to a wide spectrum of problems. We are looking for talented and experienced Machine Learning Scientists who can apply innovative Machine Learning techniques to real-world problems. You will get to work in a team dedicated to advancing Machine Learning solutions at AWS and converting it to business-impacting solutions. Major responsibilities * Use machine learning, data mining and statistical techniques to create new, scalable solutions for business problems * Analyze and extract relevant information from business data to help automate and optimize key processes * Design, develop and evaluate highly innovative models for predictive learning * Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation * Research and implement novel machine learning and statistical approaches Basic Qualifications * An MS/PhD in CS, Machine Learning, Operational research, Statistics or in a highly quantitative field. PhD strongly preferred. * 4+ years of industrial experience in predictive modeling and analysis, predictive software development * Strong Problem solving ability * Good skills with Java or C++, Perl/Python (or similar scripting language) * Experience in using R, Matlab, or any other statistical software * Experience in mentoring junior team members, and guiding them on machine learning and data modeling applications * Strong communication and data presentation skills * Ability to travel to client locations when needed. Up to 50% regionally. Preferred Qualifications * 4+ years of industrial experience in predictive modeling and analysis, predictive software development * Experience handling terabyte size datasets * Experience working with distributed systems and grid computing * Publications or presentation in recognized Machine Learning and Data Mining journals/conferences  Posted Date: 6/17/2016 | Dice Id : amazon20 | New York, NY | Data Scientist - NYC | 1 week ago | Telecommuting not available|Travel required to 50%. | Analysis, Automated, Data Mining, Data Modeling, Development, Java, Matlab, Modeling, Perl, Python, Research, Validation | 0dec6af599287542a20279a6f6a87802 | |
| https://www.dice.com/jobs/detail/Data-Scientist-M2-Resources-Inc-New-York-NY-10285/10125751/686713?icid=sr10841-362p&q=&l=New%20York,%20NY | M2 Resources Inc | Full Time | Fulltime | Role: Data Scientist Location: New York City, NYType of hire: Fulltime/Permanent Basic Qualifications: PhD/Post Doc in any field with advanced quantitative focus in modelling oriented discipline including but not limited to Machine learning, Statistics, Psychometrics, Mathematics, Physics, Chemistry, Biology, Bioinformatics, Econometrics, Neuroscience, Computer Science.5+ years of analytical experience including 2-3 years of post-PhD experience in the field of advanced quantitative techniques while working for leading global academic institutes or corporate innovation research labs or analytics organizations of large corporate or in consulting companies in analytics roles.Nice blend of big data technologies coupled with strong knowledge of predictive modeling methods! Additionally, you must be skilled at clearly communicating your findings and translating them into practical solutions. Sound knowledge and application in the following: Advanced statistical methods including complex multivariate statistical methods, discrete choice modelling, conjoint based analysisAdvance knowledge of machine learning methods including classification, regression and clustering methodsKnowledge of heuristic methods and optimization techniques including system modeling and simulations.Deep programming skills and 3+ years’ experience in R, Perl, Python, or other languages appropriate for large scale analysis of numerical and categorical dataAdvanced quantitative methods relevant to modelling risk and consumer behavior: both parametric and non-parametric modelling, using unguided, semi-guided and guided approaches as appropriate.Willingness and desire to learn from other data scientists and modellers in the team on the art and science of modelling, feature engineering, decision trade-offs between model complexity and model deploymentExcellent prototyping skillsExcellent interpersonal and collaboration skills, ability to explain complicated mathematical concepts, algorithms and data structures to all business partners Technical Pluses: Experience with graph algorithms such as semi-supervised learning on graphs, graph clustering, community detection, interest/topic graphs, and social network analysisKnowledge of emerging platforms Responsibilities: Understand complex business challenges, develop hypotheses, integrate internal and external data sources, analyze them using cutting edge machine learning or statistical modelling techniques to uncover causality (i.e., we go beyond correlations and interesting trends in making decisions that affect people’s financial wellbeing) and synthesizing insightsPropose innovative modelling solutions, evaluate their effectiveness through proof of concept experimentations and refine and enhance them as necessary to ensure scalability and provide support for their implementation. Create new models through entire life cycle using the most effective application of supervised, semi-supervised and unsupervised parametric and non-parametric modeling methods.Investigate the impact of new computing technologies and niche, cutting edge analytical techniques and specialized applications, on the future of bankingDrive, understand, and adapt latest developments in machine learning and statistical modelling and apply them appropriately to solve business problems. Clean, manipulate and investigate large data sets If interested, please share me your resume to sarath(at)m2ri(dot)com / contact me at 856-624-9036 / Apply through dice   Thanks & Regards, Sarathkumar M2 Resources Inc.  Phone: 856-624-9036 (Direct)Email: sarath@m2ri.comGtalk: sarathm2r | Dice Id : 10125751 | New York, NY | Data Scientist | 2 weeks ago | Telecommuting not available|Travel not required | Data Science, Quantitative modelling, Machine learning, Statistics, Predictive modelling, R/Perl/Python | d03eac43ee018bfcc08f234cca54c0c7 | |
| https://www.dice.com/jobs/detail/SAS-Data-Analyst-%2528Remote%2529-Odesus-New-York-NY-10020/10106335/JG9095?icid=sr11591-387p&q=&l=New%20York,%20NY | Odesus | C2H W2 | 6+ Months | Junior SAS Architect (Telecommuting/Remote position)Our entertainment client in New York City is hiring 2 Jr. SAS Architects for their Avenue of the Americas location. The Jr. SAS Architects will assist in quickly building 2000+ forecasting models using SAS Forecaster. The SAS Architects will be able to work virtually and that they will not need to be located in NYC. However, they will be asked to come into NYC sporadically. Anyone more than 2 hours away, will have travel and hotel reimbursed. Qualifications:Experience with Time series modeling, ETS, Forecasting, R or even “mixed modelingâ€.The resources for running SAS Forecast Server should be comfortable executing project setup, running models, creating / evaluating tables of outputs, understand forecast reconciliations (for hierarchical models) and understanding how to incorporate events / inputs into forecasting models.These people could be relatively more junior as we have relatively well defined process in place for typical time series model development and validation. Responsibilities: The resources for running SAS Forecast Server should be comfortable executing project setup, running models, creating / evaluating tables of outputs, understand forecast reconciliations (for hierarchical models) and understanding how to incorporate events / inputs into forecasting models.These people could be relatively more junior as we have relatively well defined process in place for typical time series model development and validation. | Dice Id : 10106335 | New York, NY | SAS Data Analyst (Remote) | 4 weeks ago | Telecommuting available|Travel not required | SAS Architect, forecasting models, SAS Forecaster, ETS | a9b83fec319ba8a9d21b203f78bd381f | |
| https://www.dice.com/jobs/detail/Advance-Analytics-Net2Source-Inc.-New-York-NY-10001/10271304/Analytics_NY?icid=sr11724-391p&q=&l=New%20York,%20NY | Net2Source Inc. | Full Time | Contract Corp-To-Corp | Net2Source, Inc. is one of the fastest growing IT Consulting company across USA. N2S is headquartered at NJ, USA with its branch offices in Asia Pacific Region. N2S offers a wide gamut of consulting solutions customized to client needs including staffing, training and technologyPosition : Advance Analytics Location : New York,NYDuration : Full Time Mandatory skill : R, SAS, Strong Analytical SkillsJob Description : * Strong analytical skills and intellectual curiosity * Proven experience with various Data and statistical analysis tools such as R, SAS, Python, SQL Strong communication, presentation and writing skills with emphasis on demonstrated ability to translate complex concepts between business and technical resources* Strong ability to take business issues and transform them into analytics requirementsStrong interpersonal and teamwork skills* Ability to take business context and strategy married with current state of analytics and derive future state analytics and the requirements necessary to achieve clients strategic business objectives Thought leadership in the areas of data, data structure, analytics, and strategic consulting * Ability to leverage internal and client resources as well as knowledge of market, competitors, and industry to assess implications for clients* business strategy, identify potential business, analytics, and data challenges, frame potential solutions, and drive these solutions in the overall business and analytics strategy process* Experience building long term data and analytics strategic plans that support overall corporate and departmental business objectives and future vision * Innovative thinker that can paint a vision of the possible for colleagues as well as clients and then drive the implementation of that vision through effective project planning and executionAbout Net2Source, Inc.Net2Source is an employer-of-choice for over 1000 consultants across the globe. We recruit top-notch talent for over 40 Fortune and Government clients coast-to-coast across the U.S. We are one of the fastest-growing companies in the U.S. and this may be your opportunity to join us! Want to read more about Net2Source? , Visit us at www.net2source.comRegards,Shashi KumarSr. Technical RecruiterNet2Source Inc.Direct : 201-479-2153Board: (201) 340-8700 Ext. 442Fax : (201) 221-8131 Address: One Evertrust Plaza, Suite # 305, Jersey City, NJ - 07302https://in.linkedin.com/in/shashi-singh-76b47685Website: www.net2source.com Refer and Earn : For contractual position upto $500 and for Full time upto $1000 Note: Under Bill s.1618 Title III passed by the 105th U.S. Congress this mail cannot be considered Spam as long as we include contact information and a remove link for removal from our mailing list. To be removed from our mailing list reply with “remove” and include your “original email address/addresses” in the subject heading. Include complete address/addresses and/or domain to be removed. We will immediately update it accordingly. We apologize for the inconvenience if any caused | Dice Id : 10271304 | New York, NY | Advance Analytics | 1 month ago | Telecommuting not available|Travel not required | Advance Analytics,optimization,Oracle Data Mining,Oracle R Enterprise,text mining,statistical analysis,computations,R scripts,R functions,Data Scientists,RStudio | 80525fe2d264ca2bae9166bf9fee69ea | |
| https://www.dice.com/jobs/detail/Sr.-Data-Scientist-%2526%252345-Machine-Learning%252C-Python%252C-R%252C-Predictive-Analytics-%2526%252347-Big-Data-Precision-Systems-New-York-NY-10001/precisn/3032DI?icid=sr32793-1094p&q=&l=California,%20Us,%20CALIFORNIA | Precision Systems | Full Time | Permanent | Our direct client is looking for a Sr. Data Scientist with a programming background! The business itself focuses on using big data analytics and social graph theories to capitalize on human relationships and to predict and understand human behavior in ways that benefit their clients. So, this role is really at the heart of their business! You will be replacing a senior consultant - they want to bring this role in-house.Apply today to be a part of an exciting, fast-paced organization in the center of some of the latest technologies!Essential Skills:- 3+ years of non-academic data science experience- Software engineering background (Java preferred)- Strong foundation and expertise in at least two of the following: predictive modeling, statistical learning/inference, survey and experiment design and analysis, independent research, or NLP- Expertise in various statistical packages such as R, STATA, Python, Weka, or Apache Spark- Experience with SQL queries and data visualization tools- Familiarity with graph theory/algorithms and numerical optimizationPlusses:- BA/BS in Economics, Statistics, Mathematics, Computer Science, Machine Learning, or other related technical field- PhD preferred* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *Please click the Apply Now button to apply for the job.We will review your resume and call if you are qualified.Resumes will NOT be sent to clients without your approval.REFERRALS WANTED - $Â 1000 REWARD!Refer a colleague to us, and Precision will give you $Â 1000 if we find a job for that person!(The fine print: The referred candidate must be previously unknown to us. Start date must be within 6 months of referral.) Job ID 3032DI-3230 | Dice Id : precisn | New York, NY | Sr. Data Scientist - Machine Learning, Python, R, Predictive Analytics / Big Data | 5 hours ago | Telecommuting not available|Travel not required | Data Science, Predictive Analytics, Machine Learning, Big Data, Python, Java, R, Statistics, SQL, Hadoop, Clustering Methods, Graph Theory | a06157d9a8fe7f48a28adf6d405c55d2 | |
| https://www.dice.com/jobs/detail/Network-Data-Scientist-Open-Systems-Technologies-New-York-NY-10022/opensyst/31192?icid=sr61267-2043p&q=&l=California,%20Us,%20CALIFORNIA | Open Systems Technologies | Full Time | NA | The team focuses on real-time monitoring of changing market conditions, loads on distribution servers and network outages when building high-quality distribution algorithms.Analyze the efficiency of distribution algorithms and suggest innovative features and enhancements to improve qualityTake full ownership of the full software development life-cycle, including researching infrastructure needs for adopting new technologiesQualifications3+ years of experience programming in Python/R/SAS/Matlab or similar technologyKnowledge of data science and machine learning methodologiesBackground in statistical inference over time-series, unsupervised data in a professional environmentKnowledge of querying relational databases (MSSQL, Oracle, MySQL) or NoSQL (Cassandra/MongoDB)Experience with distributed datastores (Hadoop/S3) or search technologies (Splunk/Solr) | Dice Id : opensyst | New York, NY | Network Data Scientist | 4 days ago | Telecommuting not available|Travel not required | data science, python, r, sas, matlab, machine learning, MSSQL, oracle, mysql, nosql, cassandra, mongodb, splunk, solr, hadoop, s3 | c79924320c73852f152b9af29b7b3a32 |
Of our 59 filtered job listings, six explicitly mention R.
How many job postings mention Python?
python_dice_df <- ds_dice_df %>%
filter(grepl(" python | python,", job_description, ignore.case=T) | grepl(" python | python,", skills, ignore.case=T))
nrow(python_dice_df)
## [1] 8
python_dice_df %>% head() %>% kable()
| advertiser_url | company | employment_type | job_status | job_description | job_id | job_location | job_title | post_date | shift | site_name | skills | uniq_id |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| https://www.dice.com/jobs/detail/Data-Scientist-%2526%252345-NYC-Amazon-New-York-NY-10001/amazon20/418984?icid=sr9445-315p&q=&l=New%20York,%20NY | Amazon | Full Time | NA | Excited by Big Data, Machine Learning and Predictive Software? Interested in creating new state-of-the-art solutions using Machine Learning and Data Mining techniques on Terabytes of Data? At AWS, we are developing state-of-the-art large-scale Machine Learning Services and Applications on the Cloud involving large amounts of data. We work on applying predictive technology to a wide spectrum of problems. We are looking for talented and experienced Machine Learning Scientists who can apply innovative Machine Learning techniques to real-world problems. You will get to work in a team dedicated to advancing Machine Learning solutions at AWS and converting it to business-impacting solutions. Major responsibilities * Use machine learning, data mining and statistical techniques to create new, scalable solutions for business problems * Analyze and extract relevant information from business data to help automate and optimize key processes * Design, develop and evaluate highly innovative models for predictive learning * Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation * Research and implement novel machine learning and statistical approaches Basic Qualifications * An MS/PhD in CS, Machine Learning, Operational research, Statistics or in a highly quantitative field. PhD strongly preferred. * 4+ years of industrial experience in predictive modeling and analysis, predictive software development * Strong Problem solving ability * Good skills with Java or C++, Perl/Python (or similar scripting language) * Experience in using R, Matlab, or any other statistical software * Experience in mentoring junior team members, and guiding them on machine learning and data modeling applications * Strong communication and data presentation skills * Ability to travel to client locations when needed. Up to 50% regionally. Preferred Qualifications * 4+ years of industrial experience in predictive modeling and analysis, predictive software development * Experience handling terabyte size datasets * Experience working with distributed systems and grid computing * Publications or presentation in recognized Machine Learning and Data Mining journals/conferences  Posted Date: 6/17/2016 | Dice Id : amazon20 | New York, NY | Data Scientist - NYC | 1 week ago | Telecommuting not available|Travel required to 50%. | Analysis, Automated, Data Mining, Data Modeling, Development, Java, Matlab, Modeling, Perl, Python, Research, Validation | 0dec6af599287542a20279a6f6a87802 | |
| https://www.dice.com/jobs/detail/Data-Scientist-M2-Resources-Inc-New-York-NY-10285/10125751/686713?icid=sr10841-362p&q=&l=New%20York,%20NY | M2 Resources Inc | Full Time | Fulltime | Role: Data Scientist Location: New York City, NYType of hire: Fulltime/Permanent Basic Qualifications: PhD/Post Doc in any field with advanced quantitative focus in modelling oriented discipline including but not limited to Machine learning, Statistics, Psychometrics, Mathematics, Physics, Chemistry, Biology, Bioinformatics, Econometrics, Neuroscience, Computer Science.5+ years of analytical experience including 2-3 years of post-PhD experience in the field of advanced quantitative techniques while working for leading global academic institutes or corporate innovation research labs or analytics organizations of large corporate or in consulting companies in analytics roles.Nice blend of big data technologies coupled with strong knowledge of predictive modeling methods! Additionally, you must be skilled at clearly communicating your findings and translating them into practical solutions. Sound knowledge and application in the following: Advanced statistical methods including complex multivariate statistical methods, discrete choice modelling, conjoint based analysisAdvance knowledge of machine learning methods including classification, regression and clustering methodsKnowledge of heuristic methods and optimization techniques including system modeling and simulations.Deep programming skills and 3+ years’ experience in R, Perl, Python, or other languages appropriate for large scale analysis of numerical and categorical dataAdvanced quantitative methods relevant to modelling risk and consumer behavior: both parametric and non-parametric modelling, using unguided, semi-guided and guided approaches as appropriate.Willingness and desire to learn from other data scientists and modellers in the team on the art and science of modelling, feature engineering, decision trade-offs between model complexity and model deploymentExcellent prototyping skillsExcellent interpersonal and collaboration skills, ability to explain complicated mathematical concepts, algorithms and data structures to all business partners Technical Pluses: Experience with graph algorithms such as semi-supervised learning on graphs, graph clustering, community detection, interest/topic graphs, and social network analysisKnowledge of emerging platforms Responsibilities: Understand complex business challenges, develop hypotheses, integrate internal and external data sources, analyze them using cutting edge machine learning or statistical modelling techniques to uncover causality (i.e., we go beyond correlations and interesting trends in making decisions that affect people’s financial wellbeing) and synthesizing insightsPropose innovative modelling solutions, evaluate their effectiveness through proof of concept experimentations and refine and enhance them as necessary to ensure scalability and provide support for their implementation. Create new models through entire life cycle using the most effective application of supervised, semi-supervised and unsupervised parametric and non-parametric modeling methods.Investigate the impact of new computing technologies and niche, cutting edge analytical techniques and specialized applications, on the future of bankingDrive, understand, and adapt latest developments in machine learning and statistical modelling and apply them appropriately to solve business problems. Clean, manipulate and investigate large data sets If interested, please share me your resume to sarath(at)m2ri(dot)com / contact me at 856-624-9036 / Apply through dice   Thanks & Regards, Sarathkumar M2 Resources Inc.  Phone: 856-624-9036 (Direct)Email: sarath@m2ri.comGtalk: sarathm2r | Dice Id : 10125751 | New York, NY | Data Scientist | 2 weeks ago | Telecommuting not available|Travel not required | Data Science, Quantitative modelling, Machine learning, Statistics, Predictive modelling, R/Perl/Python | d03eac43ee018bfcc08f234cca54c0c7 | |
| https://www.dice.com/jobs/detail/Advance-Analytics-Net2Source-Inc.-New-York-NY-10001/10271304/Analytics_NY?icid=sr11724-391p&q=&l=New%20York,%20NY | Net2Source Inc. | Full Time | Contract Corp-To-Corp | Net2Source, Inc. is one of the fastest growing IT Consulting company across USA. N2S is headquartered at NJ, USA with its branch offices in Asia Pacific Region. N2S offers a wide gamut of consulting solutions customized to client needs including staffing, training and technologyPosition : Advance Analytics Location : New York,NYDuration : Full Time Mandatory skill : R, SAS, Strong Analytical SkillsJob Description : * Strong analytical skills and intellectual curiosity * Proven experience with various Data and statistical analysis tools such as R, SAS, Python, SQL Strong communication, presentation and writing skills with emphasis on demonstrated ability to translate complex concepts between business and technical resources* Strong ability to take business issues and transform them into analytics requirementsStrong interpersonal and teamwork skills* Ability to take business context and strategy married with current state of analytics and derive future state analytics and the requirements necessary to achieve clients strategic business objectives Thought leadership in the areas of data, data structure, analytics, and strategic consulting * Ability to leverage internal and client resources as well as knowledge of market, competitors, and industry to assess implications for clients* business strategy, identify potential business, analytics, and data challenges, frame potential solutions, and drive these solutions in the overall business and analytics strategy process* Experience building long term data and analytics strategic plans that support overall corporate and departmental business objectives and future vision * Innovative thinker that can paint a vision of the possible for colleagues as well as clients and then drive the implementation of that vision through effective project planning and executionAbout Net2Source, Inc.Net2Source is an employer-of-choice for over 1000 consultants across the globe. We recruit top-notch talent for over 40 Fortune and Government clients coast-to-coast across the U.S. We are one of the fastest-growing companies in the U.S. and this may be your opportunity to join us! Want to read more about Net2Source? , Visit us at www.net2source.comRegards,Shashi KumarSr. Technical RecruiterNet2Source Inc.Direct : 201-479-2153Board: (201) 340-8700 Ext. 442Fax : (201) 221-8131 Address: One Evertrust Plaza, Suite # 305, Jersey City, NJ - 07302https://in.linkedin.com/in/shashi-singh-76b47685Website: www.net2source.com Refer and Earn : For contractual position upto $500 and for Full time upto $1000 Note: Under Bill s.1618 Title III passed by the 105th U.S. Congress this mail cannot be considered Spam as long as we include contact information and a remove link for removal from our mailing list. To be removed from our mailing list reply with “remove” and include your “original email address/addresses” in the subject heading. Include complete address/addresses and/or domain to be removed. We will immediately update it accordingly. We apologize for the inconvenience if any caused | Dice Id : 10271304 | New York, NY | Advance Analytics | 1 month ago | Telecommuting not available|Travel not required | Advance Analytics,optimization,Oracle Data Mining,Oracle R Enterprise,text mining,statistical analysis,computations,R scripts,R functions,Data Scientists,RStudio | 80525fe2d264ca2bae9166bf9fee69ea | |
| https://www.dice.com/jobs/detail/Sr.-Data-Scientist-%2526%252345-Machine-Learning%252C-Python%252C-R%252C-Predictive-Analytics-%2526%252347-Big-Data-Precision-Systems-New-York-NY-10001/precisn/3032DI?icid=sr32793-1094p&q=&l=California,%20Us,%20CALIFORNIA | Precision Systems | Full Time | Permanent | Our direct client is looking for a Sr. Data Scientist with a programming background! The business itself focuses on using big data analytics and social graph theories to capitalize on human relationships and to predict and understand human behavior in ways that benefit their clients. So, this role is really at the heart of their business! You will be replacing a senior consultant - they want to bring this role in-house.Apply today to be a part of an exciting, fast-paced organization in the center of some of the latest technologies!Essential Skills:- 3+ years of non-academic data science experience- Software engineering background (Java preferred)- Strong foundation and expertise in at least two of the following: predictive modeling, statistical learning/inference, survey and experiment design and analysis, independent research, or NLP- Expertise in various statistical packages such as R, STATA, Python, Weka, or Apache Spark- Experience with SQL queries and data visualization tools- Familiarity with graph theory/algorithms and numerical optimizationPlusses:- BA/BS in Economics, Statistics, Mathematics, Computer Science, Machine Learning, or other related technical field- PhD preferred* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *Please click the Apply Now button to apply for the job.We will review your resume and call if you are qualified.Resumes will NOT be sent to clients without your approval.REFERRALS WANTED - $Â 1000 REWARD!Refer a colleague to us, and Precision will give you $Â 1000 if we find a job for that person!(The fine print: The referred candidate must be previously unknown to us. Start date must be within 6 months of referral.) Job ID 3032DI-3230 | Dice Id : precisn | New York, NY | Sr. Data Scientist - Machine Learning, Python, R, Predictive Analytics / Big Data | 5 hours ago | Telecommuting not available|Travel not required | Data Science, Predictive Analytics, Machine Learning, Big Data, Python, Java, R, Statistics, SQL, Hadoop, Clustering Methods, Graph Theory | a06157d9a8fe7f48a28adf6d405c55d2 | |
| https://www.dice.com/jobs/detail/Data-Analyst-at-%25246B-Hedge-Fund-in-Midtown-Averity-New-York-NY-10016/90906950/699900?icid=sr10018-334p&q=&l=New%20York,%20NY | Averity | Full Time | NA | We are looking for a strong Data Analyst who possesses the ability to work directly with Investment Committee on utilizing data to answer important questions and support the decision making process. What’s the Job?As a Data Analyst you will partner with a mix of developers, engineers and the business to integrate, cleanse, monitor and prep multiple large streams of incoming data.Build tools to load and monitor financial dataImplement tool sets and databases to support research and investment decision processCreate and use bespoke softwareUse expertise in analyzing data to answer business questions Who Are We?We are a top-tier Multi-Strategy Hedge Fund focused on managing assets for our global clients. Compensation$160,000 (commensurate with experience)Fully Covered Medical401k MatchFlexible Paid Time Off PolicyCatered Meals and Fully Stocked Kitchen What Skills Do You Need?Knowledge of SQL and advanced Excel neededUnderstanding of C# and Python a plus Experience with a variety of asset class data is desirable, including equity, derivatives, futures, bonds, preferred stocks, foreign exchange and commodities What’s In It For You?This is a great opportunity to make a substantial impact in a growing hedge fund and gain experience working with some of wall streets best investment individuals. | Dice Id : 90906950 | New York, NY | Data Analyst at $6B Hedge Fund in Midtown | 2 weeks ago | Telecommuting not available|Travel not required | SQL, Excel, Python, C# | b29cc4ed3617a6edf1258c25c8a44aef | |
| https://www.dice.com/jobs/detail/Consultant-Data-Project-Manager--Investment-Bank-Thomson-Keene-Inc.-New-York-NY-10007/10527936/699751?icid=sr10010-334p&q=&l=New%20York,%20NY | Thomson Keene Inc. | Contract W2 | 12 months | Technical project manager is required to support a well regarded development team working on a flagship global reference data platform ensuring the successful delivery of a significant change program. You will ideally have a technical background and understanding of the agile SDLC and typical software testing cycles. The development team are using leading edge data management technologies, a knowledge of this domain is highly desirable (Scala, Python, RDF messaging). Main responsibilities will include proactive management and tracking of issues and risks, communicating to relevant stakeholders and ensuring timelines are maintained. You will keep the PMO office updated creating dashboards for upward reporting utilising JIRA (already in place). This is an excellent opportunity to work with a high performing technical team using the most current technologies in a key strategic group within the bank. | Dice Id : 10527936 | New York, NY | Consultant Data Project Manager Investment Bank | 2 weeks ago | Telecommuting not available|Travel not required | PM, Project Manager, Data, Agile, JIRA | 12d835e60b4a626dc6614e31cad6732a |
Of our 59 filtered job listings, eight explicitly mention Python.
We load Mary Anna’s dataset from indeed.com.
listings_db <- src_postgres(host="mkivenson-job-scrape-data.cvc7wr5vvljm.us-east-1.rds.amazonaws.com", user="postgres", password="postgres607", dbname="listings")
listings_df <- tbl(listings_db, "listings") %>% collect(n=Inf)
nrow(listings_df)
## [1] 1111
We can re-run the same keyword searches for R and Python on this new dataset.
r_listings_df <- listings_df %>%
filter(grepl(" R | R,", description, ignore.case=T) | grepl(" R | R,", summary, ignore.case=T))
nrow(r_listings_df)
## [1] 463
python_listings_df <- listings_df %>%
filter(grepl(" python | python,", description, ignore.case=T) | grepl(" python | python,", summary, ignore.case=T))
nrow(python_listings_df)
## [1] 586
Of the 1111 data science job listings on Indeed, 463 explicitly mention R and 586 mention Python.