Proposal for Final Project Using R

2011-2016 H-1b Petition Situation



By Alexa Chenyang Li


Why This Topic



H-1B is an employment-based, non-immigrant visa category for temporary foreign workers in the United States. For a foreign national to apply for H1-B visa, an US employer must offer a job and petition for H-1B visa with the US immigration department. This is the most common visa status applied for and held by international students once they complete college/ higher education (Masters, PhD) and work in a full-time position.

As an international and a STEM-major student who wants to secure a job in the U.S., asking sponsorship for H-1b is an indispensable process. Therefore, it is important for international students to know what is going on about H-1b, especially when we are confronting a flunctant H-1b petition situation today.

As a business analytics major student, whose future job tile is usually with the word “analyst”, learning the changes for analysts’ H-1b petition will be very useful for our job hunting. In this project, I will analyze which worksite have the most H-1b petition and the certified H-1b petition, Which employers send most number of H-1B visa applications, what kind of analyst will get the most H-1b petition and the certified H-1b petition, whether the H-1b petition or certified H-1b petition has certain linear relationship with the prevailing wage, what is the proportion of full-time job and part-time job in general H-1b and certified H-1b petition and the median for each variable.


What Dataset

  • The original dataset is from United States Department of Labor, Employment & Traning Administration; and then was arranged and re-posted on Dataju.cn and Kaggle.

  • This dataset contains 10 variables, including:
    • CASE_STATUS : Status associated with the last significant event or decision. Valid values include “Certified,” “Certified-Withdrawn,” Denied," and “Withdrawn”;
    • EMPLOYER_NAME : Name of employer submitting labor condition application;
    • SOC_NAME : Occupational name associated with the SOC_CODE. SOC_CODE is the occupational code associated with the job being requested for temporary labor condition, as classified by the Standard Occupational Classification (SOC) System;
    • JOB_TITLE : Title of the job;
    • FULL_TIME_POSITION : Y = Full Time Position; N = Part Time Position;
    • PREVAILING_WAGE : Prevailing Wage for the job being requested for temporary labor condition. The wage is listed at annual scale in USD. The prevailing wage for a job position is defined as the average wage paid to similarly employed workers in the requested occupation in the area of intended employment. The prevailing wage is based on the employer’s minimum requirements for the position;
    • YEAR : Year in which the H-1B visa petition was filed;
    • WORKSITE : City and State information of the foreign worker’s intended area of employment;
    • lon : Longitude of the worksite;
    • lat : Latitude of the worksite;


What Methodology (Subject to Change)

  • Use summary statistics including mean, median, etc. to describe the change for certain variables over the time period from 2011 to 2016;
  • Use regression method to analyze if there is any linear relationships between certain variables, then predict the possible casual effect;
  • Use R packages incuding but not limited to ggplot, dyplr, tidyverse to visualize the data.

Why Important

It will give you an overview on the tendency of H-1b petition from 2011 to 2016. It is very useful for international students to do job hunting and job research in order to secure a job in the U.S. for the long run, since this analysis will give international students suggestions on their target companies, target worksites or even target industries for the sake of H-1b petition.