title: “US Perm Visa Denied” author: “CT” date: “2/26/2020” output: html_document — US Visa Application for Labor Certification Dataset # https://www.foreignlaborcert.doleta.gov/performancedata.cfm US Visa Application for Labor Certification Dataset Background: This dataset contains administrative data from employers’ Applications for Permanent Employment Certification (ETA Form 9089) and certification determinations processed by the Department’s Office of Foreign Labor Certification, Employment and Training Administration, where the date of the determination was issued on or after October 1, 2018, and on or before September 30, 2019.
The process is that the employers file and not the employee. In general, the DOL works to ensure that the admission of foreign workers to work in the U.S. will not adversely affect the job opportunities, wages and working conditions of U.S. workers. Once a permanent labor certification application has been approved by the DOL, the employer will need to seek the immigration authorization from the U.S. Citizenship and Immigration Services (USCIS). DOL processes Applications for Permanent Employment Certification, ETA Form 9089, except for Schedule A and sheepherder applications which are filed under 20 CFR § 656.16. The date the labor certification application is received by the DOL is known as the filing date and is used by USCIS and the Department of State as the priority date. After the labor certification application is certified by DOL, it is valid for 180 days and it should be submitted to the appropriate USCIS Service Center with a Form I-140, Immigrant Petition for Alien Worker. Purpose: The purpose of this project was to practice data visualization techniques using R. as a beginner. The process: Data cleaning, data subset and variable selection Since the dataset contained originally 154 columns with over 50,000 observations, a subset was selected for the analysis: visa applications “denied”. This step narrowed down the dataset close to 25,000 observations. From there, some 29 variables were selected to explore some key trends in the visa denied subset. Subsequently the following variables were retained. The ones used to produce the graphs include:

case_number, case_status employer_name, employer_state, pw_soc_code pw_soc_title job_info_education job_info_major, job_info_alt_field, job_info_experience, job_info_foreign_ed, job_info_job_req_normal, country_of_citizenship, recr_info_professional_occ, foreign_worker_info_education, pw_job_title_9089, ri_coll_tch_basic_process, job_info_foreign_lang_req

global knit options

## function (...) 
## {
##     dots = resolve(...)
##     if (length(dots)) 
##         defaults <<- merge(dots)
##     invisible(NULL)
## }
## <bytecode: 0x0000000013576390>
## <environment: 0x00000000135a02d8>

##Get the data

##Read the data

## [1] "C:/Users/rande/Documents/Data visualization/Spring2020"

Creating new dataframe with the desired columns

##Uploading the libraries

##Creating new data frame for the denied cases only

df <- fread(filename)

den_df <- df[case_status == "Denied"]

##Selection of columns for the new data frame

dealing with states

##creating new variable as new_state to deal with the ##issue of having abbreviations and state names spelled out

Countries of Citizenship of the applicants denied - Top 30

Create the dataframe

For each data frame, missing values are removed using filter

## # A tibble: 30 x 2
##    country_of_citizenship count
##    <fct>                  <int>
##  1 INDIA                   8546
##  2 SOUTH KOREA             2382
##  3 MEXICO                  1777
##  4 CHINA                   1487
##  5 PHILIPPINES             1361
##  6 CANADA                   805
##  7 UNITED KINGDOM           360
##  8 PAKISTAN                 308
##  9 VENEZUELA                261
## 10 JAPAN                    256
## # ... with 20 more rows

selector of different fill colors

fillColor = "#7BB700"
fillColor4 = "#f10fad"
fillcolor2 = "#FFD505"

Plotting Countries of origin for the applicants denied - Plot #1

## New dataframe for state where employers are located

Plotting the Top 30 states where the employers are located

assigning Plot 2

Foreign Worker Education level - data frame

Assigning Plot 3

Pie chart of foreign workers education - gives proportion of denied visas and applicant education level

Exploring whether or not the employer experienced layoff 6 months prior to filing application

Exploring proportion employers who had layoff 6 months prior to applications

## Dataframe for the applications denied that were refiles

## # A tibble: 2 x 2
##   refile count
##   <fct>  <int>
## 1 N       7192
## 2 Y         49

Plotting denied visas

Dataframe for the Top 30 majors of foreign workers denied

new color

fillcolor4= "#9B111E"

Plotting the academic majors of the denied cases

##Libraries

library(dplyr)
library(plotly)

Data frame for job titles of visa denied

Plotting job titles of visas denied

Exploring the top 30 employers who got visas denied

plotting top 30 employers with visas denied

Dataframe of whether denial is for Schedule A or a Sheepherder Occupation

Proportion of denials for Schedule A or sheepherder occupations

Porportion of denials that used basic recruitment process for professional occupations

### Count of Visa by original file year ###

## # A tibble: 19 x 2
##    orig_file_date count
##    <chr>          <int>
##  1 1996               1
##  2 1997               1
##  3 1998               1
##  4 2001              22
##  5 2002               1
##  6 2003               5
##  7 2004               1
##  8 2005               4
##  9 2006               1
## 10 2007               3
## 11 2008               2
## 12 2009               2
## 13 2010               1
## 14 2011               2
## 15 2012               5
## 16 2013               4
## 17 2014              13
## 18 2015              28
## 19 2016               8