R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. # This is a final project to show off what you have learned. Select your data set from the list below: http://vincentarelbundock.github.io/Rdatasets/ (click on the csv index for a list). I selected Resume Names https://vincentarelbundock.github.io/Rdatasets/csv/AER/ResumeNames.csv The presentation approach is up to you but it should contain the following:

1. Data Exploration: This should include summary statistics, means, medians, quartiles, or any other relevant information about the data set. Please include some conclusions in the R Markdown text.

ResumeNames<- read.csv(file ="https://vincentarelbundock.github.io/Rdatasets/csv/AER/ResumeNames.csv", header = TRUE, sep = ",")

head(ResumeNames)

##   X    name gender ethnicity quality call    city jobs experience honors
## 1 1 Allison female      cauc     low   no chicago    2          6     no
## 2 2 Kristen female      cauc    high   no chicago    3          6     no
## 3 3 Lakisha female      afam     low   no chicago    1          6     no
## 4 4 Latonya female      afam    high   no chicago    4          6     no
## 5 5  Carrie female      cauc    high   no chicago    3         22     no
## 6 6     Jay   male      cauc     low   no chicago    2          6    yes
##   volunteer military holes school email computer special college minimum equal
## 1        no       no   yes     no    no      yes      no     yes       5   yes
## 2       yes      yes    no    yes   yes      yes      no      no       5   yes
## 3        no       no    no    yes    no      yes      no     yes       5   yes
## 4       yes       no   yes     no   yes      yes     yes      no       5   yes
## 5        no       no    no    yes   yes      yes      no      no    some   yes
## 6        no       no    no     no    no       no     yes     yes    none   yes
##       wanted requirements reqexp reqcomm reqeduc reqcomp reqorg
## 1 supervisor          yes    yes      no      no     yes     no
## 2 supervisor          yes    yes      no      no     yes     no
## 3 supervisor          yes    yes      no      no     yes     no
## 4 supervisor          yes    yes      no      no     yes     no
## 5  secretary          yes    yes      no      no     yes    yes
## 6      other           no     no      no      no      no     no
##                           industry
## 1                    manufacturing
## 2                    manufacturing
## 3                    manufacturing
## 4                    manufacturing
## 5 health/education/social services
## 6                            trade

summary(ResumeNames)

##        X            name              gender           ethnicity        
##  Min.   :   1   Length:4870        Length:4870        Length:4870       
##  1st Qu.:1218   Class :character   Class :character   Class :character  
##  Median :2436   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :2436                                                           
##  3rd Qu.:3653                                                           
##  Max.   :4870                                                           
##    quality              call               city                jobs      
##  Length:4870        Length:4870        Length:4870        Min.   :1.000  
##  Class :character   Class :character   Class :character   1st Qu.:3.000  
##  Mode  :character   Mode  :character   Mode  :character   Median :4.000  
##                                                           Mean   :3.661  
##                                                           3rd Qu.:4.000  
##                                                           Max.   :7.000  
##    experience        honors           volunteer           military        
##  Min.   : 1.000   Length:4870        Length:4870        Length:4870       
##  1st Qu.: 5.000   Class :character   Class :character   Class :character  
##  Median : 6.000   Mode  :character   Mode  :character   Mode  :character  
##  Mean   : 7.843                                                           
##  3rd Qu.: 9.000                                                           
##  Max.   :44.000                                                           
##     holes              school             email             computer        
##  Length:4870        Length:4870        Length:4870        Length:4870       
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##    special            college            minimum             equal          
##  Length:4870        Length:4870        Length:4870        Length:4870       
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##     wanted          requirements          reqexp            reqcomm         
##  Length:4870        Length:4870        Length:4870        Length:4870       
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##    reqeduc            reqcomp             reqorg            industry        
##  Length:4870        Length:4870        Length:4870        Length:4870       
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##

jobsmean <- mean(ResumeNames$jobs)
print(jobsmean)

## [1] 3.661396

jobsmedian <- median(ResumeNames$jobs)
print(jobsmedian)

## [1] 4

experiencemean <- mean(ResumeNames$experience)
print(experiencemean)

## [1] 7.842916

experiencemedian <- median(ResumeNames$experience)
print(experiencemedian)

## [1] 6

experiencequantile <- quantile(ResumeNames$experience)
print(experiencequantile)

##   0%  25%  50%  75% 100% 
##    1    5    6    9   44

jobsquantile <- quantile(ResumeNames$jobs)
print(jobsquantile)

##   0%  25%  50%  75% 100% 
##    1    3    4    4    7

Summary of jobs, mean/ average is 3.66, median is 4; summary of experience, mean is 7.8 and median is 6. The data set used a wide variety of experience, between 1 and 44, and a wide range of number of jobs, between 1 and 7.

##2. Data wrangling: Please perform some basic transformations. They will need to make sense but could include column renaming, creating a subset of the data, replacing values, or creating new columns with derived data.

Top<- data.frame(subset(ResumeNames, experience>= 15 & jobs>= 3))
names(Top)[names(Top) == 'holes'] <- 'gaps'
names(Top)[names(Top) == 'call'] <- 'forward'

#Top dataset reflect the top candidates. I also changed the column name of ‘holes’ to ‘gaps’ on the Top dataset. Belwo I also wanted to recategorize the character fields into metrics for requirements column, 1 meaning yes.

Top$forward <- as.character(Top$forward)
Top$forward[Top$forward == "no"] <- "0"
Top$forward[Top$forward == "yes"] <- "1"

#I tried to change the values above for if the candidate moved forward, or had a call back, while it updated, it is still a ‘chr’ not ‘int’

##3. Graphics: Please make sure to display at least one scatter plot, box plot and histogram. Don’t be limited to this. Please explore the many other options in R packages such as ggplot2.

You can also embed plots, for example:

plot(pressure)

sapply(df, class)

##         x       df1       df2       ncp       log           
##    "name"    "name"    "name"    "name" "logical"       "{"

hist(ResumeNames$jobs,main = "Jobs Histogram", xlab = "jobs")

boxplot(Top$experience, main = "Experience", ylab = "Experience")

 plot(jobs ~ experience, data=Top)

 plot(jobs ~ experience, data=ResumeNames)




```r
str(Top)

## 'data.frame':    423 obs. of  28 variables:
##  $ X           : int  5 8 57 113 163 175 178 179 182 183 ...
##  $ name        : chr  "Carrie" "Kenya" "Laurie" "Anne" ...
##  $ gender      : chr  "female" "female" "female" "female" ...
##  $ ethnicity   : chr  "cauc" "afam" "cauc" "cauc" ...
##  $ quality     : chr  "high" "high" "high" "low" ...
##  $ forward     : chr  "0" "0" "0" "0" ...
##  $ city        : chr  "chicago" "chicago" "chicago" "chicago" ...
##  $ jobs        : int  3 4 4 3 4 6 5 6 5 6 ...
##  $ experience  : int  22 21 21 23 21 18 26 18 26 18 ...
##  $ honors      : chr  "no" "no" "no" "no" ...
##  $ volunteer   : chr  "no" "yes" "no" "yes" ...
##  $ military    : chr  "no" "no" "no" "no" ...
##  $ gaps        : chr  "no" "yes" "yes" "yes" ...
##  $ school      : chr  "yes" "no" "yes" "yes" ...
##  $ email       : chr  "yes" "yes" "yes" "no" ...
##  $ computer    : chr  "yes" "yes" "yes" "yes" ...
##  $ special     : chr  "no" "yes" "no" "no" ...
##  $ college     : chr  "no" "no" "yes" "no" ...
##  $ minimum     : chr  "some" "some" "some" "none" ...
##  $ equal       : chr  "yes" "yes" "no" "yes" ...
##  $ wanted      : chr  "secretary" "secretary" "secretary" "secretary" ...
##  $ requirements: chr  "yes" "yes" "yes" "yes" ...
##  $ reqexp      : chr  "yes" "yes" "yes" "no" ...
##  $ reqcomm     : chr  "no" "no" "no" "no" ...
##  $ reqeduc     : chr  "no" "no" "no" "no" ...
##  $ reqcomp     : chr  "yes" "yes" "no" "yes" ...
##  $ reqorg      : chr  "yes" "yes" "no" "no" ...
##  $ industry    : chr  "health/education/social services" "health/education/social services" "finance/insurance/real estate" "manufacturing" ...

#4Meaningful question for analysis: Please state at the beginning a meaningful question for analysis. Use the first three steps and anything else that would be helpful to answer the question you are posing from the data set you chose. Please write a brief conclusion paragraph in R markdown at the end. ##I feel like I was unable to use this project ‘well’ to answer a real question I had on this dataset. I initally wanted to look at how much more likely a lower quality candidate is called back compared to a higher quality candidate due to their name sounding white and not african american by looking at the top 50% who were qualified. I was unable to change some characters to integers. I saw a lot of errors whicl trying to work on this including Error: attempt to use zero-length variable name but I feel like something else was the issue (not accidentally highlighted the backticks along with the R code.) The visualization I created means very of significance. I do not believe there was any direct correlation between number of jobs and experience based on this dataset. In retrospect I wish I used another csv data set for this project. However, I did attempt to do this project with another dataset but I still came across roadblocks so I decided to submit this instead and hope for enough points to pass. :)

#BONUS

```r
ResumeNames <- read.csv('https://raw.githubusercontent.com/marjete/final-R/main/ResumeNames.csv')
head(ResumeNames)

##   X    name gender ethnicity quality call    city jobs experience honors
## 1 1 Allison female      cauc     low   no chicago    2          6     no
## 2 2 Kristen female      cauc    high   no chicago    3          6     no
## 3 3 Lakisha female      afam     low   no chicago    1          6     no
## 4 4 Latonya female      afam    high   no chicago    4          6     no
## 5 5  Carrie female      cauc    high   no chicago    3         22     no
## 6 6     Jay   male      cauc     low   no chicago    2          6    yes
##   volunteer military holes school email computer special college minimum equal
## 1        no       no   yes     no    no      yes      no     yes       5   yes
## 2       yes      yes    no    yes   yes      yes      no      no       5   yes
## 3        no       no    no    yes    no      yes      no     yes       5   yes
## 4       yes       no   yes     no   yes      yes     yes      no       5   yes
## 5        no       no    no    yes   yes      yes      no      no    some   yes
## 6        no       no    no     no    no       no     yes     yes    none   yes
##       wanted requirements reqexp reqcomm reqeduc reqcomp reqorg
## 1 supervisor          yes    yes      no      no     yes     no
## 2 supervisor          yes    yes      no      no     yes     no
## 3 supervisor          yes    yes      no      no     yes     no
## 4 supervisor          yes    yes      no      no     yes     no
## 5  secretary          yes    yes      no      no     yes    yes
## 6      other           no     no      no      no      no     no
##                           industry
## 1                    manufacturing
## 2                    manufacturing
## 3                    manufacturing
## 4                    manufacturing
## 5 health/education/social services
## 6                            trade

Final Project

Marjete

2022-07-31

R Markdown

1. Data Exploration: This should include summary statistics, means, medians, quartiles, or any other relevant information about the data set. Please include some conclusions in the R Markdown text.

Summary of jobs, mean/ average is 3.66, median is 4; summary of experience, mean is 7.8 and median is 6. The data set used a wide variety of experience, between 1 and 44, and a wide range of number of jobs, between 1 and 7.