This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. # This is a final project to show off what you have learned. Select your data set from the list below: http://vincentarelbundock.github.io/Rdatasets/ (click on the csv index for a list). I selected Resume Names https://vincentarelbundock.github.io/Rdatasets/csv/AER/ResumeNames.csv The presentation approach is up to you but it should contain the following:
ResumeNames<- read.csv(file ="https://vincentarelbundock.github.io/Rdatasets/csv/AER/ResumeNames.csv", header = TRUE, sep = ",")
head(ResumeNames)
## X name gender ethnicity quality call city jobs experience honors
## 1 1 Allison female cauc low no chicago 2 6 no
## 2 2 Kristen female cauc high no chicago 3 6 no
## 3 3 Lakisha female afam low no chicago 1 6 no
## 4 4 Latonya female afam high no chicago 4 6 no
## 5 5 Carrie female cauc high no chicago 3 22 no
## 6 6 Jay male cauc low no chicago 2 6 yes
## volunteer military holes school email computer special college minimum equal
## 1 no no yes no no yes no yes 5 yes
## 2 yes yes no yes yes yes no no 5 yes
## 3 no no no yes no yes no yes 5 yes
## 4 yes no yes no yes yes yes no 5 yes
## 5 no no no yes yes yes no no some yes
## 6 no no no no no no yes yes none yes
## wanted requirements reqexp reqcomm reqeduc reqcomp reqorg
## 1 supervisor yes yes no no yes no
## 2 supervisor yes yes no no yes no
## 3 supervisor yes yes no no yes no
## 4 supervisor yes yes no no yes no
## 5 secretary yes yes no no yes yes
## 6 other no no no no no no
## industry
## 1 manufacturing
## 2 manufacturing
## 3 manufacturing
## 4 manufacturing
## 5 health/education/social services
## 6 trade
summary(ResumeNames)
## X name gender ethnicity
## Min. : 1 Length:4870 Length:4870 Length:4870
## 1st Qu.:1218 Class :character Class :character Class :character
## Median :2436 Mode :character Mode :character Mode :character
## Mean :2436
## 3rd Qu.:3653
## Max. :4870
## quality call city jobs
## Length:4870 Length:4870 Length:4870 Min. :1.000
## Class :character Class :character Class :character 1st Qu.:3.000
## Mode :character Mode :character Mode :character Median :4.000
## Mean :3.661
## 3rd Qu.:4.000
## Max. :7.000
## experience honors volunteer military
## Min. : 1.000 Length:4870 Length:4870 Length:4870
## 1st Qu.: 5.000 Class :character Class :character Class :character
## Median : 6.000 Mode :character Mode :character Mode :character
## Mean : 7.843
## 3rd Qu.: 9.000
## Max. :44.000
## holes school email computer
## Length:4870 Length:4870 Length:4870 Length:4870
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## special college minimum equal
## Length:4870 Length:4870 Length:4870 Length:4870
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## wanted requirements reqexp reqcomm
## Length:4870 Length:4870 Length:4870 Length:4870
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## reqeduc reqcomp reqorg industry
## Length:4870 Length:4870 Length:4870 Length:4870
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
jobsmean <- mean(ResumeNames$jobs)
print(jobsmean)
## [1] 3.661396
jobsmedian <- median(ResumeNames$jobs)
print(jobsmedian)
## [1] 4
experiencemean <- mean(ResumeNames$experience)
print(experiencemean)
## [1] 7.842916
experiencemedian <- median(ResumeNames$experience)
print(experiencemedian)
## [1] 6
experiencequantile <- quantile(ResumeNames$experience)
print(experiencequantile)
## 0% 25% 50% 75% 100%
## 1 5 6 9 44
jobsquantile <- quantile(ResumeNames$jobs)
print(jobsquantile)
## 0% 25% 50% 75% 100%
## 1 3 4 4 7
##2. Data wrangling: Please perform some basic transformations. They will need to make sense but could include column renaming, creating a subset of the data, replacing values, or creating new columns with derived data.
Top<- data.frame(subset(ResumeNames, experience>= 15 & jobs>= 3))
names(Top)[names(Top) == 'holes'] <- 'gaps'
names(Top)[names(Top) == 'call'] <- 'forward'
#Top dataset reflect the top candidates. I also changed the column name of ‘holes’ to ‘gaps’ on the Top dataset. Belwo I also wanted to recategorize the character fields into metrics for requirements column, 1 meaning yes.
Top$forward <- as.character(Top$forward)
Top$forward[Top$forward == "no"] <- "0"
Top$forward[Top$forward == "yes"] <- "1"
#I tried to change the values above for if the candidate moved forward, or had a call back, while it updated, it is still a ‘chr’ not ‘int’
##3. Graphics: Please make sure to display at least one scatter plot, box plot and histogram. Don’t be limited to this. Please explore the many other options in R packages such as ggplot2.
You can also embed plots, for example:
plot(pressure)
sapply(df, class)
## x df1 df2 ncp log
## "name" "name" "name" "name" "logical" "{"
hist(ResumeNames$jobs,main = "Jobs Histogram", xlab = "jobs")
boxplot(Top$experience, main = "Experience", ylab = "Experience")
plot(jobs ~ experience, data=Top)
plot(jobs ~ experience, data=ResumeNames)
```r
str(Top)
## 'data.frame': 423 obs. of 28 variables:
## $ X : int 5 8 57 113 163 175 178 179 182 183 ...
## $ name : chr "Carrie" "Kenya" "Laurie" "Anne" ...
## $ gender : chr "female" "female" "female" "female" ...
## $ ethnicity : chr "cauc" "afam" "cauc" "cauc" ...
## $ quality : chr "high" "high" "high" "low" ...
## $ forward : chr "0" "0" "0" "0" ...
## $ city : chr "chicago" "chicago" "chicago" "chicago" ...
## $ jobs : int 3 4 4 3 4 6 5 6 5 6 ...
## $ experience : int 22 21 21 23 21 18 26 18 26 18 ...
## $ honors : chr "no" "no" "no" "no" ...
## $ volunteer : chr "no" "yes" "no" "yes" ...
## $ military : chr "no" "no" "no" "no" ...
## $ gaps : chr "no" "yes" "yes" "yes" ...
## $ school : chr "yes" "no" "yes" "yes" ...
## $ email : chr "yes" "yes" "yes" "no" ...
## $ computer : chr "yes" "yes" "yes" "yes" ...
## $ special : chr "no" "yes" "no" "no" ...
## $ college : chr "no" "no" "yes" "no" ...
## $ minimum : chr "some" "some" "some" "none" ...
## $ equal : chr "yes" "yes" "no" "yes" ...
## $ wanted : chr "secretary" "secretary" "secretary" "secretary" ...
## $ requirements: chr "yes" "yes" "yes" "yes" ...
## $ reqexp : chr "yes" "yes" "yes" "no" ...
## $ reqcomm : chr "no" "no" "no" "no" ...
## $ reqeduc : chr "no" "no" "no" "no" ...
## $ reqcomp : chr "yes" "yes" "no" "yes" ...
## $ reqorg : chr "yes" "yes" "no" "no" ...
## $ industry : chr "health/education/social services" "health/education/social services" "finance/insurance/real estate" "manufacturing" ...
#4Meaningful question for analysis: Please state at the beginning a meaningful question for analysis. Use the first three steps and anything else that would be helpful to answer the question you are posing from the data set you chose. Please write a brief conclusion paragraph in R markdown at the end. ##I feel like I was unable to use this project ‘well’ to answer a real question I had on this dataset. I initally wanted to look at how much more likely a lower quality candidate is called back compared to a higher quality candidate due to their name sounding white and not african american by looking at the top 50% who were qualified. I was unable to change some characters to integers. I saw a lot of errors whicl trying to work on this including Error: attempt to use zero-length variable name but I feel like something else was the issue (not accidentally highlighted the backticks along with the R code.) The visualization I created means very of significance. I do not believe there was any direct correlation between number of jobs and experience based on this dataset. In retrospect I wish I used another csv data set for this project. However, I did attempt to do this project with another dataset but I still came across roadblocks so I decided to submit this instead and hope for enough points to pass. :)
#BONUS
```r
ResumeNames <- read.csv('https://raw.githubusercontent.com/marjete/final-R/main/ResumeNames.csv')
head(ResumeNames)
## X name gender ethnicity quality call city jobs experience honors
## 1 1 Allison female cauc low no chicago 2 6 no
## 2 2 Kristen female cauc high no chicago 3 6 no
## 3 3 Lakisha female afam low no chicago 1 6 no
## 4 4 Latonya female afam high no chicago 4 6 no
## 5 5 Carrie female cauc high no chicago 3 22 no
## 6 6 Jay male cauc low no chicago 2 6 yes
## volunteer military holes school email computer special college minimum equal
## 1 no no yes no no yes no yes 5 yes
## 2 yes yes no yes yes yes no no 5 yes
## 3 no no no yes no yes no yes 5 yes
## 4 yes no yes no yes yes yes no 5 yes
## 5 no no no yes yes yes no no some yes
## 6 no no no no no no yes yes none yes
## wanted requirements reqexp reqcomm reqeduc reqcomp reqorg
## 1 supervisor yes yes no no yes no
## 2 supervisor yes yes no no yes no
## 3 supervisor yes yes no no yes no
## 4 supervisor yes yes no no yes no
## 5 secretary yes yes no no yes yes
## 6 other no no no no no no
## industry
## 1 manufacturing
## 2 manufacturing
## 3 manufacturing
## 4 manufacturing
## 5 health/education/social services
## 6 trade