Import your data

# excel file
data <- read_excel("Salaries.xlsx")
data
## # A tibble: 397 × 6
##    rank      discipline yrs.since.phd yrs.service sex    salary
##    <chr>     <chr>              <dbl>       <dbl> <chr>   <dbl>
##  1 Prof      B                     19          18 Male   139750
##  2 Prof      B                     20          16 Male   173200
##  3 AsstProf  B                      4           3 Male    79750
##  4 Prof      B                     45          39 Male   115000
##  5 Prof      B                     40          41 Male   141500
##  6 AssocProf B                      6           6 Male    97000
##  7 Prof      B                     30          23 Male   175000
##  8 Prof      B                     45          45 Male   147765
##  9 Prof      B                     21          20 Male   119250
## 10 Prof      B                     18          18 Female 129000
## # … with 387 more rows

Question

What rank makes up a majority of those who high paying salaries?

Exaplanation of Variables Used

The names of the variables used in the data analysis include; Rank, Discipline, Yrs.since.phd, Yrs.service, Sex, and Salary. Rank is referring to the level of mastery that someone is in their discipline. From highest to lowest rank; “Prof” meaning Professor, “AsstProf” meaning Assistant Professor, and “AssocProf” meaning Associate Professor. Discipline refers to the type of work industry that they are in, ex: “B” refers to Business and “A” refers to Agriculture. The next variable is Yrs.since.phd which is referring to the number of years since they graduated from college whether it is with a Bachelor of Science/Arts degree, Master’s degree, or a Director of Philosophy and were able to start working in their industry. Yrs.service is simply just referring to the number of years that they have been working in that industry for. Sex is just there to indicate if the worker is male or female which helps give another statistic when doing different measurements for the data. Finally, Salary which is a very important variable in this data set is there to show how much everyone is making in there own different fields of study. All of these different variables are important when trying to make use out of all of this data, but the variable, “Salary” is by far the most important variable in this set.

Relevant R Code and Analyses

data %>%
    
    ggplot(aes(yrs.since.phd)) + 
    geom_point(mapping = aes(x = yrs.since.phd, y = salary))

arrange(data, desc(salary))
## # A tibble: 397 × 6
##    rank  discipline yrs.since.phd yrs.service sex   salary
##    <chr> <chr>              <dbl>       <dbl> <chr>  <dbl>
##  1 Prof  B                     38          38 Male  231545
##  2 Prof  A                     43          43 Male  205500
##  3 Prof  A                     29           7 Male  204000
##  4 Prof  A                     42          18 Male  194800
##  5 Prof  B                     26          19 Male  193000
##  6 Prof  B                     49          60 Male  192253
##  7 Prof  B                     34          33 Male  189409
##  8 Prof  B                     56          49 Male  186960
##  9 Prof  A                     33          18 Male  186023
## 10 Prof  A                     39           9 Male  183800
## # … with 387 more rows
select(data, rank, discipline, yrs.since.phd, salary)
## # A tibble: 397 × 4
##    rank      discipline yrs.since.phd salary
##    <chr>     <chr>              <dbl>  <dbl>
##  1 Prof      B                     19 139750
##  2 Prof      B                     20 173200
##  3 AsstProf  B                      4  79750
##  4 Prof      B                     45 115000
##  5 Prof      B                     40 141500
##  6 AssocProf B                      6  97000
##  7 Prof      B                     30 175000
##  8 Prof      B                     45 147765
##  9 Prof      B                     21 119250
## 10 Prof      B                     18 129000
## # … with 387 more rows
data %>%
    
    # Filter(salary < 100000 | salary > 300000)
    
    mutate(salary = ifelse(salary < 100000 | salary > 300000, NA, salary))
## # A tibble: 397 × 6
##    rank      discipline yrs.since.phd yrs.service sex    salary
##    <chr>     <chr>              <dbl>       <dbl> <chr>   <dbl>
##  1 Prof      B                     19          18 Male   139750
##  2 Prof      B                     20          16 Male   173200
##  3 AsstProf  B                      4           3 Male       NA
##  4 Prof      B                     45          39 Male   115000
##  5 Prof      B                     40          41 Male   141500
##  6 AssocProf B                      6           6 Male       NA
##  7 Prof      B                     30          23 Male   175000
##  8 Prof      B                     45          45 Male   147765
##  9 Prof      B                     21          20 Male   119250
## 10 Prof      B                     18          18 Female 129000
## # … with 387 more rows

Conclusion

In conclusion, the rank of “Prof” makes up the large majority of those who have more high paying salaries. Those who have higher paying salaries are not only professors, but they are nearly all a part of the business discipline as well. Even though working for more years helps in increasing salary, that is not what it comes down to. It comes down to rank as well as the number of years that someone has been working in the industry. The more experience that a person is able to gain through working for an industry over a certain period of time, then the more they are able to help out with different parts of their industry and become more useful to their company as a whole. A main part of getting paid a higher paid salary compared to others depends on which level of the company you want to be a part of. That is because there are people who have a highly paying salary, but their level of work and how much they put into it is what has one of the biggest effects. That is why is ranking as a professor in your discipline is what makes up a vast majority of those who have highly paid salaries.