GEOG5023 Exercise 1 – Li Xu

Q1. What are the column names? How many observations are there? How many variables?

The column names are listed below. There are 725 observations and 11 variables in total.

faculty2 <- read.csv("C:/Users/Li Xu/Documents/aaa/CU Boulder/GEOG5023/faculty.csv", 
    sep = ",", header = T)
names(faculty2)
##  [1] "AYSALARY" "R1"       "R2"       "R7"       "PRIOREXP" "YRBG"    
##  [7] "YRRANK"   "TERMDEG"  "YRDG"     "EMINENT"  "FEMALE"
nrow(faculty2)
## [1] 725
ncol(faculty2)
## [1] 11

Q2. Is annual salary normally distributed?

This distribution is somehow normally distributed except for several very high values.

hist(faculty2$AYSALARY, breaks = seq(20000, 120000, by = 2000), main = "Faculty Annual Salary", 
    xlab = "Annual Salary (US Dollars)", col = "blue")

plot of chunk unnamed-chunk-4

Q3. Does it appear that male and female faculty members make the same annual salary?

According to the boxplot below, it is obvious that male faculties make more money than female faculties on average.

boxplot(faculty2$AYSALARY ~ faculty2$FEMALE, main = "Faculty Annual Salary by gender", 
    ylab = "Annual Salary (US Dollars)", xlab = "Gender (0=Male, 1=Female)", 
    col = rainbow(2))

plot of chunk unnamed-chunk-5

4. Does there appear to be a relationship between salary and the number of years of employment?

When looking at the plot below, it seems that there is positive correlationship between salary and the number of years of employment.

plot(faculty2$YRBG, faculty2$AYSALARY, main = "Faculty Annual Salary vs Time of Employment", 
    xlab = "Time of employment (years)", ylab = "Annual Salary (US Dollars)", 
    pch = 1, col = "blue")

plot of chunk unnamed-chunk-6

5. BONUS: Create a new variable combining R1, R2, and R3 into one categorical variable of rank. Does one category appear to have higher salaries?

By combing R1, R2, R3 into one catergorical variable, we can create a boxplot as follows. Thus full professors have the highest salaries, Instructor or lecturers have the lowest.

faculty2$POSITION[faculty2$R1 == 1] <- "FP"
faculty2$POSITION[faculty2$R2 == 1] <- "AP"
faculty2$POSITION[faculty2$R7 == 1] <- "IL"
faculty2$POSITION[faculty2$R1 == 0 & faculty2$R2 == 0 & faculty2$R7 == 0] <- "NA"

boxplot(faculty2$AYSALARY ~ faculty2$POSITION, main = "Faculty Annual Salary by position", 
    ylab = "Annual Salary (US Dollars)", xlab = "Position (FP=Full Professor, AP=Asssociate Professor, IL=Instructor/Lecturer, NA=Other)", 
    col = rainbow(4))

plot of chunk unnamed-chunk-7


Created by: Li Xu; Created on: 01/22/2013; Updated on: 01/26/2013