Key Concepts:
1.
Part 1: Import the faculty.csv file into R. What are the column names? How many observations are there? How many variables?
# Import the data
faculty <- read.csv("E:/Quant/inClassExercises/InClassExerciseData/faculty.csv")
# Get the column names
names(faculty)
## [1] "AYSALARY" "R1" "R2" "R7" "PRIOREXP" "YRBG"
## [7] "YRRANK" "TERMDEG" "YRDG" "EMINENT" "FEMALE"
# How many observations and variables? By using the names function, we can
# see that there were 11 variables. We can also use the length() function
length(faculty)
## [1] 11
# By looking into one of the columns, we can see how many observations
# there are...
length(faculty$AYSALARY)
## [1] 725
Part 2: Is annual salary normally distributed?
# we can plot a histogram to see if the data is normally distributed
hist(faculty$AYSALARY)
# We can also use a boxplot
boxplot(faculty$AYSALARY)
Part 3: Does it appear that male and female faculty members make the same annual salary?
# To answer this, we can plot 2 side-by-side boxplots and compare them.
boxplot(faculty$AYSALARY ~ faculty$FEMALE, main = "Annual Salary broken into earnings \n by men and women",
ylab = "Annual Salary", xlab = "Sex \n (0 = men; 1 = women)")
Part 4: Does there appear to be a relationship bewteen salary and the number of years of employment?
# We can explore relationships using scatterplots
plot(faculty$AYSALARY, faculty$YRBG) #I'm not sure which is the appropriate YEAR column
plot(faculty$AYSALARY, faculty$YRRANK) #I'm not sure which is the appropriate YEAR column
plot(faculty$AYSALARY, faculty$YRDG) #I'm not sure which is the appropriate YEAR column
BONUS: Create a new variable combining R1, R2, and R3 into one categorical variable of rank. Does one category appear to have higer salaries?
shotData$white <- ifelse(shotData$Race == 0, c(1), c(0))
# Combine the columns First, create a new field
faculty$Ranks <- ifelse(faculty$R1 == 1, c(1), ifelse(faculty$R2 == 1, c(2),
ifelse(faculty$R7, c(7), c(99))))
# Create boxplots for all of the variables in the new Ranks Field.
boxplot(faculty$AYSALARY ~ faculty$Ranks)