Brian Mochtyak

Question 1

It seems appropriate to combine the two because typically when negotiating salaries employees, including CEOs, will negotiate a lower salary to get a higher bonus and vice versa. So combining the two variables to get total compensation does seem appropriate.

Question 2

For my model I chose to use the variables number of years at the firm, the industry, the compensation for five years and the companies sales.

fullmod3 <- lm(compensation ~ YearsFirm + Industry + Compfor5Yrs + Sales, ceo_df2)
vif(fullmod3)
##                 GVIF Df GVIF^(1/(2*Df))
## YearsFirm   1.206710  1        1.098503
## Industry    2.065558 48        1.007585
## Compfor5Yrs 1.219638  1        1.104372
## Sales       1.510742  1        1.229123

Question 3

The model is linear outside of a few outliers and the industry variable has a higher VIF of 8 but it doesnt suggest multicollinearity, just something we need to keep an eye on.

Question 4

As I stated in the question above linearity and looks fine minus a few outliers and the same applies to heteroskedasticity

Question 5A

Based off the change to the adjusted R-2 it seams that the industry variable had the biggest impact on the adjusted R-2

Question 5B

The model accounts for about 39% in the variability in CEO salaries

Question 5C

It seems that R handles missing dat by just adding an NA for all the variables and no this does not seem appropriate

Question 5D

Other variables I find important would the number of employees, valuation or stock price, other financial information such as debt, capital investments,etc.