December 9, 2015

Boxplot of Academic Year Salary by Sex

Boxplot of Academic Year Salary by Highest Degree

Scatterplot of Academic Year Salary by Number of Years Since Highest Degree was Earned

Scatterplot of Academic Year Salary by Number of Years in Current Rank

Academic Year Salary by Numbers of Years Since Highest Degree was Earned with 95 % confidence interval

Scatterplot of points of Academic Year Salary by Number of Years in Current Rank grouped by Academic Rank

Simple Linear Regression

Conducted a simple linear regression with sl as the dependent variable and sx, yr, dg, yd, and a recoded rk variable as independent variables.

## 
## Call:
## lm(formula = sl ~ sx + yr + dg + yd + TTP$prof, data = TTP)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6066.3 -1719.5  -452.5   957.8  9826.7 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 31474.72    3295.63   9.550 1.72e-12 ***
## sx           -547.47    1018.44  -0.538  0.59347    
## yr            356.25     109.64   3.249  0.00216 ** 
## dg           -559.33    1204.37  -0.464  0.64454    
## yd             77.37      76.84   1.007  0.31930    
## TTP$prof    -6856.45    1186.70  -5.778 6.23e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2880 on 46 degrees of freedom
## Multiple R-squared:  0.7863, Adjusted R-squared:  0.763 
## F-statistic: 33.84 on 5 and 46 DF,  p-value: 2.461e-14

Null Hypothesis

The null hypothesis is that dependent variable, sl, is not related to the entire set of independent variables (sl, yr, dg, yd, and the recoded rk, TTP_prof).

Multiple R-squared: 0.7638, Adjusted R-squared: 0.763

F-statistic: 33.84 on 5 and 46 DF, p-value: 2.461e-14

If \(\alpha\) = .05, then the p-value, 2.461e-14, is less than \(\alpha\). Therefore, I reject the null hypothesis that there is no relationship between the dependent variable and the entire sent of independent variables.

Null Hypothesis

The null hypothesis is that dependent variable, sl, is not related to the independent variable, sx. I examine the regression coefficient for one independent variable, sx:

Estimate Std. Error t value Pr(>|t|)
-547.47 1018.44 -0.538 0.59347

Again, if \(\alpha\) = .05, then the p-value, 0.59347, is greater than \(\alpha\). Therefore, I would fail to reject the null hypothesis that there is no relationship between sl and sx.

Confidence Interval

confint (TTP_prob)
##                   2.5 %     97.5 %
## (Intercept) 24840.96576 38108.4788
## sx          -2597.47771  1502.5290
## yr            135.56889   576.9402
## dg          -2983.60125  1864.9356
## yd            -77.31372   232.0466
## TTP$prof    -9245.14390 -4467.7541

I examined the 95% confidence interval for variable sx: 2.5 % 97.5 %` sx -2597.48 1502.53

I am 95% confident that this reduction in sx is between -2597.48 and 1502.53.

Computation and Report of New Regression

## 
## Call:
## lm(formula = sl ~ sx, data = TTP)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8602.8 -4296.6  -100.8  3513.1 16687.9 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    24697        938  26.330   <2e-16 ***
## sx             -3340       1808  -1.847   0.0706 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5782 on 50 degrees of freedom
## Multiple R-squared:  0.0639, Adjusted R-squared:  0.04518 
## F-statistic: 3.413 on 1 and 50 DF,  p-value: 0.0706

Compute the t-test of the difference in mean sl by sx

## 
##  Two Sample t-test
## 
## data:  sl by sx
## t = 1.8474, df = 50, p-value = 0.0706
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -291.257 6970.550
## sample estimates:
## mean in group 0 mean in group 1 
##        24696.79        21357.14

If \(\alpha\) = .05, then the p-value, 0.0706, is greater than \(\alpha\). Therefore, I would fail to reject the null hypothesis that there is no relationship between the dependent variable, sl, and the independent variable, sx.

Comparision of t-test and regression

I ran the t-test and regression analysis with sl as the dependent variable and sx as the independent variable. Both test produce the same results for the p-value, 0.0706, as well as the 95% confidence intervals.