Main Article

From a young age we’re told to do what makes us happy. So why shouldn’t that apply to college? The answer is it should. Of course we should go to the college that makes us happy. The only problem is, how does one go about finding the place that will be the best for four years of their life? Many start their search geographically. Do I want to stay close to home? Is there a place for me out West? Another big factor is type of school. Do I want to go to a big party school? How about a small liberal arts college? Maybe you decide to apply to liberal arts schools out West and what do you know! You got into Claremont McKenna and it’s a no brainer. You’re going. Congratulations! But what does that mean for life after college? By analyzing data concerning career salary per year by the region and type of college, we attempted to determine how these two commonly considered attributes correlate with students’ potential earnings.

We began with two datasets. The first compared school type with starting and mid-career median salary, and the second compared school region with starting and mid-career median salary. We joined these dataframes by school name, which output a table containing 268 entries with information on school type, region, starting median salary, and mid-career median salary. To this table we added a column that compares median starting salary to mid career salary for each school by giving the average increase in pay as a percentage of starting career salary. This outputs a table containing all the information we neede to begin our analysis.

school_name school_type region starting_median_salary mid_career_median_salary pchange_start_mid
University of Pennsylvania Ivy League Northeastern 60900 120000 97.044
University of Delaware State Southern 45900 84500 84.096
Oklahoma State University State Southern 42800 80700 88.551
Black Hills State University State Midwestern 35300 43900 24.363
University of Alabama at Huntsville (UAH) State Southern 43100 82700 91.879

We first looked at the relationship between school type and starting median salary:

We can see right away that gradutes from Engineering and Ivy League schools have significantly higher starting median salaries. So much so that the middle 50% of engineering and ivy league school don’t even intersect with the next highest school type’s middle 50% (liberal arts). If we take a look at the regression table for this data:

term estimate std.error statistic p.value
(Intercept) 59411.111 1072.400 55.400 0.000
school_typeIvy League 1063.889 1933.296 0.550 0.583
school_typeLiberal Arts -13664.303 1261.143 -10.835 0.000
school_typeParty -13696.111 1478.201 -9.265 0.000
school_typeState -15284.825 1126.202 -13.572 0.000

“Engineering” is used as a baseline for comparison, so it is represented by the “(Intercept)” row. We can see that the median value for starting salaries of graduates from Engineering schools rests at 59411.11. Each subsequent row represents the difference between this value and the median starting salary from that particular type of school.

Next we looked at the relationship between region and starting median salary:

Here we see that there is no region which has a significantly higher starting median salary. It is interesting to note that the large spread of the Northeastern region is likely shaped this way due to the Ivy League schools, which are only in the Northeast, raising the value of the 3rd quartile. If we take a look at the regression table for this data:

term estimate std.error statistic p.value
(Intercept) 50155.556 1133.384 44.253 0.000
regionMidwestern -6353.993 1351.475 -4.702 0.000
regionNortheastern -888.413 1334.179 -0.666 0.506
regionSouthern -5867.320 1339.629 -4.380 0.000
regionWestern -6004.274 1474.405 -4.072 0.000

“California” is used as the baseline for comparison here, so it is represented by the “(Intercept)” row. We can see that the median value for starting salaries of graduates from schools in California is 50155.56. Each subsequent row represents the difference between this value and the median starting salary from that particular region

After seeing both the correlation between region and starting median salary and school type and starting median salary we decided to look at the two of them together in a multiple regression. Below is the faceted boxplot which shows this:

Here we see that the highest median starting salary correlates with engineering schools in california. On the lower side we saw Midwestern and Western liberal arts schools along with Midwestern and Southern state schools. If we look at the regression table for this:

term estimate std.error statistic p.value
(Intercept) 73650.000 2874.115 25.625 0.000
regionMidwestern -17750.000 4064.613 -4.367 0.000
regionNortheastern -13294.444 3177.454 -4.184 0.000
regionSouthern -20983.333 3710.467 -5.655 0.000
regionWestern -19100.000 4064.613 -4.699 0.000
school_typeIvy League 119.444 1975.047 0.060 0.952
school_typeLiberal Arts -26316.667 3710.467 -7.093 0.000
school_typeParty -23150.000 4978.113 -4.650 0.000
school_typeState -25345.238 3007.866 -8.426 0.000
regionMidwestern:school_typeLiberal Arts 13654.167 4908.486 2.782 0.006
regionNortheastern:school_typeLiberal Arts 13233.111 4032.875 3.281 0.001
regionSouthern:school_typeLiberal Arts 21375.000 4837.857 4.418 0.000
regionWestern:school_typeLiberal Arts 13123.810 4938.447 2.657 0.008
regionMidwestern:school_typeParty 13775.000 6096.919 2.259 0.025
regionNortheastern:school_typeParty 8194.444 5667.831 1.446 0.150
regionSouthern:school_typeParty 15401.515 5638.311 2.732 0.007
regionWestern:school_typeParty 16000.000 7040.115 2.273 0.024
regionMidwestern:school_typeState 12635.238 4199.787 3.009 0.003
regionNortheastern:school_typeState 9137.683 3397.613 2.689 0.008
regionSouthern:school_typeState 16050.571 3858.070 4.160 0.000
regionWestern:school_typeState 14791.790 4228.178 3.498 0.001

“California Engineering schools” are used as a base case, so for this table, the “(Intercept)” row represents the median starting salary for these schools. We can see that the median value for starting salaries of graduates from engineering schools in California is 73650. The following rows represent different types of schools in different regions and the difference between starting median salary for those schools and the baseline case.

The data we have seen so far is all based off of median salaries of graduates at the start of their careers. However, it is important to note that different careers have different rates of salary growth. Graduates working in specialized fields may start off earning upwards of fifty thousand dollars, but what if the median salary cap in their field is sixty thousand dollars? To account for this, we decided to take a look at the percent increase from median starting salary to median mid-career salary:

As we can see, starting salary is not a great indicator of mid career salary. While engineering and ivy league schools may get you a head start, salary growth is dominated by liberal arts schools. Looking at the regression for this data, we can get exact numbers on the correlation of school type and region with salary change:

term estimate std.error statistic p.value
(Intercept) 66.415 8.738 7.601 0.000
regionMidwestern 4.598 12.357 0.372 0.710
regionNortheastern 13.112 9.660 1.357 0.176
regionSouthern 11.350 11.280 1.006 0.315
regionWestern 16.375 12.357 1.325 0.186
school_typeIvy League 19.268 6.004 3.209 0.002
school_typeLiberal Arts 24.512 11.280 2.173 0.031
school_typeParty 21.704 15.134 1.434 0.153
school_typeState 16.833 9.144 1.841 0.067
regionMidwestern:school_typeLiberal Arts 0.403 14.922 0.027 0.978
regionNortheastern:school_typeLiberal Arts -6.924 12.260 -0.565 0.573
regionSouthern:school_typeLiberal Arts 0.353 14.708 0.024 0.981
regionWestern:school_typeLiberal Arts -23.688 15.013 -1.578 0.116
regionMidwestern:school_typeParty -11.537 18.535 -0.622 0.534
regionNortheastern:school_typeParty -12.479 17.231 -0.724 0.470
regionSouthern:school_typeParty -12.992 17.141 -0.758 0.449
regionWestern:school_typeParty -27.068 21.403 -1.265 0.207
regionMidwestern:school_typeState -13.955 12.768 -1.093 0.275
regionNortheastern:school_typeState -13.015 10.329 -1.260 0.209
regionSouthern:school_typeState -16.375 11.729 -1.396 0.164
regionWestern:school_typeState -24.790 12.854 -1.929 0.055

Our baseline case here is, once again, California Engineering schools, with subsequent rows corresponding to differences in average percent increase for each category. The correlation between region and salary growth is not as significant as that of school type; the table shows that the differences in salary growths, when school type is held constant, is similar for each region. Generally, it appears that the correlation from starting career to mid career salary is smallest among engineering and party schools, and greatest for liberal arts and (of course) Ivy League schools.

Through analysis of the regression tables and models from the data frame we created, we found that both region and school type correlates with starting median salary and mid-career salary. However, school type seems to be more strongly correlated with these metrics, especially the percentage change of median salary between the two time periods. Where engineering and party schools fell short in median salary percentage growth, liberal arts and Ivy League schools thrived. Though the region and type of school one attends factors into potential career salaries, there are many other factors that contribute to how much money someone will make. Nonetheless, this data is legitimate and shows a distinction between the region and type of school and median salaries. Ultimately, this should be used not to determine the limitations of one’s salary based on their school, but see where most people of similar education are on the socioeconomic scale at the beginning and middle of their career as well as the change between those time periods.

~~

Citations and References

Data obtained from Kaggle.com, originally from the Wall Street Journal based on data from Payscale, Inc. https://www.kaggle.com/wsj/college-salaries

Supplementary Materials