1 Introduction

We are doing this research project to look at the level of student engagement and the overall academic experience with undergraduate students enrolled at two business schools in the US.

We will look at the relationship between some of these variables. The goal is to find areas that support or hurt student success and satisfaction at these universities.

The survey given includes the following sections:

  • Students’ Engagement in Learning

  • Student Learning Styles

  • Writing and Reading Load

  • Remedial Experience

  • Encouragement and Support

  • Growth and Development

  • Campus Resource Utilization

  • Retention

  • How Students Pay for College

The purpose of our analysis is to find patterns in student responses and examine how these factors correlate with a student’s sense of academic belonging, satisfaction, and potential persistence to graduation.

2 Data Management

This section will prepare the survey data for analysis by handling missing values across multiple sections of the questionnaire.

2.1 Students’ Engagement in Learning

2.2 Student Learning Styles

2.3 Writing and Reading Load

2.4 Remedial Experience

2.5 Encouragement and Support

2.6 Growth and Development

2.7 Campus Resource Utilization

2.8 Retention

2.9 How Students Pay for College

3 Reliability of the Analysis

We will now get an initial correlation matrix and make a decision on whether to scale our values if we have a variety of negatively correlated values or to compute our cronbachs alpha if our values are all positively correlated.

3.1 Students’ Engagement in Learning

From the correlation matrix we can see that we have a multitude of negatively correlated values. Due to this we will have to scale our values into different sub sections.

3.2 Student Learning Styles

We see from our correlation matrix of student learning styles section that we have high correlation between our questions.

Confidence Interval of Cronbach’s Alpha
LCI alpha UCI
0.8325 0.8508 0.8677

Looking at this, it shows we had a strong positive correlation matrix so, we decided to do a cronbachs alpha calculation. We got an alpha level of .8508 and a CI of (.8325, .8677). These values are very high which means we have a high level of internal consistency.

3.3 Writing and Reading Load

We see from our correlation matrix we have positive correlation. Since these are all positive we go ahead and perform a cronbachs alpha calculation.

Confidence Interval of Cronbach’s Alpha
LCI alpha UCI
0.3965 0.4703 0.5365

In the correlation matrix got a lower cronbachs alpha value of .4703 and CI of (.3965, .5365). With this value we cannot accept this section.

3.4 Remedial Experience

We see from this correlation matrix we have an all positive matrix. Since our values are positive we can go ahead and do a cronbachs alpha calculation.

Confidence Interval of Cronbach’s Alpha
LCI alpha UCI
0.8347 0.8521 0.8684

We got an alpha level of .8521 and a CI of (.8347, .8684). These values are very high which means we have a high level of internal consistency.

3.5 Encouragement and Support

The correlation matrix shows we have all positive values. Since our values are positive we can go ahead and do a cronbachs alpha calculation.

Confidence Interval of Cronbach’s Alpha
LCI alpha UCI
0.7851 0.8083 0.8298

We got a higher cronbachs value of .8083 and CI of (.7851, .8298). We can go ahead and say that we have high internal consistency between our questions.

3.6 Campus Resource Utilization

We see from the correlation matrix above that we have differing almost equal levels of positive and negative values so we will have to make some subsections of our observations for this section.

With this matrix we separated it into the 3 different scales. This did not help us at all in noticing any trends, instead we still see a lot of negative and positive correlation values. Since we cannot get these negative values out of the matrix we will not proceed with a cronbachs alpha calculation and assume that we cannot use this section.

3.7 Retention

This correlation matrix having every value positively correlated. So we can go ahead and calculate our cronbachs alpha.

Confidence Interval of Cronbach’s Alpha
LCI alpha UCI
0.8589 0.8747 0.8891

We got a high cronbachs alpha of .8747 and a CI of (.8589, .8891). So we can say that this section has high internal consistency.

3.8 How Students Pay for College

The matrix for this has a mix of both negative and positively correlated values so we cannot do a cronbachs alpha calculation. We now have to go ahead and sub section our How Students Pay for College section by personal income/savings and outside assistance.

The new matrix of just outside assistance has improved from the last matrix. All the values are positively correlated so, we can go ahead and perform a cronbachs alpha calculation.

Confidence Interval of Cronbach’s Alpha
LCI alpha UCI
0.5828 0.6309 0.6747

This has a smaller cronbach value of .6309 and CI of (.5828, .6747). Since our value is at .63 we are questionable on whether to use this questionnaire due to it being low.

4 PCA

This section presents the results of the PCA (principal component analysis). The significant principal components identified will be used in the analytic data set for use in regression modeling.

4.1 Students’ Engagement in Learning PCA

We first generate a plot to visually and statistically help us decide how many underlying dimensions are in our data for engagement. We do this before running PCA or using PCs in regression.

Our plot above shows that the appropriate amount of components will be around 2 so we go ahead and perform principle component analysis.

Factor loadings of the first few PCAs and the cumulative the proportion of variation explained by the corresponding PCAs in the Engagement Questionaire Survey.
PC1 PC2
q41 -0.241 0.202
q42 -0.280 0.050
q43 -0.229 -0.039
q44 -0.215 -0.184
q45 -0.074 -0.082
q46 -0.168 -0.284
q47 -0.265 0.030
q48 -0.204 0.220
q49 -0.254 0.163
q410 -0.018 -0.462
q411 -0.074 -0.470
q412 -0.285 0.047
q413 -0.273 -0.078
q414 -0.257 -0.130
q415 -0.187 -0.347
q416 -0.165 -0.278
q417 -0.278 0.164
q418 -0.286 0.105
q419 -0.222 0.134
q420 -0.237 0.144
q421 0.008 -0.156
Cumulative and proportion of variances explained by each the principal component in the engagement survey.
PC1 PC2
Standard deviation 2.487 1.701
Proportion of Variance 0.294 0.138
Cumulative Proportion 0.294 0.432

After looking at the PCA both show were little proportion with PC1 being 29.4% and PC2 being 13.8.

We see here that our graph appears to be skewed to the right. Further analysis will be need on whether to do a box cox transformation to fix the distributional issue in the regression residuals.

4.2 Student Learning Styles PCA

We see from the above plot that it suggests to use one principal component.

Factor loadings of the first few PCAs and the cumulative the proportion of variation explained by the corresponding PCAs in the Learning Style Questionaire Survey.
x
q51 0.279
q52 0.420
q53 0.454
q54 0.413
q55 0.453
q56 0.405
Cumulative and proportion of variances explained by each the principal component in the Learning Style survey.
x
Standard deviation 1.877
Proportion of Variance 0.587
Cumulative Proportion 0.587

The PCA has a larger amount of proportion of variance at around 59%.

The plot above shows that our distribution appears to be skewed to the right. We will need further analysis to see if we need to perform a box cox transformation to fix the distributional issue in the regression residuals.

4.3 Writing and Reading Load PCA

We see from the above plot that it suggests to use one principal component.

Factor loadings of the first few PCAs and the cumulative the proportion of variation explained by the corresponding PCAs in the Writing and Reading Questionaire Survey.
x
q61 0.676
q62 0.307
q63 0.670
Cumulative and proportion of variances explained by each the principal component in the writing and reading survey.
x
Standard deviation 1.216
Proportion of Variance 0.493
Cumulative Proportion 0.493

The PCA gives us around 49%.

We see from the distribution plot it is skewed to the right. We will need further analysis to see if we need to perform a box cox transformation.

4.4 Remedial Experience PCA

We see from the above plot that it suggests to use 2 principle components in our component analysis.

Factor loadings of the first few PCAs and the cumulative the proportion of variation explained by the corresponding PCAs in the Remedial Questionaire Survey.
PC1 PC2
q81 0.127 0.625
q82 0.318 -0.134
q83 0.394 -0.266
q84 0.395 -0.303
q85 0.363 -0.359
q86 0.354 0.035
q87 0.316 0.287
q88 0.321 0.299
q89 0.334 0.356
Cumulative and proportion of variances explained by each the principal component in the remedial survey.
PC1 PC2
Standard deviation 2.067 1.075
Proportion of Variance 0.475 0.128
Cumulative Proportion 0.475 0.603

PCA 1 has 47.5% of proportion of variance compared to our second component which only has 12.8%. We will only be using the first principle component.

The distribution in the table above is skewed to the left. So we will have to perform a box cox transformation to fix the distributional issue.

4.5 Encouragement and Support PCA

We see from the above table that it suggests us to use the first two principle components.

Factor loadings of the first few PCAs and the cumulative the proportion of variation explained by the corresponding PCAs in the Encouragement and Support Questionaire Survey.
PC1 PC2
q91 0.297 -0.378
q92 0.426 -0.162
q93 0.416 -0.178
q94 0.435 0.347
q95 0.430 0.331
q96 0.376 0.278
q97 0.203 -0.701
Cumulative and proportion of variances explained by each the principal component in the Encouragement and Support survey.
PC1 PC2
Standard deviation 1.832 1.102
Proportion of Variance 0.480 0.173
Cumulative Proportion 0.480 0.653

Our first component explains 48% of the total variance and the second component only explains 17% of the total variance. We will only be using the first component in our analysis.

We see from the above table that our distribution table is approximately normal.

4.6 Retention PCA

We see from the above table that it suggests for us to use one component.

Factor loadings of the first few PCAs and the cumulative the proportion of variation explained by the corresponding PCAs in the Retention Questionaire Survey.
x
q121 -0.458
q122 -0.461
q123 -0.439
q124 -0.455
q125 -0.422
Cumulative and proportion of variances explained by each the principal component in the Retention survey.
x
Standard deviation 1.835
Proportion of Variance 0.673
Cumulative Proportion 0.673

The first PC explains 67% of our total variance.

We see can see that our plot is skewed to the right so we might need to perform a box cox transformation. Also, further analysis is needed.

4.7 How Students Pay for College PCA

We see from the plot above that it suggests to use on PC.

Factor loadings of the first few PCAs and the cumulative the proportion of variation explained by the corresponding PCAs in the Pay Questionaire Survey.
x
q133 -0.418
q134 -0.586
q135 -0.528
q136 -0.451
Cumulative and proportion of variances explained by each the principal component in the pay survey.
x
Standard deviation 1.391
Proportion of Variance 0.484
Cumulative Proportion 0.484

The PC is 48.4% of the proportion of variance.

We see from the graph above that it is skewed to the right. We will need to perform a box cox transformation to fix the distributional issue.

5 Project Questions

  1. How well does student engagement in inside and outside the classroom predict their overall satisfaction with their college experience?

  2. Does how the student pays for school correlate with the students engagement in school?

My first question looks at the idea that students who are more actively engaged in their learning may feel more affiliated in their academic experience. The survey data that was provided to us shows a lot of detail on student engagement through the 21 questions (Section 4) given, making it a strong candidate for giving us enough variables representing engagement.

The second question is from the assumption if the student pays for their college than they would be more likely to be more engaged in their learning. Looking at these two together can offer meaningful insight into how much a student will be engaged based off if they payed for their school or if someone else did.

