class: center, middle, inverse, title-slide .title[ # STA490: Student Satisfaction Anaysis ] .subtitle[ ##
] .author[ ### Josie Gallop, Chloé Winters, Ava Destefano ] .date[ ### 2025-05-09 ] --- # Agenda <font size = 5> .pull-left[ - Introduction - Variables - Practical Questions - Satisfaction Models with ROC analysis - Loyalty Models with ROC Analysis - Transfer Student Satisfaction - Results - Conclusion and Recommendations ] <BR> <BR> </font> --- # Introduction <font size = 6> .pull-left[ - Student survey results - Collected from 2 northern universities. - 332 observations of 123 variables. - Focus on loyalty and satisfaction variables ] <BR> <BR> </font> --- ## Variables <font size = 5> .pull-left[ - **q1** school year - **q2** - asks if student started school here or elsewhere - **q3** - Credit hours being taken this semester - **q4** - engagement in learing - **q5** - student learning style - **q6** - Writing and reading load - **q7** - how challenging examinations have been for students - **q8** - remedial experience - **q9** - encouragement and support - **q10** - growth and development - **q11** - how often students utilize campus resources - **q12** - Retention - **q13** - how students pay for school ] .pull-right[ - **q14** - when do students plan to take classes at the school again? - **q15** - GPA - **q16** - total credit hours - **q17** - Would the student recomend the school of business? - **q18** - how the student evaluates their experience - **q19** - age - **q20** - sex - **q21** - children? - **q22** - engilsh as a first language? - **q23** - International or foreign student? - **q24** - Race - **q25** - which of the 2 schools the student is attending ] <BR> <BR> </font> --- ## Practical Qustions <font size = 6> ⢠What factors influence student satisfaction? <BR> <BR> ⢠What factors influence student loyalty? <BR> <BR> </font> --- ## Additional Variables <font size = 6> ⢠Some questions in the survey resulted in several variables <BR> <BR> ⢠The decision was made to condense several variables down into their average <BR> <BR> ⢠This process created 9 averaged variables named, avg_engagement, avg_learning, avg_writing, avg_remedial, avg_encouragement, avg_growth, avg_resource, avg_retention, and avg_payment <BR> <BR> </font> --- ## Data Split <font size = 6> ⢠Split the data into two groups <BR> <BR> ⢠80% for training <BR> <BR> ⢠20% for testing <BR> <BR> ⢠Training data will be used for building our models <BR> <BR> <!-- Start of Chloe's Slides --> --- class:inverse middle center name:model building # Modeling for Student Loyalty --- ## Modeling for Student Loyalty <font size = 6> ⢠We have created several logistic regression models for the survey data with student loyalty (q17) as the binary response variable <BR> <BR> ⢠Now, let's further investigate the statistical significance of certain variables <BR> <BR> </font> --- class:inverse middle center name:model building # Loyalty Model Building Process --- ## Full Model <font size = 6> ⢠Includes all single question variables and avg variables for multiple part questions <BR> <BR> </font> --- ## Full Model (Part 1) <div class="scroll-table-container"> <table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Estimate </th> <th style="text-align:right;"> Std. Error </th> <th style="text-align:right;"> z value </th> <th style="text-align:right;"> Pr(>&#124;z&#124;) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 11.5949 </td> <td style="text-align:right;"> 4.8676 </td> <td style="text-align:right;"> 2.3821 </td> <td style="text-align:right;"> 0.0172 </td> </tr> <tr> <td style="text-align:left;"> q1 </td> <td style="text-align:right;"> 0.1191 </td> <td style="text-align:right;"> 0.3254 </td> <td style="text-align:right;"> 0.3662 </td> <td style="text-align:right;"> 0.7142 </td> </tr> <tr> <td style="text-align:left;"> q2 </td> <td style="text-align:right;"> 0.1418 </td> <td style="text-align:right;"> 0.5018 </td> <td style="text-align:right;"> 0.2827 </td> <td style="text-align:right;"> 0.7774 </td> </tr> <tr> <td style="text-align:left;"> q3 </td> <td style="text-align:right;"> -0.1032 </td> <td style="text-align:right;"> 0.6045 </td> <td style="text-align:right;"> -0.1707 </td> <td style="text-align:right;"> 0.8645 </td> </tr> <tr> <td style="text-align:left;"> q7 </td> <td style="text-align:right;"> 0.0685 </td> <td style="text-align:right;"> 0.1720 </td> <td style="text-align:right;"> 0.3985 </td> <td style="text-align:right;"> 0.6903 </td> </tr> <tr> <td style="text-align:left;"> q14 </td> <td style="text-align:right;"> -0.1295 </td> <td style="text-align:right;"> 0.2871 </td> <td style="text-align:right;"> -0.4509 </td> <td style="text-align:right;"> 0.6520 </td> </tr> <tr> <td style="text-align:left;"> q15 </td> <td style="text-align:right;"> 0.4600 </td> <td style="text-align:right;"> 0.2305 </td> <td style="text-align:right;"> 1.9956 </td> <td style="text-align:right;"> 0.0460 </td> </tr> <tr> <td style="text-align:left;"> q16 </td> <td style="text-align:right;"> -0.2303 </td> <td style="text-align:right;"> 0.1874 </td> <td style="text-align:right;"> -1.2290 </td> <td style="text-align:right;"> 0.2191 </td> </tr> <tr> <td style="text-align:left;"> q19 </td> <td style="text-align:right;"> -0.1661 </td> <td style="text-align:right;"> 0.1980 </td> <td style="text-align:right;"> -0.8391 </td> <td style="text-align:right;"> 0.4014 </td> </tr> <tr> <td style="text-align:left;"> q20 </td> <td style="text-align:right;"> -0.0043 </td> <td style="text-align:right;"> 0.4207 </td> <td style="text-align:right;"> -0.0103 </td> <td style="text-align:right;"> 0.9918 </td> </tr> <tr> <td style="text-align:left;"> q21 </td> <td style="text-align:right;"> -1.5065 </td> <td style="text-align:right;"> 0.8920 </td> <td style="text-align:right;"> -1.6890 </td> <td style="text-align:right;"> 0.0912 </td> </tr> <tr> <td style="text-align:left;"> q22 </td> <td style="text-align:right;"> 0.0382 </td> <td style="text-align:right;"> 0.7491 </td> <td style="text-align:right;"> 0.0511 </td> <td style="text-align:right;"> 0.9593 </td> </tr> </tbody> </table> --- ## Full Model (Part 2) <table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Estimate </th> <th style="text-align:right;"> Std. Error </th> <th style="text-align:right;"> z value </th> <th style="text-align:right;"> Pr(>&#124;z&#124;) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> q23 </td> <td style="text-align:right;"> -0.2699 </td> <td style="text-align:right;"> 1.5454 </td> <td style="text-align:right;"> -0.1747 </td> <td style="text-align:right;"> 0.8613 </td> </tr> <tr> <td style="text-align:left;"> q24 </td> <td style="text-align:right;"> -0.1055 </td> <td style="text-align:right;"> 0.2314 </td> <td style="text-align:right;"> -0.4559 </td> <td style="text-align:right;"> 0.6485 </td> </tr> <tr> <td style="text-align:left;"> q25 </td> <td style="text-align:right;"> -2.8470 </td> <td style="text-align:right;"> 0.6191 </td> <td style="text-align:right;"> -4.5984 </td> <td style="text-align:right;"> 0.0000 </td> </tr> <tr> <td style="text-align:left;"> avg_engagement </td> <td style="text-align:right;"> 1.3438 </td> <td style="text-align:right;"> 0.6294 </td> <td style="text-align:right;"> 2.1350 </td> <td style="text-align:right;"> 0.0328 </td> </tr> <tr> <td style="text-align:left;"> avg_learning </td> <td style="text-align:right;"> -0.3719 </td> <td style="text-align:right;"> 0.4398 </td> <td style="text-align:right;"> -0.8457 </td> <td style="text-align:right;"> 0.3977 </td> </tr> <tr> <td style="text-align:left;"> avg_writing </td> <td style="text-align:right;"> 0.0232 </td> <td style="text-align:right;"> 0.3361 </td> <td style="text-align:right;"> 0.0690 </td> <td style="text-align:right;"> 0.9450 </td> </tr> <tr> <td style="text-align:left;"> avg_remedial </td> <td style="text-align:right;"> 0.8946 </td> <td style="text-align:right;"> 0.5894 </td> <td style="text-align:right;"> 1.5177 </td> <td style="text-align:right;"> 0.1291 </td> </tr> <tr> <td style="text-align:left;"> avg_encouragement </td> <td style="text-align:right;"> -1.8878 </td> <td style="text-align:right;"> 0.5190 </td> <td style="text-align:right;"> -3.6375 </td> <td style="text-align:right;"> 0.0003 </td> </tr> <tr> <td style="text-align:left;"> avg_growth </td> <td style="text-align:right;"> -1.2140 </td> <td style="text-align:right;"> 0.4124 </td> <td style="text-align:right;"> -2.9438 </td> <td style="text-align:right;"> 0.0032 </td> </tr> <tr> <td style="text-align:left;"> avg_resource </td> <td style="text-align:right;"> 0.8016 </td> <td style="text-align:right;"> 0.4634 </td> <td style="text-align:right;"> 1.7299 </td> <td style="text-align:right;"> 0.0837 </td> </tr> <tr> <td style="text-align:left;"> avg_retention </td> <td style="text-align:right;"> 0.6135 </td> <td style="text-align:right;"> 0.2949 </td> <td style="text-align:right;"> 2.0808 </td> <td style="text-align:right;"> 0.0375 </td> </tr> <tr> <td style="text-align:left;"> avg_payment </td> <td style="text-align:right;"> -1.0897 </td> <td style="text-align:right;"> 0.7251 </td> <td style="text-align:right;"> -1.5027 </td> <td style="text-align:right;"> 0.1329 </td> </tr> </tbody> </table> --- ## Reduced Model <font size = 6> ⢠Includes all avg variables for multiple part questions <BR> <BR> </font> --- ## Reduced Model <div class="center-output"> | | Estimate| Std. Error| z value| Pr(>|z|)| |:-----------------|----------:|----------:|----------:|------------------:| |(Intercept) | 4.2192044| 1.7561998| 2.4024626| 0.0162851| |avg_engagement | 1.3711692| 0.5000397| 2.7421205| 0.0061044| |avg_learning | -0.4616486| 0.3524269| -1.3099131| 0.1902252| |avg_writing | -0.2206449| 0.2642178| -0.8350872| 0.4036686| |avg_remedial | 0.0530763| 0.4665322| 0.1137678| 0.9094219| |avg_encouragement | -2.3182741| 0.4550208| -5.0948747| 0.0000003| |avg_growth | -0.4393832| 0.3201346| -1.3724952| 0.1699093| |avg_resource | 0.7246292| 0.3885169| 1.8651166| 0.0621651| |avg_retention | 0.3953444| 0.2351779| 1.6810444| 0.0927543| |avg_payment | -0.2665375| 0.5671161| -0.4699876| 0.6383639| --- ## Stepwise Model <font size = 6> ⢠Uses forward and backwards selection to build the model <BR> <BR> </font> --- ## Stepwise Model <div class="center-output"> | | Estimate| Std. Error| z value| Pr(>|z|)| |:-----------------|--------:|----------:|-------:|------------------:| |(Intercept) | 10.1762| 2.5876| 3.9327| 0.0001| |q15 | 0.4151| 0.2055| 2.0201| 0.0434| |q16 | -0.2117| 0.1393| -1.5194| 0.1287| |q21 | -1.2940| 0.7513| -1.7224| 0.0850| |q25 | -2.8818| 0.5815| -4.9561| 0.0000| |avg_engagement | 0.9806| 0.5326| 1.8410| 0.0656| |avg_remedial | 0.8700| 0.5468| 1.5911| 0.1116| |avg_encouragement | -1.9179| 0.4930| -3.8907| 0.0001| |avg_growth | -1.2310| 0.3921| -3.1394| 0.0017| |avg_resource | 0.7995| 0.4471| 1.7882| 0.0737| |avg_retention | 0.6035| 0.2782| 2.1697| 0.0300| |avg_payment | -0.9953| 0.6797| -1.4644| 0.1431| --- ## Cross Validation Table: Average prediction errors for Full, Stepwise, and Reduced Models | Full_Model| Stepwise_Model| Reduced_Model| |----------:|--------------:|-------------:| | 0.1461538| 0.1461538| 0.1807692| --- ## Cross Validation <font size = 6> ⢠We used 5 fold cross validation <BR> <BR> ⢠Full and Stepwise models are both good <BR> <BR> ⢠Reduced model is worse than Full and Stepwise <BR> <BR> ⢠Stepwise and Forward selection are very simpler, Stepwise is simpler <BR> <BR> </font> --- ## ROC Curve .pull-center[ {.stretch} --- ## ROC Analysis <font size = 6> ⢠An AUC value closer to 1 indicates ideal performance <BR> <BR> ⢠The reduced model has the lowest AUC <BR> <BR> ⢠Stepwise and Forward selection have high AUC <BR> <BR> ⢠Stepwise and Forward selection have very similar AUC, Stepwise is simpler <BR> <BR> ⢠Stepwise is my suggestion for the final model <BR> <BR> </font> --- # Results <font size = 6> ⢠Stepswise model is our chosen final model <BR> <BR> ⢠The final model has an accuracy of 0.9242 <BR> <BR> ⢠Factors with most probable impact on student satisfaction: GPA, Credit Hours, Children in House, School, Student Growth, Student Encouragement, and Student Engagement, Student Growth, Remedial Experience, Campus Resources, and How Students Pay for School. <BR> <BR> </font> --- class:inverse middle center name:Visual # Modling for Student Satisfaction --- ## Data Preparation <font size = 6> ⢠Transform question 18 into a binomial variable <BR> <BR> ⢠Split the data into training and testing data <BR> <BR> ⢠80% for training and 20% for testing <BR> <BR> </font> --- # Full Model <font size = 6> ⢠Still using averages <BR> <BR> ⢠Only a few variables statistically significant, including GPA <BR> <BR> </font> --- .pull-center[ {.stretch} --- <font size = 5> .pull-left[ ## Reduced Model - Question 15 - Question 25 - avg_encouragement - avg_growth - avg_retention ] .pull-right[ ## Stepwise model - Question 15 - Question 16 - Question 19 - Question 25 - avg_encouragement - avg_growth - avg_retention - avg_growth ] <BR> <BR> </font> </font> --- # Predictive Errors <font size = 6> ⢠The full and stepwise model have the smallest predictive errors<BR> <BR> ⢠But the stepwise model is the most simple of the two<BR> <BR> </font> | PE1 | PE2 | PE3 | |:---:|:---:|:---:| |0.769|0.792|0.769| --- # ROC Analysis .pull-center[ {.stretch} --- # Results <font size = 6> ⢠Stepswise model is our chosen final model <BR> <BR> ⢠The final model has an accuracy of 0.6818 <BR> <BR> ⢠Factors with most probable impact on student satisfaction: GPA, Student Growth, Student Encouragement, and Student retention, age, and school <BR> <BR> </font> --- class:inverse middle center name:Visual # Practical Interpretation of Our Findings --- ## Further Investigation of our Findings <font size = 6> ⢠We have created several logistic regression models for the survey data with student satisfaction (q17) as the binary response variable <BR> <BR> ⢠Now, let's further investigate the statistical significance of certain variables <BR> <BR> </font> --- ## Transfer Student Satisfaction <font size = 6> ⢠One question I had for this project was does loyalty/satisfaction significantly differ between transfer and non-transfer students? <BR> <BR> ⢠Could transfer students have lower satisfaction due to the difficult and long transfer process? <BR> <BR> </font> --- ## Logisitic Regression Model Findings <font size = 6> ⢠We have used q17 as our binary response variable (student satisfaction: yes or no) <BR> <BR> ⢠Let's look back at the full model to start <BR> <BR> <font size = 6> </font> --- ## Transfer Students vs Non-Transfer Students <font size = 6> ⢠We saw that the question regarding a student's transfer status, q2, was not statistically significant to loyalty, p = 0.569 <BR> <BR> ⢠Neither was it statistically significant to satisfaction, p = 0.259 <BR> <BR> ⢠No significant difference in satisfaction between transfer students and non-transfer students <BR> <BR> <BR> </font> --- ## Non-Significant Factors <font size = 6> ⢠Whether or not a student is an international student, q23, was also not statistically significant to loyalty, p = 0.228, or to satisfaction, p = 0.657 <BR> <BR> ⢠Gender, q20, was also not statistically significant to loyalty, p = 0.909, or to satisfaction, p = 0.982 <BR> <BR> </font> --- ## Which Factors Were Significant? <font size = 6> ⢠q19, a student's age group was statistically significant to loyalty, p = 0.036 <BR> <BR> ⢠q10.7 whether a student has skills with using computing and informational technology was statistically significant to loyalty, p = 0.038 <BR> <BR> ⢠q6.2, the number of books which students read for their own enjoyment outside of class was statistically significant to loyalty, p = 0.017. <BR> <BR> </font> --- ## Results <font size = 6> ⢠Factors like technology skills, age group, and how often a student reads, do have statistically significant effects on student satisfaction <BR> <BR> ⢠These factors should be considered when creating future reduced or stepwise selection models like we did in our project <BR> <BR> ⢠Some factors which did not show statistical significance were a bit surpirsing, so these could be investigated further <BR> <BR> </font> --- class:inverse middle center name:Visual # Summary and Conclusion --- # Summary <font size = 6> ⢠We looked at three logistic regression models: full, reduced, and stepwise <BR> <BR> ⢠Of these three the stepwise model showed the best performance for both student satisfaction and student loyalty <BR> <BR> ⢠The stepwise model had small predictive errors, a good ROC, and was more simple than a full model <BR> <BR> </font> --- ## Student Loyalty Summary <font size = 6> ⢠The stepwise model included q15 (GPA), q16 (credit hours), q21 (whether they have children living with them), q25 (which school), and the average variables <BR> <BR> ⢠All of these factors showed statistical significance (p<.05) except for q16, avg_remedial, and avg_payments. <BR> <BR> ⢠These statistically significant variables would be worth looking into regarding student loyalty <BR> <BR> </font> --- ## Student Satisfaction Summary <font size = 6> ⢠The stepwise model included q15 (GPA), q16 (credit hours), q19 (age group), q25 (which school), avg_encouragement, avg_growth, avg_retention, and avg_writing <BR> <BR> ⢠All of these factors showed statistical significance (p<.05) except for q16, and avg_writing. <BR> <BR> </font> --- # Conclusion <font size = 6> ⢠q15 (GPA), q25 (which school) avg_encouragement, avg_growth, and avg_retention were in both final models and were statistically significant <BR> <BR> ⢠q16 (credit hours) was in both final models, but not statistically significant in either <BR> <BR> </font> --- # Conclusion <font size = 6> ⢠GPA and which of the two school (FS vs SM) are significant in regards to both loyalty and satisfaction. <BR> <BR> ⢠Additionally, avg_encouragement, avg_growth, and avg_retention were significant to both loyalty and satisfaction <BR> <BR> ⢠These variables are worth looking into further to understand which factors play the biggest role in a student's loyalty and satisfaction. </font> --- # Future Recommendations <font size = 6> ⢠Further explore each individual variable <BR> <BR> ⢠Make question 11 easier to work with <BR> <BR> ⢠Explore different modeling options <BR> <BR> </font> --- name: Thank you class: inverse center middle # Thank you! Slides created using R packages: [**xaringan**](https://github.com/yihui/xaringan)<br> [**gadenbuie/xaringanthemer**](https://github.com/gadenbuie/xaringanthemer)<br> [**knitr**](http://yihui.name/knitr)<br> [**R Markdown**](https://rmarkdown.rstudio.com)<br> via <br> [**RStudio Desktop**](https://posit.co/download/rstudio-desktop/)