Reduce this to 900 words:

Collegium: A dashboard to help community colleges find peer institutions

Mark A. Perkins and Sean Field

Summary

Many community colleges and two-year institutions require program and institutional evaluation. However, it is often difficult to find indicators of comparison when considering an institution’s metrics as no two institutions are the same. Collegium is a tool to help institutional researchers identify peer institutions when benchmarking. This is an open-source software tool that uses k-means cluster analysis to help institutional researchers or other community college stakeholders narrow down lists of potential peer institutions. Unlike other methods in the past, this dashboard integrates both institution level and county level variables to help identify peer institutions.

Statement of Need

The identification and selection of peer institutions have evolved significantly over the decades, beginning with early descriptive studies and progressing to sophisticated statistical and hybrid methodologies. Curry’s (1972) pioneering work laid the groundwork by comparing the “character” of universities, introducing a framework that evolved through the 1980s. Terenzini et al. (1980) employed cluster analysis using the BMDP2M program, highlighting faculty productivity and salaries. Brinkman & Teeter (1987) later emphasized the subjective judgment inherent in cluster analysis. The 1990s saw refinements with hierarchical cluster analysis by Szelest (1996) and Boronico & Choksi (2012), utilizing IPEDS data to narrow down potential peers. The 2000s introduced hybrid approaches, as documented by Weeks et al. (2000), blending statistical methods with subjective judgments. Zhao & Dean (1997) and Yan (2017) further advanced the field with latent class modeling and Naïve Bayesian Classification. Recent years have seen continued innovations, with D’Allegro & Zhou (2013), D’Allegro (2017), and Chatman (2017) employing percentile selection indices and proximity indices to enhance precision. This historic evolution underscores the increasing complexity and precision in peer institution identification, reflecting broader trends in data availability and methodological advancements in higher education research.

Effective benchmarking and strategic planning in higher education demand identifying peer institutions, often overlooked for community factors, especially in two-year institutions. Existing tools lack user-friendliness, hindering strategic planning and resource allocation. Developing an intuitive dashboard leveraging advanced data analytics can enhance institutional benchmarking. Additionally, there’s a need for user-friendly software to manage institutional data and identify peer institutions, democratizing access to sophisticated analytical capabilities and enabling more stakeholders to participate in data-driven strategic planning and benchmarking.

Description

Collegium integrates IPEDS and US Census data to empower educational benchmarking. Through K-means clustering, it facilitates the identification of peer institutions based on diverse metrics. Users can explore and analyze institutional data, ranging from enrollment and graduation rates to demographic characteristics, fostering informed decision-making in the education sector. With features enabling data export and visualization, the dashboard serves as a comprehensive platform for educators, policymakers, and researchers alike, promoting collaboration and insights-driven strategies. Its user-friendly interface and analytical capabilities make it a valuable resource for navigating the complexities of educational data and deriving actionable insights for improving institutional performance and student outcomes.

Data Collection

The required libraries are loaded, including factoextra (Kassambara and Mundt 2020), FactoMineR (Le, Josse, and Husson 2008), tidyverse (Wicham et al. 2019), dplyr (Wickham et al. 2022), data.table (Dowle and Srinivasan 2023), DT (Xie, Cheg, and Tan 2022), shiny (“Shiny - Welcome to Shiny 2023), shinyWidgets (Perrier, Meyer, and Granjon 2023), and shinydashboard (Chang and Ribeiro 2021).

Data collection involves retrieving and processing from IPEDS and the US Census Bureau. For IPEDS, connections were established to gather institution information, enrollment, graduation rates, and cost data. This data was then joined, transformed, and exported for comprehensive analysis. Census data retrieval included population and demographic information, which was calculated for distribution insights, then cleaned and exported for integration.

The integrated dataset combines IPEDS and census data by county codes, undergoing further cleaning and preparation. The final CSV export offers insights into educational institutions’ relationship with demographic characteristics at the county level, achieved through comprehensive data collection and transformation. This approach facilitates in-depth analysis and research.

Dashboard Overview

The Shiny dashboard, using IPEDS and US Census data, facilitates educational benchmarking through interactive visualizations and K-means clustering. Data is loaded from ipedsgradmassive2.csv, filtered, and analyzed. The UI, defined with tabs for “Introduction”, “K Means”, and “Data Dictionary”, offers specific functionalities. Server logic, employing reactive expressions, updates plots and tables based on user inputs. Plot outputs visualize PCA and clustering results. Data filtering enables users to explore specific clusters. Download handlers allow exporting filtered datasets and data dictionaries in CSV format, enhancing decision-making processes in education.

Dashboard Workflow

Instead of figures, we present a link to the working dashboard (https://marksresearch.shinyapps.io/collegium/). The dashboard loads with a captivating image and a title, drawing users into the exploration process. Using the hamburger menu in the upper left corner, users can navigate to the tab dedicated to refining their selection of peer institutions. With an intuitive interface accommodating three distinct clustering stages, users can determine the optimal number of clusters via the scree plot and subsequently locate their college within the search box, identifying its associated cluster. Leveraging the “Select Cluster” slicer, users can further refine their options, progressing through Levels 2 and 3 to ultimately compile a refined list of potential peer institutions. Upon completion, users can employ either random selection hybrid methodologies from the existing literature to finalize their peer institution selection.

Conclusion and Availability

Collegium, hosted on GitHub (https://github.com/MPerk78/collegium), advances educational benchmarking by integrating IPEDS and US Census data with K-means clustering. Its user-friendly interface and interactive features empower stakeholders to strategize based on diverse metrics. Valuable for educators, policymakers, and researchers, Collegium fosters collaboration and data-driven strategies. It’s open-source, adaptable for future enhancements and expansion beyond community colleges and with more datasets. By simplifying data integration and analysis, Collegium streamlines peer institution identification, enhancing institutional performance and student outcomes while facilitating informed decision-making in education.

References

Boronico, Jess, and Shail S. Choksi. 2012. “Identifying Peer Institutions Using Cluster Analysis.” American Journal of Business Education 5 (3): 233–44. https://doi.org/10.19030/ajbe.v5i3.6994.
Brinkman, Paul T., and Deborah J. Teeter. 1987. “Methods for Selecting Comparison Groups.” New Directions for Institutional Research 1987 (53): 5–23. https://doi.org/10.1002/ir.37019875303.
Chang, Winston, and Barbara B. Ribeiro. 2021. “Shinydashboard: Create Dashboards with ’Shiny’.” https://CRAN.R-project.org/package=shinydashboard.
Chatman, Steve. 2017. “Constructing a Peer Institution: A New Peer Methodology.” Association of Institional Research. https://www.airweb.org/docs/default-source/documents-for-pages/reports-and-publications/professional-file/apf-143-2017-summer_constructing-a-peer-institution-a-new-peer-methodology.pdf.
Curry, Denis J. 1972. “The Seven Comparison States: Their Selection, Use and Applicability for Higher Education.” Boulder, Colorado: National Center for Higher Education Management Systems (NCHEMS). https://files.eric.ed.gov/fulltext/ED095768.pdf.
D’Allegro, Mary Lou. 2017. “A Case Study to Examine Three Peer Grouping Methodologies.” Association of Institional Research. https://www.airweb.org/resources/publications/professional-file/article-142.
D’Allegro, Mary Lou, and Kai Zhou. 2013. “A Case Study to Examine Peer Grouping and Aspirant Selection.” Association of Institional Research. https://www.airweb.org/resources/publications/professional-file/article-132.
Dowle, Matt, and Arun Srinivasan. 2023. “Data.table: Extension of ‘Data.frame‘.” https://CRAN.R-project.org/package=data.table.
Kassambara, Alboukadel, and Fabian Mundt. 2020. “Factoextra: Extract and Visualize the Results of Multivariate Dat Analyses.” https://CRAN.R-project.org/package=factoextra.
Le, Sebastian, Julie Josse, and Francios Husson. 2008. FactoMineR: A Package for Multivariate Analysis.” Journal of Statistical Software 25 (1).
Perrier, Victor, Fanny Meyer, and David Granjon. 2023. shinyWidgets: Custom Inputs Widgets for Shiny.”
“Shiny - Welcome to Shiny.” 2023. https://shiny.posit.co/r/getstarted/shiny-basics/lesson1/index.html.
Szelest, Bruce P. 1996. “In Search of Peer Institutions: Two Methods of Exploring and Determining Peer Institutions.” In Proceedings of North East Association for Institutional Research 23rd Annual Conference, Princeton, New Jersey.
Terenzini, Patrick T., Leif Hartmark, Wendell G. Lorang, and Robert C. Shirley. 1980. “A Conceptual and Methodological Approach to the Identification of Peer Institutions.” Research in Higher Education 12 (4): 347–64. https://doi.org/10.1007/BF00976187.
Weeks, Susan F., Dave Puckett, and Ruth Daron. 2000. “Developing Peer Groups for the Oregon University System: From Politics to Analysis (And Back).” Research in Higher Education 41 (1): 1–20. https://doi.org/10.1023/A:1007089728061.
Wicham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy McGowan, Francois Romain, Garret Grolemund, et al. 2019. “Welcome to the Tidyverse.” Journal of Open Source Software 4 (43).
Wickham, Hadley, Romain Francois, Lionel Henry, and Kirill Muller. 2022. “Dplyr: A Grammar of Data Manipulation.” https://CRAN.R-project.org/package=dplyr.
Xie, Yihui, Joe Cheg, and Xianying Tan. 2022. DT: A Wrapper of the JavaScript LibraryDataTables’.” https://CRAN.R-project.org/package=DT.
Yan, Ti. 2017. “Program-Level Peer Selection in Benchmarking Costs of Instruction.” In Annual Conference. https://www.neair.org/docs/2017_Conference_Proceedings.pdf#page=213.
Zhao, Jisehn, and Donald C. Dean. 1997. “Selecting Peer Institutions: A Hybrid Approach.” In Hirty-Seventh Annual Forum. Orlando, FL. https://files.eric.ed.gov/fulltext/ED410877.pdf.