Developing Data Science and Advanced Analytics Capabilities to Expand IR/IE Office Portfolios
Data science and advanced analytics capabilities generally refer to a set of methods, skills, and tools aimed at extracting actionable insights and knowledge from enterprise and external data resources to inform decision-making. In data maturity models, these capabilities typically are on the higher end of the analytical hierarchy. Within the context of institutional research, developing data science and advanced analytics capabilities in universities involves establishing a range of essential functions and activities that will go beyond traditional IR activities.
These necessary capabilities should include:
- Leveraging integrated or centralized data resources: In today’s digitalized environment, we cannot afford to rely on spreadsheets or data saved locally on desktop computers to provide robust data solutions. Universities need to build agile data infrastructures, including centralized databases that consolidate data from various systems (admissions, enrollment, financial aid, etc.). Effective data management ensures data is accurate, clean, connected, and accessible across departments. IR/IE professionals need to have the expertise and access to these resources to participate fully in the new data environment. For example, at Carnegie Mellon, our data science team has direct access to data assets in the Snowflake data lake which enables us to use data from across the campus to update our data models and dashboards in real-time when newer data has been added.
- Automated Dashboards: We need to support the development of a centralized data access portal where data professionals from different functional areas can share their data reports and dashboards. This one-stop shop approach helps break down organizational silos and ensure quality data is shared with data consumers. Advanced analytics activities should include the development of real-time insights through dynamic dashboards on key metrics (e.g., graduation rates, course completion, faculty workloads). IR/IE offices have the expertise and can collaborate with other functional areas to advance your institution’s analytics capabilities.
- Predictive Analytics: Using advanced modeling techniques like Markov chains, regression models, or machine learning, universities can forecast future enrollment, which is essential for resource planning and financial forecasting. IR/IE offices should actively engage in developing predictive models to identify students with higher likelihood of attrition or failing prerequisite courses based on a combination of academic performance, engagement metrics, and demographic data.
- AI and Machine Learning Applications: The rapid development of AI and GenAI tools is at an almost dizzying speed. We now have the capabilities to process both quantitative and qualitative data much faster and more intuitively. Data science and advanced analytics can help us analyze qualitative data, such as student feedback and to automate student learning assessment tasks. Increasingly, cloud computing providers such as Amazon, Microsoft, and Dataiku have made advanced data science models available to scale up predictive analytics. IR/IE professionals need to embrace the opportunities to leverage cloud-based modeling algorithm to enhance the agility of their predictive models.
- Scenario Planning and Simulation: Universities use data science models to simulate different strategic scenarios, such as the impact of tuition changes, new academic programs, or changes in student demographics, to guide long-term decision-making. Data science and advanced analytics capabilities will allow IR/IE offices to participate in such activities and be counted as a partner. In enrollment management, marketing, and advancement areas, IR/IE offices can help implement controlled experiments (A/B testing) to measure the effectiveness of different student interventions or programmatic changes ensures that decisions are evidence-based.
These capabilities have been recognized not only in the IR/IE community, but by business and IT leaders across higher education in the Change with Analytics Playbook jointly published by AIR, EDUCAUSE, and NACUBO. They are also key skills recognized by Chief Data Analytics Officers in industries outside Higher Education as critical to developing an organizational strategy for data analytics.
The development of data science and advanced analytics capabilities in colleges and universities is increasingly crucial for several reasons. First, the higher education landscape is becoming more competitive, with institutions vying to attract and retain students while optimizing resources amid fluctuating enrollment and financial pressures. To stay ahead, universities must utilize data analytics to predict trends, improve student outcomes, and guide strategic decisions. Thomas Devenport, in his seminal work, “Competing on Analytics - The New Science of Winning”, pointed to the critical importance of developing advanced analytics as a difference maker in enhancing an organization’s competitiveness.
Secondly, while it is ideal that universities develop data analytics maturity to fully realize the potential of data-informed decision-making, many universities have yet to achieve some of these advanced capabilities. A Chronicle of Higher Education research released in 2023 found that 73% of higher education leaders believe that higher education is behind the industry in developing such capabilities. Having advanced analytics capabilities allows universities to develop more precise forecasting, improve resource allocation, and strengthen institutional resilience.
Lastly, the rapid development of AI and Generative AI (GenAI) has revolutionized data processing capabilities, providing unprecedented opportunities for personalized learning, academic performance analysis, and operational efficiency. For IR/IE professionals, this is a moment of truth. Either you learn how to develop the analytics capabilities to ride the wave of AI or you will be left far behind. Universities need to harness these advancements to stay at the forefront of educational innovation and meet evolving expectations.
As demand for data science and advanced analytics increased, so too have the needs for skills and technical competency development increased for IR/IE professionals. As the profession advances and meets the needs of the AIR Statement of Aspirational Practice, we as professionals need to ensure we are developing the skills that will bring our institutions into the second half of the 21st century. While having access to advanced analytics is an increasing priority for many campus leaders and chief data offices, according to the most recent National Survey of IR Offices, less than 10% of most IR/IE staff members' time is spent working on advanced analysis and data science projects. Instead, most staff time is spent on data gathering and management, and basic analytics. This gap is troubling. As data becomes increasingly important to both strategic and operational decision-making, we must advance the skills of our teams to spend less time gathering data and more time leveraging data to identify trends and inform decision-making.
Building these advanced data management skills, automated dashboards, and robust AI and machine learning capabilities require an expanded skill set than what has typically been required in IR/IE offices. Creating and managing these automated dashboards and reports requires IR/IE professionals to develop skills in database management, information systems, and data security/permissions management. There is an increasing need to be able to build data pipelines that feed automated reports with data drawn from live information systems that provide daily or more frequent updates, giving campus leaders the ability to respond to changes in the enrollment landscape and adjust resources to meet student demand.
Delivering on these opportunities may require IR professionals to learn programing skills in SQL to manage the database objects and develop data warehouse resources, and at least basic skills in Python and/or R to assist with data pipeline flows that clean and prepare data for analytical use. With these skills, IR/IE practitioners will be able to better collaborate with IT colleagues and build the data resources that work with the various information systems across campus and meet the business use cases of campus decision makers. It is also important to note that having such technical skills will facilitate IR/IE professionals’ involvement in AI and GenAI program development on campus.
IR/IE offices traditionally work extensively with frozen or snapshot data sets. Expanding our reach to advanced analytics, we will need to be able to use real-time or frequently updating operational dashboards to perform analysis and develop data report. Using the data resources from operational systems, IR/IE offices will have the tools to track changes over time, identify cyclical trends, create forecasts in that help campus leaders anticipate changes in the enrollment pipeline and adjust to unexpected changes in environment. As campuses respond to the anticipated changes in enrollment demographics, it becomes even more important to respond to campus leadership when they request updates to reports. Building these dashboards that draw on sources from the operational systems allows IR/IE practitioners to respond to these requests for updated information in minutes or hours instead of days or weeks.
The shift to data science and advanced analytics will be a challenging process for many IR offices. IR/IE professionals are increasingly tasked with creating predictive models based on past data to forecast future trends. Developing these models requires not only the data infrastructure for advanced analytics but also expertise in feature engineering, machine learning, and data management. Effective communication is crucial to making the model results accessible to non-technical audiences, using clear data visualizations and regular data snapshots. However, not taking on the challenge to enhance our skillsets and services may limit our contribution to campus analytics efforts as AI and GenAI uses are becoming mainstream.
Besides enhancing our advanced analytics capabilities, IR/IE professionals need to reach out to our IT partners, campus data consumers, and institutional leaders to become an essential part of the data analytics fabric on campus. We need to support the building of a coordinated data analytics infrastructure and the creation of systems that support data access and insight-sharing across the institution. This partnership is essential to advancing organizational goals, making data widely accessible, and ensuring that leaders can make data informed decisions.
Henry Zheng is Carnegie Mellon University's inaugural Vice Provost for Institutional Effectiveness and Planning (IEP). As CMU’s Vice Provost for IEP, Henry oversees the Institutional Research and Analysis and Data Sciences and Advanced Analytics functions. He plays a critical role in leading and facilitating data-informed decision-making at CMU, including the launching of a university-wide community of practice in data analytics and an enterprise-level portal for managerial data reporting. Prior to joining CMU, Dr. Zheng served in several leadership roles including as an Associate Vice President for Strategic Analytics at The Ohio State University and Vice Provost for Institutional Research and Strategic Analytics at Lehigh University.
Matthew Hoolsema serves as Director of Data Science and Advanced Analytics at Carnegie Mellon University. He oversees data modeling and dashboarding projects for university leadership and the campus community, including development of new predictive analytic models to support early interventions with students. Prior to working in higher education Hoolsema was a K-12 teacher, and continues to promote data-informed strategies to support student success and institutional effectiveness. He can be contacted via e-mail at mhoolsema@cmu.edu.