Although the COVID-19 pandemic forced a temporary shift in focus, we now see a return to further development and implementation of advanced technologies and broader uses of Big Data in education and other work sectors. The release of OpenAI’s ChatGPT (Generative Pre-trained Transformer) chatbot in November 2022 has generated a flurry of warnings about how artificial intelligence (AI) might cause disruptions in higher education. The Atlantic declared that “The College Essay Is Dead.” The Tech Crunch’s headline asks “Is ChatGPT a ‘virus that has been released into the wild’?” ChatGPT is a chatbot developed by OpenAI using a family of large language models and is constantly updated with both supervised and reinforcement learning techniques. ChatGPT is a Big Data machine and it consumes an enormous amount of text data — about 500 billion words and gaining.
Recently, ChatGPT and the even more recent Bard and Bing have grabbed our attention for their novelty and ability to provide answers to questions in a conversational style. Although they have risks and refinements are needed, we find the hands-on user experience of these AI chatbots fascinatingly fast in returning quite complete answers and essays with varying degrees of complexity. ChatGPT bots and image-building AIs such as DALL-E seem to be the latest in AI applications that have generated media hysteria. However, other AI-supported systems have been used in higher education, including Georgia Tech’s AI (Jill Watson) for student tutoring and the U.S. Department of Education’s chatbot for federal financial aid (Aiden). The soaring interest in ChatGPT and other AI tools signal that the AI/ML revolution is accelerating and a tipping point may soon be here. In this short essay, we briefly describe some of the key factors that help drive the development of AI and machine learning (ML) as well as the implications and opportunities for Institutional Research (IR) and Institutional Effectiveness (IE) professionals.
Technology Advances in Higher Education
The emergence of Big Data is one of several key facilitating conditions that propelled the adoption of AI and ML in key application areas. According to Gartner, Big Data are the “high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation.”
For many years, data warehouses have been the dominant system to serve as a centralized repository for storing and managing large amounts of data from various sources, generally structured data that are connected to an organization’s enterprise resource management (ERP) systems. Data warehouses support standardization and data governance (DG) by providing a central data repository to connect and optimize for fast query, data reporting and data analysis.
However, the speed, volume, and velocity of the Big Data revolution far exceeds the designed utilities of the traditional data warehouse. The Internet of Things (IoT) and the many devices associated with it (i.e., phones, GPS, appliances, and health trackers) are generating much more data and in many forms. The traditional structured data approach will not work well in the age of IoT. As an alternative to data warehouses, the data lake can not only store structured data but also raw, unstructured, and semi-structured data. Data lakes are generally built on a cloud-based platform and therefore have a lot more flexibility to extend their capacity for storing and processing large amounts of data, such as log files, sensor data, and social media data.
To build AI or ML applications, developers need large volumes of data to train algorithms to detect connections and patterns. ML algorithms learn, find patterns, and develop a gradual understanding of the problem by examining a massively high volume of iterations, scenarios, and manipulations. According to Analytics India Magazine, to train ChatGPT as a model, its creators used databases from the internet that included a massive 570 GB of data sourced from books, Wikipedia, research articles, webtexts, websites, and other forms of content and writing on the net. Approximately 300 billion words were fed into the system to train the model, and the model continues to accumulate its database. Great debates are addressing the pros and cons of ChatGPT related to teaching and learning as well as its applications in literature searches, grant writing, and beyond. Text-generating bots like ChatGPT are getting high attention now, but perhaps we should see these as just one of numerous AI tools that require us to anticipate the growth of others and proactively consider how they will impact IR/IE. You might enjoy a recent podcast in which Tom Davenport muses on generative AI and purports that the only people who will lose their jobs are those who don’t learn how to work with AI.
Implications for IR/IE
To remain valued colleagues (who have a ‘seat at the table’), contributing to the discussions about AI in higher education is essential for IR/IE professionals. Being involved in these discussions with senior administrative officials can help cement the perception that IR/IE professionals are knowledgeable, broadly skilled, and able to situate issues within the context of one’s specific campus environment (yes, IR/IE professionals are indeed multi-talented!).
A first good step includes understanding the definitions and general uses of AI and its subcomponents machine learning (ML) and deep learning (DL). Secondly, consider where, when, and how AI techniques are being considered for or used in higher education and how these are similar to and different from applications in business and industry. Although AI applications are increasingly present in scientific research, specifically think about the potential for AI in higher education administration and student success.
Overall, we believe that AI, when used properly, can assist in student success, improve accessibility and inclusion, and can help ensure that higher education is transparent, fiscally responsible, accountable, and ethical. Still, the need to remain consciously aware of responsible and ethical uses of data may provide an even greater challenge with AI. Being cautious of plagiarism and overuse is critical, and IR/IE professionals are reminded of the AIR Statement of Ethical Principles, which was recently updated, in part, to address issues that may occur with AI. A number of writers, including O’Neil, Mathies, and Zeide, as well as recent social movements such as the Algorithmic Justice League remind us of the risks and potential biases that can result.
If this topic interests you, consider these additional questions:
- Should the potential for bias discourage the use of algorithms in admissions applications (e.g., women in STEM majors)?
- Do algorithms used in advising systems that flag a student as ‘at-risk’ for course or program failure provide more help or harm to student motivation and success?
- What if ChatGPT is used to write one’s admissions essay or a course assignment?
- Does ChatGPT mean that instructors will have fewer instructional tasks, possibly leading to an increase in the number of students per class (decreasing the costs per course)? 5. Does the incorporation of virtual teaching assistants allow the institution to offer fewer TA positions, resulting in a cost savings? More broadly, what are the economic and student success implications for the use of more virtual TAs rather than live TAs?
- Beyond chatbots, what does cell phone ping data tell us about the students’ location? Is it possible that we make incorrect assumptions based on cellphone location data alone?
- How can we consider the context of a predictive analysis that includes a high number of predictor variables? If the predictive model says that prospective students who live in green houses are more likely to accept the admissions offer, does that necessarily determine a policy change to accept students who live in green houses? (Context is important.)
Opportunities for IR/IE
The questions above are just a few of many, but they remind us of the need to be good thought partners with faculty and administrative colleagues on campus. ChatGPT and other AI tools offer an excellent opportunity for IR/IE professionals to reach in and take an active part in the discussions. Consider how you can contribute to the conversations with these topics:
- Broadly, what is the impact of ChatGPT and other AI tools on campus related to student access, engagement, and success?
- What is the impact of ChatGPT on organizational finance and accountability?
- How can we utilize these data to offer decision support that ensures access, diversity, and student success during and after degree or certificate completion?
- What are the security and ethical issues (as well as data sharing laws) to be addressed?
- Do you have an elevator speech on how your IR/IE team can contribute to AI/ML developments on campus (just in case you get stuck in the elevator with the President)?
AI/ML applications rely on large amount of data to train. Data quality, variety, and volume are important, and this is where IR/IE professionals can shine. We must collaborate with campus partners to ensure efficient, transparent, and effective collection, management, and analysis of data in our rapidly-changing and highly-technical environment. This requires us to reach out through proactive collaboration and remain a central contributor in DG plans and DG committee formation. We can’t emphasize enough the importance of involvement in your campus’ DG process!
Although many tasks keep IR/IE officials plenty busy each day, the world around us requires that we must remain up-to-date with new topics and techniques. We encourage you to scan relevant resources daily, such as The Chronicle of Higher Education and Inside Higher Ed, AIR Professional File, on-demand webinars at AIR, Research in Higher Education, AIR Hub, and EDUCAUSE’s Digital Transformations news. In addition, guidance and mentorship from other IR colleagues is a great way to learn. Engage with colleagues at professional meetings such as AIR Forum, on AIR Hub, through AIR Coffee Chats, and via good old-fashioned phone calls.
AI applications will increasingly simplify routine tasks and data processing routines. Consider what new skills and techniques are needed for your office and staff. For example: Do you need to update your skills in data science or data visualizations? Can these tools help to enhance your data analysis, summary, or storytelling skills? For example, ChatGPT can help the analyst document SQL code, write a Python program, or summarize key articles on a particular subject. It can also help analyze unstructured data from social media feeds or student feedback forums. These are opportunities that IR/IE professionals should leverage to improve one’s effectiveness.
Today’s work in IR/IE grows larger, more complex, and challenging each day. While it might pose some daunting challenges, this is an exciting time that offers IR/IE professionals a tremendous opportunity to shine and be recognized as valued colleagues on campus. AI may likely become essential in the IR professional’s toolbox, complementing the use of qualitative research, statistical analysis, and data visualization (some if which can also be done through AI). However, proficiency with these tools alone isn’t enough; it requires a thorough understanding of the data-informed decision-making process, and IR/IE professionals are uniquely positioned for success.
We believe that several facets of AI in higher education will remain relevant for some time. We encourage you to consider the implications for you and your institution. Don’t let the opportunities to grow and thrive pass you by!
Thanks to Michelle Appel, Eric Atchison, Matt Grandstaff, and Mike Urmeneta for reviewing previous versions of this essay.
Henry Zheng is Vice Provost of Institutional Effectiveness and Planning at Carnegie Mellon University.
Karen Webber is Professor Emeritus in the McBee Institute of Higher Education at The University of Georgia, former Director of IR, Associate Provost for IE, and the 2022-2023 AIR President.