The collection, analysis, and reporting of data in ethical ways has always been foundational for institutional research. Ethical research requires not only a high degree of veracity and accuracy in our work, but it also entails recognition of the broad range of variables that can influence processes and outcomes. The coronavirus pandemic has brought about new societal challenges, and national conversations about racial injustice have brought longstanding issues to the forefront. Without serious consideration of these challenges, we risk introducing new or emphasizing old biases in our analyses and our processes.
While it has never been reasonable to assume that our population has equal access to education, the pandemic has exacerbated the inequity. The digital divide that has always existed in our society had a much bigger impact as students in areas with poor access to the internet struggle to connect to the online education that substitutes for in-class experiences. Many parents and guardians have had to shift from providing at-home educational support for their K-12 children to being at-home teachers or tutors, which may be more problematic for particular subject areas or for families whose primary languages are not English. And there has surely been a greater need among lower income families for students to care for siblings or to work to supplement family incomes, creating significant distractions from schoolwork. These issues likely impact both prospective and current college students.
The effects of the pandemic reach beyond our students, of course, to our faculty and staff. With child care and school facilities closed, many parents, often mothers, have had two full-time jobs as they attend to both their careers and their children. While this is likely to prove a career disadvantage to working parents generally, it may be especially problematic for assistant professors who have limited time frames in which they are expected to gain tenure and advance to associate professor.
Recent conversations about racial injustice have reinforced that it is not enough to remove explicit racial bias, or even to ensure that people don't have personal racial animosity. Basing our judgment of a student’s scholarly ability primarily on outcomes without consideration of race disregards the institutional and structural factors that impact these outcomes unfairly. For example, research shows that “Black and Latino students are far more likely to attend schools in low-income neighborhoods, which is tied closely to academic achievement, in part because of a lack of resources…. Disproportionate rates and severity of discipline begin in preschool and extend over the years of Black children’s education.” Understanding and accounting for these factors is critical to attaining equity.
Our current data collections do not provide us with all of the information we need to address the inequities intensified by the pandemic, and our analyses do not consider all of the factors necessary for a racially just system. Institutional research professionals must collaborate with their colleagues in admissions, enrollment services, and student life to better understand the people and issues represented by the data.
We must ask ourselves the following questions:
- How do we ensure that analyses in which we engage address these issues in a fair and equitable way?
- How do we use data to support students who might be equally talented but had very different support systems and opportunities during the pandemic?
- What do the disparate experiences of students during the pandemic reveal to us about students with different opportunities during “normal” times?
- How do we identify students who may need additional support and figure out what they need to reach their goals?
To answer these questions, IR professionals should consider the following:
Make sure your analyses are fair. For example, we know that many algorithms have bias built into them. This bias comes from factors like how the data that feeds into them are chosen, biased training data, and the unequal access to resources mentioned above. Whether IR professionals are creating algorithms in-house or partnering with external vendors, there are steps they should take to ensure that their algorithms are fair. First, they should ensure that the training data are representative of the institution’s current population. Consider how far back the training data go. It is important to balance having the data be as recent as possible to reflect the current population while ensuring there is a valid sample size to train the algorithm. Next, staff should test the algorithm on their data and conduct disparate impact analyses. If the algorithm over-identifies a certain group as “at risk,” that may be because certain students are, indeed, more at-risk of failing. With the knowledge that a group of students is being over-identified, the institution should come up with a plan to focus support on that population while testing the data to make sure they are correct.
Discover what COVID has done to your data. We know that the disruptions of the COVID-19 pandemic have changed student behaviors, and thus the data that institutions have access to are different in profound ways. For instance, as colleges pivoted online, use of learning management systems (LMS) increased significantly. For colleges that used interactions with the LMS as a predictor of student success, the algorithm showed that many more students would be successful. Of course, this was not the case. Instead, increased use of the LMS was just the byproduct of a coping mechanism that made many students less successful. Chances are the pandemic has changed other data points at our institutions as well. IR staff should interrogate their data to find where the pandemic has changed them or their meaning.
Collaborate and share data to identify and address the need for support services, technology, and internet access. We have long realized the value of comparative data to best understand the needs of our students, faculty, and staff. Our institutions have for decades shared data through consortia, survey collaborations, and governmental education departments. These datasets have allowed us to answer critical questions about national trends in enrollment, faculty hiring, majors of interest, and much, much more. COVID-19 has reinforced the tremendous value of sharing data across institutions. How can you know, for example, that increasing equity gaps are due to the pandemic and not to issues instigated on your particular institution unless you can see that institutions across the country are struggling with the same issues at this same time? And how can your institution know how to address such issues unless they can determine the underlying cause? Especially for the pandemic, data collections through surveys and focus groups are probably the most telling sources of information about our students’ experiences. When we share those data, we learn about common issues and can then go forward to share and discover best practices as well. We may have already known this, but the pandemic has underscored the fact that institutional research is at its best and most useful when it is collaborative and shared.
As the pandemic continues to impact all aspects of data and data collection, IR professionals will need to employ a number of tools and tactics to ensure the ethical and effective use of that data. We need to work to address inequities intensified by the pandemic even knowing that our analyses do not consider all of the factors necessary for an equitable system.
Ethical Use of Data: Scenarios for Discussion
by Julie Carpenter-Hubin and Donald C. Hubin
Julie Carpenter-Hubin served as Assistant Vice President of Institutional Research and Planning at The Ohio State University, retiring in 2019. She chaired the 2010 Association for Institutional Research (AIR) Annual Forum, served as the Association’s President for 2012–13, and is currently a member of the eAIR Editorial Task Force. Julie represented OSU to the Association of American Universities Data Exchange and served on its governing council 2004–07, chairing that body in 2005–06. She was a member of the National Research Council’s Data Panel, which advised the NRC on the data collection for its 2010 Assessment of Research-Doctorate Programs. Julie currently serves as a peer reviewer for the Higher Learning Commission (HLC), and is a member of the HLC Institutional Actions Committee. Julie has consulted with and advised colleges and universities nationally and internationally regarding the structure and responsibilities of their institutional research teams. Julie holds a B.A. in German Languages & Literatures and an M.A. in public policy and administration, both from OSU.
Iris Palmer is deputy director for community colleges with the Education Policy program at New America. She is a member of the higher education team and also works closely with the Center on Education & Labor. She provides research and analysis on community colleges, adults enrolled in higher education, apprenticeship, and the ethical use of predictive analytics in higher education. Iris previously worked at the National Governors Association on postsecondary issues, where she helped states strengthen the connection between higher education and the workforce, support competency-based systems, use data from effectiveness and efficiency metrics, and improve licensure for veterans. Prior to joining NGA, she worked at HCM Strategists on the Lumina Foundation’s initiative to develop innovative higher education models, including new technologies and competency-based approaches. Before joining HCM Strategists, she worked at the U.S. Department of Education in all of the offices related to higher education: the Office of Career, Technical, and Adult Education; the Office of Postsecondary Education; the Policy Office; and the Office of the Undersecretary. She received her master’s of public policy from George Mason University and her undergraduate degree in political science from Goucher College.