Datathons4Justice Address Social Justice Issues with Data Science
The world is currently facing many complex challenges. Issues such as climate change, war, and rising authoritarianism exist alongside (and are sometimes linked with) ongoing injustices based on race, gender, gender identity, sexual orientation, and other personally- and socially-constructed identities. Effective responses to many of these conflicts have thus far remained elusive.
The growing field of data science can provide the research base to identify and enact ambitious, compelling, and evidence-based solutions to global problems. Emerging and established data scientists must ask compelling and nuanced research questions; find, procure, clean, and analyze data from many sources; and draw reasoned conclusions that activists and policymakers can leverage during the decision-making process. In short, we need a well-trained generation of data scientists who apply their craft to real-world scenarios and create effective solutions.
Data science’s connection to issues of criminal justice and economic equity is clear. Communities of color experience disproportionate impacts from policing practices, judicial sentencing, hiring biases, social mobility, voting restriction laws, and a host of other obstacles. Similarly, women and LGBTQIA+ individuals face less economic mobility, poor representation in the arts and media, and diminished educational pathways when compared to their male and/or heterosexual counterparts. However, the connections in other forms of activist data science may be more subtle. For example, the general public is often less aware that climate change disproportionately impacts communities of color, human trafficking targets women and children of color at overrepresented rates, healthcare is delivered to individuals of color in a vastly inferior manner, and light pollution more severely affects the quality and quantity of sleep for lower socioeconomic status groups and people of color. Data scientists must work in concert with mathematicians, physical and biological scientists, social scientists, humanities scholars, and activists to propose solutions to these types of discriminations.
To simultaneously train emerging data scientists and call attention to specific social justice issues, universities, companies, and other organizations should consider hosting a datathon — in particular, a Datathon4Justice.
QSIDE’s Inaugural Datathon4Justice
In October 2021, the Institute for the Quantitative Study of Inclusion, Diversity, and Equity (QSIDE) hosted its inaugural Datathon4Justice. Over 10 research teams of students, faculty members, and activists from more than 30 institutions of higher learning and mission-driven organizations took part in the event, which commenced virtually. QSIDE’s Datathon4Justice focused on two unique datasets: (i) Judicial sentencing disparities in the Minnesota court system and (ii) police data from a small town in rural, western Massachusetts with a troubling history of reported racial and gender bias. Some participating schools could support their own teams, while other teams comprised a mix of registrants from different schools and organizations. Groups worked with the provided datasets for two days before reporting their findings to everyone.
This initial Datathon4Justice inspired two research labs—the Small Town Policing Accountability (SToPA) lab and JUdicial System Transparency for Fairness through Archived Inferred Records (JUSTFAIR) state research lab—to continue the projects. These labs meet weekly to study state-level judicial sentencing disparities and police accountability, and more than 50 scholars and activists regularly engage in the ongoing work. QSIDE’s datathon also inspired subsequent datathons at a number of other institutions, including Tufts University and the University of Utah. In addition, QSIDE partnered with the National Math Festival to create a curriculum and support a High School Datathon4Justice earlier this year; MathWorks Math Modeling Challenge, a program of SIAM, provided additional support for this event. The datathon saw four U.S. high schools engage in introductory data science with RStudio to analyze the dataset from QSIDE’s existing research about diversity in major U.S. art museums.
The University of Utah’s Datathon4Justice
In March 2022, the University of Utah hosted its own in-person Datathon4Justice that concentrated on light pollution — an environmental justice issue in Salt Lake City. Low socioeconomic communities are often located close to industrial spaces, which are typically well lit. The emitted light can impact the health of nearby residential neighborhoods in a number of ways, including poor sleep quality. Furthermore, streetlights in Salt Lake City’s lower socioeconomic neighborhoods tend to have brighter and harsher bulbs; the resulting luminescence can negatively affect driving conditions and cause more vehicular accidents between cars, pedestrians, and/or bicycles. For these and other reasons, the issue of light pollution is particularly relevant to the local community in Salt Lake City and thus made for a timely topic for Datathon participants.
The Datathon4Justice consisted of three components: (i) Preparatory workshops, (ii) the datathon itself, and (iii) opportunities for teams to present their work. A month before the event, organizers from the University of Utah’s Department of Mathematics and School of Medicine developed and implemented a four-session workshop series to prepare students for the event. Each session was attended by 15-20 students and tailored to specific groups of participants based on their comfort levels with Python programming.
The first two sessions were meant for non-programmers; attendees primarily came from social sciences backgrounds and were interested in conceptually understanding the programming basics to better communicate with their teammates. The last two sessions targeted students with more programming experience. For example, session three discussed exploratory data analysis and examined the ways in which data science can drive meaningful change, while session four introduced two Python packages: GeoPandas and Scikit-learn. This final session utilized the Datathon4Justice dataset as an example and served as a data-focused introduction to the datathon; the justice-focused introduction occurred during the kickoff event. Exit survey results indicated that students felt more comfortable with data science and Python packages after the workshop and were more confident taking part in the datathon.
To obtain data for the event, organizers contacted Daniel Mendoza—a professor in the Departments of Atmospheric Sciences, Internal Medicine, and City and Metropolitan Planning at the University of Utah—to discuss potential environmental justice topics that would work well in the datathon format. Mendoza is the coordinator of Utah’s Dark Sky Studies minor, and he quickly suggested light pollution as a theme — particularly since he already possessed large datasets of light fixtures and crashes within Salt Lake City. He also recommended that the organizers consult with Alpha Lambert and Brenna Connely—two students in the Dark Sky Studies capstone course—when preparing potential questions for participants.
During the Datathon4Justice kickoff festivities, QSIDE co-founders Chad Topaz and Jude Higdon spoke about QSIDE’s work and the impact of their initial datathon. Lambert and Connely then characterized light pollution as an environmental justice issue, particularly in Salt Lake City. Team formation and initial groupwork time followed these introductions.
The next morning, 10 teams with three to four members each regrouped to continue their assessments. Generous support from the Utah Center for Data Science provided coffee, a light breakfast, and lunch. Throughout the day, experienced data scientists—both graduate students and faculty—stopped by to answer questions and engage with teams. The event concluded that afternoon with a final address by Mendoza that further described his ongoing research on light pollution’s effects in the Salt Lake City area.
Datathon participants were genuinely excited to use their mathematical and computational skills to address relevant, real-world issues. Mick Wagner, a third-year applied mathematics undergraduate, reveled in the realization of career opportunities at the intersection of mathematics and social justice. “I have always assumed that to survive capitalism, I would have to keep these interests separate,” Wagner said. “But after the datathon, I was excited to see people and companies working towards something I thought was just a dream career.”
After the event, teams had the opportunity to present their work at the university’s Undergraduate Research Symposium on April 5 and QSIDE’s virtual Student Quantitative Action Research for Equity, Diversity (SQuARED), and Justice Conference on April 16. At the research symposium, five teams debuted polished, refined, and interpreted findings based on their initial analyses at the datathon; for some participants, this was the first research and presentation opportunity of their academic careers. The subsequent SQuARED Justice Conference, which commenced as an online poster session with undergraduate and graduate student research from 15 colleges and universities, attracted more than 100 attendees from around the world.
Datathons engage students and faculty by showing participants that they can utilize computational math and data science to address quantitative questions about social justice. The success of both QSIDE and the University of Utah’s Datathons4Justice demonstrate that these types of events effectively encourage students from interdisciplinary backgrounds to explore science, technology, engineering, and applied math fields.
Acknowledgments: We would like to acknowledge the QSIDE Institute, the Utah Center for Data Science, and the University of Utah’s Office of Undergraduate Research for their support in organizing Utah’s Datathon4Justice.
About the Authors
Trent DeGiovanni
Ph.D. Candidate, University of Utah
Trent DeGiovanni is a Ph.D. candidate in the Department of Mathematics at the University of Utah.
Wesley Hamilton
MathWorks
Wesley Hamilton works at MathWorks on STEM outreach and workforce development initiatives, prior to which he was a Wylie Assistant Professor at the University of Utah. He also serves on the SIAM Education Committee.
Rebecca Hardenbrook
Ph.D. Candidate, University of Utah
Rebecca Hardenbrook is a fourth-year Ph.D. candidate in applied mathematics and a 2018-2019 Global Change and Sustainability Center Fellow at the University of Utah.
Jude Higdon
Chief Operations Officer, Institute for the Quantitative Study of Inclusion, Diversity, and Equity
Jude Higdon currently serves as the Chief Operations Officer for the Institute for the Quantitative Study of Inclusion, Diversity, and Equity, as well as the Chief Information Officer and Director of Technology at Bennington College.
Owen Koppe
Undergraduate Student, University of Utah
Owen Koppe is an undergraduate student at the University of Utah, where he studies applied mathematics and computer science.
Keshav Patel
Graduate Research Fellow, University of Utah
Keshav Patel is a National Science Foundation Graduate Research Fellow at the University of Utah.
Chad M. Topaz
Co-Founder, Institute for the Quantitative Study of Inclusion, Diversity, and Equity
Chad M. Topaz is the co-founder of the Institute for the Quantitative Study of Inclusion, Diversity, and Equity, a professor of complex systems at Williams College, and an adjunct professor of applied mathematics (by courtesy) at the University of Colorado-Boulder.
Stay Up-to-Date with Email Alerts
Sign up for our monthly newsletter and emails about other topics of your choosing.