“☑ Fairness Toolkits, A Checkbox Culture?” On the Factors that Fragment Developer Practices in Handling Algorithmic Harms
More Info
expand_more
Abstract
Fairness toolkits are developed to support machine learning (ML) practitioners in using algorithmic fairness metrics and mitigation methods. Past studies have investigated practical challenges for toolkit usage, which are crucial to understanding how to support practitioners. However, the extent to which fairness toolkits impact practitioners’ practices and enable reflexivity around algorithmic harms remains unclear (i.e., distributive unfairness beyond algorithmic fairness, and harms that are not related to the outputs of ML systems). Little is currently understood about the root factors that fragment practices when using fairness toolkits and how practitioners reflect on algorithmic harms. Yet, a deeper understanding of these facets is essential to enable the design of support tools for practitioners. To investigate the impact of toolkits on practices and identify factors that shape these practices, we carried out a qualitative study with 30 ML practitioners with varying backgrounds. Through a mixed within and between-subjects design, we tasked the practitioners with developing an ML model, and analyzed their reported practices to surface potential factors that lead to differences in practices. Interestingly, we found that fairness toolkits act as double-edge swords — with potentially positive and negative impacts on practices. Our findings showcase a plethora of human and organizational factors that play a key role in the way toolkits are envisioned and employed. These results bear implications for the design of future toolkits and educational training for practitioners and call for the creation of new policies to handle the organizational constraints faced by practitioners.