Princeton University's Ruja Benjamin on bias in data and AI

Key Takeaways:

The least apparent bias is often the most dangerous. We can't prevent the consequences of bias or even take it seriously if we're not aware of it — or worse, choose to ignore its less obvious manifestations.
Widen the lens and don't settle for 'happy talk.' Diversity is not the status quo for most organizations, therefore it should make us uncomfortable. If it’s not, it may be a sign that you’re stopping the conversation short.
Focus on building the right team before you start building AI systems. Diversity needs to start from the groundwork that happens before the foundation is poured.
"Big me up." Surround yourself with role models and people who build you up rather than tear you down — and be a support system for them as well.

Key Quotes:

How does flawed data scale into racist and sexist AI systems?

These systems [rely] on historic data, historic forms of decision-making practices that then get fed into the algorithms to train them how to make decisions. And so if we acknowledge that part of that historic data and those patterns of decision-making have been discriminatory, it means that both the data and oftentimes the models, which are built to make important, sometimes life-and-death decisions for people, are being reproduced under the guise of objectivity.

The main danger of it, perhaps more dangerous than human bias, is that we assume that the technology is neutral and objective. So we don't question it as much -- not as much as we would [with] a racist judge or doctor or police. We think, 'Oh, it's coming to us through a computer screen. Okay. I give these resources to this patient and not this patient.' And [we] don't question where that decision came from.

On the consequences of facial recognition biases in policing (as noted by the work of Algorithmic Justice League's Joy Buolamwini) and why it's important to widen the lens of how these technologies are used in society as a whole:

If [there's a predominance of what Joy Buolamwini calls 'pale male faces' training] your data, then your system becomes very well-equipped to identify individuals who fit that profile and not as suitable to identify people that have other phenotypes. That's certainly one dimension of the policing context that her work has helped raise attention to.

But I would also say that in addition to thinking about training data and the algorithms, we need to look at the context in which these systems are used. Both the inputs and outputs of these systems. If facial recognition or predictive policing are being used by institutions and organizations that have a long track record of profiling certain communities over others, then in some ways they can be very successful at identifying, let's say, people with darker phenotypes -- but it's still going to be used to harm those communities because that's where the police are focused, or that's where the organization is focused.

The key thing for me is it's not simply about making the technologies better at doing what they say they're supposed to do, but it's also widening the lens to think about how they're being used, what kinds of systems they're being used in, and bring the question back to society, not just the designers of technology.

How do we widen the lens and move away from the 'happy talk' that frames diversity as a feel-good issue only when it suits the interests of a corporation or organization?

A lot of this has to do with the kinds of teams, both the diversity of knowledge and the diversity of people. As we saw with some high profile examples, both Joy Buolamwini and Timnit Gebru [the researcher forced out of Google for criticism over its AI's inaccuracies in identifying women and people of color], so much of what goes under the umbrella of diversity, equity, and inclusion is what sociologists call 'happy talk.' We want to celebrate diversity and think about what it gives us as a company or as an organization, but when it feels like that diversity is causing trouble or holding things up or making work difficult, all of a sudden it's not a welcomed difference anymore.

And we saw that with Timnit being pushed out and with the backlash against Joy's work. And so I think one of the first things we can do when we think about diverse teams is to understand that that diversity is supposed to make us uncomfortable with the status quo. It's not simply supposed to make us feel good or yield more profit.

The opportunities when we break the pipeline for people who can cause what late Civil Rights pioneer and Congressman John Lewis called 'good trouble':

I think we can point to the problem much earlier in the process, the so-called pipeline, where many people who would be able to point out these issues don't even get the chance to. They don't even get the opportunities, the internships, the positions, the training in order to really be heard in the first place. And so certainly the kind of gaslighting in these more high profile cases is ongoing, but at the same time, we have so many people with potential who could be contributing to more socially conscious design and technology that never get the opportunity to make good trouble, as it were.

Do bias detection and transparency in an established system help, or is this approach too late in the process?

I do think it's too late. I think once a technology is created and rolled out, there are too many different people that are invested in maintaining it and ensuring that it continues. People have already invested so many hours and resources, so I think anything that comes down the line, it's very unlikely to make meaningful change because the stakeholders and maintaining that are just too invested. And so we need to have things in place before it gets to that point.

Since starting from scratch isn't always feasible, how can leaders address gaps and bias in data?

Like we label food in terms of [its] quality ... there are some initiatives that I've seen that are trying to do something similar with labeling datasets that account for bias and other characteristics of the data. Because a lot of times people just adopt data that already exists. They're not necessarily creating and producing new data and they don't necessarily have robust systematic mechanisms to evaluate the quality of that data -- the gaps in that data, the bias in it. And so there are different kinds of labeling and monitoring for the procurement of data that I think would be helpful. And I would point to the work of Rashida Richardson who has co-authored a wonderful piece about Dirty Data, Bad Predictions that offers some really good policy and legal prescriptions about this process of procurement that I think will be helpful.

As a visionary woman acting as a change agent in the sector of technology, how has Ruha found her voice?

I really do surround myself with women who 'big me up' ... it's a kind of West Indian Patois expression. I think it's really important to surround myself with women who reflect my highest self back at me. Also, I have lots of role models, and not role models who necessarily know that they're my role models, who sit down and mentor me, but people who I look to and look at their trajectory, including professor Alondra Nelson. And so having those people to see how they navigate is really something that has helped me find my own voice. And lastly, I would say the thing that my own kind of self-talk is to remind myself that the people who are in positions of power, who monopolize those positions, got there through gender and racial preferences, not necessarily the most qualified. And so it's sobering for me to just remind myself that the world is not a meritocracy.

On Ruha's motivation for founding the Ida B. Wells Just Data Lab:

I definitely look ahead, and that's one of the motivations for me creating the Ida B. Wells Just Data Lab. I've created a goal for myself to mentor a hundred students every year to work on data justice. And so looking ahead to me, that's what keeps me hopeful and optimistic is that the students coming up now, they want the technical know-how, but they also want the social and historical know-how to create systems that actually shine a light on power and that work to undo unfair systems. And so the fact that I get to hang out all day with students and mentor them, and also try to imagine creating different, not just technologies, but societies. That's the key thing. When I say technology is not going to save us, it means we have to think about the larger social structures and relationships and work on that as much as we work on the technologies. [Data leaders who want to support and contribute to this work can go to thejustdatalab.com for details.]

More About Ruha:

Ruha Benjamin is a professor of African American Studies at Princeton University and the founding director of the IDA B. WELLS Just Data Lab. She has studied the social dimensions of science, technology, and medicine for over fifteen years and speaks widely on issues of innovation, equity, health, and justice in the U.S. and globally.

Ruha's second book, Race After Technology: Abolitionist Tools for the New Jim Code, examines the relationship between machine bias and systemic racism, analyzing specific cases of "discriminatory design," and offering tools for a socially-conscious approach to tech development. She is also the editor of Captivating Technology.

Ruha also recommends the workbook, Advancing Racial Literacy in Tech.

BACK TO THE DATA CHIEF

Princeton University's Ruja Benjamin on bias in data and AI

Ruha Benjamin

Episode Overview

Key Takeaways:

Key Quotes:

More About Ruha: