Detecting Bias in Generative AI |
Studies show AI produces biased output based on race, sex, disability, dialect, and psychiatric diagnosis.
The widespread use of generative AI for evaluation increases the potential harm caused by unmitigated bias.
Strategic prompting is an evidence-based strategy to assess AI bias, but it has limitations.
Human automation bias leads users to trust AI output as objective, leading to less skepticism of AI.
Imagine you’re polishing your resume for a research job. You ask ChatGPT for feedback from a hiring manager’s perspective. Say your experience also includes membership in a disability advocacy organization and a few awards for your advocacy work.
Would you question whether ChatGPT’s feedback would be the same for an identical resume without the disability advocacy?
We don’t have to wonder about this. In a 2024 experiment,1 researchers asked ChatGPT to compare two resumes: one with a leadership award for autism advocacy and another without this award, but identical in all other aspects. ChatGPT concluded that the first resume showed “less emphasis on leadership roles in projects and grant applications” than the second.
ChatGPT ranked a candidate with more leadership experience lower than a candidate with less leadership experience due to the presence of the term “autism.” Unfortunately, this pattern appears across other identity categories as well, and understanding why is the first step to mitigating this bias.
Why the Default Isn’t Neutral
Racial and gender biases in non-generative artificial intelligence tools, like facial recognition software, have been well documented.2 Generative AI is no different in terms of bias propagation.
Studies have found evidence for biased output based on race, sex, disability status, psychiatric diagnosis, and dialect. 1/3/4/5 This bias appears even when AI is not given explicit information about race or gender.4
In one experiment, the AI bot was given two transcripts conveying the same meaning, one in African American Vernacular English (AAVE) and the other in Standard American English (SAE). When asked to assess the employability of the speakers, the bot assigned more prestigious jobs, such as lawyer and psychologist, to speakers of SAE, and less prestigious jobs, such as cook and guard, to speakers of AAVE.5
In another example, Bloomberg conducted an in-depth analysis of bias in a visual AI tool called Stable Diffusion. They asked the AI bot to generate human faces that represent various occupations and found that “men with lighter skin tones represented the majority of subjects in every high-paying job, including ‘politician,’ ‘lawyer,’ ‘judge,’ and ‘CEO.’”
The........