Published in AI

AI doctors hate women and minorities

by on19 September 2025


Bias baked into medical models 

Artificial intelligence tools touted as the saviours of overstretched hospitals could end up doing the opposite for women and ethnic minorities, according to new research.

Studies from US and UK universities have found that large language models used in healthcare underplay female symptoms and show less empathy toward Black and Asian patients. Researchers warn this could reinforce patterns of under-treatment that already plague Western healthcare systems.

MIT associate professor at the Jameel Clinic Marzyeh Ghassemi said: “Some patients could receive much less supportive guidance based purely on their perceived race by the model.”

She added: “My hope is that we will start to refocus models in health on addressing crucial health gaps, not adding an extra per cent to task performance that the doctors are honestly pretty good at anyway.”

The warnings come as Microsoft, Amazon, OpenAI and Google rush to flog AI products that claim to cut physicians’ workloads. Microsoft even boasted in June that it had an AI tool “four times more successful than human doctors at diagnosing complex ailments.”

MIT’s clinic found models such as OpenAI’s GPT-4, Meta’s Llama 3 and Palmyra-Med consistently recommended lower levels of care for women and sometimes advised patients to self-treat at home. A separate MIT study showed GPT-4 and others offered colder, less supportive responses to Black and Asian users seeking mental health help.

The London School of Economics uncovered similar issues with Google’s Gemma model, already in use by more than half of UK local authorities. The research found women’s physical and mental health issues were consistently downplayed compared with men’s when Gemma was used for social care case notes.

Chief medical officer of AI start-up Open Evidence, Travis Zack, said: “If you’re in any situation where there’s a chance that a Reddit subforum is advising your health decisions, I don’t think that that’s a safe place to be.”

OpenAI admitted many of the studies evaluated older versions of GPT-4 and insisted it had “improved accuracy since its launch.” Google said it takes model bias “extremely seriously” and is developing techniques to limit discrimination.

Even so, the research suggests that unless the training data changes, AI risks amplifying the same blind spots that have dogged medicine for decades.

Last modified on 19 September 2025
Rate this item
(0 votes)

Read more about: