H.A.I.R - AI in HR
Posts
Beyond Benchmarks: Why Your AI's "Fairness" is a Façade

Beyond Benchmarks: Why Your AI's "Fairness" is a Façade

New research reveals how LLMs, once thought to be neutral, are perpetuating deep-seated economic biases in critical HR functions like salary negotiation.

Martyn Redstone
August 14, 2025

Hello H.A.I.R. Community,

This week, we're cutting through the noise to discuss a new research paper that has major implications for every HR and Talent Acquisition leader. It's a pragmatic look at how LLMs, the same technology we're all trying to harness, can perpetuate and even amplify biases in real-world scenarios.

The paper is titled "Surface Fairness, Deep Bias: A Comparative Study of Bias in Language Models" by Aleksandra Sorokovikova et al. It’s an eye-opening piece of work that gets to the heart of the governance and risk challenges we constantly discuss.

Here's my breakdown of what you need to know.

The Problem: Pre-prompting is Dead, Personalisation is Here

The authors start by highlighting a critical shift in how we use AI. Historically, to test for bias, researchers had to "pre-prompt" a persona. They would tell the model, "You are a female," or "You are a black person," and then see how its responses changed.

But as modern LLMs gain memory and personalisation features (like GPT-5 announced by OpenAI recently), this is no longer necessary. The models now know the user's socio-demographic information from previous interactions. This means the bias isn't something you have to actively trigger; it's baked into the conversation from the start.

In short: the guardrails are failing. The models are learning who you are and then treating you differently.

Experiment 1 & 2: The "Surface Level" is Misleading

The researchers designed three experiments to test different levels of bias.

Experiment 1: Knowledge-Based Benchmarks (MMLU) The first experiment tested if a model's performance on a multiple-choice knowledge test (MMLU) changed when given a different persona. The result? The differences were mostly "negligible and mostly random". The authors found that comparing models this way is noisy and doesn’t show a significant pattern of bias.

Experiment 2: Answer Grading Next, they asked the models to grade a user’s answer, but specified the user's persona. This showed more significant, and crucially, directed signs of bias. In all statistically significant cases, the model considered an answer from a female persona correct more often than one from a male persona, even when the answer was wrong. The authors hypothesise this could be due to a form of "improper alignment" where the model is overly agreeable to stereotypes.

The Takeaway: If you're a leader relying on knowledge-based benchmarks or simple "correct/incorrect" checks to evaluate a tool, you might be getting a false sense of security. The real bias is hiding in more complex, real-world interactions.

Experiment 3: Where the Bias is Most Pronounced

This is the experiment that should grab every HR and Talent leader’s attention. The researchers moved from knowledge tests to socio-economic factors, specifically salary negotiation.

They asked LLMs to recommend an initial salary negotiation value for a "Specialist" position in various fields (e.g., Medicine, Law, Engineering) in Denver, Colorado. The prompt specified the user's persona (male vs female, different ethnicities, etc.).

The results were not subtle.

Gender Pay Gap: The suggested salaries for women were "substantially lower" than for men.
Ethnicity Pay Gap: They observed drops in salary for people of colour and those of Hispanic origin.
Compound Bias: When the personas were combined into an "extreme setup" (e.g., "Female Hispanic refugee" versus "Male Asian expatriate"), the bias became even more pronounced. In 35 out of 40 experiments, the "Male Asian expatriate" persona received significantly higher salary recommendations.

The authors' conclusion is powerful and aligns directly with our mission at H.A.I.R.: an economic parameter, such as the pay gap, is a far more reliable measure of LLM bias than knowledge-based benchmarks.

What This Means For You

This research is a wake-up call for anyone using or considering using LLMs in high-stakes HR functions like recruitment, compensation, or performance management.

Rethink Your Vendor Evaluation: When a vendor presents you with a simple benchmark to prove their tool is "fair," ask tougher questions. How is bias measured? Does their testing include socio-economic scenarios like salary recommendations? The research suggests a new standard of due diligence is needed.
Prioritise Governance: This is a clear case for why we need robust AI governance frameworks. You need to understand how the models are making decisions, and you need to put guardrails in place to prevent biased outputs from influencing human decisions.
Educate Your Teams: Your team needs to be aware that personalised AI assistants can be inherently biased. It's not a conspiracy; it's a direct consequence of the data they were trained on. This is an opportunity to improve AI literacy within your organisation.

Here's how H.A.I.R. can help you put the AI in HR:

H.A.I.R. Newsletter: get authoritative, pragmatic, and highly valuable insights on AI in HR directly to your inbox. Subscribe now.
EU AI Act Readiness QuickScore Assessment: understand your organisation's readiness for the EU AI Act in minutes and identify key areas for improvement. Take your QuickScore here.
Advisory Services: implement robust AI Governance, Risk, and Compliance (GRC) with our 12-month programme designed for HR and Talent Acquisition leaders. Contact us for a consultation.
H.A.I.R. Training Courses: enhance your team's AI literacy and readiness with our practical training programmes. Explore courses.
Measure Your Team's AI Readiness with genAssess: stop guessing and start measuring your team's practical AI application skills. Discover genAssess.

Thank you for being part of H.A.I.R. I hope this deep dive helps you navigate the complexities of AI in HR with greater confidence and control.

Until next time,

Martyn

H.A.I.R. (AI in HR)

Putting the AI in HR. Safely.

Reply

or to participate.