Bias detection & fairness metrics

What is AI Bias and Where Does It Come From?

Algorithmic bias occurs when a computer system produces results that are systematically prejudiced due to erroneous assumptions in the machine learning process. It's crucial to understand that AI systems are not inherently biased; they learn bias from the data we give them.

If the data reflects existing societal biases, the model will learn and often amplify those biases.

Common Types of Bias

Historical Bias: This occurs when the data is a reflection of past prejudices, even if the world has changed. For example, if a model for hiring tech executives is trained on historical data from the 1980s, it might learn that being male is a key predictor of success, simply because very few women were executives at the time.
Representation Bias: This happens when the dataset used to train the model does not accurately represent the environment where the model will be used. For example, training a facial recognition system primarily on images of light-skinned individuals will lead to poor performance and higher error rates for dark-skinned individuals.
Measurement Bias: This occurs when the data is measured or collected inconsistently across different groups. For example, using "arrests" as a proxy for "criminal activity" can introduce bias, as policing practices may differ significantly between neighborhoods.

How to Measure Fairness: Group Fairness Metrics

To detect bias, we need to measure it. Fairness metrics allow us to quantitatively assess whether a model's outcomes are equitable across different subgroups, which are defined by protected attributes like race, gender, age, etc.

Here are two of the most common "group fairness" metrics:

1. Demographic Parity (or Statistical Parity)

The Idea: The model's predictions should be independent of the protected attribute. In other words, the likelihood of receiving a positive outcome should be the same for all groups.
The Formula: P(prediction=1∣group=A)≈P(prediction=1∣group=B)
When to Use It: This is useful when the goal is to ensure equal allocation of resources or opportunities, such as sending out a marketing flyer. You want the same percentage of people from each group to receive the flyer.
The Downside: It can be a poor choice if there is a real-world correlation between the protected attribute and the outcome (e.g., certain medical conditions are more prevalent in specific age groups). Forcing parity could harm the less-privileged group by denying them accurate, beneficial predictions.

2. Equalized Odds

The Idea: The model should be equally accurate for all groups. It ensures that the model's true positive rate and false positive rate are the same for group A and group B.
The Formula:
P(prediction=1∣true label=1,group=A)≈P(prediction=1∣true label=1,group=B) (Equal True Positive Rate)
P(prediction=1∣true label=0,group=A)≈P(prediction=1∣true label=0,group=B) (Equal False Positive Rate)
When to Use It: This is critical in high-stakes decisions like loan applications, criminal justice, or medical diagnoses. It means the chances of a qualified applicant being correctly approved (true positive) and an unqualified applicant being correctly rejected (true negative) are the same regardless of their group.

Code Example: Calculating Demographic Parity Let's check the selection rate for a hypothetical loan approval model.

Python

import pandas as pd

# Sample data: model predictions and group identity
data = {
    'gender': ['Male', 'Male', 'Female', 'Male', 'Female', 'Female', 'Male'],
    'loan_approved_prediction': [1, 0, 1, 1, 0, 1, 0] # 1 = approved
}
df = pd.DataFrame(data)

# Calculate selection rates for each group
selection_rates = df.groupby('gender')['loan_approved_prediction'].mean()

print(selection_rates)
# Output:
# gender
# Female    0.666667  (2 out of 3 approved)
# Male      0.500000  (2 out of 4 approved)

# Calculate the Demographic Parity Difference
parity_difference = abs(selection_rates['Female'] - selection_rates['Male'])
print(f"\nDemographic Parity Difference: {parity_difference:.2f}")

if parity_difference > 0.1: # A common threshold
    print("Potential bias detected according to Demographic Parity.")

LearnCodePro

Bias detection & fairness metrics

What is AI Bias and Where Does It Come From?

Common Types of Bias

How to Measure Fairness: Group Fairness Metrics

1. Demographic Parity (or Statistical Parity)

2. Equalized Odds

Model explainability: SHAP & LIME tutorials

Privacy & compliance (GDPR basics for ML)

Responsible AI checklist & documentation

Quick Navigation

This Series

Topics in Data Science, Machine Learning & AI

Categories

Learn More

Want to Track Your Progress?

What is AI Bias and Where Does It Come From?

Common Types of Bias

How to Measure Fairness: Group Fairness Metrics

1. Demographic Parity (or Statistical Parity)

2. Equalized Odds

More in Explainability, Ethics & Responsible AI

Model explainability: SHAP & LIME tutorials

Privacy & compliance (GDPR basics for ML)

Responsible AI checklist & documentation

Quick Navigation

This Series

Topics in Data Science, Machine Learning & AI

Categories

Learn More

Want to Track Your Progress?