Bias & Fairness

“The machine is only as unbiased as the data it learns from — and the humans who built it.”

Learning Objectives

By the end of this module, you will be able to:

Identify sources of bias in AI systems
Recognize when AI outputs may reflect harmful biases
Apply practical checks for bias in your AI-assisted work
Consider fairness implications in AI-enabled features
Discuss the responsibility of developers using AI tools

Why Bias Matters

AI systems are increasingly making or influencing decisions that affect people:

Who sees which job listings
What content gets recommended
How code suggestions shape applications
What answers people receive to questions

Bias in these systems can cause real harm — often in ways that aren’t immediately visible.

Sources of Bias

1. Training Data Bias

AI learns patterns from data. If the data contains biases, the AI learns them:

Data Characteristic	Resulting Bias
Historical discrimination	AI perpetuates it
Underrepresented groups	AI performs worse for them
Stereotypical associations	AI reproduces them
Geographic/cultural gaps	AI defaults to dominant culture

Example: An AI trained mostly on code from certain communities may suggest patterns that don’t match other communities’ conventions.

2. Selection Bias

What data gets collected affects what the AI knows:

Internet text overrepresents certain demographics
Code repositories favor certain languages and styles
Feedback data comes from users who engage (not all users)

Example: AI coding assistants trained on popular open-source projects may not understand domain-specific patterns from underrepresented industries.

3. Algorithmic Amplification

AI can amplify small biases into large effects:

Small bias in training data
         ↓
AI learns the pattern
         ↓
AI outputs reinforce the pattern
         ↓
Pattern becomes more prevalent
         ↓
Future training data is even more biased

Example: If AI slightly prefers certain naming conventions, developers may adopt them, making them more common, reinforcing the AI’s preference.

4. Human Bias in Design

The humans building AI make choices that embed biases:

What problems are worth solving?
How is “success” defined?
Whose feedback shapes improvements?
What tradeoffs are acceptable?

Example: Deciding which languages to support first reflects whose needs are prioritized.

Bias in Code Generation

AI coding assistants can exhibit bias in subtle ways:

Variable Naming Assumptions

# AI might default to assumptions:
def calculate_salary(employee):
    # Assumes certain salary structures
    # Assumes certain employment models
    pass

def parse_name(full_name):
    # Assumes name format (first/last)
    # May not handle non-Western name structures
    first, last = full_name.split()  # Fails for many names

Default User Assumptions

// AI might generate:
function getUserPreferences(user) {
    return {
        language: 'en-US',  // Why this default?
        timezone: 'America/New_York',  // Whose timezone?
        dateFormat: 'MM/DD/YYYY'  // Not universal
    };
}

Example Data Bias

# AI-generated example data often reflects biases:
users = [
    {"name": "John Smith", "role": "engineer"},
    {"name": "Jane Doe", "role": "designer"},  # Gender/role associations
    {"name": "Bob Johnson", "role": "manager"},
]

Documentation Assumptions

# AI might write:
def authenticate_user(username, password):
    """
    Authenticates a user with his credentials.  # Gendered language
    ...
    """

Recognizing Bias in AI Outputs

Questions to Ask

When reviewing AI-generated content:

Question	What It Catches
Who is assumed to be the default user?	Exclusionary defaults
What examples are provided?	Stereotypical representations
What edge cases are missing?	Underrepresented scenarios
Whose perspective is centered?	Cultural assumptions
What language is used?	Gendered or exclusionary terms

Red Flags

Watch for:

Homogeneous examples: All example users look the same
Assumed norms: “Normal” that isn’t universal
Missing considerations: No accessibility, no i18n
Stereotypical patterns: Roles/traits associated with demographics
Exclusive language: “He” as default, jargon that excludes

Green Flags

Look for:

Diverse examples: Varied names, contexts, scenarios
Explicit configurability: Defaults that can be changed
Inclusive language: They/them, accessible explanations
Edge case awareness: Different cultures, abilities, contexts

Practical Bias Checks

Check 1: The “Who’s Missing?” Test

For any AI-generated feature or example:

List who’s represented
Consider who’s NOT represented
Ask: would this work for them?

# AI generates user validation:
def validate_age(age):
    if age < 18 or age > 65:
        return False  # Why 65? What about users outside this range?
    return True

Check 2: The “Default Swap” Test

Replace defaults with alternatives:

Would this work in a different country?
Would this work for a different gender?
Would this work for a different ability level?
Would this work for a different economic context?

Check 3: The “Explain the Assumption” Test

For every assumption in generated code, ask:

“Why this default? What’s the justification?”

If the answer is “I don’t know” or “that’s just how it is,” examine further.

Check 4: The “Harm Scenario” Test

Consider how the code could cause harm:

Could it exclude someone unfairly?
Could it make incorrect assumptions about people?
Could it perpetuate stereotypes?
Could it disadvantage certain groups?

Fairness in AI-Enabled Features

When building features that use AI, consider:

Who Benefits?

Question	Consideration
Who does this feature help?	Ensure broad benefit
Who might it disadvantage?	Mitigate harm
Who was consulted in design?	Include affected groups
Who tests it?	Diverse testing

What Decisions Are Made?

If AI influences decisions about people:

Can decisions be explained?
Can decisions be appealed?
Are decisions auditable?
Is there human oversight for high-stakes decisions?

What Data Is Used?

Is the data representative?
Does the data contain historical biases?
Can users control their data?
Are data limitations disclosed?

Developer Responsibility

You Can’t Fix Everything

AI systems have biases. You can’t eliminate all of them.

What you CAN do:

Recognize biases when they appear
Question assumptions in AI output
Improve code before accepting it
Advocate for inclusive practices
Document known limitations

Professional Standards

As a professional using AI tools:

Responsibility	Action
Don’t blindly accept biased output	Review critically
Don’t ship known harmful biases	Fix before release
Don’t ignore feedback about bias	Listen and respond
Don’t assume bias is someone else’s problem	Take ownership

When You Find Bias

Fix it if you can (in your code)
Report it if appropriate (to tool providers)
Document it for others (in your team)
Learn from it (improve your review process)

Practical Exercises

Exercise 1: Bias Audit

Take AI-generated code from your project:

List all assumptions in the code
For each assumption, consider who it might exclude
Identify at least one bias or oversight
Suggest a fix

Exercise 2: Example Data Review

Ask AI to generate sample user data.

Review the names, roles, characteristics
What patterns do you notice?
Are any groups overrepresented or underrepresented?
Ask AI to regenerate with explicit diversity

Compare the results.

Exercise 3: Default Analysis

Find three default values in code you’ve written or accepted:

Why that default?
Who does it serve well?
Who might it not serve?
Should it be configurable instead?

Exercise 4: Inclusive Rewrite

Take this biased code:

def greet_customer(customer):
    """Say hello to a customer.

    Welcomes the customer and asks if he needs help.
    """
    title = "Mr." if customer.gender == "male" else "Mrs."
    return f"Hello {title} {customer.last_name}! How can I help you, sir?"

Rewrite it to be more inclusive. What changes? What assumptions did the original make?

Key Insights

Concept	Practical Rule
Bias sources	Training data, selection, amplification, design
Code manifestation	Defaults, examples, assumptions, language
Detection	Ask “who’s missing?” and “why this default?”
Responsibility	Review, fix, report, document
Perfection isn’t possible	Improvement is always possible

Connection to What’s Next

Bias connects to broader ethics:

Module 16: Privacy & Data — how data practices affect fairness
Module 17: The Future of Work — societal implications

Reflection Questions

Think of a default in software you use regularly. Who does it serve? Who might it not serve?
“The machine is only as unbiased as the data it learns from.” If this is true, what are the implications for using AI-generated code?
When you find bias in AI output, what factors determine how you respond? When is “fixing it locally” enough? When should you do more?

Next module: Privacy & Data — responsible data practices in AI-assisted development.