AI Safety Through Human-Centred Simulation

February 8, 2024 by The Cato Bot Team 4 min read

As AI systems become more sophisticated and autonomous, understanding how they interact with humans in complex, real-world scenarios becomes critical for safety. Traditional testing methods often miss edge cases that emerge from the messy reality of human behavior, environmental pressures, and system interactions.

The Gap in AI Safety Testing

Current AI safety approaches tend to focus on:

Technical robustness (adversarial inputs, model failures)
Alignment problems (reward hacking, goal specification)
Controlled environments (laboratory testing, synthetic datasets)

While these are essential, they often miss a crucial dimension: how AI systems behave when embedded in complex human systems under real-world constraints.

Simulation as a Safety Tool

Human-centred simulation offers a way to stress-test AI systems against realistic scenarios before deployment. By modeling not just the technical system, but the entire socio-technical context, we can identify failure modes that emerge from:

Human Adaptation and Misuse

People don’t use systems as designed. They find workarounds, develop habits, and adapt behaviors based on their experience. Simulation can model these adaptations:

Persona: "Time-Pressured Nurse"
  Adaptation-Pattern: "Seeks efficiency shortcuts"
  Context: high_workload=true, time_pressure=severe

Event: "AI-assisted diagnosis recommendation"
  Step: "Nurse reviews AI suggestion"
    Conditional: IF time_pressure=severe 
                 THEN skip_detailed_review=true
  Disruption: "AI suggests rare condition requiring complex verification"
    Response: "Nurse accepts recommendation without verification"

Cascade Failures

When AI systems fail, they often fail as part of larger systems. Simulation can model how AI failures propagate through human workflows:

Healthcare: How does an AI diagnostic error affect treatment chains?
Finance: How do automated trading failures cascade through market systems?
Transportation: How do autonomous vehicle edge cases affect traffic flow?

Context-Dependent Risks

AI safety risks often depend heavily on deployment context. The same model might be safe in a laboratory but dangerous in a high-stress, resource-constrained environment.

Case Study: Healthcare AI Deployment

We recently worked with a healthcare AI deployment where simulation revealed critical safety issues that laboratory testing had missed:

The Technical System

An AI model for triaging patient symptoms with 95% accuracy in controlled testing.

The Human System

Busy emergency department with:

Staff working 12-hour shifts
High patient volume during flu season
Pressure to reduce waiting times
Mix of experienced and junior staff

The Simulation

Using Experience Notation and discrete-event simulation, we modeled:

Staff fatigue patterns and decision-making changes over shift duration
Patient flow dynamics during peak hours
System load effects on AI response times and interface usability
Training variations between experienced and junior staff

The Discovery

The simulation revealed that during peak hours with fatigued staff, the AI’s recommendations were being accepted with minimal review—even for low-confidence predictions. This created a reliability cliff where the system appeared to work well most of the time, but had dangerous failure modes under stress.

Implementing Safety Simulation

Here’s how teams can integrate human-centred simulation into AI safety workflows:

1. Map the Socio-Technical System

Don’t just model the AI—model the entire context:

Human stakeholders and their goals, constraints, pressures
Environmental factors that change over time
System interactions and dependencies
Failure propagation paths

2. Model Human Behavior Realistically

Include:

Adaptation patterns (how people change behavior over time)
Stress responses (how decision-making changes under pressure)
Training variations (different skill levels and experiences)
Workaround development (how people route around problems)

3. Simulate Stress Conditions

Test your AI systems under:

High load conditions
Resource constraints
Time pressure
Partial system failures
Staff turnover and training gaps

4. Iterate and Validate

Compare simulation results with pilot deployments
Update models based on real-world observations
Continuously test new scenarios as you discover them

Tools for Safety Simulation

We’ve open-sourced several tools to support this approach:

Experience Notation: For modeling human behavior and system interactions
Text2Sim: For running discrete-event simulations that include both technical and human factors
Integration examples and safety scenario libraries (coming soon)

The Path Forward

AI safety isn’t just about making better models—it’s about understanding how those models behave in the complex, messy, human systems where they’ll actually be deployed.

Simulation gives us a way to explore failure modes, test interventions, and build safety margins before real-world deployment. But it requires thinking beyond the technical system to the full socio-technical context.

Ready to explore AI safety through simulation? Check out our open-source tools or get in touch to discuss your specific safety challenges.

This post is part of our ongoing series on human-centred AI systems. Subscribe to our updates or follow us on GitHub for more insights on building safer, more effective AI.