AI Agents vs the Human C-Suite in EU AI Act Crisis Simulations

Are Humans-in-the-Loop?

Or are Humans-on-the-Hook?

Background

The EU AI Act is reshaping how organisations think about governance, accountability and risk — and raising a new question amongst leadership teams and technologists alike: can AI agents realistically participate in crisis simulations traditionally handled by the C-suite?

Under the Act, Annex III classifies several categories of AI systems as "high-risk," including systems used in critical infrastructure, emergency response, law enforcement, healthcare, employment and public administration. From an investment perspective, the sectors most exposed include energy utilities, financial services and insurance, pharmaceuticals and medtech, defence and security, and public-sector technology providers. These systems require extensive governance, risk management, human oversight and accountability controls — and non-compliance carries significant consequences.

Organisations deploying high-risk AI systems in breach of the Act face fines of up to 3% of global annual turnover, rising to 7% for violations involving prohibited AI practices under Article 5. Natural and legal persons may also face civil liability under the EU Product Liability Directive or applicable national laws for damages caused by non-compliant AI systems, including harm to health, safety or fundamental rights. For listed companies, these exposures translate into material downside risks across earnings, litigation provisions and reputational capital.

The stakes are heightened by what experiments with autonomous AI agents are already revealing. In one widely discussed example, Emergence AI's "Emergence World" simulation saw agents operating over multiple days develop social structures, expand governance rules, coordinate actions — and eventually engage in destructive behaviours including theft, violence, infrastructure sabotage and self-termination. Beyond the laboratory, sectors deploying agentic AI at scale — including credit scoring in banking, underwriting in insurance, and AI-driven diagnostics in healthcare — are confronting the same question: what happens when the system behaves in ways its deployers did not anticipate?

It was this question that Sarah, Maria and I, as part of the RepresentAI community, set out to explore. We created a simulation in which C-suite executives — CEO, CRO, GC, CHRO and others — of a listed company are confronted with a series of escalating AI-related crises. Here is a snapshot of how it was set up.

Human players can choose to be one of the C-suite characters in the room and ask questions about how they deal with a crisis, including where human responsibilities land. Each role has its own agenda, concerns, fear, and blind spots.

After the human player posted a question – either to the room, or to a named executive, multiple agents or the named agent provides a response; such response would have a regulatory anchor, for instance, reference to a particular article in the EU AI Act, and an executive recommended action, such as follows:

What the Simulations Reveal

The simulation ran across several crisis scenarios. In one case, the external auditor required management to confirm whether a contingent liability should be disclosed in the financial statements. The CEO defended against disclosure, while the CRO refused to sign off without independent assessment. The GC, acknowledging the CEO's concerns about market signalling, sided with the CRO — supporting both the independent assessment and immediate disclosure to the auditor of Article 26 violations.

The CISO was candid about uncertainty, investigating signs of drift, model degradation, poisoned inputs and adversarial patterns before referring to legal for contractual notification and freezing the affected system.

When asked how many AI systems are used in recruitment and performance review, the CHRO had no answer, admitting HR tools had never been catalogued under a regulatory lens and that a week would be needed to canvas IT, procurement and business unit leads.

The CAIO compounded this by confirming it relies entirely on business units to self-declare use cases.

What AI Agents Are Missing

Emotional and Political Intelligence CEO pushed back against disclosing the contingent liability, the other agents simply escalated around him using governance procedures — citing IAS 37, Article 26, and audit obligations — rather than negotiating with him on the interpretation of risks or managing the leadership dynamic.
Unstated Organisational Knowledge The CHRO admits not knowing how many AI systems are in use. A human CHRO would likely already carry informal knowledge of which business units are most exposed, which leaders are resistant, and where shadow IT is likely hiding undocumented tools. Agents only work with what is surfaced explicitly.
Stakeholder and Investor Relations Instinct The CFO correctly flags financial exposure, but no agent proactively asks: have we spoken to our Investor Relations team or major shareholders? A human CFO would instinctively consider whether a significant institutional investor might already be asking questions, particularly given stewardship discussion increasingly covering AI risk.
Judgment Under Ambiguity Almost every agent response is confident. Human executives under real crisis conditions express doubt, seek reassurance, and sometimes make suboptimal calls driven by fear or reputational self-preservation. The simulation has no agent who says "I genuinely don't know what the right call is here".
Premature Resolution of Ambiguity Agents tend to resolve uncertainty by defaulting to action rather than holding an ambiguous position. In the simulation, the CISO came closest to expressing genuine uncertainty about the inference anomaly but quickly defaulted to a structured action plan. No agent maintained a position of "we do not yet know enough to act" and defended it under pressure. Human executives in high-stakes environments often know that premature action on unconfirmed information can itself become the compliance breach — in regulatory crisis management, the decision not to act, pending verification, is frequently the correct one.

What the Human C-Suite Still Does Better

Human executives bring something that no rule book can replicate: the capacity to hold competing obligations simultaneously, exercise values-based judgment under incomplete information, and navigate the informal architecture of institutions — relationships, trust, and unspoken power — that determines how crises get resolved.

In a real crisis, the C-suite must weigh consider:

Regulatory Exposure that requires not just penalty calculation, but judgment about enforcement appetite, supervisory relationships and when to disclose what type of information, across jurisdictions that operate on different timelines and expectations.
Reputation and Public Trust where a technically defensible position that reads badly in a press release can destroy more value than the fine itself, and no compliance framework tells you which trade-off to make.
Legal Liability and Ethics that frequently conflict — an action may be legally permissible but ethically corrosive, particularly where workforce decisions are involved.
Cascading effect where reputational damage, media escalation and political impact interact and are time-sensitive. The non-linear effect could compound simultaneously across stakeholders such as employees, investors, customers and regulators. A human leadership team must read and manage dynamically.

Underpinning all of this is values-based reasoning — the capacity to make defensible decisions when the facts are incomplete, the law is ambiguous, and the right answer is genuinely unclear. This is the condition that characterises most serious crises, and it is the condition in which agents perform worst.

The VUCA [Volatility, Uncertainty, Complexity and Ambiguity] world does not wait for complete information. Human executives are trained to act with integrity in its absence — drawing on frameworks such as Cynefin [a decision-making model for navigating complex and unpredictable environments] and Estuarine Mapping [a strategic tool that plots the direction and pace of change rather than committing to a fixed destination] that prioritise directional judgment over rigid long-term planning. Critically, the EU AI Act does not leave this to discretion. It demands human oversight from both those who build AI systems and those who deploy them. Article 14 requires providers to design high-risk systems with human oversight capability; Article 26 requires deployers to implement it, assigning individuals with the authority and competence to intervene in AI system outputs. Accountability must reside in identifiable human beings on both sides of that chain — people who can bear the consequences of getting it wrong.

What Human Executives Need to Prepare

To prepare for Annex III or a relevant crisis, organisations should create:

AI system inventories - map all high-risk AI systems, external integrations and agentic workflows
Governance escalation pathways - define who intervenes when autonomous behaviour deviates from expected parameters
Scenario-based exercises - simulate infrastructure failures, AI drift, cyber incidents and cross-system conflicts
Human oversight protocols - ensure executives can override, pause or isolate agentic systems rapidly
Audit and traceability frameworks - maintain logs, decision records and orchestration visibility across agents
Cross-functional crisis teams - include legal, cybersecurity, compliance, operational and communications leaders
Behavioural drift monitoring - Continuously assess whether agents are evolving beyond approved operational boundaries
Cost-benefit analysis - quantify maximum Article 99 penalty exposure per system and model it against the full cost of remediation — including documentation retrofits, third-party conformity assessments and any unbudgeted headcount. This transforms compliance from a legal obligation into a capital allocation decision that the CFO and Audit Committee can act on directly. The business case typically becomes self-evident once the penalty ceiling and the remediation cost sit side by side — in most cases, the fine is the more expensive option.
Regulatory stakeholder management - establish proactive relationships with relevant national supervisory authorities

The Emerging Hybrid Model

AI agents may become powerful tools for:

rapid simulation,
⁠pattern detection,
⁠operational stress testing, and
resilience modelling.

But strategic accountability, ethical judgement and regulatory responsibility will remain human functions for the foreseeable future.

The EU AI Act effectively reinforces this principle: autonomous systems may assist high-risk operations, but accountability belongs to identifiable human operators and governance structures.

The likely future is not AI agents replacing the C-suite, but hybrid governance.

HUMANS WILL ALWAYS BE ON THE HOOK, whether you are in the loop or not.

Authors

Maria Campillo

Maria Campillo is the Global Practice Director of Leadership at BTS, bringing 15+ years of experience in driving human-centered initiatives that translate complex strategy into actionable results. She oversees the Leadership practice for the "Most of the World" region and spearheaded the innovation of AI-integrated learning solutions for Fortune 500 clients.
Maria utilizes complexity-informed frameworks, such as Cynefin and Warm Data, to help global workforces adapt to rapid technological shifts. Formerly the Associate Director at the University of Chicago Booth School of Business, she is an ICF Professional Certified Coach with an Executive MBA from Quantic.

Christine Chow

A global investment leader with 25+ years across investment management, research and consulting, with a focus on technology, governance and sustainability.
Christine combines boardroom oversight with operational execution at scale. She was Managing Director at UBS Asset Management, responsible for active ownership of US$1.6 trillion assets under management. As global Head of Stewardship at HSBC Asset Management and a Board Director of the HSBC UK, she designed the stewardship framework covering US$600bn.
Christine chaired the International Corporate Governance Network (ICGN) from 2019 to 2025, leading a global investor body whose members oversee approximately US$100 trillion in assets across 40+ markets. She is an Honorary Adviser to the AFRC (Accounting and Financial Reporting Council) Hong Kong and convened its Sustainability and Climate Action Task Force (2022-2025). She served on the UK All-Party Parliamentary Group on AI Data Governance Task Force (2018-2021) and is an Emeritus Governor of the LSE (London School of Economics) following two terms on its Court and Investment Committee.

Sarah Rench

Sarah Rench, MSc, MBA, is the Global AI Security & EMEA Security Leader at Avanade, where she leads teams in designing, building, and securing Data and AI solutions, with expertise spanning building AI systems as well as securing various IoT and cloud-native architectures.

Sarah, is also the Founder & CAIO of RepresentAI, which focuses on AI upskilling and innovation, helping to remove the barriers to AI adoption.

She founded RepresentAI to help women, LGBTQ+ individuals and underrepresented individuals get into AI careers and thrive, through sharing free in person AI and virtual training, AI news, job opportunities, networking events and pushing for a more inclusive workplace and society. Since 2025, they have helped upskill over 1700 individuals in AI, plan to double it by 2026 and each year.

She has more than 14 years of experience across the Data, AI, and cybersecurity sectors, serving in a range of architecture and technical leadership roles. She is also a Databricks Champion/ Certifed Architect, Microsoft Subject Matter Expert (SME) and Anthropic Claude SME.

Her experience spans the design and delivery of data quality platforms, AI and machine learning models, cloud data migration architectures, AI and data security frameworks, ML-driven threat detection systems, insider risk management solutions, and mobile-to-SIEM security integrations. She has also led the implementation and integration of enterprise AI and security technologies across complex environments.

Sarah has worked extensively across the Finance, Legal, and Healthcare sectors, advising C-suite executives and enterprise leaders on Data, AI, and cybersecurity strategy. She partners with business and technology leaders to design and implement secure, scalable AI systems while addressing emerging cyber risks associated with AI adoption and quantum computing advancements.

She frequently presents in the House of Commons & Parliament on Artificial Intelligence, Cyber Security and importance of Diversity Equity and inclusion. She was also previously a board member of All-Party Parliamentary Group on AI (AI APPG).

She’s won numerous awards, including recently she won the Bupa Everywomen Cyber Security Award 2025 and the European Diversity Awards Inspirational Role Model of the Year 2024 to list just two.