55% of organizations have AI systems running in production with no formal governance framework in place. Not a partial framework. Not an immature one. None at all.
That figure comes from research published in 2024 by multiple sources tracking the governance gap in enterprise AI. And it explains a lot about what happened that year.
Air Canada lost a tribunal ruling because its chatbot gave a customer incorrect policy information and nobody had built a mechanism to stop it. McDonald's shut down its AI drive-thru across more than 100 locations because it couldn't manage the edge cases that show up at scale. IBM's Watson for Oncology, after a $4 billion investment, was discontinued because the training data didn't match the real patients it was supposed to serve.
In each case, the technology worked in the demo. The governance didn't exist to make it work in production.
What Governance Failure Actually Looks Like
Most people think of AI governance as compliance paperwork. An ethics statement. A committee that meets quarterly. A policy PDF that the data science team never reads.
That's governance theater. It doesn't stop anything.
Real governance failure looks like this: your AI system makes a decision, something goes wrong, and you cannot reconstruct what data it used, what logic it applied, or who was responsible for the outcome. At that point, the organization absorbs 100% of the liability. The vendor's contract says so.
The Air Canada case is the clearest recent example. A customer asked the airline's AI chatbot about bereavement fares after a family member's death. The chatbot gave incorrect information. Air Canada tried to argue in the British Columbia Civil Resolution Tribunal that the chatbot was a "separate legal entity" responsible for its own actions. The tribunal rejected that argument. Air Canada was liable.
The failure wasn't that the chatbot said the wrong thing. Chatbots say wrong things. The failure was that there was no system in place to ensure the chatbot only retrieved verified, current policies — and no way to reconstruct what it retrieved when the dispute arose. Without that audit trail, there was no defense.
The Numbers Behind the Failure
The governance gap isn't a marginal problem. The data from 2024 and 2025 describes a systemic one.
75% of organizations experienced AI-related security breaches in 2024. 99% reported some form of financial loss tied to AI risk incidents. The average cost per incident was $4.4 million. These aren't research projections. They're reported outcomes from organizations already running AI in production.
The jump in project cancellation rates tells a similar story. In 2024, 17% of organizations scrapped most of their AI initiatives. In 2025, that number jumped to 42%. That's not disillusionment with AI as a technology. It's the consequence of deploying AI without the controls to make it reliable at scale.
Gartner's projection is that 60% of AI projects will be abandoned by 2026 due to poor data quality alone. Over 40% of agentic AI projects — the autonomous, multi-step systems that are increasingly being sold as enterprise solutions — will be canceled by 2027. Not because the agents don't work. Because the organizations deploying them can't account for what they do.
A model that works in a demo is not the same thing as a model that is defensible in production. The gap between those two things is governance.
The Readiness Paradox
There's a pattern in the data that explains why this keeps happening.
72% of organizations are currently deploying AI. But only 18% have an enterprise-wide AI governance council with actual decision-making authority. The other 54% are deploying AI that nobody has formal authority to stop.
The data management picture is similar. 90.6% of organizations claim they have effective information management. But only 30.3% have implemented the data classification systems required to safely train or ground an AI model. Most organizations believe they are further along than they are.
This overestimation creates a specific failure mode. Teams build in a controlled environment where edge cases are managed manually. The demo looks good. The pilot looks good. Then the system reaches production volume — 10,000 transactions a day instead of 100 — and the manual controls that masked the governance gaps can no longer keep up. The system fails publicly, and the project gets canceled.
McDonald's reached this point in June 2024. The AI ordering system had been tested and deployed across more than 100 locations. At volume, it couldn't handle the variability of real customer orders. Viral clips circulated of the system adding incorrect items in loops. The project was terminated. The technical system worked. The controls didn't scale.
Governance Debt: What You're Actually Paying for Later
The concept of technical debt is well understood. You take a shortcut now and pay interest on it later. Governance debt works the same way.
When teams build AI systems for demos, they skip controls. They use public APIs without proper authentication. They train on unverified datasets to move faster. They skip the audit logging because it adds latency. Each of these decisions incurs governance debt — a cost that doesn't appear in the project budget but surfaces the moment the system needs to move into a regulated environment.
Paying governance debt usually means one of two things. Either you rebuild significant parts of the system to meet compliance requirements, which can cost as much as the original build. Or you cancel the project because the rework isn't worth it relative to the projected value.
This is why the organizations that try to "bolt on" governance after deployment fail at a higher rate than those that build it in from the start. You can't add immutable audit trails to a system that was never designed to produce them. You can't retroactively prove training data provenance when you didn't track it at ingestion. These aren't features you add. They're architectural decisions that have to be made at the beginning.
The AI You Didn't Build
There's a second governance problem that most organizations aren't measuring: the AI they didn't sanction.
Research from ISACA and others puts the number at around 90% of companies having employees who use personal or unsanctioned AI accounts for work tasks. This is sometimes called Shadow AI — the same category as Shadow IT, but with a faster adoption curve because the tools are free or nearly free and immediately useful.
The risk isn't just that employees are using unvetted tools. It's that company data flows into systems the organization has no visibility into, no contract with, and no ability to audit. Customer records, internal strategy documents, proprietary code — all of it can end up in external model training pipelines depending on the terms of the tools being used.
At the same time, SaaS products now ship with AI capabilities embedded by default. The CRM tool your team has been using for five years may have quietly added an AI feature that summarizes customer conversations. Whether that summary is accurate, whether it stores data in compliant regions, and whether it logs decisions for audit purposes — most organizations have no idea.
Governance programs that focus exclusively on internally built models are covering a small fraction of the actual exposure. The bigger attack surface is the AI your organization is already using that nobody has inventoried.
The Regulatory Shift: From Ethics to Law
For most of 2022 and 2023, AI governance was a voluntary exercise. Companies published ethical AI principles. Some formed advisory boards. The actual systems in production were largely ungoverned.
That period ended on August 1, 2024, when the EU AI Act entered into force. It's the first comprehensive legal framework for AI anywhere in the world, and it imposes specific technical and organizational requirements — not aspirational guidelines.
For high-risk AI systems — those used in employment decisions, credit scoring, healthcare, or law enforcement — the Act requires:
- Explainability. Organizations must provide "meaningful information about the logic involved" in automated decisions. If a system denies a loan application, the applicant has a right to understand why. If you can't reconstruct the reasoning, you can't comply.
- Data quality standards. High-risk systems must meet specific standards for the data used in their development. This applies to training data, validation data, and the data used to ground the model at query time. IBM Watson's failure — training on synthetic rather than real patient data — would be a compliance violation under this standard.
- Human oversight. The Act requires that humans have the authority to monitor, interpret, and intervene in the AI's operation. This isn't a soft requirement. It means you need defined roles, defined authority, and documented processes for override.
- Technical documentation. Article 11 of the Act requires a maintained documentation repository demonstrating compliance across the entire AI lifecycle — from data ingestion through model updates and decommissioning.
In the United States, federal oversight has remained minimal by design. The current posture prioritizes industry freedom and national competitiveness over prescriptive mandates. But that has created a different problem: 550+ state AI bills were introduced in early 2025 alone. For any enterprise that operates across state lines or sells into European markets, the EU Act's requirements are the practical floor. Designing governance to satisfy the strictest jurisdiction is the only way to avoid building separate compliance programs for each market.
What Operational Governance Actually Requires
Governance theater is static. Real governance is dynamic.
A static ethics policy doesn't stop a model from drifting. A quarterly committee review doesn't catch a bias that develops in week 6 as real-world data distribution shifts away from the training set. An ethics statement doesn't give anyone the authority to pause a deployment when something goes wrong at 2 a.m.
Operational governance means moving controls from PDFs into infrastructure. Specifically:
Machine-enforced gateways. Before a model response reaches a user, it passes through validation logic that checks it against verified policy. Air Canada's chatbot failure was preventable with this control. The model would have been allowed to respond only with information retrieved from a verified, current policy store — not from its own hallucinated reconstruction of what the policy might be.
Immutable audit trails. Every decision the model makes is logged with enough context to reconstruct it later. What query came in. What data was retrieved. What version of the model was running. What the output was. This log cannot be edited. It persists. When a dispute arises — and in production systems at scale, disputes arise — you can show exactly what happened.
Drift detection. Model performance is monitored continuously, not reviewed in quarterly retrospectives. Specific thresholds trigger alerts. If accuracy on a class of inputs drops below a defined threshold, the system flags it before it becomes a visible failure. If the distribution of outputs shifts in a way that suggests emerging bias, that shows up in monitoring before a regulator finds it.
Named ownership. A specific person, not a committee, has the authority and responsibility to pause a deployment. Committees diffuse accountability. When something goes wrong, diffused accountability means nobody acts. The IBM Watson for Oncology project suffered from this: accountability for clinical safety was spread across a large organization with no single owner empowered to stop it.
A 90-Day Start
Most organizations don't build governance because they think it requires years of infrastructure work before it becomes useful. That's a mistake. The basics can be in place within 90 days, and basic governance is far better than none.
The Minimum Viable Governance framework breaks it into three questions and one gut check.
Question 1: What AI systems do we have? Build an inventory. This includes models your team built, models embedded in vendor software, and tools employees are using without IT approval. You can't govern what you haven't counted. This takes 30 days for most organizations and consistently turns up AI exposure that leadership didn't know existed.
Question 2: What could go wrong? For each system in the inventory, identify the top two or three failure modes that would cause the most damage. Not every possible risk — the top ones. A customer-facing AI that provides incorrect information. A recruiting tool that filters out candidates based on a biased feature. A fraud detection system that flags legitimate transactions at a rate that erodes customer trust. Name the specific risks and assign a severity.
Question 3: Who decides? Assign a named individual, not a committee, with the authority to pause each system's deployment. This person gets the monitoring data. They have defined criteria for when to act. They have escalation paths that don't require waiting for a committee meeting.
The gut check. For each AI system that touches a consequential decision, ask: "Would I be comfortable if this AI made this decision about my family?" It's not a formal framework. But it catches the things formal assessments miss. Amazon's hiring algorithm, which was found to penalize resumes that included the word "women's" (as in "women's chess club"), would not have passed this test.
The Pre-Deployment Requirements That Kill Projects Later
The most common reason projects fail at the production gate is that they were built without documentation that regulators or internal governance boards require before going live. This is governance debt becoming due.
Before any AI system touches a live workflow, three things need to be in place.
Data certification. You need to be able to prove where your training data came from and that you have the legal right to use it. You need a bias audit — a documented assessment of whether the training data produces disparate outcomes across demographic groups. And you need data lineage: the ability to trace any piece of data in the model back to its source. If you can't produce these documents, the EU AI Act won't let you deploy the system in European markets. Increasingly, enterprise procurement requirements won't let you deploy it anywhere.
Adversarial testing. Before deployment, the system should be tested by people trying to break it. Prompt injection. Jailbreaks. Attempts to extract training data. Attempts to manipulate the system into producing outputs it wasn't designed to produce. This is called red-teaming and it should be a required gate before any customer-facing AI system goes live. Most organizations skip it. Most of the viral AI failures of 2024 would have surfaced in red-teaming.
Defined rollback criteria. If the system's performance degrades below a defined threshold after deployment, what happens? If there's no answer to that question before the system goes live, you're operating without a circuit breaker. Production AI systems at scale will have performance issues. The organizations that handle these well have already decided what the response looks like before the issue occurs.
What Mature Governance Actually Returns
Governance is usually framed as a cost and a constraint. The data suggests it's also a competitive differentiator.
Organizations with mature AI governance outperform their peers by up to 49% on financial metrics, according to research tracking the relationship between governance maturity and business outcomes. The mechanism is straightforward: governed systems fail less, recover faster when they do fail, and earn enough institutional trust to get expanded to more use cases. Ungoverned systems fail, generate legal exposure, and get shut down.
Executive involvement follows the same pattern. In organizations with mature governance programs, 81% have active C-level sponsorship of AI governance. In organizations without mature governance, that number drops to 28%. The organizations generating returns from AI treat governance as a leadership priority, not a compliance function.
The question isn't whether governance is worth the cost. It's whether you want to pay for it as a planned investment before deployment or as an unplanned crisis after something breaks publicly.
Air Canada, McDonald's, and IBM Watson each paid the second way. The projects they governed too late cost far more — in money, in reputation, and in lost time — than governance built in from the start would have.
The question every AI project needs answered before it starts is not "can this model do the task?" The question is: "Can we account for, explain, and defend what this model does?" If the answer is no, the project isn't ready to start.
Not sure where your governance gaps are?
The AI Readiness Assessment maps your current AI inventory against the compliance requirements and deployment risks relevant to your industry — and identifies the gaps before they become incidents.
Sources
- Gartner — "Predicts 2025: Artificial Intelligence Governance and Risk" (2025)
- Gartner — "60% of AI Projects with Data Issues Will Fail" (2024)
- AvePoint — "AvePoint Report Finds AI Rollouts Stalled Up to 12 Months; 75% of Organizations Face Security Breaches" (2024)
- Actian Corporation — "The Governance Gap: Why 60% of AI Initiatives Fail" (2024)
- Trullion — "Why Over 40% of Agentic AI Projects Will Fail — and Which Will Survive" (2025)
- British Columbia Civil Resolution Tribunal — Moffatt v. Air Canada, 2024 BCCRT 149 (2024)
- Reuters — "McDonald's ends AI drive-thru trial with IBM" (June 2024)
- Stat News — "IBM's Watson supercomputer recommended 'unsafe and incorrect' cancer treatments, internal documents show" (2018); project officially wound down by 2022
- AiExponent — "AI Governance Framework: Start in 90 Days With MVG" (2025)
- ISACA — "The Rise of Shadow AI: Auditing Unauthorized AI Tools in the Enterprise" (2025)
- EU — "Regulation (EU) 2024/1689 on Artificial Intelligence (EU AI Act)" — entered into force August 1, 2024
- King & Spalding — "AI Quarterly Update: State-Led Regulatory Approach" (2025)
- Informatica — "The Surprising Reason Most AI Projects Fail" (2025)
- Liminal.ai — "Enterprise AI Governance: Complete Implementation Guide" (2025)
- Databricks — "AI Governance Best Practices: How to Build Responsible and Effective AI Programs" (2025)
- Emerj — "Architecting the AI-Native Enterprise for Workforce Agility" (2025)