AI Hallucinations and the Discipline of Legal Authority

Latest PostAI Hallucinations and the Discipline of Legal Authority

In 2023, if you had asked a room of lawyers what an “AI hallucination” was, many would have been unsure. That has changed quickly, particularly following the observations of Dame Victoria Sharp in the combined High Court cases of Ayinde v London Borough of Haringey and Al-Haroun v Qatar National Bank [2025] EWHC 1383 (Admin).

By Matthew Lee, Barrister, Doughty Street

Those cases brought into focus a tension with how lawyers are trained to work. Generative AI systems can produce legal-sounding text, complete with case names, citations and confident statements of principle. Sometimes, however, the material is not merely wrong but invented. Because it is plausible, it can pass unnoticed unless every authority is checked. That is not an argument against using AI, but a reminder of a basic discipline: the authority must exist and it must say what you claim it says.

What Do We Mean by “Hallucination”?

The term is now used with enough regularity to appear in formal judicial guidance. The updated Artificial Intelligence (AI) Guidance for Judicial Office Holders[1] defines it as follows:

“AI hallucinations are incorrect or misleading results that AI models generate. These errors can be caused by a variety of factors, including insufficient training data, the model’s statistical nature, incorrect assumptions made by the model, or biases in the data used to train the model”

Dame Victoria Sharp captured both the opportunity and the risk in The Mayflower Lecture 2025:

“Generative AI, designed as it is to produce coherent text rather than the “right” answer, if asked to generate a legal argument, can produce “hallucinations.” These are fake cases, or fake citations from fake cases or even fake quotations from real cases… The risk of injustice and misinformation filtering into the legal system is nonetheless real, but at least such problems can be spotted, and those responsible for misusing technology can be held to account by the courts and professional regulators.”[2]

This calm seriousness is echoed in Ayinde, where the Divisional Court emphasised not only individual responsibility but the need for leadership, training and regulation across the profession.

Where Are We Now?

I try to be careful not to overclaim and AI use is certainly not confirmed in all cases where it is suspected. The reported cases vary, the context is often incomplete, and court users are not always consistently described. With that caveat, my current working count of UK decisions involving hallucinated material stands at 38, which I treat as a minimum rather than a definitive figure.

Responsibility is mixed. On what I have been able to trace so far, 21 cases involved litigants in person, 10 involved lawyers, and the remainder involved other forms of representation. The problem is also spread across jurisdictions, from the High Court and Court of Appeal to tribunals of many kinds. That breadth suggests the issue is not confined to one corner of the system, and that pressures on written submissions and access to competent advice matter.

Having read hundreds of cases from outside the UK, I am confident there is no single explanation for why hallucinated material reaches the court. Sometimes it reflects a genuine misunderstanding of what these tools can do. Sometimes it is poor supervision or over-reliance on unchecked drafts. Sometimes it is the pressure to produce something that looks authoritative. At the most serious end, there may be deliberate attempts to mislead. Increasingly, courts want to know not only that hallucinated material is present, but why it appeared in that particular case.

Is Technology Alone Enough?

I am cautious when I hear confident claims that this is an easy technical problem with a technical fix. Tools that check citations against databases may help with obvious fabrications and I would welcome anything that makes verification quicker. But hallucinations are tied to how large language models generate plausible text rather than verified truth and the most troubling examples are often subtle, coherent and superficially sound. For that reason, I find it helpful to distinguish between different types of hallucinations that seem most common in the case law internationally.

The 8 Common Hallucinations in Legal Work

At the start of my research, I began grouping hallucinations into eight types[3]. I may revisit these categories over time, but for now they provide a useful way of explaining why the problem is not easily solved.

The first type is a fabricated case and citation, where both the parties and the reference are invented. This is often the easiest to spot. The second involves a wrong case name paired with a real citation, so that checking the reference leads to an existing decision but not the one claimed. The third is the reverse: a real case name accompanied by the wrong citation.

The fourth type consists of conflated authorities, where elements of multiple real cases are merged into a single, plausible but inaccurate source. The fifth involves a correct statement of law supported by an invented authority. The principle may be sound, but the case relied upon does not exist.

The sixth type is a real case with misstated facts or a distorted ratio. The seventh involves a misleading paraphrase of secondary material, such as textbooks, articles, or headnotes, where the tone of scholarship is retained but the substance is altered. The eighth arises where a real citation is used to smuggle in a false one, for example where a legitimate article or case is cited even though it already contains an earlier hallucinated authority.

The earlier categories are generally easier to detect. The later ones are harder, because the surface appears reliable while the substance quietly shifts. Of these, it is Types 6 to 8 which present difficulties that are unlikely to be addressed through current technical measures alone.

As an aside, a reader of my blog recently informed me that ChatGPT has begun referring to these categories as “Lee’s types of hallucinations”. I hope that doesn’t catch on, but it struck me as an oddly circular development, not least because the phrasing leaves open whether the hallucinations are being analysed or attributed to the author, which neatly captures how AI-generated language can misstate and confuse at the same time.

The Risk of Contamination

The deeper concern arises when hallucinations stop being isolated errors and become part of the wider information environment. A fabricated citation explained in a judgment can, paradoxically, gain durability once it appears in an official source that is scraped, indexed and reused. Similar risks arise when invented authorities enter academic or professional writing and are repeated without checking. Over time, the problem is not a single falsehood, but the erosion of traceable authority.

Expert evidence deserves particular care. Kohls v Ellison[4], a case from the United States, is a salutary reminder that expertise does not immunise anyone against fluent but unverified drafting. The below is taken from the Judgment. The Expert:

“…included citations to two non-existent academic articles and incorrectly cited the authors of a third….admits that he used GPT-4o to assist in drafting…failed to discern that GPT-4o generated fake citations to academic articles…. The irony. [expert] a credentialed expert on the dangers of AI and misinformation, has fallen victim to the siren call of relying too heavily on AI…”

This case provides a stark reminder to people like myself and others who believe they are fairly proficient in the use or certain AI tools and the risks involved. These tools can be highly persuasive, and the hallucinations subtle, even to those who understand the risks. The message is clear, if you use AI at any level, you must carefully check the output.

Looking Ahead

When I began tracking these issues, I assumed they would fade quickly, either through better tools or settled professional habits. There is a strong argument that we are in a transitional phase and the Master of the Rolls, Sir Geoffrey Vos, has recently articulated that view clearly in part of his response to an interesting question about How can and should the fundamentals of modern justice be delivered more quickly, more efficiently and at more proportionate cost, but still justly in the forthcoming generation:

“…One of the problems with providing a definitive answer to this question is that technology is moving very fast, and the thought leaders in our society have been struggling to keep up with its capabilities. If we were to devise solutions based on today’s AI capabilities, those solutions would be outdated in months…

The first assumption is and must be that AI will be far more capable than many can now imagine. It is pointless to dwell upon the hallucinations of today, when we know that such hallucinations will very likely soon be things of the past. We need, as I have said, to consider the more fundamental questions of how justice should be delivered when AI is able to decide cases, both civil and criminal, as or more reliably than humans and certainly far more cheaply and quickly…”[5]

I take that view seriously and I hope it proves to be right. If hallucinations do become a thing of the past, a significant concern shared by many, including myself, would fall away. In particular, it would reduce the risk of our legal canon becoming polluted as the use of AI increases across the profession. I also agree that the pace of technological development is rapid. The tools I use now may soon feel antiquated.

At the same time, I find myself quietly reflecting on an important counterpoint. The risk of hallucination does not arise accidentally. It seems to flow from the core design goal of many of these systems, which is to generate coherent and persuasive language rather than to guarantee truth. In law, the standard is necessarily stricter than being usually right. Authority must exist, be correctly identified and genuinely support the proposition advanced. Even a low residual error rate becomes problematic where a single undetected mistake can determine the outcome of a case.

This is not an argument against progress, nor against optimism about what may come next. It is simply a reminder that legal systems are uniquely sensitive to error, and that even small mistakes can have lasting effects. As we move through this transitional period, the challenge is not only to anticipate what AI may soon be capable of doing, but also to remain attentive to the standards that law already demands. After all, the legal documents of today are likely to shape the AI models of the future.

Matthew Lee, Barrister, Doughty Street

[1] https://www.judiciary.uk/wp-content/uploads/2025/10/Artificial-Intelligence-AI-Guidance-for-Judicial-Office-Holders-2.pdf

[2] https://www.judiciary.uk/wp-content/uploads/2025/12/THE-MAYFLOWER-LECTURE-2025.pdf

[3] https://naturalandartificiallaw.com/ai-hallucinations-in-case-law/

[4] https://naturalandartificiallaw.com/ai-in-expert-evidence-legal/

[5] Speech by The Master of the Rolls: Justice for all, justice for the accused, February 5, 2026, https://www.judiciary.uk/speech-by-the-master-of-the-rolls-justice-for-all-justice-for-the-accused/

Check out our other content

Most Popular Articles