Tools and strategies for detecting AI case citation hallucinations in legal materials

Tools and Strategies for Detecting AI Case Citation Hallucinations in Legal Materials[1]

The use of generative AI tools in the legal field is rapidly expanding. According to a Bloomberg Law report released in August 2024, adoption rates can vary significantly by experience level.[2]

  • Early Career Attorney (less than 5 Years of Experience) – 64% Adoption Rate
  • Mid-Career Attorney (5 to 9 Years of Experience) – 80% Adoption Rate
  • Partner-level Attorney (More than 10 Years of Experience) – 61 to 72% Adoption Rate
  • Senior Level Attorney (30 Years of Experience or more) – 54% Adoption Rate

The benefits of these tools are undeniable. They can reduce time spent on repetitive, labor-intensive tasks, assist with brainstorming, and even support substantive legal work. However, accuracy remains paramount in the legal field, where even minor mistakes can have major consequences. AI-generated outputs should never be blindly trusted. They must always be scrutinized through editing and verification processes.

AI as a Supplement, rather than a Replacement

While AI tools continue to reshape the practice of law, their role should be understood as complementary to, rather than a substitute for, human expertise and legal judgment. In an article published in March 2025, seven lawyers from Baker Botts LLP acknowledged the advantages of incorporating AI into legal practice, while arguing that these new tools “cannot replace the exercise of independent legal judgment.”[3] They advised that practitioners treat AI outputs like the work of a “sharp but green first-year lawyer who requires significant oversight,” and that these tools be employed as a “sparring partner” to test and refine their work, rather than as a way to outsource legal research and analysis.

Tracking AI Hallucinations in Legal Decisions

Earlier this year, a database tracking legal decisions from around the world where “generative AI produced hallucinated content – typically fake citations, but also other types of AI-generated arguments” became publicly accessible.[4] Created by French lawyer and data scientist Damien Charlotin, the AI Hallucination Cases database has identified 508 cases as of November 3rd, 2025.[5] Although this figure represents a minuscule fraction of court filings worldwide, it is important to note that any effort to document such occurrences is a herculean undertaking.

Attempts to Reduce Hallucinations with Retrieval-Augmented Generation (RAG)

Over the past few years, major legal research vendors have integrated a technique known as Retrieval-Augmented Generation (RAG) into their AI-assisted search engines.[6] Touted as a way to “significantly mitigate[] the risk of hallucinations,” RAG draws on a defined legal corpus of text-based sources, which can include legislation, case law, legal briefs, treatises, and other primary and secondary legal sources.[7] The model’s output is then based only on the information within its dataset. This process has been shown to reduce the number of AI-generated errors, but it does not eliminate them.

Although advances like RAG have improved the performance of AI models in recent years, “bespoke legal AI tools still hallucinate an alarming amount of the time.”[8] A 2024 study found that AI tools used for legal research hallucinated between 17% and 33% of the time.[9] Drawing on this data, the study’s authors developed a typology of causes for hallucinations produced by RAG-assisted AI models, which included naive retrieval, inapplicable authorities, reasoning errors, and sycophancy. Their findings emphasize the “need for rigorous, transparent benchmarking and public evaluations of AI tools in law.”[10]However, the development of such frameworks will take time.

Even when using RAG-assisted, industry-specific tools where hallucinations are less likely to occur, AI systems remain, to some extent, a “black box.” This means that while their results may appear sound, the reasoning behind them is not always clear. Until the issue of AI interpretability is resolved, a challenge even OpenAI’s Sam Altman has acknowledged, no AI output should be trusted beyond reproach.[11] Careful verification and human oversight must continue to be considered fundamental parts of responsible legal research for the foreseeable future.

Proceeding with Vigilance and Caution

The enthusiasm surrounding the transformative potential of AI and its applications within the legal field shows no signs of slowing. Yet amid this excitement, practitioners and researchers must remember that such tools are not substitutes for traditional research grounded in reliable secondary sources. As one commentator aptly observed, “technology is no substitute for competence, and the use of legal technology without sufficient knowledge of the underlying law one is working with is a recipe for malpractice.”[12]

While technological innovation in this area is both impressive and inevitable, these tools must be approached with informed caution. Their continued evolution reflects remarkable levels of technical expertise, creativity, and vision. When used properly, they can streamline workflows and enhance efficiency across the legal profession. However, these benefits must be balanced with a clear understanding of their limitations and the risks of overreliance on automated outputs.

The legal profession has now reached a point where, even for those who choose not to engage directly with AI tools, it is impossible to avoid encountering their influence within legal research and analysis. As law librarians and other information professionals, this reality underscores the importance of actively engaging in critical discussions about emerging technologies and providing informed guidance to the organizations and communities we serve.

Best Practices When Using Generative-AI Tools

First, while RAG-specific tools are not entirely foolproof, they offer a significant advantage over general-purpose AI systems by drawing on an established and trusted knowledge base. For any form of legal drafting, it is best to rely exclusively on these industry-specific tools. Second, users must continue to verify the conclusions and arguments generated by legal AI tools and remain vigilant for potential hallucinations. When reviewing legal materials that were, or may have been, produced using AI tools, the following warning signs should elicit a closer look.[13]

  1. Novel case citations that are not recognized by official databases
  2. Unusual or overly generic statutory references
  3. Lack of pinpoint citations or ambiguity in attribution
  4. Unexpectedly broad or overly confident legal conclusions

As mentioned earlier, it is recommended that AI outputs be treated as the work of a first-year associate. They can be sharp and incredibly helpful, but they are still prone to mistakes due to inexperience. Just as a junior associate would not be allowed to submit a court filing without careful review and editing, AI-generated content should always be verified before use.

One practical insight gained during research for this article is that AI-enhanced editing tools within office software can inadvertently auto-complete citations, producing mis-cites that may appear as hallucinated cases. Keeping this in mind is essential both when analyzing citations in the work of other legal professionals and trying to prevent mis-citations in your own work.

Finally, verify, verify, verify. The importance of verification cannot be overstated. A key purpose of this article is to underscore the fundamental need for what Thomson Reuters refers to as “proper lawyering,” or simply the act of checking one’s sources.[14] Equally important, however, is the need to foster a broader conversation about what humans do best: adapting workflows and strategies to integrate new technologies. As these technologies become more deeply embedded in legal practice, particular attention must be paid to ensure that they are used responsibly and ethically.[15]

The following section highlights proprietary and open-source tools that can assist with verifying cited authorities. The tools and strategies discussed are by no means exhaustive. The pace of technological change is so rapid that some advice may become outdated within a matter of weeks. Replicability is also a challenge, as models are frequently updated, and the broader issue of AI interpretability remains unresolved. Nevertheless, sharing resources, strategies, and insights, and continuing the conversation about emerging solutions, remains critical as these tools evolve.

Tools and Strategies

Westlaw

Drafting Assistant Essential – A legal drafting tool for research, formatting, citation review, and identifying possible drafting issues. It includes several tools that can be used individually or in combination with one another.

  • Cite Formatting – Checks citations and suggests corrections for citation formatting.
    • Since this tool only provides citation corrections and does not link authorities to Westlaw documents, it can be helpful to use it with another tool, such as Westcheck or Quick Check, which can be used to pull or navigate to primary sources in Westlaw.
  • Quick Check – Reviews cited cases and provides warnings for cited authorities that have been overruled or have received negative treatment, recommendations for additional authorities that may further support the document’s argument, and quotation analysis.
    • Once a document has been processed, the user can navigate to the tab titled “Warnings for Cited Authority.” In the upper right-hand corner, there is a button for “Unverified Citations,” which highlights citations that may require additional review and verification.
  • WestCheck – Extracts and checks citations in KeyCite, generates a list of cited cases, and retrieves cited documents on Westlaw.
    • This tool only returns verified cases, so the absence of a cited authority from its results can serve as a quick indication that the case may not exist. However, if the citation is a pincite or refers to a state or specialty court, the tool may be unable to locate the authority, and additional verification will be required. The authors have also observed instances where WestCheck failed to recognize certain cases as authorities and omitted them from the results. Therefore, it remains essential to review the document manually to ensure that all cited authorities are properly identified and verified.
  • Deal Proof – Analyzes and reviews document drafts. Identifies potential errors and discrepancies, analyzes references, and generates reports.

Litigation Document Analyzer – A toolset that can be used to review documents, identify erroneous citations and potential misstatements of law, summarize and analyze arguments, and suggest counterarguments and supporting authorities. It includes a Judicial Analysis tool, which can provide a report detailing the veracity of cited authorities.

Bloomberg

Brief Analyzer – A tool that analyzes documents and generates a report of cited authorities and arguments, and highlights suggested content for review. Its output includes a list of cited authorities that the system identified but was unable to resolve. However, as with WestCheck, the tool may occasionally fail to recognize a citation as an authority and omit it from the results. This limitation once again underscores the importance of conducting a manual review to ensure that all cited authorities are accurately identified and verified.

Lexis

Document Analysis – A toolset that can be used to analyze a brief or passage document, compare multiple briefs, or analyze an agreement. With a brief analysis, the output is presented in a dashboard with recommendations for review, identification of similar briefs, citation checks, and quote verification. The efficacy of this tool’s citation review is dependent on its ability to recognize authorities within the document. As with WestCheck and Bloomberg’s Brief Analyzer, a manual review should always be performed to confirm that all cited authorities have been correctly identified and verified.

Stand-Alone Products

CiteCheck AI by LawDroid[16] – A web-based platform that verifies citations within uploaded documents. Users can upload five documents at no cost; any additional uploads require a paid subscription. The tool generates a validation report listing all citations identified in the document and indicates which citations were deemed invalid.

Open-Source Tools

When using free, publicly available resources, users should take care not to upload any confidential or proprietary materials.

Case Strainer[17] – A free, web-based platform created by Johnathan Franklin, Digital Innovation Librarian at the University of Washington. The website relies on CourtListener and other free sources to verify cases, and works with URLs, files, and pasted text.

Is This Case Real?[18] – A free online case checker that can be used to verify citations using data and tools from the Free Law Project. The website only allows users to paste text for review. The API relies on citations in CourtListener’s database to validate case citations.[19]

Damien Charlotin’s Database of AI Hallucinations[20] – While this resource is not designed to detect hallucinations, it serves as a valuable reference by providing examples of documents that contain them. It also helps prepare users to engage in informed discussions with stakeholders about the importance of education and training in the responsible and ethical use of AI tools.


[1] The authors of this article feel compelled to mention that this article was written without the use of Generative AI.

[2] Bloomberg Law, “Artificial Intelligence: The Impact on the Legal Industry,” August 2025.

[3] D. Curciani, D. David, R. Harper, J. Lawrence, M. Molner, M. Welsh & T. Wofford, “Trust, But Verify: Avoiding the Perils of Artificial Intelligence Hallucinations in Court,” The Computer & Internet Lawyer 42, no. 3 (March 2025): 1-3.

[4] Damien Charlotin, “AI Hallucination Cases,” accessible at https://www.damiencharlotin.com/hallucinations.

[5] Some attorneys involved in cases that appear in Damien Charlotin’s database have offered statements regarding their use of AI and, in some instances, their decision not to verify its output. In these explanations, lawyers “blam[ed] IT issues, personal and family emergencies, their own poor judgment and carelessness, and demands from their firms and the industry to be more productive and take on more casework…But most often, they simple blame[d] their assistants.” J. Koebler & J. Roscoe, “18 Lawyers Caught Using AI Explain Why They Did It,” 404 Media, Sept. 30, 2025.

[6] Westlaw Precision’s AI-Assisted Research with CoCounsel uses a retrieval augmented generation (RAG) engine and is the “first generative AI offering from Thomson Reuters.” (Thomson Reuters, “Introducing AI-Assisted Research: Legal research meets generative AI,” Nov. 15, 2023.) Lexis also used RAG for its generative AI models. (LexisNexis, “Retrieval Augmented Generation (RAG) for Trusted Generative AI,” Sept. 18, 2025.)

[7] B. Munir, M. Z. Abbasi, W. B. Wilson, & A. Columbo Jr., “Evaluating AI in Legal Operations: A Comparative Analysis of Accuracy, Completeness, and Hallucinations in ChatGPT-4, Copilot, DeepSeek, Lexis+ AI, and Llama 3,” International Journal of Legal Information 53, no. 2 (2025)” 103–114, https://doi.org/10.1017/jli.2025.10052; see also, M. Hindi, L. Mohammed, O. Maaz, & A. Alwarafy, “Enhancing the Precision and Interpretability of Retrieval-Augmented Generation (RAG) in Legal Technology: A Survey,” IEEE Access 13 (2025): 46171-46189.

[8] Stanford University, “AI on Trial: Legal Models Hallucinate in 1 out of 6 (or More) Benchmarking Queries,” Human-Centered Artificial Intelligence, May 23, 2024.

[9] This study evaluated three products by LexisNexis, Thomson Reuters, and Westlaw. V. Magesh, F. Surani, M. Dahl, M. Suzgun, C. D. Manning, & D. E. Ho, “Hallucination‐Free? Assessing the Reliability of Leading Legal Research Tools,” Journal of Empirical Legal Studies 22, no. 2 (2025): 216–242. https://doi.org/10.1111/jels.12413.

[10] V. Magesh, F. Surani, M. Dahl, M. Suzgun, C. D. Manning, & D. E. Ho (2025), “Hallucination‐Free? Assessing the Reliability of Leading Legal Research Tools,” Journal of Empirical Legal Studies 22, no. 2 (2025): 216–242. https://doi.org/10.1111/jels.12413. This need has also been discussed by in other studies. See R. Bhambhoria, S. Dahan, J. Li, & X. Zhu, “Evaluating AI for Law: Bridging the Gap with Open-Source Solutions,” arXiv:2404.12349, April 2024.

[11] R. Curry, “Sam Altman Says OpenAI Doesn’t Fully Understand How GPT Works Despite Rapid Progress,” The Observer, May 30, 2024.

[12] N. Magnanelli, “The Legal Tech Bro Blues: Generative AI, Legal Indeterminacy, and the Future of Legal Research and Writing,” Georgetown Law Technology Review 8, no. 2 (2024): 300–315.

[13] Paxton, “How to Avoid AI Hallucinations in Legal Research: Best Practices for Lawyers,” May 13, 2025.

[14] Z. Warren, “GenAI Hallucinations Are Still Pervasive in Legal Filings, But Better Lawyering Is the Cure,” Thomson Reuters, August 18, 2025.

[15] In July 2024, the American Bar Association’s Standing Committee on Ethics and Professional Responsibility released its first ethics guidance on the use of AI Tools.  In its Opinion, the Committee “identifies some ethical issues involving the use of GAI tools and offers general guidance for lawyers attempting to navigate this emerging landscape.” (American Bar Association, Standing Committee on Ethics and Professional Responsibility, Formal Opinion 512, July 29, 2024.)

[16] https://citecheck.ai/

[17] https://wolf.law.uw.edu/casestrainer/

[18] https://isthiscasereal.com 

[19] M. Lissner, “Combat Hallucinations and Look Up Citations with our New API,” Free Law Project, April 16, 2024;

CourtListener, “Citation Lookup and Verification API,” n.d. 

[20] https://www.damiencharlotin.com/hallucinations

About the authors

Ross Prolic, Research Librarian

Rachel Wertheim, Senior Research Analyst, Harbor Global