When Legal AI Sounds Right But Fails Across Borders

By Michael Krallmann, CEO, TransLegal.

Legal AI has reached an unsettling stage of maturity. On the one hand, outputs have surface credibility, read well, follow structures familiar to lawyers and often cite statutes and cases. At the same time, accuracy can be poor, a problem which intensifies by orders of magnitude in a cross-border context.

In a multilingual or multi-jurisdictional context, legal errors rarely announce themselves explicitly. Instead, they come disguised by surface gloss: a translated clause that feels accurate; a definition that sounds familiar; a comparative explanation that seems plausible. The danger lies not in obvious mistakes, but in answers that are almost, but not quite, right.

Beyond hallucination – the equivalence problem

Foundation models are very good at generating the dominant English-language framing of relevant concepts, but far less reliable at recognising when a term from another jurisdiction does not align cleanly with that framing, aligns only partially or aligns in one legal context but not another. The output reads well because the language is fluent, but surface gloss hides an equivalence problem. What is missing is not language capability, but an understanding of legal (non-)equivalence.

This problem is amplified by the underlying model architecture. Legaltech tools may rely on large general models with light layers of prompting or interface logic. Some may even add retrieval from local sources, but this alone does not overcome conceptual divergence. If the underlying model does not understand differences between legal concepts across systems, it cannot reliably signal uncertainty, explain limitations or make accurate suggestions. Instead, it will fill its knowledge gap with confidence.

Moreover, the training data of foundation models is insufficient to provide the legal and conceptual context necessary for cross-border translation in the legal domain. Even the largest models are trained on broad, unevenly distributed legal materials, with strong biases towards specific jurisdictions, languages and legal traditions. They do not carry the structured knowledge needed to distinguish between superficially similar concepts that operate differently in different legal systems. Without access to human-curated, jurisdiction-specific legal data at the point of generation, a model has to rely on its internal priors. It cannot ground its answers in authoritative definitions, comparative mappings or contextual explanations because those materials are simply not there. When built on genuinely comparative legal data rather than generic documents, systems can show relevant legal context explicitly, instead of guessing. Without such context, cross-border legal AI remains fluent and fast, but fundamentally under-informed.

Invisible risk

The result is a growing class of legal outputs that are linguistically polished but legally unreliable. For cross-border teams, this issue surfaces in small but consequential ways: advice communicated poorly between jurisdictions; drafting that assumes non-existent rights or remedies; compliance analysis that reflects one legal culture while being applied in another. Such weaknesses obviously compound at scale.

A particular difficulty is that users may often not realise that a problem exists. Lawyers are trained to read critically, but also to trust legal language framed according to familiar patterns. When AI-generated output mirrors how a lawyer would explain an issue in their home jurisdiction, the instinct is to accept it.

This is why cross-border liability related to legal AI use is likely to increase as take-up increases and systems increasingly produce relied-upon outputs. Adoption is accelerating, and use cases are expanding from internal research to client-facing work, contract generation, regulatory analysis and multilingual communications. Additionally, many users are operating under the false assumption that if a model can answer legal questions fluently in multiple languages, it is effectively multilingual in a legal sense.

True legal multilingualism is not about language alone. It is about the construction, definition and application of legal meaning in specific systems. Terms can be translated correctly and still mislead, doctrines can share a name but still diverge in effect, and procedures can appear analogous while serving different purposes. Foundation models are not designed to present these distinctions unless they are explicitly encoded.

Even when legal AI works exactly as designed, the underlying large language models still optimise for plausible language, not jurisdictional accuracy and accountability. The potential pitfall is treating linguistic gloss as a proxy for legal equivalence.

For legaltech companies and legal teams, the implication is uncomfortable but necessary. Cross-border accuracy cannot be bolted on through interface design or prompt engineering alone. It requires deliberate work focused on legal terminology, comparative structure and jurisdiction-specific meaning, guided by human expertise. Without that accuracy, even the most polished systems will continue to produce answers that feel robust until tested in the real world.

Organisations that recognise this challenge early will have an advantage. Not as first movers, but due to their clearer understanding of where legal risk actually arises in AI-assisted workflows.

Towards a solution

If your work depends on legal outputs that cross languages or jurisdictions, it is worth considering not just whether the answer sounds right, but whether the system producing it understands and can explain why it might not be. Exploring that distinction is often the first step towards reducing a risk that is still largely invisible.

At TransLegal, we’re developing human-curated legal datasets and AI-powered systems to improve AI performance in multilingual and multi-jurisdictional legal environments. Our work focuses on expert-led data creation and quality assurance to support more accurate and accountable legal AI systems.

If you are responsible for deploying or relying on legal AI in a cross-border context, it may be worth discussing how jurisdictional accuracy is handled in your current systems. You can explore our data demo model at translegal.com or get in touch to continue the conversation.

About the author

Michael Krallmann is CEO of TransLegal, where he leads the development of structured cross-jurisdictional legal datasets designed to improve conceptual accuracy in multilingual legal AI systems. He holds a PhD in Law and Translation and works at the intersection of comparative law, language and legal technology.

[ This is a sponsored thought leadership article by TransLegal for Artificial Lawyer. ]

Note: AL is away in NY all week, back to usual news publication on March 16th.


Discover more from Artificial Lawyer

Subscribe to get the latest posts sent to your email.