Legal AI Has A Growing Token Price Problem

If legal AI tools are the vehicles our work is now transported by, then tokens are the oil that drives it all. And that’s an issue because this ‘oil’ is getting increasingly expensive.

Legal AI has this token problem because the cost of leveraging an LLM for legal tasks, especially via the latest models, is rapidly rising. Agents and widescale doc-based needs make it worse. It’s such a systemic issue now it will change the way we use AI in the law.

Some companies such as Harvey are trying to find ways to reduce token costs (see below), meanwhile the strategy of fine-tuning open source LLMs – see Kirkland, and also Thomson Reuters – makes a lot of sense in this context. AL takes a look at the issues and the repercussions, and hears from some legal tech experts.

Houston, We Have A Token Cost Problem…

In the beginning, the job was to get lawyers to use genAI. Now many thousands are using it. But, this creates a new challenge: token costs.

Why are token costs rising? Because the tasks lawyers now perform with the latest frontier models are getting more and more token-intensive. There is more reasoning, there is more agentic work – which can mean going back to documents to read again and again – and there is simply more textual data being run through the systems now.

Meanwhile OpenAI and Anthropic are increasing the token prices for the latest models – and naturally everyone wants to use the new models, the same as everyone wants the latest iPhone.

The net result is spiralling costs. This impacts the law firms and the legal AI companies, e.g. how do you charge for this increased token cost, and how do law firms swallow it / pass the costs on? (More on that below.) But first some comments from the market.

Shawn Curran, CEO of Jylo, told AL: ‘Token use has increased because models like Opus 4.7 and GPT5.5 are the first models trained on enterprise usage and data rather than just the internet, are smarter, better, and more aligned to white collar tasks so people are loving them.

‘Because the internet is a commodity there was competition, now OpenAI and Anthropic have the bulk of users, they have the data fly-wheel, and are pulling ahead and can set the price without too much competition now.

‘What does it mean for legal AI companies? Tokenmaxxing is a thing, per seat pricing is gone, if Anthropic, Microsoft and OpenAI have moved away from it, no-one in going to subsidise legal tech vendors on all you can eat.

‘Therefore firms are going to have to spend more, unfortunately firms paid a premium for wrappers when the supply chain models were useless and now they may have to pay an even bigger fortune.’

Antti Innanen, the legal AI expert behind Laverne, told AL: ‘Yes, the tokens are getting more expensive. Or perhaps more accurately: what is becoming more expensive is using the newest frontier models, especially reasoning models. The frontier is expensive, the baseline is getting cheaper.

‘My view is that the new models were heavily subsidized at first to attract users and developers. I am not entirely sure what the long-term strategy is for API pricing. APIs are an important revenue stream for the foundation model providers.

‘Legal tech companies are responding in different ways.

1. Using cheaper models – Some rely on older models. Others use escalation protocols or model-routing systems that reserve expensive models for complex tasks. [Using an] old model is the main money-making move.

2. Creating legal-specific layers – Legal tech vendors continue to build legal-specific features and workflows. Some of this is genuine innovation. Some of it is innovation theatre and fear designed to justify pricing. At the end of the day, most of these companies are middle men. They buy tokens and resell them at a premium.

That premium has to be justified somehow, often through convenience, workflow integration, proprietary data, or customer concerns about risk.

3. Local models – The obvious escape route is local models.’

And Jake Jones, co-founder of Flank added: ‘Work is shifting to agentic workloads that use orders of magnitude more tokens. Even as models get smarter and cheaper, longer-running tasks will keep consumption climbing.

‘That’s already shifted our pricing away from ‘this is a tool, here’s a licence’,  towards ‘this is an autonomous system displacing humans’, priced on the displacement, not the licence. In short: it’s expensive, but you get what you pay for, and still a fraction of what you’d pay an ALSP to do the work.’

Meanwhile, Winston Weinberg, CEO of Harvey, shared with AL how they are thinking very deeply about token costs, and in fact the overall related costs of running agents and AI infrastructure in general. Weinberg highlighted the work of Factory, which helps route tasks to the appropriate model for each need, i.e. not all tasks demand the most expensive model.

He also pointed to their work with LangChain, which – put simply – reduces the cost of running agents by making it less expensive to check their performance.

Here’s a summary: ‘Verifiers can be a cost bottleneck for running agent evaluations and RL post-training at scale. We find we can reduce verifier costs by an order of magnitude by batching verifiers and using open models. Tuning prompts for verifiers allow us to target particular behaviour further.’

What Does This All Mean?

So, with this in mind, what does this all mean? Here’s some thoughts from Artificial Lawyer.

  • Eventually, all legal AI companies will have to pass on the costs of tokens in a more direct way, as they can’t absorb all the costs alone, especially when also sometimes offering very subsidised per seat deals to win customers.
  • More legal AI companies may start to build their own fine-tuned open-source LLMs to allow them to reduce costs instead of having total reliance on the main frontier models for all needs. The more they avoid relying on Claude and ChatGPT, the more they can reduce *some* of their costs.
  • It shows that Thomson Reuters’ move to build their ‘own AI’, via an open-source model makes even more sense now. It’s not just about better answers trained on their data, it’s about reducing their cost of using AI. That potentially means bigger profit margins for the publicly listed business.
  • And it shows that Kirkland & Ellis’s plan – or at least AL believes this is so – to fine-tune their own open-source models for internal use is not just smart in terms of providing privacy and a good message to the clients – it may also save them money in the long term.

And also…

  • Does going all-in with Claude cause cost problems? AL asked ChatGPT to estimate the costs of a 3,000-lawyer firm going full tilt with frontier models and the costs were very, very large (more on that later).
  • Should more law firms start buying GPUs as well to do what TR and Kirkland are doing? Possibly, if they can afford it. But, few will do so.
  • Will token costs keep rising now that O and A have a near monopoly on enterprise users? Will their IPOs force them to up prices even more? Possibly. And that in turn could drive more changes in how legal AI economics operates.

Conclusion

The next time you use an AI tool at work for a very document-heavy task you may want to ask: how much is this costing us?

This in turn will create a whole new sensibility around legal AI use. Can it be done more cheaply? Are we wasting money? Do we copy Kirkland? How are our suppliers helping, or not?

And for the legal tech companies they have the challenge of what to do with their costs as well. Pass them on? Swallow them? End per seat deals? Move to a mobile phone / data use pricing strategy? Start buying GPUs and / or develop new strategies to reduce frontier model use?

And for inhouse, do they start to pick law firms on the basis of how they handle token costs?

AL will never think of legal AI use the same way after really dwelling on the token issue. Why? Because AL is all about the intersection of the business of law and AI. And if the cost of using AI changes then that also impacts this interrelationship.

As said at the beginning: if legal AI tools are the vehicles our work is now transported by, then tokens are the oil that drives it all. And that’s an issue because this ‘oil’ is getting increasingly expensive – and just as our AI needs are growing rapidly.

And as noted, this in turn will reshape how the legal tech market functions, perhaps the legal market as a whole.

Richard Tromans, Founder, Artificial Lawyer

Legal Innovators California Conference is Next Week – June 10 and 11 in San Francsico. 

For the first time at the California conference, we are providing complimentary tickets, which includes all sessions, breaks, lunch, and the evening networking drinks reception in the heart of the Bay Area, which in turn is the bustling centre of the global AI industry – and which is now changing the legal tech world itself.

Legal Innovators California is also the largest and most pioneering legal tech event on the West Coast of the USA – but we only hold it once per year, each June. So, if you’d like to join us next week, then get your skates on and claim your ticket! There’s so much happening now in legal tech and AI, this is an incredible opportunity to be right at the center of the action, with those who are shaping the future of our industry! 

To claim you ticket before we really do hit maximum capacity, then please follow the link to the Express Registration. This link is only for those working at law firms and inside inhouse legal teams.

Look forward to seeing you there! It’s going to be amazing!


Discover more from Artificial Lawyer

Subscribe to get the latest posts sent to your email.