Hanzo Offers Gigabyte Pricing for ‘Smaller LLMs’, ‘10X Cheaper’

eDiscovery company Hanzo is making its Spotlight AI technology generally available, which they claim ‘drastically lowers the cost’ of applying ‘smaller LLMs’ to enterprise legal use cases, with data processing costs as low as ‘$99 per Gigabyte (GB), which is ten times more cost-effective than current industry norms’, they claim.

Sounds interesting………how do they do this?

The company stated that ‘rather than using API calls to extremely large language models like ChatGPT, the company has engineered Spotlight AI to work with smaller models hosted on secure private cloud instances for each customer. This approach ensures optimum performance by orchestrating optimal and cost-effective models to achieve the required results’.

Although, when this site asked which ‘smaller models’ they were referring to they didn’t immediately name one. Instead they replied: ‘Hanzo uses several smaller LLMs [that are] fit for purpose, as opposed to a general purpose LLM like GPT-4.’

The next step is charging customers via gigabyte (GB) use, rather than ‘tokens’.

Or, as they explain: ‘Most generative AI systems charge based on the number of ‘tokens’ created during data processing. Tokens are numerical representations of letters, characters, phrases, or sentences that allow neural networks to process human language.

The pricing model based on tokens can be particularly costly in the legal industry where data discovery use cases require vast numbers of documents to be processed. Using generic LLMs like ChatGPT 4, which haven’t been optimized for the task, can make AI cost-prohibitive in the legal domain.’

So, if you are doing eDiscovery work, for example, then you’re looking at 10,000s of documents, emails, chat messages and more, which potentially means huge numbers of tokens.

And finally they add that: ‘Due to its unique approach based on optimal orchestration of smaller LLMs, Hanzo can provide a price that is 10x to 20x times lower cost, which makes adoption of this technology feasible.’

Well, that’s quite a statement…. But, it’s certainly true that LLM token costs are seen as a barrier for large-scale document analysis by some.

Julien Masanès, CEO, Hanzo, added: ‘Pricing of AI tools has been opaque, and frankly that’s limited the adoption of the technology across the profession. At Hanzo, we have worked hard to crack the economics of AI as a key factor for enterprises, recognising that the ultimate measure of success is the return on investment. AI must prove its worth by helping companies manage the overwhelming data involved in litigation and investigation. We invite all industry professionals interested in this significant shift to test and validate in their own environments.’

The company also quoted an anonymous expert in discovery at a Fortune 100 company, who said: ‘Generative AI is going to be transformational for legal and compliance functions but the cost is a barrier today. Hanzo’s work to engineer the cost out of this technology opens the door to a number of new use cases for Gen AI in legal departments, from data discovery to litigation. We also see potential for the technology in compliance, with the ability to proactively identify risks.’

Well, let’s see where this goes. It has always seemed to this site that token pricing was an odd way to go about things for users who may have spent many years thinking in terms of GBs – even if tokens make sense if you run an LLM and are an AI engineer.

As to the point about smaller LLMs, there’s some greyness there, but an example could be something like Mistral 7B, with 7 billion parameters, as compared to GPT-4’s 1.76 trillion parameters. But, let’s see how this selection of smaller LLMs evolves as well.