Equall, a Paris-based genAI company, has released its new SaulLM-141B and 54B LLMs, specifically designed for the legal domain. The team claims that they outperform GPT4 and Mistral on a range of legal tasks. These models, which are much larger than those previously developed by the group, also once again raise the question: is it better to use a general LLM, or one specifically made for legal tasks?
The new models were co-developed with CentraleSupélec at the Université Paris-Saclay and CINES. As to Equall, they say this about themselves: ‘We build serious products for serious legal work – and are redefining how lawyers and corporations deal with legal risk.’
(And if you’d like to see another legally-focused LLM then check out 273’s KL3M.)
In terms of the numbers 141 and 54, they refer to the tokens the models were trained on – in billions. They follow an earlier and much smaller experiment. Sources include multiple free libraries. See below for a sample they have used in one of their models for pre-training data, which gives a sense of where their results are coming from.
The new models are ‘achieving state-of-the-art performance on legal reasoning benchmarks.’ They are also ‘sharing their research and training methodologies with the community in an aim to spur further innovation, research, and value creation in the legal industry’, stated Pierre Colombo, Chief Scientist at Equall.
Performance was measured by LegalBench, a leading benchmark used to evaluate legal reasoning in LLMs. The ‘LegalBench’ comprises over 150 tasks across six legal reasoning categories: issue-spotting, rule-recall, rule-application, rule-conclusion, interpretation, and rhetorical understanding.
‘While our training work is not complete, the results mark a significant achievement – and we hope that our research will light the path for further innovation. Now, we are working to optimise the models for interactive experiences and other legal use cases through additional data integration, further fine-tuning, and enhanced user preference alignment,’ the company stated.
General or Legal Specific?
Is it better to have legal specific LLMs or use something like OpenAI and then fine-tune and system prompt? Generally, the answer has been the latter, at least in terms of what we see in the market. However, Equall stated: ‘Our work with Saul further confirms that domain-specific models can surpass general-purpose systems for legal work.‘
‘From deeper and more contextualized understanding, to increased accuracy and relevance, to expert-level insights, enhanced user experience and beyond — specialized models can meaningfully elevate the performance of AI in law. While the impact of specialized LLMs has already been well observed in several fields, like code, medicine and science, domain specialization in the legal industry has been less deeply explored. The SaulLM project aims to contribute to this discussion.’
And then you may still ask: but why a special model for law?
The company explained that: ‘Due to the profound nuance and intricacies in legal practice, unlocking the full potential of AI in law requires the development of highly-specialized systems combining multiple models, diverse techniques and complex algorithms to service legal use-cases and replicate legal work processes.’
The entire Saul family of models is now available on Equall’s HuggingFace page here. In the spirit of fostering innovation and increasing access to legal resources, they are releasing the Saul models as open for ‘free and responsible use under the permissive MIT license’.
So, there you go. Try it out and see what you think. Legal specific or general LLMs, or both at the same time? Only one way to find out.