Harvey Launches ‘Legal Agent Bench’

Agents….they sound great, but do these autonomous programs actually do what you want? Are they accurate? Are they reliable? Enter Legal Agent Benchmark by Harvey, which launches with the support of a list of major names such as Nvidia, OpenAI, Anthropic, Mistral, and DeepMind.

Think of Harvey’s ‘Big Law Bench’ for testing AI outputs, now think of something made to measure the performance of agents. In a nutshell, that’s it.

So, how does ‘Legal Agent Bench’ (LAB) work?

First, it will be opensource and open for everyone to get their agents tested.

The first version of LAB includes more than 1,200 agent tasks across 24 legal practice areas, and is evaluated by over 75,000 expert-written rubric criteria.

As Niko Grupen, Head of Applied Research at Harvey, told AL: ‘You can bring your agents [to the LAB] to solve tasks.’

Those tasks, each with a specially designed rubric – or as AL suggested ‘an agent assault course’ – test the agent and show how it performed. As Grupen explained, it could be an M&A deal, and the agent has to find key provisions in synthetic data, consider those provisions’ importance, and then write a report.

Grupen said it’s useful to think of agents as working across three main areas: planning, interacting, and adapting.

‘Agents will decompose a task, execute it, interact with data, tools, and also with other agents, and humans to ask for help and review. There is also adaptation where the agent may need to know more information,’ he said.

He noted that they have also added issues in the test material to see if the agents will spot them.

In short, imagine giving an associate a bunch of documents, some rules on how to work, and some specific instructions as well. Tell them who they can ask for help, and where they can look for more information, and then say: ‘Begin and tell me when you are done, or if you get stuck.’

He also noted that people can test out Harvey agents, or ones they’ve customized, and he underlined that they will welcome vibe-coders to test agents out on the open sourced LAB.

Therefore ‘LAB provides a shared framework for model providers, agent builders, researchers, and the legal community to measure and track progress. It evaluates agent performance on real-world legal tasks using a structure of instructions, client matters, and required work product,’ the company explained.

‘We want model providers, startups, researchers, legal AI companies, and law firms to run the benchmark, audit the rubrics, improve the harness, contribute new task families, and help define what legal agent evaluation should measure next’, added Harvey.

There will also be a leader board in the coming weeks, so that you can see which AI systems are best at supporting certain agentic tasks.

And in terms of fellow collaborators that are supporting the project, the list includes: LangChain, Fireworks AI, Baseten, Applied Compute, Mithril/Jared Quincy David, Stanford Liftlab, Trajectory, Moritz Hardt, Snorkel, Mercor, Irwan Bello, and Kelvin Guu.

Harvey added: ‘For the collaborators listed we are working closely with the foundation model labs, inference providers, Neo labs, and AI companies to evaluate the best open and closed source legal agents. These research groups have already made contributions to the benchmark and the research directions that it enables.’

Is this a big deal?

Yes, agents are going to become a key part of legal tech. The tech is improving and as Grupen told AL, where coding agents have massively improved and had a big impact on engineering, so too legal agents will have a big impact here.

Allowing public testing of agents like this encourages faith in such products and helps lawyers to develop use cases and build familiarity with this approach.

Great stuff.

—

(Conference advert)

Legal Innovators California, the landmark West Coast legal tech event, will take place on June 10 and 11, in the heart of the Bay Area, the home to many of the world’s leading AI businesses – and plenty of legal tech pioneers as well! More information and tickets here.

Express route to your Legal Innovators California June 10th and 11th ticket here.