Liquid Legal Creates Open Source NLP Library

The Liquid Legal Institute (LLI), an organisation focused on building a ‘Common Legal Platform’, has created an extensive open source library of free NLP resources that may be of use to the legal and legal tech world.

The library, which is on GitHub, covers a wide range of content, from general know-how, to free NLP software, to legal sector datasets that you can use to then train your NLP for a specific project (see part of the resources listed below), to tutorials to help you learn more about how to use this software.

It’s a great endeavour that helps to broaden the use of NLP software, which is steadily helping to reshape a significant part of the legal tech market – (see The New Legal AI Map).

Artificial Lawyer caught up with LLI co-founder, Bernhard Waltl, who is also a Legal Operations Officer in Munich at German car group, BMW, to find out some more.

One particularly interesting aspect Waltl mentioned was that this was not necessarily all about encouraging law firms to start building their own NLP tools – although if they wanted to this would be a great resource – but, that the priority is to break down knowledge barriers that hold back lawyers from understanding what NLP is and what it can do.

Ironically, this may in turn result in lawyers looking to existing commercial legal tech vendors, but at least they will then do so with a deeper knowledge of what NLP is all about and how it works. And that’s got to be a good thing. I.e. free resources can in theory help to sustain and expand a commercial market, rather than undermining it, at least when related to what is still seen by some as a new approach.

Here’s the interview that covers this point and much more. Enjoy.

LLI co-founder, Bernhard Waltl

– First, what is Liquid Legal?

Liquid Legal is a mindset, a movement, an attitude, and a laboratory for the future of the legal industry. It all started a couple of years ago with our first official book, in which the transformation of legal was sketched.

The book’s subtitle unveils a lot: ‘Transforming Legal into a Business Savvy, Information Enabled and Performance Driven Industry’. Since then, we have been overwhelmed by the momentum Liquid Legal has experienced.

In 2018 we founded a non-profit association with a lightweight structure of 5+1 working groups, namely Digitalisation, Education, Standardisation, Methodologies, and Material Law, in which our members work on projects that contribute to the ‘+1’ working group and our overall vision: the Common Legal Platform.

– Now the information on NLP, is this all from other sources, or did some come from LLI?

We observed that many NLP projects in the legal field lack a common understanding and a knowledge base that contains all the freely available state-of-the-art information on NLP for legal. This is counter-intuitive, as so much information is freely available.

This was the base line for the Legal Text Analytics Repository, which is the result of a project that was proposed and completed within a very short period of time. Most of the content is linked, we ‘just’ collected it and linked to the original sources.

The NLP content that has been created by LLI members is linked there as well. We are constantly adding new information and are more than happy to add things that we have missed. The great thing with GitHub is that is really open to everyone: not only to those who inform themselves, but also to those who want to contribute.

– Do you hope law firms will build their own legal tech tools (by using this information)?

No, we don’t expect law firms or inhouse departments to write their own legal tech tools – and least not with the repository only. However, we think that the repository decreases the barrier of trying out and understanding legal tech tools.

Ultimately, it creates a common and shared understanding. This contributes to our mission of the ‘Common Legal Platform’: We get more aligned with what we have in common and become much more efficient in solving our own, individual challenges. Open source is definitely a step forward – although it might appear as a novelty for the legal industry. We, at the LLI, are more than happy to bring this mindset to the legal domain.

– What is there?

The repository is structured into different sections: Selected Tasks and Use Cases, Methods, Libraries, Datasets and Data, Annotation and Data Schemes, Annotation Tools, Research Groups and Labs, and Tutorials

We assembled all the information that is required to start and implement an NLP project. We want to be as hands-on as possible: no overhead, no registration, and above all: no paywall.

– Why did you do this?

Well, you must understand the LLI mindset: we want to solve the problems that cause headaches for everyone during digitalisation. This includes the getting-starting with AI / NLP projects. We decided to summarise all the relevant information in one place and create an overview that everyone can use. Of course, we hope that the community accepts our offering and gives something back, e.g., a resource that is not yet linked.

We are more than happy that the various communities have reacted so positively: the International AI and Law Association linked the repository on their official webpage and many NLP communities have shared it within their network. As of now we have approx. 200 Github stars, which we consider a success.

– Is this the only open source area, and are there others you are building?

Most of the projects are creating something that is freely available. Not everything is published via GitHub, but on different channels. Just have a look at our webpage and get an overview of all the great projects that are either completed or on-going. We cordially invite you to become a member, join the LLI community and work with us on Liquid Legal.

Congratulations, this is a great project!

Some of what’s on offer in the library:

Source: LLI, via GitHub.