GPT-3 – A Game Changer For Legal Tech?

Tech circles have been lit up over the last few weeks following the unveiling by OpenAI of GPT-3, a new type of pretrained language model capable of generating natural language text and computer code with the most minimal of inputs. Could it be a game changer for legal technology, especially in relation to NLP tools that analyse text, as well as doc generation systems?

Artificial Lawyer was initially quite sceptical about GPT-3, (or to give you its full name: Generative Pretrained Transformer 3, an unsupervised Transformer language model). How was being able to create auto-generated pages of chimeric text that appear to be written by a real person going to be useful, especially in the legal world? Wasn’t this just a gimmick? 

So, AL contacted legal AI experts in California to get their views, which seemed apt as OpenAI is an artificial intelligence research laboratory, founded in San Francisco in 2015 by Elon Musk and other tech giants. Its CEO is Sam Altman, former president of the legendary startup accelerator, Y Combinator. Microsoft has also provided $1 billion in funding to help their work. I.e. this is a serious project.

But, what did the legal sector experts think?

First, Dan Rubins, a legal tech founder based in California, who is well-known in the market for having a sensible and level-headed approach to the field, with both a solid understanding of the law and coding. This is what he said:

‘I think GPT-3 will be quite impactful, but it’s the trend that really matters.

In the abstract sense, GPT-3 has already shattered benchmarks, but it really crossed the ‘uncanny valley’ for me when I saw this demo: 

https://twitter.com/sharifshameem/status/1282676454690451457 

This was the first time I saw a machine perform a task similar to what I do, all day, every day.

Plenty of technical challenges also lie ahead. For example, GPT-3 apparently costs between $4m-$12m to train, so we are unlikely to see the pace of iteration that we’ve seen with prior models, especially in domain specific applications like medicine and law. GPT-3 is a couple of orders of magnitude larger than its prior – 175B parameters vs. 1.5B for GPT-2.

Microsoft released a 17B parameter model last month, so there’s a kind of ridiculousness to current NLP model sizes. Researchers will apply pruning and other techniques to shrink the models, as happened with the initial release of BERT and subsequent release of DistilBERT, it just takes some time (funding), creativity (funding), and effort (funding).

In legal tech, few will be using a custom GPT-3 model anytime soon.’

Next, AL asked the team at NLP-driven doc analysis company, Evisort, who are also based in California, what they thought about this new capability. Was it going to be useful to legal tech companies? This is what Amine Anoun, Chief Technology Officer, said:

‘GPT-3 is the biggest advance in AI language models since its predecessor, GPT-2, was released in 2018. Trained with two orders of magnitude more parameters, it’s posed to beat many current accuracy benchmarks in tasks like natural language generation, named entity recognition, and question answering.

Even though it’s not specifically trained on legal language, GPT-3 can still be useful to lawyers. With few-shot learning, additional legal contracts and metadata can be passed into the pre-trained GPT-3 model, enabling it to quickly learn to generate new contracts or clauses.

For example, Evisort’s proprietary AI extracts key metadata and clauses from contracts, and can flag risk based on a user’s risk playbook. But with GPT-3, we can go one step further and suggest edits and redlines with less training data to mitigate those risks, increasing speed and efficiency during contract negotiation.

Another application of GPT-3 is the ability to support smart searches. A user can ask questions like ‘Can I terminate the contract at any time?’ or ‘What state law governs the agreement?’.

These queries are already possible today with a higher accuracy than human review, but with the amount of variation and complexity of contract language, GPT-3 could push the envelope on precision and recall even more.

It’s unlikely a technology like GPT-3 will fully replace contract drafting anytime soon, but it can augment the process of contract generation, analysis and e-discovery.

We’re very excited about the possible applications of GPT-3 at Evisort. By combining our specialized AI algorithms with a language model of this scale, we can help lawyers manage their contracts even more efficiently.’

And Jerry Ting, CEO and co-founder of Evisort, added: ‘We will be exploring its capabilities and testing it, as we do with any major developments in AI, but it’s too early to tell whether it’s directly applicable yet.

If we do use it, it’ll probably be in conjunction with our proprietary solutions as a bolt on to supplement all the inventions we’ve already created.’

Meanwhile, Joshua Browder – also based on the West Coast – and the founder of DoNotPay, sent a quick reply to say:

‘It’s truly amazing. I’m using it right now and we are building it into several of our products.’

So…..DoNotPay is going to try and leverage GPT-3. You read it here first.

So, there you go. Given their views, it looks like we should take GPT-3 seriously, but not get carried away.

It’s going to be very interesting over the next couple of years to see how legal tech companies use this (and future iterations of it….roll on GPT-4), especially those tapping NLP and NLG (natural language generation) systems to build their products.

Plus, some additional insights from UK-based legal tech company, Genie AI, via their CEO, Rafie Faruq, who is an expert in machine learning.

‘The main innovations of GPT-3 are two things. Firstly, an incredibly large model size (175 bn parameters compared to Microsoft’s Turing-NLG with just 17 bn parameters), which is partly due to new datasets such as nearly a trillion data points in the Common Crawl dataset.

Secondly, its few-shot learning achieving sometimes state of the art performance on a variety of NLP tasks. The ability of GPT-3 to generate language without model fine-tuning in a task-agnostic way is a monumental breakthrough in NLP. However, GPT-2 wasn’t released for public use for a long time.

Furthermore, even if GPT-3 is released beyond a mere API, hosting 175 bn parameters in a legal tech product, and expecting fast inference (and user experience for lawyers) is highly unlikely.

So, in summary, this is unlikely to be usable by legal tech companies in a production grade application for some time.

We already use bi-directional transformers (at Genie AI), and will certainly upgrade to GPT-3 if it is possible. We’ve signed up for the beta (not many people have access to GPT-3 right now) and will be playing with it as soon as it comes out, but as per the quote, at this stage we’re quite doubtful it’s actually useable in a legal tech product, because of the sheer size of it.

Predictions are likely to be very slow, and fine-tuning or retraining it on legal data is likely to be very hard due to the computational resource needed. Language modelling in the last few years has basically been a battle of who has the most computational resource (as has much of machine learning).’

Have you had a test of the software yet? Let AL know what you think.

4 Comments

  1. GPT-2 was not much more than a curiosity in the context of legal work. Coherent but none the less fake stories of discovery of a herd of unicorns took the internet by storm but the reality was that getting GPT-2 to output something relevant and useful was impossible. Parameters could be set to determine how variant the output would be but we found nothing of practical application in our research at Contract Genetica. Fine tuning the large model using GPUs was also impractical even with the most powerful hardware. CPU fine tuning was time consuming.

    GPT-3 shows more promise (and commensurate hype) but the model’s vastness means fine tune training to better address legal tasks presents it’s challenges. GPT-3 is able to convert legalese to simpler text and vice versa for example. But it leverages public data so inevitably has assimilated US court and SEC filing data (a bread and butter data source of the legal AI community). Not so useful for agreements that tend to be confidential such as English law contracts.

    GPT-3 is in private beta at present and use is via an API to OpenAI’s infrastructure so they can monitor the output – partly to prevent abuse of the technology (it’s use for those with a preference for disruption and misinformation) and partly to better understand its unfortunate bias for discriminatory and extreme output (where prompted in this direction). This is a problem OpenAI are working to filter out.

    Needless to say these challenges mean we are unlikely to see large scale implementation from law firms for some time but discrete use cases are aplenty.

    • @John Danaby, you lack imagination. Neither GPT-2 and GPT-3 are supposed to be used for their textual output.

      Their complexity is to be turned to filtering and detection.

  2. That demo is pretty impressive. For the lawyer who wants to build e.g. an automated advice systems and render the look of the UI (exactly what I am trying to to do this week) the ability to just say what you want then get code implementing your vision and see the results is breathtaking. Clause / doc generation is an obvious application but the freedom and immediate results that this can give the legal designer generally is a game changer. The fact that Microsoft have thrown a billion into the kitty speaks volumes.

  3. Dan Rubins was right to point out that it’s the trend in NLP advances that matters. In October of 2018, Google started it with “BERT: Pre-Training of Deep Bidirectional Transformer for Language Understanding.” In 2019, there were at least eight additional advances by others using the transformer architecture. After BERT opened the door: tinyBERT, Grover, BigBIRD, RoBERTa, two ERNIEs, a KERMIT, and (recently) BEHRT, where the EHR stands for Electronic Health Records (collectively dubbed “Muppetware”).

1 Trackback / Pingback

  1. Kira on GPT-3’s Pros + Cons – Privacy Is An Issue – Artificial Lawyer

Leave a Reply to John Danahy Cancel reply

Your email address will not be published.


*