Legal AI pioneer, Kira Systems, has become the first legal tech company – it believes – to develop a Differential Privacy capability which ensures that when clients share trained NLP models with others, confidential information cannot be discovered by ‘reverse engineering’ the software’s response to potentially sensitive documents.
[ *** Note: see the comment below from Kevin Gidney, the CTO and co-founder of Seal Software. His main point relates to the fact that Seal has allowed customers to share models with protections for a long time already. Although, he is not claiming Seal used differential privacy for this.]
The simple version of what this does is that it makes it very hard, if not totally impossible, for someone to elicit confidential deal insights by exploiting Kira’s software if it has been shared with someone with bad intentions.
A classic example of the kind of thing Kira wants to make impossible would be if there were an M&A deal with multiple targets – (and this is a theoretical example) – where someone had trained up what the company calls a Smart Field on a certain target’s documents. Then someone involved in the deal sends that field to a third party, e.g. at a bank.
Someone at that third party gets hold of the model and then decides to use this smart field on several companies’ documents that they believe could be targets of the M&A deal. By seeing how those documents respond to the trained NLP model they then get a steer on which one is going to be bought. Such information could then be used for insider trading – and that would be a massive breach of lawyer-client confidentiality, albeit one that most lawyers have probably never even considered possible.
What the differential privacy algorithm does is so greatly increase the possibilities of what a response to the system could be that it provides no clues to someone ‘taking shots in the dark’ and hoping to get a hit.
Artificial Lawyer spoke to Kira’s CSO, Steve Obenski, and recently hired differential privacy expert, Sam Fletcher, about the move.
Obenski pointed out that the capability was essential for lawyers wanting to show that they are operating with the right ethical approach, as they can explain that the AI tool they used makes it impossible for a third party to exploit the machine learning model to gain insider intelligence on a deal.
Artificial Lawyer asked if this was a bit like having a combination lock with so many possible combinations no-one can guess the right answer?
‘Yes, this is like having a lock where you have increased the size of the dial so that it goes from 1 to infinity,’ Fletcher explained.
This mattered as the way to exploit an NLP model is not in ‘breaking open the model’ he said, but rather ‘watching how it effects documents and seeing how it behaves’ to get a clue.
Aside from the technical aspect, this is a very smart marketing move and could potentially separate Kira from the growing number of other players in this segment of the market, i.e. they may decide to say, as Obenski stressed, ‘Kira is the only tool like this that makes it easy for a lawyer to satisfy ethical obligations’.
As more and more legal tech companies develop NLP doc analysis tools, the battle is on to create market differentiation – not an easy thing to do when much of the base technology is seen by some as very similar. Of course, years of training and product development has gone into Kira and the NLP models of other well-established companies’ doc analysis platforms.
Even so, saying ‘we are different – AND we help lawyers with something as critical as ethical demands’ – could help commercially as much as anything else.
Alexander Hudek, CTO and Co-Founder at Kira System, added: ‘Confidentiality is a cornerstone for relationships between law firms and their clients. Differential Privacy is the only technique available today that can guarantee the privacy required by law firms.
‘Adding this capability to our products allows law firms to finally be able to share trained AI securely. I’m looking forward to seeing how people use this to further transform legal work.’
Not really correct, at Seal i designed this type of protection within the system from day 1. if you look at the patents filed, you will see that we replace within the data all items with placeholders, such as WatchTowerContractingParties, for the parties within the text. it is called out within the following patent, US10185712B2, and further built on within other pending patents for distributed learning.
when Seal created its Market Place, a number of years back, this was a key reason that allowed customers to share the models from within the analytics system. in addition to them all being encrypted via PKI methods, so that only the target system could decrypt the model.
the first release of the Seal system had this type of method, and the patent was from 2015, where it was detailed for other functions.
I applaud Kira for moving with the times, however they are far from the first to have this.