Artificial Lawyer recently caught up with Nick Brestoff, the founder of Intraspexion, a new legal tech company that seeks to predict and prevent potential litigation before it happens. Think ‘Minority Report’, but using algorithms instead of pre-cogs.
(The short video about half-way down the story is also really useful.)
Q1. You have described Intraspexion as an ‘early warning system’ that helps companies to avoid litigation. Could you explain how the system works?
A1. We want a client to be able to see a risk in time to be able to avoid a lawsuit. So we train a Deep Learning algorithm with generic lawsuits in order to be able to see the risks in their emails.
Think Shakespeare, The Tempest, Act 2, Scene 1, because ‘past is prologue’.
In stair steps, here’s how the system works:
Step 1: We train a Deep Learning algorithm by providing it with a sufficient number of examples for a specific lawsuit case “type,” such as text from employment discrimination lawsuits. As I’ll explain, there are many categories like this, but we had to start somewhere. More generally, we would take the unstructured text in each category and then train a case-specific algorithm with them.
Step 2: Then, from the training data, our software counts and ranks the words, and we have attorneys or paralegals note which of the words would be relevant to a user. In short, we create a visualisation Library which the system accesses after the emails are scored by the algorithm. I stress that visualisation is the Library’s only function. We do not use the Library for any searching or sorting, and the algorithm does not use key words or concept search or any of the technologies used in eDiscovery.
Step 3: The Legal Department installs and operates the system.
Step 4. The system accesses a copy of yesterday’s emails, indexes them with an identification number, turns the unstructured words in the emails and attachments into number strings (now we have two strings), and passes them both to the algorithm.
Step 5: The algorithm compares the emails (now as number strings) to the way it’s been trained (and to a “negative” set of unrelated text) and outputs the accuracy score, the identification number, and the email number to the user interface.
Step 6: There are only two screens in the user interface. A user is not bothered by all that I’ve just now related to you. In the first screen, a user gets some data and learns that, by the time the emails related to a specific risk category are presented, probably more than 99% of the emails have been bypassed as being unrelated. In addition, the user sees a bar chart to show the distribution of the high-scoring risky emails. Attorneys and paralegals will go to the highest scoring risks first and look for true positives and false positives. This information is useful for re-training the algorithm in a feedback loop that better reflects the company’s culture.
Step 7: The second screen shows the scores and the associated emails. The system can find the emails because we gave each one a unique index number when it was scanned. And it can highlight the relevant words by using the Library.
You can see Steps 6 and 7.
Step 8: The user decides what to do next. Because of the index number, an attorney can find a risky email in its native state with a single click, and pass it to the company’s existing case/investigation management system. Once there, the attorney can ask, “Who else is in on this?” and access other internal databases, e.g., performance reviews. So now we’ve moved from Shakespeare to Sherlock. When a corporate attorney finishes the investigation, he or she reports to a control group executive.
Step 9: Now a decision-maker has an early warning of a risk and he or she can decide what to do about it. Hopefully, the risk has been seen in time to minimise the damage or avoid the damage altogether.
So what is this, really? Let me use the UK’s own Richard Susskind’s description (modified somewhat). Intraspexion puts the lawyers ahead of where the guardrail has failed at the top of the cliff. The lawyers of the future may be aspirational and point to an even higher place on the mountain, but now they have a way to steer their clients away from the bottom with the ambulances.
Corporate counsel are closest to the risks in their company’s own internal data, and they’ve been blind to them. Now they can see.
Q2. You have mentioned that you recently received a patent for the technology, for which, congratulations. When did this idea first come to you and what were the drivers that led you to create this?
A2. On December 12, 2016, the U.S. Patent and Trademark Office allowed my patent for “Using Classified Text and Deep Learning Algorithms to Identify Risk and Provide Early Warning.” I had been thinking about the idea for years. On October 25, 2012, Law Technology News (now known as Legaltech News) published my article entitled “Data Lawyers and Preventive Law,” but the notion goes back to one of my law professors, the late Louis M. Brown (1909-1996), who wrote a book entitled “Preventive Law” in 1950. I tried many ways to implement his idea. Nothing worked. Eventually, in 2014, I retired from the legal profession. I had been a litigator for 38 years.
So, one motivation for my book, my startup company, and now my patent is my experience with litigation. Litigation is an enormous pain in so many ways. There should be less of it.
I was also motivated by two giants: Professor Brown, who was teaching when I was in law school at the University of Southern California, and the UK’s own Sir Richard Susskind. Harold Brown, Professor’s Brown’s son and a partner in a preeminent Beverly Hills entertainment law firm, wrote the Foreword to my book and Professor Susskind endorsed it.
On occasion, I have heard them whispering to me.
There’s one more driver, education. Before I went to law school, I earned engineering degrees at UCLA and at the California Institute of Technology. With an awareness that computer science was on the cusp of enabling a way for there to be “less litigation,” I cooked up a name, mashing Intranet and “introspection” together, and started Intraspexion on August 20, 2015.
It was quite a leap into retirement.
Q3. Would this be used primarily for employment issues, where perhaps the risk signals are sometimes quite clear? Or could this be used in many other areas, such as bribery, fraud or other ‘white collar’ type crime committed inside a company?
A3. I chose employment discrimination as a starting case. (We’ll add other categories based on client demand.) It’s a high frequency category. But in the U.S. federal litigation system alone, there are over 150 categories. Some of these categories, such as Freedom of Information Act requests and cases involving prisoner rights, are not relevant to the business world. But there are categories of civil litigation matters for breach of contract, fraud, insurance, product liability, and many more, and we can train a Deep Learning algorithm for any of them in an automated way. Think of them as filters.
Q4. The system works with ‘deep learning’. Would you classify this as an aspect of Artificial Intelligence? And also, can we draw a line between deep learning and machine learning in general, if so, how can we differentiate them?
A4. Artificial Intelligence is an overarching term, and I’ll run through the “nesting” as I understand it. The first subset is Machine Intelligence. A subset of Machine Intelligence is neural networks. Neural networks with more than three nodes for computations are called deep nets. That subset became known as Deep Learning.
Deep Learning neural nets were noticed by the AI academic community when they started winning programming competitions and playing video (and other) games better than world-class players. For more, see Jeremy Howard’s TED talk entitled, “The wonderful and terrifying implications of computers that can learn.”
Also, see any YouTube video featuring the UK’s own Demis Hassabis, the CEO of Google DeepMind.
In addition, one of my colleagues, Jagannath Rajagopal, produces the deeplearning.tv channel, which is also on YouTube. He is the creator and producer of more than 30 episodes about Deep Learning. He clarifies Deep Learning, in part by leaving the math out of it. He presents each episode (each concept) in somewhere between 3 and 6 minutes.
There’s a lot of hype surrounding Deep Learning now but it may more like the Internet than anything else. Andrew Ng, a Stanford University computer scientist and the Chief Scientist at Baidu, has called Deep Learning the “new electricity” — meaning that, as electricity changed every industry during the Industrial Revolution, Deep Learning holds the promise of changing every industry in our time.
Q5. If a company licensed the software, who in the company would be in charge of it? Would it be operated by the in-house legal team, or another party, or other group of employees?
A5. Our system should be installed by the Legal Department and operated in good faith under its direction and control. The reason for this focus is to invoke the attorney work-product doctrine and the attorney-client privilege in accordance with policies notifying employees that the company reserves the right to monitor its own (but only its own) computer resources. As a matter of “best practice,” employees would sign off on these policies and managers would never undermine them by conduct or advice to the contrary.
Q6. How have you handled the issue of the software scanning potentially confidential material sent from 3rd parties to the company? Do all parties have to give consent to have their emails scanned in this way?
A6. The system scans only internal emails and other communications, including (in the future) text messages, voice mails converted to text, warranty claims, and so on. So it stands to reason that company attorneys would be allowed to see even information marked “confidential.” However, Deep Learning algorithms “understand” words in the context of a specific risk category, so I would expect one of our algorithms to present an email only if it had a high score in relation to a category of risk for which it had been trained.
In this sense, it’s important to say, the algorithm is augmenting the intelligence of human users, and is not making decisions for them. No one wants to paw through tens of thousands of emails looking for a risky few. But everyone’s in favour of finding the needle in the haystack that presents a risk. Does anyone want more litigation?
Q7. Are there any clients you can mention yet?
A7. Intraspexion was in stealth mode until I announced our existence and I gave my first talk about us and our team on November 7, 2016, and we’re already in discussions with two companies, both of which are NYSE-level companies, two VCs I didn’t approach, a global consultancy, a company that focuses on data-enabled contracts, and a company with whom we might form a close marketing relationship. And we’re set to go with a pilot project with one of them. I don’t have permission to say anything more than that now. I can say that Intraspexion was accepted into the Inception Program for Deep Learning startups sponsored by NVIDIA.