US legal technology and ediscovery company, CS DISCO, has today announced the public launch of DISCO AI, a ‘next generation’, deep learning platform, which has been two years in the making.
According to the company, DISCO AI applies the latest advancements in both machine learning and cloud computing to solve complex unstructured data analysis challenges in the ediscovery space.
DISCO said that because of its native cloud technology it had the advantage of massive GPU compute-on-demand capabilities to power the latest machine learning technologies and algorithms, such as Google’s Word2Vec and a series of Convolutional Neural Networks (CNNs), to deliver higher levels of classification accuracy ‘[that are] faster than ever previously seen in the legal space’, it claimed.
Kiwi Camara, founder and CEO of DISCO said: ‘Artificial intelligence and legal technology are two areas that are just taking off….We will continue to bring the latest advances in technology to bear on the problems lawyers face. We’re just getting started.’
Using DISCO AI, lawyers are presented with predictions for suggested document classifications (or tags) relevant to particular aspects of a case, such as key issues, importance, confidential information, and overall case relevance within the DISCO interface.
Working in the background during the normal course of a review, DISCO AI displays Tag Predictions — a suggested tag with a score from -100 to +100, indicating the likelihood that the tag should be applied to the document by a human — in real-time.
The company said that its platform’s ability to correctly predict the likelihood that a tag should or should not be applied to a document is consistently in the 85% to 95% range, even with as few as 50 examples and data sets as small as 2,000 documents.
Artificial Lawyer conducted a Q&A interview with DISCO to try and find a bit more about the details of the new AI ediscovery system. Please see the company’s responses below.
What kind of machine learning and/or natural language processing is DISCO AI using?
DISCO, a native cloud technology, has the advantage of massive GPU compute-on-demand to power the latest machine learning technologies and algorithms. Underneath the hood, DISCO parlays words into meaning using a revolutionary tool called Word2Vec. Word2Vec was developed at Google to convert words to numbers in a way that encapsulates the immediate context around the word.
Because words with similar meanings often occur in similar contexts, Word2Vec is able to extract the meaning of words to an astonishing degree. DISCO’s Convolutional Neural Network runs on top of Word2Vec in order to pinpoint key building blocks used to develop tag recommendations. The combination of this software with on-demand cloud infrastructure enables radical new advances in AI for legal technology.
You have described this as an AI system, what elements of AI technology are you using, for example, machine learning and NLP?
DISCO AI brings artificial intelligence to the task of finding evidence through Deep Learning, chaining together multiple machine learning models to create a modular and powerful artificial intelligence that can be applied to problems in the real world. DISCO’s automation advancements powered by the cloud rely on deep learning technologies, such as Convolutional Neural Networks (CNNs). CNNs have achieved human-level performance on many tasks in image and text processing.
Whereas many predictive coding technologies of the past simply counted the number of times each word appeared in each document, CNNs read the document word by word; an ability that is groundbreaking for predictive technologies like ediscovery document review.
Fast computation of CNNs is achieved by an army of graphical processing units (GPUs). Unlike central processing units (CPUs), GPUs are special computational devices capable of completing thousands of mathematical operations simultaneously, billions of times per second. Software libraries, such as NVIDIA’s cuDNN™, enable lightning-quick evaluation of data using CNNs.
If you’re using NLP, please can you say how this is being used in the analysis of legal documents?
Like all predictive coding systems, DISCO AI relies on iterative learning. First the reviewer codes some document, then the system makes predictions. The reviewer responds by coding the predicted documents, and so on until the reviewer is satisfied that the document review is complete.
DISCO’s system is unique from other predictive coding solutions. Whereas other systems foist a fixed process on the reviewer, we designed DISCO with the goal of helping the reviewer rather than imposing an immutable review structure. In other systems, the machine decides which documents will be reviewed, a process that is called active learning. For DISCO, the machine makes predictions but it is ultimately the reviewer who decides what to review. We call this setting interactive learning, and it flows from DISCO’s core belief that technology should augment and not dominate the lawyer’s workflow.
Central to interactive learning is the idea of a separation between the reviewer and the machine. The reviewer chooses documents to review based on his or her best judgement. The machine observes the reviewer’s decisions and updates its recommendations. Our name for this paradigm is Continuous Asynchronous Learning. It is asynchronous because the user’s workflow is uninterrupted, and the machine is continuously learning from the reviewer to improve its predictions.
When did the company start and when did you decide to move into using AI technology?
DISCO was initially developed at a litigation boutique in Houston in 2012. It was born out of the firm’s frustration with conventional ediscovery tools that were slow and difficult for lawyers to use. Instead of being forced to adapt our work methods to technology, we wanted to invent technology that works the way lawyers work.
DISCO was the result, and today we are the fastest-growing ediscovery solution in North America. After more than two years of development and a successful limited availability program, legal technology company will announced the general availability of DISCO AI as of June 1st.
There seems to be a lot of different views at the moment about the different levels of sophistication of ediscovery systems. How does the new DISCO AI system improve upon what else is in the market?
Conventional approaches to TAR have been generally under utilised due to the use of old technology stacks, unintuitive user experiences and rigid workflows. DISCO AI does not require traditional seed sets or extensive administrative setup. Using DISCO AI, lawyers are presented with predictions for suggested document classifications (or tags) relevant to particular aspects of a case, such as key issues, importance, confidential information, and overall case relevance within the DISCO interface.