Internet Evidence Is the Next Big Thing In eDiscovery

Internet Evidence Is the Next Big Thing In eDiscovery

By Kevin Gibson, CEO & Chairman, Hanzo

Hear that rumbling? It’s the next sea change in eDiscovery, and this one’s a tsunami, headed right for you.

We’re talking about the discovery of internet evidence. If you haven’t been looking online for discoverable information in your litigation and investigation matters, you literally have no idea how much data is out there. Moreover, as society changes, the boundaries between work and personal content are blurred. Collaboration apps break down barriers and they include links that go to the outside of the company, making the web a vast treasure trove for responsive information.

Kevin Gibson, Hanzo

Do you want to find it, or do you want to find out about it when your opponent sends it to you? Nobody likes a surprise, unless you’re the one strategically using it to advance your case.

Yet, identifying discoverable data on the internet has hit two stumbling blocks. First, it’s hard to find responsive information in the mind-boggling mass of unstructured data that is the internet. It’s like looking for a veritable needle in a haystack. (Some have even decided that there must not be any discoverable evidence online because the idea of looking for it seems overwhelming).

Second, common collection technology does not capture dynamic data in a usable, persuasive format.

We’ve been here before. Ediscovery itself was a seismic shift in the world of paper discovery. Managing email in ediscovery seemed insurmountable before it became commonplace. While the discovery of online data may initially appear impossible, the rules governing discovery, regardless of the data source, have remained quite consistent, and relevance to the issue is key. Today, ediscovery is simply inclusive of a wider variety and volume of electronically stored information (ESI).

It’s hard to find responsive information in the mind-boggling mass of unstructured data that is the internet

A Universe of Evidence Is on the Web

There’s a massive, growing, changing, and largely untapped universe of data on the internet. Google, Amazon, Microsoft, and Facebook store at least 1,200 petabytes of data, and that estimate is already out of date, as the internet grows daily. Nor are humans alone in adding data to the web; Internet of Things devices are generating online data too.

With as much time as people spend on the internet today, you can be certain there’s web data relevant to ediscovery matters. We’re online constantly. We’re communicating, socialising, working, buying everything under the sun, and marketing and selling our own products and services.

We’re reading the news, following local and national politics, investing, choosing consumer goods, researching answers to questions, obtaining medical advice, and seeking like-minded communities. We’re finding our next job, house, pet, or life partner. We’re leaving a massive trail of breadcrumbs about who we are, how we are, where we are, what we think, and what we value.

While not all online information is relevant to litigation, much of it could be, if lawyers can find it, understand its meaning through context, and capture it. Forward-thinking attorneys have already started looking to the internet for evidence. Perhaps your litigation opponent has posted a diatribe on a blog or ranted in a chat group, giving you all the information you need to win your case.

Here’s the thing: if you’re not looking for evidence online, you won’t know what you’re missing, until an opponent drops it in your lap.

It’s hard to find responsive information in the mind-boggling mass of unstructured data that is the internet

Levelling the Playing Field: Identifying Online Discoverable Evidence

The first problem with web evidence, though, is that there’s too much of it. How do you home in on useful information about a particular person, company, or issue in the sea of online data? It’s simply not a fair fight, asking humans to keep up with the volumes of automatically generated data. Humans can’t review data fast enough to keep up with the web’s growth. Even if they could, it’s not cost-effective.

But while reviewing the internet for discoverable data may seem a Sisyphean task, the truth is it’s entirely possible—it’s just that you, nor any human, can do it.

How is that data accessible, then? AI-powered investigation tools that leverage data science and machine learning to automate searching at scale can make the impossible possible. AI investigators are tireless fact-finders that can sift through massive amounts of online data and simplify the process of identifying relevant discoverable data associated with a person of interest and find everything they’ve said and done, their entire breadcrumb trail of potential evidence, online.

Capturing and Using Dynamic Web Evidence

The second challenge is collecting web evidence in a format that’s both persuasive and defensible in court. Historically, ediscovery professionals have settled for poor reproductions, using PDFs or image files to capture the complex world of the web. Worse yet, some have printed out websites and presented them in court. That’s not compelling or persuasive evidence, and it can’t be authenticated.

Fortunately, native-format web archives satisfy both of those requirements. Web data differs from every previous form of communication in that it’s dynamic. Unlike email or Word documents, the internet has never been a paper-based medium; online data cannot be reduced to paper without losing critical information. The very nature of the internet is its interconnectedness and ceaseless changeability. The linkages between pages have inherent meaning, providing context, detail, and richness. Without a ‘live’ version of the site, you lose that context and its associated meaning, leaving factfinders wondering what they’re missing.

Factfinders aren’t the only ones wondering: courts often reject these substandard forms of evidence, finding that they can’t be reliably authenticated.

But when you capture website data in its native format, you preserve all of that dynamic content and metadata, from videos and interactive page features to links, including all of the content on linked pages. In essence, you can navigate a native-format preserved site as if you were on the live internet. That’s how you can show a jury who said what while, at the same time, demonstrating to the court that the evidence you’re offering is an authentic copy of the original site.

Ediscovery with online evidence isn’t just the wave of the future: it’s here today. The tools to find the evidentiary needles in the haystack are available, and by capturing and preserving data in its native format, we have the opportunity to raise the standards of compelling and admissible evidence.

Are you ready?


About the Author: Kevin Gibson, CEO & Chairman, Hanzo, Inc.

Kevin foundationally believes that the legal system is the cornerstone of democracy and that technology should be a powerful force in the defense and support of that legal system.

As CEO at Hanzo, Kevin leverages his vast experience gained over a 25-year career in high tech, where he has been responsible for the formation, growth, and sale of a number of technology businesses, to advance the mission of technology in service and support of a high-functioning legal system.

Kevin also serves as a director of VO2 Media, Jiva Technology and Business Fitness Pty Ltd. He was formerly the Chairman of Ebooks Corporation and held senior executive positions with SAP and Ariba. Kevin received a Masters degree in Electronic Engineering and an MBA from the University of Western Australia.

(P.S. the world wide web is 30 years old, here’s another piece to consider, produced by Google: ‘The World Wide Web: The Invention That Connected The World’.)