Request a demo

File Analysis versus eDiscovery (A comparative analysis)

January 25, 2019 | By Nybble

File Analysis and eDiscovery possess some conceptual overlap in terms of use cases and applicable solutions. Both categories may appear similar at first but are in fact quite different.

It is important to establish the boundaries of each category to ensure that solutions and use cases aren’t paired inappropriately, which would result in unnecessary risks, costs and delays. Ultimately, the goal is to map project requirements to solution functionalities ensuring the most likely path to success for all stakeholders.


eDiscovery is now a well-defined process required when one party must produce electronically stored information (ESI) to the other party as part of a legal matter. Duke Law has outlined this process in the universally-adopted EDRM process model.

Information Governance – Getting your electronic house in order to mitigate risk & expenses should eDiscovery become an issue, from initial creation of ESI (electronically stored information) through its final disposition
Identification – Locating potential sources of ESI & determining its scope, breadth & depth
Preservation – Ensuring that ESI is protected against inappropriate alteration or destruction
Collection – Gathering ESI for further use in the e-discovery process (processing, review, etc.)

Processing – Reducing the volume of ESI and converting it, if necessary, to forms more suitable for review & analysis
Review – Evaluating ESI for relevance & privilege
Analysis – Evaluating ESI for content & context, including key patterns, topics, people & discussion
Production – Delivering ESI to others in appropriate forms & using appropriate delivery mechanisms
Presentation – Displaying ESI before audiences (at depositions, hearings, trials, etc.), to elicit further information, validate existing facts or positions, or persuade an audience

eDiscovery solutions have matured over the last 20 years to assist in these tasks with increasing sophistication and granularity. They can perform keyword searches, redact lines of text, preserve documents, provide a format for lawyers to read pages of relevant documents looking page by page, line by line, and word by word for information. They have transformed the discovery world by reducing the hours it takes lawyers to search through amounts of documents that they never could have achieved with paper records decades ago.

The most unique element of eDiscovery is that it is always done with a production of ESI in mind and almost always as part of an adversarial process. Based upon evolution of the solutions and the applicable case law, eDiscovery is now more defined and mature than ever.

File Analysis

File Analysis is defined by Gartner as a solution that “scans, maps and manages unstructured data stores… thus enabling data and analytics leaders to make better data management decisions for unstructured data, which in turn reduces risk and lowers costs associated with data.”
Gartner further notes that available solutions:

“[A]nalyze, index, search, track and report on file metadata and file content. This enables organizations to take action on current and legacy files and objects according to what was identified. FA provides detailed metadata and contextual information to enable improved information governance and organizational efficiency for unstructured data management.”

File Analysis solutions are able to analyze:

File shares
Email databases
Content collaboration platforms (CCPs)
Records management, enterprise content management (ECM) systems and Microsoft SharePoint

Log files
Unstructured Internet of Things (IoT)-generated objects
Data archives

A typical File Analysis project usually indexes terabytes and sometimes petabytes of data. File Analysis solutions are built to scale an entire organization’s data infrastructure with the end goal being an intuitive information governance process that reduces the information risk of the organization.

Putting the pieces together

After reviewing the different characteristics of both solutions, it is clear that eDiscovery and File Analysis are different for three main reasons:

1. Scale
2. Granularity
3. Use Cases

First, for an eDiscovery solution to be cost effective, it must choose the right volume data to analyze that is manageable. This usually falls in the GB range with a large project being a single terabyte of data. File Analysis solutions on the other hand are built to analyze terabytes or petabytes of data in a cost-effective way. This scale difference drives the user to have a firm picture of what they want to accomplish before they deploy a particular type of solution.

Additionally, because eDiscovery is a solution used by legal professionals to produce ESI to another party, the level of granularity of eDiscovery projects is highly specific. After identifying relevant ESI, an eDiscovery solution is built to assist a user in scanning documents word by word in order to find responsive information. A File Analysis solution is less granular due to the scope of the projects it is used for. Although the solution is designed to scan unstructured data stores down to the file level at a high speed, it would be near impossible for a user to go through the amount of data that is indexed on such a granular basis.

The use cases of these solutions are clearly different, and a savvy consumer must clearly identify the end goal before deploying either an eDiscovery solution or a File Analysis solution. Otherwise the scale, cost and granularity will not meet the expectations of the organization.

Nybble is ActiveNav’s data-loving mascot, here to answer all your data privacy and governance questions.