File Analysis and eDiscovery possess some conceptual overlap in terms of use cases and applicable solutions. Both categories may appear similar at first but are in fact quite different.
It is important to establish the boundaries of each category to ensure that solutions and use cases aren’t paired inappropriately, which would result in unnecessary risks, costs and delays. Ultimately, the goal is to map project requirements to solution functionalities ensuring the most likely path to success for all stakeholders.
eDiscovery
eDiscovery is now a well-defined process required when one party must produce electronically stored information (ESI) to the other party as part of a legal matter. Duke Law has outlined this process in the universally-adopted EDRM process model.
eDiscovery solutions have matured over the last 20 years to assist in these tasks with increasing sophistication and granularity. They can perform keyword searches, redact lines of text, preserve documents, provide a format for lawyers to read pages of relevant documents looking page by page, line by line, and word by word for information. They have transformed the discovery world by reducing the hours it takes lawyers to search through amounts of documents that they never could have achieved with paper records decades ago.
The most unique element of eDiscovery is that it is always done with a production of ESI in mind and almost always as part of an adversarial process. Based upon evolution of the solutions and the applicable case law, eDiscovery is now more defined and mature than ever.
File Analysis
File Analysis is defined by Gartner as a solution that “scans, maps and manages unstructured data stores… thus enabling data and analytics leaders to make better data management decisions for unstructured data, which in turn reduces risk and lowers costs associated with data.”
Gartner further notes that available solutions:
“[A]nalyze, index, search, track and report on file metadata and file content. This enables organizations to take action on current and legacy files and objects according to what was identified. FA provides detailed metadata and contextual information to enable improved information governance and organizational efficiency for unstructured data management.”
File Analysis solutions are able to analyze:
A typical File Analysis project usually indexes terabytes and sometimes petabytes of data. File Analysis solutions are built to scale an entire organization’s data infrastructure with the end goal being an intuitive information governance process that reduces the information risk of the organization.
Putting the pieces together
After reviewing the different characteristics of both solutions, it is clear that eDiscovery and File Analysis are different for three main reasons:
1. Scale
2. Granularity
3. Use Cases
First, for an eDiscovery solution to be cost effective, it must choose the right volume data to analyze that is manageable. This usually falls in the GB range with a large project being a single terabyte of data. File Analysis solutions on the other hand are built to analyze terabytes or petabytes of data in a cost-effective way. This scale difference drives the user to have a firm picture of what they want to accomplish before they deploy a particular type of solution.
Additionally, because eDiscovery is a solution used by legal professionals to produce ESI to another party, the level of granularity of eDiscovery projects is highly specific. After identifying relevant ESI, an eDiscovery solution is built to assist a user in scanning documents word by word in order to find responsive information. A File Analysis solution is less granular due to the scope of the projects it is used for. Although the solution is designed to scan unstructured data stores down to the file level at a high speed, it would be near impossible for a user to go through the amount of data that is indexed on such a granular basis.
The use cases of these solutions are clearly different, and a savvy consumer must clearly identify the end goal before deploying either an eDiscovery solution or a File Analysis solution. Otherwise the scale, cost and granularity will not meet the expectations of the organization.