HOME

TheInfoList



OR:

Electronic discovery (also ediscovery or e-discovery) refers to
discovery Discovery may refer to: * Discovery (observation), observing or finding something unknown * Discovery (fiction), a character's learning something unknown * Discovery (law), a process in courts of law relating to evidence Discovery, The Discovery ...
in legal proceedings such as
litigation - A lawsuit is a proceeding by a party or parties against another in the civil court of law. The archaic term "suit in law" is found in only a small number of laws still in effect today. The term "lawsuit" is used in reference to a civil actio ...
, government investigations, or
Freedom of Information Act Freedom of Information Act may refer to the following legislations in different jurisdictions which mandate the national government to disclose certain data to the general public upon request: * Freedom of Information Act 1982, the Australian act * ...
requests, where the information sought is in electronic format (often referred to as
electronically stored information Electronically stored information (ESI), for the purpose of the Federal Rules of Civil Procedure (FRCP) is information created, manipulated, communicated, stored, and best utilized in digital form, requiring the use of computer hardware and software ...
or ESI). Electronic discovery is subject to rules of
civil procedure Civil procedure is the body of law that sets out the rules and standards that courts follow when adjudicating civil lawsuits (as opposed to procedures in criminal law matters). These rules govern how a lawsuit or case may be commenced; what ki ...
and agreed-upon processes, often involving review for privilege and
relevance Relevance is the concept of one topic being connected to another topic in a way that makes it useful to consider the second topic when considering the first. The concept of relevance is studied in many different fields, including cognitive sci ...
before data are turned over to the requesting party. Electronic information is considered different from paper information because of its intangible form, volume, transience and persistence. Electronic information is usually accompanied by
metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
that is not found in paper documents and that can play an important part as evidence (e.g. the date and time a document was written could be useful in a
copyright A copyright is a type of intellectual property that gives its owner the exclusive right to copy, distribute, adapt, display, and perform a creative work, usually for a limited time. The creative work may be in a literary, artistic, education ...
case). The preservation of metadata from electronic documents creates special challenges to prevent
spoliation Spoliation may refer to: * Looting * Spoliation of evidence in a criminal investigation See also * Spoliation Advisory Panel *Nazi plunder Nazi plunder (german: Raubkunst) was the stealing of art and other items which occurred as a result ...
. In the United States, at the federal level, electronic discovery is governed by
common law In law, common law (also known as judicial precedent, judge-made law, or case law) is the body of law created by judges and similar quasi-judicial tribunals by virtue of being stated in written opinions."The common law is not a brooding omnipresen ...
, case law and specific statutes, but primarily by the
Federal Rules of Civil Procedure The Federal Rules of Civil Procedure (officially abbreviated Fed. R. Civ. P.; colloquially FRCP) govern civil procedure in United States district courts. The FRCP are promulgated by the United States Supreme Court pursuant to the Rules Enabling ...
(FRCP), including
amendments An amendment is a formal or official change made to a law, contract, constitution, or other legal document. It is based on the verb to amend, which means to change for better. Amendments can add, remove, or update parts of these agreements. The ...
effective December 1, 2006, and December 1, 2015. In addition, state law and regulatory agencies increasingly also address issues relating to electronic discovery. In
England and Wales England and Wales () is one of the three legal jurisdictions of the United Kingdom. It covers the constituent countries England and Wales and was formed by the Laws in Wales Acts 1535 and 1542. The substantive law of the jurisdiction is Eng ...
, Part 31 of the
Civil Procedure Rules The Civil Procedure Rules (CPR) were introduced in 1997 as per the Civil Procedure Act 1997 by the Civil Procedure Rule Committee and are the rules of civil procedure used by the Court of Appeal, High Court of Justice, and County Courts in civil ...
and Practice Direction 31B on Disclosure of Electronic Documents apply. Other jurisdictions around the world also have rules relating to electronic discovery.


Stages of process

Th
Electronic Discovery Reference Model (EDRM)
is a ubiquitous diagram that represents a conceptual view of these stages involved in the ediscovery process.


Identification

The identification phase is when potentially responsive documents are identified for further analysis and review. In the United States, in '' Zubulake v. UBS Warburg'', Hon. Shira Scheindlin ruled that failure to issue a written legal hold notice whenever litigation is reasonably anticipated will be deemed grossly negligent. This holding brought additional focus to the concepts of legal holds, eDiscovery, and electronic preservation. Custodians who are in possession of potentially relevant information or documents are identified. To ensure a complete identification of data sources,
data mapping In computing and data management, data mapping is the process of creating data element mappings between two distinct data models. Data mapping is used as a first step for a wide variety of data integration tasks, including: * Data transformatio ...
techniques are often employed. Since the scope of data can be overwhelming or uncertain in this phase, attempts are made to reasonably reduce the overall scope during this phase - such as limiting the identification of documents to a certain date range or custodians.


Preservation

A duty to preserve begins upon the reasonable anticipation of litigation. During preservation, data identified as potentially relevant is placed in a legal hold. This ensures that data cannot be destroyed. Care is taken to ensure this process is defensible, while the end-goal is to reduce the possibility of data spoliation or destruction. Failure to preserve can lead to sanctions. Even if the court ruled the failure to preserve as negligence, they can force the accused to pay fines if the lost data puts the defense "at an undue disadvantage in establishing their defense."


Collection

Once documents have been preserved, collection can begin. Collection is the transfer of data from a company to their legal counsel, who will determine relevance and disposition of data. Some companies that deal with frequent litigation have software in place to quickly place legal holds on certain custodians when an event (such as legal notice) is triggered and begin the collection process immediately. Other companies may need to call in a digital forensics expert to prevent the spoliation of data. The size and scale of this collection is determined by the identification phase.


Processing

During the processing phase, native files are prepared to be loaded into a document review platform. Often, this phase also involves the extraction of text and
metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
from the native files. Various data culling techniques are employed during this phase, such as deduplication and de-NISTing. Sometimes native files will be converted to a petrified, paper-like format (such as PDF or TIFF) at this stage, to allow for easier redaction and bates-labeling. Modern processing tools can also employ advanced analytic tools to help document review attorneys more accurately identify potentially relevant documents.


Review

During the review phase, documents are reviewed for responsiveness to discovery requests and for privilege. Different document review platforms can assist in many tasks related to this process, including the rapid identification of potentially relevant documents, and the culling of documents according to various criteria (such as keyword, date range, etc.). Most review tools also make it easy for large groups of document review attorneys to work on cases, featuring collaborative tools and batches to speed up the review process and eliminate work duplication.


Production

Documents are turned over to opposing counsel, based on agreed-upon specifications. Often this production is accompanied by a load file, which is used to load documents into a document review platform. Documents can be produced either as native files, or in a petrified format (such as
PDF Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. ...
or
TIFF Tag Image File Format, abbreviated TIFF or TIF, is an image file format for storing raster graphics images, popular among graphic artists, the publishing industry, and photographers. TIFF is widely supported by scanning, faxing, word processin ...
), alongside
metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
.


Types of electronically stored information

Any data that is stored in an electronic form may be subject to production under common eDiscovery rules. This type of data has historically included email and office documents, but can also include photos, video, databases, and other filetypes. Also included in ediscovery is "
raw data Raw data, also known as primary data, are ''data'' (e.g., numbers, instrument readings, figures, etc.) collected from a source. In the context of examinations, the raw data might be described as a raw score (after test scores). If a scientist ...
", which forensic investigators can review for hidden evidence. The original file format is known as the "native" format. Litigators may review material from ediscovery in one of several formats: printed paper, "native file", or a petrified, paper-like format, such as PDF files or TIFF images. Modern document review platforms accommodate the use of native files, and allow for them to be converted to TIFF and
Bates Bates may refer to: Places * Bates, Arkansas, an unincorporated community * Bates, Illinois. an unincorporated community in Sangamon County * Bates, Michigan, a community in Grand Traverse County * Bates, New York, a hamlet in the town of Ell ...
-stamped for use in court.


Electronic messages

In 2006, the
U.S. Supreme Court The Supreme Court of the United States (SCOTUS) is the highest court in the federal judiciary of the United States. It has ultimate appellate jurisdiction over all U.S. federal court cases, and over state court cases that involve a point o ...
's amendments to the
Federal Rules of Civil Procedure The Federal Rules of Civil Procedure (officially abbreviated Fed. R. Civ. P.; colloquially FRCP) govern civil procedure in United States district courts. The FRCP are promulgated by the United States Supreme Court pursuant to the Rules Enabling ...
created a category for electronic records that, for the first time, explicitly named emails and instant message chats as likely records to be archived and produced when relevant. One type of preservation problem arose during the '' Zubulake v. UBS Warburg'' LLC lawsuit. Throughout the case, the plaintiff claimed that the evidence needed to prove the case existed in emails stored on UBS' own computer systems. Because the emails requested were either never found or destroyed, the court found that it was more likely that they existed than not. The court found that while the corporation's counsel directed that all potential discovery evidence, including emails, be preserved, the staff that the directive applied to did not follow through. This resulted in significant sanctions against UBS. Some archiving systems apply a unique code to each archived message or chat to establish authenticity. The systems prevent alterations to original messages, messages cannot be deleted, and the messages cannot be accessed by unauthorized persons. The formalized changes to the Federal Rules of Civil Procedure in December 2006 and in 2007 effectively forced civil litigants into a compliance mode with respect to their proper retention and management of
electronically stored information Electronically stored information (ESI), for the purpose of the Federal Rules of Civil Procedure (FRCP) is information created, manipulated, communicated, stored, and best utilized in digital form, requiring the use of computer hardware and software ...
(ESI). Improper management of ESI can result in a finding of spoliation of evidence and the imposition of one or more sanctions including an adverse inference jury instructions,
summary judgment In law, a summary judgment (also judgment as a matter of law or summary disposition) is a judgment entered by a court A court is any person or institution, often as a government institution, with the authority to adjudicate legal disputes ...
, monetary fines, and other sanctions. In some cases, such as ''Qualcomm v. Broadcom'', attorneys can be brought before the bar.


Databases and other structured data

Structured data typically resides in databases or datasets. It is organized in tables with columns and rows along with defined data types. The most common are Relational Database Management Systems (
RDBMS A relational database is a (most commonly digital) database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relation ...
) that are capable of handling large volumes of data such as
Oracle An oracle is a person or agency considered to provide wise and insightful counsel or prophetic predictions, most notably including precognition of the future, inspired by deities. As such, it is a form of divination. Description The word '' ...
,
IBM Db2 Db2 is a family of data management products, including database servers, developed by IBM. It initially supported the relational model, but was extended to support object–relational features and non-relational structures like JSON a ...
,
Microsoft SQL Server Microsoft SQL Server is a relational database management system developed by Microsoft. As a database server, it is a software product with the primary function of storing and retrieving data as requested by other software applications—which ma ...
,
Sybase Sybase, Inc. was an enterprise software and services company. The company produced software to manage and analyze information in relational databases, with facilities located in California and Massachusetts. Sybase was acquired by SAP in 2010; ...
, and
Teradata Teradata Corporation is an American software company that provides cloud database and analytics-related software, products, and services. The company was formed in 1979 in Brentwood, California, as a collaboration between researchers at Caltech ...
. The structured data domain also includes spreadsheets (not all spreadsheets contain structured data, but those that have data organized in database-like tables), desktop databases like FileMaker Pro and
Microsoft Access Microsoft Access is a database management system (DBMS) from Microsoft that combines the relational Access Database Engine (ACE) with a graphical user interface and software-development tools (not to be confused with the old Microsoft Access w ...
, structured
flat file A flat-file database is a database stored in a file called a flat file. Records follow a uniform format, and there are no structures for indexing or recognizing relationships between records. The file is simple. A flat file can be a plain ...
s,
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. T ...
files,
data mart A data mart is a structure/access pattern specific to ''data warehouse'' environments, used to retrieve client-facing data. The data mart is a subset of the data warehouse and is usually oriented to a specific business line or team. Whereas data w ...
s,
data warehouse In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for Business reporting, reporting and data analysis and is considered a core component of business intelligence. DWs are central Repos ...
s, etc.


Audio

Voicemail is often discoverable under electronic discovery rules. Employers may have a duty to retain voicemail if there is an anticipation of litigation involving that employee. Data from voice assistants like Amazon Alexa and Siri have been used in criminal cases.


Reporting formats

Although petrifying documents to static image formats (tiff &
jpeg JPEG ( ) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and imag ...
) had become the standard document review method for almost two decades, native format review has increased in popularity as a method for document review since around 2004. Because it requires the review of documents in their original file formats, applications and toolkits capable of opening multiple file formats have also become popular. This is also true in the ECM (Enterprise Content Management) storage markets which are converging quickly with ESI technologies. Petrification involves the conversion of native files into an image format that does not require use of the native applications. This is useful in the
redaction Redaction is a form of editing in which multiple sources of texts are combined and altered slightly to make a single document. Often this is a method of collecting a series of writings on a similar theme and creating a definitive and coherent wo ...
of privileged or sensitive information, since redaction tools for images are traditionally more mature, and easier to apply on uniform image types by non-technical people. Efforts to redact similarly petrified PDF files by incompetent personnel have resulted in the removal of redacted layers and exposure of redacted information, such as social security numbers and other private information. Traditionally, electronic discovery vendors had been contracted to convert native files into TIFF images (for example 10 images for a 10-page
Microsoft Word Microsoft Word is a word processing software developed by Microsoft. It was first released on October 25, 1983, under the name ''Multi-Tool Word'' for Xenix systems. Subsequent versions were later written for several other platforms includin ...
document) with a load file for use in image-based discovery review database applications. Increasingly, database review applications have embedded native file viewers with TIFF-capabilities. With both native and image file capabilities, it could either increase or decrease the total necessary storage, since there may be multiple formats and files associated with each individual native file. Deployment, storage, and best practices are becoming especially critical and necessary to maintain cost-effective strategies. Structured data are most often produced in delimited text format. When the number of tables subject to discovery is large or relationships between the tables are of essence, the data are produced in native database format or as a database backup file.


Common issues

A number of different people may be involved in an electronic discovery project: lawyers for both parties, forensic specialists, IT managers, and records managers, amongst others. Forensic examination often uses specialized terminology (for example "image" refers to the acquisition of digital media) which can lead to confusion. While attorneys involved in case litigation try their best to understand the companies and organization they represent, they may fail to understand the policies and practices that are in place in the company's IT department. As a result, some data may be destroyed ''after'' a legal hold has been issued by unknowing technicians performing their regular duties. To combat this trend, many companies are deploying software which properly preserves data across the network, preventing inadvertent data spoliation. Given the complexities of modern litigation and the wide variety of information systems on the market, electronic discovery often requires IT professionals from both the attorney's office (or vendor) and the parties to the litigation to communicate directly to address technology incompatibilities and agree on production formats. Failure to get expert advice from knowledgeable personnel often leads to additional time and unforeseen costs in acquiring new technology or adapting existing technologies to accommodate the collected data.


Emerging trends


Alternative collection methods

Currently the two main approaches for identifying responsive material on custodian machines are: (1) where physical access to the organizations network is possible - agents are installed on each custodian machine which push large amounts of data for indexing across the network to one or more servers that have to be attached to the network or (2) for instances where it is impossible or impractical to attend the physical location of the custodian system - storage devices are attached to custodian machines (or company servers) and then each collection instance is manually deployed. In relation to the first approach there are several issues: * In a typical collection process large volumes of data are transmitted across the network for indexing and this impacts normal business operations * The indexing process is not 100% reliable in finding responsive material * IT administrators are generally unhappy with the installation of agents on custodian machines * The number of concurrent custodian machines that can be processed is severely limited due to the network bandwidth required New technology is able to address problems created by the first approach by running an application entirely in memory on each custodian machine and only pushing responsive data across the network. This process has been patented and embodied in a tool that has been the subject of a conference paper. In relation to the second approach, despite self-collection being a hot topic in eDiscovery, concerns are being addressed by limiting the involvement of the custodian to simply plugging in a device and running an application to create an encrypted container of responsive documents.


Technology-assisted review

Technology-assisted review (TAR)—also known as computer-assisted review or predictive coding—involves the application of
supervised machine learning Supervised learning (SL) is a machine learning paradigm for problems where the available data consists of labelled examples, meaning that each data point contains features (covariates) and an associated label. The goal of supervised learning alg ...
or rule-based approaches to infer the relevance (or responsiveness, privilege, or other categories of interest) of ESI. Technology-assisted review has evolved rapidly since its inception ''circa'' 2005. Following research studies indicating its effectiveness, TAR was first recognized by a U.S. court in 2012, by an Irish court in 2015, and by a U.K. court in 2016. Recently a U.S. court has declared that it is "
black letter law In common law legal systems, black letter laws are the well-established legal rules that are no longer subject to reasonable dispute. Some examples are the "black-letter law" that the formation of a contract requires consideration, or the "black- ...
that where the producing party wants to utilize TAR for document review, courts will permit it."S.D.N.Y (2015)
Rio Tinto v. Vale
Retrieved August 14, 2016
In a subsequent matter, the same court stated,


Convergence with information governance

Anecdotal evidence for this emerging trend points to the business value of
information governance Information governance, or IG, is the overall strategy for information at an organization. Information governance balances the risk that information presents with the value that information provides. Information governance helps with legal compl ...
(IG), defined by
Gartner Gartner, Inc is a technological research and consulting firm based in Stamford, Connecticut that conducts research on technology and shares this research both through private consulting as well as executive programs and conferences. Its clients ...
as "the specification of decision rights and an accountability framework to encourage desirable behavior in the valuation, creation, storage, use, archival, and deletion of information. It includes the processes, roles, standards, and metrics that ensure the effective and efficient use of information in enabling an organization to achieve its goals." As compared to eDiscovery, information governance as a discipline is rather new. Yet there is traction for convergence. eDiscovery—as a multibillion-dollar industry—is rapidly evolving, ready to embrace optimized solutions that strengthen cybersecurity (for cloud computing). Since the early 2000s eDiscovery practitioners have developed skills and techniques that can be applied to information governance. Organizations can apply the lessons learned from eDiscovery to accelerate their path forward to a sophisticated information governance framework. The Information Governance Reference Model (IGRM) illustrates the relationship between key stakeholders and the Information Lifecycle and highlights the transparency required to enable effective governance. Notably, the updated IGRM v3.0 emphasizes that Privacy & Security Officers are essential stakeholders. This topic is addressed in an article entitled "Better Ediscovery: Unified Governance and the IGRM", published by the American Bar Association.


See also

* Data mining *
Data retention Data retention defines the policies of persistent data and records management for meeting legal and business data archival requirements. Although sometimes interchangeable, it is not to be confused with the Data Protection Act 1998. The differen ...
*
Discovery (law) Discovery, in the law of common law jurisdictions, is a pre-trial procedure in a lawsuit in which each party, through the law of civil procedure, can obtain evidence from the other party or parties by means of discovery devices such as inter ...
*
Early case assessment Early case assessment refers to estimating risk (cost of time and money) to prosecute or Lawyer, defend a legal case. Global organizations deal with legal discovery and Discovery (law), disclosure requests for electronically stored information "ES ...
*
Electronically stored information (Federal Rules of Civil Procedure) Electronically stored information (ESI), for the purpose of the Federal Rules of Civil Procedure (FRCP) is information created, manipulated, communicated, stored, and best utilized in digital form, requiring the use of computer hardware and softwar ...
*
File hosting service A file-hosting service, cloud-storage service, online file-storage provider, or cyberlocker is an internet hosting service specifically designed to host user files. It allows users to upload files that could be accessed over the internet afte ...
* Forensic search *
Information governance Information governance, or IG, is the overall strategy for information at an organization. Information governance balances the risk that information presents with the value that information provides. Information governance helps with legal compl ...
*
Legal governance, risk management, and compliance Legal governance, risk management, and compliance (LGRC) refers to the complex set of processes, rules, tools and systems used by corporate legal departments to adopt, implement and monitor an integrated approach to business problems. While Governa ...
*
Machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
*
Telecommunications data retention Data retention defines the policies of persistent data and records management for meeting legal and business data archival requirements. Although sometimes interchangeable, it is not to be confused with the Data Protection Act 1998. The different ...


References


External links


Federal Judicial Center: Materials on Electronic Discovery

American Bar Association article on eDiscoveryeDiscovery with AI (Artificial intelligence)
{{DEFAULTSORT:Electronic Discovery Civil procedure Email Digital forensics Information governance Records management