United States: How to Manage the Risks and Costs Associated with Searching ESI (Tip of the Month)

Originally published May 2010

Keywords: Electronically stored information, ESI, discovery, search terms, information retrieval, concept searches


A large corporation is served with a complaint accusing it of participating in a price-fixing conspiracy. Multiple discovery requests follow seeking electronically stored information (ESI). In-house counsel speaks with the company's IT department to estimate the scale of the review and is disturbed by the sheer number of files to be reviewed. How can a meaningful review be accomplished in a reasonable time frame in a cost-effective way?

The Risks Associated with Using Search Terms

The use of search terms has become the new panacea of many electronic discovery vendors that trumpet the use of technology to reduce the costs of identifying relevant documents and the number of documents that need to be reviewed. But some critics contend that keyword searches may fail to identify potentially relevant information. As Magistrate Judge Paul Grimm observed in his 2008 Victor Stanley decision, "while it is universally acknowledged that keyword searches are useful tools for search and retrieval of ESI, ... there is a growing body of literature that highlights the risks associated with conducting an unreliable or inadequate keyword search or relying exclusively on such searches."

While no computer-assisted information retrieval (IR) system yet developed can simply scan through a mountain of data and infallibly identify exactly those documents that an attorney would deem relevant, many IR options exist beyond traditional keyword searches that can reduce the risks associated with using search terms. For example, established search algorithms, commonly called "Boolean" or "set-theoretic" models, make binary decisions regarding the responsiveness of documents based on various simple tests, such as the presence of keywords within a certain distance of one another, linked by AND, OR, and NOT. A document is judged as either responsive or not, with no middle ground. Other search approaches, often broadly gathered under the rubric of "concept searches," move beyond this paradigm in a variety of ways.

Mitigating the Risks Associated with Using Search Terms

There are several ways to mitigate the risks of using traditional keyword searching. While not all of these options are appropriate in every case, consideration should be given to the following factors:

  • Even within the Boolean paradigm, search tools can take advantage of "fuzzy" text comparisons and auxiliary structures such as thesauri to expand upon the queries generated by attorneys and thereby deal with the misspellings, optical character recognition (OCR) errors, synonymy (multiple words for a single concept) and polysemy (multiple meanings for a single word) that plague keyword searches. The keyword search also can be enhanced by interviewing key custodians about the language that they use in correspondence, and by consulting with electronic discovery experts who are trained in keyword search development.
  • "Algebraic" IR methods generate a measure of how similar each document is to what a query ideally seeks, thereby enabling the tool to rank documents by relevance rather than simply assigning them to two undifferentiated camps: responsive and non-responsive. 
  • "Probabilistic" or "Bayesian" search algorithms make use of more user input than simply the initial query in order to estimate a particular document's relevance.
  • Tools such as domain name restrictions, discussion threading, topic clustering, people analytics, and analytics to identify duplicate copies of files can facilitate effective review.

The advantages of choosing a search system that is suited to your particular problem can be significant. By ranking documents rather than simply tagging all responsive documents as equally good, algebraic and probabilistic algorithms facilitate faster identification of key documents. By taking into account reviewers' tagging decisions and not simply the initial query, these systems reduce the need for humans—who charge by the hour—to keep repeating their recommendations. (Systems can be calibrated to permit some redundancy, to ensure that tagging mistakes do not poison an entire search.) Seemingly abstruse discussions of IR algorithm improvements quickly resolve themselves into bottom-line impacts in terms of the cost and time required to respond to discovery requests.

The Risks Associated with Using Concept Searching

If the use of concept searching can reduce risks and costs associated with retrieving relevant documents, why are these tools not already more popular among lawyers? There are several remaining risks:

  • There is not yet much case law certifying non-Boolean search methods as acceptable. While this is changing—for example, in a 2007 opinion, Judge Facciola of the District Court for the District of Columbia noted that "recent scholarship ... argues that concept searching, as opposed to keyword searching, is more efficient and more likely to produce the most comprehensive results," —the use of concept searching remains untested in the law, and opposing lawyers may balk at its use. On the other hand, successfully challenging the thoughtful use of Bayesian concept clustering is much more difficult and complicated than simply pointing to omitted search terms in a Boolean search string.
  • Boolean search tools are ubiquitous and fungible. By contrast, software packages supporting concept searching are less well known. Each package comes with its own idiosyncratic user interface (no universally agreed syntax here). And there are significant differences between mathematical concept searching and thesaurus based concept searching. Thus, choosing a quality vendor can be a bigger challenge.
  • Boolean searches generate output that lawyers understand—or at least think they do. The output of a keyword search is a list of documents that match the search query and, more importantly, a list of documents that do not. When asked if all responsive documents have been produced, an attorney who trusts keyword searches implicitly will answer, without reservation, "yes." The ranked output from an algebraic or probabilistic search provides no bright lines and, thus, requires more nuanced communication with the court. This distinction just makes explicit the uncertainties that are already present in Boolean searches that the binary output obscures: user-defined queries are far from perfect, and the statement that no document responsive to the search query has been withheld is a far cry from certification that no document responsive to the document request has been withheld. Nontraditional search algorithms do not create this uncomfortable truth; they just bring it to the fore.

Regardless of the information retrieval methodology selected, documentation of which model was chosen and how it was implemented is an important tool to facilitate defense of the chosen process.

So, which search method is right for you? That can vary based on the types and volume of documents searched, the time frame and budget permitted, your aversion to risk, and your organization's comfort with technology, among other factors. But if the universe to be searched is large and costs are likely to be scrutinized, consideration should be given to making use of concept search technology, especially as prices for such technology have fallen dramatically.

Learn more about our Electronic Discovery & Records Management practice.

Visit us at www.mayerbrown.com.

Copyright 2010. Mayer Brown LLP, Mayer Brown International LLP, Mayer Brown JSM and/or Tauil & Chequer Advogados, a Brazilian law partnership with which Mayer Brown is associated. All rights reserved.

Mayer Brown is a global legal services organization comprising legal practices that are separate entities (the Mayer Brown Practices). The Mayer Brown Practices are: Mayer Brown LLP, a limited liability partnership established in the United States; Mayer Brown International LLP, a limited liability partnership incorporated in England and Wales; Mayer Brown JSM, a Hong Kong partnership, and its associated entities in Asia; and Tauil & Chequer Advogados, a Brazilian law partnership with which Mayer Brown is associated. "Mayer Brown" and the Mayer Brown logo are the trademarks of the Mayer Brown Practices in their respective jurisdictions.

This Mayer Brown article provides information and comments on legal issues and developments of interest. The foregoing is not a comprehensive treatment of the subject matter covered and is not intended to provide legal advice. Readers should seek specific legal advice before taking any action with respect to the matters discussed herein.

To print this article, all you need is to be registered on Mondaq.com.

Click to Login as an existing user or Register so you can print this article.

In association with
Related Topics
Related Articles
Related Video
Up-coming Events Search
Font Size:
Mondaq on Twitter
Mondaq Free Registration
Gain access to Mondaq global archive of over 375,000 articles covering 200 countries with a personalised News Alert and automatic login on this device.
Mondaq News Alert (some suggested topics and region)
Select Topics
Registration (please scroll down to set your data preferences)

Mondaq Ltd requires you to register and provide information that personally identifies you, including your content preferences, for three primary purposes (full details of Mondaq’s use of your personal data can be found in our Privacy and Cookies Notice):

  • To allow you to personalize the Mondaq websites you are visiting to show content ("Content") relevant to your interests.
  • To enable features such as password reminder, news alerts, email a colleague, and linking from Mondaq (and its affiliate sites) to your website.
  • To produce demographic feedback for our content providers ("Contributors") who contribute Content for free for your use.

Mondaq hopes that our registered users will support us in maintaining our free to view business model by consenting to our use of your personal data as described below.

Mondaq has a "free to view" business model. Our services are paid for by Contributors in exchange for Mondaq providing them with access to information about who accesses their content. Once personal data is transferred to our Contributors they become a data controller of this personal data. They use it to measure the response that their articles are receiving, as a form of market research. They may also use it to provide Mondaq users with information about their products and services.

Details of each Contributor to which your personal data will be transferred is clearly stated within the Content that you access. For full details of how this Contributor will use your personal data, you should review the Contributor’s own Privacy Notice.

Please indicate your preference below:

Yes, I am happy to support Mondaq in maintaining its free to view business model by agreeing to allow Mondaq to share my personal data with Contributors whose Content I access
No, I do not want Mondaq to share my personal data with Contributors

Also please let us know whether you are happy to receive communications promoting products and services offered by Mondaq:

Yes, I am happy to received promotional communications from Mondaq
No, please do not send me promotional communications from Mondaq
Terms & Conditions

Mondaq.com (the Website) is owned and managed by Mondaq Ltd (Mondaq). Mondaq grants you a non-exclusive, revocable licence to access the Website and associated services, such as the Mondaq News Alerts (Services), subject to and in consideration of your compliance with the following terms and conditions of use (Terms). Your use of the Website and/or Services constitutes your agreement to the Terms. Mondaq may terminate your use of the Website and Services if you are in breach of these Terms or if Mondaq decides to terminate the licence granted hereunder for any reason whatsoever.

Use of www.mondaq.com

To Use Mondaq.com you must be: eighteen (18) years old or over; legally capable of entering into binding contracts; and not in any way prohibited by the applicable law to enter into these Terms in the jurisdiction which you are currently located.

You may use the Website as an unregistered user, however, you are required to register as a user if you wish to read the full text of the Content or to receive the Services.

You may not modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, link, display, or in any way exploit any of the Content, in whole or in part, except as expressly permitted in these Terms or with the prior written consent of Mondaq. You may not use electronic or other means to extract details or information from the Content. Nor shall you extract information about users or Contributors in order to offer them any services or products.

In your use of the Website and/or Services you shall: comply with all applicable laws, regulations, directives and legislations which apply to your Use of the Website and/or Services in whatever country you are physically located including without limitation any and all consumer law, export control laws and regulations; provide to us true, correct and accurate information and promptly inform us in the event that any information that you have provided to us changes or becomes inaccurate; notify Mondaq immediately of any circumstances where you have reason to believe that any Intellectual Property Rights or any other rights of any third party may have been infringed; co-operate with reasonable security or other checks or requests for information made by Mondaq from time to time; and at all times be fully liable for the breach of any of these Terms by a third party using your login details to access the Website and/or Services

however, you shall not: do anything likely to impair, interfere with or damage or cause harm or distress to any persons, or the network; do anything that will infringe any Intellectual Property Rights or other rights of Mondaq or any third party; or use the Website, Services and/or Content otherwise than in accordance with these Terms; use any trade marks or service marks of Mondaq or the Contributors, or do anything which may be seen to take unfair advantage of the reputation and goodwill of Mondaq or the Contributors, or the Website, Services and/or Content.

Mondaq reserves the right, in its sole discretion, to take any action that it deems necessary and appropriate in the event it considers that there is a breach or threatened breach of the Terms.

Mondaq’s Rights and Obligations

Unless otherwise expressly set out to the contrary, nothing in these Terms shall serve to transfer from Mondaq to you, any Intellectual Property Rights owned by and/or licensed to Mondaq and all rights, title and interest in and to such Intellectual Property Rights will remain exclusively with Mondaq and/or its licensors.

Mondaq shall use its reasonable endeavours to make the Website and Services available to you at all times, but we cannot guarantee an uninterrupted and fault free service.

Mondaq reserves the right to make changes to the services and/or the Website or part thereof, from time to time, and we may add, remove, modify and/or vary any elements of features and functionalities of the Website or the services.

Mondaq also reserves the right from time to time to monitor your Use of the Website and/or services.


The Content is general information only. It is not intended to constitute legal advice or seek to be the complete and comprehensive statement of the law, nor is it intended to address your specific requirements or provide advice on which reliance should be placed. Mondaq and/or its Contributors and other suppliers make no representations about the suitability of the information contained in the Content for any purpose. All Content provided "as is" without warranty of any kind. Mondaq and/or its Contributors and other suppliers hereby exclude and disclaim all representations, warranties or guarantees with regard to the Content, including all implied warranties and conditions of merchantability, fitness for a particular purpose, title and non-infringement. To the maximum extent permitted by law, Mondaq expressly excludes all representations, warranties, obligations, and liabilities arising out of or in connection with all Content. In no event shall Mondaq and/or its respective suppliers be liable for any special, indirect or consequential damages or any damages whatsoever resulting from loss of use, data or profits, whether in an action of contract, negligence or other tortious action, arising out of or in connection with the use of the Content or performance of Mondaq’s Services.


Mondaq may alter or amend these Terms by amending them on the Website. By continuing to Use the Services and/or the Website after such amendment, you will be deemed to have accepted any amendment to these Terms.

These Terms shall be governed by and construed in accordance with the laws of England and Wales and you irrevocably submit to the exclusive jurisdiction of the courts of England and Wales to settle any dispute which may arise out of or in connection with these Terms. If you live outside the United Kingdom, English law shall apply only to the extent that English law shall not deprive you of any legal protection accorded in accordance with the law of the place where you are habitually resident ("Local Law"). In the event English law deprives you of any legal protection which is accorded to you under Local Law, then these terms shall be governed by Local Law and any dispute or claim arising out of or in connection with these Terms shall be subject to the non-exclusive jurisdiction of the courts where you are habitually resident.

You may print and keep a copy of these Terms, which form the entire agreement between you and Mondaq and supersede any other communications or advertising in respect of the Service and/or the Website.

No delay in exercising or non-exercise by you and/or Mondaq of any of its rights under or in connection with these Terms shall operate as a waiver or release of each of your or Mondaq’s right. Rather, any such waiver or release must be specifically granted in writing signed by the party granting it.

If any part of these Terms is held unenforceable, that part shall be enforced to the maximum extent permissible so as to give effect to the intent of the parties, and the Terms shall continue in full force and effect.

Mondaq shall not incur any liability to you on account of any loss or damage resulting from any delay or failure to perform all or any part of these Terms if such delay or failure is caused, in whole or in part, by events, occurrences, or causes beyond the control of Mondaq. Such events, occurrences or causes will include, without limitation, acts of God, strikes, lockouts, server and network failure, riots, acts of war, earthquakes, fire and explosions.

By clicking Register you state you have read and agree to our Terms and Conditions