Search Term Selection: Avoiding the Pitfalls
With less than 30 percent of all information ever appearing as ink on paper, the “paper trail” often turns out to be a “bitstream.” This sheer volume of data held by organizations makes it clear that electronically stored information plays an essential part in litigation today. Once the information as been preserved, what’s next? Well, it would make no sense for anyone to read through all of upper management’s e-mails or review all the documents stored on an organization’s network. The solution? Applying search terms to the electronically stored information to identify responsive files and documents.
Successful searches of electronic data must produce information that is useful not only in what it tells you but in a volume that can be reviewed. The most efficient to achieve this is by constructing a list of terms that can be used to search through digital evidence to identify the most relevant documents for review.
Selecting search terms may seem easy enough: pick terms that describe what we are looking for and search whatever electronic documents we’ve recovered. But careful selection is critical unless you want to review responsive yet irrelevant documents. Here are some elements to take into account when building thinking about searching electronically stored information:
- Determine what is to be searched: emails, documents, deleted files? Careful determination will reduce the number of hits to review and allow you to focus on what matters. Keep in mind that a focused search may provide you with focused results but may also prevent you from finding critical elements if the scope is too narrow.
- Be careful of generic terms: they will likely produce a large volume of irrelevant documents to review to determine relevance. The term “confidential” may be critical to the review but the organization may be including an automatically generated disclaimer at the bottom of all its e-mails that contains the sentence, “The content of this message is CONFIDENTIAL.”
- Be mindful of language: what may be targeted in English may be generic in French. Also, think of building your list of search terms in English and French (and any other language you think appropriate).
- Short words may produce a tall amount of work: short words, such as abbreviations, might produce thousands of search hits. These terms might be contained in random text patterns such as those contained in remnants of deleted documents or binary system files found on the computer system.
- Be wary of “embedded” words: short words may be contained in others. For example, if we’re searching for the word “car” as part of scheme involving the use of company or rental cars, the term would flag documents containing “North Carolina,” “South Carolina,” “carriage,” “carries,” “Carnegie Hall,” and thousands more.
- Corporate culture: organizations make up their own language. Organizations have words derived from internal acronyms or inside language that only employees might use to describe elements specific to the organization.
- Some words may mean nothing to you, but they mean something to the computer. For example, if you searching for any documents relating to the “Atlantic” region of Canada, you should keep in mind that any system file containing a reference to the “Atlantic” time zone will be identified which could mean a lot of useless files to review.
In closing, it should be noted that the keyword selection process should be a joint effort by those involved in the case. This insures that adequate terms are selected and that they meet the objectives of all involved.
Comments are closed.