Column

The Joy of Wildcards (And Boolean Operators)

Quicklaw and Westlaw Canada use “*”. CanLII uses “!”. I’m referring to the symbol that these databases use for as a “wildcard”, that is the symbol used to represent one or more characters in a string when carrying out a search. Conversely, when it comes to the symbol used to truncate a word, Quicklaw and Westlaw Canada use “!”, but CanLII uses “*”. (Google does not allow users to truncate search terms at all, although it does use “*” as a wildcard in phrase searches.)

Not only does the symbol used for the wildcard vary among online services, but some services may use different symbols for the wildcard depending on whether it can replace multiple characters or only a single character. For example, CCH Online uses “?” as a single character wildcard (e.g. “mari?uana”) but “*” as a multiple character wildcard (e.g. “securit*” would retrieve “security”, “securities” and so forth.)

By default, most search engines will perform an absolute search: they will return any documents that include the precise words entered. Quicklaw or Westlaw Canada will include the plurals of the search terms in their search. More sophisticated search engines such as Google and CanLII will perform their own fuzzy logic for matching words or phrases that are similar to the ones entered. WIldcards are therefore very useful when searching databases like Quicklaw and Westlaw as they allow users to find records that have variants of a particular search term.

Where did the term “wildcard” (or “wild card”) come from? Its first recorded use (with the meaning “a playing card that has its rank chosen by the player holding it”) was in 1927, according to the Oxford English Dictionary. The term was subsequently adopted in computing; in a number of computer operating systems the asterisk is used as the wildcard symbol for multiple characters, with the question mark being used to indicate a single character.

So what about the symbols chosen to represent the Boolean search terms (AND, OR, NOT)? The ampersand is fairly consistently used as a replacement for the Boolean term AND; CCH, Quicklaw and Westlaw Canada all use it. The vertical bar or pipe (“|”) is a pretty common alternative to OR; CCH, DisclosureNet and Google all use it. (“|” is used for OR in a number of computer programming languages.) There is more variety in the symbolic replacements for NOT; CanLII uses a dash (“-”), CCH uses the caret (“^”) and Quicklaw and Westlaw Canada use the percentage sign (“%”). Generally though, try not to use “NOT” in a search as you can end up inadvertently excluding relevant search results.

As an aside, the term “Boolean” comes from the inventor of Boolean logic, George Boole, who was a nineteenth century English mathematician.

The inconsistent use of symbols means that users have to be careful when running the same search on different platforms. Using “*” instead of “!” (or vice versa) can lead to a narrower search than was intended. It is therefore important, when training users on a new database, to make sure they are aware of any quirks in the database’s search syntax.

For ease of use, I’ve included a quick reference table below.

 

CanLII

CCH

Google

Maritime Law Book

Quicklaw

Westlaw Canada

AND

AND
(default)

AND
&
(default)

(default)

AND

AND
&

AND
&

OR

OR

OR
|

OR
|

OR

OR

OR
(default)

OR

NOT

NOT
-

NOT
^

-

NOT

AND NOT
%

BUT NOT
%

Phrase Search

“…”

“…”

“…”

 

“…”
(default)

“…”

Truncation

* (but note that CanLII searches for all variations of a word automatically)

*

(not used)

*

!

!

Wildcard

*
(multiple character wildcard)

?
(single character wildcard)

*
(multiple character wildcard)

(used for whole words, but not individual letters)

 

*
(single character wildcard)

*
(single character wildcard)

Proximity [1]

/n

“ ” @n

AROUND(n)

/n/

/n

/n

Precedes [1]  

“ ” /n

 

/0,+n/

+n

+n

[1] n is used to indicate how close the two terms should be to one another, e.g. “contract /5 sale” would find everything that had the words “contract” AND “sale” within 5 words of each other.

Retweet information »

Comments

  1. Susannah, I have a suggestion and comment. First, the comment.

    The key to understanding search operators in English is to keep in mind it is a syntactical language. The sentence

    Dick and Jane ran up the hill

    is a classic English sentence: subject verb object, in that order. This is what the English reader and writer expects to see – and anything which breaks up that order takes away from the meaning – so, sentences should not be more than 25 words without good reason, and verbs subjects and objects should not be separated if possible. So, proximity searching often works better than simple boolean because it assumes that the closer the words the more relevant the meaning. If you take my drift.

    This won’t work for inflected languages like German or Latin, nor for Gaelic for that matter, but thats another story.

    The suggestion is to include Maritime Law Book in your table (the same publisher who supplies the case comments to SLAW) – they have excellent databases structured around a very good taxonomy, and including them would make the table much more useful.

    Neil Campbell
    Faculty of Law
    University of Victoria

  2. Thank you!