CanLII Now Offering Sorting by Number of Cites
On Friday CanLII announced a new search results sorting option:
December 7, 2007
News Release No 2007-06Dear users,I am happy to announce a set of new features that CanLII is now offering to you in order to help you deal even more efficiently with search results.
In your search results, you are now able to sort cases based on the number of times a case has been cited. You can do so by clicking on the “The most cited” link in the “Sort” menu of your search results page. By choosing “Sort by The most cited” the search engine will display the most frequently cited cases first.
Each result indicates the number of times the case has been cited. You will notice that the number of times a case has been cited is also an active link. By following that link, you will access a list of the citing cases.
Thank you for using CanLII!
Ivan Mokanov
for CanLII
There seems to be something unusual about the “number of cites” function. For example, there are 52 (51 including the case) references to the SCC’s Resurfice v Hanke 2007 SCC 7. However, when one sorts by “most cited” one gets “cited by 39 cases”.
That can’t be right, even by Intel’s version of mathematics.
I wonder how they are generating number of cites. Is it just that particular format of citation, for example (the neutral citation)?
How did you pull up the 52/51 cases?
Actually, when I search the cite “2007 SCC 7” in full text, I pull up 42, so there are at least a couple it seems to be missing. I’ll ask them how it is generated…
search on resurf* /s hanke in the main text box and restrict your results to decisions from and after Feb 8/07. Anything else misses cases. Here’s the link.
And the full link.
http://canlii.org/eliisa/search.do?language=en&searchTitle=Search+all+CanLII+Databases&sortOrder=relevance&searchPage=eliisa%2FmainPageSearch.vm&text=resurf*+%2Fs+hanke&id=&startDate=2007-02-08&endDate=2007-12-31&legislation=legislation&caselaw=courts&boardTribunal=tribunals
The number of citing cases is based on all cases that cite one particular case (here 2007 SCC 7) that have been identified by our system.
When a case is cited in other cases by its neutral citation (2007 SCC 7) or any of its parallel citations ([2007] 1 S.C.R. 333 ; (2007), 69 Alta. L.R. (4th) 1), those cases are identified as citing cases and links are placed in their content to the cited case (2007 SCC 7).
So, as Connie wrote, if you run a full text search for “2007 SCC 7”, you’ll find 42 cases that contain the string “2007 SCC 7” including 2007 SCC 7 itself.
The number of citing cases is 39, which means that 2 citing cases have not been identified. Those two probably contain citations that are formed mysteriously, such as [2007] S.C.C. 7 in Ferguson v. Steel, 2007 ABQB 596 (CanLII).
Thanks, both, for explaining a little more fully. I think we would have to look through every result in David’s search to figure out what wasn’t pulled up and why…
Ivan, Connie
I avoid the problem of how the case is cited, and how it is misspelled, by using resurf* /s hanke in the “full text” search box. That pulls up 52 references to the SCC decision – 51 not including Resurfice, itself. Not 39 or any other number.
At least one of the cases won’t be found by using any version of “2007 SCC 7” or the full names in the citation search box because it has “no. 7” in the reference. Another spells the corporate name “Resurface”.
If CanLII isn’t doing some sort of generic enough, wild card based, search for case references, CanLII had best post a caveat explaining the limitations on the number of cites figure provided by the “most cited” sort.
DC
With David’s search resurf* /s hanke 60 results are returned.
Of these 6 are of the case itself in some form or other, and three results refer to a version of Resurfice that is earlier than the SCC judgement.
1. Manitoba Association of Architects v. City of Winnipeg et al, 2006 MBQB 126
2. Whey v. Halifax (Regional Municipality), 2005 NSSC 348
3. Nielsen (Estate of) v. Epton, 2006 ABQB 21}
That leaves 51 results that do refer to the SCC judgment.
Of these, 38 have what would be considered, perhaps, standard unexecptionable citations to Hanke, typically the neutral citation…
1 Lane v. Alcock Enterprises et al, 2007 NLTD 157
2 Rizzi v. Mavros, 2007 ONCA 350
3 Radke v. M.S. (Litigation guardian of), 2007 BCCA 216
4 B.S.A. Investors Ltd. v. DSB, 2007 BCCA 556
5 Jackson v. Kelowna General Hospital, 2007 BCCA 129
6 Barabash v. U-Haul Co. (Canada) Ltd., 2007 BCPC 195
7 Hawkins v. Mathieson et al, 2007 MBQB 163
8 Hutchings v. Dow, 2007 BCCA 148
9 Ashcroft v. Dhaliwal, 2007 BCSC 533
10 Agno v. Wilson, 2007 BCSC 1160
11 Windsor v. Canadian Pacific Railway Limited, 2007 ABCA 294
12 Duley v. Friesen, 2007 BCSC 1723
13 Simpson v. Baechler et al., 2007 BCSC 347
14 Michaud v. Brodsky, 2007 MBQB 239
15 Nason v. Nunes et al, 2007 BCSC 266
16 Martin v. Capital Health Authority, 2007 ABQB 260
17 Ruffle v. Canada (Correctional Service), 2007 BCSC 1264
18 Hall v. MacDougall, 2007 BCSC 1296
19 Whyte v. Morin, 2007 BCSC 1329
20 Bohun v. Sennewald et al, 2007 BCSC 269
21 Greenall v. MacDougall and HMTQ, 2007 BCSC 339
22 Durand v. Bolt, 2007 BCSC 480
23 Williams v. Thomas Development (1989) Corporation, 2007 NLCA 54
24 Naidu v. Mann, 2007 BCSC 1313
25 Lyon v. Ridge Meadows Hospital, 2007 BCSC 1000
26 Nash v. MacDougall and HMTQ, 2007 BCSC 563
27 Jackson v. Rooney, 2007 BCSC 761
28 Nolet c. Boisclair, 2007 QCCS 4417
29 Zazelenchuk v. Kumleben, 2007 ABQB 650
30 Vasiliopoulos v. Dosanjh, 2007 BCSC 703
31 Wilde v. Archean Energy Ltd., 2007 ABCA 385
32 B.(P.) v. V.E.(R.), 2007 BCSC 1568
33 Wainwright (Town of) v. G-M Pearson Environmental Management Ltd., 2007 ABQB 576
34 Burbank v. R.T.B., 2007 BCCA 215
35 Marszalek et al v. Bishop et al, 2007 BCSC 324
36 Tonizzo v. Moysa, 2007 ABQB 245
37 Mainland Sawmills Ltd. v. USW Union Local – 1-3567, 2007 BCSC 1433
38 Haj Khalil v. Canada, 2007 FC 923
The remainder, 13 cases, exhibit some form of citation that might cause CanLII’s citation recognizing algorithm to choke. I’ve put the Hanke citation below the citing case and between curly braces because, thank God, no legal citation format uses curly braces:
1. Misko v. John Doe, 2007 ONCA 660
{(2007), 278 D.L.R. (4th) 643}
2. Barker v. Montfort Hospital, 2007 ONCA 282
{(2007) S.C.C. 7}
3. Adams v. Borrel et al, 2007 NBQB 102
{misspelling, [2007] S.C.C. 7}
4. Dey v. Sookram, 2007 CanLII 24077 (ON SCC)
{2007 S.CJ.No.7}
5. Block v. Canadian Pacific Hotels Corporation, 2007 ABQB 166
{[2007] S.C.J, No.7}
6. B.S.A. Investors Ltd. v. DSB, 2007 BCCA 94
{2007 SCC No. 7.}
7. Seatle (Guardian ad litem of) v. Purvis, 2007 BCCA 349
{(2007), 357 N.R. 175, 69 Alta. L.R. (4th) 1}
8. Abel v. Hamelin, 2007 CanLII 17185 (ON S.C.)
{[2007] S.C.C. 7}
9. Graham v. WCB et al, 2007 NWTSC 54
{[2007] S.C.J. No. 7}
10. Walford v. Jacuzzi Canada Ltd., 2007 ONCA 729
{[2007] S.C.J. No. 7}
11. Vescio v. Garfield, 2007 CanLII 24676 (ON S.C.)
{[2007] S.C.J. No. 7}
12. Hill v. Victoria Hospital Corporation, 2007 CanLII 27582 (ON S.C.)
{(2007), S.C.J. No. 7}
13. Ferguson v. Steel, 2007 ABQB 596
{2007 S.C.C. 7}
Of these cases 1 and 7 use perfectly good citations (I’ve checked them in LexisNexis/Quicklaw) and should both have been caught by CanLII. Cases 8,9,10 and 11 use the correct Quicklaw citation [2007] S.C.J. No.7 while cases 4 and 5 butcher it. Cases 2, 3, 6, 8 and 13 get the neutral citation wrong in some way or other.
It seems to me that CanLII ought to be able to entail good alternative citations into its count of cases that cite X, including Quicklaw citations; and I suspect that with a bit of work it should be able to catch the messed up neutral citations, perhaps by running a double-check against the case names. I would doubt that trying to sweep in all the badly formed other citations is worth the candle, even if it were possible: there are more ways to botch things, Horatio, than your LRW prof ever imagined.
But I’d have to say that there’s no way at all it could allow for all misspellings of the case name. It’s one thing for David to see that Resurfice is akin to the word “resurface,” but another to get a computer to act like David (in this, I mean). A couple of good typos and the case name becomes anybody’s guess.
So I’d say that CanLII has a some more work to do on its algorithm, and that David shouldn’t expect the perfection of human searches from a machine. In the meantime, it might be a good idea, as David suggests, for CanLII to state explicitly how a case is counted as one that cites another.
Simon,
I’m still waiting for the release of computer that’s the analogue of David Gerrold’s H.A.R.L.I.E.
You have 60 hits on Resurfice, not 52 because you didn’t date limit your search to cases from Feb 08/07 onwards. The result, as you note, was you picked up some of the lower court references and the SCC case itself more than once. I could move the back date forward one day. That would eliminate Resurfice, itself. However, I prefer to keep it for my convenience.
I suspect a “fuzzy” enough algorithm sifting a large enough database would quickly “see” the Resurfice/Resurface overlap. I hope a competent programmer would recognize that “Resurfice” is a coined word and realize it’s an invitation for misspelling. The truncuated root, then – resurf* – is the logical place to start.
The resurf* /s hanke “full text” search, so far, hasn’t missed any case referring to the SCC decision.
wow–great detective work.
I just hope we haven’t dissuaded Ivan et al. at Lexum from continuing the fantastic development work! I think this feature is very forward-thinking, especially for a publicly available free service. I guess the challenge has been made to make it even stronger.
David, I suspect that it would take one whopper of a computer, fuzzy logic and all, to catch up accurately all of the potential typos and misspellings of case names. I should imagine that the game isn’t worth the candle, especially when you’ve got citations to play with. There, I think, some decent programming could work wonders. But surely what needs to happen as well is to insist that courts use one common citation system, regardless of what Quicklaw does, and to get it right. Not asking a lot, I think.
I didn’t leave the debate because I was dissuaded; I was just busy watching a movie… Quite good actually.
I totally agree with Simon. Many things can be done with programming. However, sometimes there are smarter ways to get the benefit than writing code to catch a few unnecessary exceptions. The use of the neutral citation is one smarter way. There is no reason not to use a neutral citation to cite a case when such citation is available. 2007 SCC 7 – could it be any simpler? This will make QL’s, eCarswell’s and CanLII’s lives much easier.
Glad the movie was good, Ivan. Getting everyone to use the neutral citation for a case will surely be a process that gets slower and slower the closer it gets to perfection. In the meantime, and until failure to use them approaches zero, it would make sense for CanLII to have a notice put up to make sure people know clearly what counts as a citation in “cited in ‘n’ cases.” And though I wouldn’t argue for the effort of devising a fuzzy logic search for misspelled names, I would like to see CanLII amend its citation recognition algorithm to bring in the obvious (wrongful) variations on the neutral citation scheme and the commercial publisher citations as well. This latter would seem to be very important, given that lawyers and courts are likely to use commercial databases as though these were transparently available to everyone; and unless you guys are truly impressive persuaders, I’d bet that the mountain won’t come to Mohammed on this one, and commercial providers will go on using proprietary citations, and their mess will continue to leak into the public system.
Simon,
I resorted to what amounts to a version of a hay-stack sifting, brute-force, search when the citation algorithm failed. Fortunately, I already had the information that told me it failed. I’d have cross-checked, though, by running a search based on a non-citation parameters in any event.
I’m not asking CanLII to provide a better algorithm for searching based on word-pattern recognition. What it has is currently more than adequate. CanLII will eventually have systems that are more powerful. For me, if I’m researching proposition X established by case Y, there will be times where the research isn’t complete until I’m satisfied that I’ve located both the relevant cases that expressly refer to case Y and cases that don’t mention case Y but whose result necessarily means the judge had to have applied proposition X. Or, at the lower levels, the result would have been different had the judge been apprised on case X. At the appellate levels? We get to wonder (I’m being more circumspect than usual) at appellate inconsistency.
Oops. That should have been judge apprised of case Y and proposition X.
In the meantime, until CanLII is able to convince the courts to include neutral citations, it seems to me that the algorithm should at least be modified so that it picks up valid, major, SCC commercial citations such as the DLR. It doesn’t, at the momement, hence an Ont Case citing Resurfice as Hanke v. Resurfice Corp. (2007), 278 D.L.R. (4th) 643 (S.C.C.) isn’t picked up.
my understanding is that the ‘major’ commercial publisher reporter cites are incorporated into canlii’s reflex database only once annually, after the calendar year is finished. while this is more efficient, increasing the frequency of these updates might be something that canlii can work towards.
however, in the grand scheme of things, court decisions which do not include a neutral citation will tend to become less available and thus less cited. perhaps that is by intention ??
also, while the DLRs continue to carry the torch for a comprehensive canadian report series, the apparent trend is towards more specialization, and given the DLR’s impending exclusion from QL, i have some reservations about the viability of the DLRs going forward.