Visual Searching

It’s not easy to get a computer to recognize Charley’s aunt — or your copy of Insurance and Risk Management in Commercial Leases, for that matter. Seems our visual cortex et al. do a rather marvelous job of making sense of photon streams. Thanks to ASCII and to some nifty OCR developments, words aren’t all that hard for the machines we live by, but people and objects are a tough nut that the computer world is working hard to crack.

There is, of course, the impetus provided by the U.S. Homeland Security’s wish to recognize the face of terrror when it shows up in a queue at the airport; as a consequence, facial recognition systems are getting quite sophisticated. But on a more mundane level, software developers see profit in the offing, if they can get your camera-equipped computer (i.e. your smart phone) to identify certain things around you.

The Canadian company Idée Inc. has released Tin Eye Mobile, currently just for the iPhone but soon to be more widely useful. This lets you take a photo of a music albumn cover and have a wealth of data about that albumn returned to you. In the near future similar systems are going to come from Idée Inc. for books and DVDs.

More experimental is work being done by MOBVIS, a university research project at Lubjiana, Slovenia, funded by the European Commission. Again aimed at the camera in your phone, the goal is to devise a system that will recognize your photo of a building or city part regardless of the angle you shoot it from, and, having recognized it, provide you with information about that object. You can see a video illustrating this here.

One of the things I see going on here is the use of the now ubiquitous camera to bypass the need for words and the tedious business of coming up with them and keyboarding them in. The gestalt says it all. The worder (to coin a phrase) in me is worried about this trend and what it might mean for the kind of reasoning and thinking I’ve learned to value. And of course it sails right by law and all its works, a good thousand metres to the right. Or does it? I believe we will have to learn to make use of, understand, incorporate — some such term — images in law in a fairly fundamental way in the not too distant future. This was mooted way back when icons and signage first became popular, but it’s back again now in a much more serious way. We could, instead, have two streams that have little to do with each other, the verbal and the visual, each with its “proper” domain. But I suspect that this would be a mistake.

For the present, all I really want is a program that would let me take a photo of a text and have that converted into digital format — OCR for the iPhone. That’d be a start.


  1. See, too, Microsoft Tag, which I’ve just come across.