How Black Is the AI Black Box?

It’s always interesting to me how things can sometimes coalesce and synchronize around an idea. For example, I’ve been thinking about a comment that Nicole Shanahan made in a recent collection of presentations delivered at Codex, the Stanford Center for Legal Informatics. She was talking about “lawyering in the AI age” and touched on “predictive policing” where the computer is used to predict human behaviour. Based on her experience with how algorithms and data work Shanahan characterizes this as “not really a rational goal.”

However, she notes, there are products on the market today and,

“… no one is reviewing what those algorithms are doing, and some are even proprietary so we can’t even access them. I think we as a legal community need to decide if we have computers doing our policing for us we should probably have some standards of reviewing those machines. And we don’t. We don’t.”

And she’s absolutely right, we need standards for reviewing machine algorithms. We cannot blindly rah rah our way toward an AI future without taking a close look at the processes that often manifest themselves to us as a “black box” full of machine learning algorithms.

Cathy O’Neil writes about this in her recent book, “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.” Despite being written by a “data scientist,” this is a very accessible book that provides some good cautionary tales illustrating how things can go bad under the algorithmic hood.

She considers some “critical life moments”: going to university, interacting with banks or the justice system, or trying to find and keep a job. “All of these life domains are increasingly controlled by secret models wielding arbitrary punishments.” And she wonders whether “we’ve eliminated human bias or simply camouflaged it with technology.”

And later, in reference to gaming Google search, O’Neil observes:

“Our livelihoods increasingly depend on our ability to make our case to machines. … The key is to learn what the machines are looking for. But there too, in a digital universe touted to be fair, scientific, and democratic, the insiders find a way to gain a crucial edge.”

And this reminded me of Lawrence Lessig‘s “Code and Other Laws of Cyberspace” originally written in 1990 and updated using a “collaborative Wiki” in 2005. Specifically his exploration of regulation in cyberspace and the idea that “code is law.”

“We can build, or architect, or code cyberspace to protect values that we believe are fundamental. Or we can build, or architect, or code cyberspace to allow those values to disappear. There is no middle ground. There is no choice that does not include some kind of building. Code is never found; it is only ever made, and only ever made by us.”

And to help us get a sense of what might be inside the black box he suggests we ask,

“Who are the lawmakers? Who writes this law that regulates us? What role do we have in defining this regulation? What right do we have to know of the regulation? And how might we intervene to check it?”

And while thinking about all of this a colleague* was kind enough to send around a link to a recent post by Brian Sheppard over on the Legal Rebels blog called, “Does machine-learning-powered software make good research decisions?: Lawyers can’t know for sure.” A provocative title to be sure. And, for the nice short primers he includes on algorithms and machine learning alone, is well worth the read.

He starts by asking, “Do lawyers need to know what they are doing?”

“The problem is that few lawyers understand what machine learning can do for them, let alone how it works. This is due in part to the guarded manner in which companies implement it. Machine learning is powered by complex proprietary algorithms, which companies keep under wraps and frequently change.”

But how understandable are these algorithms anyway? To begin with, algorithms are written using programming languages that will largely remain opaque to the uninitiated even if they do happen to get a chance to evaluate them. Then he notes that even the “companies that own the algorithms [and by extension potentially the coders that developed them] have trouble knowing exactly what their algorithms are doing.” That’s a little unsettling. And like Shanahan commented with respect to predictive policing, it “makes me a little nervous.”

Sheppard quotes the Bloomberg Law data scientist Robert Kingan who provides this description of the black box:

“Many machine learning techniques result in models of the data that consist of, say, hundreds of thousands to millions of numerical weights used to determine how input data is transformed to output. One can apply tests to such an algorithm and review examples from ‘gold standard’ training data to get a feel for how it behaves, but it may be impossible to interpret the algorithm itself in human terms.”

Which, as Sheppard observes, “leaves both lawyers and research companies fumbling in the dark: Lawyers don’t have a complete picture of what is happening, and research companies are relying on the lawyers to teach their machines.”

So, as I said, a little unsettling: even if we do get a chance to look in the box it might be pretty black inside.

* Thank you Sandra Geddes over at Bennett Jones SLP!


  1. Great post, Tim.

    This topic has been top of mind for me as well and it was helpful to read your post, as well as the excellent Sheppard article in Legal Rebels that you cite.

    I shared the Sheppard piece with the team developing and was quickly presented with this recent article from the MIT Technology Review in reply:

    The MIT authors acknowledge the black box aspect and offer up a principled approach to accountability predicated on:


    Each aspect is further defined, but the shorter version is that an identifiable someone should be capable of explaining the approach, and the data relied upon should be known, verifiable and justifiable to the purpose.

    Does having those insights in respect of algorithms used in research tools mean legal researchers will be immediately comfortable putting their faith in the tool? Not immediately, no, but comfort will follow with demonstrated reliability and transparency.

  2. Thank you Colin. You’ve reminded me of another book that’s in my “to read” pile: “Automate This: How Algorithms Took Over Our Markets, Our Jobs, and the World,” by Christopher Steiner. Here’s a review by Ashlee Vance at Bloomberg.

    Looking forward to reading that MIT piece. Hopefully we actually can evaluate these algorithms in “human terms.”

  3. David Collier-Brown

    A subset of the AI folks are interested in things which are amenable to explaining how they came to a conclusion.

    I recently stole that very idea from model-checkers and applied it to a much more pedestrian program, a language interpreter, which previously had been annoyingly inscrutable. I recommend it, and not just where we’re making decisions about someone’s life.

    [To oversimplify, a model checker exhaustively tries all possible sequences of inputs to a program, and reports which ones caused the program to fail to meet its requirements. Each is an explanation of how to make the program fail.]