Column

The Science of Bad Decisions

Freakonomics, is one of my favourite series of books, blogs and audio podcasts. As the authors proclaim, they look at “the hidden side of everything.”

So imagine my delight when I found a recent podcast that examined how decisions by adjudicators (baseball umpires, judges and bank loan officers) can be affected by totally random factors such as the the order in which they are made and time of day.

The problem is, people don’t really understand randomness. They understand that, if you flip a coin, the odds of landing heads or tails are 50/50. This means that, out of any 10 flips, 5 will be heads, right? Wrong.

It’s quite possible to have 10 heads in a row. Rare, perhaps, but possible. It’s completely random.

Most people who get two or three heads in a row will expect to get tails on the next flip. But the odds are still 50/50.

This is called the “gambler’s fallacy.”

“Decision-Making Under the Gambler’s Fallacy: Evidence from Asylum Judges, Loan Officers, and Baseball Umpires” a research paper published earlier this year by three U.S. economists found evidence that decisions made by umpires, judges and loan officers were affected by this fallacy.

The authors looked at Major League Baseball because there is an enormous amount of data about every pitch thrown in every game. The paper examined data about 1.5 million pitches collected by PITCHf/x, including the type of pitch (curve or fastball), where it landed, and whether it was called a ball or strike by the umpire.

The authors asked whether the umpire was more likely to call a strike, if the previous pitch had been a strike or a ball. They focussed on the close calls that could go either way.

(Umpires were right almost 100% of the time, if the pitch was down the center of the plate or way outside the strike zone. But on the close calls, the umpires were right only about 2/3 of the time, when the call was compared with the actual recorded pitch location. This was consistent among all 127 umpires in the data. These are the best, most experienced umpires around. High-definition, super-slow-motion video must be hugely detrimental to their self esteem. Let this be a lesson to all adjudicators, especially those who can’t easily verify decisions against objective criteria. How sure are we that we get things right? As certain as baseball umpires?)

In any case, the research also showed, with those same borderline calls, umpires were about 3.5% less likely to call a strike, if the previous pitch was a strike. And they were about 5% less likely to call a strike, if the previous two were strikes.

Umpires don’t do this consciously. Baseball umpires – like referees and umpires in all sports – are trained to make every call independently, and not to think about whether previous calls were right or wrong.

According to Hunter Wendelstedt, a Major League umpire since 1999 and head of a leading umpire school, featured on the Freakenomics podcast commenting on the PITCHf/x data:

“If you miss something – the worst thing to do, you can never make up a call. People are like, ‘That’s a makeup call.’ Well, no, it’s not, because if you try and make up a call – now you’ve missed two. And that’s something that we would never, ever want to do.”

And it’s also not what the research data shows. What it shows is that, regardless of whether the previous call was right or wrong, the umpire is marginally less likely to make the same call again on the next pitch, and less likely still to make the same call three times in a row.

And the paper showed that the same tendency applied to U.S. judges deciding refugee cases and bank offers in India deciding whether to approve loans to customers.

The research on the Indian bank loans showed the loan officers made bad decisions (measured by the percentage of approved loans that later went bad) roughly eight percent of the time, simply because of the order in which they reviewed the applications. If they approved two or more applications in a row, they were less likely to approve the next one they saw, even if the application was just as good. And vice versa.

The effect on refugee asylum decisions made by U.S. federal court judges was also striking.

If cases are randomly assigned, one would expect some consistency in the rate of decisions for and against applicants. That’s not what the authors found.

(Quite apart from the decision-order effects the authors were studying, they also found that, in New York, some judges granted asylum in 80% of cases, while others approved less than 10%. This is an astonishing discrepancy in outcomes in what should be a random mix of assigned cases.)

To study the effect of the order of cases, the study focussed on judges with more moderate approval rates. They studied about 150,000 decisions by more than 350 judges. According to Toby Moskowitz, one of the study authors:

“If the previous case was approved by the judge, then the next case is less likely to be approved by almost one percent. Where it gets really interesting is, if the previous two cases were approved, then that drops even further to about one-and-a-half percent. And if these happen on the same day, that goes up even further, closer to 3 percent. And then obviously if it’s two cases in the same day it gets even bigger, it starts to approach about 5 percent.”

And the same difference of up to 5 percent worked the other way too. If the previous cases were denied, then the next case was more likely to be approved. So an applicant’s odds of approval in a particular case, due simply to the order cases were heard, changed by as much as 10%.

The authors had no way to determine the correctness of any of these decisions, but the statistics are significant – especially for the applicants whose entire future may be affected by whether a judge approved a previous applicant or not.

As Freakonomics author Stephen Dubner commented on the podcast:

…if I hear that a baseball umpire might be wrong … I think, “Well, but the stakes are not very high.” But in the case of an asylum seeker, this is a binary choice. This is not a one ball or strike out of many. This is I’m either in the country or I’m not in the country.

The podcast also refers to another study, from Israel, which found that time of day can also have an effect on judges’ decisions. That study found that judges were more likely to grant parole early in the day or right after lunch.

And there’s a phenomenon known as “sequential contrasts” which says that our our opinion of a book, or meal, or job applicant, is affected by the last book we read, meal we ate or applicant we interviewed. Our frame of reference changes and we rate the next one better or worse, accordingly.

One study of this effect in financial markets showed that investors mistakenly perceive current earnings news more positively, if the previous day’s earnings news was bad, and more negatively if the previous news was good.

These studies are all part of a growing body of research into the social science of decision-making. They shed light, not only on how adjudicators evaluate the merits of cases, but also on the many extraneous factors and biases that can affect decisions without our knowledge.

For more on the effect of the gambler’s fallacy on decision making, see:

Chen, Daniel L. and Moskowitz, Tobias J. and Shue, Kelly, Decision-Making Under the Gambler’s Fallacy: Evidence from Asylum Judges, Loan Officers, and Baseball Umpires (January 12, 2016). Fama-Miller Working Paper. Available at SSRN: https://ssrn.com/abstract=2538147 or http://dx.doi.org/10.2139/ssrn.2538147

For more on sequential contrast effects, see:

Hartzmark, Samuel M. and Shue, Kelly, A Tough Act to Follow: Contrast Effects in Financial Markets (July 15, 2016). Available at SSRN: https://ssrn.com/abstract=2613702 or http://dx.doi.org/10.2139/ssrn.2613702

And check out other interesting Freakonomics stories at www.freakonomics.com.

Comments

  1. Michael

    Yup. Fun site.

    Not just “people’. Too many judges and lawyers, too. Too many people who take polls as gospel – whatever they think that gospel is.

    Maple Leaf fans, by definition.

    This article at the next link isn’t about Canadians, of course

    https://sciencebasedmedicine.org/5-out-of-4-americans-do-not-understand-statistics/

    which, all things considered, is probably a good thing because our results would be equally bad or worse. (See Mape Leaf fans, Canuck fans, Nickelback fans, and people who buy 10 6/49 tickets on the basis that that way they’ll have 10 times as much chance of winning the lottery.

    And, in my experience, arbitrators too but, at least in cases where my side had a useful say in who the arbitrator was, usually not. (That wouldn’t always have been my doing. More often than not it was because the C.A. retained by my client, or by my firm, did and was very good at educating where necessary.

    If you want more hits on the subject, here’s a plain vanilla google search link: https://www.google.nl/search?q=studies+on+whether+people+understand+statistics&rlz=1C9BKJA_enCA650CA650&oq=studies+on+whether+people+understand+statistics&aqs=chrome..69i57.28709j0j8&hl=en-GB&sourceid=chrome-mobile&ie=UTF-8

    I didn’t run the same search on Bing to see if the results would be relevantly different.

    I did, however, run the same search on Duckduckgo.

    https://duckduckgo.com/?q=studies+on+whether+people+understand+statistics&t=h_&ia=web
    The results differ from the Google search.

    Go figure. What is the statistical likelihood of that.

  2. Excellent blog. Those interested in this topic might also be surprised to read about another persistent cognitive bias, the anchoring effect, and its role in judging. See Birte Englich, Thomas Mussweiler and Fritz Strack, “Playing Dice with Criminal Sentences: The Influence of Irrelevant Anchors on Experts’ Judicial Decision Making” (2006) 32 Personality and Sociology Bulletin 188, which can be found here:

    http://www.eucim-te.eu/data/dppsenglich/File/PDFSStudien/PSPB_32(1).pdf.