Das U-Blog by Prashanth: Book Review: "Noise" by Daniel Kahneman, Olivier Sibony, and Cass Sunstein

I recently read the book Noise by Daniel Kahneman, Olivier Sibony, and Cass Sunstein. I will refer to it as the current book, because it was written after the book Thinking, Fast and Slow by Kahneman (one of the authors of the current book); I will refer to the latter book as the previous book because many concepts from the previous book are briefly reviewed in the current book, and as I reviewed the previous book in the post just before this one on this blog [LINK], I will sometimes compare some aspects of the current book to the previous book.

The current book introduces the concepts of statistical noise & statistical bias in human judgments, discusses the psychological biases that can lead to statistical biases & noise (of which statistical noise can be clearly seen even in the absence of clear information about statistical biases), demonstrates how statistical noise can lead to uncontrolled & large variations in human judgments in fields like criminal justice, medicine, forensic science, insurance claims adjustment, corporate hiring, and college admissions, explains the sorts of systematic techniques at individual & organizational levels that can be used to reduce noise in judgments, and discusses some tradeoffs that may be encountered when implementing these noise reduction strategies. The authors' discussion of many of the psychological biases that lead to statistical noise in judgments reviews concepts from the previous book, especially Systems 1 & 2.

When reading the current book, I found myself generally agreeing with the discussions of techniques to reduce noise in domains where the presence of significant statistical noise in judgments is broadly recognized as a severe problem. These techniques include aggregating predictions or evaluations that are made independently, structuring/sequencing discussions among people judging things so that their decisions don't affect each other through emergent group-based social dynamics, carefully accounting for base rates from external information when assessing various internal probabilities, and breaking up decision processes into smaller steps that are more clearly defined in their intent and in example decisions/anchors. It helped a lot that I had read the previous book first, such that even if I didn't remember every detail of every psychological bias presented in both the previous book and the current book, those things looked familiar upon reading them in the current book.

There were also a few new things that I learned from the current book. I learned about how the process of judgment feels so satisfying and infuses confidence into the person making the judgment specifically from the psychological signal of having completed the judgment, which explains why so many people who make professional judgments in many domains are so reluctant to turn their discretion over to more systematic rules or algorithms. I also learned about how simple models of human predictive judgments, when those predictive judgments are about specific outcomes, may do a better job at predicting the outcomes that are the objects of judgment than at predicting the judgments that humans would make, simply because those models lack within-person noise pervasive in human judgments.

However, my overall opinion of the current book was shaped more by the many major and minor (the latter to an appropriately lesser extent) criticisms of it. These minor and major criticisms as well as my concluding remarks will be presented in separate sections as follows after the jump; the spoiling of my concluding remarks is simply that I do not recommend this book to others.

Minor criticisms

The current book presents the empirical finding that many people do not understand the idea that $ \mathrm{Pr}(\mathrm{A} \cap \mathrm{B}) \leq \mathrm{Pr}(\mathrm{A}) $ must always hold for any events $ \mathrm{A} $ & $ \mathrm{B} $ but does not further contextualize this finding. This gap in the explanation also exists in the previous book, which I pointed out in my review linked within this post, and in the book The Drunkard's Walk by Leonard Mlodinow, which I pointed out in my review of that book on this blog [LINK]. I will not dwell on this point further except to lament the lack of imagination of the authors about why else people asked such questions about probability may make that mistake and how else to empirically test the psychological processes that lead to such mistakes.

Additionally, the current book uncritically presents nudge theory as an effective way to change the mental environment for people making judgments. Such claims are also presented in the previous book. It is not surprising that nudge theory features somewhat heavily in the current book, as Sunstein (one of the authors of the current book) is the academic researcher who formalized it & did pioneering empirical work on it. In my review of the previous book linked within this post, I noted how more recent empirical work on nudge theory has showed that the effects of nudges on people's behavior in many contexts are not statistically significantly different from zero. The current book does not acknowledge the existence of such contradictory work at all. In the concluding remarks of this post, I present some admittedly unfounded speculation on my part as to some of the authors' motives for doing so.

Major criticisms

My major criticisms of the current book are largely that the authors' claims about the extent to which noise is a problem in various domains may be overstated and that their recommendations seem antithetical to many American political & societal values (which is salient to the current book because the authors explicitly say, near the beginning, that they are writing from the perspective of American history, cultural, societal values, and societal expectations). These and related major criticisms are presented in the following subsections in no particular order. That said, I do recognize that some of these criticisms may be anecdotal or may stem from my own biases as an American and as a recent academic researcher in the social sciences with an understanding & appreciation of the grave history of socioeconomic marginalization of various groups.

Noise reduction strategies for singular business decisions

To be clear, I am willing to accept the need for more systematic & regulated predictive judgments to reduce noise in medicine, forensic science, insurance claims adjustment, and some other domains involving recurring similar decisions, especially where there is somewhat less direct interaction between the people making judgments and the people who are affected by those judgments. (However, this does not extend to all domains involving recurring similar decisions, because I have a separate criticism about domains like hiring of new employees where more direct interactions exist between the people making judgments and the people who are affected by those judgments.) The criticism in this subsection is mostly about the application of noise reduction strategies to singular business decisions; this criticism may apply to the use of noise reduction strategies in the context of singular decisions in other domains too, but I haven't thought through that as carefully.

In the particular domain of singular business decisions, the authors point out that excessive optimism abounds. This is consistent with what I have read about how people who start their own businesses underestimate the frequency with which past businesses like theirs have failed quickly.

My concern is that if too many people in the business world adopt more conservative (less risk-taking) judgment rules and the System 1 of each person adapts to this new judgment environment, the end result of widespread increased pessimism in the business world might be shrinking entrepreneurship & stagnation of the market. The latter point implies that a dynamic & innovative market might require most of its business owners or managers to be at least a little more optimistic than past results should warrant. This is also consistent with other things that I have read about how many businesses in Europe of all sizes are excessively conservative (with respect to risky innovations or other business moves), stagnant, bureaucratic, and lacking in innovation in contrast to their similarly-sized counterparts in the US because in Europe, the culture of excessively systematic breakdown & assessment of risk information snuffs out entrepreneurial optimism & flexibility and instead triggers risk aversion (which optimism from not as systematically analyzing risks would hide) common in humans. Moreover, if only one company in a competitive market in the US were to do this, that company would be a significant disadvantage because of the risks of being left behind not only by competitors but also by investors who might not like such excessive caution when it leads to mediocre returns compared to competitors. Overall, this is an example of how a suggestion by the authors, if implemented to the fullest extent, would have much bigger higher-order effects based on exactly the individual & group-based psychological biases that the authors describe.

It is telling, too, that the authors, in contrast to their presentation of real success stories of noise reduction efforts in domains of recurring decisions like physiological medicine or hiring (along with honest accounts of failures of such efforts in domains like psychiatry), only share hypothetical scenarios for noise reduction efforts in domains of singular decisions like business moves. Ostensibly, this may just be because business owners or managers are so locked into conventional wisdom that they refuse to consider any improvements to their internal business processes, but I suspect that at least in the US, there may be some value in the conventional wisdom, even if its practitioners cannot articulate why the conventional wisdom may work better than these suggested improvements.

Signals from rules changing too often or being gamed

In some domains of recurring decisions, especially hiring applicants for similar job roles in a company, the authors' rebuttal to complaints that rigid noise reduction rules applied to such judgments would not be flexible enough to account for unforeseen individual circumstances or changing societal values/beliefs is that people making these rules should just do better and should update these noise reduction rules sufficiently often to account for changing circumstances or societal values/beliefs. As I see it, the problem is that the authors fundamentally assume that everyone in the system (whether higher-level managers, hiring managers, interviewers, or job applicants in the domain of hiring) is used to these rules, will accept changes in these rules quickly, and has a steady-state behavioral response to these rules, yet people's responses tend to be slower, which makes this assumption generally incompatible with the admonition to frequently update these rules.

In the specific context of hiring, noise reduction rules that change at a frequency above a certain threshold, even if ostensibly for desirable reasons like accounting for previously unforeseen differences in individual circumstances or for changes in societal values/beliefs, would seem just as arbitrary (to the people at these organizations enforcing these rules as well as to job applicants) as individual human judgments. Above that threshold, such frequent changes to rules would become essentially formalized noise at the organizational level indistinguishable from arbitrary individual human judgments. Such frequent changes may even perversely signal to job applicants that the organization is not a trustworthy or stable employer, as it exhibits the same noise as (instead of much less noise than) individual humans but at a larger scale. The authors do not acknowledge these concerns to any significant degree. Moreover, if those rules were to be published, it is likely that the system would be gamed, but the authors brush off criticism about the possibility of rules being gamed by essentially just saying "do better" (at creating & updating rules that minimize the possibility of being gamed), yet this advice could be counterproductive by offering even more opportunities for the organization's intent to be perverted by those who wish to exploit that system.

Organizations operating to minimize legal liability

Despite the authors making clear at the beginning of the current book that it would mostly focus on examples in the US and would be written from the perspective of the social, political, and business cultures in the US, the authors demonstrate a remarkable lack of understanding of the extent to which institutional/organizational processes in the US to make judgments are shaped to minimize legal liability/the threat of being sued. For example, the authors discuss hiring or employment rules that may or may not account for an applicant's or employee's pregnancy or recently having given birth without considering laws protecting applicants or employees against such discrimination. This is especially surprising to me considering the extensive discussion elsewhere in the current book about legal issues and considering how many seemingly odd organizational practices may make more sense when considering that the organization in question or another one like it had been sued because of the opposite of that practice in the past.

Mentally settling into rigid rule-following

Despite reviewing the concepts of Systems 1 & 2 in the current book, the authors fail to imagine how when rigid rules for judgment are enacted & accepted, even in a large organization, the people in charge of enforcing those rules may not only consciously feel that they are forced by the system to give up the opportunity for mercy but may also subconsciously be led by System 1 to not even notice cases that show the deficiencies of those rules and instead claim that their hands are tied. From the previous book, this is an example of a feature of System 1 that "what you see is all there is". The latter point is especially troubling because that mentality of "just following orders" was memorably explained by Hannah Arendt as part of the concept of the "banality of evil" in the context of Nazi Germany (though the particular examples that she used turned out to undercut her arguments) and could also be observed in Mao-era China, the reign of the Khmer Rouge, and the British colonization of India (in which native Indians hired to be British soldiers or police officers at multiple points massacred their fellow Indians, in some cases with seemingly no hesitation in that moment). I struggle to understand why such rigid rule-following without mercy is associated with brutal dictatorships and therefore would not resonate with American readers given prevailing societal values in the US. I further struggle to understand how Kahneman, as one of the authors, could have endorsed these ideas and their implications, given his particular background of his upbringing as a French Jew during the Holocaust specifically having informed his view of human psychology even as a child by his own telling and his residence mostly in Israel since its creation (when he was 14 years old) in the wake of the Holocaust likely having exposed him to enough information about the horrors of Nazi Germany & similar brutal dictatorships; please note that my references to specific aspects of Kahneman's background should make clear that I do not imply that Kahneman "should" have believed certain things only because of his Jewish background or upbringing during the Holocaust or that his residency in Israel since its creation would lead him to specific views about the politics of that region.

Moreover, the authors' explanation that such rigid rules should channel the vagaries of System 1 systematically into more productive & less noisy judgments without requiring people to use System 2 all the time (as the previous book makes clear that using System 2 drains people's energy & focus, so it is unrealistic to expect any human to use System 2 all the time) contradicts the authors' expectation that the people in those systems who enforce & update those rules forever vigilantly use System 2 to assess the performance of those rules & update them frequently. The quote "if men were angels, no government would be necessary" from Federalist Paper #51 illustrates this contradiction, as the rules underpinning governments to channel people's System 1 more productively would not be needed if people were using System 2 all the time. That said, I recognize that is possible that I am overstating the extent to which this contradiction is a problem by making it seem like a binary choice between using only either System 1 or System 2 instead of a spectrum in between.

Whether and why noise in the judicial system is a problem

To be clear, I agree with the authors that the idea that verdicts & sentences in cases of criminal law or judgments in cases of civil law, many of which are life-changing, could depend critically on the judge's short-term mood, demonstrating within-person noise sensitive to mundane things like the weather that day or how hungry the judge happens to be at that moment, is outrageous. With that aside, I think that the authors seem to fundamentally overstate the extent to which between-person noise in legal judgments is a problem & mischaracterize outrage about such disparities as being purely about its quantitative aspects. This can be seen in 2 ways.

The first way is to consider a scenario where ordinary people are outraged about judgments against companies that have committed crimes because too many of those companies have received at most nominal punishments & too few of them (but enough to create a large observed range of punishments) have received substantial punishments; this is a simplified view of one aspect of the Occupy Wall Street protests in the early 2010s following the financial system collapse in the US in 2008 & the Great Recession in 2009. In this scenario, if the probability distribution of quantified punishments (in terms of money that the company has to pay or jail sentences for the company's managers) were to uniformly shift such that the new minimum punishment were to be placed much higher than the old maximum punishment, then I think that it is plausible that ordinary people's thirst for retribution against companies that in the future commit crimes would be quenched and that further disparities evident from the large range of punishments wouldn't matter as much to ordinary people. As a specific example with numbers, I think that such people would be far more outraged if a company found guilty had a 99.9% chance of being punished with a fine of $5 and a 0.1% chance of being punished with a fine of $100 billion than if a company found guilty had a 99.9% chance of being punished with a fine of $100,000,000,005 ($100 billion + $5) and a 0.1% chance of being punished with a fine of $200 billion, even though the ranges are identical. The authors' logic about the importance of noise, by contrast, would imply that such people would & should continue to be outraged about the width/range of that probability distribution irrespective of its mean/median, which I find to be implausible especially given that the current book itself discusses deficiencies in ordinary people's understanding of probability. It is even more odd too that the authors don't acknowledge the effects of "satisficing" preferences from microeconomic theory & empiricism that are applicable in the context of finding a high enough level of punishment that is broadly acceptable, given that the authors have done research in behavioral economics.

The second way is that I think that public anger about between-person noise especially in criminal sentencing outcomes, just like those disparities themselves, may arise more from disagreements over the extent to which criminal sentences should be used for rehabilitation (in which case they should be short & structured appropriately) versus punishment (in which case they should be long & isolating). Rigid rules to reduce noise, in the form of mandatory minimum sentences, not only failed to resolve this qualitative disagreement but may have further inflamed it, as such mandatory minimum sentences were implemented in the 1980s coinciding with high levels of crime & high levels of public support for the War on Crime, while the overturning of mandatory minimum sentences on technical legal grounds came in the 2000s coincident with lower levels of crime & lower levels of public support for the War on Crime per se. The authors somehow don't address these points at all.

Concluding remarks

In almost all of my major criticisms, I have been able to use the authors' own concepts & logic to significantly weaken their arguments (following the notion of exegesis) in the current book; this significantly weakens the respect that I have for the current book. Moreover, the authors demonstrate hypocrisy by too often glibly dismissing objections to their arguments (often with the rebuttal that the solution is to simply "do better" when implementing noise reduction strategies, even as that rebuttal offers no further details about that solution and does not demonstrate nuanced understanding of the criticism) instead of taking those objections seriously, even as they exhort readers to think carefully & be open to new information. The authors also seem to treat the absence of clear articulations from their study subjects about reasons for opposing noise reduction strategies as evidence of the absence or illegitimacy of such reasons, showing the authors falling prey to some of the psychological biases affecting System 1.

The condescendingly dismissive attitude underpinning the authors' rebuttals especially irritated me, particularly given the authors hypocritically expecting readers at many points to carefully consider new ideas & viewpoints, and seemed counterproductive with respect to getting ordinary people to trust technocrats or technocratic solutions (the latter label of which the authors cheerfully apply to these noise reduction strategies) in various domains. Admittedly without much more than circumstantial evidence & my own conjecture, I suspect that the condescension most likely came from Sunstein, who was a technocrat in the Obama administration, especially because the glib attitude, condescension, lack of humility, and prescriptive nature (in the economic sense) of the current book contrasted with the detail, humility, and descriptive nature (also in the economic sense) of the previous book (as Kahneman, who was one of three authors of the current book, was the sole author of the previous book).

The point about Sunstein made me wonder further whether Sunstein's failure in the current book (as he was the pioneer formalizing that area of research) to address examples of empirical invalidation of nudge theory, instead of being an innocuous oversight or an omission for the sake of brevity, might actually indicate an active effort by him to dishonestly deny the existence or legitimacy of such contradictory work. Additionally & speculatively (as I don't have strong evidence beyond the authors' respective backgrounds & my own conjecture to support the following point), if Sunstein was the author primarily responsible for the recommendations in the book that seemed to endorse rigid rule-following in many organizational contexts that make those organizations seem more like brutal dictatorships, then I can sympathize more in hindsight with the large number of ordinary Americans who chafed during the Obama presidency against the technocratic approach of the Obama administration as seeming dictatorial to them (though my sympathy is still limited knowing that much of that sentiment came from a steady stream of right-wing propaganda), as I get the sense that the Obama administration's technocratic approach worked well from a systems perspective but made too many people feel badly about not understanding why these systems are in place & didn't do a good enough job of explaining those things in simple terms to ordinary people.

For all of the reasons in this post, I cannot recommend the current book to anyone else. I still do stand by my recommendation of the previous book despite more recent studies (as I pointed out in my review of the previous book, linked within this post) failing to reproduce many of the key empirical findings presented in the previous book, as the previous book has high levels of detail & humility (without a glib or unjustifiably dismissive attitude toward criticism of the ideas) that make its novel ideas interesting & thought-provoking.