Book Review: "Weapons of Math Destruction" by Cathy O'Neil

I've recently read Weapons of Math Destruction by Cathy O'Neil. It is a short but dense exposition into the various ways that computer algorithms can determine the courses of people's lives and exacerbate existing societal inequities and biases/prejudices, in areas like education, civic engagement, education, health at the workplace, and many others. It argues that while many algorithms used in big data can be used for good, many instead widen inequalities and reinforce various cycles of poverty because of their opacity, lack of accountability, poor use of statistics, and lack of critical examination by those in charge (who instead use results and predictions generated by such algorithms to fire workers, deny opportunities to potential employees, financially prey on poor people, and so on, making such predictions self-fulfilling prophecies as only confirmatory data is fed back in); in particular, many of these algorithms and models use questionable proxies to predict certain attributes or behaviors (especially when the desired attributes are hard to quantify but the proxies are easy), these models are rarely transparent in what inputs are collected and how they are manipulated to produce outputs, and further research and fine-tuning are rarely performed to correct models that most humans would recognize produce incorrect results (but which computers would miss). It concludes that extensions of existing regulations on use of health and financial data are needed to curtail the misuse of such algorithms, and that simultaneously data scientists need to be scrupulous about the ways that their work is used and developed.

I rather enjoyed reading this book: it's pretty fast-paced, yet gives many detailed examples of the abuse of these algorithms to form a compelling narrative. Additionally, it in many ways follows the book The Attention Merchants by Tim Wu (which I have previously reviewed), because as that book shows the various ways that companies collect and sell customer data, this book shows the various ways that data can be used for the benefit of those companies (even if that works against some of those customers). There are only two issues that I have with this book. One is that the few times that politics comes up, the author's political bias (in favor of liberals in the US) is obvious; perhaps this is just due to the nature of the author's passionate crusade against abuse of algorithms and for institutional action uplifting poor and marginalized people, as that would necessitate regulation of such mathematical instruments, which would be (and has been) loudly opposed by large corporations maintaining their short-term profits and long-term status quo through these algorithms as well as the conservative politicians that they support. The other is that there aren't too many examples of big data and related algorithms truly working toward greater socioeconomic equity, especially when such algorithms are finding patterns that wouldn't be found by humans; while I get that the author is trying to build a brief but dense narrative warning against the excesses and abuses of such algorithms (as she professes herself to not be a big data evangelist), I would have liked to see more nuanced examples of proper uses of big data, because as this book stands, it seems just as one-sided/polemical as uncritical big data evangelism. Overall, I certainly feel like I got a better sense of the potential dangers of unchecked and uncritical use of algorithms to shape the economy and society. Plus, now that I'm about halfway through my PhD, I've started to think more about the sorts of jobs that I'd like to take after I finish. I've decided that I don't want to go into finance because (as mentioned in this book too) I'm not comfortable with playing with other people's money, as it is too easy to be seduced by mathematical simplicity and elegance into doing questionable things. That said, one thing (among the many) that has caught my fancy has been studies of policy problems (especially as related to STEM fields, but as they affect ordinary people); however, the story in this book about the role of the Mathematica Policy Research company in developing the arbitrary and statistically unsound metrics for evaluating teachers in DC public schools has made me realize that I'll need to make sure if I end up joining a policy research organization/consultancy/think tank that the organization that I join is responsible and transparent about the data that it collects and processes as much as possible.