Skip to content

Law, Liberation, and Causal Inference


Lily Hu is a PhD candidate in Applied Mathematics and Philosophy at Harvard University.

This post is part of a symposium on the Methods of Political Economy.

Struggles over the “right” interpretation of key ethical notions are perpetual, indissoluble conditions of the law. But this burden of normativity is so onerous that efforts at alleviating any of it are always being offered up, and taken up, in legal analysis. Frameworks that can side-step moral and political dispute with an agree-to-disagree convergence on a simplified empirical analysis gain much in practical directedness and operationalizability. The law must, at the end of the day, do something.

First, through the escape hatch from normativity, was neoclassical economics with its clear models and desiderata that, though lacking in the way of normative sensitivity and faithfulness to the goings-on of the complex social world, does achieve its goal of guiding the law in issuing verdicts. Now, as economics takes an empirical turn, troves of data and a well-developed toolkit of statistical inference methods have moved to supplement (if not supplant) the largely theoretical models of yesteryear. Such empirical analysis are perhaps even better suited for the law, which must, after all, deal in actually existing complex social systems. If key legal goals can be mapped onto social outcomes that may be summarized in the language of statistics, then careful analysis of the numbers and data tell us both what is the case and offer guidance on what the law should do about it.

The incorporation of empirical analysis via statistical methods into interpretive and normative legal frameworks calls for scrutiny into the nature of the role this input plays in the law. I want to suggest that we can take lessons from disputes in statistical methodology and their use in the legal reasoning to better illuminate the more general relationship between “fact”-finding and normative judging.

A recent methodological dispute among quantitative social scientists about how to properly study racial discrimination in the criminal justice system well illustrates how the nuts and bolts of causal inference—in this case about the quantitative ventures to compute “effects of race”—feature a slurry of theoretical, empirical, and normative reasoning that is often displaced into debates about purely technical matters in methodology.

The crux of the debate concerned the possible distortionary effect of using arrest data from administrative police records to measure causal effects of race. If the composition of police encounters recorded is biased by upstream racially biased policing procedures, one camp argued, standard inference procedure produces erroneous estimates of causal effects of race, and correspondingly, of racial discrimination. The opposing side disagreed. According to these researchers, biased policing practices do not necessarily invalidate the usual methods, since other statistical facts of the data might shake out in a way that preserves our standard inferential capacities. For example, even if policing is biased such that Black people are stopped for much more minor “violations” than are whites, it is possible that the nature of these Black police stops is such that potential police violence in these encounters compared to white ones exactly counterbalances those differences in potential use of force due to the different composition of stops. That is, the causal effect of race on police of force could be biased by, on the one hand, the kinds of stops that Black vs. white people are subject to—i.e., jaywalk stops or assault stops—and on the other, the natureof the police encounters themselves—i.e., police officer’s sense of “suspicion” or “level of threat” in the encounter. There’s no saying that these two sources of bias might not happily cancel each other out so to allow statistical inference to carry on business as usual.

The dispute, then, came down to the oldest problem in the book of statistical inference: whether any estimand in an inference exercise can be identified depends crucially on whether some set of assumptions about the data can be made. And even in today’s quantitative social science, despite its bigger and bigger datasets combined with more and more sophisticated methods measuring all sorts of estimands that we are told correspond to interesting facts about the social world, neither data nor statistical methodology can organize themselves into an account of what was going on in the world to produce these data and statistics. That burden of judgment, the analyst must bear priorto any cranking of the statistical machinery.

If social theory prefigures empirical analysis, then one might think that any fault in otherwise valid statistical inference methods must be laid at the feet of these starting theories of the social world, not the quantitative methods as such. Such a division of labor, however, is naïve; in practice, the statistician faces professional pressures to start from a picture of the social world as neat and readymade for her analysis. After all, progress in statistics is made possible only by the heavy hand of assumption: the more, the better for “causal” inference. In a standard exercise of inference, there are statistical assumptions that defy empirical referent—say, that errors are drawn independently from a Gaussian distribution. There are statistical assumptions whose empirical status are, at best, pending—non-interference, i.e., that potential police officer use of force on a stopped individual is independent of the racial composition of other police stops. And there are statistical assumptions that in all likelihood are empirically false—ignorability, i.e., that whether a police encounter becomes violent is independent of race of the stopped individual once we take account of all the information written up in police records. Yet despite their uneasy footing in the realm of empirical plausibility, these assumptions are thought to present no real issue for statistics. They remain the lifeblood of causal inference methodology. Indeed, facing the question head on, statistician and noted stats blogger Andrew Gelman recently endorsed exactly this view.

So, if statistics is already in the business of making assumptions about data generation, the distribution of errors, and the like—which, after all, are assumptions about how the elements of a statistical model relate to each other and to the elements of a social system—what is one more assumption in the way of how policing might interact with race? What is one more assumption that race figures only in a decision to stop an individual and not any further in whether that police encounter becomes violent? What is one more assumption that policing decisions due to suspicion can be disentangled from decisions based on race? Assumptions are assumptions are just assumptions!

The statistical skirmish among the social scientists this past summer thrusts into open view the strange place of the theoretical and the normative within empirical methods. Whether the causal effect of race on police use of force can be identified statistically using administrative police records hangs on assumptions that the analyst is willing to put forth about how policing works and how race works in our society—questions, as I have just argued, that must be decided priorto the use of inference methods to “detect discrimination.”

In particular, the credence one puts in the whole exercise depends on one’s views about the extent to which assumptions about the racial character of policing might be substantive assumptions about how the raced world works, and thus importantly differ from the standard mere statistical assumptions needed to get inference off the ground. Even among those who accept that convenient statistical assumptions fail to hold in most observational studies of race, exactly how race is theorized to work in violation of these assumptions dictates what subsequent statistical moves and inferential capacities are available to the working quant. Does it call into question police reports of “furtive movements” and “sense of threat” as supposedly “non-raced” assessments of risk? Does the nature of the category of race challenge run-of-the-mill approaches to statistically “adjust for” class? Does racial profiling make police encounters with Black individuals on average more or less dangerous than white encounters? These questions show that whether causal inference business as usual can measure a causal effect of race on police use of force depends on one’s prior views about what else but the role of race in policing and the social world more broadly. When causal effects of race are plugged in as evidence of racial discrimination, we have a formula that might make the value-free statistician blush: whether policing is racially discriminatory depends on one’s prior views about which differences across racial groups are the “relevant” differences that do and do not “justify” differential police treatment.

If normative thinking about race sets the statistical assumptions from which inference proceeds, then whether any particular move within the causal inference methodology is apt will depend in part on what one posits to be true about how the system of race (and gender and class, etc.) works and creates difference in a raced (and gendered and classed, etc.) society. This presents a basic worry for how the law presently interprets causal effects of race in the evidentiary stage of anti-discrimination trials.

To get at the causal effect of race on sentencing, says one expert, we need to stratify the data based on past parole violation because those who do and do not violate parole constitute distinct classes of individuals—classes independent of race for which causal effects of race must be measured separately.

No, that’s not right, says another, past parole violation is a variable downstream of race, and so conditioning on it induces post-treatment bias (no need to sweat the technical details here). Shrouded in tables, figures, “preferred model” specifications, double asterisks indicating statistical significance, the whole nine-yards of econometric analysis, statistical disagreement about method just is, in many such cases, substantive disagreement about race. It is hard to see how expert analyses so thoroughly imbued with normative thinking about race can be seen to present cold hard expert social scientific evidence on some “true” effect of race. 

All this might seem to weigh in favor of the thought that it is, as they say,“political” all the way down, even in our most advanced empirically-inclined statistical practice. But setting aside here that particular line—if only because it turns in part on what one takes to be “political” vs. not, a can of worms I do not wish to open here—I do not think we should cede what I take to be our upper hand on matters of empirics.

This past summer, protests initially spurred by the killing of George Floyd, just one in an unremitting string of police murders of Black individuals in the country, grew into a broader political rebellion against the undeniably racial character of state-sanctioned violence and murder within a massive highly punitive criminal justice system (and, indeed, of the racial character of many of our other institutions). These were acts of a radical morality from below that attacked not only a dominant understanding of the justice of these arms of the state. Their insistence was, at the same time, also plainly an empirical one about how cops, courts, and corrections (and much else besides) work in the 21st century United States, and, as a result, how a basic and customary violence is knit into encounters between Blacks and these instruments of the state. And the success of these uprisings is based in part in how well presentation of these “data” are ultimately taken up as disclosing matters of empirical “fact” by those who are not daily reminded of their racialization perceive its workings.

Then, if the battle over empirics is part of the political battle, and if the information and reasoning that marginalized persons put forth in social movements constitute, as I believe they do, a superior form of empirical evidence, then there is no reason to shy away from battles in statistical inference. Even if we think such methods as now practices are undeserving of their current normative stature, their authority remains ours to gain.

What is more, if, as I’ve argued, the substantive issues at stake in methodological debates in statistical inference just are the ones we engage in when we do social critique, then there is no politics that can sidestep these matters. What kinds of empirical considerations should form the background assumptions from which we conduct inquiry about the social world? (How) should social scientific practice be adapted to respond to differential social access to knowledge on matters of racial justice? How should we conceptualize notions such as race and causation, so they are well-suited to our various theoretical and practical ventures? Reigning scientific practice certainly has its answers to this set of questions. But this debate wasnever just the specialized province of methodologists. They are questions about what theory of the social world and what political, ethical, and empirical commitments should guide us in how and what we think and do—questions that are just as central to political as to scientific practice and theory.

This gets to my final point. Statistical methods can never be sources of normative innovation. Instead, we should think their role to be to fill in a more detailed portrait of the social world, given some substantive (qualitative, interpretive) starting sketch. Both the starting sketch and the final portrait are subject to standards of empirical and political scrutiny. They can give more or less accurate accounts of how the world in fact is, and they can be more or less useful for our political projects. A belief in the basic proposition that the urgency of our political projects stems in large part from the many gross injustices that have in the past and continue still to permeate our society, that these injustices actually exist and are genuine features of our social landscape, is a belief that there is a thread that directly connects our politics to our empirical diagnoses.

Investment in getting statistical analyses right—getting the substantive questions about methodology, our social concepts, and political epistemology right—should be a part of any political struggle rooted in a critical assessment of the world as it stands. Progressive interpretations of anti-discrimination and equal protection are exemplary of legal ideals that are future-looking towards a horizon of egalitarianism but are historically borne out of and presently grounded in a critical orientation toward a set of empirical facts about social injustice and legally enshrined inequality. Empirical analyses in these areas should be seen as potential allies—though, of course, not automatic ones—to a progressive legal agenda.