Skip to content

Towards a Legal Understanding of Social Data


Amanda Parsons (@AmandaH_Parsons) is an Associate Professor at the University of Colorado Law School.

Salomé Viljoen (@salome_viljoen_) is Assistant Professor of Law at the University of Michigan Law School.

It is by now a commonplace observation that we live in an age of informational capitalism, an arrangement that combines the familiar system of capitalism as a mode of production with the less familiar phenomenon of informationalism as a logic of accumulation. While law has not been silent in the face of this development in our political economy—in fact, law has in many ways been central to its rise—there is, nevertheless, a widely shared sense that law is poorly equipped to confront the problems it has generated.

In recent work, we explore one way to understand this disconnect: as a conceptual mismatch between, on the one hand, how strategies of accumulation specific to the digital economy operate, and on the other, the basic assumptions regarding accumulation that underlie legal regimes tasked with regulating those processes of accumulation. Simply put, our argument runs as follows: commercial entities collect data about people (i.e., social data) to make money in various ways. These strategies make use of data’s predictive capacity (i.e., data’s prediction value). Prediction value is a sub-species of use value, and its translation into exchange value (or a priced, market value) is messy and complex in ways that law has a hard time grasping onto. This may sound abstract, but we think understanding why and how companies go about producing social data to pursue profits and power is vital for both addressing the harms of informational capitalism and harnessing the potential social benefits of data value.

Social Data & Prediction Value

One of the distinguishing features of informational capitalism is that successful strategies of accumulation adopt (and increasingly, require) the systematic collection and use of social data. Social data, as we will understand it here, can refer to two related categories of data: data that is directly about human activities, such as how we move, communicate, play, learn, work and consume, as well as data that doesn’t directly record human activities but nevertheless can be used to infer and/or predict human behavior, such as weather data that can predict traffic patterns.

Social data assets are used by entities to provide insight into human behavior, guide predictions about behavior, and optimize strategies to intervene in and modify behavior. In other words, the value of social data lies in its capacity to apprehend and predict human behavior, or what we refer to as its prediction value. Prediction value is ‘valuable’ insofar as it grants its holder the capacity to apprehend and influence (i.e., control) future behavior in ways that align with that holder’s objectives.

The pursuit of prediction value isn’t new, of course, but improved chip processing, widespread internet and smartphone access, and better data science and machine learning techniques all contribute to the quality of prediction value that can be made available at low cost. In this environment, companies face general market pressure to accumulate and exploit prediction value to remain competitive. Consider that 9 of the 12 largest companies by market capitalization in 2023 are companies that engage in data-intensive business practices. This group includes companies such as Meta, Tesla, NVIDIA, and Alphabet, which are widely known to engage in a high degree of commercial surveillance, as well as less obvious candidates, such as United Health and Visa.

As plenty of others have argued, these practices can result in excessive commercial surveillance, resulting in political polarization, social atomization, and new forms by which to exercise social domination. Digital companies now track everything from users’ hand and eye movements, to women’s menstrual cycles, to immigrant location via gaming apps, to children’s video consumption. This creates myriad opportunities for discrimination, manipulation, exploitation, and social discipline. It is these risks of harm that (understandably) preoccupy much thinking about the digital economy, and the role of commercial surveillance within it.

But the companies that engage in commercial surveillance are not doing so with the primary aim to discriminate or to punish (or not most of them, anyway). They are cultivating social data as a strategy of profit making. Thus, a related problem to those above, and the primary focus here, is that wealth and power that accrues as prediction value escapes traditional legal processes of value regulation and redistribution. The same digital companies that engage in (and/or benefit from) ubiquitous commercial surveillance achieve massive market capitalizations by doing so, without, for example, paying any taxes in the countries in which they are operating.

To gain purchase on this problem, we return to older traditions of economic thinking that took seriously the messy and difficult task of transforming the productive (or use) value of something into its priced, monetary value. To be clear, we have no interest in getting dragged into old fights about—and projects of—systematic and formalized value theory. But to understand the data value-accumulating forces that drive the social disruptions detailed above, and why the pursuit of value in this form slips the traces of legal oversight, it can be helpful to hold apart the cultivation of social data’s use value (i.e., prediction value) and consider what is happening during its imperfect, delayed, messy (and sometimes nonexistent) transformation into a market (or price) value.

How Prediction Value Meets Exchange Value

Companies follow three basic scripts to translate social data value into money, as well as into economic and political power for themselves and their investors.

This first involves immediately and directly converting social data’s prediction value into money. Companies can sell data directly (as data brokers do), but they can also sell access to data value. For example, targeted advertising is a dominant way that companies turn data value into monetary value–selling access to superior customer insights to advertisers at a premium. Platform ad companies like Meta use social data to predict purchasing behaviors and charge advertisers to direct ads towards those consumers most likely to buy their products. Diapers are advertised to new parents, gaming equipment to teenage boys.

The second script involves indirectly converting prediction value into money. Companies use prediction value to improve products and services, create new products and services, and even expand into new industries. For example, lenders can supplement information on a potential borrower’s financial status with knowledge of how frequently they let their smartphone battery die to better predict loan default rates. Streaming platforms can use information on viewers’ streaming behavior to craft original entertainment content that will be particularly salient to those viewers. Or a fitness tracking company could use information on users’ physical activity to enter the health insurance industry at lower risk (and lower cost).

The third script involves using social data and its resulting prediction value to amass power. The ability to predict, as well as control and manipulate, human behavior confers upon companies economic and political power wholly separate from any monetary gains. Prediction value gives the companies that possess it the ability to shape the world in ways most beneficial to them. For example, Uber used location and user data to “grey out” ridesharing services around city halls and for civil servants in municipalities in which it was operating illegally. This practice was key to its strategy to get drivers and riders in these municipalities used to the service, and thus build political constituencies that would resist city governments attempts to regulate the company (or discipline them for operating illegally in their jurisdiction in the first place).

All three of these scripts require companies to amass large volumes of social data. As a result, many of the business practices that dominate informational capitalism focus on growth, often at the expense of current profits. For example, Amazon’s Amazon Prime program lost the company billions of dollars per year, but allowed them to build up large networks of users from whom they can glean social data. Google has built up an entire ecosystems of products and services—an email service, a web browser, a mobile phone provider, among many others—to maximize the amount of social data that the company is able to amass. It has also used aggressive merger and acquisition strategies, such as its acquisition of Fitbit, to gain access to data on millions of users beyond its own purview. Each of these strategies can be seen as a sign of the centrality of prediction value in our economy.

What the Law Doesn’t See

Understanding the business practices that spin social data hay into market capitalization gold is itself a useful exercise in taxonomizing the digital economy. But the fact that prediction value does not always reduce quickly or neatly (or in some cases, at all) into exchange value becomes a legal problem insofar as relevant and important areas of law do not register, apprehend, or consider significant forms of value production that predate and/or do not convert into exchange value. Legal fields affected by this conceptual misapprehension include both regimes, like tax and antitrust, that have historically been tasked with regulating value creation, as well as regimes, like data privacy law, that are newly-tasked with regulating value creation because of the primary role they play in affecting how social data is collected and used. 

To effectively regulate informational capitalism, legal actors in these regimes must understand how companies use social data and its predictive value to create both wealth and power for themselves and their investors. In the case of tax law, for example, understanding the role of social data as a driving force in economic accumulation lends more force to arguments made by proponents of digital services taxes that digital companies should be paying taxes in their users’ home countries. Additionally, recognizing that the business models of informational capitalism often focus on growth over income supports a renewed focus in tax law on how to effectively and equitably tax the wealth being built in the digital economy, rather than merely taxing income.

In the case of data privacy law, appreciating the political economic causes of privacy erosion begins with owning up to the primary role privacy law plays in regulating social data value production. Existing laws are focused on the privacy harm done to individuals from improper data collection or use; they are not designed with the task of value regulation in mind. For instance, certain (opportunistic) privacy arguments may foreclose beneficial applications of prediction value (for example, independent research to hold companies and regulators accountable), even as they routinely fail to curb privacy’s worst offenders. Recognizing data privacy’s role in regulating the conditions of social data value grounds data privacy’s turn from a focus on public overreach to commercial surveillance, and to embracing the need for structural reforms to secure privacy for everyone.

As these brief examples indicate, existing law fundamentally fails to grasp one of the central features of the prevailing economic system. Conceptual clarity around the role of social data and prediction value is necessary if law is to meet the challenge of addressing the harms of informational capitalism, prevent legal arbitrage by powerful interests, and, importantly, harness the potential social benefits of social data and its prediction value.