As markets began to usurp other forms of social regulation throughout the 20th century, metrics became increasingly central to the coordination of new spheres of market-mediated relations. More recently, digital metrics have been operationalized to facilitate the platformization of those domains. Platforms use automated scoring systems to rank content and actors across the markets they mediate. Search engines, e-commerce sites, and social media feeds all have ways to rank material and deliver it to users according to their calculation of “relevance.” This has been described as platforms’ gatekeeper power, and the way it is used to control the visibility of material across platforms continues to challenge legacy forms of regulatory intervention.
This post explores metrics and gatekeeper power through the Google Scholar platform and its intermediation of the “scholarly economy”—the domain in which research is produced, consumed, bought and sold. Google Scholar has become an important piece of academic infrastructure. Not only is it used to search for academic publications, but its bibliometric and “scholar profile” systems have become critical in evaluating scholars and scholarship. Unlike search and bibliometrics systems provided by legacy academic publishers, Google Scholar’s services are free and—leveraging Google’s massive investment in user experience, cloud infrastructures, and search algorithms—are generally more usable than other academic tools and repositories. The result is a growing centrality of Google Scholar in search and bibliometric evaluation. The engine of the Google Scholar ecosystem is the metric of citation counts.
As Google Scholar continues to intermediate academic life, interjecting its opaque and trade-secret-protected systems of scholarly evaluation, it is displacing a key set of contextual norms around academic autonomy, including transparency and the ability of the academy to understand how it evaluates itself. This has led to calls for greater accountability, including through law. Despite transparency rights in data protection laws, FRAND-type (Fair, Reasonable and Non-Discriminatory) rules in digital market regulation (designed precisely to apply to gatekeepers), emerging prohibitions on automated scoring, and structural and behavioral antitrust enforcement, however, law still lacks a meaningful language for this lack of accountability and the significance of metrics to platformization more broadly.
Bibliometrics, Scholarly Autonomy, and Google Scholar’s Gatekeeper Power
Bibliometrics—the counting of citations and other indicators about academic work—has been big business for some time. And Google Scholar is not the only player in scholarly platforms and bibliometrics. Indeed, many of the issues described below apply across the bibliometrics and academic analytics ecosystem. Several other academic publishers have also sought to centralize and commercialize their bibliometrics as critical infrastructure for academic evaluation. HeinOnline for instance, has made substantial investments into its citation counting system in order to leverage its analytics directly into the US News law school ranking service. By being the data and analytics channel for an important law school ranking system, HeinOnline is able to capture tremendous value from its repository, while simultaneously cementing its centrality.
Those legacy publishers, however, typically offer bibliometrics and analytics as paid services, with a higher degree of transparency, customer service, and policing of metrics for quality and manipulation. Academic publishers typically build bibliometrics systems from repositories of intellectual property they control, whereas Google, which is not a publisher, uses its free and highly-usable search, citation counting, and scholar profiling systems to intermediate the entire scholarly information system and control the flow of data between research producers, research managers and research repositories. In doing so, it has been able to build a metric that opaquely controls the visibility of research across the platform—its own system of citation counting.
Citations counts and h-index scores on scholar profiles are offered as one evaluative instrument on Google Scholar, but citation counts are also a primary determinant in search ranking. That is, query parsing being equal, documents with higher citation counts are ranked higher in search, making them more likely to be accessed and cited again by researchers. At the same time, what documents are included in the Google Scholar scholarly index are unclear, how citations are extracted from documents is opaque, and the capacity for users to define search “relevance” through their own parameters is highly constrained. Unlike academic publishers, Google Scholar will not correct errors or manipulations of its bibliometric system, premising its refusal on the fact that it is free, non-commercial, and a gift to the academic community.
On one hand, the academy should be skeptical of the “gift” claim. In order to curate their own profile, Google Scholar users must agree to ordinary Google terms of service enabling Google Scholar data to be integrated with Google’s online advertising apparatus. Indeed, Marion Fourcade and Daniel Kluttz have demonstrated how platforms deploy gift claims as part of broader strategies for accumulation. But beyond direct commercialization, the absence of accountability and transparency also opens space for alternative agendas in the evaluative and gatekeeping process.
In the Google Scholar example, these interests are difficult to identify or verify, and ascribing direct political motivation would be speculative. But all of the system’s design choices have consequences. For instance, the software that extracts Google Scholar citations from documents, and the way those citations are displayed on the platform, privilege certain academic actors and fields over others. There is no factoring for number of authors (which benefits STEM disciplines and computer science) or indication whether citations are from peer-reviewed sources (which benefits disciplines that use unmoderated repositories). Authors who publish semi-scholarly works in newspapers also benefit when those articles are referenced in other scholarly or semi-scholarly work. These choices as to how the unit of value is construed make certain domains of knowledge appear more prominent. For instance, the discipline with the highest citation count on most bibliometric services is pharmacology; however, on Google Scholar it is computer science. Google Scholar thus participates in a broader paradigm shift towards computer science appearing as the discipline around which all knowledge is organized.
This lack of transparency or accountability is not a problem for Google Scholar alone. While we can know, for instance, what materials are included in the Web of Science Index, there is little transparency regarding the process of deciding which journals are included in the index to begin with. Platforms like SSRN exercise profoundly unaccountable control over what is included in their index, with unclear standards for what is considered “scholarly” or of scholarly merit and with opaque mechanisms for policing its own metrics. But Google Scholar, more than other bibliometric services, amplifies these problematics. As Onora O’Neill describes, old intermediaries are replaced with new intermediaries “whose contributions are harder to grasp, and who are not and cannot be disciplined by the measures used to discipline the old intermediaries.”
The Challenge of Regulating Metrics
Within the scholarly economy, metrics represent a unit of symbolic capital that works as a proxy for recognition and quality. While the appropriate uses of citation counting are contested, with bibliometricians scorning their naïve use to evaluate research and researchers, their utility for research managers and funders (i.e., buyers in the scholarly economy) is not dissimilar to other personal ratings systems like financial credit ratings. But the law has developed only a limited range of conceptual tools for addressing metrics and ratings as economic instruments.
James Grimmelmann has provided a comprehensive legal analysis of ratings through their treatment in copyright jurisprudence.Looking at the corpus of case law, Grimmelmann identifies three juridical approaches to understanding ratings: as statements of fact that are either true or false; as creative opinions that exist independently of the world; and as self-fulfilling prophecies that remake the world in their own image. “The telos of a rating-as-fact is truth; the telos of a rating-as-opinion is authenticity; the telos of a rating-as-prophecy is power.” He notes that if ratings are statements of fact, then a knowingly incorrect rating is a lie. If ratings are statements of opinion, there is no way to prove it false, and hence no basis for liability. If ratings are self-fulfilling prophecies, then “true or false” is entirely the wrong question to ask. This conceptual taxonomy can be applied beyond copyright law, however, and offers a useful lens for understanding what different legal treatments of metrics might achieve.
For instance, consider regulating scholarly metrics for accuracy: Google Scholar citation counts are factual—they do refer to the existence of citations somewhere within Google Scholar’s scholarly index. But citations should not be understood as a “realist” form of measurement. Citation counting is not a mechanical derivation of reality—they are the output of a deeply complex socio-technical process. Further, evaluating the truth of metrics would require some transparency into how those metrics were produced. Indeed, in specific domains such as credit ratings, the primary regulatory instrument addressing ratings is transparency. When it comes to accuracy, transparency is the first step towards contestation.
Data protection laws do offer some transparency under the penumbra of “data subject rights,” in order for a data subject to understand what is known about them and ensure that data is accurate. Some regimes, like the EU General Data Protection Regulation (GDPR), also notionally afford transparency into the logic of automated data processing in certain situations. By ensuring the quality of information processed, their goal is to enhance the legitimacy of its processing and to make decisions more fair and reasonable.
But these transparency regimes appear unsuited to a system like Google Scholar, because transparency does not afford structural information about how its system works to coordinate the field. Rather, it merely enables individual scholars to better manage their own universe of ratings through their own labor. While potentially increasing metric accuracy, these rules would not render Google Scholar citation counts transparent in any way that might afford insight or a way to challenge that gatekeeper power.
Scholarly metrics might also be addressed in law as opinions that must be “authentic.” While citation counts are technically measurements rather than opinions, the way they are operationalized into rankings represent decisions that might be censored for inappropriately reflecting the interests of the platform over market participants. There has been U.S. litigation addressing the truth of academic metrics and the ways they are produced to benefit the economic interests of particular players. The 1994 case of Gordon & Breach v. American Institute of Physics and American Physical Society concerned the creation of a metric (of citation counts per 1000 characters in a journal) by an academic scholarly society. Their metric indicated that a commercial academic publisher’s journals were not cost effective, while the scholarly society’s journals were highly cost effective. The commercial publisher argued that because the metric was intended to influence the purchasing of scholarly periodicals by libraries, it ought to be regulated as a form of misleading commercial speech.
Ultimately the court was wary of threatening academic inquiry, and it found these metrics to be a constitutionally protected form of academic speech because they were produced by a scholarly society and had been subject to peer-review. The treatment of commercially motivated academic metrics was a dilemma for the courts. And the court appeared to resolve it by noting that the metrics were produced in line with the contextual norms of the academy rather than of commerce.
Whether Google Scholar metrics would, alternatively, be treated as commercial speech is unclear. For instance, they do not necessarily propose a commercial transaction as required by the Lanham Act. Rather, they are intended to bring users into a privately controlled information ecosystem, the commercial machinations of which are complex. It is therefore difficult to analogize Google’s citation counting system to any other form of economic scoring or rating, especially in a way that addresses the problems of opaque commercial intermediation of the academic field.
Another mechanism for treating Google Scholar metrics as “opinions” or ensuring their authenticity might be FRAND (fair, reasonable, and non-discriminatory) conditions of scholarly intermediation. These rules apply to search engines, ranking systems and app stores in the EU P2B Regulation and proposed Digital Markets Act to prevent gatekeepers intermediating in their own commercial interests by privileging their own services. But there is nothing inherently unfair or discriminatory in the way the Google Scholar search and scoring ecosystem works. Search engines must discriminate one way or another, and if they do not self-preference or require payment for additional services to maintain rankings, ideas like fairness or non-discrimination cannot challenge the broader politics inherent in a system’s design. Put another way, it cannot challenge the way a platform’s business model displaces the norms of the academic context.
The recently leaked draft EU AI Regulation would be similarly ineffective here, prohibiting “algorithmic social scoring” only when not carried out for a specific legitimate purpose of evaluation and classification, and when generating detrimental treatment.
Grimmelmann’s final conceptualization of metrics is as a form of power. As Grimmelmann notes, in this conceptualization, truth or falsity is entirely the wrong question to ask. But what is the right question? How can law attend to the forms of power that these metrics generate? This requires understanding the ways in which the contextual norms of commercial platforms undermine the contextual norms of scholarly work. It means accepting the reality that, despite the convenience and usability, this form of platformization risks collapsing scholarly contexts, both between disciplines and between academia and industry. If ratings-as-power requires thinking through metrics as a self-fulfilling prophecy, Google Scholar must be thought of as an instrument for reshaping the organization of universities along platformed lines—for remaking the university in the platforms’ own image.
The academy must take responsibility for these consequences. Safeguarding academic norms requires some mechanism for ensuring academic oversight and governance of platform tools, akin to the ways that academic peer-review (or alternative forms of academic evaluation like collaborative editing) and editorship of commercial journals keeps certain forms of industrial influence out (although, of course, while letting other very pernicious forms in). To that end, requiring that the academy itself set the gatekeeping agenda of a commercially produced academic platform may be beyond what law can achieve. Reconciling commercially motivated intermediation of the scholarly economy with scholarly norms may be impossible. In that case, the relevant legal idea is not laws that regulate markets, but laws that attend to organizations—and finding a way to use them to govern a collectively controlled automated information infrastructure built for and by the academy itself.