VALUING SOCIAL DATA

VALUING SOCIAL DATA

Social data production—accumulating, processing, and using large volumes of data about people—is a unique form of value creation that characterizes the digital economy. Social data production also presents critical challenges for the legal regimes that encounter it. This Article provides scholars and policymakers with the tools to comprehend this new form of value creation through two descriptive contributions. First, it presents a theoretical account of social data, a mode of production that is cultivated and exploited for two distinct (albeit related) forms of value: prediction value and exchange value. Second, it creates and defends a taxonomy of three “scripts” that companies follow to build up and leverage prediction value and explains their normative and legal ramifications.

Through the examples of tax and data privacy law, the Article applies these descriptive contributions to demonstrate how legal regimes fail to effectively regulate social data value creation. Tax law demonstrates how legal regimes historically tasked with regulating value creation struggle with this new form of value creation. Data privacy law shows how legal regimes that have historically regulated social data struggle with regulating data’s role in value creation.

The Article argues that separately analyzing data’s prediction value and its exchange value is helpful to understanding the challenges the law faces in governing social data production and its surrounding political economy. This improved understanding will equip legal scholars to better confront the harms of law’s failures in the digital economy, reduce legal arbitrage by powerful actors, and facilitate opportunities to maximize the beneficial potential of social data value.

The full text of this Article can be found by clicking the PDF link to the left.

Introduction

Anna is sixteen weeks pregnant. 1 This example is adapted from various presentations made by Salomé Viljoen in 2022 and 2023, including at the 2022 Annual Meeting of the American Society of International Law, Seventh Annual Detlev Vagts Roundtable. Salomé Viljoen, Remarks by Salomé Viljoen, 116 Proc. Am. Soc’y Int’l L. Ann. Meeting 122, 122–25 (2022). Anna uses Pineapple, a popular fertility and pregnancy tracking and social media app. In the app, Anna inputs information about her ovulation cycle, pregnancy symptoms, sleep patterns, eating habits, exercise, and moods. Anna also consumes Pineapple’s content on pregnancy and fetal development and engages with other Pineapple users in their forum, swapping questions on pregnancy and preparing for a new baby.

Pineapple collects, aggregates, and synthesizes data that Anna and other users share. This data includes not only the information that Anna inputs in the app but also data about the content she consumes and her interactions with other users. Pineapple does not sell this data directly—in fact, their privacy policy explicitly states that they will never sell or license individual user data. But Pineapple does sell insights about their user base as a whole to clients like advertisers, employment agencies, and consumer credit agencies. This is how Pineapple makes its money—the app is free to Anna. Pineapple’s clients then combine the data they receive from Pineapple with data from other companies to build out a more complete picture of the behavior of pregnant people. This could include data on TV viewing patterns from video streaming platforms, movement and sleep patterns from wearable fitness devices, or online purchasing behaviors. 2 While this is a hypothetical example, it draws on real uses of social data. See, e.g., Hooman Mohajeri Moghaddam, Gunes Acar, Ben Burgess, Arunesh Mathur, Danny Yuxing Huang, Nick Feamster, Edward W. Felten, Prateek Mittal & Arvind Narayanan, Watching You Watch: The Tracking Ecosystem of Over-the-Top TV Streaming Devices, 2019 Proc. Ass’n for Computing Mach. Conf. on Comput. & Commc’ns Sec. 131, 142 (studying two thousand “over-the-top” streaming channels, finding widespread user tracking and data collection with little recourse for consumers to disable tracking through countermeasures); Tong Yan, Yachao Lu & Nan Zhang, Privacy Disclosure From Wearable Devices, 2015 Proc. Ass’n for Computing Mach. Workshop on Priv.-Aware Mobile Computing 13, 18 (studying how aggregated data from fitness trackers can be used to infer a user’s behavioral patterns, such as when they will go grocery shopping, get coffee, or work out); Jonah Engel Bromwich & Jessica Testa, They See You When You’re Shopping, N.Y. Times (Nov. 26, 2019), https://www.nytimes.com/2019/11/26/style/powerfront-software-ecommerce-cartoons.html (on file with the Columbia Law Review) (describing how e-commerce customer service representatives can now visualize emotional profiles of customers visiting their sites or using support chats); Aljoscha Dietrich, Kurunandan Jain, Georg Gutjahr, Bianca Steffes & Christoph Sorge, I Recognize You by Your Steps: Privacy Impact of Pedometer Data, Computs. & Sec., Jan 2023, no. 102994, at 1, 7 (demonstrating how fifteen minutes of pedometer data alone—data collected on nearly every smartphone—can identify users).

Becca has never used Pineapple. But she does watch streaming services, owns a wearable fitness device, and shops online. And Becca’s behavioral patterns on these platforms have shifted in similar ways to Anna’s and other Pineapple users’ behaviors. Pineapple’s clients can, therefore, infer that Becca is also likely pregnant and treat her accordingly.

Why do companies care about Anna’s and Becca’s pregnancy status? Because early pregnancy data is incredibly valuable. Pregnancy signals that a consumer is about to undergo a significant change in their daily habits and their buying activities; the birth of a child is a time when someone’s buying habits, brand loyalties, and daily routines are in flux. Getting to such consumers early is a valuable opportunity to shape their future purchasing behaviors. Diaper companies can advertise to Becca or Anna before competitors and get them locked into their brand. Grocery stores, online subscription services, car manufacturers, and others can also reach out, offering deals favorable to new parents that entice them to switch entrenched behaviors and brand loyalties. Aggregate pregnancy data also provides an opportunity to understand the nature of consumer change more generally—how and why do consumption patterns change? When are such changes most robust, and why? How can you predict (and modify) those behavioral changes?

Data about Anna’s and Becca’s pregnancies is what this Article calls social data. 3 Part I provides a more detailed definition of this concept. Social data refers jointly to two interrelated types of data about people. The first is data that directly materializes and stores traces of human activity. 4 This first category is adapted from Julie E. Cohen’s concept of the “data refinery”—the data-processing practice of “refin[ing] and massag[ing] flows of personal data to produce virtual representations . . . optimized for modulating human behavior systematically”—as a “centrally important means of economic production.” Julie E. Cohen, Between Truth and Power: The Legal Constructions of Informational Capitalism 66–68 (2019) [hereinafter Cohen, Between Truth and Power]. This includes, for example, information on Anna’s or Becca’s TV viewing patterns, ovulation, or movement. This type of social data is directly collected from data subjects, like the data Pineapple collects and uses about Anna. The second is data that is used to apprehend, infer, or predict human activity. 5 Part I describes the difference between the two below. But for now, it is important to note that data used to infer or predict human activity need not always derive from data directly about human activity. For further discussion of the significance of the second category, see Alicia Solow-Niederman, Information Privacy and the Inference Economy, 117 Nw. U. L. Rev. 357, 400–03 (2022). For example, Pineapple collects data about Anna (and other users) to aggregate and analyze for insights about pregnant people as a group, which it sells to third parties. Those third parties may use this data in turn to gain insight about, and drive decisions regarding, Becca. Thus, data about Anna and her pregnancy is also data about Becca’s pregnancy, even though it was not directly collected from Becca. 6 See Salomé Viljoen, A Relational Theory of Data Governance, 131 Yale L.J. 573, 606–08 (2021) [hereinafter Viljoen, Relational Theory] (describing a hypothetical scenario in which a tattoo AI company creates an algorithm based on the dataset of the social media company it acquires that allows data collected from one person to be used to infer information about a second person). Indeed, data that can be used to infer or predict human activity need not be collected from people at all—for example, data about weather can be used to predict and infer commuting behavior. In contrast with the more commonplace term “personal data,” “social data” nicely expresses the view (and a central focus of this Article) that data is socially useful and economically valuable—not only for what it can tell the world about any one person, like Anna, but also, and especially, for what it can tell the world about people. 7 Id. at 609–11. For examples of the ubiquity of the term “personal data” in common references to data privacy, see, e.g., Colleen McClain, Michelle Faverio, Monica Anderson & Eugenie Park, Pew Rsch. Ctr., How Americans View Data Privacy 12, 18 (2023), https://www.pewresearch.org/internet/2023/10/18/views-of-data-privacy-risks-personal-data-and-digital-privacy-laws/ [https://perma.cc/M82M-BATC] (using the term “personal data” frequently to refer to aggregated, predictive insights); Hossein Rahnama & Alex “Sandy” Pentland, The New Rules of Data Privacy, Harv. Bus. Rev. (Feb. 25, 2022) https://hbr.org/2022/02/the-new-rules-of-data-privacy (on file with the Columbia Law Review) (describing “personal data” as individual data points aggregated to generate broader consumer insights).

The value of Anna’s and Becca’s social data is what this Article refers to as prediction value. 8 Part I provides a more detailed definition of this concept. Prediction value is a particular form of use value that lies in social data’s capacity to infer or predict things about people—in this case, pregnancy status—and to act on that knowledge. For example a firm with access to Becca’s social data may send Becca a diaper coupon or free prenatal vitamins, a hospital where Anna will give birth may use it to inform labor and delivery staffing plans, or an employer’s hiring algorithm may flag Becca as a potentially risky and expensive hire and exclude her from a pool of prospective employees. Social data stores the value of being able to apprehend behavior, to infer, predict, and direct the future actions of people (who are not always the data subject), and to develop informed strategies to obtain some objective. It provides the valuable capacity to exert some measure of insight into and control over future behavior. 9 See David Graeber & David Wengrow, The Dawn of Everything 364–65 (2021) (arguing that “control of information” is one of three bases of social power). The capacity of social data to store insight into human behavior, guide predictions about that behavior, and optimize strategies to guide and change human behavior is (much of) what drives companies to collect the data they collect and use the data in the way that they do. 10 Part II covers this at length. For a specific example relevant to the scenario here, see Eric Siegel, Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die 6–12 (rev. ed. 2016) (detailing the variety of predictive models organizations employ across various sectors, including healthcare, retail, and the mortgage industry, to attract, retain, and win back customers). Specifically, Eric Siegel covers Target’s use of big data to identify pregnant customers: Target created a training model based on users who signed up for Target’s baby registry and then applied it to customers who had not registered. Id. at 48–50.

Social data cultivation is key to the business strategies of some of the wealthiest and most powerful companies operating today. Companies face generalized market pressure to engage in the accumulation and cultivation of social data and its prediction value to stay competitive. 11 See Cohen, Between Truth and Power, supra note 4, at 6 (“In a regime of informational capitalism, market actors use knowledge, culture, and networked information technologies as means of extracting and appropriating surplus value, including consumer surplus.”). Indeed, the widespread practice of treating social data as a key input to production is part of what it means to refer to contemporary capitalism as an informational capitalism. 12 See Viljoen, Relational Theory, supra note 6, at 577, 586 (“Data plays a central role in both descriptive and critical accounts that characterize the contemporary digital political economy as informational capitalism.”). Recent technological transformations, like improved chip processing power, ubiquitous connected devices like smartphones, and improvements in machine-learning techniques, have all contributed to the feasibility and utility of entities cultivating, refining, and extracting social data value. 13 See Beth Gutelius & Sanjay Pinto, Ctr. for Urb. Econ. Dev., Pain Points: Data on Work Intensity, Monitoring, and Health at Amazon Warehouses 20 (2023), https://cued.uic.edu/wp-content/uploads/sites/219/2023/10/Pain-Points_Final_Oct2023.pdf [https://perma.cc/WQ2K-8V9V] (describing the impact on workers of the “company’s system of technology-enabled workplace monitoring”); Valerio De Stefano & Simon Taes, Algorithmic Management and Collective Bargaining, 29 Transfer 21, 23–25 (2023) (discussing how employers use data gathered from various technological tools to track workers’ physical locations, mental and emotional states, and digital activity to make decisions and predictions regarding workers’ conduct and productivity); Matt Burgess, All the Data Apple Collects About You—And How to Limit It, Wired ( Jan. 16, 2023), https://www.wired.com/story/apple-privacy-data-collection/ (on file with the Columbia Law Review) (describing all of the ways that Apple products may track and collect data about how users interact with their products). These technological changes have allowed entities to exploit for economic gain what has long been true: People are social beings, deeply knowable and materially influenced by relations to one another. Thus, the stakes of understanding social data’s particular form of value, and the social and economic effects that its widespread cultivation produces, have grown more salient.

A primary way the digital economy works is by using prediction value to increase monetary value: to grow profits by raising revenue and by lowering costs, or to grow market share (and, the thinking goes, future profits) by expanding customer bases and entering new markets. 14 Sections II.A and II.B provide a detailed analysis of these profitmaking strategies. As Part II will survey in greater detail, companies deploy a variety of strategies to transform prediction value into exchange value—the priced, monetary value of a good, service, or company, typically expressed as a “market price.” Exchange value, as a general theory and form of value, posits that the value of a thing is the value derived from its exchange, expressed via price, on a real or imagined market. 15 See Dave Elder-Vass, Inventing Value 1, 22 (2022) (noting the mainstream notion of value as the equilibrium price of demand and supply and arguing that equilibrium price is “not the same thing as actual price[]” but a “regulative concept: the notional price at which marginalist theory says goods ought to be exchanged”); see also R.H. Coase, The Nature of the Firm, 4 Economica 386, 388 (1937) (“Outside the firm, price movements direct production, which is co-ordinated through a series of exchange transactions on the market.”). So, prediction value can be—and is—transformed into exchange value. But it doesn’t have to be. Prediction value is distinct from (and not always neatly transformed into) exchange value. Part I provides greater detail defending the descriptive and analytic virtues of cataloging this distinction.

Prediction value confers on its holder the power to apprehend, shape, and thus exert some measure of control over people’s behavior. In fact, the central preoccupation of privacy scholars and many other observers of the digital economy is this potential for the control power of social data, cultivated for its conversion into priced value, to be repurposed toward other (potentially disempowering) ends. 16 See infra section III.C. These purposes can coexist with strategies to grow exchange value, such as the use of prediction value in labor settings to reduce operating costs by eroding workplace protections, or lie outside the commercial realm entirely, such as immigration officials repurposing location data cultivated for commercial ends to detect and detain suspected undocumented immigrants. 17 Paul Blest, ICE Is Using Location Data From Games and Apps to Track and Arrest Immigrants, Report Says, Vice News (Feb. 7, 2020), https://www.vice.com/en/article/v7479m/
ice-is-using-location-data-from-games-and-apps-to-track-and-arrest-immigrants-report-says [https://perma.cc/C24R-GB9R].
Indeed, much of privacy law’s traditional concern regarding privately cultivated surveillance capacities is how such capacities fall into the hands of state actors and empower state action without sufficient scrutiny. 18 On use of data in workplace settings, see infra section II.A. On the traditional focus on state actors as a source of power-related privacy harm, see infra section III.C.

Of course, there is also some amount of speculative behavior around prediction value, as when entities, in order to secure valuations of high exchange value, overclaim or overpromise on the prediction value their products can deliver—a phenomenon the computer scientists Sayash Kapoor and Arvind Narayanan refer to as “AI snake oil.” 19 See Sayash Kapoor & Arvind Naryanan, About the Book and This Substack, AI Snake Oil, https://www.aisnakeoil.com/about [https://perma.cc/SM2P-J4ET] (last visited Feb. 19, 2024) (explaining the foundations of the AI snake oil project, which seeks “to dispel hype, remove misconceptions, and clarify the limits of AI”); see also Louise Matsakis, The Princeton Researchers Calling Out ‘AI Snake Oil’, Semafor (Sept. 15, 2023), https://www.semafor.com/article/09/15/2023/the-princeton-researchers-calling-out-ai-snake-oil [https://perma.cc/XMU9-TTNP]. But this, too, highlights the importance of disentangling assessments of social data value from priced exchange value—to better identify when claims of social data value (and its potential to transform into priced exchange value) are overblown. 20 See Inioluwa Deborah Raji, I. Elizabeth Kumar, Aaron Horowitz & Andrew D. Selbst, The Fallacy of AI Functionality, 2022 Proceedings Ass’n for Computing Mach. Conf. on Fairness, Accountability & Transparency 959, 961–65 (describing various ways in which AI systems fail to produce certain claimed outcomes, whether because the objectives were impossible, the systems were designed in faulty ways, or the systems’ capabilities were falsified, misrepresented, or overstated); Angelina Wang, Sayash Kapoor, Solon Barocas & Arvind Narayanan, Against Predictive Optimization: On the Legitimacy of Decision-Making Algorithms that Optimize Predictive Accuracy, Ass’n for Computing Mach. J. Responsible Computing, Mar. 2024, no. 9, at 1, 8–16 (identifying specific shortcomings in datasets or modeling that interfere with the ability to optimize the predictive value of AI and machine-learning models). As Aaron Shapiro notes in his excellent work on gig platforms, when it comes to understanding the way platforms capitalize on prediction value by turning it into market valuation, there is a considerable “gap between what platforms do and what they say they do.” 21 Aaron Shapiro, Platform Sabotage, 16 J. Cultural Econ. 203, 204 (2023) [hereinafter Shapiro, Platform Sabotage]. Tim Hwang compares the behavioral advertising market to the subprime mortgage crisis of 2008, arguing that companies’ claims that behavioral advertising is more effective are, like the supposed value of subprime mortgage–backed financial products, empirically dubious. Yet, similar to the 2008 financial crisis, these empirically dubious value propositions nevertheless produce widespread social disruption as companies pursue them. Tim Hwang, Subprime Attention Crisis 76–92 (2020). Clarifying the two modes of value production (and how they relate to each other) can help regulators and other observers traverse this gap and evaluate when claims are plausible and when they are not.

This Article argues for the importance of understanding how social data value is cultivated and used for regulating the digital economy. Part I provides greater detail on the concepts of social data and prediction value and argues for the distinctive value proposition of cultivating, accumulating, and using social data. It also provides theoretical context to distinguish the concept of exchange value—priced monetary value—from the concept of value more broadly and from prediction value as a particular kind of use value.

Part II offers a taxonomy of the business models and practices developed around cultivating and using social data value. This taxonomy divides the ways in which companies leverage prediction value to produce wealth and power for themselves and their investors into three scripts. The first script is direct and immediate conversion of social data value into exchange value through means such as direct sale of data, or through the premiums charged for targeted, as opposed to untargeted, advertising. The second script is indirect and often delayed conversion of prediction value into exchange value through improving and developing new products and services, lowering costs, increasing and stabilizing revenue, and expanding into new business lines and industries. The third script is leveraging prediction value to accrue power. This script catalogs how social data value can be a source of economic and political power, and thus of value to companies in their longer-term aims to secure market power and favorable regulatory environments. After cataloging and describing these three scripts, the Article explores some specific business practices associated with following these scripts, each of which focus on growth and expansion. These practices include offering free and low-cost services, creating ecosystems of products and services, and embarking on aggressive merger and acquisition strategies. The Article shows how these strategies differ from traditional ones in ways that carry both legal and normative significance.

In Part III, this Article explores how disambiguating prediction value and exchange value (conceptually and normatively) can illuminate why such a variety of existing legal regimes fail to properly manage the social and economic disruptions that have accompanied capitalism’s informational turn. In short, the same legal regimes that structure the transformation of prediction value into exchange value fail to grasp—in its entirety—the messy, imperfect, and socially disruptive process by which this transformation occurs. While the regulatory challenges of the digital economy increasingly place strain on various areas of law, most consider only small portions of this process and lack a systematic understanding of social data value production. 22 Julie E. Cohen & Ari Ezra Waldman, Introduction: Framing Regulatory Managerialism as an Object of Study and Strategic Displacement, 86 Law & Contemp. Probs., no. 3, 2023, at i, ii–iii (detailing the “long and growing” list of harms from informational capitalism as managed under current regulatory paradigms and noting that adequate responses to these harms are beyond the capacities of current regulatory approaches).

The Article identifies two contexts in which the legal regimes that structure this process index only part of its legally relevant features. The first context is legal regimes that have historically been tasked with governing value creation. 23 See infra Part III. Such regimes are focused on evaluating and regulating companies’ claims of exchange value and thus only apprehend or index prediction value (indeed, they only consider such value normatively and legally relevant) at the point it is transformed into exchange value. As section III.B will show, this can miss many legally salient features of prediction value, such as how it is cultivated and the wider social effects that cultivation creates. This leaves such regimes poorly equipped to properly achieve their normative goals. The Article chronicles these struggles through the example of tax law.

The second context is legal regimes that have not historically understood themselves to be tasked with governing value creation but that are focused on the legal significance of informational power. Such regimes are attentive to the capacity of information about people to create power over them, but they regulate social data along a strict public–private divide. Through the example of privacy and data governance law, the Article shows the conceptual and programmatic challenges of this approach. Privacy and data governance law govern private data collection primarily via individual control and consent rights. 24 This is also referred to as the “notice and choice” regime. See Neil Richards & Woodrow Hartzog, Privacy’s Trust Gap: A Review, 126 Yale L.J. 1180, 1197–98 (2017) (reviewing Finn Brunton & Helen Nissenbaum, Obfuscation: A User’s Guide for Privacy and Protest (2015)); Neil Richards & Woodrow Hartzog, Taking Trust Seriously in Privacy Law, 19 Stan. Tech. L. Rev. 431, 434 (2016). There are hundreds of privacy laws in the United States, and providing a systematic review of the role consent plays in each one is beyond the scope of this project. Yet, U.S. privacy laws are generally understood to derive from the Fair Information Practice Principles (FIPPs), which lay out principles for fair data processing, including meaningful individual rights of control and consent over data collection and use. Fed. Priv. Council, Fair Information Practice Principles (FIPPs), https://www.fpc.gov/
resources/fipps/ [https://perma.cc/BYS2-SY4F] (last visited May 8, 2024). For some canonical examples of privacy laws that operationalize consent, see, e.g., Children’s Online Privacy Protection Act (COPPA) of 1998, 15 U.S.C. §§ 6501–6506 (2018); Fair Credit Reporting Act (FCRA) of 1970, 15 U.S.C. §§ 1681–1681x; Family Educational Rights and Privacy Act (FERPA) of 1974, 20 U.S.C. § 1232g (2018); Health Insurance Portability and Accountability Act (HIPAA) of 1996, Pub. L. No. 104-191, 110 Stat. 1936 (codified as amended in scattered sections of 18, 26, 29, and 42 U.S.C. (2018));California Consumer Privacy Act (CCPA) of 2018, Cal. Civ. Code §§ 1798.100–.199.100 (2023); Biometric Information Privacy Act (BIPA), 740 Ill. Comp. Stat. Ann. 14/10, 14/15 (West 2023).
Privacy law traditionally apprehends or indexes social concerns regarding prediction value, and its capacity to coerce action and remake social relations, only if or when it falls into the hands of public actors. And while the near-exclusive focus on state surveillance in the field is shifting, both popular and doctrinal conceptions of socially coercive privacy harm remain primarily focused on public, rather than private, actors. This ignores many salient concerns regarding informational power that arise as social data is imbricated into the strategies of commercial actors and neglects the role of privacy and data governance law in facilitating this form of value creation. It also overlooks the potential social benefits of prediction value if cultivated in procedurally fair ways and put toward collectively determined ends.

This analysis has broad implications for other areas of the law. For example, other legal fields that, like tax law, have historically been tasked with governing value creation have legal frameworks developed around the concept of exchange value and are not achieving their normative goals when applied to prediction value. Antitrust and financial regulation are prominent examples here, as there is growing evidence to suggest these regimes are struggling to index the profit-seeking behavior of technology companies and thus achieve these legal regimes’ regulatory briefs in the digital economy. 25 See Saule T. Omarova & Graham S. Steele, Banking and Antitrust, 133 Yale L.J. 1162, 1244–55 (2024) (describing the challenges faced by banking regulators striving to manage digital financial markets and financial tech companies); Sanjukta Paul, Fissuring and the Firm Exemption, 82 Law & Contemp. Probs., no. 3, 2019, at 65, 72–76 (analyzing why antitrust law has failed to adequately regulate platforms like ride-hailing firms). This understanding will also be invaluable to legal fields that, like privacy and data protection law, have not historically been seen as regimes governing value creation and that, as a result, have not developed a positive agenda for regulating prediction value. First Amendment law is a prominent example. 26 Here, the Article refers particularly to discussions of free speech and the First Amendment as a primary legal regime regulating against the discursive and democratic ills of social media platforms. Similar to the developments in privacy and data governance law discussed in Part III, free speech scholars are increasingly attentive to the role free speech law plays in structuring the platform economy and the conceptual limits of that role. See, e.g., Jack M. Balkin, To Reform Social Media, Reform Informational Capitalism, in Social Media, Freedom of Speech and the Future of Our Democracy 233, 233–34 (Lee C. Bollinger & Geoffrey R. Stone eds., 2022) (arguing that, in order to restore a robust digital public sphere, people should focus on the “industrial organization of digital media and the . . . business models of social media companies” rather than focusing reform on First Amendment doctrines). There is, of course, a more general literature on free speech and the First Amendment that has argued that current free speech doctrine serves as a (Lochnerized) regulatory paradigm for economic activity. See, e.g., Amy Kapczynski, The Lochnerized First Amendment and the FDA: Toward a More Democratic Political Economy, 118 Colum. L. Rev. Online 179, 179–80 (2018), https://www.columbialawreview.org/wp-content/uploads/2018/11/Kapczynski-THE_LOCHNERIZED_FIRST_AMENDMENT_
AND_THE_FDA_TOWARD_A_MORE_DEMOCRATIC_POLITICAL_ECONOMY.pdf [https://perma.cc/6FZM-YZZ4]; Robert Post & Amanda Shanor, Adam Smith’s First Amendment, 128 Harv. L. Rev. Forum 165, 165–67 (2015), https://harvardlawreview.org/
wp-content/uploads/2015/03/vol128_PostShanor2.pdf [https://perma.cc/88XH-LXMG];
Jedidiah Purdy, Neoliberal Constitutionalism: Lochnerism for a New Economy, 77 Law & Contemp. Probs., no. 4, 2014, at 195, 202–03.

The idea that information like social data confers power, and is thus a source of value with significant ontological, political-economic, and legal implications, is not new. 27 In 1963, Kenneth E. Boulding remarked that “[t]he very concept of a knowledge industry contains enough dynamite to blast traditional economics into orbit.” Kenneth E. Boulding, The Knowledge Industry, Challenge, May 1963, at 36, 38 (reviewing Fritz Machlup, The Production and Distribution of Knowledge in the United States (1962)). Fritz Machlup was one of the first to systematically analyze the production and distribution of commoditized knowledge. See Fritz Machlup, The Production and Distribution of Knowledge in the United States 13–43 (1962) (reviewing knowledge classifications and proposing his own). In 1994, Robert E. Babe remarked on the curious need for mainstream economics to treat information as a commodity to make sense of it within the economic paradigm, a need that “obscures many essential properties of information, as well as consequences of informational exchange.” Robert E. Babe, The Place of Information in Economics, in Information and Communication in Economics 41, 41–42 (Robert E. Babe ed., 1994). Others, particularly political economists of communication and historians of science, have long identified and analyzed the role of informationalism in contemporary capitalist value formation as it emerged and took on growing importance. 28 See generally S.M. Amadae, Rationalizing Capitalist Democracy (2003) (describing how informational tools “led to a far-reaching and comprehensive system for defining appropriate beliefs and actions”); Manuel Castells, The Rise of the Network Society (2d ed. 2010) (describing the historical transformations leading to the creation of a network society); Dan Schiller, How to Think About Information (2007) (“Companies engaged in making and selling entertainment, banking, communications, data processing, engineering, advertising, law, and other information-intensive services have played an increasingly critical role in overall U.S. investment, employment, and international trade.”). Ian Parker lays out the broad contours of this emergence: “In the 1870s, in North America, about 70 percent of Gross Domestic Product (GDP) was based on material commodity production, with commoditized services constituting about 30 percent of GDP.” See Ian Parker, Commodities as Sign-Systems, in Information and Communication in Economics, supra note 27, at 69, 74. In the early 1990s, when Parker was writing about them, these percentages had reversed. See id. “The rise of the service sector” (and the decline of material commodity production or manufacturing) has been one of “the most fundamental . . . shifts” to happen across advanced capitalist nations in the latter half of the twentieth century. See id. This shift to the service sector encompasses three broad categories: increases in government expenditure associated with capitalist states’ fiscal policies, the commoditization of care work and other services previously associated with social reproduction (in part a result of the integration of women into the wage-labor market), and the rapid growth of the informational economy. See id. Since around the 1970s, however, the average share of government expenditures as a portion of GDP has plateaued. This means that the continued relative growth of the sector is driven by the other two trends: the ongoing transformation of care and of information into commodities. See id. Previous work has established the centrality of social data as a vital, even paradigmatic, factor of production under informational capitalism. 29 See Cohen, Between Truth and Power, supra note 4, at 42 (“[P]latforms represent both horizontal and vertical strategies for extracting the surplus value of user data. Because that project requires large numbers of users generating large amounts of data, the platform[’s] . . . goal is to become and remain the indispensable point of intermediation . . . in its target markets.”); Nick Couldry & Ulises A. Mejias, Data Colonialism: Rethinking Big Data’s Relation to the Contemporary Subject, 20 Television & New Media 336, 337 (2018) (“Just as historical colonialism over the long-run provided the essential preconditions for the emergence of industrial capitalism, . . . we can expect that data colonialism will provide the preconditions for a new stage of capitalism . . . for which the appropriation of human life through data will be central.”); Jathan Sadowski, When Data Is Capital: Datafication, Accumulation, and Extraction, Big Data & Soc’y. Jan.–June 2019, at 1, 1, https://journals.sagepub.com/doi/epdf/10.1177/2053951718820549 [https://perma.cc/
RBB3-8AAJ] [hereinafter Sadowski, When Data Is Capital] (“Industries focused on technology, infrastructure, finance, manufacturing, insurance, and energy are now treating data as a form of capital.”).
Others have identified the importance of behavioral monitoring and prediction to the governance capacities and challenges of the digital economy. 30 See Katharina Pistor, The Code of Capital 213 (2019) (“By constantly contesting the existing boundaries of legal rules in general, and by expanding the remit of the code’s modules to make them fit for ever newer asset classes, lawyers turn any of their clients’ assets into capital.”). See generally Shoshana Zuboff, The Age of Surveillance Capitalism (2019) [hereinafter Zuboff, Surveillance Capitalism] (“We celebrate the networked world for the many ways in which it enriches our capabilities and prospects, but it has birthed whole new territories of anxiety, danger, and violence as the sense of a predictable future slips away.”). Legal scholars have also explored the legal facilitations and fallouts of the informational turn. 31 See Cohen, Between Truth and Power, supra note 4, at 1 (“Networked information technologies inevitably will alter, and are already altering, the future of law . . . .”). See generally Kiel Brennan-Marquez & Daniel Susser, Knight First Amend. Inst., Privacy, Autonomy, and the Dissolution of Markets, (Aug. 11, 2022), https://s3.amazonaws.com/
kfai-documents/documents/854ed7a7b7/Brennan-Marquez—Sussner—-Privacy–Autonomy–
and-the-Dissolution-of-Markets—08.11.22.pdf [https://perma.cc/SM73-SKQR] (describing how, with the rise of information companies, “the capacity of individuals to self-determine—and the capacity of polities to self-govern—is under threat”); Amy Kapczynski, The Law of Informational Capitalism, 129 Yale L.J. 1460 (2020) [hereinafter Kapczynski, Informational Capitalism] (reviewing Cohen, Between Truth and Power, supra note 4, and Zuboff, Surveillance Capitalism, supra note 30) (discussing the inequitable impacts of informational power); Omri Marian, Taxing Data, 47 B.Y.U. L. Rev. 511 (2022) (arguing that in data-rich markets, data should form the tax base); Katharina Pistor, Rule by Data: The End of Markets?, 83 Law & Contemp. Probs., no. 2, 2020, at 101 [hereinafter Pistor, Rule by Data] (analyzing the possibility of using data to organize markets).

This Article builds on that earlier work with two goals in mind. First, the Article’s primary goal is to provide a granular and reasonably systematic accounting of the various ways data is used (or can be used) by platforms and other firms to produce value (and power). It takes up this goal in Part II. The Article’s second goal is to provide a theoretical account of social data as a value form whose cultivation is a primary aim of digital firms—indeed, it is part of what marks the digital economy as “digital.” This theoretical contribution, laid out in Part I, is in service of the primary goal: to illuminate the distinctive value proposition of data and to help explain the conceptual and normative significance of social data as a value form. Taken together, Parts I and II describe the current structure of social data production and suggest why legal scholars and regulators have had trouble grasping the implications and effects of data production under the particular conditions of the contemporary digital economy. Part III explores these legal implications directly. In addition to its two substantive goals, the Article makes a modest methodological contribution to how legal scholarship engages with law’s constitution and regulation of production. Its theoretical account supplies a way of analyzing and evaluating productive activity that does not, at the conceptual level, presuppose market ordering of that activity. In doing so, the Article provides a model for similar analysis, when appropriate, for other kinds of productive activity.