Introduction
Sasha Costanza-Chock triggered the alarm when they walked through the full-body scanner at the Detroit Metro Airport.
They knew it would happen because it happens to transgender, nonbinary, and gender-nonconforming people all the time.
The machine deemed Sasha “risky” because their body, datafied into machine-readable code, differed from the pictures of bodies that trained the machine’s algorithm.
Their breasts were too pronounced relative to data associated with “male,” and their groin area deviated from data associated with “female.”
Pulled out of the line for a physical body search, Sasha found themself in an awkward, humiliating, and potentially dangerous situation.
Toby P., a transgender man living in Colorado, was singled out by a different kind of automated administrative technology.
After Toby sustained a debilitating injury at work, his employer completed the required workers’ compensation First Report of Injury Form by checking the box next to “Female,” a designation that matched Toby’s assigned sex at birth and the information in his human resources file.
The state’s automated fraud-detection system, which compares this claim form with information pooled from state databases, denied Toby’s claim. The “system,” Toby told me, “saw ‘female’ here and ‘male’ [everywhere else] . . . and figured something didn’t match.”
Seven months, twenty-five phone calls, sixteen refiled forms, and two demand letters later, Toby is still hurt and still without the compensation to which he is entitled. He is “basically bankrupt.”
Sasha and Toby fell through the cracks of the automated administrative state.
As government agencies turn to algorithms and artificial intelligence (AI) to administer benefits programs, detect fraud, and secure spaces, transgender, nonbinary, and gender-nonconforming individuals are put in situations where they can’t win. They become “anomalies” or “deviants” in systems designed for efficiency.
Technologies “have politics.”
Just like race and gender hierarchies can be embedded into technological systems,
in this case it is cisnormativity—the assumption that everyone’s gender identity and presentation accord with their assigned sex at birth—that is designed into the automated systems that singled out Sasha and Toby. The underlying data that train machines to recognize males and females, the algorithms that identify anomalies in a person’s body relative to that database, the forms inconsistently designed to collect sex and gender data in the first place, and the systems’ restriction to only male/female options all reflect assumptions of gender as binary. Anyone who deviates from a normative, binary body is “risky” and singled out, potentially exposing them to harm. Those gender-nonconforming individuals who are also religious minorities, immigrants, people of color, or people with disabilities, and people who hold more than one minoritized identity, are multiply burdened.
But this Article is not simply about the biases replicated and entrenched by AI and algorithmic technologies, a story deftly told by others and summarized in Part I. Nor is it just about gender as a tool of classification, a story as old as the nation.
This is a story about law. Specifically, this Article argues that the law has mandated, influenced, and guided the state to automate in a way that binarizes gender data, thereby erasing and harming transgender, nonbinary, and gender-nonconforming individuals.
The law’s active role in the creation of this kind of automated state has been overlooked because the two dominant strands in legal scholarship on algorithmic technologies are focused elsewhere. One of those strands sees automation and its harms flourishing in a regulatory void. Scholarship in this vein rightly argues that automated systems used by private, for-profit technology companies cause harm because “the law has offered insufficient protection.”
Other scholars suggest that algorithmic technologies are built amidst “lawlessness,” or the lack of regulation.
A second important strand of law and technology scholarship focuses on how law can address automation’s harms. This research explores how the technologies work, where they go wrong, and how we might use law to regulate them, fix them, and restore the status quo ex ante by holding technologies and those that use them accountable for discrimination, bias, and harm.
Few scholars have focused on how the law creates the automated administrative state,
and fewer still have focused on how the law constructs gender data in the automated state.
This Article fills that gap: Sasha’s and Toby’s stories are actively and indelibly framed, constructed, and sustained by law every step of the way.
The process begins at the source, where statutes mandate the collection of sex and gender data. As Part II describes, the law of gender data collection relies on assumptions of static gender, taps into uninformed perceptions of the gender binary as “common sense,” and creates the conditions for civil servants to design forms with primarily binary gender questions. This creates binary gender data streams. Part III shows how interstate compacts and interagency contracts, all of which I collected from public records requests, require states to share datasets that include sex and gender. The law of gender data sharing looks outward and inward to privilege the gender binary: It has expressive effects that normalize the gender binary, conflationary effects that confuse the social aspects of gender with the biological aspects of sex, and interoperability effects that force the gender binary onto any agency that wants to realize the benefits of participating in shared data systems. Part IV demonstrates how automation mandates, agency policymaking by procurement, trade secrecy law, and privacy and data protection law actively encourage automation to improve efficiencies while preventing anyone from interrogating the underlying assumptions of the algorithms that use sex and gender data. This web of legal rules guides automation to exclude those outside the norm and erects barriers around automated tools that protect the gender binary from change.
In other words, the law forces an oversimplified legibility on its subjects, leaving those most marginalized at risk.
This rich account of how law collects, shares, and uses sex and gender data in state-run automated systems offers several insights about automation and the automated state in general that challenge or add nuance to the conventional wisdom in the legal literature. Part V discusses four of those lessons.
The automated state is discretionary.
Scholars have argued that automation erodes traditional agency discretion, a pillar of the administrative state.
But this Article shows that civil servants have discretion to guide automation in ways that binarize gender data. The discretion may be buried, but its fingerprints are everywhere—in the design of data-collection forms, in the terms of data-sharing agreements, in the procurement of technologies, and in the design and completion of privacy impact assessments (PIAs).
Relatedly, the automated state is also driven by stereotypes.
Rather than merely shifting expertise from civil servants hired for their substantive knowledge to engineers with technological knowledge about how algorithms work, the automated state relies on both civil servants’ and engineers’ supposedly commonsense perceptions of sex and gender.
Because most people have traditionally presumed that sex and gender are the same and static, automated systems designed by engineers and used by the government reflect those stereotypes.
The automated state is also managerial.
Far from a product of the law stepping out of the way, the state’s use of algorithmic decisionmaking processes represents the synthesis of the logics (and pathologies) of data-driven governance, risk assessment, public–private partnerships, and procedural compliance, leveraging the power of law and the state to achieve efficiency goals. By orienting algorithmic tools toward the neoliberal goal of targeted governance through risk assessments that are supposed to cover most people most of the time, the law singles out those outside the norm for disproportionate harm. Finally, and again, relatedly, the automated state is structurally subordinating.
Law infuses the government’s data ecosystem with sex and gender information in a way that is both over- and underinclusive: It is overinclusive because it collects sex and gender data too often when not necessary; it is underinclusive because its reliance on the gender binary excludes transgender, nonbinary, and gender-nonconforming individuals from any of the benefits that could come from data’s capacity to create insight.
This kind of automated state harms gender-diverse populations. But the reification of the gender binary in the automated state is not a niche concern; it harms anyone constrained by strict gender expectations.
Plus, those most dependent on government resources and thereby subject to the state’s informational demands will bear the greatest burdens of the state’s automated use of binary gender data streams.
This poses a particular problem for members of the LGBTQ+ community, approximately one million of whom are on Medicaid.
Nearly half of LGBT people of color live in low-income households.
Transgender people are nearly two and a half times more likely than non-transgender people to face food insecurity.
LGBT people have higher rates of unemployment than the general population.
For some scholars and advocates, the solution to these problems is for the state to stop collecting sex and gender data.
But as various scholars have shown, legibility comes with benefits as well as risks.
I don’t know whether there is a way to get it right, to find the “Goldilocks Zone” for gender, data, and power, especially given the state’s historic commitment to queer oppression and the historical aims of what James C. Scott might call top-down legibility.
But I would like to try. This Article offers a way to navigate the legibility dilemmas triggered by state gender data collection.
The Article’s lessons about the automated state—its persistent reliance on civil servant discretion, its use of stereotypes and perceptions of common sense, its orientation toward efficiency, and its subordinating capacities—suggest that scholars and advocates ignore the liminal space between the law on the books and the law on the ground to our peril.
For sure, we can pass new laws that guarantee an “X” gender marker option; we can also litigate in court when state gender designations discriminate against those outside the gender binary. But “new categories are not enough.”
Nor will a statute “deprogram” a gender binary so embedded in our culture and in the technologies of private and state surveillance.
To protect transgender, nonbinary, and gender-nonconforming individuals from automation-based harms on a more systematic level, we can also develop the state’s “gender competence.”
That is, in addition to changing the law on the books, scholars and advocates can also help change how civil servants understand gender data and its value, limits, and powers.
These are the goals of Part VI, which wrestles with the live and pressing questions of the proper role of the state: Should the state ever collect and use gender data? If not, why? If so, how can the state do so in a way that serves the interests of gender-diverse populations rather than its own disciplinary interests? Resolving these questions is beyond the scope of this Article, but in a world in which the state does collect and use gender data, its role should be particularly narrow. Part VI offers three principles, familiar to privacy scholars, for building a future in which government uses of gender data and algorithmic technology foster rather than erode antisubordination goals. A necessity principle urges the state to ask whether it actually needs sex or gender data to achieve its goals and, if it does, to determine which one it needs. An antisubordination principle would limit sex and gender data collection to only those uses that benefit and support greater inclusion of gender-diverse populations. And an inclusivity principle would ensure that once the state decides to collect sex or gender data for emancipatory ends, it does so sensitively and in a contextually inclusive way.
Luckily, privacy law principles of data minimization—that one should only collect as much personal data as is necessary to achieve a stated purpose—and antisubordination—that law should disrupt traditional hierarchies of power enjoyed by data collectors—are capable of doing just that.
Part VI concludes with this Article’s ultimate recommendation: The law on the books and the law on the ground should take gender diversity into account. The state should be able to collect, share, and use sex and gender data only when necessary to support a gender-inclusive antisubordination agenda: to combat discrimination, to provide adequate healthcare, to guarantee benefits that have been traditionally denied, and to enable self-determination for gender-diverse populations.
To date, the law’s role in creating an automated state that binarizes gender data has been mostly hidden from view. It is a puzzle of statutes, rules, interstate compacts, intergovernmental cooperation, procurement, street-level bureaucracy, and managerial policymaking, all of which is summarized in Table 1. This Article pieces that puzzle together. It relies on a mix of primary source materials, including a computationally derived novel dataset of more than 12,000 government forms scraped from state agency websites, documents obtained through public record requests, and first-person interviews with lawyers and government officials.
Table 1. Law and the Binarization of Gender Data, Summary
Law of Data Collection (examples)
|
Data binarized by . . . |
Statutes requiring sex/gender data collection (e.g., security, identity verification, distribution of benefits).
Information primarily gathered through forms created by street-level bureaucrats. |
Mediation by the state, which creates the data.
Perceptions of “common sense” about sex/gender, which govern form design.
Path dependencies, which ensure that forms remain the same over time.
Assumption that gender is a static/secure identifier, which implies gender binary only. |
Law of Data Sharing
|
Data binarized by . . . |
Data sharing required to realize security and efficiency benefits.
Data sharing permitted at discretion of state agency leadership.
Interagency agreements.
Interstate compacts. |
Normalization of the binary by dissemination.
Conflation of sex and gender.
Interoperability, which requires all data look to the same. |
Law of Data Use
|
Data binarized by . . . |
Automation mandates.
Efficiency mandates.
Innovation, chief innovation offices.
Procurement.
Trade secrecy.
Privacy law compliance
(privacy impact assessments). |
Efficiency mandates, which mean binary design.
Managerialization via innovation offices, which ensures narrow cost–benefit analysis.
No interrogation of design via procurement process.
Symbolic compliance, which weaponizes PIAs to serve automation rather than privacy. |