INTRODUCTION
In a 2017 interview, Chief Justice John Roberts was asked if he could foresee a day when artificial intelligence would play a role in judicial decisionmaking. Immediately noting that “[i]t’s a day that’s here,” the Chief Justice described several types of technology that are widely used in the judicial system, concluding that the courts “have not yet really absorbed how it’s going to change the way we do business.”
Roberts was right—courts today make use of a variety of new technologies, including artificial intelligence,
but the impact of these innovations on the judicial system remains largely unaddressed.
Artificial intelligence already plays a significant role in judicial decisionmaking through the widespread use of recidivism risk assessment algorithms in state criminal justice systems. Today, at least twenty states use risk assessment algorithms to predict recidivism in their bail, parole, and sentencing proceedings,
encouraging or requiring judges to consider them in making their determinations.
And while these tools promise to decrease recidivism, reduce prison overcrowding, and combat racial bias,
critics have identified a number of potential constitutional issues that stem from the use of these algorithms, including due process and equal protection claims.
Further complicating these constitutional questions is the fact that state governments often contract with private companies to develop and license these tools, in part due to the level of technological and statistical expertise needed to create them.
Because of this outsourcing, defendants and judges alike have limited information about how these risk assessment systems operate. For example, in a 2016 case from Wisconsin, a private developer denied an offender’s request for information about the algorithm that was used to determine his sentence on the ground that it was a trade secret.
As states increasingly contract with private companies to incorporate algorithmic risk assessment into their criminal justice systems,
a number of questions arise about the strength of constitutional protections when private actors play a role in the sentencing process.
While scholarship on risk assessment algorithms has focused primarily on the constitutionality of risk assessment and challenges to the invocation of trade secret protections,
this Note addresses the inadvertent and largely unexamined role that private actors—risk assessment developers—have come to play in individual sentencing determinations. Arguing that a lack of oversight and control by state actors has created an accountability gap in the use of these tools, this Note proposes a solution for increasing accountability and legitimacy in recidivism risk assessment based on case law addressing delegations to private actors. Part I provides a brief history of the development and use of risk assessment instruments in sentencing, followed by a more comprehensive explanation of how modern risk assessment tools differ from earlier models. Part II explains how the process for developing modern risk assessment algorithms has obscured the way these tools operate, making judges more likely to rely on these seemingly objective yet opaque assessments in sentencing decisions. This has inadvertently allowed private developers to play a significant role in sentencing individual defendants, while remaining unrestricted by traditional notions of constitutional accountability that bind state actors. Part III suggests that the private delegation doctrine—a largely dormant, New Deal–era doctrine developed to increase government oversight and control of private actors exercising government power—can provide a framework for understanding and filling this accountability gap.
The Note concludes in Part IV, which uses private delegation principles identified in Part III to craft legislative remedies to restore constitutional accountability to the use of privately developed risk assessment algorithms in sentencing.
I. RISK ASSESSMENT IN SENTENCING
This Part provides an overview of the history and current applications of recidivism risk assessment in sentencing. Section I.A describes the development of risk assessment tools in the twentieth century, as well as the policy reasons for using them, before looking at examples of state statutes that authorize or mandate risk assessment in sentencing. Section I.B discusses the machine learning process, illustrating how today’s risk assessment algorithms differ from earlier risk assessment methodologies.
Finally, section I.C summarizes some of the constitutional concerns surrounding the use of modern risk assessment algorithms in sentencing, setting the stage for Part II’s discussion of opacity, privatization, and constitutional accountability.
A. The History and Use of Recidivism Risk Assessment Tools
Now ubiquitous,
the use of risk assessment tools in the U.S. criminal justice system dates back to the 1920s, when sociologist Ernest Burgess developed a statistical method for estimating recidivism risk for parole determinations.
In the 1960s and 1970s, statisticians made further efforts to create models for identifying offenders with a high risk of committing violent crimes.
These early methods were not particularly accurate or useful, with some identifying up to ninety-nine percent of study participants as “dangerous.”
Perhaps because of the unreliability of these early models, clinical risk assessments, in which correctional staff and clinical psychologists would undertake unstructured, one-on-one interviews to assess an individual’s likelihood of recidivating, were more popular during this time.
In sentencing specifically, risk assessment became common in the 1980s, when sentencing commissions began to use criminal history as an approximation of recidivism risk in sentencing guidelines.
Although the instruments for assessing risk have changed since these tools were first developed, three policy arguments continue to drive the use of risk assessment tools. First, risk assessments may help to reduce prison populations and save taxpayer money by enabling judges to sentence low-risk defendants to shorter prison terms.
Second, they increase fairness in the criminal justice system by providing an assessment of a defendant’s dangerousness, purportedly free from bias that may plague judicial decisionmaking.
Finally, they reduce recidivism and increase public safety by enabling judges to better understand a defendant’s rehabilitative needs.
The use of predictive risk assessment tools in sentencing has increased dramatically in recent years.
The first actuarial risk assessment instrument for sentencing was implemented in Virginia in 1994,
and as of 2014 approximately twenty states used some form of risk assessment in sentencing.
In the current working draft of the Model Penal Code, the American Law Institute has even added a provision encouraging states to incorporate risk assessment tools into the sentencing process.
In states that have adopted predictive risk assessments, an offender’s risk score—which often appears in the form of both quantitative and qualitative “high,” “medium,” or “low risk” values—is typically provided to the sentencing judge in the offender’s presentence investigation report.
Many state legislatures have recently started to require the consideration of recidivism risk assessments during the sentencing process.
For example, Kentucky’s state sentencing policy mandates that judges consider “the results of a defendant’s risk and needs assessment included in the presentence investigation” and “[t]he likely impact of a potential sentence on the reduction of the defendant’s potential future criminal behavior.”
Likewise, Tennessee law mandates that “[i]n preparing presentence reports . . . the department of correction shall include information identifying the defendant’s risks and needs as determined through the use of a validated assessment instrument.”
Other states have taken a more permissive approach to the use of recidivism risk assessment tools, simply encouraging or permitting consideration of risk assessments without explicitly requiring it.
For example, Louisiana law gives courts the option of using a “single presentence investigation validated risk and needs assessment tool prior to sentencing an adult offender eligible for assessment.”
Washington State’s sentencing law takes a similar approach, stating that the judge at a sentencing hearing “may order the department to complete a risk assessment report,” except in certain cases.
B. Modern Risk Assessment Tools
The increased use of risk assessment tools in sentencing is closely tied to the development of new modeling techniques utilizing machine learning,
the process by which a computer program is given a large quantity of data and tasked with identifying variables in the data that correlate with a specified outcome.
The use of machine learning has greatly increased in recent years, largely due to “the accumulation of large datasets for analysis and advances in computing power and machine learning theory that have enabled much more complex analysis of those datasets.”
This technology is now used in myriad ways, including in self-driving cars,
disease detection,
online-shopping product recommendations,
and, of course, recidivism risk assessment tools.
The machine learning process proceeds through eight (sometimes overlapping or repeated) steps, as identified by Professor Paul Ohm and David Lehr and summarized in the table below.
Table 1: The Machine Learning Process
Problem Definition |
The developer determines the outcome the final algorithm should predict.
In the case of a recidivism risk assessment, this would be the likelihood of a given defendant committing another crime in the future. |
Data Collection |
The developer assembles a sufficiently large dataset from which the machine learning model can identify patterns.
For recidivism risk, this dataset would likely include extensive criminal histories collected from local government offices. |
Data Cleaning |
The developer combs through the dataset to identify incorrect or missing information.
Missing or incorrect values may require entire entries to be eliminated from the dataset, or replacement values to be imputed from the existing data. Although a single error may not be significant, it is impossible to assess the aggregate impacts of cleaning decisions on the overall dataset without detailed explanations from the developer.
|
Summary Statistics Review |
A developer conducts a review of the aggregate statistics to identify outliers.
|
Data Partitioning |
The dataset is divided into two subsets: one that will be used to “train” the algorithm and a second that will later be used to test its predictive ability.
|
Model Selection |
The developer chooses the model that will generate the predictive algorithm.
While all machine learning models work toward the same end—producing an algorithm with maximal predictive accuracy—there are several different methods for achieving this goal. Different models are better equipped to digest and predict quantitative or qualitative values, to over- or underestimate in their predictions, and to provide some amount of explanation for how results are calculated.
|
Model Training |
The selected model is applied to the subset of data that has been designated for training.
This is the “learning” portion of the process, when the model identifies patterns in the dataset and develops the predictive algorithm based on these patterns.
This process may be repeated several times, with developers fine-tuning both the data and the model over the course of multiple learning cycles. It is important to keep in mind that the ultimate objective of a machine learning model is not to identify inherent causal relationships in the dataset, but “to make classifications based on mathematical descriptions . . . that yield the lowest error rates.”
|
Model Deployment |
The patterns that have been identified in the data are converted into a usable interface.
In the risk assessment context, this includes translating the “quantitative outcome into a qualitative ‘risk score’ used by criminal justice actors at sentencing.”
|
Today’s risk assessment algorithms differ from earlier tools in their complexity and sophistication, and consequently, in their opacity.
This can make it difficult to understand the many decisions that go into creating them, particularly when these decisions are outsourced to private contractors.
As Professor Jessica Eaglin has observed, “Decisions about the data to collect, the recidivism event to observe, and the risk factors selected have great import to understanding what and how a resulting recidivism risk tool predicts.”
For example, in the problem definition stage, if the algorithm developer chooses to define recidivism as an arrest occurring within five years of release, the algorithm will identify a different set of offenders in the dataset as recidivists than if the developer had selected a ten-year timeframe. Because this decision changes which offenders are classified as recidivists—and which are not—it inevitably influences the patterns that the model identifies in drawing connections between those offenders.
C. Constitutional Challenges to Algorithmic Risk Assessment
The increasing use of risk assessment algorithms in criminal justice has been accompanied by vocal concern about the constitutionality of their operation and deployment. In a notable Wisconsin case from 2016, Loomis,
Eric Loomis brought a due process challenge against the use of COMPAS, the most widely used risk assessment tool in America,
after he received a high risk score and was sentenced to six years in prison.
The Wisconsin Supreme Court held that the use of COMPAS did not violate due process, in part because Loomis had the opportunity to review the accuracy of the information that produced his risk score.
Despite the outcome in Loomis, the due process implications of government use of predictive algorithms in criminal adjudication remain unsettled.
And while Loomis did not explicitly bring an equal protection claim,
a number of scholars have identified potential equal protection challenges arising from the use of risk assessment algorithms, which may consider protected characteristics in determining risk scores and are known to have disparate impacts on certain groups.
As the following Part demonstrates, the private development of risk assessment algorithms inhibits investigation of the merits of these constitutional claims. Even when public actors apply privately developed risk assessment tools in sentencing determinations, issues of transparency and accountability persist.
Judges struggle to understand and interrogate algorithmic results, allowing private actors to influence sentencing outcomes without being subject to traditional accountability mechanisms.
After all, it is difficult, if not impossible, to enforce constitutional guarantees when the violation is obscured from even the judge’s view.
II. PRIVATIZATION BY AUTOMATION
As seen in Part I, a number of constitutional concerns arise from the use of recidivism risk assessment algorithms in sentencing.
In the outpouring of scholarship on the constitutional problems presented by these algorithms, however, one question has been largely overlooked: Who is accountable for ensuring the constitutional compliance of risk assessment systems? This Part outlines the gap in constitutional accountability that arises from the use of privately developed risk assessment algorithms in sentencing, focusing on the legal and technological obstacles judges face in applying algorithmic risk scores.
Section II.A begins with a summary of these obstacles, looking at the opacity and false sense of objectivity that surround algorithmic decisionmaking and potentially lead judges to rely heavily on the results of risk assessment tools. While previous scholarship has discussed these opacity and objectivity problems in and of themselves, section II.B demonstrates how these obstacles obscure the role that private developers play in shaping a risk assessment algorithm, giving them undue influence in sentencing determinations. Section II.C concludes this Part by identifying the gap in constitutional accountability that arises when the decisions of risk assessment developers are not clear to the judges tasked with applying their tools.
A. Limits on Judicial Understanding of Risk Scores
This section summarizes the difficulties judges face in understanding and applying risk assessment algorithms.
Both legal and technological obstacles may obscure essential information about the development of these tools and the meaning of their outputs. Due in part to this opacity, judges may be inclined to rely on algorithmic predictions, which appear scientific and objective when little background information is available.
1. Legal and Technological Opacity. — Two kinds of opacity limit judges’ ability to understand the operation of risk assessment algorithms: legal opacity and technological opacity. Legal opacity refers to legal obstacles, such as trade secret protections,
that prevent a judge from accessing information about the algorithm,
whereas technological opacity relates to a judge’s inability to understand available information about the algorithm due to a lack of relevant expertise.
This section argues that, without transparency in the development and function of risk assessment algorithms, judges are unable to understand and properly apply their results—an idea best demonstrated by Loomis, the Wisconsin case discussed in Part I.
In Loomis, the defendant challenged the use of the privately developed COMPAS risk assessment tool in determining his six-year prison sentence.
On appeal to the Wisconsin Supreme Court, Loomis argued that the sentencing court’s decision incorporated information that was unavailable to him because the creators of COMPAS, invoking trade secret protections, were able to avoid disclosing how the algorithm determined his risk score.
In upholding the sentence, the court stated that, unlike the due process violation that occurs when a sentencing court relies on information that the defendant is barred from reviewing, Loomis had access to the same material as the court that sentenced him.
Essentially, no due process issue existed because the sentencing judge and the defendant had equally limited information about the operation of the algorithm.
Much of the scholarship discussing Loomis has focused on the successful invocation of trade secret privileges by Northpointe,
the developer of COMPAS, to prevent disclosure of how the algorithm operates.
Although trade secret protections were a central issue in Loomis,
the failure to disclose proprietary source code is only one of many ways in which the process for developing these tools has been obscured. In addition to the legal opacity arising from the use of trade secret protections to prevent source code disclosure, there is a layer of technological opacity working to obscure the ways in which risk assessment algorithms make their determinations.
While the trade secret obstacle certainly limits transparency, the opacity arising when judges with little to no technical knowledge apply incredibly complex software is equally troublesome. Even if trade secret protections were eliminated and the source code for these tools were provided to both judges and defendants, it is unlikely that this disclosure would meaningfully increase their understanding of how these tools function.
Whereas the significance and weight of factors used in traditional statistical models can be clearly explained, there is currently no practical way to demonstrate how a given input—such as the offender’s age—influences the risk score determined by an algorithm.
Looking back at the reasoning in Loomis,
it is easy to imagine a situation in which the court and defendant receive source code that they have no means of interpreting, but the judge finds no due process issue because there is equal access to information for all involved. It is therefore important to ensure that judges understand the inner workings of the technology that they apply, especially when, as in cases like Loomis, that technology may play a central role in sentencing.
A large source of the technological opacity described above is what has been called the “black box” nature of machine learning
—the idea that machine learning algorithms transform a set of inputs, such as criminal history, into an output, like a recidivism risk score, through a process that is not easily explained.
When the algorithm used to assess a defendant’s recidivism risk is created through machine learning, rather than by a human statistician, it is not always clear which of the data points provided as inputs actually factor into the final output, or how frequently and heavily those inputs are considered in the algorithm’s complex calculus.
Consequently, when a judge or defendant receives an offender’s algorithmically generated risk score, they “may not understand why someone is considered to have a low, medium, or high risk of recidivism.”
Because “[t]he results of machine learning analysis are not intuitively explainable,” they “cannot support causal explanations of the kind that underlie the reasons traditionally offered to justify governmental action.”
2. The False Objectivity of Data. — A second problem arising from the use of risk assessment algorithms in sentencing is the false sense of objectivity that surrounds algorithmic predictions.
Professor Kate Crawford has described this phenomenon as “data fundamentalism,” or “the notion that correlation always indicates causation, and that massive data sets and predictive analytics always reflect objective truth.”
Because of the pervasive idea that all data is scientific, and that processes like machine learning are inherently logical, the unique attributes of modern risk assessment tools “combine to make machine-learning techniques appear qualitatively more independent from humans when compared to other statistical techniques.”
When interpreting the output of a machine learning algorithm, it can be tempting to view the results as objective, fair, and inevitable simply because they were produced by a computer.
The guise of objectivity is an issue whenever data is used to supplement human decisionmaking, but it is particularly concerning when judges use proprietary algorithms to inform their judgments. Judges, directed by statute to consider the results of risk assessment algorithms in their sentencing decisions,
might place strong emphasis on risk scores.
As Professors Robert Brauneis and Ellen P. Goodman explain, judges “are expected to exercise human judgment over algorithmic predictions so that they may catch faulty predictions. In theory, the algorithmic edict is advisory only. In practice, decisionmakers place heavy reliance on the numbers, raising the stakes for their fairness.”
Judicial reliance on algorithmic predictions is especially problematic when there has been little to no independent validation of their results.
This false sense of objectivity is connected to the opacity problems discussed above, which also contribute to judicial reliance on algorithmic risk scores. As Eaglin explains, private risk assessment developers are incentivized to avoid disclosing details related to their tools, since this information may undermine the credibility of their software, and therefore their competitiveness.
It is possible that, rather than relying on the results of an opaque machine because it seems “scientific,” judges may disregard a risk score when its origins are unclear.
However, many scholars have suggested that, despite the lack of transparency, decisionmakers do in fact rely on the results of risk assessment tools in making sentencing determinations.
Although these opacity and objectivity problems make sense in theory, it is worthwhile to examine the reality of how judges use and apply risk scores. Another case from Wisconsin provides a useful example. Paul Zilly agreed to a plea deal with prosecutors to serve one year in jail for theft.
After seeing his risk assessment score, a judge rejected the deal and sentenced Zilly to two years in prison along with three years of supervision, noting that his risk score was “about as bad as it could be.”
Although the judge later reduced Zilly’s sentence, he explained that, “[h]ad I not had the COMPAS, . . . it would likely be that I would have given one year, six months.”
While this evidence is admittedly anecdotal, it suggests that judges may rely heavily on risk scores to inform their sentencing decisions, imposing harsher sentences on those determined to present a high risk of recidivism. Scholars and journalists have noted this issue, raising alarms about allowing machines to determine prison sentences,
but the problem is more complicated than that. A focus on the algorithm as decisionmaker distracts from the human actors that have come to play a major role in sentencing determinations: the private developers of risk assessment tools.
B. The Privatization Problem
The opacity and objectivity problems discussed above prevent judges from understanding the inner workings of risk assessment tools, thereby obscuring the subjective decisions that shape these algorithms and increasing the likelihood that those decisions will influence someone’s sentence.
This means that private developers play a significant part in sentencing determinations without being subject to traditional constitutional accountability mechanisms.
As Brauneis and Goodman explain, in providing algorithmic tools to government bodies, “private entities assume a significant role in public administration. [Critical information] comes to reside in the impenetrable brains of private vendors while the government, which alone is accountable to the public, is hollowed out, dumb and dark.”
Returning to the eight steps of machine learning described by Lehr and Ohm,
it is clear that risk assessment developers make policy decisions throughout the machine learning process that can substantially impact the results of their algorithm.
For example, developers must quantitatively define what “recidivism” means in order for the model to be able to predict it, a process which can implicate major policy questions in criminal law.
A developer could instruct the model to predict recidivism based on whether someone will be arrested or convicted, or to predict the likelihood of this event occurring within one, five, or ten years.
While these subjective, private determinations have obvious significance for the meaning of a defendant’s risk score,
they are not at all clear to a judge applying the risk assessment tool.
The collection and cleaning of the dataset from which the model learns necessarily present further opportunities for developers to influence the tool’s predictive outcomes. After all, “[n]o predictive tool is better than the data set from which it originates.”
Private developers make decisions about how to gather the data that will form the foundation of the risk assessment tool—including the jurisdictions to pull data from and the size of the dataset—which can have significant impacts on how the algorithm functions.
Determinations about which data points to exclude and how to replace erroneous or missing values all reflect subjective judgments about the underlying data.
The process of assembling the risk assessment algorithm presents another set of opportunities to influence a tool’s output. Not every predictive factor that a model identifies will be incorporated into the tool that is ultimately packaged and sold to the government. Private developers must therefore judge which factors to include and which to omit,
decisions which are often made without taking relevant state sentencing laws into consideration.
Furthermore, an algorithm’s risk determination must ultimately be converted into a digestible output for judges to use, which requires developers to make additional policy judgments about what constitutes low, medium, and high risk.
Because of both legal and technological opacity, as well as the veil of objectivity that surrounds algorithmic decisionmaking, most of the actions of these private developers are obscured from public view.
Although private actors influence the results of risk assessment algorithms at every stage of their development, these choices are not immediately clear to the judges applying these tools. As Eaglin notes, “[P]revious efforts to estimate risk at sentencing—like guidelines and mandatory penalties—made normative judgments about how to sentence and why, [but] those choices were apparent on the face of the mechanized tool. . . . With actuarial risk tools, normative judgments are more difficult or even impossible to discern.”
C. The Constitutional Accountability Gap
At the heart of this problem is the principle of constitutional accountability—the idea that “the Constitution imposes limits on the actions that governments can take . . . [, and that] individuals injured by exercises of government power can enforce these constitutional limits in court.”
As Professor Paul R. Verkuil explains, “Accountability for acts of government is difficult when duties are delegated to private hands and secrecy covers the tracks.”
Because the Constitution “erects no shield against merely private conduct, however discriminatory or wrongful,”
the privatization of government functions raises questions about how to ensure that constitutional accountability is preserved in the now-private performance of previously public functions.
A central facet of constitutional accountability is the ability of individuals to enforce “constitutional restrictions in court through judicial review.”
As the preceding sections demonstrate, the opacity surrounding privately developed risk assessment algorithms, along with the false sense of objectivity that these algorithms provide, combine to prevent judges from reviewing the decisions of private actors that are embedded in these technologies.
While judges have a symbolic ability to review and deviate from the determinations of risk assessment algorithms, the issues raised in section II.A effectively limit the extent to which this is possible.
Without additional information about how these algorithms are developed, it is impossible for judges to know whether they represent due process or equal protection violations,
or to use them in a way that is consistent with the intent of the legislature.
The private action here is subtly obscured but raises the same issues as other instances of privatization of government functions
—namely that the mechanisms of government oversight in these relationships are structured in a way that does not comport with notions of constitutional accountability. Without the ability to more fully understand how these tools operate, judges are unable to interrogate their results in a way that sufficiently preserves constitutional protections.
As Lehr and Ohm explain, “Only one who is attentive to the many ways in which data can be selected and shaped—say, during data cleaning or model training—will characterize fully the source of the stink.”
The use of privately developed risk assessment algorithms in sentencing is, therefore, an instance in which “private market providers are cloaked in state clothes,” giving rise to potentially harmful conduct in need of greater constitutional oversight.
Implementations of machine learning technology in government decisionmaking obscure private action behind a technological veil, masking the reality of the situation—that the developers of these tools have been given inordinate power in sentencing, power that should be subject to traditional constitutional limitations.
While source code disclosure may nominally give both defendants and judges the ability to understand how these algorithms operate, the chance that a judge or criminal defendant will be able to make sense of this technical information in order to mount or assess a constitutional challenge is extremely low.
Rather than solely focusing on the constitutionality of a risk assessment algorithm or the defendant’s ability to challenge trade secret protections, it is equally important to examine the judge’s ability to apply the risk assessment in a way that allows for enforcement of constitutional guarantees.
III. THE PRIVATE DELEGATION DOCTRINE
Responding to the gap in constitutional accountability described in Part II, this Part turns to the private delegation doctrine as a framework for understanding the shortcomings of existing laws that govern the use of privately developed risk assessment algorithms in sentencing. Section III.A begins with a description of the traditional nondelegation doctrine, which limits delegations to public actors, and then discusses the justifications for applying similar, and possibly heightened, scrutiny when government power is placed in the hands of private parties. Section III.B summarizes the origins and evolution of the private delegation doctrine, which limits the ability of Congress to delegate authority to private actors. Section III.C then identifies key principles that guide courts’ analyses of private delegations. As Part IV then explores, these private delegation principles provide a framework for understanding the problems presented by privately developed risk assessment algorithms, as well as a path to restoring accountability to their use.
A. Private Versus Public Delegation
In administrative law, the nondelegation doctrine limits Congress’s ability to transfer power to administrative agencies and other actors. While the doctrine formally states that Congress may not delegate legislative power at all,
congressional grants of power are rarely invalidated in practice.
Recognizing that strict limitations on delegation are impractical,
the modern doctrine requires that the legislature prescribe sufficient policies and standards to restrict the scope of discretion that actors have in wielding the power delegated to them.
When reviewing congressional delegations of power to public actors, the Court asks whether the delegation establishes an “intelligible principle” to which the agency must conform.
One function of the intelligible principle is to enable judicial review of agency action,
meaning the legislative directive must be “sufficiently definite and precise to enable Congress, the courts, and the public to ascertain whether the Administrator . . . has conformed to those standards.”
A delegation’s legitimacy therefore depends in part on courts’ ability to ensure continued accountability
—to understand the legislative intent behind the delegation and determine whether the challenged action conforms to the will of Congress.
While the nondelegation doctrine described above applies the intelligible principle test to delegations of legislative power to public actors, courts have distinguished this formulation from the delegation of power to private actors.
Under the private delegation doctrine—the “lesser-known cousin” of nondelegation
—“the question . . . becomes whether ‘grants of government power to private entities are adequately structured to preserve constitutional accountability.’”
This distinction between delegations to public versus private actors is particularly important at a time when government functions are facing increasing privatization.
As courts and scholars have noted, the nondelegation doctrine’s concerns about government oversight are heightened when government power is placed in the hands of private actors.
Whereas the lax enforcement of the nondelegation doctrine may be justified on the grounds that the Constitution vests some amount of discretion in the executive branch and provides mechanisms of accountability, a heightened level of scrutiny may be required when these justifications do not exist, as in a private delegation.
Although Supreme Court jurisprudence tends to approve of the delegation of government power to private actors, it “emphasize[s] the presence of government review of private decisionmaking in upholding private delegations.”
B. The History of Private Delegation
1. New Deal Origins. — Although delegations to private actors were typically upheld prior to the mid-twentieth century, the New Deal “gave sharp focus to the private delegation doctrine, as reliance on private regulation and corporatism represented cornerstones of President Roosevelt’s early efforts to revive the national economy.”
In A.L.A. Schechter Poultry Corp. v. United States, the Supreme Court considered whether provisions of the National Industrial Recovery Act that permitted private industrial groups to write local trade codes unconstitutionally delegated lawmaking power to those private groups.
The majority, led by Chief Justice Hughes, wondered if “it be seriously contended that Congress could delegate its legislative authority to trade or industrial associations or groups so as to empower them to enact the laws they deem to be wise and beneficent.”
Hughes went on to explain that this delegation could not be “made valid by . . . a preface of generalities as to permissible aims,” and that “[s]uch a delegation of legislative power is unknown to our law and is utterly inconsistent with the constitutional prerogatives and duties of Congress.”
One year later, the Court decided Carter v. Carter Coal Co., invalidating the Bituminous Coal Conservation Act as an impermissible delegation of power to private parties because it allowed coal miners and producers to enter labor agreements that were binding on all other miners and producers in the local area.
Because the Act delegated not “to an official or an official body, presumptively disinterested, but to private persons whose interests may be and often are adverse to the interests of others in the same business,” the Court deemed it “legislative delegation in its most obnoxious form.”
The Court balked at the idea of Congress conferring power on a private majority “to regulate the affairs of an unwilling minority,”
and found that the regulation of coal production “is necessarily a governmental function, since, in the very nature of things, one person may not be entrusted with the power to regulate the business of another.”
2. The Modern Private Delegation Doctrine. — The Supreme Court has not invalidated a law under the private delegation doctrine since Carter Coal,
giving the impression that “the most salient characteristic of current private delegation doctrine is its dormant status.”
Nevertheless, there are indications that the doctrine remains influential, in part because the Supreme Court “has continued to emphasize the presence of government review of private decisionmaking in upholding private delegations.”
In assessing a private delegation today, “the pre–New Deal cases remain valid . . . both because they have never been overruled and, more importantly, because the principles on which they relied remain relevant and vital.”
Despite the dormancy of private delegation at the Supreme Court, both state and lower federal courts have continued to apply the doctrine, helping to provide a fuller picture of its principles and considerations.
The recent battle over Amtrak’s regulatory capacity illustrates the continued influence of private delegation principles.
In 2013, the D.C. Circuit invalidated section 207 of the Passenger Rail Investment and Improvement Act of 2008 (PRIIA), which required the Federal Railroad Administration and Amtrak, a statutorily created, for-profit company, to jointly develop standards and metrics to assess the quality of passenger rail service.
In determining that the statute unconstitutionally delegated power to a private actor, Judge Brown emphasized the potential for Amtrak to abuse its statutory authority for private gain,
explaining that the doctrine established in Carter Coal “ensures that regulations are not dictated by those who ‘are not bound by any official duty,’ but may instead act ‘for selfish reasons or arbitrarily.’”
The court also stressed the structural limitations of the private delegation doctrine, rejecting the government’s argument that “‘[n]o more is constitutionally required’ than the government’s ‘active oversight, participation, and assent’ in its private partner’s rulemaking decisions.”
The Supreme Court later vacated this decision on the grounds that Amtrak is not a private entity,
in part because the federal government controls the company’s stock and oversees its operations.
While Justice Alito agreed with the majority that Amtrak was a government entity, he nevertheless relied on the private delegation doctrine in arguing that the statute may be invalid as a delegation of additional regulatory power to a private arbitrator.
Involving a private actor in the regulatory process, Alito argued, would allow the government to “regulate without accountability . . . by passing off a Government operation as an independent private concern.”
For Congress to delegate to a nongovernmental actor would be to violate the carefully constructed system of accountability established by the Constitution.
On remand, while largely ignoring the Supreme Court’s holding that Amtrak is a government entity, the D.C. Circuit again invalidated the statute, with only slight modifications to its earlier approach.
Rather than holding that the statute improperly delegated to Amtrak as a private entity, the court read Carter Coal broadly as prohibiting delegations of authority to any self-interested party.
The court concluded that, because the PRIIA enabled Amtrak, a self-interested market participant, to regulate its competitors, the statute violated due process.
Whereas the Supreme Court majority had focused on the role of government oversight in determining that Amtrak was a government entity,
the D.C. Circuit, again relying on Carter Coal, found that no level of oversight was sufficient to remedy the conflict of interest, and therefore the due process violation, inherent in Amtrak’s regulatory role.
C. Principles of Private Delegation
The D.C. Circuit’s obstinacy in light of the Supreme Court’s ruling is notably extreme, but the Amtrak saga nevertheless demonstrates that private delegation considerations continue to animate courts’ assessments of the role of private actors in government. While there is a lack of agreement among courts and scholars as to the exact factors to be considered in evaluating delegations to private actors,
or even whether delegation to private actors is permissible under any circumstances,
common themes emerge from the jurisprudence: emphasis on government oversight and a desire to limit potential conflicts of interest. At the heart of both concerns is the need to maintain the role of the Constitution, and therefore the public, in directing the exercise of government power.
1. Limiting Conflicts of Interest. — A central concern in private delegations is that “private entities may face significant conflicts of interest or other tensions with public goals as a result of market incentives or professional culture.”
This idea played a key role in Carter Coal, in which the Court objected to the ability of private parties in the coal industry to “regulate the business of another, and especially of a competitor.”
The potential for conflicts of interest in government decisionmaking has animated Supreme Court jurisprudence in other areas of the law as well.
For example, in Gibson v. Berryhill, the Court prohibited the Alabama Board of Optometry from adjudicating delicensing proceedings within the optometry industry because individual board members, who were also private practitioners, had a financial interest in the adjudications.
Although the Board itself was a public body, individual members stood to gain by delicensing optometrists who might compete with their own private businesses,
leading the Court to conclude that “those with substantial pecuniary interest in legal proceedings should not adjudicate these disputes.”
The concern that conflicts of interest will prevent private actors from wielding government power in a neutral way was echoed nearly eighty years after Carter Coal in the Amtrak case, when the D.C. Circuit expressed concern that the power given to Amtrak allowed self-interested motives to overshadow consideration of the public good in the regulatory process.
Even on remand, after the Supreme Court held that Amtrak was not a private entity for the purposes of nondelegation, the D.C. Circuit stressed the self-interested nature of Amtrak’s regulatory power in invalidating the statute.
As noted above, the court’s intransigence was extreme, but these opinions nevertheless emphasize the potential for “[s]kewed incentives” that may lead a private delegate to abuse public authority for personal gain.
2. Ensuring Government Oversight. — A second theme that emerges from both the early Supreme Court cases and more recent lower court cases is that delegations of authority to private actors may be constitutional when mechanisms for government oversight of private decisions are in place.
As Verkuil explains, “When powers are delegated . . . Congress has a constitutional stake in the process. Its job is to assure the People that the work of government stays in the hands of those responsible for its execution.”
One function of government oversight under the private delegation doctrine is to ensure that private parties act within the scope of their delegated power.
In this way, the role of oversight in a private delegation aligns with the intelligible principle standard, which functions in part to enable judicial review of an agency’s compliance with the legislative will.
By requiring the delegating statute to provide an oversight mechanism, the private delegation doctrine emphasizes the government’s role in enforcing the boundary between the proper use of authority and a deviation from the delegated power. A secondary function of government oversight may be to monitor private delegations for the conflicts of interests discussed above
—a principle illustrated by Geo-Tech Reclamation Industries, Inc. v. Hamrick.
In Hamrick, the Fourth Circuit explained that it was “unable . . . to discern within the language [of the statute] any meaningful standard” to oversee the actions of the private party, allowing administrative decisionmaking to be “subservient to selfish or arbitrary motivations.”
By this reasoning, government supervision of private delegations accomplishes two things: It allows a public actor to review the delegate’s actions for compliance with the legislature’s intent and, through this review, limits the private actor’s ability to use its public power for private ends.
The emphasis on government supervision may also serve to maintain a formal government presence in the private exercise of government power. While a private actor may be tasked with administrative and advisory duties,
key decisions about the policy and structure of the delegation must be decided and explained by the legislature.
This idea is reflected in the Supreme Court’s New Deal discussions of private delegation as well. Four years after Carter Coal, the Court upheld a later incarnation of the Bituminous Coal Act, which gave the National Bituminous Coal Commission the power to set minimum coal prices after considering proposals made by local boards of coal producers.
The statute provided that these proposals could be “approved, disapproved, or modified by the Commission,” and allowed the local boards to serve “as an aid to the Commission but subject to its pervasive surveillance and authority.”
In short, the Court determined that because the local boards “function[ed] subordinately to the Commission” and were not entrusted with the power to set prices, there was no improper delegation.
Private actors may therefore play a role in government, but oversight mechanisms cannot be superficial, and a government actor must be actively involved in evaluating and moderating the decisions of the private delegate.
IV. CREATING CONSTITUTIONAL ACCOUNTABILITY IN RISK ASSESSMENT
The use of privately developed risk assessment tools in sentencing presents the same concerns that the private delegation doctrine is intended to address: a lack of government oversight and the potential for private self-interest to overshadow the public good.
While a court is unlikely to strike down the use of these tools in sentencing as an impermissible delegation,
private delegation principles can provide a useful framework for understanding the constitutional accountability gap that arises from the use of these tools.
Framing the work of risk assessment developers in private delegation terms, this Part proposes legislative remedies to close the accountability gap created by the use of algorithmic tools in sentencing. Section IV.A identifies oversight failures and conflicts of interest in the use of privately developed risk assessment instruments as the source of the accountability gap discussed in section II.C. Section IV.B then proposes legislative remedies to mitigate these issues, restoring accountability to the use of these technologies.
A. Applying Private Delegation Principles to Risk Assessment
As discussed in Part II, judges lack the necessary information to review and apply risk assessment algorithms in a critical and meaningful way, giving the private developers of these tools a significant role in sentencing determinations.
This section applies a private delegation lens to this problem, concluding that the sparse statutory regimes that govern risk assessment provide limited oversight and fail to check conflicts of interest, giving rise to the accountability gap discussed in section II.C.
1. Potential for Conflicts of Interest. — On the surface, the potential for private interests to overshadow the public good in the risk assessment context appears minimal because the incentives of private risk assessment developers align with those of the government actors applying their tools. Judges want to accurately predict an offender’s recidivism risk to ensure the efficient use of government resources,
minimize the risk of future crime,
and protect their own political reputations.
Likewise, risk assessment developers are motivated to produce accurate tools to increase the demand for their products. In this sense, the public and private incentives are closely aligned, perhaps assuaging concerns that risk assessment developers may abuse the power that is delegated to them.
But the private motives of algorithm developers nevertheless influence the way that these tools are used. As noted in section II.A, private companies may rely on trade secret protections to prevent disclosure of information about their software out of concern that competitors will exploit this information for competitive advantage.
Furthermore, some have suggested that developers avoid disclosure not solely out of concern about misappropriation of their intellectual property but also to avoid added scrutiny of their product that might negatively impact their business.
Likewise, during design, developers may opt for a machine learning model that is more likely to produce false positives than false negatives,
since recidivism by a “low risk” offender after release is worse for business than the imprisonment of someone deemed “high risk” who actually would not have reoffended.
Private motivations therefore have the potential to influence how risk assessment algorithms function, as well as how judges are able to interpret their results, to the detriment of the public good.
2. The Lack of Government Oversight. — The private delegation lens also reveals clear gaps in government oversight of the companies developing risk assessment tools. Often, the statutory frameworks requiring or permitting the use of risk assessment tools in sentencing are sparse and provide limited instruction on how risk scores should be applied.
Because of legal and technological opacity,
judges are unable to review the choices made by private developers to ensure that a given risk assessment algorithm functions in a way that is consistent with the purpose and scope of the delegation,
or even with established sentencing policy.
Furthermore, existing statutory frameworks often provide no instruction to tool developers on how to construct risk assessment tools, leaving these private actors to make their own policy determinations when gathering data, defining recidivism, and developing an algorithm.
At the same time, these statutes often provide little direction to judges on how to apply the risk score they receive, but judges have such limited information about the meaning of the risk score in front of them that it may be difficult to exercise this discretion in a meaningful way.
The existing statutory frameworks permitting or requiring the use of predictive risk assessment tools create an overly broad and undefined delegation of power to private actors in sentencing.
To give judges a meaningful opportunity to review risk scores, tool vendors must provide additional information about how their products are developed and function. Likewise, more information is needed from legislators about the purpose and policies of using risk assessments in sentencing, enabling courts to review the actions of private developers for conformity and to apply risk assessment results in a way that promotes meaningful judicial discretion.
B. Restoring Accountability and Oversight
Building on section IV.A’s identification of oversight failures and conflicts of interest in the use of privately developed risk assessment algorithms, this section proposes legislative solutions to close the accountability gap arising from the use of these tools. Legislative and regulatory changes, as opposed to litigation, are likely to be effective means of increasing accountability in the use of privately developed risk assessment algorithms. Although enforcement of the private delegation doctrine may be more common at the state level,
courts are still reluctant to invalidate existing delegations to private entities.
In the context of privately developed risk assessments, in which the complexity and private nature of the technology limit government oversight and control, the enforcement of accountability through judicial review is reduced.
While a private delegation challenge is unlikely to succeed in court, the doctrine highlights the legislature’s policymaking role and provides important principles for developing legislative remedies. State legislatures and sentencing commissions often have the resources to develop comprehensive, informed solutions,
whereas courts may not be able to craft widely applicable, nuanced reforms within the confines of a single case.
To restore constitutional accountability to the algorithm-assisted sentencing process, the structure of the laws permitting the use of this technology must be modified.
As Verkuil explains, when “making public actions ‘private,’ . . . [delegations] should come with strings attached that ensure fairness at the individual level and accountability at the political level.”
Because of its focus on the structure of the relationship between public and private actors, the private delegation doctrine provides a useful framework for designing legislative and regulatory remedies that address the need for increased information from both risk assessment developers and legislators.
Even without expertise in statistics or computer science, thoughtful legislators can reshape the way in which risk assessment algorithms are used and restore accountability through more purposive and deliberate lawmaking.
To increase constitutional accountability in this delegation, state statutes should require sentencing commissions to develop additional guidelines and reporting requirements for the use of risk assessment algorithms in sentencing.
Sentencing commissions should identify the dataset from which a tool will be developed, determine the definition of recidivism to be used in training the algorithm, and set a minimum acceptable error rate for recidivism risk prediction. Sentencing commissions may be particularly well-positioned to carry out these policymaking tasks, as their membership typically includes judges, district attorneys, public defenders, academic experts in criminal justice, and community members.
By requiring a public actor to make these key decisions, legislators can ensure that critical policy decisions remain in the hands of publicly accountable officials and provide judges with greater transparency around the choices that shaped the algorithm.
In addition to prescribing policies and methodologies for the development of these algorithms, risk assessment statutes should also direct sentencing commissions to establish reporting procedures for tool developers, requiring them to provide error rates, discrepancies in accuracy among different populations, comparative rates of false positives and negatives, and other basic information about the performance of their algorithms.
By requiring this reporting, sentencing commissions can provide judges with necessary context for understanding both an offender’s risk score and the broader limitations of algorithmic prediction. With this information readily available, sentencing judges may be less inclined to rely heavily on algorithmic outputs, developers may be faced with fewer demands for proprietary source code, and other government actors may be better equipped to assess the efficacy of these tools. In returning key policymaking functions to a government body and enabling judges to better understand the function and limitations of risk assessment tools, legislators can ensure that adequate standards are in place to guide tool developers
and that public actors maintain “surveillance and authority” over the actions of these private parties.
CONCLUSION
As the applications of machine learning expand and government willingness to contract with private developers of algorithmic risk assessment tools increases, it is important to take stock of the accountability issues that arise from the public use of private technology. To do so, this Note explores the role that private actors, who operate outside traditional mechanisms of constitutional accountability, play in shaping the outcomes these tools produce. The private nature of many recidivism risk assessment algorithms leaves sentencing judges unable to understand and adequately apply their results, leading to a greater reliance on the policy decisions of private developers. As a result, private actors are given an outsized role in sentencing, and the legislative purpose of risk assessment statutes is undermined. To remedy this problem, this Note uses the private delegation doctrine as a framework to improve judicial engagement with algorithmic risk scores. Because existing statutory frameworks allow private actors to wield government power with limited public oversight and control, legislators must increase the specifications included in risk assessment statutes, bolstering the ability of judges to understand and apply these technologies.