Artificial Intelligence (AI) capabilities are rapidly advancing. Highly capable AI could cause radically different futures depending on how it is developed and deployed. We are unable to specify human goals and societal values in a way that reliably directs AI behavior. Specifying the desirability (value) of AI taking a particular action in a particular state of the world is unwieldy beyond a very limited set of state-action-values. The purpose of machine learning is to train on a subset of states and have the resulting agent generalize an ability to choose high value actions in unencountered circumstances. Inevitably, the function ascribing values to an agent’s actions during training is an incomplete encapsulation of human values and the training process is a sparse exploration of states pertinent to all possible futures. After training, AI is therefore deployed with a coarse map of human preferred territory and will often choose actions unaligned with our preferred paths.
Law-making and legal interpretation convert opaque human goals and values into legible directives. Law Informs Code is the research agenda embedding legal processes and concepts in AI. Similar to how parties to a legal contract cannot foresee every potential “if-then” contingency of their future relationship, and legislators cannot predict all the circumstances under which their bills will be applied, we cannot ex ante specify “if-then” rules that provably direct good AI behavior. Legal theory and practice offer arrays of tools to address these problems. For instance, legal standards allow humans to develop shared understandings and adapt them to novel situations, i.e., to generalize expectations regarding actions taken to unspecified states of the world. In contrast to more prosaic uses of the law (e.g., as a deterrent of bad behavior), leveraged as an expression of how humans communicate their goals, and what society values, Law Informs Code.
We describe how data generated by legal processes and the tools of law (methods of law-making, statutory interpretation, contract drafting, applications of standards, and legal reasoning) can facilitate the robust specification of inherently vague human goals to increase human-AI alignment. Toward society-AI alignment, we present a framework for understanding law as the applied philosophy of multi-agent alignment, harnessing public law as an up-to-date knowledge base of democratically endorsed values ascribed to state-action pairs. Although law is partly a reflection of historically contingent political power – and thus not a perfect aggregation of citizen preferences – if properly parsed, its distillation offers the most legitimate computational comprehension of societal values available. Other data sources suggested for AI alignment – surveys, humans labeling “ethical” situations, or (most commonly) the beliefs of the AI developers – lack an authoritative source of synthesized preference aggregation. Law is grounded in verifiable resolutions: ultimately obtained from a court opinion, but short of that, elicited from legal experts. If law informs powerful AI, engaging in the deliberative political process to improve law would take on even more meaning.