Words or Code First? Is the Legacy Document or the Inherent Structure of the Bargain the Starting Point for Contract Automation?

by Oliver R. Goodenough, Codex Affiliated Faculty

The Insurance Initiative of CodeX, the Stanford Center for Legal Informatics, is investigating how to convert the contracts which represent insurance policies into native expressions of computer code.[1]  This will allow the automation of many of the processes relating to such contracts, from their formation and administration to advanced analytic tools for both customers and companies.  The central element in this approach is restating the insurance bargains as “computable contracts.”  As we define them, with a measure of mathematical precision, a computable contract is “one that is specified in sufficient detail to provide unambiguous answers to questions about compliance of clearly specified circumstances with the terms and conditions of the contract.”  If the bargains of insurance can be represented precisely, their automation becomes a tractable problem – complex, but doable.

That said, there is a fundamental question about the starting point for the process: should the goal be (i) translating the provisions of the legacy, word-based formulations that we have used to formalize insurance bargains for centuries, or should it be (ii) taking a new look at the underlying, word-independent, structure of the insurance bargain and seeking to maximize the utility of a native, computer-centered, representation of that bargain?  Many of the initiatives in contract automation take the first option as their starting point.  While it can produce useful work, I believe that this will ultimately be too limiting.  Rather, we should consider how best to represent the understanding about behaviors, rules and outcomes that make up insurance bargains directly and natively in code.

A number of benefits will flow from this.  First, it will enable the code embodying the bargain to be realized in relatively good programming, rather than tracking the kludgy and convoluted necessities that arise from using natural language as a programming medium.  Second, it will enable deploying interfaces that can liberate the user experience from the shackles of ponderous, opaque legal language, instead enabling graphical techniques and other design elements to communicate the terms.  The improvement of Windows on DOS provides a useful historical analogy.  And third, it will allow – and encourage – the designers of insurance bargains to develop new policies, that would have been hard to conceptualize in words but can be more easily envisioned in a code-first environment.

Notwithstanding these and other potential benefits from a code-first approach, a number of initiatives are putting the traditional document, and the word-based formulas it contains, at the center of the automation process.  Many of the legal specification projects currently in development reflect this word-first methodology.  Translating existing documents has attractive features.  To begin with, it respects the traditions of legal drafting, providing comfort to adopters and regulatory or judicial reviewers who don’t want to be challenged.  As a part of this comfort, a “doppelganger” text-based document can often exist or be generated.  No need for judges – or the legal department in the insurance company – to trouble themselves about understanding something new.

Second, the word-first strategy can be a bit easier to deploy, at least as an initial matter.  Rather than beginning with stepping back and conceptualizing the bargain in its totality, the conversion process just requires a reasonably skilled practitioner to make a step-by-step translation of the data and logic in the contract as it currently exists.

Third, natural language processing may be able to assist this in a very direct way, acting in place of that reasonably skilled practitioner, at least with respect to creating a first draft in code that can be de-bugged and corrected in a later review.

Finally, such a word-centric approach better enables using a deontic vocabulary in the transition to code, that is, one employing the language of ethical obligation to formulate duties.  Concepts like “should” and “ought” have been built into the conversion methodology of some of the legal model languages that start with the word text and then look for a conversion.  A discussion of whether this is actually useful in specifying legal formulations as opposed to philosophical statements requires a much longer treatment than this short paper allows; the short, assertive version reflects my belief that one of the functions of law is to convert deontic moral precepts into actionable if/then propositions which change behavior by changing the payoff matrix around that behavior through penalties, damages and coercion, and not by making people feel bad about what they have done.[2]  Deontic language may creep into legal formulations, but it isn’t a property that clear legal specification needs to (or, in my opinion, and being deontic about it, should) encourage.

But all of these possibly positive factors, in my opinion, simply encourage people to adopt strategies that will impede the full development of the advantages of a computable contract-based automation program.  Consider an analogy.  When steam and other power sources became available to help mechanize transportation, some of the earliest efforts involved creating a steam-powered mechanical horse.  The full development of trains, cars, bicycles, etc. could only advance once the model of the horse was discarded and other configurations, more native to “self-moving” (“auto mobile”) through mechanical means, were explored and adopted.  I believe the same is true of representing legal expression in code.

The hard step, as mentioned above, is that the code-first approach needs experts to conceptualize abstractly the structure of the bargain at the heart of an insurance policy, and then create a platform that (i) allows this to be represented efficiently and accurately in code, and (ii) allows variation in the design of the specific elements of a particular family of bargains to be easily inserted or removed, probably from a somewhat general template for that family, and (iii) provides an interface for various users of such a platform – policy designers, customer and brokers, policy and claims administers, risk analyzers, regulators, etc. – that allows them relatively easily to design variations, perhaps through a mixture of interactive features ranging from words to toggles and drop-down menus, and to extract the information or action they need as the policy is administered.

Luckily, the insurance bargain is particularly open to such an approach.  To being with, it has a recurring structure that shapes its terms.  The details with the terms may vary or have considerable complexity, but that variation and complexity generally fits within this recurring structure.  The elements of this structure are:

  1. Is the policy in effect at the time of the events that give rise to a claim? By way of example, is the policy contract signed?  Has the premium been paid?  Has the term of the coverage expired?
  2. Have any conditions necessary to keep coverage in effect been met? Has a required periodic fire safety inspection occurred?  Or a required doctor visit?
  3. Has a claim proving the occurrence of the required elements for a valid indemnity claim been made? The contract may lay out different pathways for what constitutes such a valid claim.
  4. Is the claim disallowed because of the existence of an applicable exclusion? Was the accident caused by a prohibited activity, such as skydiving, or by driving while intoxicated?  Was the business interruption caused by a pandemic?
  5. If a valid claim has been made, what is the amount of any payment due, applying the arithmetic of the policy. Are there deductibles? Co-pays? Maximum pay-outs?
  6. Ancillary matters relating to administration and disputes. What law applies?  For a dispute, courts or arbitration?  What is the time frame for steps in the claims process?  Where are notices and claims sent?

I believe that an effective, easily adapted, and widely deployable platform for representing insurance bargains directly into code can be created by using these elements as a starting point, and not the language of particular policies.  That isn’t to say that specific policies won’t be extremely useful in figuring out coverage methods and targets, but they will stand as a resource, and not as the object of line-by-line translation.  This recurring general structure can inform how we program the data and the queries to that data that together make up a policy.  With some additional granularity, the “local logic” of a particular family of policy variations – life, casualty, cyber, workplace – can be realized as a starting point for a template within such a platform.  In such a case, the platform will also provide users with the ability to make both standard and bespoke variations within the boundaries of such a family and of such a template.  These code-first representations should demonstrate greater flexibility, capacity and efficiency than a dogged policy-by-policy direct automation of the words could accomplish.

I also believe that logic programming languages, such as Prolog or Epilog, will be particularly effective for representing these elements, and, just as importantly, in enabling comparisons, gap analysis, portfolio reviews, and other critical byproducts of the automation process.[3]  While such representation in logic programming can be achieved through a direct text conversion approach, I believe that such an approach will, in the end, suffer from the limitations described above. I also believe that a code-first approach, rooted in the core structure of the insurance bargain, will help unlock benefits in user experience, adaptability and analytics.

The creation of a proof-of-concept of such a platform is a project we are currently pursuing in the Insurance Initiative, where it can be tested against other approaches.  Stay tuned, as they used to say in the legacy technology of broadcast television.


[1] See, generally, Goodenough, O., & Salkind, S. (2022). Computable Contracts and Insurance: An Introduction. MIT Computational Law Report. Available at https://law.mit.edu/pub/computablecontractsandinsuranceanintroduction. And see the material at https://law.stanford.edu/codex-the-stanford-center-for-legal-informatics/codex-insurance-initiative/.

[2] For a somewhat more developed discussion, see Flood, M.D., Goodenough, O.R. Contract as automaton: representing a simple financial agreement in computational form. Artif Intell Law (2021). https://doi.org/10.1007/s10506-021-09300-9 at p 397. The approach argued here is consistent with Justice Holmes formulation of the “bad man theory of law,” which takes as its starting point the idea that law as viewed by a potential malefactor is the set of consequences that would flow as a result of the bad deed.  See, e.g., Oliver Wendell Holmes, O.W., The Path of the Law, 10 Harv. L. Rev. (1897) and Perry, S. Holmes versus Hart: The Bad Man in Legal Theory. In S. Burton (Ed.), The Path of the Law and its Influence: The Legacy of Oliver Wendell Holmes, Jr (Cambridge Studies in Philosophy and Law, pp. 158-196). Cambridge: Cambridge University Press (2000). doi:10.1017/CBO9780511527432.009.

[3] See, e.g., Goodenough, O.R. and Genesereth, M. (2022). Why a Logic Programming Approach Works for Automating Insurance Contracts. Release forthcoming.