Enforceable Content Management

A recent CyberProf post by Lauren Gelman (and reply by James Grimmelmann) highlighted an incident between Facebook and a self-professed webcrawler, Pete Warden. What did Warden do to allegedly earn the ire of the mother of all social networks? He grabbed user data, aggregating it into a giant dataset, portions of which he apparently intended to distribute for commercial gain.

Warden cried foul. He claimed his activity was allowed by Facebook’s electronic “mall cop”: robots.txt. (BTW: the “mall cop” is my own designation, not Warden’s.) Facebook, according to his account, threatened him with legal action. It’s not clear what exactly happened (Warden also claimed he got sued). Let’s just agree that that part doesn’t really matter for the purpose of this post.

In his reply to Lauren, James brought up an intriguing point. He noted, among other things, that the debate about the enforceability of robots.txt is making the case for the Automated Content Access Protocol (ACAP) and other robots.txt extensions.

So let’s take a look at ACAP. With the stated mission of “improving universal access to content” the ACAP folks offer a format that signals terms and conditions to webcrawlers in ways they claim robots.txt falls short. This is implemented through a tool they offer, which apparently adds a variety of “allow” and “disallow” parameters to the content owner’s existing robots.txt file.

Ok. Interesting. But query whether ACAP is enforceable? I don’t think it is.

On its website, ACAP declares that “Implementation today is largely symbolic and part of the campaign to encourage search engines and other aggregators to adopt ACAP and respect the online copyright of content providers.” Folks, symbolic + encourage + respect ≠ enforceable.

But I want to be clear that I am not completely dismissing the ACAP initiative just because it is not, ab initio, enforceable. I think I’m agreeing with James’ point when I say that it does appear to embody certain incremental-improvement qualities over the venerable robots.txt. Yes, it is a step in the right direction of seeking to more efficiently communicate terms and conditions to non-humans. But it’s just a step, a minor update, if you will. As such I score it as “robots.txt version 1.15.”

So, how do we do better, much better? Enter the CodeX Autonomous Intelligent Cyber Entity (AiCE, pronounced “ice”) project. AiCE represents a game-changing paradigm to many on-line activities, not just the content management game. The following is a very brief project snapshot, but sufficient to make the point.

Using the same normative and utilitarian framework that gave rise to corporate entities, the CodeX AiCE project proposes a new, legally-sanctioned entity; one that is software-based and sufficiently programmatically sophisticated so as to earn the “autonomous” and “intelligent” labels.

These 2 labels identify the independent decision-making and computational law qualities designed into AiCE. Consequently, AiCE is able to evaluate a broad range of relevant laws; enter into enforceable contracts; evaluate and analyze legal and other website metadata; monitor and report on an on-line business’ activities to the relevant parties, etc. The spectrum of applications is limited only by the imagination and computational power.

From this brief description you should already be able to discern that in a content management configuration, AiCE can perform a wide variety of sophisticated missions. For example, it can dictate to a (non-AiCE) what portions of a given web content are available, timely intervene (an enforceability quality) if those instructions are not observed, initiate legal action where necessary (e.g., issue a cease and desist) and so on. At the point in which the webcrawler happens to be another AiCE, the possible range of interactions is significantly expanded.

AiCE’s enforceability qualities are not envisioned to be confined to technical attributes. The project proposes to render enforceable activities through a new, (under-construction/concept phase) law: the Uniform AiCE Transactions Act (UATA).

UATA is a novel, intelligent, computational law framework envisioned to govern all type of AiCE activity. It will, in part, be based on an amalgam of various established legal doctrine and will borrow (as necessary) from a wide variety of legal resources such as UETA, UCITA and the American Law Institute’s Principles of the Law of Software Contracts. It will also serve as the legal framework that endows AiCE with the legally-sanctioned, corporate-like entity status mentioned earlier.

(Note: UATA is an ambitious and critical undertaking. Currently it is being worked on within the AiCE project, but as it continues to grow it may very well need to be segmented out for semi-parallel pursuit.)

In sum, James’ note about the enforceability of robots.txt is both a valid and valuable reinforcement/reminder that effective (i.e., enforceable) content access tools are necessary for promotion of healthy universal access. The AiCE platform is envisioned to be the tool-of-choice to that end.

-Eran