Context Stewardship: What Source-by-Source Authorization Misses

Abstract

AI systems access data. That access is governed. Each source is reviewed, scoped, and authorized. The governance frameworks that manage this process are mature and well understood. But AI systems do not merely access data. They combine it. And the combination produces something that no individual authorization addressed: inference. An AI that reads a calendar, an email account, and a CRM can derive a health condition that exists in none of those sources individually. The inference was never permitted. It was never prevented. It sits in a gap that current frameworks do not recognize. This essay introduces Context Stewardship, an interpretive lens within the AI Life Cycle Core Principles framework, to name that gap and propose controls for it. Context Stewardship reframes six existing governance principles around a single question: what risks emerge when authorized data sources are combined? It introduces Inference Boundary Mapping, a documented specification that defines what an AI system may and may not derive from combined context. The argument is that data access is not the right unit of governance for AI systems that synthesize across sources. Data combination is. Until frameworks reflect that distinction, organizations will continue to govern what AI can reach while ignoring what it can infer.

Between Access and Inference

An AI reads your calendar. That is authorized.

It reads your email. No problem. That is authorized too.

It queries your company’s CRM. Authorized, for certain fields.

Each access was scoped. Each was reviewed. Each was approved under data control frameworks designed to ensure that AI systems reach only the data they are permitted to reach.

But no one authorized what happens next. The AI combines what it found. A medical appointment on the calendar. Test results mentioned in an email thread. An insurance tier in the CRM. From these, it derives a health condition. No single source contained that information. No single authorization contemplated it. The inference was never permitted. It was simply never prevented.

This is the gap that current data governance frameworks do not see because they govern access. They do not govern synthesis. They ask whether the AI may reach a data source but not what becomes possible when authorized sources are combined.

This gap, between source-level authorization and synthesis-level inference, represents the most under addressed data governance risk in enterprise AI deployment today. Context Stewardship, which I explain in more detail below, is designed to effectively address it.

Context Stewardship is part of the AI Life Cycle Core Principles (AILCCP) framework. The AILCCP is a system of 37 principles organized across 10 strategic pillars, designed to guide organizations through the full life cycle of AI development, deployment, and decommissioning. Now in its third year of development, the AILCCP spans areas from oversight and accountability to reliability and robustness. (A public-facing version is available here.) It functions as both a compliance roadmap and a strategic tool for managing the legal, ethical, and reputational risks that accompany AI adoption. Context Stewardship is one of the latest additions to the framework. It is not a standalone principle, but an interpretive lens that cuts across six AILCCP principles (which appear in initial uppercase letter): Governance, Data Stewardship, Privacy, Consent, Security, and Resilience. It reframes each of these around a single question: what risks emerge from combining authorized data sources that none of those sources would present alone?

The Combinatorial Problem

The authorization pattern I described above is how most organizations operate. Each approval in that sequence was evaluated independently but no one asked what the combination would make possible.

This reflects a structural limitation in existing frameworks. They were designed for a world in which systems accessed data sources one at a time. They lack vocabulary for what happens when an AI agent moves across sources, synthesizing context as it goes. A system with access to three enterprise sources has a manageable set of possible inferences. A system with access to fifteen has a set that grows far faster than the number of sources, because every new source interacts with all the existing ones. No individual review of each source can meaningfully cover that space.

The health inference I opened with is simple by design. In practice, the inferences available from combined enterprise data are far more varied and far harder to anticipate. Consider a second example. An AI assistant authorized to access a company’s code repository notices that a senior engineer’s commit frequency has dropped. Individually, that data point means little. People take vacations. They shift to architecture work. They mentor junior staff. But the same AI also has access to the employee’s calendar, where it sees several midday blocks marked “personal.” And it has access to the internal wiki, where it can see that the engineer recently viewed pages on equity vesting schedules and the company’s non-compete policy. No single source signals anything. Combined, the AI infers departure risk. It was never asked to. No authorization contemplated this inference. But nothing prevented it either.

Of course, not every combination of data sources produces problematic inferences. Many combinations are benign or beneficial and mature organizations do attempt (at least on paper) cross-system risk analysis, tiering AI access by data sensitivity, decision impact, and safety vulnerabilities. Some security teams also treat AI agents as distinct entities with their own access credentials, applying zero-trust principles (the assumption that no entity is trusted by default, and every access request must be verified and scoped) to limit what each agent can reach. But these processes are emerging, they are not  standard, and even where they exist, they tend to operate at the access level rather than the inference level.

How Context Stewardship Reframes Existing Principles

Closing this gap requires reframing the questions those frameworks ask. Context Stewardship does this by cutting across six AILCCP principles and shifting each from source-level compliance to synthesis-level risk.

Three of these reframings carry the most weight. 

Privacy frameworks typically evaluate exposure at the source level. Context Stewardship treats aggregated context as an expanded area of vulnerability. Each additional source the AI can access opens new categories of inference that were unavailable from any individual source. The health inference from the opening illustrates this at a basic level. The departure risk example complicates it further: there, the privacy risk arises not from obviously sensitive data but from the combination of routine signals that no one would think to restrict individually. 

Consent mechanisms inform users about what each system collects, but they never make it clear that an AI drawing on multiple sources can derive things from their combination that no individual consent addressed. Consenting to an AI assistant that reads your calendar is one thing. Understanding that the same assistant cross-references your calendar with your email, your documents, and your CRM records is quite another.

Security controls enforce least-privilege access to individual resources. Context Stewardship introduces a parallel concept: least-context-necessary. An AI that has access to ten enterprise sources but only needs three for a given task should be constrained to those three. Not because the other seven are sensitive in isolation, but because their addition opens inferential possibilities that may be unnecessary and ungoverned.

The remaining three principles are also reframed. 

Data Stewardship shifts from data quality and provenance to scope authorization, asking (in relevant part) whether access was granted with awareness of how sources would combine. 

Resilience takes on new failure modes specific to synthesized environments: stale data from one source contradicting current data from another, compromised data from one source propagating through the AI’s reasoning, or temporal mismatches creating a distorted picture that the AI treats as coherent. 

And Governance at the organizational level must answer who bears responsibility for risks that arise from integration decisions, particularly when no single source owner anticipated them.

Inference Boundary Mapping

Among the controls accompanying Context Stewardship, one addresses what I believe is the most practically urgent problem: specifying what an AI system may derive from combined context.

The concept is straightforward. If an organization authorizes an AI to access both calendar data and email data, Inference Boundary Mapping asks what inferences from that combination are within scope and which are not. A scheduling optimization that combines calendar availability with email response patterns may be entirely appropriate. A health status inference derived from medical appointment entries and insurance correspondence is not.

In practice, Inference Boundary Mapping takes the form of a documented specification, developed before deployment and revisited periodically, that defines permissible inferences for each combination of data sources the AI can access. Scope is defined by use case: an AI authorized to assist with scheduling has a different inference boundary than one authorized to assist with workforce planning, even if both access the same underlying sources. The exercise requires a cross-functional team. Technical staff identify what inferences the AI is capable of drawing from combined sources. Legal and compliance staff determine which of those inferences fall within the purpose for which access was granted. The resulting specification becomes a governance artifact, auditable and enforceable, that sits between source-level authorization and system-level behavior.

But the harder cases are less obvious. Consider an AI with access to email, meeting transcripts, project management tools, and peer review records. It could construct a detailed performance profile by correlating communication patterns, meeting participation, task completion rates, and peer feedback. Each of these sources was authorized for the AI to help with workflow management. The performance profile that emerges from their combination was not. Yet unlike the health inference, the line here is blurry. Correlating task completion with meeting load to suggest workload redistribution seems helpful. Correlating email tone with peer review sentiment to predict a performance rating seems invasive and creepy. The underlying data is the same. The difference is in what the AI is permitted to derive from it, and that distinction exists nowhere in a source-level authorization framework. Inference Boundary Mapping makes that distinction explicit and assigns ownership to it.

None of this is easy. An AI system with access to fifteen data sources can draw inferences that no pre-deployment review will fully anticipate. Defining boundaries in advance requires judgment about what an AI might derive from data combinations, and that judgment will sometimes be wrong or incomplete. But the alternative, allowing AI systems to derive whatever their combined context permits, transfers risk from a design decision to an unexamined default. Organizations that make that transfer without acknowledging it may find themselves unable to explain, after the fact, why a particular inference was generated and who authorized it. That is a liability exposure, not merely a compliance gap.

Beyond Enterprise Data: A Structural Pattern

Everything I have described so far concerns enterprise data. But the same structural pattern appears wherever AI systems draw on multiple data sources to build an integrated picture of their environment.

AI research is also taking aim at systems that learn internal representations of how the world works, built from training data drawn from multiple sources. These learned representations drive predictions, and those predictions drive actions. The training data that shapes these representations raises the same questions that Context Stewardship asks of enterprise AI. Were the sources selected with awareness of how they would interact? Could the system draw inferences from combined training data that no one intended? When the system produces a flawed prediction, can responsibility be traced back to specific sources?

The pattern is consistent. Whenever multiple sources are synthesized, the whole exceeds what any individual source authorization contemplated. That is true whether the synthesis happens in an enterprise assistant combining calendar and email data or in a research system combining training datasets to build a model of its environment.

The Regulatory Gap

Current and proposed AI legislation does not adequately address combinatorial inference risk. Transparency mandates focus on disclosing what data a system accesses and how it processes that data. Impact assessments, which are increasingly required by state consumer privacy law, evaluate risks at the system level or the data source level. Neither is designed to capture risks that emerge specifically from synthesis across individually authorized sources.

This matters increasingly as AI systems become more deeply embedded in enterprise operations. The number of data sources they access will grow. The inferential possibilities will expand. Legislative frameworks that treat data access authorization as the primary safeguard will increasingly miss where the actual risk resides.

I do not think this requires entirely new legislation. In many cases, existing regulatory frameworks could accommodate combinatorial risk analysis if regulators recognized the category. What is missing is analytical vocabulary. Context Stewardship, and controls like Inference Boundary Mapping, are an attempt to provide it.

What This Means in Practice

For legal counsel advising on AI integration, the practical implication is this: reviewing AI data access authorizations one source at a time is necessary but insufficient. The review should also ask what becomes possible when authorized sources are combined, and whether the organization has mechanisms to govern those possibilities.

For policy makers, the implication is that transparency and access control mandates, while valuable, do not reach the synthesis layer. The combinatorial risks I described are present in every enterprise AI deployment that accesses more than one data source. They are not speculative.

For organizations deploying AI agents in enterprise environments, the question is concrete: when your AI system derives something from combined data that no individual authorization addressed, who is accountable? If the answer is unclear, the organization has a gap that no amount of source-level compliance will close.

Context Stewardship does not claim to solve every problem that enterprise AI creates. It identifies the specific layer where source-by-source authorization fails and proposes both an analytical framework and practical controls to address it. The AILCCP was designed to evolve as AI deployment reveals new categories of risk. Context Stewardship represents that evolution.