Rethinking Human – AI Agent Collaboration for the Knowledge Worker
Dr. Megan Ma, Associate Director, CodeX and Jay Mandal, CodeX Fellow
2025 has been described as the “Year of AI Agents.” Yet, what exactly does this mean and how should we be prepared for it? Beyond a lack of definition around the notion of “agents”, there is a further sense of confusion on the utility of these powerful tools. Particularly for the legal industry, AI vendors have long been marketing the integration of generative AI tools as opportunities for “workforce augmentation,” capable of producing work at a comparable quality of an associate. But unlike the use of large language models (LLMs) as tools for the execution of legal work products, the core messaging behind AI agents is that they should resemble interactions similar to those of a (machine) colleague. The problem? Most professional services operate in the form of an assembly line, rather than a collaborative team.
For clarity, we note here that in reference to AI agents, we are not conflating them with agentic workflows. While agentic workflows are the processes that are executed by agents, the focal point of this piece centers on the specific engagement with agents. That is, when and how do we enable meaningful human-agent interaction?
AI agents introduce an era of knowledge sharing and iterative real-time collaboration. This is unfamiliar territory for most legal professionals. Consider, for example, the structural workflow of a traditional law firm. A junior associate is asked to engage in the initial drafting, which is then passed to a senior associate for their redlines. These redlines are returned to the junior associate for their correction. Eventually, this is sent up the chain to the partner for their comments, then sent back down the chain for revisions and updates.
On the other hand, the structural workflow that agents introduce is one much more akin to a joint art project and/or playing in a team sport. There is not necessarily a specific hierarchy, but that the more seasoned individual inadvertently has a guiding force behind the success and completion of a given goal. What this configuration entails is an acute awareness of the specific strengths and weaknesses of the parties involved, and importantly, a sensitivity to the division of labor. In the case of AI agents, there must be an understanding of the criteria of success, in addition to the explicit expectations around task completion at every stage of the workflow.
The issue, of course, is that every task within a specified workflow encompasses sets of sub steps that are at a lower level of abstraction. Given the recent popularity of OpenAI’s Deep Research agent, let us consider researching on a particular topic for an essay and/or scientific report as a tangible example. In order to start the process of drafting a research paper, there must be a clearly defined problem statement that necessarily requires an impetus for investigation, and accordingly, a proposed recommendation.
Both anecdotally and observations from our own testing, Deep Research is not great with iteration. While it can execute completely and, occasionally beautifully, on a task on its first try, the process of making pointwise edits poses a significant challenge. At first blush, it provides a decent work product. However, the process of transforming the initial output into an excellent work product remains a struggle. This is because the process of refinement extends beyond directive language and transitions into expressive language. In other words, as opposed to defining what we need accomplished, editing requires detailing how we would prefer it to be accomplished.
As humans, defining from top-down the vision and criteria of success misses what we often need. Humans, like LLMs, also require some extent of prompting (nudging) to transform how they react to phenomena into a tangible ask. In an analog world, this could come in the form of trial and error, the testing of multiple pathways and experimentation of results obtained. While this was highly inefficient, it also led to sparks of creativity and the necessary modification of expectations and goals for a particular research question.
With the advent of LLMs, while we greatly reduce the inefficiency behind the initial research process, we also lose the necessary refinement of our “intuitions” we once were afforded. Accordingly, the edits that we suggest to our agents become a haphazard method of revision. Ultimately, quality remains at a level of mediocrity because we have removed the opportunity for expectations to be clarified in the messy act of brainstorming and planning for the execution of the project. In other words, we have not defined the right form of user engagement with agents, and accordingly, adoption of AI agents largely remains unclear.
So, how exactly do we re-integrate the inefficiency and/or “messiness” of planning in our workflows? In early 2025, a framework known as Collaborative Gym was released as a study into enabling human-machine collaboration. In their study, Stanford researchers demonstrated that agents as initial collaborators encourage higher performance in achieving the task outcomes of the user.
What is particularly fascinating about the study is that it reveals that creating a dedicated environment that allows agents to interact with humans and one another (if more than one agent is involved) encourages a shared working space with ultimately better-defined goals. Having a specific environment does not confine agents to autonomously work towards an identified outcome. Instead, this environment encourages the agents to ask questions to the user, and even wrestle with the problem, prior to execution. In turn, users can treat this environment as a “scratchpad” to explore at a deeper level exactly their expectations and milestones needed to accomplish a particular task.
We argue that this will become a new form of task decomposition. We apply elements of this framework to corporate mergers & acquisitions (M&A) as a concrete example of how we can implement Co-Gym into the legal professional services space, and more broadly, for the knowledge worker.
When thinking about the workflow of an M&A deal, the overall end-to-end process can be decomposed into key stages: starting from initial deal discussions to the ultimate integration and post-closing of a deal. While there is a general process for an M&A transaction, certain stages could run in parallel, or be revisited, based on the specific circumstances of the deal. We will explore the dynamic nature of such deals, and the need for real-time adjustments, in considering how an AI agent could service M&A transactions.
M&A general workflow – a deal decomposed:
| 1.Pre-deal diligence for a deal – enter into an NDA and engage in preliminary discussions and review of initial documents shared between parties. |
| 2. Enter into a Letter of Intent (LOI) – draft and negotiate a non-binding LOI, with key terms such a price, structure (e.g., share purchase, asset acquisition, merger of entities), and key conditions of the deal. It sets the groundwork for further negotiations. |
| 3. Diligence Process – thorough review of target company’s documents, including contracts, corporate and financing history, litigation history, IP, and regulatory compliance. Identify potential risks and impediments in completing the transaction. |
| 4. Regulatory Compliance – evaluate applicable laws triggered (such as antitrust and securities regulations) and take steps to secure approval and ensure compliance with such regulations and authorities to consummate this deal. |
| 5. Draft transaction documents – draft the core agreement underlying the sale. These agreements define the key terms, such as the structure, price, and the other rights, liabilities and obligations of both parties. |
| 6. Negotiating a deal – negotiate the main purchase agreement and related agreements, including the parties’ representations and warranties, indemnities, payment conditions and closing conditions. Negotiate all legal protections for the buyer. |
| 7. Closing of deal – Finalize the legal documentation and facilitate the ownership transfer per the agreement terms. Payment transfer completed and all final conditions of the deal completed. |
| 8. Integration and Post-closing matters – assist with agreed upon post-closing matters, such as integration, regulatory compliance for the merged entity, and employment issues or other disputes. |
Consider a scenario in which a legal team is leveraging autonomous AI agents to support the execution of an M&A workflow. We could establish the discrete steps and substeps at each stage, the intended business and legal outcomes upfront, the parties for engagement, among other details. Nevertheless, this workflow would face the following key obstacles:
- Overall objectives may change. For example, following the diligence process, the acquirer may want to recalibrate their approach and rethink the key terms negotiated in the term sheet based on red flags uncovered in diligence.
- Need to reorder processes. If new material information is unearthed about a company late in the acquisition process (such as an undisclosed IP dispute), the flow of the M&A process may need to be reordered. One possibility is that the acquirer may need to reprioritize the diligence stage, which could impact the structure and key terms in the draft transaction documents.
- Objectives within a step(s) may change. For example, in the middle of the deal, there may be a previously unforeseen urgency to accelerate the deal (e.g., external market forces, competitive pressure to integrate and release the acquiree’s product quickly, etc.). As a result, the acquirer’s key objectives at the deal negotiation and deal closing stages may become reprioritized in order to accelerate the completion of the transaction. In this case, removing impediments to close the deal would take precedence over other objectives, such as ensuring deal terms are fully negotiated.
Consequently, autonomous AI agents would underperform, taking liberties with certain decisions, either encouraging hallucination and/or arriving at dead ends in the workflow. In these scenarios, there is no forum or interface to dynamically consider the human lawyer’s evolving intentions in light of new circumstances. In order to address these situations, the strategies employed by the Co-Gym framework would be useful.
Rather than directing AI agents to work towards an overall objective and workplan upfront – with a predefined set of tasks and outcomes – we argue for human-machine reflection and active engagement at each new decision point of a deal, including (a) the initial planning of the M&A deal; (b) in advance of each major milestone; and (c) at each critical decision point when new information and/or circumstances are introduced in the deal (e.g., newly discovered diligence information that impacts the deal, or an exogenous factor such as market changes).
Let us consider applying the Co-Gym framework to the M&A workflow. Each new decision point of an M&A deal, as described above, would call for a collaborative interface between humans and AI agents. The AI agents would leverage the collaborative environment to ask questions, suggest next steps, and even argue with the human lawyer in order to refine the objectives and create a new work plan to execute. This would encourage real-time adaptability at any key juncture in the M&A process.
For example, if a new litigation threat on a key patent is discovered as parties are negotiating an M&A share purchase agreement, this material change in circumstances would trigger the Co-Gym collaborative process. The AI agent would first reflect on how this information could impact the transaction. Next, the AI agent would raise a set of questions with the human lawyer to better understand – through queries back and forth – the significance of the threat, and its role on the preferences and priorities of the human lawyer, including the overall objectives of the deal. Should the human lawyer indicate that this new litigation liability substantively heightens the risk of moving forward with this deal, then the AI agent could draft an interim work plan that prioritizes diligence on the new liability information. The AI agent would also consider putting on hold, deprioritizing, and/or reshuffling considerations, such as negotiations on M&A deal terms. This finds parallels to the actions a lawyer would take in dealing with this type of situation in the course of an M&A transaction, without the assistance of an AI agent colleague.
In contrast, without the Co-Gym approach, we could lose this crucial step around re-assessment. Instead, the AI agent would autonomously evaluate the importance of this recently developed liability risk without a deeper understanding of the human lawyer’s preferences, potentially misinterpreting the goals of the transaction (e.g., under or over adjusting to its importance, or ignoring it altogether). Moreover, there would be no opportunity of interjection for a human lawyer to spar and strategize given the new information introduced.
Therefore, the Co-Gym framework demonstrates that an explicit planning environment for human-machine collaboration could helpfully address the dynamic nature of an M&A transaction. In effect, this could serve as a model for other legal workflows that have similar dynamism. Co-Gym, in effect, offers a more apt framework for how AI agents can address the variable multi-stage workflows of an M&A transaction.
We note that, in the Co-Gym paper, the researchers raise the need for advancements in core aspects of intelligence— communication capabilities, situational awareness, and balancing autonomy and human control — in order to better facilitate this collaboration process. We are thoughtful about these current limitations. While some will be overcome by exponential developments in the technology, we also believe that further investigation into different tangible use cases could illuminate how we can rectify other constraints, such as situational awareness and balances of autonomy and human control.
As a proposal for forthcoming work, we are interested in how Co-Gym could similarly be applied to other dynamic multi-stage legal workflows, such as (a) other corporate transactional deals, such as VC financing and IPOs; (b) litigation; and (c) specialized practices, such as patent prosecution, bankruptcy proceedings, immigration filings, and more. In the interim, we will continue to experiment with how humans and AI agents may work together, by researching and developing potential prototypes illustrating this form of engagement. This includes our next piece focused on how division of machine labor (task distribution across AI agents) could work in professional services.