The capability-trajectory measurement the chapter's "the curve is real" section anchors to. The METR site lets you toggle between log and linear scale; the linear view is the one that captures the current's acceleration most viscerally. Worth visiting periodically rather than reading once. The chart updates as new models land. METR also publishes the randomized controlled trial showing experienced developers were slower with AI tools than without them, which is the result that earned the chapter's trust in this source (but may also be out of date by the time you read this).

Carnegie Mellon, TheAgentCompany benchmark

The honest counterweight to capability-trajectory enthusiasm. TheAgentCompany simulates a realistic multi-step office workflow and measures how often an agent can complete it autonomously. The leaderboard is public and updates over time. As of this writing, the leading entries on the leaderboard still fail a slim majority of tasks. That is the ceiling worth holding in your head alongside the trajectory. Both readings are true at the same time.

Jamie Dimon's annual shareholder letters, JPMorgan Chase

The trajectory of one institution's thinking about AI, told through the chairman's annual voice across roughly a decade, with agents entering the story only in the most recent letters. The chapter quotes the April 2024 letter ("the printing press, the steam engine, electricity, computing and the Internet") and the April 2026 letter ("AI will affect virtually every function, application, and process in the company"). Read the two in sequence and the rightward shift is unmistakable. Earlier letters in the archive (2017 onward) show the start of the arc. JPMorgan's investor relations site holds the full set; the secondary trade-press coverage is reliable for headlines and unreliable for context.

JPMorgan's agentic KYC redesign, April 2026

The "complete reimagination, agentic-first" framing of client onboarding, attributed to a senior JPMorgan executive in coverage of the firm's new agentic KYC system going to production at the end of April 2026. The chapter relies on this framing for the move from "AI as augmentation" to "AI as operating logic." What is striking is the operational specificity: the same redesign that produced the language is also the redesign that took the firm's KYC cycle from a multi-day process to under one minute. That is what reimagine looks like in operating terms.

Boston Consulting Group, "Are You Generating Value from AI? The Widening Gap" (2025)

The clearest articulation of the leader-laggard divergence argument from a major consulting firm. The title alone signals the thesis. Read it for the gap data and the diagnosis. The prescription is shaped by what the firm sells, as any firm's would be, which colors the framing without undermining the data behind it.

MIT NANDA, "The GenAI Divide" / "State of AI in Business 2025"

The 95-percent-no-P&L finding lives here, along with a more transparent methodology section than most things in the research base. Aditya Challapally is the lead author. The paper is also the source of several of the surprising-detail findings that surface throughout this book, including the mid-market-outperforms-large-enterprise data and the shadow-AI-outperforming-sanctioned-AI observation. Worth reading in full rather than reading about.

Chapter 2: The Choice You're Already Making

Sebastian Siemiatkowski / Klarna, Bloomberg interview, May 2025

The reversal interview is the centerpiece, but the broader arc is what the chapter relies on: the February 2024 announcement that AI was handling the work of roughly 700 customer service agents, the May 2025 Bloomberg interview where Siemiatkowski reversed publicly, the September 2025 IPO that priced anyway, and the subsequent reporting that confirmed the rehiring. Reading the sequence in order rather than the reversal moment alone gives you the felt experience of how reimagine failure unfolds when it unfolds in public.

Chapter 3: Reading the River

Anthropic's Sonnet 4.5 release and demonstration, September 2025

The historical surge anchor in this chapter. The released demonstration of the model autonomously rebuilding a working web application over hours, with thousands of tool calls, is the best single artifact for what "long-running agentic workflow" looks like in practice (at the time of writing). Worth watching the recording rather than reading about it; the duration is part of the point.

Anthropic's public statements on Mythos

The held-back-frontier-model anchor in this chapter. As of this writing, Anthropic's communication about why it has chosen not to release Mythos is the best current example of lab foresight functioning as a surge signal. By the time you read this, the situation may have evolved; the lab's public framing of the decision is the artifact worth tracking, not the specific resolution. Watch how the lab describes what condition was missing (and was later put in place), because that condition is the one your enterprise will eventually have to navigate around.

OSWorld benchmark

The benchmark that produced the 36-point swing this chapter cites in its discussion of the April 2026 surge. OSWorld is one of the more honest agentic-capability measurements in the public research base, in that it treats the agent as a worker operating against representative computer-use tasks and reports failure modes alongside successes. Useful counterweight to vendor benchmark cards. The leaderboard updates over time, so the headline numbers in this book will lag the current state by the time you reach this section.

UK AI Security Institute, Frontier AI Trends Report

A government-led perspective on where frontier capability is going, with special attention to the kinds of dangerous-capability thresholds that drive lab decisions to hold models back. Reading this in parallel to lab-published material gives a sharper picture of the surge-signal pattern than either source alone. The argument in this chapter that lab foresight is itself a surge signal lands more concretely once you have read what governments are watching for.

Chapter 4: Tension 1: When the Work Itself Changes

Microsoft 2026 Work Trend Index Annual Report

Microsoft surveys workers and managers globally every year on what is changing in the experience of work. The 2026 edition is one of the first to publish an empirical decomposition of where AI impact actually comes from at the enterprise level. The headline finding, the random forest result on roughly nineteen thousand workers showing organizational factors at about twice the impact strength of individual factors, is the strongest empirical anchor for the argument I am making here that the constraint is the operating system rather than the people inside it. Microsoft has commercial interest in the conclusion, since Microsoft sells the agentic platforms that the redesign relies on but the methodology is unusually transparent for a vendor-published study, and the finding is consistent with what the named-enterprise cases independently show.

The Block-Sequoia essay and the Dorsey podcast appearance, March and April 2026

Block CEO Jack Dorsey and Sequoia partner Roelof Botha co-published an essay titled "From Hierarchy to Intelligence" on March 31, 2026, framing Block's reorganization as building the company as an intelligence rather than as a hierarchy. Dorsey followed up on Sequoia's "Long Strange Trip" podcast on April 2, 2026, with the verbatim "in the most ideal case, there is no layer" framing and the specifics on current and target layer counts. Read both. The essay is the strategic framing; the podcast is the operational concrete. Block is the fullest public case of decision-structure redesign, with the founder doing the unusual work of being publicly explicit about both the play and its limits.

JPMorgan Company Update, February 23, 2026

JPMorgan's annual Company Update is dense with substantive material on what an at-scale agentic AI program looks like inside the largest US bank. The Dimon "we have displaced people from AI" framing in context (paired with redeployment infrastructure at scale and active plans to expand it further) is the strongest CEO-voice anchor available for what coupling done deliberately looks like. Erdoes on the AWM controls scale-out (200 to 3,000 to another 3,000-5,000 next) is the operational concrete behind the framing. Worth reading or watching in full rather than via press summaries; the press coverage tends to crop the framing in ways that lose the operational substance.

WSJ, "Why Moderna Merged Its Tech and HR Departments," May 12, 2025

The longest single primary source on the Moderna case, the one that plays out at the governance level - how the function is run and overseen rather than how its tasks get done. Tracey Franklin in her own words on the merger, the "virtual HR agent" framing, and the redesign of work across the firm based on what is best done by people versus what can be done by GPTs. Franklin's voice in this article is the most candid public CEO-voice anchor on the governance form of the coupling argument (the claim that workflow change and workforce change have to move together).

WSJ, "KPMG and the Future of Audit," April 10, 2026 (Mackenzie interview)

Thomas Mackenzie on KPMG's audit redesign, with the "I think next to no human beings" framing, the "slicing the bottom of the pyramid versus lifting the pyramid up" framing, and the operational specifics on routine audit testing as the first wedge. Pair this article with EY's April 7, 2026 press release on the 130,000-auditor agentic rollout and PwC's Matt Wood interview from December 31, 2025; reading the three together is the best way to see the buyer-seller dynamic at the function level.

Michael Hammer, "Reengineering Work: Don't Automate, Obliterate," Harvard Business Review, 1990

The historical anchor for the coupling argument I am making here. Hammer's article on Ford accounts-payable is more than thirty years old and reads even better now than it did when it was published, because the technology that finally makes its argument completely operational is the agentic AI of the past two years. The category-error framing is still load-bearing. Worth reading in full if you have any doubt that the discipline of coupled workflow-and-workforce redesign predates the current technology.

NBER Working Paper 34984, "Artificial Intelligence, Productivity, and the Workforce: Evidence from Corporate Executives," March 2026

A Federal Reserve Bank of Atlanta-led collaboration with Richmond Fed and Duke, surveying corporate executives on AI productivity and workforce effects. The headline finding (compositional reallocation of labor within and across firms, with routine clerical work exposed to substitution and analytical and managerial work more often complemented) is convergent with the "squeeze in the middle of the work" framing above at the broader level of work allocation. Worth reading directly rather than via the secondary commentary the paper has attracted; the paper is more careful than most of its summaries.

Chapter 5: Tension 2: What the Agent Must Know

Fortune, "How JPMorgan's CIO is reshaping work at the bank with a $19.8 billion annual tech and AI budget" (Lori Beer interview, John Kell, April 29, 2026)

Beer is the clearest senior-CIO voice in the public record on how agentic context provision and bounded authority are being designed together inside a large bank. The "right level to create an agent" framing, the "rebuilding the factory" metaphor for software-development redesign, and the permissions-architecture detail are all in the same piece. Worth reading in full rather than via press summaries.

JPMorgan Chase Technology Blog, "Securing the next generation of AI agents" (March 23, 2026)

The bank's own technical articulation of why context provision and bounded authority travel together at the reimagine end. The "lethal trifecta" framing on agents that combine untrusted inputs, sensitive data access, and external action authority is the most precise vocabulary in the current public record for the configuration that requires the highest level of continuous runtime safeguards. The piece is technical and brief; read it for the structural framing rather than for the implementation specifics.

UNLEASH, "Why Moderna merged HR and IT to better 'architect the flow of work'" (Tracey Franklin interview, Allie Nawrat, June 27, 2025)

Franklin's own framing of the HR-Tech merger, the "Ask HR" routing architecture, and the system-wide rethinking. The architect-the-flow-of-work language is in her voice. A strong primary source on what level-two coalition looks like from the executive holding the joint title.

McKinsey, "The State of AI" (QuantumBlack, November 17, 2025)

The most-cited large-sample survey on the state of agentic AI adoption in mid-to-late 2025. The decomposition matters: sixty-two percent of organizations at least experimenting with AI agents, twenty-three percent scaling somewhere in the enterprise, no single function above approximately ten percent reporting fully scaled agents. Useful corrective to coverage that has conflated the experimenting figure with the scaling figure.

Andrej Karpathy on X, "+1 for 'context engineering' over 'prompt engineering'" (June 25, 2025) and the preceding Tobi Lütke post (June 18, 2025)

The two X posts that started the practitioner-vocabulary cluster. Lütke's preceding post is the originator. Karpathy's "delicate art and science" formulation in reply has become the most-cited verbatim across the secondary literature. Reading them in order is the most direct way to see the conversational arc that the analyst firms later joined.

Chapter 6: Tension 3: Built-In or Bolted-On

Moffatt v. Air Canada, 2024 BCCRT 149

The British Columbia Civil Resolution Tribunal decision quoted in the opening of this chapter. Worth reading in full as the fullest available primary-source articulation of what happens when an enterprise treats an agent as legally separable from the rest of the operation. The decision is short, the language is direct, and the finding (that an agent is part of the firm's operational surface area whether the firm has set it up that way or not) maps to the chapter's argument with no translation required.

Mindful AI Foundation publications

Four pieces from the Mindful AI Foundation develop the architectural case for built-in governance in AI systems specifically. The peer-reviewed entry point is Mikkilineni and Kelly (2025), "From Static Prediction to Mindful Machines: A Paradigm Shift in Distributed AI Systems" (Computers 14(12), 541, MDPI). The paper is the source of the "coherence debt" term used in this chapter. Three additional pieces (a manuscript on governance of commitments, and two book-in-progress extracts) extend the argument in different directions and are available through the Mindful AI Foundation. Of the four, the Foundation's case is the maximalist version of the built-in argument and is the right reading for executives who want the architectural case at full length.

Derek Waldron interview with McKinsey, "JPMorgan Chase's Derek Waldron on building an AI-first bank culture" (October 2025)

The prompt-engineer-to-context-engineer framing in Waldron's own voice. The full interview covers the emergence of new job categories at JPMorgan and the firm's knowledge-management approach to making institutional information available to AI. The chapter borrows a single sentence; the rest of the interview is worth reading in full.

Matt Wood on the PwC One launch (Wood LinkedIn post, March 19, 2026; PwC corporate press release)

The most authoritative single articulation in the verified executive-voice corpus of the bolt-on / built-in distinction. Wood has since moved from PwC to AWS, where he is Chief AI and Technology Officer; the launch-day framing is read with role-at-the-time attribution. The PwC press release carries the institutional articulation; Wood's LinkedIn carries the personal voice.

Simon Willison, "The lethal trifecta for AI agents" (simonwillison.net, June 16, 2025)

Willison coined the term to describe the architectural conditions that make prompt-injection attacks like the Microsoft 365 Copilot one in this chapter possible. Three properties in combination produce the danger: access to private data, exposure to untrusted content, and the ability to externally communicate. The framing is the plainest vocabulary in the current public record for the configuration that built-in architectures address by cutting one leg of the trifecta rather than monitoring for the combination.

Bain & Company, "The Three Layers of an Agentic AI Platform" (April 2026)

The closest pre-existing articulation of the specific bolt-on / built-in distinction in the consulting-firm literature, with the "embedded by design, not bolted on after deployment" framing applied to AI specifically. Useful as evidence that the distinction is in circulation among advisors to the executive audience this book is written for.

Chapter 7: The Plan You Keep Remaking

No new sources here. The goal in this chapter was to put to work the cases and arguments built up across Chapters 1 through 6: the capability trajectory and the leader-laggard gap, the retrofit-or-reimagine choice, the river and its surges, and the three Tensions with their named cases (Block, Moderna, and JPMorgan on the reimagine side; Air Canada, Microsoft 365 Copilot, and Deloitte on the governance-failure side). All of them are in the earlier chapters' Further Reading sections, and the companion website aggregates everything in one place. If you want to go deeper on anything here, go back to the chapter where it first appeared.

Further Reading