Agents, Memory, and the Demo That Didn't Need a Design Tool
Governance gaps, open source agent security, AI memory architecture, and a prototyping workflow that made PowerPoint redundant.
Show Notes
This week's episode opens with David on two tools from Merill Fernando that caught his attention. The first is Locker, an MCP implementation that bridges the Model Context Protocol directly to Microsoft Graph and Azure APIs, allowing compatible AI systems to query a live tenant in plain English rather than relying on stale training data. The second is the Microsoft Graph Skill, which addresses the specific problem of LLMs working with an API that has over 27,000 endpoints updated weekly. The two tools are designed to work together: the Graph Skill provides current knowledge of what the API can do and how to call it, while Locker handles execution against the tenant in a read-only, safe-by-default manner. It is a practical combination that closes one of the more frustrating gaps in agentic development on the Microsoft stack.
David then moves to the Agent Governance Toolkit, an open source project from Microsoft released at the start of April. The framing is a question that is harder to answer than it first appears: as agents gain autonomy to act, who actually governs what they do? The toolkit consists of seven independently installable packages covering policy enforcement, cryptographic agent identity, inter-agent trust scoring, execution rings modelled on CPU privilege levels, SRE practices applied to agent systems, compliance mapping for the EU AI Act, and plugin lifecycle management. OWASP published the first formal taxonomy of agentic AI risks in December 2025, covering goal hijacking, tool misuse, memory poisoning, cascading failures, and rogue agent behaviour among others, and the toolkit is a direct response to that gap. Microsoft's stated intent is to move the project into community governance under OWASP or LFAI, which David notes is a meaningful signal of how seriously they are treating it.
Richard picks up from there, connecting the agent governance question to the Copilot adoption picture ahead of Microsoft's Q3 earnings on Tuesday. The adoption numbers are well known at this point, around 15 million paid seats, roughly 3% of the M365 base, but the context matters. Microsoft spent $37.5 billion on infrastructure in a single quarter, up 66% year on year, and the market is asking whether the monetisation trajectory justifies that capex. Richard's observation is that the slow adoption is almost always a governance problem in practice, or more precisely a permissions problem dressed up as a governance problem: Copilot surfaces whatever is already accessible in an estate, and most SharePoint environments were not built with that kind of exposure in mind.
The second part of Richard's segment takes a different direction entirely. Working on a client project over the past couple of weeks, he found himself wanting to produce something with the clarity and fidelity of the wireframes a UX team had sent over, without the tools or the time to produce them conventionally. The result was a click-through React prototype built entirely in VS Code using GitHub Copilot and Claude, with chaptered narrative flow, role-based screens, and deterministic state. The point is not that he built an app, it is that he vibe coded what would previously have been a PowerPoint deck or a Visio diagram, and the output answered substantially more questions at a meaningfully higher fidelity. Anthropic shipping Claude Design in the same week, a dedicated prototyping surface built on Opus 4.7 with a handoff path to Claude Code, struck him as serendipitous validation that the gap is real and widely perceived.
David closes his section with MemPalace, an open source memory project released by Mila Jovovich and her partner, which David acknowledges is a specific reason to pay attention to what is, in fact, a technical memory architecture project. The connection to The Fifth Element is deliberate: the film's premise is that four classical elements are necessary but insufficient, and life itself is what binds them. David's argument is that agents can have all the reasoning, retrieval, and tool use in the world, but without persistent memory and continuity of identity across sessions, something fundamental is still missing. MemPalace structures memory across four layers, wings, rooms, closets, and drawers, stores verbatim content without summarisation to preserve context fidelity, and uses a compressed symbolic index designed for fast scanning by a language model. Everything runs locally using ChromaDB and SQLite, with no cloud sync or API keys required.
Cyrus covers the security section at pace, moving through AutoPatch improvements and the arrival of hot patching for Windows as default from May, Android XR device support arriving in Intune, Entra's new tenant configuration management API which allows full tenant snapshots in JSON and drift detection at scale, the conditional access optimisation agent which moves governance from periodic review to continuous live monitoring, and the Defender entity analyser which reached general availability on 1 April and now integrates with the new Sentinel MCP graph tool released on 20 April. On Purview, administrative units now allow ring-fenced governance for large organisations, giving local teams scoped control without losing central visibility. Cyrus closes with a striking data point from the IBM and Palo Alto Networks research: 61% of C-suite leaders surveyed said their AI model assets or data had already been compromised, and 67% said they had been targeted by an AI-enabled attack in the last year. The OWASP agentic AI top ten, which David covered earlier in the episode, is not a theoretical exercise.