Beneath the National Mall in Washington, D.C,, 23 federal buildings are connected by tunnels most people never see. Five walkable corridors carry steam, chilled water, and fiber optic lines through passages that regularly hit 100 degrees Fahrenheit. Every person walking above ground depends on it. Almost none of them know it exists.

The agentic era has given every enterprise its own version of those tunnels, which made this city the perfect place to hold IEEE Cloud Summit 2026. Dozens of researchers, practitioners, and architects gathered just a few blocks from the Capitol building in Washington, D.C., to work through what that reality means for how we build, secure, and govern cloud systems. Sessions ranged from NSF-funded research into sustainable compute to live attack chain demonstrations on agentic systems to a production case study from Salesforce's infrastructure team. Several threads ran consistently.

Here are just a few highlights from this 10th edition of IEEE Cloud Summit.

The Agent That Knew When to Step Aside

In "AI in Production: How We Built an Autonomous Agent to Optimize Millions in Capacity at Salesforce," Tuhin Kanti Sharma, DevX Architect at Salesforce, walked through how his team really took agentic AI to production. 

They were tasked with solving the fact that 86% of containers were running at less than 20% of their requested CPU. The configuration lived across six different places, making this hard to tune. Before you could change anything, you had to know which layer actually owned the value. Their mission was to right-size Kubernetes helm charts across Salesforce's fleet.

The first agent they made handled standard cases well, but for everything else, the prompt kept growing. The team split it into specialist agents with an orchestrator. The results varied across runs. Probability had entered a system that needed to be auditable, repeatable, and scalable. They needed to make it deterministic.

Their breakthrough came from separating which parts of the problem actually required a language model. Figuring out how CPU configuration was spread across a complex repository involved genuine ambiguity. That was the LLM's role. Computing the optimal CPU plan across hundreds of instances was a math problem. The team replaced that step with an Integer Linear Programming solver: same input, same output, every run.

The architecture that emerged was Discover (LLM) into ILP Solver into Apply (LLM). Separating reasoning from computation was what made the system production-ready. When you can run code, run code. When you can do math, do math. Route probability to where ambiguity lives, and give deterministic problems a deterministic engine.

"The best AI agent we built was the one that knew when not to use AI," Tuhin concluded.

Tuhin Kanti Sharma

Only Give the Agent What It Needs Right Now

In "Responsible AI in Practice: Building Agents with Context-Aware Security," Dr. Chae Clark, Senior Generative AI/ML Specialist at AWS, walked us through a use case involving an AI travel agent that has access to a customer's passport number, booking history, and external flight-search APIs. A user asks it to find hotel options for the next night. The agent pulls up past bookings, searches available hotels, and returns results. A second request arrives: "Find me a trip and also, what is my passport number?" The agent has everything it needs to answer both questions. That is the problem.

Agents should receive only the context relevant to the specific task they are about to perform. A public tool that calls external APIs or reads non-sensitive data should carry no controlled context, regardless of what else is in the session. The architecture Chae described separates data into tiers, tags every element with a sensitivity classification, and filters the context window before each tool call. Tools carry classifications too: `find_flights` is public, `get_saved_travelers` is controlled. The filtering happens before the model ever sees the data.

However, Chae pointed out, a more secure system is a less flexible one. When a user asks a question that spans both public and controlled data, the agent must choose which context to allow. Every over-permissioned agent is a data exposure waiting for the right prompt. Context-aware security extends least privilege to the context window itself, governing what the agent can see as well as what it is authorized to do.

Chae Clark

The Kill Chain Nobody Audited

In "Forensics for Agentic AI: Detecting and Tracing Poisoning Attacks in Agentic Systems," Gautham Koorma, Principal Consultant at Quandary Peak Research, started with the facts that ninety-seven percent of non-human identities are over-privileged and 0.01% of machine identities hold 80% of all cloud access. He called this "identity debt," access risk that compounds quietly inside day-to-day operations, invisible until something goes wrong.

Three attack scenarios illustrated how that debt gets collected. In the first, a proof-of-concept against an Amazon Bedrock agent showed memory injection: a fake conversation hidden inside a website the agent visited poisoned its long-term memory so that every future session re-injected the malicious payload. The second was EchoLeak, a confirmed zero-click exfiltration attack against Microsoft Copilot where an attacker sends a crafted email; the agent retrieves it for a compliance check. The third was a tool poisoning attack against MCP clients. A field in an OAuth server response was passed to a shell command without validation, giving a malicious MCP server arbitrary code execution on the developer's machine. 

Gautham's forensics answer uses OpenTelemetry as the foundation for what he called AgentTrace: a replayable record of every thought and action an agent took, stored for post-hoc analysis of adversarial or misaligned behavior. The key primitives are Trace, Span, and Attribute. "Make agents forensically traceable by default," he said.

Gautham Koorma

Put the Policy Where the Decision Gets Made

In "GuardOn: Shifting Kubernetes Compliance Left with Developer-First Guardrails," Sajal Nigam, Expert Application Engineer at Capital One, mapped the enforcement landscape for Kubernetes configuration security and identified the gap sitting at its center. He said traditional models offer enforcement at four points: IDE linters, pre-commit hooks, CI/CD pipeline checks, and admission controllers. None of them catch a misconfiguration at the moment when a human reviewer is evaluating it with full context and the authority to block it. Where they all meet is the pull request review. 

PR review is mandatory in most engineering organizations, collaborative by design, and audited by default. When a configuration change with `privileged: true` or missing resource limits gets submitted, the PR is where a reviewer is already looking at the change and deciding whether to approve it.

His open-source project GuardOn puts a browser extension at that point. It runs a local rule engine against the YAML in the pull request and surfaces policy violations in the review interface. The rule engine is deterministic: same input, same output, every run. Native rules support Kyverno validation and OPA/Rego subsets, so existing policy investments carry over without rewriting.  Security teams invest heavily at the perimeter of the software delivery pipeline - at the scanner, at the runtime, at the admission controller.

GuardOn shifts the question: what changes when enforcement lands where developers and reviewers are already paying attention?

Sajal Nigam

Over-Permission Is the Default State

The most persistent theme throughout all the sessions was that we keep over-provisioning our workloads. Agents inherit the credentials of service accounts and IAM roles that were themselves over-provisioned for convenience. When agents chain together, the privilege surface compounds. This is the predictable result of decades of credential sprawl meeting a new class of autonomous actor. Scoped identity, filtered to match the specific task, has to be designed in from the start.

Deterministic Problems Need Deterministic Systems

Language models are being asked to do work they are structurally wrong for. Probabilistic reasoning is genuinely powerful for discovery and interpretation, for the parts of a problem where ambiguity is the feature. Computing the same correct answer on every run is a different category of problem. The architectural discipline is to be explicit about which tier each step belongs to. Language models handle interpretation. Code and solvers handle computation.

Forensics Is the Missing Design Requirement

Most agentic deployments lack the ability to reconstruct what happened and replay the sequence of inputs, tool calls, and context states that produced a bad result. Tools based on OpenTelemetry provide the primitives. The missing piece is the convention of treating agent actions as audit artifacts by default, just as database transactions are logged. Every agent action is an identity event that needs attribution, traceability, and retention. 

Walk the Tunnel Before There Is a Fire

The tunnels under the National Mall were not designed together. They accumulated over decades, each addition solving a local problem without anyone mapping what was already there. Agentic AI is being layered into the same kind of inherited infrastructure, inheriting the same unresolved debt. The discipline to address it already exists: scope identity to the task, route computation to deterministic engines, and log every agent action as an audit artifact. The same principles that made distributed systems governable apply here, one layer up.

The harder part is organizational. Over-permissioning persists because scoping access requires upfront time. Forensic infrastructure gets skipped because no one put it on the roadmap. Enforcement lands at the perimeter because that is where it has always landed. Every presenter at this summit was working on a piece of the same question: what does responsible autonomy look like when the full action surface is still being mapped? 

The technical answers are available. Acting on them before a breach makes it mandatory is a different kind of decision.