The Complete Guide to Sandboxing Autonomous Agents: Tools, Frameworks, and Safety Essentials

The pattern shows up in incident reports, forum posts, and whispered Slack conversations with disturbing regularity: an AI coding assistant, given a routine task, interprets it just slightly wrong—and destroys something important. A Cursor user watches their agent wipe a Git repository. An Amazon Q developer extension ships a prompt-stealing worm. A Claude Code session, asked to "clean up old files," nukes configuration directories that took weeks to build.

None of these agents were malicious. They were helpful. That's the problem.

This is the paradox at the heart of agentic coding: the same autonomy that makes AI coding assistants revolutionary also makes them dangerous. When you give an AI the ability to write code and execute it, you're not just dealing with autocomplete anymore. You're handing the keys to your kingdom to an entity that hallucinates package names in nearly 20% of generated imports, that can be tricked by white-on-white text hidden in web pages, and that will cheerfully phone home to an attacker's server if the right prompt injection lands.

The industry's response has been rapid but uneven. Some tools ship with sophisticated sandboxing built-in. Others leave security as an exercise for the reader. And a disturbing number of developers are running AI-generated code directly on their production systems because, well, it's faster.

This guide is for the rest of us—the practitioners who understand that agentic coding isn't going away, but who need to deploy it without getting destroyed. We'll cover the full spectrum of isolation technologies, from lightweight OS primitives to hardware-backed microVMs. We'll examine how the major tools actually implement security (spoiler: the variation is wild). And we'll give you a practical framework for choosing the right level of protection for your threat model.

The bottom line, stated upfront: container isolation alone is insufficient for untrusted AI-generated code execution. Defense-in-depth combining OS primitives, hardware virtualization, and network segmentation is now mandatory. If that sounds paranoid, you haven't been paying attention to the CVEs.

The Threat Model: What Actually Goes Wrong

Understanding why sandboxing matters requires understanding what happens when it fails. The threat landscape for agentic coding systems differs fundamentally from traditional application security because the attack surface is the agent itself.

Prompt Injection: OWASP's #1 LLM Vulnerability

The most dangerous attacks don't target your infrastructure—they target your agent's reasoning. Prompt injection—ranked as the #1 risk in OWASP's 2025 Top 10 for LLM Applications—comes in two flavors, and both are worse than you think.

Direct injection is the obvious one: users telling agents to "ignore previous instructions and do something malicious." It's well-understood and partially mitigated by modern models. The real threat is indirect injection—malicious instructions hidden in external data sources that the agent consumes during normal operation.

Researchers have demonstrated successful attacks via web pages (white-on-white text invisible to users but parsed by agents), PDF documents (hidden text layers), code comments in repositories the agent clones, and even YouTube transcripts. The ZombAIs attack proved that agents with web browsing capability could be compromised by hidden HTML instructions, leading to autonomous malware downloads without any user interaction.

CVE-2024-5565 in Vanna.AI demonstrated the endpoint: remote code execution via AI-generated SQL and Python, triggered entirely through prompt injection. The agent wasn't buggy. It was working exactly as designed—it just happened to be designed in a way that trusted external input.

Recent research on the Hackode framework reports an average 84.29% attack success rate for inducing vulnerable code generation across several evaluated LLMs. Your system prompt saying "never delete files" is not a security control.

Supply Chain Attacks: Slopsquatting and Hallucinated Dependencies

LLMs hallucinate. This isn't news. What is news is that attackers have figured out how to weaponize it.

"Slopsquatting" exploits the fact that language models hallucinate non-existent package names with disturbing regularity. A large-scale study analyzing 2.23 million package references found that 19.7% pointed to packages that don't exist—440,445 hallucinated references total, encompassing 205,474 unique non-existent package names. Crucially, 58% of these appeared repeatedly across multiple prompts, making them predictable targets for attacker registration on npm and PyPI.

Traditional typosquatting remains equally effective. By late 2025, the Shai-Hulud and Shai-Hulud v2 campaigns had compromised hundreds of npm packages—upwards of 800 in some tallies. The payloads hunted for GitHub tokens and cloud API keys, with CI/GitHub Actions lateral movement as a common escalation path.

Network Exfiltration

When an AI agent has access to environment variables containing secrets (and most do), network exfiltration becomes trivial. Smart Labs AI and the University of Augsburg ran 1,068 attack variants per model against popular agents, demonstrating that it's straightforward to make them read internal files, encode the contents (e.g., Base64), and leak them via seemingly harmless HTTP requests.

The agent isn't doing anything obviously malicious. It's making an HTTP request—the same thing it does when fetching documentation or installing packages. The difference is that the request URL contains your AWS credentials.

DNS exfiltration is even harder to catch. Since DNS traffic is typically allowed through even restrictive firewalls, a compromised agent can encode sensitive data in DNS queries and exfiltrate it character by character. Slow? Yes. Effective? Devastatingly.

Filesystem Attacks

Naive path validation has proven repeatedly inadequate. CVE-2025-53109 and CVE-2025-53110 in the filesystem MCP server—the @modelcontextprotocol/server-filesystem reference implementation used by Anthropic's tools and others—demonstrated that simple path prefix matching could be bypassed through symlink exploitation. Crafted symlinks allowed attackers to escape sandboxed directories entirely, enabling reads of /etc/sudoers, writes to macOS Launch Agents for persistence, and full system takeover.

The lesson: if your sandbox relies on checking whether a path starts with /home/user/project, you've already lost.

Inter-Agent Trust

Multi-agent systems are increasingly common—orchestrators that coordinate multiple specialized agents to complete complex tasks. But research testing 17 LLMs found that 82.4% will execute malicious tool calls or code when requested by "peer agents," compared to 41.2% success rate for direct prompt injection and 52.9% for RAG backdoor attacks.

Agents trust each other. If you compromise one agent in a multi-agent system, you've likely compromised them all.

The Isolation Spectrum

Sandboxing technologies form a hierarchy from convenient-but-weak to secure-but-complex. Understanding this spectrum is essential because there is no one-size-fits-all solution—the right choice depends on your threat model, performance requirements, and operational capacity.

Tier 1: Hardware Virtualization (Firecracker, Kata Containers)

The gold standard for untrusted code execution is complete virtual machine isolation. Each execution environment boots its own Linux kernel, isolated from the host by the hypervisor boundary. System calls from the guest cannot reach the host kernel directly—they're mediated by virtualized hardware.

Firecracker, the Rust-based virtual machine monitor that powers AWS Lambda, represents the current state of the art. It boots microVMs in under 125 milliseconds with less than 5 MiB memory overhead per instance. The minimal device model—only virtio-net, virtio-block, virtio-vsock, serial console, and a keyboard controller for reset—reduces attack surface from QEMU's approximately 1.4 million lines of C to around 50,000 lines of memory-safe Rust.

AWS processes trillions of Lambda invocations monthly on this foundation. No publicly disclosed VM escapes from user-space code in Firecracker have been documented as of late 2025—though this is exactly the kind of claim that readers should re-verify as the landscape evolves.

Kata Containers combines OCI compatibility with VM-backed isolation, supporting multiple hypervisors including QEMU, Cloud Hypervisor, and Firecracker itself. Boot times are typically in the low hundreds of milliseconds with roughly an order of magnitude more memory overhead than standard containers—acceptable for enterprise multi-tenancy but configuration-dependent.

The tradeoff is complexity. You need Linux with KVM support (bare metal or nested virtualization in cloud), kernel images, rootfs management, networking configuration, and operational overhead for monitoring and log aggregation across VMs.

Best for: Multi-tenant production code execution, serverless platforms, any scenario involving fully untrusted code from external sources.

Tier 2: User-Space Kernel Interception (gVisor)

Rather than hardware virtualization, gVisor implements a user-space kernel (called "Sentry") written in Go that intercepts every system call from the container and emulates a Linux kernel interface. The container still shares the host kernel, but cannot invoke syscalls directly—they're filtered and mediated by Sentry. This approach reduces host kernel exposure from approximately 350 syscalls to around 68. Google Cloud Functions, Cloud Run, and GKE all use gVisor for multi-tenant isolation.

The tradeoff is performance. In Google's own benchmarks and follow-up research, basic syscalls are often 2–9× slower than native, and filesystem operations that require host mediation can be tens to over 100× slower on synthetic I/O microbenchmarks. Real-world impact varies significantly by workload. Startup remains fast (50-100ms), and memory overhead is modest.

Best for: Kubernetes multi-tenant environments, workloads that can tolerate syscall overhead, teams already invested in the container ecosystem.

Tier 3: Container Hardening (Docker + seccomp + namespaces)

Standard containers using Docker, containerd, or runc provide process-level isolation using Linux kernel namespaces (pid, mount, network, ipc, user, uts), resource limits via cgroups, and syscall filtering via seccomp-bpf. This is fast—near-native performance with sub-100ms startup. It's also well-understood, with extensive tooling and documentation.

The critical limitation: containers share the host kernel. They are not security boundaries in the way hypervisors are. NIST and security researchers have been clear on this point. Container escapes remain an active CVE category.

January 2024 brought CVE-2024-21626 ("Leaky Vessels") in runc, where a WORKDIR set to /proc/self/fd/<fd> could exploit file descriptor leaks for container escape. November 2025 added three more high-severity runc vulnerabilities (CVE-2025-31133, CVE-2025-52565, CVE-2025-52881), all rated high under CVSS 4.0 (around 7+).

Proper hardening helps: seccomp profiles that filter syscalls, dropped capabilities (especially CAP_SYS_ADMIN), user namespace remapping, read-only root filesystems. But even hardened containers accept kernel-level risk.

A baseline hardened Docker configuration for AI agent execution:

docker run -d \
  --user 1001:1001 \
  --read-only \
  --tmpfs /tmp:rw,noexec,nosuid,size=64M \
  --cap-drop ALL \
  --security-opt no-new-privileges:true \
  --security-opt seccomp=/path/to/profile.json \
  --memory="512m" \
  --cpus="1.0" \
  --pids-limit 100 \
  --network none \
  agent-image:latest

This is a starting point; the CIS Docker Benchmark provides more comprehensive hardening guidance.

Best for: Development environments, CI/CD pipelines, scenarios where the threat model is "prevent accidental damage" rather than "resist active adversaries."

Tier 4: OS-Level Sandboxing (Bubblewrap, Seatbelt)

Lightweight OS primitives can create meaningful isolation without container or VM overhead. When sandboxing is enabled, Anthropic's Claude Code uses Linux's Bubblewrap and macOS Seatbelt to run sandboxed bash shells that enforce filesystem and network boundaries.

This approach defines exactly which directories and hosts the agent can access. Processes inherit restrictions, so nested shells can't escape. Startup is essentially instant, and resource overhead is minimal.

The limitation is the shared kernel—a severe kernel exploit could escape. But for trusted-ish code (your own agent working on your own codebase), this provides meaningful protection against accidents and low-sophistication attacks.

Best for: Local development, single-user scenarios, teams wanting fine-grained policy control without container complexity.

Tier 5: Permission-Gated Runtimes (Deno)

Deno and similar runtimes require explicit permission grants for network, filesystem, and subprocess access. No capabilities are available by default. This isn't sandboxing in the formal sense—a bug in the Deno runtime itself could allow escape. But it makes policies explicit and easier to audit. Permission-gated runtimes work well as an inner layer combined with OS or container isolation.

Best for: Controlling which APIs agents can call, not isolating execution. Complementary to true sandboxing.

Tier 6: Prompt-Only Controls

Telling the LLM "don't delete files" and hoping it listens. This has an 84%+ failure rate against targeted attacks. It's not sandboxing. It's a sign that says "Please don't break in."

Verdict: Not acceptable for any production agentic system.

How Major Tools Actually Implement Security

The landscape of sandboxing approaches across major agentic coding tools reveals significant variation—from sophisticated multi-layer isolation to "that's the user's problem."

Claude Code: Dual-Boundary Isolation

Anthropic's approach is the most technically sophisticated among mainstream tools. When sandboxing is enabled, Claude Code implements isolation using OS-level primitives rather than traditional containerization (detailed in their engineering blog post on sandboxing).

On Linux, it uses Bubblewrap for filesystem and process isolation. On macOS, Apple's Seatbelt (sandbox-exec) provides enforcement. Critically, all network traffic routes through proxy servers running outside the sandbox on Unix domain sockets—the Linux network namespace is removed entirely. Processes inside the sandbox can only communicate via HTTP (through the HTTP proxy) or TCP (through SOCKS5).

This architecture achieves two things: it prevents direct network exfiltration even if the sandbox is partially compromised, and it allows fine-grained logging of all network activity. Configuration supports domain allowlists and denylists, filesystem path controls with read/write granularity, and protection for sensitive paths like ~/.ssh and ~/.aws.

Anthropic reports an 84% reduction in permission prompts with this approach—users can let the agent work more autonomously because the blast radius of any mistake is contained. The sandbox runtime is available as an open-source research preview.

The web-hosted version adds credential isolation: git keys and signing keys never exist inside the sandbox. A proxy service handles authentication, verifying operations and applying real credentials on the host. If the sandbox is compromised, the attacker still can't access credentials.

GitHub Copilot Workspace: Process Gates Over Technical Isolation

GitHub's Copilot agents run in ephemeral execution environments with integrated firewall controls. The technical isolation is standard container-level, but the security model emphasizes process controls: the agent has read-only repository access and can only push to Copilot-managed branches. No code the agent writes can run until a human approves a pull request.

This is a fundamentally different philosophy—rather than trying to make sandbox escape impossible, it makes sandbox escape irrelevant for the most dangerous actions. The agent can be compromised, but it can't ship code to production without human approval. Network access is firewall-controlled with configurable allowlists.

Cursor

Cursor 2.x adds an optional agent sandbox on macOS that gives the agent a dedicated worktree and, in sandboxed mode, can block network access and isolate git state. The implementation uses git worktree isolation for parallel agent operations. By default, however, many users still let Cursor run commands directly in their local environment. The sandbox is opt-in.

OpenHands (formerly OpenDevin)

OpenHands runs all agent-generated code in isolated Docker containers, with optional Daytona integration for stronger isolation via zero-trust sandboxes and ephemeral environments. The architecture separates the runtime container from the main application.

This is solid container-level isolation with the option to upgrade to stronger boundaries through third-party integration.

Aider and Continue.dev

Aider and Continue.dev tools do not include built-in sandboxing. Users are responsible for implementing their own isolation via Docker or similar mechanisms. Recent documentation sometimes recommends Docker setups, but the tools themselves don't enforce it. This isn't necessarily wrong—some users have security expertise and prefer to control their own stack. But it shifts significant security responsibility to end users who may not realize they need to act.

Cloud Execution Services

If building your own sandboxing infrastructure sounds like a bad time (it is), a category of services has emerged specifically for AI-generated code execution.

E2B: The Firecracker-Native Option

E2B markets itself as the go-to Firecracker sandbox for AI agents, claiming adoption at 88% of the Fortune 100 (a vendor claim worth noting as such). Users include Perplexity, Hugging Face, and Groq.

The architecture is straightforward: your agent runs locally, and when it needs to execute code, it calls E2B's API. E2B spins up an isolated Firecracker microVM in approximately 150 milliseconds, executes the code, returns the output, and cleans up. Credentials never enter the sandbox.

Current public pricing is on the order of $0.05 per vCPU-hour for sandboxes. Sessions support up to 24 hours on paid plans. SDKs exist for Python and JavaScript, with integrations for the major AI frameworks.

The tradeoff is vendor lock-in and network latency—150-500ms per execution adds up for interactive workloads.

Modal: gVisor with GPU Support

Modal uses gVisor containers rather than Firecracker, with the key differentiator being GPU support. H100, A100, L40S, and T4 GPUs are available with per-second billing, making Modal the default choice for AI workloads that need to run models inside the sandbox.

Sandbox-specific pricing runs at 3× standard container rates per Modal's documentation, positioning Modal between E2B's pure-sandbox play and general-purpose serverless GPU compute.

Daytona: Speed-Optimized Containers

Daytona pivoted to AI agent sandboxing in early 2025, offering Docker containers with optional Kata Containers or Sysbox for enhanced isolation. Daytona claims cold starts under 90ms—if accurate, the fastest option available. Language Server Protocol support enables sophisticated code intelligence.

The limitation is relative immaturity—features like advanced networking and snapshotting are still developing.

Together Code Sandbox

Together.ai's sandbox service can start full VM instances from snapshots in approximately 500ms according to their benchmarks, targeting "IDE-style" persistent agents that need large VMs with state. Up to 64 vCPUs are available with versioned storage.

Pricing is VM-like (per vCPU-minute), making it less cost-effective for quick ephemeral tasks but appropriate for long-running development environments.

Building Defense-in-Depth

Effective security for agentic coding requires layered defenses. Any single control may fail. The goal is ensuring that when one layer fails, others contain the damage.

Layer 0: Hardware Isolation

For fully untrusted code, Firecracker microVMs or Kata Containers provide the foundation. Each execution environment gets its own kernel, isolated by Extended Page Tables (EPT). Cross-VM memory access is impossible without hypervisor escape.

Layer 1: OS-Level Controls

Landlock LSM (Linux 5.13+) enables unprivileged process self-sandboxing with hierarchical filesystem restrictions and network controls (TCP bind/connect since Linux 6.7). Seccomp-BPF provides syscall filtering with single-digit nanosecond overhead. Linux namespaces (mount, PID, network, user) provide resource isolation.

Layer 2: Container Hardening

Docker and similar container runtimes ship with permissive defaults designed for compatibility, not security. Hardening requires explicit configuration:

Capability dropping removes Linux capabilities—granular root privileges that containers inherit by default. The flag --cap-drop ALL removes all 40+ capabilities (like CAP_NET_RAW for raw sockets, CAP_SYS_ADMIN for mount operations). Add back only what's strictly needed with --cap-add:

docker run --cap-drop ALL --cap-add NET_BIND_SERVICE myimage

Custom seccomp profiles restrict which syscalls the container can make. Docker's default profile blocks ~44 dangerous syscalls, but you can create stricter profiles that allowlist only the specific syscalls your application needs:

docker run --security-opt seccomp=/path/to/profile.json myimage

User namespace remapping maps container root (UID 0) to an unprivileged host user, so even if an attacker escapes as "root," they land as nobody on the host:

# In /etc/docker/daemon.json
{ "userns-remap": "default" }

Read-only root filesystem prevents attackers from modifying binaries or dropping persistence mechanisms:

docker run --read-only --tmpfs /tmp myimage

Resource limits via cgroups prevent denial-of-service through resource exhaustion:

docker run --memory=512m --cpus=1 --pids-limit=100 myimage

A fully hardened container invocation might look like:

docker run \
  --cap-drop ALL \
  --security-opt no-new-privileges:true \
  --security-opt seccomp=/path/to/strict-profile.json \
  --read-only \
  --tmpfs /tmp:size=64m \
  --memory=512m \
  --cpus=1 \
  --pids-limit=100 \
  --network=none \
  myimage

Layer 3: Application Sandbox

Tool-specific restrictions add another layer. Configure filesystem allowlists (project directory only), network allowlists (package registries and specific APIs), and deny sensitive paths (~/.ssh, ~/.aws, ~/.gnupg).

Layer 4: Network Segmentation

Default-deny egress with allowlisted domains prevents exfiltration. Block internal addresses (10.x, 192.168.x, 169.254.169.254) to prevent cloud metadata access. Route all traffic through logging proxies—if the agent makes a suspicious request, you want to know about it. DNS deserves special attention. It's the favorite exfiltration channel for sophisticated attackers because it's almost always allowed through firewalls. Consider DNS logging and anomaly detection.

Layer 5: CI/CD Gates

Static analysis with CodeQL or Semgrep, secret scanning, and dependency review should run on all AI-generated code before merge. This catches the accidental security issues that agents routinely introduce.

Layer 6: Human Review

All AI-generated code should be reviewed before production deployment. Not skimmed—reviewed. Audit logs of agent actions enable forensic analysis and behavioral detection. For especially sensitive operations—database modifications, credential creation, production deployments—require explicit human approval in the agent workflow itself.

Decision Framework

For Trusted Internal Development

When the AI assists with your own code and speed matters, Landlock + seccomp provides minimal overhead. Claude Code's sandbox mode or Cursor's agent sandbox add meaningful protection without significant performance impact. This accepts some risk in exchange for velocity. Appropriate when you trust your codebase's inputs and the agent isn't exposed to external content.

For Package Installation and Untrusted Dependencies

When AI may install packages from public registries, Docker with seccomp hardening balances isolation and performance. Add package validation against known-good registries—don't let the agent pip install hallucinated-package-namewithout verification.

For Multi-Tenant SaaS

When customer code executes in your infrastructure, gVisor or Kata Containers provide the necessary isolation. One customer's compromised agent cannot impact another's data. Google Cloud Run 2nd generation or GKE Sandbox (gVisor-based) offers managed infrastructure if you don't want to operate this yourself.

For Fully Untrusted Code

Production execution of AI-generated code from untrusted sources requires Firecracker microVMs. E2B or self-hosted Firecracker with jailer provide this foundation. (Note that AWS Lambda uses Firecracker under the hood but isn't a configurable sandbox service you can wire agents into directly.)

Fallacies and Pitfalls

1. Containers Are Not Security Boundaries

Docker, OCI runtimes, and Kubernetes Pod Security Policies are convenience layers, not vaults—not in the way hypervisors are. A misconfigured seccomp profile, a kernel vulnerability, or privileged mode all break them. Treat containers as process groups with guardrails.

2. Firecracker Is Not Bulletproof

Firecracker is genuinely more isolated than containers, but hypervisor bugs exist (rare, patched promptly), nested virtualization adds complexity, and physical-layer attacks (Rowhammer, side-channels) remain theoretically possible. For typical threat models—accidental damage, script kiddies, opportunistic attackers—Firecracker is excellent. For adversaries with hardware-level access or nation-state resources, nothing is bulletproof.

3. Network Isolation Is Hardest

Making a sandbox unable to talk to the network is easy. Making it unable to exfiltrate data is hard. A compromised sandbox can encode data in DNS queries, use ICMP to tunnel data, or exploit any application-layer protocol you've allowed. Egress allowlists (block-by-default) help. DNS inspection helps more. Anomaly detection for traffic patterns is approaching necessity.

4. Prompts Are Not Guardrails

If a capability is dangerous, remove it via policy, not prompts. The system message is not a security boundary.

5. RAG Is an Emerging Attack Surface

Vector databases, prompt injection chains, and memory poisoning are among the most concerning emerging attack vectors for agent systems. A compromised retrieval pipeline can inject adversarial context into every agent decision. Isolate RAG components, sanitize retrieved content, and monitor for injection signatures.

Outlook 2026

Hardware-Assisted Isolation

AMD SEV-SNP, Intel TDX, and ARM CCA are maturing, enabling encrypted VMs even at the hypervisor level. This provides defense against insider threats at cloud providers—your agent's execution is protected even from the infrastructure operator. Expect cost premiums initially, with commoditization likely by 2027—though as with all forward-looking statements, verify as the landscape develops.

Formal Verification

Academic work on proving sandbox properties is progressing. Tools like Isabelle/HOL are being applied to seccomp policies and hypervisor configurations. "Certified isolation" remains rare but is growing.

Standardized Sandbox APIs

E2B, Replit, Modal, and others are converging on similar interfaces. Expect standardized APIs for isolated code execution, potentially leading to vendor-agnostic implementations and price commoditization.

Recommendations

1. Solo Developers and Startups

Docker with seccomp profiles plus a network proxy provides adequate protection for development workloads. Or outsource to E2B or Replit and skip the infrastructure entirely. Time investment: 1-2 days to set up proper seccomp rules. Cost: free (open-source) or approximately $100/month for managed service.

2. Enterprise and Multi-Tenant Systems

Firecracker or gVisor with SIEM integration. Kafka or SQS for work distribution. Dedicated personnel for operations and red teaming. Time investment: 4-8 weeks to build the control plane. Cost: $5K-$20K/month for infrastructure plus personnel.

3. AI/ML Teams Building Agentic Platforms

Start with E2B or Replit—let them solve the infrastructure problem. Your differentiation is in agent design, monitoring, and governance, not container orchestration. Time investment: 2-3 weeks to integrate. Then focus on the hard problems.

Checklist

Sandboxing is not optional. If you're running LLM-generated code, you need a boundary.
Choose based on threat model, not marketing. "AI-safe" doesn't mean secure. Isolate based on what could actually go wrong.
Combine multiple layers. Sandbox plus monitoring plus human-in-the-loop gates plus signed artifacts plus red teaming.
Test failure modes, not happy paths. Can the agent delete files? Exfiltrate data? DoS the host? Test for it. Reproduce symlink escapes, disk fills, network exfiltration attempts.
Keep it simple to start. Docker plus seccomp is a solid foundation. Upgrade to Firecracker when you need it, not before.
Document decisions. Write down why you chose a sandbox and what you're protecting against. This helps during audits and when onboarding new team members.
Prompts are not guardrails. This one bears repeating.

The future of software development includes AI agents that can write and execute code autonomously. That future can be secure—but only if we're honest about the risks and rigorous about mitigation. The technology stack has matured to meet the challenge. The question is whether we'll use it.