Restricted Execution for LLM Agents: Sandboxing Untrusted Code
LLM agents that execute code present a fundamental security challenge: how do you safely run potentially malicious commands generated by an AI? The answer lies in process isolation and sandboxing—techniques that have been refined over decades in Linux and macOS systems.
The Problem: Trusting AI-Generated Code
When an LLM agent needs to interact with a system—whether running shell commands, executing scripts, or manipulating files—you're essentially executing untrusted code. The agent might:
- Access sensitive files it shouldn't read
- Modify or delete critical system files
- Establish network connections to exfiltrate data
- Consume excessive system resources
- Execute privilege escalation attacks
This isn't theoretical. Any production LLM agent with command execution capabilities requires robust isolation to prevent accidental or malicious damage.
Linux Sandboxing Technologies
Firejail: Namespace-Based Isolation
Firejail leverages Linux namespaces and seccomp-bpf to create lightweight sandboxes. It isolates processes using:
- Filesystem isolation: Private
/tmp, restricted filesystem access via whitelists/blacklists - Network isolation: Disable networking entirely or restrict to specific interfaces
- PID namespace: Process cannot see or signal other processes
- Seccomp filters: Syscall filtering to block dangerous operations
# Run command with no network and restricted filesystem
firejail --net=none --private=/tmp/sandbox commandLandlock: Kernel-Level Access Control
Landlock (merged in Linux 5.13) provides unprivileged sandboxing at the kernel level. Unlike traditional MAC systems (SELinux, AppArmor), Landlock allows processes to self-restrict without root privileges:
// Restrict filesystem access to specific paths
landlock_restrict_self(ruleset_fd, 0);Landlock is particularly valuable for LLM agents because it enables fine-grained filesystem restrictions without requiring privileged operations or complex policy files.
Bubblewrap: User Namespace Containers
Bubblewrap creates lightweight containers using user namespaces. It's the sandboxing engine behind Flatpak and provides:
- Unprivileged container creation
- Custom filesystem layouts via bind mounts
- Network namespace isolation
- No setuid binaries required
bwrap --ro-bind /usr /usr --tmpfs /tmp --unshare-net commandmacOS Sandboxing: sandbox-exec
macOS provides sandbox-exec for process sandboxing using the Sandbox Kernel Extension (SBKE). It uses a Scheme-like policy language:
(version 1)
(deny default)
(allow file-read* (subpath "/usr/lib"))
(allow file-write* (subpath "/tmp"))
(deny network*)While less flexible than Linux solutions, sandbox-exec provides adequate isolation for many LLM agent use cases on macOS.
Comparison of Sandboxing Technologies
| Feature | Firejail | Landlock | Bubblewrap | sandbox-exec | Docker |
|---|---|---|---|---|---|
| Platform | Linux | Linux 5.13+ | Linux | macOS | Linux/macOS |
| Privileges Required | None (setuid) | None | None | None | Docker daemon |
| Filesystem Isolation | Whitelist/blacklist | Path-based ACL | Bind mounts | Policy-based | Full container FS |
| Network Isolation | Yes | No | Yes | Yes | Yes |
| PID Namespace | Yes | No | Yes | Limited | Yes |
| Seccomp Filtering | Yes | No | No | No | Yes |
| Overhead | Low | Minimal | Low | Low | Medium-High |
| Complexity | Medium | Low | Medium | Low | High |
| Maturity | Stable | New (2021) | Stable | Stable | Very Stable |
| Use Case | General sandboxing | Fine-grained FS control | Desktop apps | macOS isolation | Full isolation |
go-restricted-runner: Unified Sandboxing Interface
The go-restricted-runner library provides a unified Go interface for multiple sandboxing backends:
import "github.com/inercia/go-restricted-runner/pkg/runner"
// Create a firejail-based runner
r, err := runner.New(runner.TypeFirejail, runner.Options{
"allow_networking": false,
"allow_read_folders": []string{"/tmp/workspace"},
}, logger)
// Execute untrusted command
output, err := r.Run(ctx, "sh", "echo 'Hello from sandbox'", nil, nil, false)Supported Runners
- exec: Direct execution (no isolation) - for trusted environments
- sandbox-exec: macOS sandboxing via SBKE
- firejail: Linux namespace-based isolation
- docker: Container-based isolation with full filesystem separation
Interactive Process Communication
For LLM agents that need REPL-style interaction or streaming output:
stdin, stdout, stderr, wait, err := r.RunWithPipes(
ctx, "python3", []string{"-i"}, nil, nil,
)
// Send commands to the sandboxed process
fmt.Fprintln(stdin, "import os; print(os.getcwd())")
stdin.Close()
// Read streaming output
output, _ := io.ReadAll(stdout)
wait()This enables agents to maintain stateful sessions with interpreters (Python, Node.js, etc.) while remaining isolated.
Practical Considerations
Defense in Depth
Sandboxing should be one layer in a defense-in-depth strategy:
- Input validation: Sanitize LLM-generated commands before execution
- Resource limits: Use cgroups to limit CPU, memory, and I/O
- Audit logging: Record all executed commands for forensic analysis
- Least privilege: Run sandboxed processes as unprivileged users
- Timeout enforcement: Kill long-running processes automatically
Performance vs. Security Trade-offs
- Firejail: Low overhead, good for frequent short-lived processes
- Docker: Higher overhead, better isolation, suitable for long-running agents
- Landlock: Minimal overhead, but requires kernel 5.13+
Limitations
No sandbox is perfect. Kernel vulnerabilities, container escapes, and side-channel attacks remain possible. Critical systems should combine sandboxing with:
- Regular security updates
- Intrusion detection systems
- Network segmentation
- Principle of least privilege at the infrastructure level
Conclusion
As LLM agents become more autonomous and capable, robust sandboxing transitions from optional to mandatory. Technologies like firejail, Landlock, and bubblewrap provide the primitives needed to safely execute untrusted code. Libraries like go-restricted-runner make these technologies accessible to developers building agent systems.
The key insight: treat LLM-generated code with the same suspicion you'd apply to arbitrary user input. Sandbox everything, assume breach, and build systems that fail safely when isolation is compromised.