Skip to content

Bridging the Gap: Empowering LLMs with Secure Command-Line Access

Large Language Models (LLMs) are transforming how we interact with technology, offering incredible potential for automation and assistance. However, one of the significant hurdles in making LLMs truly effective partners, especially for DevOps and SREs, is their inherent lack of awareness of our actual operational environments. They can't "see" our systems, check logs, or understand real-time states without a bridge to the real world.

The DevOps Dream: LLMs as Your On-Call Co-Pilot

Imagine an LLM that could not only understand your runbooks but also safely execute the diagnostic steps when an alert fires at 3 AM. Picture it:

  • Alert! High CPU on prod-web-7!
  • Instead of fumbling through dashboards, you ask your LLM: "What's causing high CPU on prod-web-7? Can you check recent logs for errors and see resource utilization?"
  • The LLM, guided by predefined, safe tools, could check system metrics, parse relevant log snippets, and even look up recent deployments related to the affected service.

This is the dream: LLMs assisting with the "Ops" side of DevOps, triaging incidents, gathering initial data, and even suggesting remediation steps based on established procedures. They could act as intelligent assistants, significantly reducing the cognitive load on engineers during stressful on-call situations.

The Double-Edged Sword: Letting LLMs Run Commands

The power to execute commands is immense, but so is the risk. Giving an LLM direct, unrestricted shell access is like handing over the keys to your kingdom without knowing if the recipient is a benevolent ruler or a mischievous gremlin. Accidental rm -rf / commands, data exfiltration, or unintentional system modifications are all too real possibilities. The core challenge is: how do we grant LLMs the ability to interact with systems without exposing ourselves to catastrophic failures?

Introducing MCPShell: Safe, Controlled Command Execution for LLMs

This is where the MCPShell project steps in. It's designed to be a secure gateway, allowing LLMs that support the Model Context Protocol (MCP) to interact with command-line tools in a controlled and safe manner.

How Does It Work?

The magic lies in its configuration-driven approach. You, the human operator, define a set of "tools" that the LLM can use. These definitions are written in simple YAML files and specify:

  1. The Command: What actual shell command to run (e.g., kubectl get pods -n my-namespaceaws s3 ls my-bucket).
  2. Parameters: What inputs the LLM can provide (e.g., a namespace, a bucket name, a pod label).
  3. Security Constraints: This is the crucial part. Using the powerful Common Expression Language (CEL), you define strict rules that parameters must adhere to before the command is even considered for execution.

For example, a tool to list Kubernetes pods might have constraints like:

  • The namespace parameter must not contain special characters like ; or |.
  • The namespace parameter must match a predefined list of allowed namespaces.
  • No parameters can contain ../ to prevent directory traversal.

If the LLM attempts to use a tool with parameters that violate these constraints, the mcp-cli-adapter blocks the execution.

Key Security Features at a Glance:

  • Declarative Tool Definition: You explicitly define what commands can be run and what parameters they accept. There's no "guessing" by the LLM.
  • Parameter Validation with CEL: Robust, expression-based validation of all inputs before execution. This is your primary defense against malicious or malformed inputs.
  • Emphasis on Read-Only Operations: The project strongly encourages defining tools for read-only actions (getlistdescribe). While you could define tools that make changes, the philosophy is to start with observation and diagnostics.
  • Shell and Command Templating: Commands are constructed using templates, reducing the risk of direct injection if parameters are also constrained.
  • Prerequisites: You can define prerequisites for tools, like requiring specific executables to be present or running on a particular OS.

Simplified Examples: Giving Your LLM Eyes on Your Systems

Let's look at how you might empower an LLM:

1. Checking Kubernetes Pods:

You could define a tool named get_kubernetes_pods like this (simplified):

yaml
mcp:
  tools:
    - name: "get_kubernetes_pods"
      description: "Get a list of Kubernetes pods in a specific namespace."
      params:
        namespace:
          type: string
          description: "The Kubernetes namespace to query."
          required: true
      constraints:
        - "namespace.matches('^[a-z0-9-]+$')"# Only allow simple namespace names- "!namespace.contains('kube-system')"# Don't let it poke in critical system namespaces easilyrun:
        command: "kubectl get pods -n {{ .namespace }} --output=json"

Now, your LLM can be asked, "Show me pods in the frontend-prod namespace," and MCPShell will safely execute kubectl get pods -n frontend-prod --output=json if the namespace meets the constraints.

2. Listing AWS S3 Bucket Contents:

Similarly, for AWS:

yaml
mcp:
  tools:
    - name: "list_s3_bucket_objects"
      description: "Lists objects in a specific AWS S3 bucket."
      params:
        bucket_name:
          type: string
          description: "The name of the S3 bucket."
          required: true
        max_items:
          type: number
          description: "Maximum number of items to list (1-100)."
          default: 20
      constraints:
        - "bucket_name.matches('^[a-z0-9.-]+$')"# Basic S3 bucket name validation- "max_items >= 1 && max_items <= 100"# Sensible limitrun:
        command: "aws s3api list-objects-v2 --bucket {{ .bucket_name }} --max-items {{ .max_items }}"

The LLM could then fulfill requests like, "List the first 10 objects in the my-backup-bucket."

3. Running a Command in a Custom Sandbox (macOS Example):

To further lock down a command, you can use sandbox-exec in Macs for improving security execution:

yaml
mcp:
  tools:
    - name: "sandboxed_safe_command"
      description: "Runs a very restricted command within a macOS sandbox."
      params:
        user_input:
          type: string
          description: "Some input for the command"
      constraints:
        - "!user_input.contains('dangerous_stuff')"
      run:
        runner: sandbox-exec
        command: "echo {{ .user_input }}"# A simple, safe command
        options:
          custom_profile: |
            (version 1)
            (allow default)      # Start with a restrictive base
            (deny network*)      # Explicitly deny all network access
            (deny file-write*)   # Explicitly deny all file writes
            # Allow only specific read access if needed, e.g.:
            # (allow file-read* (literal "/tmp/safe-file.txt"))

This example demonstrates a custom sandbox profile that denies network and file write access, ensuring the echo command (or any other specified command) operates with minimal privileges.

The LLM could then fulfill requests like, "List the first 10 objects in the my-backup-bucket."

A Glimpse of Available Tooling

The MCPShell project comes with a rich set of predefined, read-only tool configurations that you can adapt, including wrappers for:

  • Kubernetes (kubectl): For inspecting pods, services, deployments, logs, etc.
  • AWS CLI (aws): Covering various services like EC2, S3, Route53, networking components (VPCs, Security Groups, Load Balancers), and more.
  • Network Diagnostics: Tools for pingtraceroutedig.
  • Disk Diagnostics: For checking disk space (dfdu).
  • System Performance: Tools to look at running processes.
  • GitHub CLI (gh): For interacting with GitHub repositories.

These examples are typically suffixed with -ro (read-only) in their configuration files, reinforcing the safety-first approach.

The Path Forward

The MCPShell offers a pragmatic and security-conscious solution to one of the most significant challenges in applied AI: enabling LLMs to interact with live systems. By putting humans firmly in control of defining what can be run and how it can be run, it paves the way for LLMs to become invaluable assistants in complex operational environments.

While the temptation to grant LLMs write access is strong, the path to truly autonomous operations is one that must be walked carefully, with security and control as paramount concerns. Projects like MCPShell are vital steps on that journey.

Check out the project, explore its examples, and consider how it might help your LLM become a more aware and helpful co-pilot in your daily operations!