decoct

Compress infrastructure configuration for LLM context windows. Strips platform defaults, redacts secrets, and annotates deviations from your design standards — saving 15–57% of tokens depending on platform and tier. 25 bundled schemas covering INI, YAML, and JSON input. Auto-detects Docker Compose, Kubernetes, Ansible, cloud-init, Terraform state, GitHub Actions, Traefik, and Prometheus.

pip install decoct

The problem

Infrastructure operations increasingly involve feeding configuration into LLM context windows — for AI-assisted troubleshooting, agent-driven operations, architecture review, and code generation against live state.

But the data is full of noise. A typical Docker Compose file is packed with platform defaults the model already knows, system-managed metadata nobody asked about, and structural boilerplate. The things that actually matter — where your configuration deviates from your standards — are buried and indistinguishable from everything else.

The result: context windows stuffed with low-value tokens, expensive to run, and missing the intent that would make them useful.

The approach

To concoct is to build something up — mixing ingredients, adding complexity. Infrastructure configuration is concocted: layers of platform defaults, boilerplate, system-managed fields, and actual intent all mixed into one document.

To decoct is the opposite. It's an old term from chemistry meaning to extract the essence by boiling something down — simmering raw material until only the concentrated, useful compounds remain. It's also, we're aware, a slightly unusual word to say out loud in a professional setting — which is fine. Memorable is good.

That's what this tool does. decoct takes the concoction of your infrastructure configuration and reduces it to its essence: what you intentionally changed, and what's wrong. Everything else boils away.

What it does

Given a Docker Compose file with platform defaults, secrets, and conformant values:

Input · 142 tokens
services:
  web:
    image: nginx:1.25.3
    restart: unless-stopped
    network_mode: bridge          # platform default
    privileged: false            # platform default
    read_only: false             # platform default
    stdin_open: false            # platform default
    tty: false                   # platform default
    ports:
      - "8080:80"
    environment:
      DB_PASSWORD: s3cret!Pass99
Output · 58 tokens 59% saved
services:
  web:
    image: nginx:1.25.3
    ports:
      - "8080:80"
    environment:
      DB_PASSWORD: [REDACTED]

With assertions

Add your design standards as assertions. Conformant services collapse entirely; deviations are annotated with what's expected:

Input — two services
services:
  web:
    image: nginx:1.25.3
    restart: unless-stopped
    # ... healthcheck, logging, etc.
  db:
    image: postgres:latest
    restart: always
Output — only problems remain 2 deviations
# decoct: 2 deviations from standards
# [!] no-latest: services.db.image
# [!] restart-policy: services.db.restart
services:
  web: 
  db:
    image: postgres:latest  # [!] must not use :latest
    restart: always  # [!] standard: unless-stopped

The entire compliant web service collapsed to {} (an empty object). The database kept only its two problems, clearly annotated. An LLM reading this immediately understands: web is fine, db has two issues.

Compression tiers

Measured against 11 production Docker Compose files. Cloud-init configs achieve up to 57% from default stripping alone due to high default density.

Tier Measured savings What it does Requires
Generic cleanup ~15% Redact secrets, strip comments
Platform defaults ~20% Also strip values matching platform defaults --schema or auto-detected
Standards conformance ~32% Also strip conformant values, annotate deviations --assertions

Template-generated configs (Ansible/Jinja2) reach 35–41%. Assertions are custom design rules — your organisation's standards layered on top of platform defaults. Conformant values are stripped just like defaults, driving real token savings. With class-based reconstitution via the emit-classes pass, common conformant patterns across services are deduplicated further — grouped into named classes that an LLM can reference to reconstruct full configuration from the compressed output.

Pipeline passes

Each pass transforms the document in-place. Passes requiring --schema or --assertions are only included when those options are provided.

  1. strip-secrets always runs
  2. strip-comments always runs
  3. strip-defaults requires --schema
  4. drop-fields requires --schema
  5. keep-fields requires --schema
  6. emit-classes requires --schema
  7. strip-conformant requires --assertions
  8. prune-empty always runs
  9. annotate-deviations requires --assertions
  10. deviation-summary requires --assertions

Where the project stands

Phase 1 — Complete

Deterministic pipeline with 8 passes, CLI, and 183 tests. Proved the concept with synthetic fixtures.

Phase 2 — Complete

Production-validated pipeline with 10 passes, 403 tests, 25 bundled schemas covering 1,494 platform defaults, deployment standard assertions, INI/JSON/YAML input, LLM-assisted schema and assertion learning, class-based reconstitution, and directory mode with platform auto-detection for 8 platforms.

v0.1.0 released

Additional platform schemas, HCL/XML input support, MCP tool server integration, and expanded assertion coverage for Kubernetes and Terraform.

The bigger picture

decoct is the compression layer in a larger ecosystem of LLM-powered infrastructure tooling. The deterministic pipeline handles the mechanical work — stripping what can be mechanically determined to be noise. Future LLM integration adds learning (deriving schemas and assertions from documentation) and resolution (handling ambiguous cases the deterministic pipeline can't decide).

The core principle: secrets are always handled deterministically, the pipeline always works without LLM access, and LLM features are an optional enhancement layered on top.