decoct
Compress infrastructure configuration for LLM context windows. Strips platform defaults, redacts secrets, and annotates deviations from your design standards — saving 15–57% of tokens depending on platform and tier. 25 bundled schemas covering INI, YAML, and JSON input. Auto-detects Docker Compose, Kubernetes, Ansible, cloud-init, Terraform state, GitHub Actions, Traefik, and Prometheus.
pip install decoct The problem
Infrastructure operations increasingly involve feeding configuration into LLM context windows — for AI-assisted troubleshooting, agent-driven operations, architecture review, and code generation against live state.
But the data is full of noise. A typical Docker Compose file is packed with platform defaults the model already knows, system-managed metadata nobody asked about, and structural boilerplate. The things that actually matter — where your configuration deviates from your standards — are buried and indistinguishable from everything else.
The result: context windows stuffed with low-value tokens, expensive to run, and missing the intent that would make them useful.
The approach
To concoct is to build something up — mixing ingredients, adding complexity. Infrastructure configuration is concocted: layers of platform defaults, boilerplate, system-managed fields, and actual intent all mixed into one document.
To decoct is the opposite. It's an old term from chemistry meaning to extract the essence by boiling something down — simmering raw material until only the concentrated, useful compounds remain. It's also, we're aware, a slightly unusual word to say out loud in a professional setting — which is fine. Memorable is good.
That's what this tool does. decoct takes the concoction of your infrastructure configuration and reduces it to its essence: what you intentionally changed, and what's wrong. Everything else boils away.
What it does
Given a Docker Compose file with platform defaults, secrets, and conformant values:
services:
web:
image: nginx:1.25.3
restart: unless-stopped
network_mode: bridge # platform default
privileged: false # platform default
read_only: false # platform default
stdin_open: false # platform default
tty: false # platform default
ports:
- "8080:80"
environment:
DB_PASSWORD: s3cret!Pass99 services:
web:
image: nginx:1.25.3
ports:
- "8080:80"
environment:
DB_PASSWORD: [REDACTED] With assertions
Add your design standards as assertions. Conformant services collapse entirely; deviations are annotated with what's expected:
services:
web:
image: nginx:1.25.3
restart: unless-stopped
# ... healthcheck, logging, etc.
db:
image: postgres:latest
restart: always # decoct: 2 deviations from standards
# [!] no-latest: services.db.image
# [!] restart-policy: services.db.restart
services:
web:
db:
image: postgres:latest # [!] must not use :latest
restart: always # [!] standard: unless-stopped
The entire compliant web service collapsed to {} (an empty object). The
database kept only its two problems, clearly annotated. An LLM reading
this immediately understands: web is fine, db has two issues.
Compression tiers
Measured against 11 production Docker Compose files. Cloud-init configs achieve up to 57% from default stripping alone due to high default density.
| Tier | Measured savings | What it does | Requires |
|---|---|---|---|
| Generic cleanup | ~15% | Redact secrets, strip comments | — |
| Platform defaults | ~20% | Also strip values matching platform defaults | --schema or auto-detected |
| Standards conformance | ~32% | Also strip conformant values, annotate deviations | --assertions |
Template-generated configs (Ansible/Jinja2) reach 35–41%. Assertions are custom design rules — your organisation's standards layered on top of platform defaults. Conformant values are stripped just like defaults, driving real token savings. With class-based reconstitution via the emit-classes pass, common conformant patterns across services are deduplicated further — grouped into named classes that an LLM can reference to reconstruct full configuration from the compressed output.
Pipeline passes
Each pass transforms the document in-place. Passes requiring
--schema or --assertions are only
included when those options are provided.
strip-secretsalways runsstrip-commentsalways runsstrip-defaultsrequires --schemadrop-fieldsrequires --schemakeep-fieldsrequires --schemaemit-classesrequires --schemastrip-conformantrequires --assertionsprune-emptyalways runsannotate-deviationsrequires --assertionsdeviation-summaryrequires --assertions
Where the project stands
Phase 1 — Complete
Deterministic pipeline with 8 passes, CLI, and 183 tests. Proved the concept with synthetic fixtures.
Phase 2 — Complete
Production-validated pipeline with 10 passes, 403 tests, 25 bundled schemas covering 1,494 platform defaults, deployment standard assertions, INI/JSON/YAML input, LLM-assisted schema and assertion learning, class-based reconstitution, and directory mode with platform auto-detection for 8 platforms.
v0.1.0 released
Additional platform schemas, HCL/XML input support, MCP tool server integration, and expanded assertion coverage for Kubernetes and Terraform.
The bigger picture
decoct is the compression layer in a larger ecosystem of LLM-powered infrastructure tooling. The deterministic pipeline handles the mechanical work — stripping what can be mechanically determined to be noise. Future LLM integration adds learning (deriving schemas and assertions from documentation) and resolution (handling ambiguous cases the deterministic pipeline can't decide).
The core principle: secrets are always handled deterministically, the pipeline always works without LLM access, and LLM features are an optional enhancement layered on top.
pip install decoct