immutlex

Immutable knowledge base for AI agents

Your agent can search the web.
It still can't remember.

> Ask the substrate, then prompt.

immutlex gives your agent a durable, cross-linked memory it queries instead of guessing - a knowledge graph with fused semantic and lexical search and 65 MCP tools, built from the ground up on immutable object storage. All from local models in a single binary. No cloud, no API keys.

It's all plain markdown in storage you own. Take your documents anywhere, anytime - you just won't want to.

Request a pilot See it work

wiki-link graph fused semantic + lexical search 65 MCP tools local models, on device no cloud, no API keys

The paradigm shift

Stop stuffing context. Start asking the substrate.

The old way

Stuff everything into the prompt and hope.

one enormous prompt

Here is the codebase. Here is the spec.
Here is the conversation history.
Now answer my question.
# token bill: enormous - hallucinations: frequent

The model pattern-matches across too much undifferentiated noise.
Working memory gets crammed into the question.

With immutlex auto-inject

Ask the question. The substrate attaches the answer.

the question, grounded

What should I tell the client about the spec issue?
# substrate attaches the 3-5 ranked chunks
# most relevant to that query, before the model
# ever sees the prompt

The substrate is the working memory. The prompt is the question.
Customer 0: ~10x token cost reduction; confidently-wrong answers driven to near-zero on domain queries.

See it work

Watch an agent query its own knowledge graph

The best engineers already do a version of this by hand - an Obsidian vault of plain markdown, a pile of glued-together scripts, an LLM told to read and maintain it. It's the inefficiency Andrej Karpathy described: stop stuffing context into the prompt, give the model a knowledge base to ask. immutlex is built from the ground up on immutable object storage, with that pattern engineered in - in this demo an agent ingests a multimodal corpus (text, code, diagrams, a whiteboard photo) and immediately searches and traverses it, with the desktop watching every event live.

Ingest markdown with [[wikilinks]], drop in PDFs, CSVs, and images. immutlex chunks, embeds, indexes, and builds a navigable knowledge graph - then your agent queries it.

agent <-> immutlex

Agent: "What connects transformers to attention?"
immutlex: Search (fused BM25 + semantic, 11.4x compression):
  1. transformer-architecture - "based entirely on self-attention"
  2. attention-is-all-you-need - "replacing recurrence with attention"
  Graph: transformer-architecture <-> self-attention (1 hop)
  PageRank: 0.50 / 0.50 - Community: 2 clusters (Louvain)

From document to graph

A plain markdown file. A queryable graph underneath.

You write ordinary markdown. immutlex reads the YAML frontmatter and the [[wikilinks]] in the body, turns every link into a graph edge, and infers the rest semantically - so the file stays portable while the graph underneath becomes navigable.

attention-is-all-you-need.md

---
id: attention-is-all-you-need
title: Attention Is All You Need
tags: [transformers, attention, nlp]
links: [transformer-architecture, self-attention]
---

The Transformer replaces recurrence with
pure [[self-attention]], the core idea behind the
[[transformer-architecture]] that followed.

What immutlex derives

Forward links -> self-attention, transformer-architecture

Backlinks <- attention-survey, encoder-decoder

Semantic edges ~ positional-encoding (0.81 cosine)

Forward links come straight from the body. Backlinks are generated automatically - every doc that points here. Semantic neighbors are inferred from the embeddings, so related docs connect even without an explicit link.

query the graph

Agent: graph op="neighbors" doc="attention-is-all-you-need"
out: self-attention, transformer-architecture
in: attention-survey, encoder-decoder
Agent: query mode="path" from="encoder-decoder" to="self-attention"
encoder-decoder -> attention-is-all-you-need -> self-attention

See your own knowledge base

Your whole corpus, as a graph you can fly through.

Every document is a node; every [[wikilink]] is an edge. As your knowledge base grows, the graph fills in - this is a real vault with tens of thousands of docs rendered live in the browser. Pilot users get a hosted viewer to explore their own.

immutlex web viewer showing a knowledge-base graph of tens of thousands of documents — The immutlex web viewer - browse, search, and traverse your vault in 3D.

It's a thin client - your knowledge base never leaves your machine. The hosted viewer at npiesco.github.io/immutlex talks to the daemon running on your own hardware; nothing is uploaded, and the same build serves every user. Pilot users: point it at your daemon and explore your graph in the browser.

The 90-second proof

The diff that converts a skeptic

Before - no substrate

$0.40

per answer - the agent re-reads 12 schema files and still invents column names that don't exist.

After - immutlex auto-inject

$0.04

same prompt. The substrate injects 3 ranked chunks from the real schema. The model answers correctly, with citations.

The delta

10x

cost drop. 0% to 100% accuracy on the failing query, with a citation grounding every claim.

The substrate is the working memory. The prompt is the question. Most developers stuff working memory into the question - that is the inefficiency immutlex eliminates.

What your agent can ask

65 MCP tools across 9 categories

Ingest & Validate

Ingest markdown, PDFs, CSVs, images, and any web page. Validate frontmatter and wikilinks before they land.

Query & Retrieve

Fused BM25 + semantic search, heading and backlink search, document rehydration with provenance.

Graph Analytics

Neighbors, shortest path, PageRank, community detection, orphan detection over the wiki-link graph.

Context & Budget

Auto-inject ranked chunks under a token budget. Extractive, diversity-aware summarization.

Relevance & Temporal

Decay, mark-noise, and revalidate - so stale knowledge fades instead of being deleted.

WAL & Compaction

A write-ahead log with durable stages and compaction - the substrate is immutable and auditable.

Infrastructure

S3 / MinIO / ADLS object storage behind one connection string. Local or cloud, your choice.

Admin

Operator surface: vault overview, ingest jobs, worker nodes, leases, index generations.

System

Health, status, and license info - role, storage reachability, pending jobs, generations.

Why it's different

Everything local. Nothing phoning out.

Every model runs locally, inside the binary. Text, code, images, and OCR are all handled on-device - no download step, no model files to manage, no network at inference time.
No cloud. No API keys. Zero telemetry. Storage is one S3-compatible connection string: local MinIO on a VPS, your S3 bucket, or your ADLS.
Immutable revisions. Every wiki document is a zero-padded, content-addressed revision with a full audit trail. Nothing is silently overwritten.
Your data is the moat. Export the whole bucket or pull plain .md files and walk away anytime. The substrate is portable; what you won't want to leave behind is the grounding that accretes on top.

By the numbers

78K

lines of Rust

1,825

tests in the suite

MCP tools

100%

local, no network at inference

Fused lexical + semantic search and a native knowledge graph, in one self-contained binary.

The bigger story

Substrate alone is the floor. The full stack is the ceiling.

Auto-injection is the floor lift every customer gets on day one. The teams that stack grounding layers on top compound them into something qualitatively different.

Substrate auto-injection

Every turn gets a ranked block of relevant chunks before the model sees the prompt.

Repo-grounded reading

The agent reads the canonical source instead of pattern-matching from training data.

Verified-fact memory

Hard-won gotchas auto-load every session - always on, not retrieved.

Each layer is useful alone. Stacked, customer 0 sees near-zero confident hallucinations on a domain the substrate has been trained against for months.

Extend it

Standalone tools. One thing in common: clean markdown.

These are independent tools - each useful entirely on its own, each one an MCP server your agent can call (codino and fsgdb ship a CLI too). What ties them to immutlex is not a dependency, it's a format: every one of them emits plain .md - and codino and fsgdb emit it already linked, with frontmatter and back-and-forward [[wikilinks]], so it lands in the graph fully formed. Use them with immutlex, use them without it. The markdown goes wherever you want.

defuddle-rs open source

Ingest any website

The doc your agent needs is a web page buried in nav, ads, share widgets, and chrome. Point defuddle-rs at the URL and it keeps the article, drops the rest, and hands back clean markdown - ready to drop into immutlex or anything else that reads .md. One page or a whole capture flow.

Clean-room Rust defuddle - the same parser ships as a crate, MCP server, Python package, browser extension, and WASM. MIT, free. View on GitHub ->

codino add-on

Give last week's conversation back to your agent

Every hard problem your agent already solved is sitting in a Claude Code, Codex, or Copilot session log - and forgotten the moment the window closed. codino reads those logs where the tools already keep them, finds the threads worth keeping, and turns them into markdown your agent can ask again. The work you already paid for stops evaporating.

Hybrid retrieval over any SQLite or JSONL: SQLite FTS5 lexical + DuckDB HNSW semantic, fused with RRF. Exports interlinked wiki markdown - YAML frontmatter, bidirectional [[wikilinks]], no dangling. Source stays read-only.

fsgdb add-on

Teach your agent how the codebase actually works

Your agent can write code but doesn't know what calls what, what breaks if it changes a function, or which tests cover it. fsgdb maps the real structure of your repo - call graphs, blast radius, dead code, coverage, git history - and dumps it as plain markdown, ready to drop into immutlex or read anywhere your agent already looks. Scan once, re-scan whenever the code moves.

Local tree-sitter parsing into a full Cypher graph database over the Bolt protocol. 34 MCP tools; exports the graph as immutlex-compatible linked markdown - frontmatter plus a workspace index that cross-links every module.

Local-first and zero-telemetry, same as immutlex - but they stand alone. defuddle-rs is open source; codino and fsgdb are paid tools that pair naturally with any plan, no coupling required. Ask us which ones fit your stack.

Pricing

Pay for the intelligence layer, not per seat

No per-seat pricing, ever. You pay for what immutlex processes and stores for you - documents ingested and storage used. Two numbers you already know, scaling linearly.

Pro - for teams under ~250 people

One VPS you manage. No cloud account required.

The full system on a single box you control, with bundled storage. Everything runs local; nothing phones out. The fastest way to put a durable memory in front of your agents.

25,000 docs / mo 10 GB included Bundled storage Priority email support

$249 / mo

Overage $5 / GB/mo
and $10 / 1K docs.

Request a pilot

Introductory rate while we onboard early teams - the Pro price may rise as immutlex matures.

Pro Plus - more headroom, cheaper overage

Same single-box simplicity, room to grow.

For teams pushing past Pro's limits. More documents and storage included, and a lower overage rate when you spill over - so growing costs less, not more.

35,000 docs / mo 25 GB included Cheaper overage Priority email support

$499 / mo

Overage $4 / GB/mo
and $8 / 1K docs.

Request a pilot

Introductory rate while we onboard early teams - the Pro Plus price may rise as immutlex matures.

Enterprise - usage-based, deployed your way

Base by capacity, plus usage - no per-seat pricing. Deploy Managed (your corpus on immutlex-operated storage, no cloud account to stand up) or Hybrid (compute we operate, documents stay in your own S3 or ADLS and never leave your cloud - the compliance path for banking, healthcare, and manufacturing).

Enterprise

$1,500 / mo

50 GB · 50K docs / mo included

Enterprise

$3,000 / mo

100 GB · 100K docs / mo included

Enterprise

$6,000 / mo

250 GB · 250K docs / mo included

Enterprise

$12,000 / mo

500 GB · 500K docs / mo included

Enterprise overage $4 / GB/mo and $8 / 1K docs. Need more than 500 GB or 500K docs a month? Contact us for custom pricing.

Self-hosted - fully owned

Own the entire stack. Nothing leaves your walls.

Run immutlex end to end inside your own environment: your infrastructure, your network, your data. No shared infra, no hybrid dependency, full data sovereignty - built and priced for your org, your compliance posture, and your scale.

Air-gap friendly Full data sovereignty Nothing phoning out

Tailored to your org.

Talk to us

Built on open formats - your data stays yours. Everything is plain markdown with YAML frontmatter in S3-compatible storage you control, on open-source foundations. Export every document to a directory of .md files and take it anywhere - no proprietary format, no vendor lock. What you keep by staying is everything the system builds on top of those files: the wiki-link graph, the revision lineage, and the months of grounding that accrete around your corpus.

Give your agent a memory.

A durable, cross-linked knowledge base your agent queries instead of guessing - wiki-link graph, fused search, 65 MCP tools, all from local models in one binary.

Point immutlex at your real documents - markdown, PDFs, CSVs, images.
Wire auto-inject into your agent and watch the token bill and hallucinations drop.
Run it where your agents already do: a VPS, your cloud, or a laptop.
Vaults are unlimited; isolation is per-vault, with a license-gated MCP surface.

The biggest unlock is not a bigger model - it's switching from stuffing context to asking the substrate.
Everything runs local: no cloud dependency, no API keys, zero telemetry.
78,000 lines of Rust, 1,825 tests, and 65 MCP tools behind it.

Request a pilot hello@immutlex.dev

Your agent can search the web.It still can't remember.

Stop stuffing context. Start asking the substrate.

Watch an agent query its own knowledge graph

A plain markdown file. A queryable graph underneath.

Your whole corpus, as a graph you can fly through.

The diff that converts a skeptic

65 MCP tools across 9 categories

Everything local. Nothing phoning out.

Substrate alone is the floor. The full stack is the ceiling.

Substrate auto-injection

Repo-grounded reading

Verified-fact memory

Standalone tools. One thing in common: clean markdown.

Ingest any website

Give last week's conversation back to your agent

Teach your agent how the codebase actually works

Pay for the intelligence layer, not per seat

One VPS you manage. No cloud account required.

Same single-box simplicity, room to grow.

Own the entire stack. Nothing leaves your walls.

Give your agent a memory.

Your agent can search the web.
It still can't remember.