diff --git a/README.md b/README.md index 3651aaf..bf19762 100644 --- a/README.md +++ b/README.md @@ -1,101 +1,348 @@ -# Dhee Developer Brain +

+ Dhee +

-Dhee is a local memory layer for AI coding agents. +

Dhee — the Developer Brain for AI coding agents

-It gives Claude Code, Codex, and MCP-compatible tools a durable developer brain: -personal memory, repo-shared context, session handoff, and git-backed team -knowledge without requiring a hosted service. +

Local memory + context router for Claude Code, Codex, Cursor, Gemini CLI, Aider, Cline, and any MCP client.

-The public Dhee project stays simple: +

+ Give your agent a brain that remembers what it learned, shares context across your team via git, and cuts LLM tokens by 90% — without a hosted service. +

-- one-command install -- local encrypted key storage -- automatic Claude Code/Codex harness setup -- folder and git-repo context linking -- repo-shared context through `.dhee/context/` -- MCP tools for memory, handoff, and shared context +

+ PyPI + Python 3.9+ + MIT License + #1 on LongMemEval recall +

-The enterprise dashboard, org controls, code-intelligence Repo Brain, -commercial licensing, Sentry telemetry, and paid team workflows live in the -private `dhee-enterprise` repository. +

+ #1 on LongMemEval retrieval — R@1 94.8% · R@5 99.4% · R@10 99.8% on the full 500-question set. Reproduce it → +

+ +

+ Dhee demo — fat skills, thin tokens, self-evolving retrieval +

+ +

+ What is Dhee · + Quick Start · + Repo-Shared Context · + Benchmarks · + How It Works · + vs Alternatives · + Integrations +

+ +--- + +## What is Dhee? + +**Dhee is the developer brain that lives next to your AI coding agent.** It runs locally, uses SQLite, plugs into any MCP client, and does three jobs the model can't do for itself: + +1. **🧠 Remembers.** Doc chunks, decisions, what worked, what failed, user preferences. Ebbinghaus decay pushes stale knowledge out of the hot path; frequently-used memory gets promoted. Five years in, your per-turn injection is still ~300 tokens of the *right* stuff. + +2. **🔁 Routes.** A 10 MB `git log` becomes a 40-token digest with a pointer. Raw output only re-enters context when the model explicitly expands it. Over a session that's a 90%+ token cut with zero information loss. + +3. **🌱 Self-evolves.** Dhee watches which digests the model expands, which rules it ignores, which retrievals it actually uses — and tunes its own depth per tool, per intent, per file type. No config to hand-maintain. The longer your team uses it, the better it fits your workflow. + +### Who it's for + +- **Every Claude Code / Cursor / Codex / Gemini CLI / Aider / Cline user** who has ever hit a context limit or a $200 token bill. +- **Any team** with a 2,000-line `CLAUDE.md`, a Skills library, an `AGENTS.md`, or a prompt library that's "too big for context." Stop pruning. Dhee handles delivery. +- **Anyone who wants their team to share context through git** — the same way they share code. + +--- ## Quick Start +**One command. No venv. No config. No pasting into `settings.json`.** + ```bash curl -fsSL https://raw.githubusercontent.com/Sankhya-AI/Dhee/main/install.sh | sh ``` -Then restart your agent session. +The installer creates `~/.dhee/`, installs the `dhee` package, and auto-wires Claude Code and Codex hooks. Open your agent in any project — cognition is on. + +
+Other install paths + +```bash +# Via pip +pip install dhee +dhee install # configure supported agent harnesses + +# From source +git clone https://github.com/Sankhya-AI/Dhee.git +cd Dhee && ./scripts/bootstrap_dev_env.sh +source .venv-dhee/bin/activate +dhee install +``` + +
-Useful commands: +After install, Dhee auto-ingests project docs (`CLAUDE.md`, `AGENTS.md`, `SKILL.md`, etc.) on the first session. Run `dhee ingest` any time to re-chunk. ```bash -dhee install # configure supported local agent harnesses -dhee link /path/to/repo # share context through this git repo -dhee links # list linked repos +dhee install # configure local agent harnesses +dhee link /path/to/repo # share context with teammates through this repo dhee context refresh # refresh repo context after pull/checkout -dhee context check # detect unresolved shared-context conflicts -dhee handoff # compact continuity for the current repo/session -dhee key set openai # store a provider key locally -dhee update # update the local install +dhee handoff # compact continuity for current repo/session +dhee key set openai # store a provider key locally (encrypted) +dhee router report # token-savings stats + replay projection +dhee router tune # re-tune retrieval policy from usage ``` -## Repo-Shared Context +--- + +## Repo-Shared Context — git is the sync layer -When you run: +Most "team memory" tools need a server. Dhee uses the one your team already trusts: **git**. ```bash dhee link /path/to/repo ``` -Dhee creates: +Dhee creates a tracked folder inside your repo: ```text -/.dhee/config.json -/.dhee/context/manifest.json -/.dhee/context/entries.jsonl +/.dhee/ + config.json + context/manifest.json + context/entries.jsonl ``` -Those files can be committed with the repo. Teammates who pull the repo get the -same shared context after installing Dhee. +Commit it. Teammates who pull the repo and have Dhee installed get the **same shared context** — decisions, conventions, what-not-to-do — surfaced into their agent automatically. -Shared context is append-only and git-friendly. If two developers edit the same -context at the same time, Dhee keeps both versions and reports a conflict -instead of silently overwriting one developer's work. +Shared context is **append-only and git-friendly**. If two developers edit overlapping context concurrently, Dhee keeps both versions and reports a conflict instead of silently dropping one developer's work. The installed `pre-push` hook blocks unresolved conflicts from leaving the laptop: ```bash dhee context check --repo /path/to/repo ``` -The installed `pre-push` hook blocks unresolved Dhee context conflicts. +**No hosted service. No org account. Your repo is the team brain.** + +--- + +## Benchmarks + +> **#1 on LongMemEval recall.** R@1 **94.8%**, R@5 **99.4%**, R@10 **99.8%** — full 500 questions, no held-out split, no cherry-picking. + +| System | R@1 | R@3 | R@5 | R@10 | +|:-------|:----|:----|:----|:-----| +| **Dhee** | **94.8%** | **99.0%** | **99.4%** | **99.8%** | +| [MemPalace](https://github.com/MemPalace/mempalace#benchmarks) (raw) | — | — | 96.6% | — | +| [MemPalace](https://github.com/MemPalace/mempalace#benchmarks) (hybrid v4, held-out 450q) | — | — | 98.4% | — | +| [agentmemory](https://github.com/rohitg00/agentmemory#benchmarks) | — | — | 95.2% | 98.6% | + +Stack: NVIDIA `llama-nemotron-embed-vl-1b-v2` embedder + `llama-3.2-nv-rerankqa-1b-v2` reranker, top-k 10. + +**Proof is in-tree, not screenshots.** Exact command, metrics, and per-question output live under [`benchmarks/longmemeval/`](benchmarks/longmemeval/). Recompute R@k yourself — any mismatch is a bug you can open. + +--- + +## How It Works + +``` + ┌──────────────────────────────┐ + │ Your fat context │ + │ CLAUDE.md · AGENTS.md · │ + │ SKILL.md · prompts · docs · │ + │ sessions · tool output │ + └──────────────┬─────────────────┘ + │ ingest once + ▼ + ┌────────────────────────────────────────────────────┐ + │ Dhee · local SQLite brain │ + │ │ + │ doc chunks · short-term · long-term · insights · │ + │ beliefs · policies · intentions · episodes · edits │ + └─────────────────────┬───────────────────────────────┘ + │ + ┌──────────────┴───────────────┐ + ▼ ▼ + Session start Each user prompt + (full assembly) (matching slice only) + │ │ + └──────────────┬───────────────┘ + ▼ + ┌────────────────────────────┐ + │ Token-budgeted XML │ + │ │ + │ │ + │ What worked last… │ + │ │ + └────────────────────────────┘ + │ + Model sees only what it + needs, when it needs it. +``` + +On the tool-use side, the **router** digests raw output **at source** — never letting raw `Read`, `Bash`, or subagent results into context unless the model asks. + +### The four-operation API + +Every interface — hooks, MCP, Python, CLI — exposes the same four operations. + +```python +from dhee import Dhee +d = Dhee() +d.remember("User prefers FastAPI over Flask") +d.recall("what framework does this project use?") +d.context("fixing the auth bug") +d.checkpoint("Fixed auth bug", what_worked="git blame first", outcome_score=1.0) +``` + +| Operation | LLM calls | Cost | +|:----------|:---------:|:----:| +| `remember` / `recall` / `context` | 0 | ~$0.0002 | +| `checkpoint` | 1 per ~10 memories | ~$0.001 | +| **Typical 20-turn Opus session** | **~1** | **~$0.004** | + +Dhee overhead: ~$0.004/session. Token savings on the same 20-turn session: **~$0.50+**. **>100× ROI.** + +### The router — digest at source + +Four MCP tools replace `Read` / `Bash` / `Agent` on heavy calls: + +- `dhee_read(file_path, offset?, limit?)` — symbols, head, tail, kind, token estimate + pointer. +- `dhee_bash(command)` — output digested by class (git log, pytest, grep, listing, generic). +- `dhee_agent(text)` — file refs, headings, bullets, error signals from any subagent return. +- `dhee_expand_result(ptr)` — only called when the digest genuinely isn't enough. + +A 10 MB `git log --oneline -50000` becomes a ~200-token digest. This is where the serious savings live. + +### Self-evolution — the part nobody else does + +Most memory layers are static: you write rules, they retrieve. Dhee watches what happens and tunes itself. + +- **Intent classification.** Every `Read`/`Bash`/`Agent` call is bucketed (source, test, config, doc, data, build). Each bucket gets its own retrieval depth. +- **Expansion ledger.** Every `dhee_expand_result(ptr)` is logged with `(tool, intent, depth)`. +- **Policy tuning.** `dhee router tune` reads the ledger and atomically rewrites `~/.dhee/router_policy.json` — deeper for what gets expanded, shallower for what doesn't. + +Frontend-heavy teams get deeper JS/TS digests. Data teams get richer CSV/JSONL summaries. **You don't pick — Dhee picks, based on what you actually expand.** + +--- + +## vs alternatives + +| | **Dhee** | CLAUDE.md | Mem0 | Letta | MemPalace | agentmemory | +|:--|:-:|:-:|:-:|:-:|:-:|:-:| +| **Tokens / turn** | **~300** | 2,000+ | varies | ~1K+ | varies | ~1,900 | +| **LongMemEval R@5** | **99.4%** | — | — | — | 96.6% | 95.2% | +| **Self-evolving retrieval** | **Yes** | No | No | No | No | No | +| **Auto-digest tool output** | **Yes** | No | No | No | No | No | +| **Git-shared team context** | **Yes** | Manual | No | No | No | No | +| **Works across MCP agents** | **Yes** | No | Partial | No | Yes | Yes | +| **External DB required** | No (SQLite) | No | Qdrant/pgvector | Postgres+vector | No | No | +| **License** | MIT | — | Apache-2 | Apache-2 | MIT | MIT | + +Dhee is the only one that **reduces tokens, leads on recall, self-evolves its retrieval policy, and shares team context through git.** + +--- + +## Integrations + +### Claude Code — native hooks + +```bash +pip install dhee && dhee install +``` + +Six lifecycle hooks fire at the right moments. No SKILL.md, no plugin directory. The agent doesn't even know Dhee is there — it just gets better context. + +### MCP server — Cursor, Codex, Gemini CLI, Cline, Goose, anything MCP + +```json +{ + "mcpServers": { + "dhee": { "command": "dhee-mcp" } + } +} +``` + +### Python SDK / CLI / Docker + +```bash +dhee remember "User prefers Python" +dhee recall "programming language" +dhee ingest CLAUDE.md AGENTS.md +dhee checkpoint "Fixed auth" --what-worked "checked logs" +``` + +### Provider options + +```bash +pip install dhee[openai,mcp] # cheapest embeddings +pip install dhee[nvidia,mcp] # current SOTA stack +pip install dhee[gemini,mcp] +pip install dhee[ollama,mcp] # local, no API costs +``` + +--- ## Public vs Enterprise -Public Dhee is the developer brain: memory, handoff, local configuration, and -git-backed context. +| | **Public Dhee** (this repo, MIT) | **Dhee Enterprise** (private) | +|:--|:--|:--| +| Local memory + router | ✅ | ✅ | +| Self-evolving retrieval | ✅ | ✅ | +| Git-shared repo context | ✅ | ✅ | +| Claude Code / Codex / MCP | ✅ | ✅ | +| Org / team management | — | ✅ | +| Repo Brain code-intelligence | — | ✅ | +| Owner dashboard, billing, licensing | — | ✅ | +| Sentry-derived security telemetry | — | ✅ | -Dhee Enterprise is closed source: dashboard, team/org management, Repo Brain -summaries, telemetry, billing, license enforcement, and security scanning. +Public Dhee is the developer brain — lightweight, trustworthy, and complete on its own. The commercial layer is closed-source and lives in `Sankhya-AI/dhee-enterprise`. -This separation keeps the open-source package lightweight and trustworthy while -letting the commercial product move faster. +--- -## Development +## FAQ + +**What problem does Dhee solve?** +Large agent projects accumulate a fat `CLAUDE.md`, `AGENTS.md`, skills library, and tool output that get re-injected every turn. Dhee chunks, indexes, and decays that knowledge, and digests fat tool output at the source — so only the relevant ~300 tokens reach the model. + +**How is Dhee different from Mem0, Letta, MemPalace, agentmemory?** +Dhee is the only memory layer that (a) leads [LongMemEval](https://github.com/xiaowu0162/LongMemEval) at R@5 99.4% on the full 500-question set, (b) self-evolves its retrieval policy per tool and per intent, (c) ships a **router** that digests `Read`/`Bash`/subagent output at source, and (d) shares team context through git instead of a server. + +**Does Dhee work with Claude Code, Cursor, Codex, Gemini CLI, Aider?** +Yes. Native Claude Code hooks, an MCP server for every other host, plus a Python SDK and CLI. One install, every agent. + +**How does the team-context sharing actually work?** +`dhee link /path/to/repo` writes a `.dhee/` directory inside your repo. Commit it. Teammates pull, install Dhee, and their agent surfaces the same shared decisions and conventions. Append-only with conflict detection — no overwrites, no server, no account. + +**Is Dhee production-ready? What storage?** +SQLite by default. No Postgres, no Qdrant, no pgvector, no infra. 1000+ tests, reproducible benchmarks in-tree, MIT, works offline with Ollama or online with OpenAI / NVIDIA NIM / Gemini. + +**Where are the benchmarks and can I reproduce them?** +[`benchmarks/longmemeval/`](benchmarks/longmemeval/) — full command, per-question JSONL, `metrics.json`. Clone, run, recompute R@k. Any mismatch is an issue you can open. + +--- + +## Contributing ```bash git clone https://github.com/Sankhya-AI/Dhee.git -cd Dhee -pip install -e ".[dev]" +cd Dhee && ./scripts/bootstrap_dev_env.sh +source .venv-dhee/bin/activate pytest ``` -## Configuration - -Gemini uses `GEMINI_API_KEY` or `GOOGLE_API_KEY`. -OpenAI uses `OPENAI_API_KEY`. +--- -Never commit secrets. Dhee stores keys locally under `~/.dhee/`. +

+ Your fat skills stay fat. Your token bill stays thin. Your agent gets smarter every session. +

+ GitHub · + PyPI · + Issues · + Sankhya AI +

-## License +

MIT License — built by Sankhya AI Labs.

-MIT. Built by Sankhya AI Labs. +

+Topics: ai-agents · agent-memory · llm-memory · developer-brain · claude-code · claude-code-hooks · claudemd · agentsmd · mcp · mcp-server · model-context-protocol · context-router · context-engineering · context-compression · token-optimization · llm-tools · vector-memory · sqlite · longmemeval · retrieval-augmented-generation · rag · mem0-alternative · letta-alternative · mempalace-alternative · cursor · codex · gemini-cli · aider · cline · goose +