From 3372b16a691a451d69c2f0be46fafc4f49c6c341 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E5=B8=AD=E9=98=B3=E9=98=B3?= Date: Tue, 12 May 2026 15:00:30 +0800 Subject: [PATCH 1/4] feat: add contributing --- contributing.md | 452 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 452 insertions(+) create mode 100644 contributing.md diff --git a/contributing.md b/contributing.md new file mode 100644 index 00000000..65833a56 --- /dev/null +++ b/contributing.md @@ -0,0 +1,452 @@ +# Contributing to MemOS + +Thanks for your interest in contributing to MemOS! 🎉 + +MemOS is a Memory Operating System for LLMs and AI agents, maintained by [记忆张量MemTensor](https://www.memtensor.com.cn/) and a growing community of contributors. Whether you want to fix a bug, add a feature, improve docs, or just ask a question — you're welcome here. + +--- + +## Table of Contents + +* [Ways to Contribute](https://claude.ai/chat/4504392b-b48f-484a-b284-09624e7f1147#ways-to-contribute) + +* [Before You Start](https://claude.ai/chat/4504392b-b48f-484a-b284-09624e7f1147#before-you-start) + +* [Setting Up Your Development Environment](https://claude.ai/chat/4504392b-b48f-484a-b284-09624e7f1147#setting-up-your-development-environment) + +* [Development Workflow](https://claude.ai/chat/4504392b-b48f-484a-b284-09624e7f1147#development-workflow) + +* [Commit Message Guidelines](https://claude.ai/chat/4504392b-b48f-484a-b284-09624e7f1147#commit-message-guidelines) + +* [What Makes a Good PR](https://claude.ai/chat/4504392b-b48f-484a-b284-09624e7f1147#what-makes-a-good-pr) + +* [Review Process](https://claude.ai/chat/4504392b-b48f-484a-b284-09624e7f1147#review-process) + +* [Writing Tests](https://claude.ai/chat/4504392b-b48f-484a-b284-09624e7f1147#writing-tests) + +* [Writing Documentation](https://claude.ai/chat/4504392b-b48f-484a-b284-09624e7f1147#writing-documentation) + +* [Community](https://claude.ai/chat/4504392b-b48f-484a-b284-09624e7f1147#community) + +* [Code of Conduct](https://claude.ai/chat/4504392b-b48f-484a-b284-09624e7f1147#code-of-conduct) + +* [License](https://claude.ai/chat/4504392b-b48f-484a-b284-09624e7f1147#license) + +* [Recognition](https://claude.ai/chat/4504392b-b48f-484a-b284-09624e7f1147#recognition) + + +--- + +## Ways to Contribute + +You don't have to write code to be a contributor. Things that genuinely help the project: + +* **🐛 Report bugs** — open a [GitHub Issue](https://github.com/MemTensor/MemOS/issues)with a minimal reproduction + +* **💡 Propose features or design ideas** — start a thread in [GitHub Discussions](https://github.com/MemTensor/MemOS/discussions) + +* **🔧 Submit code** — bug fixes, new memory backends, plugins, performance improvements + +* **📚 Improve documentation** — typos, missing examples, unclear explanations. Docs live in a separate repo: [MemTensor/MemOS-Docs](https://github.com/MemTensor/MemOS-Docs) + +* **🧪 Add tests** — coverage for edge cases or under-tested modules + +* **🌍 Translate** — help us reach more developers in more languages + +* **❓ Answer questions** — help newcomers in Discussions, Discord, and the WeChat group + +* **📣 Share what you built** — write a blog post, demo, or tutorial using MemOS, and tell us about it + + +![image.png](https://alidocs.oss-cn-zhangjiakou.aliyuncs.com/res/WgZOZA8erK2geqLX/img/db661fd0-70ec-4c31-af72-87dabfcb89be.png) + +All of these count as contributions. We're happy to recognize non-code contributors as well — open an issue or message us if you'd like to be added. + +--- + +## Before You Start + +### First time here? + +* Read the [project overview](https://github.com/MemTensor/MemOS#readme) to get a sense of what MemOS does + +* Try the [Quickstart](https://memos-docs.openmem.net/open_source/getting_started/installation)to set up a local instance + +* Skim [Core Concepts](https://memos-docs.openmem.net/open_source/home/core_concepts)— especially the distinction between Plaintext, Activation, and Parametric memory + + +### Found something to work on? + +* **For bugs or small fixes** — feel free to open a PR directly + +* **For larger changes** (new modules, API changes, architectural changes) — please open an Issue or Discussion first to align with maintainers before writing code. This avoids the situation where a substantial PR is rejected because it doesn't fit the project direction + + +### Not sure where to start? + +We use two labels to help newcomers find a good entry point: + +* 🌱 [`good first issue`](https://github.com/MemTensor/MemOS/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22) — small, well-scoped tasks that don't require deep familiarity with the codebase. **Start here for your first contribution.** + +* 🙋 [`help wanted`](https://github.com/MemTensor/MemOS/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22) — issues where maintainers actively welcome external contributions. May be larger or more involved than `good first issue`, but the direction is already agreed on. + + +**How to claim an issue:** + +1. Comment on the issue saying you'd like to take it (this avoids two people working on the same thing) + +2. Wait for a maintainer to assign it to you, or just go ahead if it's been sitting unclaimed + +3. If you go quiet for more than a week without progress, we may release the issue back to the pool — feel free to pick it up again later + + +If nothing on the list catches your eye, ask in [Discussions](https://github.com/MemTensor/MemOS/discussions) — we're happy to suggest something based on your interests. + +--- + +## Setting Up Your Development Environment + +### Prerequisites + +Make sure these are installed locally: + +* **Git** + +* **Python 3.9+** — verify with `python3 --version` + +* **Make** + +* **Poetry** — for dependency management + + +Install Poetry using the official installer: + +```bash +curl -sSL https://install.python-poetry.org | python3 - +poetry --version + +``` + +If you see `poetry: command not found`, add the Poetry executable directory to your `PATH` as prompted by the installer, then restart your terminal. + +### Fork and clone the repository + +```bash +# Fork the repo on GitHub first, then: +git clone https://github.com/YOUR-USERNAME/MemOS.git +cd MemOS +git remote add upstream https://github.com/MemTensor/MemOS.git + +``` + +### Install dependencies + +From the repository root: + +```bash +make install + +``` + +This installs all project dependencies and sets up pre-commit hooks. If you later switch branches or upstream dependencies change, run `make install` again to keep your environment in sync. + +### Choose your memory backend + +MemOS supports multiple memory types, each with different database dependencies. You only need to set up the ones you'll actually use. + +**Textual Memory** (you must pick one): + +| Backend | Identifier | Database needed | +| --- | --- | --- | +| **Tree** (recommended) | `tree_text` | Graph database — Neo4j Desktop, Neo4j Community, or PolarDB | +| **General** | `general_text` | Vector database — Qdrant or compatible | +| **Naive** | `naive_text` | None (testing only) | + +**Preference Memory** (optional): + +| Backend | Identifier | Database needed | +| --- | --- | --- | +| **Pref** | `pref` | Milvus | + +For most contributors, the simplest setup is: + +* **Memory type:** `tree` (`tree_text`) + +* **Graph database:** Neo4j Community (via Docker) + +* **Vector database:** Qdrant in local embedded mode (no separate service needed) + + +> Neo4j Community has no native vector retrieval, so it's paired with Qdrant for vector search. Qdrant in local embedded mode reads/writes local files directly, so you don't need to run a separate Qdrant server. + +### Configure `.env` + +Create a `.env` file in the repo root: + +```bash +cd MemOS +touch .env +``` + +For the contents, refer to the [`.env` configuration guide](https://memos-docs.openmem.net/open_source/getting_started/installation#2.-.env-content). You'll need API keys for your chosen LLM provider — these can be obtained from [BaiLian](https://bailian.console.aliyun.com/) (for `OPENAI_API_KEY`, `MOS_EMBEDDER_API_KEY`, `MEMRADER_API_KEY`, etc.) or any compatible provider. + +### Start dependent services + +If you're using the Neo4j + Qdrant setup: + +```bash +cd docker +docker compose up neo4j +``` + +### Run the dev server + +In a new terminal: + +```bash +cd MemOS +make serve +``` + +The API server will start on `http://localhost:8000`. + +For more deployment options (full Docker setup, slim/full image variants, ARM/x86 builds), see the [full Setting Up guide](https://memos-docs.openmem.net/open_source/contribution/setting_up). + +--- + +## Development Workflow + +### 1. Sync with upstream + +If you've forked previously, pull in the latest upstream changes before starting: + +```bash +git checkout dev +git fetch upstream +git pull upstream dev +git push origin dev + +``` + +### 2. Create a feature branch + +Branch off `dev` (not `main`): + +```bash +git checkout -b feat/your-feature-name + +``` + +Use a descriptive branch name: + +* `feat/add-redis-backend` + +* `fix/memory-leak-in-scheduler` + +* `docs/clarify-memcube-api` + + +### 3. Make your changes + +Implement your feature, fix, or improvement in the appropriate files. For example, you might add a function in `src/memos/your_module.py` and corresponding tests in `tests/test_your_module.py`. + +### 4. Run tests + +```bash +make test + +``` + +All tests should pass before you open a PR. If you've added new functionality, please add tests for it (see [Writing Tests](https://claude.ai/chat/4504392b-b48f-484a-b284-09624e7f1147#writing-tests) below). + +### 5. Rebase onto the latest `dev` + +Before committing or opening a PR, rebase to make sure your branch is on top of the latest upstream: + +```bash +git fetch upstream +git rebase upstream/dev +``` + +### 6. Commit your changes + +Follow the [Commit Message Guidelines](https://claude.ai/chat/4504392b-b48f-484a-b284-09624e7f1147#commit-message-guidelines) below. + +### 7. Push to your fork + +```bash +git push origin feat/your-feature-name +``` + +### 8. Open a Pull Request + +> ⚠️ **Open PRs against** `**dev**`**, not** `**main**`**.** PRs against `main` will be asked to retarget. + +* Go to [the upstream repository](https://github.com/MemTensor/MemOS) on GitHub + +* Click **Pull Requests** → **New Pull Request** + +* Select `dev` as the base branch and your feature branch as compare + +* Fill in the PR description carefully — what you changed, why, and any tradeoffs or open questions + +* Link to any related issue with `Closes #123` or `Refs #123` + + +If your PR is a work-in-progress and you'd like early feedback, mark it as **Draft** when opening — maintainers will know not to do a full review yet. + +--- + +## Commit Message Guidelines + +We follow the [Conventional Commits](https://www.conventionalcommits.org/)format: + +```plaintext +: + +[optional body] + +[optional footer] +``` + +### Types + +| Type | Use for | +| --- | --- | +| `feat` | A new feature | +| `fix` | A bug fix | +| `docs` | Documentation-only changes | +| `style` | Formatting changes (no logic change) | +| `refactor` | Code restructuring without behavior change | +| `test` | Adding or updating tests | +| `chore` | Maintenance tasks, build tooling, dependencies | +| `ci` | CI/CD or workflow related changes | + +### Examples + +```plaintext +feat: add Redis Streams backend for MemScheduler + +fix: prevent memory leak in MemCube cleanup + +docs: clarify MemCube vs MemReader in core concepts + +refactor: extract retry logic into shared helper +``` + +Keep the description in the imperative mood ("add", "fix", "update"), not past tense. + +For larger changes, include a body explaining the **why**, not just the **what** — the diff already shows what changed. + +--- + +## What Makes a Good PR + +Things that help your PR get merged faster: + +* **Scoped** — one logical change per PR. Don't bundle unrelated fixes + +* **Tested** — new code should have tests; bug fixes should include a regression test + +* **Documented** — public APIs need docstrings; user-facing changes need a note in the PR description + +* **Conventional commit messages** — see above + +* **Linked to an issue** — for non-trivial changes, reference the issue (`Closes #123`) + +* **Passes CI** — the PR can't be merged until checks are green + + +--- + +## Review Process + +* A maintainer will usually review within a few business days. If a PR sits untouched for over a week, feel free to ping politely + +* We may ask for changes — this isn't personal, it's how we keep the codebase consistent. Please don't take rejection or revision requests as discouragement + +* Once approved and CI is green, a maintainer will merge using **squash and merge** by default + + +--- + +## Writing Tests + +We use `pytest`. Tests live under `tests/`, mirroring the structure of `src/`. + +```bash +# Run all tests +make test + +# Run a specific test file +poetry run pytest tests/test_your_module.py + +# Run a specific test +poetry run pytest tests/test_your_module.py::test_specific_behavior + +``` + +Guidelines: + +* New features should include tests covering the happy path and key edge cases + +* Bug fixes should include a regression test that fails on the old code and passes on the new + +* Use descriptive test names — `test_search_returns_empty_when_no_match` is better than `test_search_2` + +* Avoid relying on external services in unit tests — mock them or use fixtures + + +For detailed conventions, see [How to Write Unit Tests](https://memos-docs.openmem.net/open_source/contribution/writing_tests). + +--- + +## Writing Documentation + +The MemOS documentation lives in a separate repository: [MemTensor/MemOS-Docs](https://github.com/MemTensor/MemOS-Docs). + +If you want to: + +* **Fix a typo or small issue** — click "Edit on GitHub" at the bottom of any doc page + +* **Add a new doc page or restructure existing ones** — open a PR against the MemOS-Docs repo + +* **Document a new feature you're adding** — please update both the code (in this repo) and the docs (in MemOS-Docs) as part of your change + + +For style and structure conventions, see [Documentation Writing Guidelines](https://memos-docs.openmem.net/open_source/contribution/writing_docs). + +--- + +## Community + +Questions, ideas, showing off what you built — pick whichever channel fits: + +| Channel | Best for | +| --- | --- | +| [GitHub Issues](https://github.com/MemTensor/MemOS/issues) | Bug reports, concrete feature requests | +| [GitHub Discussions](https://github.com/MemTensor/MemOS/discussions) | Open-ended questions, design ideas, sharing projects | +| [Discord](https://discord.gg/Txbx3gebZR) | Real-time chat, mostly English-speaking community | +| [WeChat Group](https://statics.memtensor.com.cn/memos/qr-code.png) | 中文实时交流,国内用户首选 | + +For sensitive issues (security vulnerabilities, Code of Conduct concerns), please contact the maintainers privately rather than using public channels. See our [Code of Conduct](https://claude.ai/chat/CODE_OF_CONDUCT.md) for the reporting email. + +--- + +## Code of Conduct + +By participating in this project, you agree to abide by our [Code of Conduct](https://claude.ai/chat/CODE_OF_CONDUCT.md). We're committed to making MemOS a welcoming, harassment-free community for everyone. + +--- + +## License + +MemOS is licensed under the [Apache License 2.0](https://claude.ai/chat/LICENSE). By contributing, you agree that your contributions will be licensed under the same license. + +--- + +## Recognition + +Every merged PR earns you a place in the [Contributors graph](https://github.com/MemTensor/MemOS/graphs/contributors)and on your GitHub profile. We're working on broader recognition for non-code contributions too — stay tuned. + +Thanks again for being here. ✨ From fc03e9784ebaf4626e3fcc3d4abc4ced9b1e11a8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E5=B8=AD=E9=98=B3=E9=98=B3?= Date: Tue, 12 May 2026 16:05:45 +0800 Subject: [PATCH 2/4] docs: add docs --- docs/README.md | 3 - .../best_practice/common_errors_solutions.md | 155 + .../mcp_for_cozespace_and_tools.md | 417 ++ .../best_practice/memory_structure_design.md | 138 + .../best_practice/network_workarounds.md | 115 + .../best_practice/performance_tuning.md | 132 + .../contribution/commit_guidelines.md | 17 + .../contribution/development_workflow.md | 75 + docs/cn/open_source/contribution/overview.md | 14 + .../cn/open_source/contribution/setting_up.md | 205 + .../open_source/contribution/writing_docs.md | 546 +++ .../open_source/contribution/writing_tests.md | 43 + .../open_source/getting_started/examples.md | 680 ++++ .../getting_started/installation.md | 645 +++ .../getting_started/rest_api_server.md | 501 +++ .../getting_started/your_first_memory.md | 322 ++ docs/cn/open_source/home/architecture.md | 139 + docs/cn/open_source/home/core_concepts.md | 105 + docs/cn/open_source/home/memos_intro.md | 114 + docs/cn/open_source/home/overview.md | 47 + docs/cn/open_source/modules/mem_chat.md | 180 + docs/cn/open_source/modules/mem_cube.md | 290 ++ docs/cn/open_source/modules/mem_feedback.md | 155 + docs/cn/open_source/modules/mem_reader.md | 181 + docs/cn/open_source/modules/mem_scheduler.md | 495 +++ .../memories/general_textual_memory.md | 157 + .../modules/memories/kv_cache_memory.md | 519 +++ .../modules/memories/naive_textual_memory.md | 508 +++ .../modules/memories/nebula_graph_db.md | 126 + .../modules/memories/neo4j_graph_db.md | 189 + .../open_source/modules/memories/overview.md | 244 ++ .../modules/memories/parametric_memory.md | 44 + .../modules/memories/polardb_graph_db.md | 461 +++ .../memories/preference_textual_memory.md | 227 ++ .../modules/memories/tree_textual_memory.md | 509 +++ docs/cn/open_source/modules/mos/overview.md | 106 + docs/cn/open_source/modules/mos/users.md | 306 ++ .../modules/mos/users_configurations.md | 719 ++++ .../open_source/open_source_api/chat/chat.md | 86 + .../open_source_api/core/add_memory.md | 71 + .../open_source_api/core/delete_memory.md | 62 + .../open_source_api/core/get_memory.md | 81 + .../open_source_api/core/get_memory_by_id.md | 58 + .../open_source_api/core/search_memory.md | 95 + .../open_source_api/help/error_codes.md | 48 + .../open_source_api/message/feedback.md | 78 + .../open_source_api/message/get_message.md | 75 + .../message/get_suggestion_queries.md | 70 + .../open_source_api/scheduler/ wait.md | 77 + .../open_source_api/scheduler/get_status.md | 97 + .../open_source_api/start/configuration.md | 7 + .../open_source_api/start/overview.md | 52 + .../open_source_api/tools/check_cube.md | 50 + .../open_source_api/tools/get_user_names.md | 53 + docs/cn/openclaw/changes.md | 119 + docs/cn/openclaw/examples/hermes_usage.md | 112 + docs/cn/openclaw/examples/multi_agent.md | 97 + docs/cn/openclaw/examples/recall_filter.md | 107 + docs/cn/openclaw/guide.md | 343 ++ docs/cn/openclaw/hermes_local_plugin.md | 8 + docs/cn/openclaw/local_plugin.md | 127 + docs/cn/openclaw/plugin_compare.md | 81 + .../best_practice/common_errors_solutions.md | 75 + .../mcp_for_cozespace_and_tools.md | 420 ++ .../best_practice/memory_structure_design.md | 136 + .../best_practice/network_workarounds.md | 109 + .../best_practice/performance_tuning.md | 55 + .../contribution/commit_guidelines.md | 17 + .../contribution/development_workflow.md | 74 + docs/en/open_source/contribution/overview.md | 14 + .../en/open_source/contribution/setting_up.md | 212 + .../open_source/contribution/writing_docs.md | 537 +++ .../open_source/contribution/writing_tests.md | 43 + .../open_source/getting_started/examples.md | 581 +++ .../getting_started/installation.md | 574 +++ .../getting_started/rest_api_server.md | 500 +++ .../getting_started/your_first_memory.md | 268 ++ docs/en/open_source/home/architecture.md | 98 + docs/en/open_source/home/core_concepts.md | 107 + docs/en/open_source/home/memos_intro.md | 117 + docs/en/open_source/home/overview.md | 46 + docs/en/open_source/modules/mem_chat.md | 180 + docs/en/open_source/modules/mem_cube.md | 285 ++ docs/en/open_source/modules/mem_feedback.md | 152 + docs/en/open_source/modules/mem_reader.md | 181 + docs/en/open_source/modules/mem_scheduler.md | 495 +++ .../memories/general_textual_memory.md | 145 + .../modules/memories/kv_cache_memory.md | 267 ++ .../modules/memories/naive_textual_memory.md | 504 +++ .../modules/memories/nebula_graph_db.md | 125 + .../modules/memories/neo4j_graph_db.md | 188 + .../open_source/modules/memories/overview.md | 279 ++ .../modules/memories/parametric_memory.md | 49 + .../modules/memories/polardb_graph_db.md | 464 +++ .../memories/preference_textual_memory.md | 226 ++ .../modules/memories/tree_textual_memory.md | 515 +++ docs/en/open_source/modules/model_backend.md | 104 + docs/en/open_source/modules/mos/memos_mcp.md | 110 + docs/en/open_source/modules/mos/memos_neo.md | 171 + docs/en/open_source/modules/mos/overview.md | 105 + docs/en/open_source/modules/mos/users.md | 306 ++ .../modules/mos/users_configurations.md | 719 ++++ docs/en/openclaw/changes.md | 118 + docs/en/openclaw/examples/hermes_usage.md | 112 + docs/en/openclaw/examples/multi_agent.md | 98 + docs/en/openclaw/examples/recall_filter.md | 108 + docs/en/openclaw/guide.md | 343 ++ docs/en/openclaw/hermes_local_plugin.md | 8 + docs/en/openclaw/local_plugin.md | 127 + docs/en/openclaw/plugin_compare.md | 82 + docs/openapi.json | 3569 ----------------- docs/product-api-tests.md | 65 - 112 files changed, 22429 insertions(+), 3637 deletions(-) delete mode 100644 docs/README.md create mode 100644 docs/cn/open_source/best_practice/common_errors_solutions.md create mode 100644 docs/cn/open_source/best_practice/mcp_for_cozespace_and_tools.md create mode 100644 docs/cn/open_source/best_practice/memory_structure_design.md create mode 100644 docs/cn/open_source/best_practice/network_workarounds.md create mode 100644 docs/cn/open_source/best_practice/performance_tuning.md create mode 100644 docs/cn/open_source/contribution/commit_guidelines.md create mode 100644 docs/cn/open_source/contribution/development_workflow.md create mode 100644 docs/cn/open_source/contribution/overview.md create mode 100644 docs/cn/open_source/contribution/setting_up.md create mode 100644 docs/cn/open_source/contribution/writing_docs.md create mode 100644 docs/cn/open_source/contribution/writing_tests.md create mode 100644 docs/cn/open_source/getting_started/examples.md create mode 100644 docs/cn/open_source/getting_started/installation.md create mode 100644 docs/cn/open_source/getting_started/rest_api_server.md create mode 100644 docs/cn/open_source/getting_started/your_first_memory.md create mode 100644 docs/cn/open_source/home/architecture.md create mode 100644 docs/cn/open_source/home/core_concepts.md create mode 100644 docs/cn/open_source/home/memos_intro.md create mode 100644 docs/cn/open_source/home/overview.md create mode 100644 docs/cn/open_source/modules/mem_chat.md create mode 100644 docs/cn/open_source/modules/mem_cube.md create mode 100644 docs/cn/open_source/modules/mem_feedback.md create mode 100644 docs/cn/open_source/modules/mem_reader.md create mode 100644 docs/cn/open_source/modules/mem_scheduler.md create mode 100644 docs/cn/open_source/modules/memories/general_textual_memory.md create mode 100644 docs/cn/open_source/modules/memories/kv_cache_memory.md create mode 100644 docs/cn/open_source/modules/memories/naive_textual_memory.md create mode 100644 docs/cn/open_source/modules/memories/nebula_graph_db.md create mode 100644 docs/cn/open_source/modules/memories/neo4j_graph_db.md create mode 100644 docs/cn/open_source/modules/memories/overview.md create mode 100644 docs/cn/open_source/modules/memories/parametric_memory.md create mode 100644 docs/cn/open_source/modules/memories/polardb_graph_db.md create mode 100644 docs/cn/open_source/modules/memories/preference_textual_memory.md create mode 100644 docs/cn/open_source/modules/memories/tree_textual_memory.md create mode 100644 docs/cn/open_source/modules/mos/overview.md create mode 100644 docs/cn/open_source/modules/mos/users.md create mode 100644 docs/cn/open_source/modules/mos/users_configurations.md create mode 100644 docs/cn/open_source/open_source_api/chat/chat.md create mode 100644 docs/cn/open_source/open_source_api/core/add_memory.md create mode 100644 docs/cn/open_source/open_source_api/core/delete_memory.md create mode 100644 docs/cn/open_source/open_source_api/core/get_memory.md create mode 100644 docs/cn/open_source/open_source_api/core/get_memory_by_id.md create mode 100644 docs/cn/open_source/open_source_api/core/search_memory.md create mode 100644 docs/cn/open_source/open_source_api/help/error_codes.md create mode 100644 docs/cn/open_source/open_source_api/message/feedback.md create mode 100644 docs/cn/open_source/open_source_api/message/get_message.md create mode 100644 docs/cn/open_source/open_source_api/message/get_suggestion_queries.md create mode 100644 docs/cn/open_source/open_source_api/scheduler/ wait.md create mode 100644 docs/cn/open_source/open_source_api/scheduler/get_status.md create mode 100644 docs/cn/open_source/open_source_api/start/configuration.md create mode 100644 docs/cn/open_source/open_source_api/start/overview.md create mode 100644 docs/cn/open_source/open_source_api/tools/check_cube.md create mode 100644 docs/cn/open_source/open_source_api/tools/get_user_names.md create mode 100644 docs/cn/openclaw/changes.md create mode 100644 docs/cn/openclaw/examples/hermes_usage.md create mode 100644 docs/cn/openclaw/examples/multi_agent.md create mode 100644 docs/cn/openclaw/examples/recall_filter.md create mode 100644 docs/cn/openclaw/guide.md create mode 100644 docs/cn/openclaw/hermes_local_plugin.md create mode 100644 docs/cn/openclaw/local_plugin.md create mode 100644 docs/cn/openclaw/plugin_compare.md create mode 100644 docs/en/open_source/best_practice/common_errors_solutions.md create mode 100644 docs/en/open_source/best_practice/mcp_for_cozespace_and_tools.md create mode 100644 docs/en/open_source/best_practice/memory_structure_design.md create mode 100644 docs/en/open_source/best_practice/network_workarounds.md create mode 100644 docs/en/open_source/best_practice/performance_tuning.md create mode 100644 docs/en/open_source/contribution/commit_guidelines.md create mode 100644 docs/en/open_source/contribution/development_workflow.md create mode 100644 docs/en/open_source/contribution/overview.md create mode 100644 docs/en/open_source/contribution/setting_up.md create mode 100644 docs/en/open_source/contribution/writing_docs.md create mode 100644 docs/en/open_source/contribution/writing_tests.md create mode 100644 docs/en/open_source/getting_started/examples.md create mode 100644 docs/en/open_source/getting_started/installation.md create mode 100644 docs/en/open_source/getting_started/rest_api_server.md create mode 100644 docs/en/open_source/getting_started/your_first_memory.md create mode 100644 docs/en/open_source/home/architecture.md create mode 100644 docs/en/open_source/home/core_concepts.md create mode 100644 docs/en/open_source/home/memos_intro.md create mode 100644 docs/en/open_source/home/overview.md create mode 100644 docs/en/open_source/modules/mem_chat.md create mode 100644 docs/en/open_source/modules/mem_cube.md create mode 100644 docs/en/open_source/modules/mem_feedback.md create mode 100644 docs/en/open_source/modules/mem_reader.md create mode 100644 docs/en/open_source/modules/mem_scheduler.md create mode 100644 docs/en/open_source/modules/memories/general_textual_memory.md create mode 100644 docs/en/open_source/modules/memories/kv_cache_memory.md create mode 100644 docs/en/open_source/modules/memories/naive_textual_memory.md create mode 100644 docs/en/open_source/modules/memories/nebula_graph_db.md create mode 100644 docs/en/open_source/modules/memories/neo4j_graph_db.md create mode 100644 docs/en/open_source/modules/memories/overview.md create mode 100644 docs/en/open_source/modules/memories/parametric_memory.md create mode 100644 docs/en/open_source/modules/memories/polardb_graph_db.md create mode 100644 docs/en/open_source/modules/memories/preference_textual_memory.md create mode 100644 docs/en/open_source/modules/memories/tree_textual_memory.md create mode 100644 docs/en/open_source/modules/model_backend.md create mode 100644 docs/en/open_source/modules/mos/memos_mcp.md create mode 100644 docs/en/open_source/modules/mos/memos_neo.md create mode 100644 docs/en/open_source/modules/mos/overview.md create mode 100644 docs/en/open_source/modules/mos/users.md create mode 100644 docs/en/open_source/modules/mos/users_configurations.md create mode 100644 docs/en/openclaw/changes.md create mode 100644 docs/en/openclaw/examples/hermes_usage.md create mode 100644 docs/en/openclaw/examples/multi_agent.md create mode 100644 docs/en/openclaw/examples/recall_filter.md create mode 100644 docs/en/openclaw/guide.md create mode 100644 docs/en/openclaw/hermes_local_plugin.md create mode 100644 docs/en/openclaw/local_plugin.md create mode 100644 docs/en/openclaw/plugin_compare.md delete mode 100644 docs/openapi.json delete mode 100644 docs/product-api-tests.md diff --git a/docs/README.md b/docs/README.md deleted file mode 100644 index 8be17ffb..00000000 --- a/docs/README.md +++ /dev/null @@ -1,3 +0,0 @@ -All documentation has been moved to a separate repository: https://github.com/MemTensor/MemOS-Docs. Please edit documentation there. - -所有文档已迁移至独立仓库 https://github.com/MemTensor/MemOS-Docs 。请在该仓库中编辑文档。 diff --git a/docs/cn/open_source/best_practice/common_errors_solutions.md b/docs/cn/open_source/best_practice/common_errors_solutions.md new file mode 100644 index 00000000..1a259c20 --- /dev/null +++ b/docs/cn/open_source/best_practice/common_errors_solutions.md @@ -0,0 +1,155 @@ +--- +title: 常见错误与解决方案 +--- + +## 1. 数据库与向量相关错误 + +### Embedding 维度不匹配 + +**现象**: +更改 Embedding 模型后(例如从 `openai` 切换到 `ollama`),系统报错或检索效果极差。 +日志中可能出现 `Dimension mismatch` 或 Qdrant 相关的 `Wrong input vector size` 错误。 + +**原因**: +Qdrant 在创建 Collection 时会根据配置文件中的 `vector_dimension` 固定向量维度。 +* OpenAI `text-embedding-3-small`: 1536 维 +* Ollama `nomic-embed-text`: 768 维 +* BAAI `bge-m3`: 1024 维 + +MemOS 的 `QdrantVecDB` 在初始化时,如果发现 Collection 已存在,会跳过创建步骤。此时如果使用了新维度的模型,写入向量时就会报错。 + +**解决方案**: +1. **修改 Collection 名称**:在配置文件中更改 `collection_name`,让 MemOS 创建一个新的 Collection。 + ```yaml + vec_db: + config: + collection_name: "memos_v2" # 原名为 memos_v1 + vector_dimension: 768 # 确保此维度与新模型一致 + ``` +2. **删除旧数据**:如果你在开发环境,可以直接删除 Qdrant 的存储卷或 Drop 掉旧的 Collection。 + +### 数据后端启动失败 (Neo4j/Qdrant) + +**现象**: +启动 MemOS 时报错 `ConnectionRefusedError`, `ServiceUnavailable` 或 `AuthError`。 + +**常见原因与检查清单**: + +1. **Docker 容器未启动**: + 确保你已经运行了必要的中间件容器。 + ```bash + docker ps + # 检查是否有 neo4j 和 qdrant 容器在运行 + ``` + +2. **端口未映射**: + 检查 `docker run` 命令是否包含了 `-p` 参数。 + * Qdrant 需要暴露 `6333` (gRPC/HTTP) + * Neo4j 需要暴露 `7474` (HTTP) 和 `7687` (Bolt) + +3. **Neo4j 认证失败**: + MemOS 默认配置通常使用 `neo4j/password` 或 `neo4j/neo4j`。 + 请检查你的环境变量或配置文件: + ```bash + export NEO4J_PASSWORD="your_actual_password" + ``` + *注意:Neo4j 首次启动要求修改默认密码,请确保已在浏览器 (http://localhost:7474) 中完成此步骤。* + +## 2. 模型服务错误 + +### Ollama 连接失败 + +**现象**: +报错 `Connection refused` 连接到 `localhost:11434` 失败,或者提示模型不存在。 + +**解决方案**: +1. **启动服务**:确保在终端运行了 `ollama serve`。 +2. **拉取模型**:MemOS 的 `OllamaEmbedder` 会尝试检查本地模型,如果不存在会尝试 pull,但建议手动执行以确保成功: + ```bash + ollama pull nomic-embed-text + ``` +3. **地址问题**:如果是 Docker 运行 MemOS,`localhost` 指向容器内部。需使用 `host.docker.internal` (Mac/Windows) 或宿主机 IP (Linux) 配置 `api_base`。 + +## 3. 配置错误 + +### 缺失必要字段 + +```python +# ✅ 始终需要包含必填字段 +llm_config = { + "backend": "openai", + "config": { + "api_key": "your-api-key", + "model_name_or_path": "gpt-4" + } +} +``` + +### 后端不匹配 + +```python +# ✅ KVCache 需要使用 HuggingFace 后端 +# 参考 src/memos/memories/activation/kv.py +kv_config = { + "backend": "kv_cache", + "config": { + "extractor_llm": { + "backend": "huggingface", + "config": { + "model_name_or_path": "Qwen/Qwen3-1.7B" + } + } + } +} +``` + +## 4. 运行时资源问题 + +### 记忆加载失败 (Schema Mismatch) + +**现象**: +`mem_cube.load()` 报错,通常是因为 JSON 文件结构与当前代码版本不兼容。 + +**解决方案**: +重新初始化 MemCube 并覆盖旧数据(注意数据丢失风险): + +```python +try: + mem_cube.load("memory_dir") +except Exception: + logger.warning("Loading failed, initializing new memory cube") + mem_cube = GeneralMemCube(config) + # 谨慎操作:这会覆盖旧数据 + mem_cube.dump("memory_dir") +``` + +### GPU 显存不足 + +**解决方案**: +使用 `CUDA_VISIBLE_DEVICES` 指定显卡,或切换更小的模型(如 0.5B/1.5B 版本)。 + +```python +import os +os.environ["CUDA_VISIBLE_DEVICES"] = "0" +``` + +## 5. 用户管理常见问题 + +**现象**: +调用 `get_user` 返回 None 或报错。 + +**解决方案**: +MemOS 需要明确的用户注册流程。 + +```python +# 1. 注册 MemCube 到特定用户 +mos.register_mem_cube(cube_path="path", user_id="user_id", cube_id="cube_id") + +# 2. 创建或获取用户 +try: + # 尝试创建用户 + user_id = mos.create_user(user_name="john", role=UserRole.USER) +except ValueError: + # 如果用户已存在,则获取 + user = mos.user_manager.get_user_by_name("john") +``` diff --git a/docs/cn/open_source/best_practice/mcp_for_cozespace_and_tools.md b/docs/cn/open_source/best_practice/mcp_for_cozespace_and_tools.md new file mode 100644 index 00000000..cb44b73c --- /dev/null +++ b/docs/cn/open_source/best_practice/mcp_for_cozespace_and_tools.md @@ -0,0 +1,417 @@ +--- +title: MemOS MCP集成指南 +description: 在Coze等平台配置MemOS的MCP服务,实现智能体与记忆系统的无缝集成 +--- + +本指南将帮助您在Coze空间等平台中配置MemOS的MCP服务,实现智能体与记忆系统的无缝集成。 + +## 选择MCP部署方式 + +MemOS提供两种MCP部署方式,您可以根据实际需求选择: + +### 使用MemOS云服务(推荐) + +如果您希望快速接入,无需自己部署服务器,推荐使用MemOS官方云服务。 + +**优势:** +- ✅ 开箱即用,无需部署 +- ✅ 高可用性保障 +- ✅ 自动扩展和维护 +- ✅ 支持多种客户端(Claude、Cursor、Cline等) + +**配置方式:** + +请访问 [MemOS云服务MCP配置指南](https://memos-docs.openmem.net/cn/mcp_agent/mcp/guide) 获取详细的配置说明。 + +主要步骤: +1. 在 [MemOS API控制台](https://memos-dashboard.openmem.net/cn/apikeys/) 注册账号并获取API Key +2. 在MCP客户端中配置 `@memtensor/memos-api-mcp` 服务 +3. 设置环境变量(`MEMOS_API_KEY`、`MEMOS_USER_ID`、`MEMOS_CHANNEL`) + +### 自己部署MCP服务 + +如果您需要私有化部署或定制化需求,可以在自己的服务器上部署MCP服务。 + +**优势:** +- ✅ 数据完全私有化 +- ✅ 可定制化配置 +- ✅ 完全掌控服务 +- ✅ 适合企业内部使用 + +**前置要求:** +- Python 3.9+ +- Neo4j数据库(或其他支持的图数据库) +- HTTPS域名(用于Coze等平台) + +继续阅读下方内容了解详细部署步骤。 + +--- + +## 自部署MCP服务配置 + +以下内容适用于需要自己部署MCP服务的用户。 + +## 架构说明 + +自部署MCP服务采用以下架构: + +``` +客户端(Coze/Claude等) + ↓ [HTTPS] +MCP服务器(8002端口) + ↓ [HTTP调用] +Server API(8001端口) + ↓ +MemOS核心服务 +``` + +**组件说明:** +- **Server API**: 提供REST API接口(`/product/*`),处理记忆的增删改查 +- **MCP服务器**: 通过HTTP传输暴露MCP协议,调用Server API完成操作 +- **HTTPS反向代理**: Coze等平台要求使用HTTPS安全连接 + +::steps{level="3"} + +### 步骤1: 启动Server API + +Server API是MCP服务的后端,提供实际的记忆管理功能。 + +```bash +cd /path/to/MemOS +python src/memos/api/server_api.py --port 8001 +``` + +验证Server API是否正常运行: + +```bash +curl http://localhost:8001/docs +``` + +如果返回API文档页面,说明启动成功。 + +::note +**配置文件**
+Server API会自动加载配置,确保Neo4j等依赖服务已正确配置。可参考 `examples/data/config/tree_config_shared_database.json` 配置示例。 +:: + +### 步骤2: 启动MCP HTTP服务 + +在另一个终端启动MCP服务: + +```bash +cd /path/to/MemOS +python examples/mem_mcp/simple_fastmcp_serve.py --transport http --port 8002 +``` + +MCP服务启动后会显示类似以下信息: + +``` +╭──────────────────────────────────────────────────╮ +│ MemOS MCP via Server API │ +│ Transport: HTTP │ +│ Server URL: http://localhost:8002/mcp │ +╰──────────────────────────────────────────────────╯ +``` + +**环境变量配置(可选):** + +可以通过`.env`文件或环境变量配置Server API地址: + +```bash +export MEMOS_API_BASE_URL="http://localhost:8001/product" +``` + +::note +**工具列表**
+MCP服务提供以下工具: +- `add_memory`: 添加记忆 +- `search_memories`: 搜索记忆 +- `chat`: 与记忆系统对话 + +完整工具列表参考 `examples/mem_mcp/simple_fastmcp_serve.py` +:: + +### 步骤3: 配置HTTPS反向代理 + +Coze等平台要求使用HTTPS连接。您需要配置HTTPS反向代理(如Nginx)将流量转发到MCP服务。 + +**Nginx配置示例:** + +```nginx +server { + listen 443 ssl http2; + server_name your-domain.com; + + ssl_certificate /path/to/cert.pem; + ssl_certificate_key /path/to/key.pem; + + location /mcp { + proxy_pass http://localhost:8002/mcp; + proxy_http_version 1.1; + proxy_set_header Upgrade $http_upgrade; + proxy_set_header Connection "upgrade"; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + + # SSE支持 + proxy_buffering off; + proxy_cache off; + } +} +``` + +::warning +**HTTPS证书**
+确保使用有效的SSL证书,自签名证书可能无法被Coze等平台接受。可使用Let's Encrypt免费获取证书。 +:: + +### 步骤4: 测试MCP服务 + +使用客户端测试脚本验证服务: + +```bash +cd /path/to/MemOS +python examples/mem_mcp/simple_fastmcp_client.py +``` + +成功输出示例: + +``` +Working FastMCP Client +======================================== +Connected to MCP server + + 1. Adding memory... + Result: Memory added successfully + + 2. Searching memories... + Result: [搜索结果] + + 3. Chatting... + Result: [AI响应] + +✓ All tests completed! +``` + +:: + +## 在Coze空间配置MCP + +服务部署完成后,在Coze空间中配置MCP连接。 + +::steps{level="3"} + +### 步骤1: 打开Coze空间并进入工具配置页面 + +![Coze空间配置页面](https://statics.memtensor.com.cn/memos/coze_space_1.png) + +### 步骤2: 添加自定义MCP工具 + +在工具配置页面中添加自定义工具: + +![添加自定义工具](https://statics.memtensor.com.cn/memos/coze_space_2.png) + +### 步骤3: 配置MCP连接地址 + +配置MCP连接URL,使用您配置的HTTPS地址: + +``` +https://your-domain.com/mcp +``` +可用的MCP工具: +- **add_memory**: 添加新记忆 +- **search_memories**: 搜索已有记忆 +- **chat**: 基于记忆的对话 + +::note +**测试连接**
+配置完成后,在Coze中测试MCP连接是否正常。确保能够成功调用各个工具。 +:: + +:: + +--- + +## 直接使用REST API(高级) + +对于需要更灵活集成的场景,可以直接使用Server API的REST接口。 + +::steps{level="3"} + +### 步骤1: 启动Server API + +```bash +cd /path/to/MemOS +python src/memos/api/server_api.py --port 8001 +``` +**端口说明** +- Server API默认运行在8001端口 +- 提供 `/product/*` REST API端点 + +### 步骤2: 在Coze IDE配置自定义工具 + +1. 在Coze中选择"IDE插件"创建方式 +2. 配置请求到您部署的Server API服务 + +![Coze IDE插件配置](https://statics.memtensor.com.cn/memos/coze_tools_1.png) + +### 步骤3: 实现add_memory工具 + +![配置add_memory操作](https://statics.memtensor.com.cn/memos/coze_tools_2.png) + +**代码示例:** IDE中配置`add_memory`操作并发布: + +![配置add_memory操作](https://statics.memtensor.com.cn/memos/coze_tools_2.png) +详细代码如下 + +```python +import json +import requests +from runtime import Args +from typings.add_memory.add_memory import Input, Output + +def handler(args: Args[Input])->Output: + memory_content = args.input.memory_content + user_id = args.input.user_id + cube_id = args.input.cube_id + + # 调用Server API的add接口 + url = "https://your-domain.com:8001/product/add" + payload = json.dumps({ + "user_id": user_id, + "messages": memory_content, # 支持字符串或消息数组 + "writable_cube_ids": [cube_id] if cube_id else None + }) + headers = { + 'Content-Type': 'application/json' + } + + response = requests.post(url, headers=headers, data=payload, timeout=30) + response.raise_for_status() + + return response.json() +``` + +**其他工具实现:** + +类似地实现search和chat工具: + +```python +# Search工具 +def search_handler(args: Args[Input]) -> Output: + url = "https://your-domain.com:8001/product/search" + payload = json.dumps{ + "user_id": args.input.user_id, + "query": args.input.query, + }) + headers = { + 'Content-Type': 'application/json' + } + + response = requests.post(url, headers=headers, data=payload, timeout=30) + response.raise_for_status() + + return response.json() + +# Chat工具 +def chat_handler(args: Args[Input]) -> Output: + url = "https://your-domain.com:8001/product/chat/complete" + payload = json.dumps({ + "user_id": args.input.user_id, + "query": args.input.query + }) + response = requests.post(url, json=payload, timeout=30) + return response.json() +``` + +### 步骤4: 发布并测试工具 + +发布完成后,可以在"我的资源"中查看插件: + +![发布后的插件资源](https://statics.memtensor.com.cn/memos/coze_tools_3.png) + +### 步骤5: 集成到智能体工作流 + +将插件添加到智能体工作流中: + +1. 创建新的智能体或编辑现有智能体 +2. 在工具列表中添加已发布的MemOS插件 +3. 配置工作流,调用记忆工具 +4. 测试记忆存储和检索功能 + +:: + +--- + +## 常见问题 + +### Q1: MCP服务无法连接到Server API + +**解决方案:** +- 检查Server API是否正常运行:`curl http://localhost:8001/docs` +- 检查环境变量`MEMOS_API_BASE_URL`配置是否正确 +- 查看MCP服务日志,确认调用地址 + +### Q2: Coze无法连接到MCP服务 + +**解决方案:** +- 确保使用HTTPS连接 +- 检查SSL证书是否有效 +- 测试反向代理配置:`curl https://your-domain.com/mcp` +- 检查防火墙和安全组设置 + +### Q3: Neo4j连接失败 + +**解决方案:** +- 确保Neo4j服务正常运行 +- 检查配置文件中的连接信息(uri、user、password) +- 参考 `examples/data/config/tree_config_shared_database.json` 配置示例 + +### Q4: 如何查看完整的API示例? + +**参考文件:** +- MCP服务端: `examples/mem_mcp/simple_fastmcp_serve.py` +- MCP客户端: `examples/mem_mcp/simple_fastmcp_client.py` +- API测试: `examples/api/server_router_api.py` + +--- + +## 总结 + +通过本指南,您可以: +- ✅ 选择适合的MCP部署方式(云服务或自部署) +- ✅ 完成MCP服务的完整部署流程 +- ✅ 在Coze等平台中集成MemOS记忆功能 +- ✅ 使用REST API直接集成 + +无论选择哪种方式,MemOS都能为您的智能体提供强大的记忆管理ders=headers, data=payload) + return json.loads(response.text) + +::note +**API参数说明** +- 使用Server API的标准参数格式 +- `messages`: 替代原来的 `memory_content`,支持字符串或消息数组 +- `writable_cube_ids`: 替代原来的 `mem_cube_id`,支持多个cube +- Server API运行在8001端口,路径为 `/product/add` +- 确保与MemOS Server API接口一致,可参考 `examples/api/server_router_api.py` 中的示例 +**IDE配置**
在IDE中可以自定义工具的参数、返回值格式等,确保与MemOS API接口一致。 采用此方法完成 search 接口以及用户注册接口的编写,并点点击发布 +:: + +### 发布并使用插件 + +发布完成后,可以在"我的资源"中查看插件,以插件形式融入智能体工作流: + +![发布后的插件资源](https://statics.memtensor.com.cn/memos/coze_tools_3.png) + +### 构建智能体并测试 + +构建最简易智能体后,即可进行记忆操作测试: + +1. 创建新的智能体 +2. 添加已发布的记忆插件 +3. 配置工作流 +4. 测试记忆存储和检索功能 + +通过以上配置,您就可以在Coze空间中成功集成MemOS的记忆功能,为您的智能体提供强大的记忆能力。 diff --git a/docs/cn/open_source/best_practice/memory_structure_design.md b/docs/cn/open_source/best_practice/memory_structure_design.md new file mode 100644 index 00000000..75d80396 --- /dev/null +++ b/docs/cn/open_source/best_practice/memory_structure_design.md @@ -0,0 +1,138 @@ +--- +title: 记忆结构设计最佳实践 +--- + +## 记忆类型选择 + +### 树形明文记忆 + +**最适用于**:知识管理、研究助手、层级结构数据 +```python +tree_config = { + "backend": "tree_text", + "config": { + "extractor_llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b" + } + }, + "graph_db": { + "backend": "neo4j", + "config": { + "host": "localhost", + "port": 7687 + } + } + } +} +``` + +### 偏好明文记忆 + +**最适用于**:个性化对话、智能推荐、客户服务 + +```python +preference_config = { + "backend": "preference_text", + "config": { + "extractor_llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b", + } + }, + "vector_db": { + "backend": "milvus", + "config": { + "collection_name": [ + "explicit_preference", + "implicit_preference" + ], + "vector_dimension": 768, + "distance_metric": "cosine", + "uri": "./milvus_demo.db" + } + }, + "embedder": { + "backend": "ollama", + "config": { + "model_name_or_path": "nomic-embed-text:latest" + } + }, + "reranker": { + "backend": "cosine_local", + "config": { + "level_weights": { + "topic": 1.0, + "concept": 1.0, + "fact": 1.0 + }, + "level_field": "background" + } + } + } +} +``` + +### 通用明文记忆(带向量索引) + +**最适用于**:对话式 AI、私人助理、问答系统 + +```python +general_config = { + "backend": "general_text", + "config": { + "extractor_llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b" + } + }, + "vector_db": { + "backend": "qdrant", + "config": { + "collection_name": "general" + } + }, + "embedder": { + "backend": "ollama", + "config": { + "model_name_or_path": "nomic-embed-text" + } + } + } +} +``` + +### 纯明文记忆(仅文本) + +**最适用于**:简单应用、原型开发 + +```python +naive_config = { + "backend": "naive_text", + "config": { + "extractor_llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b" + } + } + } +} +``` + +## 容量规划 + +如果你启用了调度器,可以设置记忆容量来控制资源使用情况: + +```python +scheduler_config = { + "memory_capacities": { + "working_memory_capacity": 20, # 工作记忆 + "user_memory_capacity": 500, # 用户记忆 + "long_term_memory_capacity": 2000 # 长时记忆 + } +} +``` diff --git a/docs/cn/open_source/best_practice/network_workarounds.md b/docs/cn/open_source/best_practice/network_workarounds.md new file mode 100644 index 00000000..a8327a23 --- /dev/null +++ b/docs/cn/open_source/best_practice/network_workarounds.md @@ -0,0 +1,115 @@ +--- +title: 网络问题解决方案 +desc: 以下是一些在开发过程中可能遇到的网络问题的应对方案。 +--- + +## **下载 Huggingface 模型** + +### 镜像站点(HF-Mirror) + +要通过镜像站点下载 Huggingface 模型,可以按照以下步骤进行操作: + +::steps{level="4"} + +#### 安装依赖项 + +运行以下命令安装必要的依赖项: + +```bash +pip install -U huggingface_hub +``` + +#### 设置环境变量 + +将环境变量 `HF_ENDPOINT` 设置为 `https://hf-mirror.com`。 + +#### 下载模型或数据集 + +使用 huggingface-cli 下载模型或数据集。例如: + +- 下载模型: + + ```bash + huggingface-cli download --resume-download gpt2 --local-dir gpt2 + ``` +- 下载数据集: + ``` + huggingface-cli download --repo-type dataset --resume-download wikitext --local-dir wikitext + ``` + +:: + +获取更详细的说明和其他方法,请参见 [此链接](https://hf-mirror.com/)。 + +### 其他来源 + +某些地区仍可能无法访问部分模型。在这种情况下,可以使用 modelscope: + +::steps{level="4"} + +#### 安装 ModelScope + +运行以下命令安装必要的依赖项: + +```bash +pip install modelscope[framework] +``` + +#### 下载模型或数据集 + +使用 modelscope 下载模型或数据集。例如: + +* 下载模型: + + ```bash + modelscope download --model 'Qwen/Qwen2-7b' --local_dir 'path/to/dir' + ``` + +* 下载数据集: + + ```bash + modelscope download --dataset 'Tongyi-DataEngine/SA1B-Dense-Caption' --local_dir './local_dir' + ``` + +:: + +获取更详细的说明和其他方法,请参见 [官方文档](https://modelscope.cn/docs/home)。 + +## **使用 Poetry** + +### 安装过程中的网络错误 + +在某些地区使用 "poetry install" 可能会遇到网络错误,可以按照以下步骤解决: + +::steps{level="4"} + +#### 更新配置 + +在 `pyproject.toml` 文件中添加以下配置以使用镜像源: + +```toml +[[tool.poetry.source]] +name = "mirrors" +url = "https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple/" +priority = "primary" +``` + +#### 重新配置 Poetry + +在终端中运行 `poetry lock` 命令,使用新的镜像源重新配置 Poetry。 + +:: + +**提示:** +注意 `poetry lock` 会修改 `pyproject.toml` 和 `poetry.lock` 文件。为避免提交不必要的更改: + +- 方案一:成功执行 `poetry install` 后,使用 `git reset --hard HEAD` 还原到 Git HEAD 节点。 +- 方案二:执行 `git add` 时,排除 `pyproject.toml` 和 `poetry.lock` 文件,仅添加其他文件。 + +以后在添加或移除依赖包时,可以使用如下命令: + +```bash +poetry add +``` + +更多命令和说明,请参见 [Poetry CLI 官方文档](https://python-poetry.org/docs/cli/)。 diff --git a/docs/cn/open_source/best_practice/performance_tuning.md b/docs/cn/open_source/best_practice/performance_tuning.md new file mode 100644 index 00000000..8fe27d82 --- /dev/null +++ b/docs/cn/open_source/best_practice/performance_tuning.md @@ -0,0 +1,132 @@ +--- +title: 性能调优 +--- + +MemOS 的性能优化主要围绕 **记忆提取 (Mem-Reader)**、**向量嵌入 (Embedding)** 和 **检索排序 (Search Ranking)** 展开。大部分配置可以通过修改 YAML 配置文件(如 `memos_config_w_scheduler.yaml`)或直接调整源代码来实现。 + +## 1. 记忆提取优化 (Mem-Reader Prompt) + +slow_embedder = { + "backend": "sentence_transformer", + "config": { + "model_name_or_path": "nomic-ai/nomic-embed-text-v1.5" + } +} +``` +`Mem-Reader` 组件负责从对话中提取关键信息。目前的实现中,Prompt 是定义在源代码模板中的。 + +### 修改 Prompt 模板 + +要调整提取逻辑(例如忽略闲聊、专注于特定事实),你需要直接修改源码文件: + +* **文件路径**: `src/memos/templates/mem_reader_prompts.py` +* **目标变量**: `SIMPLE_STRUCT_MEM_READER_PROMPT` (用于英文) 或 `SIMPLE_STRUCT_MEM_READER_PROMPT_ZH` (用于中文) + +**示例修改**: + +在 `src/memos/templates/mem_reader_prompts.py` 中: + +```python +SIMPLE_STRUCT_MEM_READER_PROMPT = """ +You are a preference extraction expert. +Your task is to extract ONLY user preferences and dislikes from the conversation. +Ignore all other information including plans and daily events. +... +""" +``` + +## 2. 向量嵌入模型优化 (Embedding Models) + +Embedding 模型的选择决定了语义检索的准确性和速度。通常在 YAML 配置文件中进行设置。 + +### 配置文件修改 + +在你的配置文件(如 `memos_config.yaml`)中,找到 `mem_reader` 或 `text_mem` 下的 `embedder` 部分: + +```yaml +mem_reader: + backend: "simple_struct" + config: + # ... 其他配置 + embedder: + # 方案 A: 使用 Ollama (速度快,适合本地) + backend: "ollama" + config: + model_name_or_path: "nomic-embed-text:latest" + + # 方案 B: 使用 Sentence Transformer (精度高,显存占用大) + # backend: "sentence_transformer" + # config: + # model_name_or_path: "BAAI/bge-m3" +``` + +* **推荐模型**: + * **快速/本地**: `nomic-embed-text` (Ollama) + * **高精度**: `BAAI/bge-m3` 或 `OpenAI` 的 `text-embedding-3-small` (需使用 `universal_api` backend) + +## 3. 检索排序优化 (Search Ranking) + +检索性能主要受召回数量 (`top_k`) 和重排序策略影响。 + +### 调整召回数量 (Top-K) + +在 `mem_scheduler` 的配置中调整 `top_k`。增加此值可以提高召回率,但会增加处理时间。 + +```yaml +mem_scheduler: + backend: "general_scheduler" + config: + # 初始检索的候选数量 + top_k: 20 + # ... +``` + +### 引入 Reranker (进阶) + +MemOS 支持在检索后引入 Reranker 进行精排。这通常需要在初始化 `Searcher` 组件时指定。如果你是作为开发者集成 MemOS,可以在代码中配置: + +```python +from memos.reranker.factory import RerankerFactory + +# 在初始化 Searcher 时 +reranker = RerankerFactory.from_config({ + "backend": "sentence_transformer", + "config": { + "model_name_or_path": "BAAI/bge-reranker-base" + } +}) +``` + +## 4. 系统资源与容量限制 + +合理限制各类记忆的容量可以防止内存无限增长,并保持检索速度。这通常在 `mem_cube` 的配置中设置。 + +### 内存容量配置 (Memory Size) + +在 YAML 配置文件中,配置 `memory_size` 字典: + +```yaml +mem_cube: + backend: "general" + config: + text_mem: + backend: "tree" + config: + # 限制各类记忆的条目数 + memory_size: + WorkingMemory: 10 # 最近几轮对话的短期记忆 + LongTermMemory: 2000 # 长期记忆上限 + UserMemory: 500 # 用户画像/偏好上限 +``` + +### 批处理与并发 + +在 `mem_scheduler` 中可以配置并发处理能力: + +```yaml +mem_scheduler: + config: + thread_pool_max_workers: 10 # 并行处理线程数 + consume_interval_seconds: 0.01 # 消息队列消费间隔 + enable_parallel_dispatch: true # 开启并行分发 +``` diff --git a/docs/cn/open_source/contribution/commit_guidelines.md b/docs/cn/open_source/contribution/commit_guidelines.md new file mode 100644 index 00000000..aac4095b --- /dev/null +++ b/docs/cn/open_source/contribution/commit_guidelines.md @@ -0,0 +1,17 @@ +--- +title: 提交规范 +--- + +请遵循 [Conventional Commits](https://www.conventionalcommits.org/) 格式: + +- `feat:` 用于新增功能 +- `fix:` 用于修复 bug +- `docs:` 用于文档更新 +- `style:` 用于格式调整(不影响代码逻辑) +- `refactor:` 用于代码重构 +- `test:` 用于新增或更新测试 +- `chore:` 用于其他维护任务 +- `ci:` 用于 CI/CD 或工作流相关变更 + +**示例:** +`feat: add user authentication` diff --git a/docs/cn/open_source/contribution/development_workflow.md b/docs/cn/open_source/contribution/development_workflow.md new file mode 100644 index 00000000..088784fb --- /dev/null +++ b/docs/cn/open_source/contribution/development_workflow.md @@ -0,0 +1,75 @@ +--- +title: 开发流程 +--- + +按照以下步骤参与项目开发。 + +::steps{level="4"} + +#### 与上游仓库同步 + +如果你之前 fork 了该仓库,请与上游仓库的变更保持同步: + +```bash +git checkout dev # 切换到 dev 分支 +git fetch upstream # 获取上游仓库的最新更改 +git pull upstream dev # 将更改合并到本地 dev 分支 +git push origin dev # 将合并后的代码推送到你自己的 fork +``` + +#### 创建功能分支 + +为你的新功能或修订创建一个新的分支: + +```bash +git checkout -b feat/descriptive-name +``` + +#### 添加你的功能或修订 + +在相应文件中实现你的功能、修订或改进。 + +* 例如,你可以在 `src/memos/hello_world.py` 中添加一个函数,并在 `tests/test_hello_world.py` 中编写相应的测试用例。 + +#### 测试你的更改 + +运行测试套件以确保更改正确: + +```bash +make test +``` + +#### 提交更改 + +在提交前或 PR 前,rebase 到最新 upstream/dev: + +```bash +git fetch upstream +git rebase upstream/dev # 把你的 feat 分支基于最新 dev 重放 +``` + +提交更改时请遵循项目的提交规范(参见 [提交规范](commit_guidelines.md))。 + +#### 推送到你的 Fork 仓库 + +将功能分支推送到你 fork 的远程仓库: + +```bash +git push origin feat/descriptive-name +``` + +#### 创建 Pull Request + +提交你的更改以供审核: + +* **重要提示:** 请务必将 Pull Request 提交到: + + * ✅ 上游仓库的 `dev` 分支, + * ❎ 而不是上游仓库的 `main` 分支。 +* 打开 GitHub 上的原始仓库 +* 点击 "Pull Requests" +* 点击 "New Pull Request" +* 选择 `dev` 作为目标分支,你的分支作为对比分支 +* 仔细填写 PR 描述 + +:: diff --git a/docs/cn/open_source/contribution/overview.md b/docs/cn/open_source/contribution/overview.md new file mode 100644 index 00000000..83f58c4b --- /dev/null +++ b/docs/cn/open_source/contribution/overview.md @@ -0,0 +1,14 @@ +--- +title: 参与 MemOS 开发 +desc: 欢迎阅读 MemOS 贡献指南!了解如何配置开发环境、遵循我们的开发流程、撰写规范的提交信息、完善文档以及添加测试用例。 +--- + +- **首次贡献者:** 请先阅读 [环境配置指南](setting_up.md),以准备你的开发环境。 +- **准备开始编码了吗?** 请查看 [开发流程](development_workflow.md) 指南,了解我们提交更改的流程。 +- **撰写规范的提交信息:** 请参考我们的 [提交规范](commit_guidelines.md)。 +- **参与文档编写:** 如果你正在帮助我们改进文档,请阅读 [文档编写指南](writing_docs.md)。 +- **添加或完善测试用例:** 请参考 [测试编写指南](writing_tests.md)。 + +你的贡献让这个项目更加优秀!✨ 如果有任何问题,欢迎提交 issue 或参与讨论,也可以扫描下方二维码加入我们的 Discord 或微信社群与我们联系。 + +QR Code diff --git a/docs/cn/open_source/contribution/setting_up.md b/docs/cn/open_source/contribution/setting_up.md new file mode 100644 index 00000000..4e5cd330 --- /dev/null +++ b/docs/cn/open_source/contribution/setting_up.md @@ -0,0 +1,205 @@ +--- +title: 配置开发环境 +desc: 若要参与 MemOS 的开发,你需要在本地配置开发环境。 +--- + +::steps{level="4"} + +#### Fork 并克隆仓库 + +在本地设置项目仓库: + +- 在 GitHub 上 fork 仓库 +- 将你的 fork 克隆到本地: + + ```bash + git clone https://github.com/YOUR-USERNAME/MemOS.git + cd MemOS + ``` + +- 添加上游仓库作为远程源: + + ```bash + git remote add upstream https://github.com/MemTensor/MemOS.git + ``` + +#### 准备开发依赖 + +确保本地已安装: + +- Git +- Python 3.9+ +- Make + +验证 Python: + +```bash +python3 --version +``` + +#### 安装 Poetry + +MemOS 使用 Poetry 管理 Python 依赖。推荐使用官方安装脚本: + +```bash +curl -sSL https://install.python-poetry.org | python - +``` + +验证安装是否成功: + +```bash +poetry --version +``` + +如果提示 `poetry: command not found`,请将安装器输出中提示的 Poetry 可执行文件目录加入 PATH,然后重新打开终端再验证。 + +更多安装选项参考:[官方安装指南](https://python-poetry.org/docs/#installing-with-the-official-installer)。 + +#### 安装依赖并设置 Pre-commit 钩子 + +在仓库根目录安装所有依赖与开发工具: + +```bash +make install +``` + +提示: + +- 如果你切换分支或依赖发生变化,可能需要**重新运行 `make install`** 以保持环境一致 + +### 理解记忆模块与依赖选择 +在配置环境之前,我们需要先了解 MemOS 的记忆模块分类及其对应的数据库依赖。这将决定你需要安装哪些组件。 + +#### 记忆类型 + +MemOS 的记忆系统主要分为两类(括号内为配置项 `backend` 的标识符): + +- **明文记忆 (Textual Memory)**:属于事实记忆,**需要选择其中一种**。 + - `tree` (`tree_text`): 树状记忆(推荐),结构化程度最高。 + - `general` (`general_text`): 通用记忆,基于向量检索。 + - `naive` (`naive_text`): 简单记忆,无特殊依赖(仅用于测试)。 +- **偏好记忆 (Preference Memory)**:属于用户偏好,**可选**。 + - `pref`: 用于存储和检索用户偏好。 + +#### 数据库依赖矩阵 + +不同的记忆类型需要不同的数据库支持: + +| 记忆类型 | 依赖组件 | 备注 | +| :--- | :--- | :--- | +| **Tree** | **图数据库** | 必选。支持 Neo4j Desktop, Neo4j Community , PolarDB | +| **General** | **向量数据库** | 必选。推荐使用 Qdrant(或兼容向量 DB) | +| **Naive** | 无 | 无需安装数据库 | +| **Pref** | **Milvus** | 如果启用偏好记忆,必须安装 Milvus | + +#### 关于 Tree 记忆与图数据库的选择 + +如果你选择使用 **Tree 明文记忆后端**(配置标识通常为 `tree_text`),则需要准备一个 **图数据库(Graph DB)** 作为存储与查询基础。目前可选方案包括: + +- **Neo4j Desktop**(PC 端推荐):在本机安装并通过图形界面管理数据库,适合快速上手与调试。 +- **PolarDB**:云上托管的图数据库服务(付费),适合生产或团队协作场景。 +- **Neo4j Community**(社区版):开源免费,适合服务器或 Linux 环境部署。 + +**特别说明**: + +- 使用 **Neo4j Desktop** 时,你主要关注数据库的启动与连接即可,日常调试更方便。 +- 使用 **Neo4j Community** 时,需要注意:它**不提供原生向量索引能力**。如果你的流程需要向量检索/相似度搜索能力,通常需要通过**外挂向量库**(例如 Qdrant)来补齐相关能力。 + +#### 本教程的配置方案 + +为便于开发者快速跑通核心链路,本教程采用以下组合: + +- **明文记忆后端**:`tree_text`(概念上对应 Tree 记忆) +- **图数据库**:Neo4j Community(可使用 Docker 启动) +- **向量能力**:Qdrant(本地模式) + +由于 Neo4j Community 不支持原生向量索引,本教程引入 Qdrant 作为向量能力的补充。为了降低环境复杂度,我们**不启动 Qdrant 的服务端进程**(不运行 Qdrant 容器),而是使用 Qdrant 的**本地模式**:在配置中以本地路径(`path`)形式指定存储位置,由系统在该目录下初始化并读写所需的数据文件。若未显式指定路径,则会使用默认路径进行初始化与持久化(具体默认位置以项目实现与配置为准)。 + +#### 创建配置文件 + +.env 内容,快速配置请见 docker 安装下的[env 配置](/open_source/getting_started/installation#2.-.env-内容) +.env详细配置请见[env配置](/open_source/getting_started/rest_api_server/#本地运行) + +::note +**请注意**
+.env 文件配置需要放在MemOS 项目根目录下 +:: + +```bash +cd MemOS +touch .env +``` +#### 配置Dockerfile文件 +::note +**请注意**
+Dockerfile 文件在 docker 目录下 +:: + +```bash +#进入docker目录下 +cd docker +``` +包含快速模式和完整模式,可区分使用精简包(区分arm和x86)和全量包(区分arm和x86) + +```bash + +● 精简包:简化体量过大的 nvidia相关等依赖,对镜像实现轻量化,使本地部署更加轻量快速。 +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-base:v1.0 +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-base-arm:v1.0 + +● 全量包:将 MemOS 全部依赖包打为镜像,可体验完整功能,通过配置 Dockerfile可直接构建启动。 +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-full-base:v1.0.0 +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-full-base-arm:v1.0.0 +``` + +```bash +# 当前示例使用精简包 url +FROM registry.cn-shanghai.aliyuncs.com/memtensor/memos-base-arm:v1.0 + +WORKDIR /app + +ENV HF_ENDPOINT=https://hf-mirror.com + +ENV PYTHONPATH=/app/src + +COPY src/ ./src/ + +EXPOSE 8000 + +CMD ["uvicorn", "memos.api.server_api:app", "--host", "0.0.0.0", "--port", "8000", "--reload"] + +``` + +#### 启动docker客户端 +```bash + # 如果没有安装docker,请安装对应版本,下载地址如下: + https://www.docker.com/ + + # 安装完成之后,可通过客户端启动docker,或者通过命令行启动docker + # 通过命令行启动docker + sudo systemctl start docker + +# 安装完成后,查看docker状态 +docker ps + +# 查看docker镜像 (可不用) +docker images + +``` + +#### 构建并启动服务 : +::note +**请注意**
+构建命令同样在 docker 目录下 +:: +```bash +# 在docker目录下 +docker compose up neo4j +``` +#### 新建终端启动server端口 : + +```bash +cd MemOS +make serve +``` +:: diff --git a/docs/cn/open_source/contribution/writing_docs.md b/docs/cn/open_source/contribution/writing_docs.md new file mode 100644 index 00000000..ea8cdace --- /dev/null +++ b/docs/cn/open_source/contribution/writing_docs.md @@ -0,0 +1,546 @@ +--- +title: 文档编写指南 +desc: 本项目使用 Nuxt Content 构建支持 Markdown 和富 Vue 组件的文档系统。 +--- + +## 创建新文档 + +::steps +### 创建 Markdown 文件 +在 `content/` 目录或其子目录中创建新的 `.md` 文件。根据内容类型选择合适的位置。 + +### 添加 Frontmatter +在文件顶部添加 YAML frontmatter 来提供元数据。frontmatter 支持以下字段: + +::card{title="Frontmatter 字段"} +**必填字段:** +- `title`(字符串) - 显示在导航和页面标题中的文档标题 + +**可选字段:** +- `desc`(字符串) - 文档内容的简要描述 +- `banner`(字符串) - 页面顶部展示的横幅图片链接 +- `links`(数组) - 包含标签、URL 和图标的相关链接数组 + +![Frontmatter 示例](https://statics.memtensor.com.cn/memos/frontmatter.png) +:: + +**完整 Frontmatter 示例:** + +```yaml +--- +title: MemOS 文档 +desc: 欢迎阅读 MemOS 的官方文档——一个旨在赋能大语言模型(LLMs)实现高级、模块化记忆能力的 Python 包。 +banner: https://statics.memtensor.com.cn/memos/memos-banner.gif +links: + - label: 'PyPI' + to: https://pypi.org/project/MemoryOS/ + target: _blank + avatar: + src: https://statics.memtensor.com.cn/icon/pypi.svg + alt: PyPI logo + - label: '开源地址' + to: https://github.com/MemTensor/MemOS + target: _blank + icon: i-simple-icons-github +--- +``` + +### 编写内容 +使用 Markdown 语法和 MDC 组件撰写文档内容。利用已有组件构建结构清晰、交互友好、内容丰富的文档。 + +### 更新导航 +将新文档添加到 `content/settings.yml` 中的 `nav` 部分,以便在站点导航中访问。 + +### 合并到主分支 +一旦变更合并到 `main` 分支,文档将自动更新并部署。 +:: + +## 组件示例 + +本项目使用 Nuxt Content 的 MDC(Markdown Components)语法,支持在 Markdown 中使用 Vue 组件。这些组件有助于创建风格一致、结构良好、体验优秀的文档内容。 + +### Image + +为文档添加图片时,可以使用多种方式进行引用: + +#### 使用 Base64Image 组件引用本地图片 + +对于存储在 `public/assets` 目录下的图片,推荐使用 `Base64Image` 组件,该组件能将图片直接嵌入页面以提高性能: + +```mdc +:Base64Image{src="/assets/memos-architecture.png" alt="MemOS Architecture"} +``` + +#### 使用 Markdown 语法引用远程图片 + +对于托管在外部服务器上的图片,使用标准的 Markdown 图片语法: + +```markdown +![MemOS Architecture](https://statics.memtensor.com.cn/memos/memos-architecture.png) +``` + +### Steps + +使用 `steps` 组件将文档标题自动编号,生成逐步引导的教程。 + +::code-preview +--- +class: "[&>div]:*:w-full" +--- + :::steps{level="4"} +#### Fork 并克隆仓库 + +在本地设置项目仓库: + +- 在 GitHub 上 fork 仓库 +- 将你的 fork 克隆到本地: + + ```bash + git clone https://github.com/YOUR-USERNAME/MemOS.git + cd MemOS + ``` + +- 添加上游仓库作为远程源: + + ```bash + git remote add upstream https://github.com/MemTensor/MemOS.git + ``` + +#### 准备开发依赖 + +确保本地已安装: + +- Git +- Python 3.9+ +- Make + +验证 Python: + +```bash +python3 --version +``` + +#### 安装 Poetry + +MemOS 使用 Poetry 管理 Python 依赖。推荐使用官方安装脚本: + +```bash +curl -sSL https://install.python-poetry.org | python3 - +``` + +验证安装是否成功: + +```bash +poetry --version +``` + +如果提示 `poetry: command not found`,请将安装器输出中提示的 Poetry 可执行文件目录加入 PATH,然后重新打开终端再验证。 + +更多安装选项参考:[官方安装指南](https://python-poetry.org/docs/#installing-with-the-official-installer)。 + +#### 安装依赖并设置 Pre-commit 钩子 + +在仓库根目录安装所有依赖与开发工具: + +```bash +make install +``` + +提示: + +- 如果你切换分支或依赖发生变化,可能需要**重新运行 `make install`** 以保持环境一致 +::: + +#code +````mdc +::steps{level="4"} + +#### Fork 并克隆仓库 + +在本地设置项目仓库: + +- 在 GitHub 上 fork 仓库 +- 将你的 fork 克隆到本地: + + ```bash + git clone https://github.com/YOUR-USERNAME/MemOS.git + cd MemOS + ``` + +- 添加上游仓库作为远程源: + + ```bash + git remote add upstream https://github.com/MemTensor/MemOS.git + ``` + +#### 准备开发依赖 + +确保本地已安装: + +- Git +- Python 3.9+ +- Make + +验证 Python: + +```bash +python3 --version +``` + +#### 安装 Poetry + +MemOS 使用 Poetry 管理 Python 依赖。推荐使用官方安装脚本: + +```bash +curl -sSL https://install.python-poetry.org | python3 - +``` + +验证安装是否成功: + +```bash +poetry --version +``` + +如果提示 `poetry: command not found`,请将安装器输出中提示的 Poetry 可执行文件目录加入 PATH,然后重新打开终端再验证。 + +更多安装选项参考:[官方安装指南](https://python-poetry.org/docs/#installing-with-the-official-installer)。 + +#### 安装依赖并设置 Pre-commit 钩子 + +在仓库根目录安装所有依赖与开发工具: + +```bash +make install +``` + +提示: + +- 如果你切换分支或依赖发生变化,可能需要**重新运行 `make install`** 以保持环境一致 +:: +```` +:: + + +### Accordion + +使用 `accordion` 和 `accordion-item` 创建可折叠内容区域。适用于组织 FAQ、可展开详情或分组信息等场景。 + +::code-preview +--- +class: "[&>div]:*:my-0" +--- + :::accordion + ::::accordion-item + --- + icon: i-lucide-circle-help + label: MemOS 是否兼容通过 API 访问的大语言模型(LLM)? + --- + 是的。MemOS 设计时尽可能兼容各种类型的模型。不过需要注意的是,如果你使用的是基于 API 的模型,那么激活记忆和参数记忆将无法使用。 + :::: + + ::::accordion-item + --- + icon: i-lucide-circle-help + label: MemOS 如何提升大语言模型应用的效果? + --- + MemOS 通过提供结构化的、持久化的记忆功能、智能调度机制、长期知识保留能力,以及用于快速推理的 KV cache,增强了大语言模型的应用效果。它支持精细化的访问控制与用户隔离,保障在多用户环境中的记忆安全。其模块化架构使得新记忆类型、LLM 及存储后端可以无缝集成,适用于各种智能应用场景。 + :::: + + ::::accordion-item{icon="i-lucide-circle-help" label="MemOS 的定价是多少?"} + MemOS 开源版本是免费的。 + :::: +::: + + +#code +```mdc +::accordion + +:::accordion-item{label="MemOS 是否兼容通过 API 访问的大语言模型(LLM)?" icon="i-lucide-circle-help"} +是的。MemOS 的设计目标是尽可能兼容各种类型的模型。然而需要注意的是,如果你使用的是基于 API 的模型,那么激活记忆和参数记忆将无法使用。 +::: + +:::accordion-item{label="MemOS 如何提升大语言模型应用的效果?" icon="i-lucide-circle-help"} +MemOS 通过提供结构化、持久化的记忆,配合智能调度、长期知识保留机制以及用于快速推理的 KV cache,有效增强了大语言模型的应用能力。它支持细粒度的访问控制与用户隔离机制,确保在多用户环境中的记忆安全。其模块化架构支持无缝集成新的记忆类型、LLM 和存储后端,能够适配多种智能应用场景。 +::: + +:::accordion-item{label="MemOS 的定价是多少?" icon="i-lucide-circle-help"} +MemOS 开源版本是免费的。 +::: + +:: +``` +:: + +### Badge + +使用 badge 展示状态指示或标签。在内容中高亮版本号、状态或分类信息时非常实用。 + +::code-preview +--- +label: Preview +--- + :::badge + **v1.0.0** + ::: + +#code +```mdc +::badge +**v1.0.0** +:: +``` +:: + + + +### Callout + +使用 callout 可以强调重要的上下文信息。Callout 用于引起用户注意,例如备注、提示、警告或注意事项,使关键信息更加突出。 + +你可以通过 `icon` 和 `color` 属性自定义样式,或者使用预定义的语义样式 `note`、`tip`、`warning`、`caution` 进行快捷调用。 + +::code-preview +--- +class: "[&>div]:*:my-0 [&>div]:*:w-full" +--- + :::callout + 这是一个支持完整 **markdown** 的 `callout` 提示框。 + ::: + +#code +```mdc +::callout +这是一个支持完整 **markdown** 的 `callout` 提示框。 +:: +``` +:: + +::code-preview + :::div{.flex.flex-col.gap-4.w-full} + ::::note{.w-full.my-0} + 基础备注内容 + :::: + + ::::note{.w-full.my-0 to="/open_source/getting_started/quick_start"} + 带链接的备注 —— 点击跳转到快速开始指南 + :::: + + ::::note{.w-full.my-0 to="/open_source/modules/mem_cube" icon="ri:database-line"} + 带自定义图标的备注 —— 了解更多关于 MemCube 的信息 + :::: + + ::::tip{.w-full.my-0} + 这里是一个有用的建议。 + :::: + + ::::warning{.w-full.my-0} + 请谨慎操作,此行为可能导致意外结果。 + :::: + + ::::caution{.w-full.my-0} + 此操作无法撤销。 + :::: + ::: + +#code +```mdc +::note +基础备注内容 +:: + +::note{to="/open_source/getting_started/quick_start"} +带链接的备注 —— 点击跳转到快速开始指南 +:: + +::note{to="/open_source/modules/mem_cube" icon="ri:database-line"} +带自定义图标的备注 —— 了解更多关于 MemCube 的信息 +:: + +::tip +这里是一个有用的建议。 +:: + +::warning +请小心执行此操作,它可能会导致意外结果。 +:: + +::caution +此操作无法撤销。 +:: +``` +:: + +### Card + +使用 `card` 可高亮展示内容模块。卡片适用于展示功能、资源或相关信息,以视觉上区分并增强交互性。 + +你可以通过 `title`、`icon` 和 `color` 属性自定义样式。Card 还支持使用 `` 属性进行导航跳转。 + +::code-preview +--- +class: "[&>div]:*:my-0 [&>div]:*:w-full" +--- + :::card + --- + icon: i-simple-icons-github + target: _blank + title: 开源项目 + to: https://github.com/MemTensor/MemOS + --- + 使用我们的开源版本 + ::: + +#code +```mdc +::card +--- +title: 开源项目 +icon: i-simple-icons-github +to: https://github.com/MemTensor/MemOS +target: _blank +--- +使用我们的开源版本 +:: +``` +:: + +### CardGroup + +使用 `card-group` 可将多个卡片以网格形式排列。适合展示结构化、响应式布局的卡片集合,视觉效果良好。 + +::code-preview + :::card-group{.w-full} + ::::card + --- + icon: ri:play-line + title: 最简Pipeline + to: /open_source/getting_started/examples#example-1-minimal-pipeline + --- + 最小可用Pipeline — 添加、搜索、更新和导出明文记忆。 + :::: + + ::::card + --- + icon: ri:tree-line + title: 仅 TreeTextMemory + to: /open_source/getting_started/examples#example-2-treetextmemory-only + --- + 使用基于 Neo4j 的分层记忆,构建结构化、多跳的知识图谱。 + :::: + + ::::card + --- + icon: ri:database-2-line + title: 仅 KVCacheMemory + to: /open_source/getting_started/examples#example-3-kvcachememory-only + --- + 通过短期 KV cache加速会话,实现快速上下文注入。 + :::: + + ::::card + --- + icon: hugeicons:share-07 + title: 混合 TreeText + KVCache + to: /open_source/getting_started/examples#example-4-hybrid + --- + 在单一 MemCube 中结合可解释的图记忆与快速 KV cache。 + :::: + ::: + +#code +```mdc +::card-group + +:::card +--- +icon: ri:play-line +title: 最简Pipeline 示例 +to: /open_source/getting_started/examples#example-1-minimal-pipeline +--- +最小可运行的Pipeline 示例——添加、搜索、更新及导出明文记忆。 +::: + +:::card +--- +icon: ri:tree-line +title: 仅使用 TreeTextMemory +to: /open_source/getting_started/examples#example-2-treetextmemory-only +--- +使用基于 Neo4j 的层级记忆构建结构化的多跳知识图谱。 +::: + +:::card +--- +icon: ri:database-2-line +title: 仅使用 KVCacheMemory +to: /open_source/getting_started/examples#example-3-kvcachememory-only +--- +通过短期KV cache加速会话,实现快速上下文注入。 +::: + +:::card +--- +icon: hugeicons:share-07 +title: 混合使用 TreeText 和 KVCache +to: /open_source/getting_started/examples#example-4-hybrid +--- +在单一 MemCube 中结合可解释的图记忆与高速KV cache。 +::: + +:: +``` +:: + +## 导航图标 + +在 `content/settings.yml` 中添加导航条目时,可以使用 `(ri:图标名称)` 的语法嵌入图标: + +```yaml +- "(ri:home-line) 首页": overview.md +- "(ri:team-line) 用户管理": modules/mos/users.md +- "(ri:flask-line) 测试编写": contribution/writing_tests.md +``` + +可用图标请参考:[https://icones.js.org/](https://icones.js.org/) + +## 本地预览 + +若需本地预览文档,可在项目根目录下执行以下命令。 + +首先安装依赖: + +```bash +pnpm install +``` + +启动开发服务器: + +```bash +pnpm dev +``` + +上述命令将启动本地 Web 服务器,通常访问地址为 `http://127.0.0.1:3000`。 + +## 深入了解 + +### Nuxt Content 与排版系统 + +本项目使用 Nuxt Content,支持丰富的排版组件与样式。如需了解更多组件用法与自定义选项,请参考: + +* [Nuxt UI Typography 文档](https://ui.nuxt.com/getting-started/typography) + +## 编写规范 + +::note +**文档编写建议** + +1. **结构清晰**:使用恰当的标题层级组织内容 +2. **合理使用组件**:如 note、card 等组件提升可读性与互动性 +3. **代码示例清晰**:为技术文档提供清晰的代码片段,并使用语法高亮 +4. **图标使用**:在导航中使用合适的图标以增强用户体验与层次感 +:: + +::card{title="Quick Reference"} +提交前请先本地测试你的文档效果。运行 `pnpm dev` 以预览你的变更并确保所有组件正确渲染。 +:: diff --git a/docs/cn/open_source/contribution/writing_tests.md b/docs/cn/open_source/contribution/writing_tests.md new file mode 100644 index 00000000..8058224f --- /dev/null +++ b/docs/cn/open_source/contribution/writing_tests.md @@ -0,0 +1,43 @@ +--- +title: 如何编写单元测试 +desc: 本项目使用 [pytest](https://docs.pytest.org/) 进行单元测试。 +--- + +## 编写测试 + +1. 在 `tests/` 目录下创建一个新的 Python 文件,文件名应以 `test_` 开头。 +2. 在该文件中定义以 `test_` 开头的函数。 +3. 使用 `assert` 语句检查预期结果。 + +以下是一个基本示例: + +```python +# tests/test_example.py + +def test_addition(): + assert 1 + 1 == 2 +``` + +## 运行测试 + +要运行所有测试,请在项目根目录下执行以下命令: + +```bash +make test +``` + +该命令将自动发现并运行 `tests/` 目录下的所有测试用例。 + +## 高级技巧 + +Pytest 提供了许多高级功能,例如 fixtures 和 mocking。 + +### Fixtures + +Fixtures 是可以为测试提供数据或设置初始状态的函数。使用 `@pytest.fixture` 装饰器进行定义。 + +### Mocking + +Mocking 用于用 mock 对象替换系统中的某些部分。这对于隔离被测试的代码非常有用。常用的工具是 `unittest.mock` 库,通常配合 `patch` 函数使用。 + +有关 mocking 的示例,请参见 `tests/test_hello_world.py`。 diff --git a/docs/cn/open_source/getting_started/examples.md b/docs/cn/open_source/getting_started/examples.md new file mode 100644 index 00000000..16e4e476 --- /dev/null +++ b/docs/cn/open_source/getting_started/examples.md @@ -0,0 +1,680 @@ +--- +title: MemOS 示例 +desc: "恭喜你——你已经掌握了快速入门并构建了第一个可用的记忆!现在是时候通过结合不同的记忆类型和功能,看看 MemOS 可以实现多大的可能性。使用这些精选示例来激发你自己的智能体、聊天机器人或知识系统的灵感。" +--- + +::card-group + + :::card + --- + icon: ri:play-line + title: 最简Pipeline + to: /cn/open_source/getting_started/examples#示例-1最简pipeline + --- + 最小的可用Pipeline — 添加、搜索明文记忆。 + ::: + + :::card + --- + icon: ri:tree-line + title: 多信息源的添加与检索 + to: /cn/open_source/getting_started/examples#示例-2多信息源记忆的添加与检索 + --- + 添加文本、图片、文件、工具调用的多信息源messages到记忆,并能够检索它们。 + ::: + + :::card + --- + icon: ri:apps-line + title: 多Cube添加和检索 + to: /cn/open_source/getting_started/examples#示例-3多cube添加和检索 + --- + 添加不同记忆到不同的Cube,在检索时同时召回它们。 + ::: + + :::card + --- + icon: ri:database-2-line + title: 仅 KVCacheMemory + to: /cn/open_source/getting_started/examples#示例-4仅-kvcachememory + --- + 使用短期 KV cache加速会话,实现快速上下文注入。 + ::: + + :::card + --- + icon: ri:calendar-check-line + title: 记忆调度 + to: /cn/open_source/getting_started/examples#示例-5多忆调度 + --- + 为多用户、多会话智能体运行动态记忆调用。 + ::: + +:: + +## 示例 1:最简Pipeline + +### 何时使用: +- 你想要最小的入门可用示例。 +- 你只需要将简单的明文记忆存储到数据库中,并能够检索它们。 + +### 关键点: +- 支持基础的个人用户记忆添加、搜索。 + +### 完整示例代码 +```python +import json +from memos.api.routers.server_router import add_memories, search_memories +from memos.api.product_models import APIADDRequest, APISearchRequest + +user_id = "test_user_1" +add_req = APIADDRequest( + user_id=user_id, + writable_cube_ids=["cube_test_user_1"], + messages = [ + {"role": "user", "content": "I’ve planned to travel to Guangzhou during the summer vacation. What chain hotels are available for accommodation?"}, + {"role": "assistant", "content": "You can consider [7 Days Inn, Ji Hotel, Hilton], etc."}, + {"role": "user", "content": "I’ll choose 7 Days Inn."}, + {"role": "assistant", "content": "Okay, feel free to ask me if you have any other questions."} + ], + async_mode="sync", + mode="fine", +) + +add_rsp = add_memories(add_req) +print("add_memories rsp: \n\n", add_rsp) + +search_req = APISearchRequest( + user_id=user_id, + readable_cube_ids=["cube_test_user_1"], + query="Please recommend a hotel that I haven’t stayed at before.", + include_preference=True, +) + +search_rsp = search_memories(search_req).data +print("\n\nsearch_rsp: \n\n", json.dumps(search_rsp, indent=2, ensure_ascii=False)) +```` + +## 示例 2:多信息源记忆的添加与检索 + +### 何时使用: + +- 除单纯的文本对话外,你需要将文件、图片内容或工具调用历史信息加入记忆 +- 同时你想要检索这些多源信息的记忆 + +### 关键点: + +- 多种信息来源的记忆添加 +- 需要有可下载的文件、图片url +- 添加的信息需要严格符合OpenAI Messages格式 +- system prompt中的工具Schema需要包装在 中 + +### 完整示例代码 +添加文本+文件到记忆中 +```python +import json +from memos.api.routers.server_router import add_memories, search_memories +from memos.api.product_models import APIADDRequest, APISearchRequest + +user_id = "test_user_2" +add_req = APIADDRequest( + user_id=user_id, + writable_cube_ids=["cube_test_user_2"], + messages = [ + { + "role": "user", + "content": [ + { + "type": "text", + "text": "Please read this file, summarize the key points, and provide a final conclusion." + }, + { + "type": "file", + "file": { + "file_id": "file_123", + "filename": "report.md", + "file_data": "@http://139.196.232.20:9090/graph-test/algorithm/2025_11_13/1763043889_1763043782_PM1%E8%BD%A6%E9%97%B4PMT%E9%9D%B4%E5%8E%8B%E8%BE%B9%E5%8E%8B%E5%8E%8B%E5%8A%9B%E6%97%A0%E6%B3%95%E5%BB%BA%E7%AB%8B%E6%95%85%E9%9A%9C%E6%8A%A5%E5%91%8A20240720.md" + } + }, + ] + }, + { + "role": "assistant", + "content": [ + { + "type": "text", + "text": "Final Summary: During the PMT boot-pressure startup test of the PM1 workshop on July 20, 2024, the drive could not run because the edge pressures on both sides failed to reach the 2.5-bar interlock requirement. After troubleshooting, the PLC output signals, hydraulic pipelines, and valves were all found to be normal. The root cause was ultimately identified as poor contact at the negative terminal of the proportional valve’s DC 24V power supply inside the PLC cabinet, caused by a short-jumpered terminal block. After re-connecting the negative incoming lines in parallel, the equipment returned to normal operation. It is recommended to replace terminal blocks in batches, inspect instruments with uncertain service life, and optimize the troubleshooting process by tracing common-mode issues from shared buses and power supply sources." + } + ] + } + ], + async_mode="sync", + mode="fine", +) + +add_rsp = add_memories(add_req) +print("add_memories rsp: \n\n", add_rsp) + +search_req = APISearchRequest( + user_id=user_id, + readable_cube_ids=["cube_test_user_2"], + query="Workshop PMT boot pressure startup test", + include_preference=False, +) +search_rsp = search_memories(search_req).data +print("\n\nsearch_rsp: \n\n", json.dumps(search_rsp, indent=2, ensure_ascii=False)) +``` +添加多种混合信息源的messages到记忆中 +```python +import json +from memos.api.routers.server_router import add_memories, search_memories +from memos.api.product_models import APIADDRequest, APISearchRequest + +user_id = "test_user_2" +add_req = APIADDRequest( + user_id=user_id, + writable_cube_ids=["cube_test_user_2"], + messages = [ + { + "role": "system", + "content": [ + { + "type": "text", + "text": "You are a professional industrial fault analysis assistant. Please read the PDF, images, and instructions provided by the user and provide a professional technical summary.\n\n\n[\n {\n \"name\": \"file_reader\",\n \"description\": \"Used to read the content of files uploaded by the user and return the text data (in JSON string format).\",\n \"parameters\": [\n {\"name\": \"file_id\", \"type\": \"string\", \"required\": true, \"description\": \"The file ID to be read\"}\n ],\n \"returns\": {\"type\": \"text\", \"description\": \"Returns the extracted text content of the file\"}\n }\n]\n" + } + ] + }, + { + "role": "user", + "content": [ + { + "type": "text", + "text": "Please read this file and image, summarize the key points, and provide a final conclusion." + }, + { + "type": "file", + "file": { + "file_id": "file_123", + "filename": "report.pdf", + "file_data": "@http://139.196.232.20:9090/graph-test/algorithm/2025_11_13/1763043889_1763043782_PM1%E8%BD%A6%E9%97%B4PMT%E9%9D%B4%E5%8E%8B%E8%BE%B9%E5%8E%8B%E5%8E%8B%E5%8A%9B%E6%97%A0%E6%B3%95%E5%BB%BA%E7%AB%8B%E6%95%85%E9%9A%9C%E6%8A%A5%E5%91%8A20240720.md" + } + }, + { + "type": "image_url", + "image_url": { + "url": "https://play-groud-test-1.oss-cn-shanghai.aliyuncs.com/%E5%9B%BE%E7%89%871.jpeg" + } + } + ] + }, + { + "role": "assistant", + "tool_calls": [ + { + "id": "call_file_reader_001", + "type": "function", + "function": { + "name": "file_reader", + "arguments": "{\"file_id\": \"file_123\"}" + } + } + ] + }, + { + "role": "tool", + "tool_call_id": "call_file_reader_001", + "content": [ + { + "type": "text", + "text": "{\"file_id\":\"file_123\",\"extracted_text\":\"PM1 workshop PMT boot pressure startup test record… Final fault cause: poor contact at the negative terminal of the DC 24V power supply circuit due to a short-jumped terminal block.\"}" + } + ] + }, + { + "role": "assistant", + "content": [ + { + "type": "text", + "text": "Final Summary: During the PMT boot-pressure startup test of the PM1 workshop on July 20, 2024, the drive could not run because the edge pressures on both sides failed to reach the 2.5-bar interlock requirement. After troubleshooting, the PLC output signals, hydraulic pipelines, and valves were all found to be normal. The root cause was ultimately identified as poor contact at the negative terminal of the proportional valve’s DC 24V power supply inside the PLC cabinet, caused by a short-jumpered terminal block. After re-connecting the negative incoming lines in parallel, the equipment returned to normal operation. It is recommended to replace terminal blocks in batches, inspect instruments with uncertain service life, and optimize the troubleshooting process by tracing common-mode issues from shared buses and power supply sources." + } + ] + } +], + async_mode="sync", + mode="fine", +) + +add_rsp = add_memories(add_req) + +print("add_memories rsp: \n\n", add_rsp) + + + +search_req = APISearchRequest( + user_id=user_id, + readable_cube_ids=["cube_test_user_2"], + query="Workshop PMT boot pressure startup test", + include_preference=False, +) + +search_rsp = search_memories(search_req).data +print("\n\nsearch_rsp: \n\n", json.dumps(search_rsp, indent=2, ensure_ascii=False)) +``` + +## 示例 3:多Cube添加和检索 + +### 何时使用: + +- 向彼此隔离的不同的Cube空间中添加记忆 +- 你希望同时检索不同Cube空间中的记忆 + +### 关键点: + +- 在检索时输入含有多个cube id的readable_cube_ids列表 + +### 完整示例代码 +```python +import json +from memos.api.routers.server_router import add_memories, search_memories +from memos.api.product_models import APIADDRequest, APISearchRequest + +user_id = "test_user_3" +add_req = APIADDRequest( + user_id=user_id, + writable_cube_ids=["cube_test_user_3_1"] , + messages = [ + {"role": "user", "content": "I’ve planned to travel to Guangzhou during the summer vacation. What chain hotels are available for accommodation?"}, + {"role": "assistant", "content": "You can consider [7 Days Inn, Ji Hotel, Hilton], etc."}, + {"role": "user", "content": "I’ll choose 7 Days Inn."}, + {"role": "assistant", "content": "Okay, feel free to ask me if you have any other questions."} + ], + async_mode="sync", + mode="fine", +) + +add_rsp = add_memories(add_req) +print("add_memories rsp: \n\n", add_rsp) + +add_req = APIADDRequest( + user_id=user_id, + writable_cube_ids=["cube_test_user_3_2"] , + messages = [ + {"role": "user", "content": "I love you, I need you."}, + {"role": "assistant", "content": "Wow, I love you too"}, + ], + async_mode="sync", + mode="fine", +) + +add_rsp = add_memories(add_req) +print("add_memories rsp: \n\n", add_rsp) + +search_req = APISearchRequest( + user_id=user_id, + readable_cube_ids=["cube_test_user_3_1", "cube_test_user_3_2"], + query="Please recommend a hotel, Love u u", + include_preference=True, +) + +search_rsp = search_memories(search_req).data +print("\n\nsearch_rsp: \n\n", json.dumps(search_rsp, indent=2, ensure_ascii=False)) +``` + +## 示例 4:仅 KVCacheMemory + +### 何时使用: + +- 你想要短期工作记忆以加快多轮对话速度。 +- 适合聊天机器人会话加速或提示复用。 +- 最适合缓存隐藏状态 / KV 对。 + +### 关键点: + +- 使用 KVCacheMemory,不含显式明文记忆。 +- 演示提取 → 添加 → 合并 → 获取 → 删除。 +- 展示如何导出/加载 KV cache。 + +### 完整示例代码 + + +```python +import json +from transformers import DynamicCache + +from memos.memories.activation.item import KVCacheItem +from memos.configs.memory import MemoryConfigFactory +from memos.memories.factory import MemoryFactory + +def get_cache_info(cache): + if not cache: + return None + + num_layers = 0 + total_size_bytes = 0 + + if hasattr(cache, "layers"): + num_layers = len(cache.layers) + for layer in cache.layers: + if hasattr(layer, "key_cache") and layer.key_cache is not None: + total_size_bytes += layer.key_cache.nelement() * layer.key_cache.element_size() + if hasattr(layer, "value_cache") and layer.value_cache is not None: + total_size_bytes += layer.value_cache.nelement() * layer.value_cache.element_size() + + if hasattr(layer, "keys") and layer.keys is not None: + total_size_bytes += layer.keys.nelement() * layer.keys.element_size() + if hasattr(layer, "values") and layer.values is not None: + total_size_bytes += layer.values.nelement() * layer.values.element_size() + + elif hasattr(cache, "key_cache") and hasattr(cache, "value_cache"): + num_layers = len(cache.key_cache) + for k, v in zip(cache.key_cache, cache.value_cache, strict=False): + if k is not None: + total_size_bytes += k.nelement() * k.element_size() + if v is not None: + total_size_bytes += v.nelement() * v.element_size() + + return { + "num_layers": num_layers, + "size_bytes": total_size_bytes, + "size_mb": f"{total_size_bytes / (1024 * 1024):.2f} MB", + } + + +def serialize_item(obj): + if isinstance(obj, list): + return [serialize_item(x) for x in obj] + + if isinstance(obj, KVCacheItem): + return { + "id": obj.id, + "metadata": obj.metadata, + "records": obj.records.model_dump() + if hasattr(obj.records, "model_dump") + else obj.records, + "memory": get_cache_info(obj.memory), + } + + if isinstance(obj, DynamicCache): + return get_cache_info(obj) + + return str(obj) + + +# 为 KVCacheMemory(HuggingFace 后端)创建配置 +config = MemoryConfigFactory( + backend="kv_cache", + config={ + "extractor_llm": { + "backend": "huggingface", + "config": { + "model_name_or_path": "Qwen/Qwen3-0.6B", + "max_tokens": 32, + "add_generation_prompt": True, + "remove_think_prefix": True, + }, + }, + }, +) + +# 实例化 KVCacheMemory +kv_mem = MemoryFactory.from_config(config) + +# 提取一个 KVCacheItem(DynamicCache) +prompt = [ + {"role": "user", "content": "What is MemOS?"}, + {"role": "assistant", "content": "MemOS is a memory operating system for LLMs."}, +] +print("===== Extract KVCacheItem =====") +cache_item = kv_mem.extract(prompt) +print(json.dumps(serialize_item(cache_item), indent=2, default=str)) + +# 将缓存添加到内存中 +kv_mem.add([cache_item]) +print("All caches:") +print(json.dumps(serialize_item(kv_mem.get_all()), indent=2, default=str)) + +# 通过 ID 获取 +retrieved = kv_mem.get(cache_item.id) +print("Retrieved:") +print(json.dumps(serialize_item(retrieved), indent=2, default=str)) + +# 合并缓存 +item2 = kv_mem.extract([{"role": "user", "content": "Tell me a joke."}]) +kv_mem.add([item2]) +merged = kv_mem.get_cache([cache_item.id, item2.id]) +print("Merged cache:") +print(json.dumps(serialize_item(merged), indent=2, default=str)) + +# 删除其中一个 +kv_mem.delete([cache_item.id]) +print("After delete:") +print(json.dumps(serialize_item(kv_mem.get_all()), indent=2, default=str)) + +# 导出和加载缓存 +kv_mem.dump("tmp/kv_mem") +print("Dumped to tmp/kv_mem") +kv_mem.delete_all() +kv_mem.load("tmp/kv_mem") +print("Loaded caches:") +print(json.dumps(serialize_item(kv_mem.get_all()), indent=2, default=str)) +``` + +## 示例 5:记忆调度 + +### 何时使用: + +- 你希望自定义记忆调度逻辑或扩展后台任务,以异步触发的方式不断对记忆进行管理和优化。 +- 适用于 SaaS 智能体或多会轮对话的LLM应用任务。 +- 展示 MemScheduler的记忆管理任务设置与运行方式。 + +### 关键点: + +- 通过 `mem_scheduler.register_handlers` 注册自定义回调。 +- 使用 `add_handler` 和 `chat_stream_playground` 进行交互。 +- 演示了如何获取和使用从环境量初始化完成的MemScheduler实例。 + +### 完整示例代码 + +```python +import asyncio +import json +import os +import sys +import time + +from pathlib import Path + + +# 在依赖路径的导入之前设置路径 +FILE_PATH = Path(__file__).absolute() +BASE_DIR = FILE_PATH.parent.parent.parent +sys.path.insert(0, str(BASE_DIR)) # 启用从任何工作目录执行 + +# 在导入 server_router 之前设置环境变量,以确保组件正确初始化 +os.environ["ENABLE_CHAT_API"] = "true" + +from memos.api.product_models import APIADDRequest, ChatPlaygroundRequest # noqa: E402 + +# 从 server_router 导入以进行初始化 +from memos.api.routers.server_router import ( # noqa: E402 + add_handler, + chat_stream_playground, + mem_scheduler, +) +from memos.log import get_logger # noqa: E402 +from memos.mem_scheduler.schemas.message_schemas import ScheduleMessageItem # noqa: E402 +from memos.mem_scheduler.schemas.task_schemas import ( # noqa: E402 + MEM_UPDATE_TASK_LABEL, + QUERY_TASK_LABEL, +) + + +logger = get_logger(__name__) + + +def init_task(): + conversations = [ + {"role": "user", "content": "I just adopted a golden retriever puppy yesterday."}, + {"role": "assistant", "content": "Congratulations! What did you name your new puppy?"}, + { + "role": "user", + "content": "His name is Max. I live near Central Park in New York where we'll walk daily.", + }, + {"role": "assistant", "content": "Max will love those walks! Any favorite treats for him?"}, + { + "role": "user", + "content": "He loves peanut butter biscuits. Personally, I'm allergic to nuts though.", + }, + {"role": "assistant", "content": "Good to know about your allergy. I'll note that."}, + # 问题 1 (宠物) - 名字 + {"role": "user", "content": "What's my dog's name again?"}, + {"role": "assistant", "content": "Your dog is named Max."}, + # 问题 2 (宠物) - 品种 + {"role": "user", "content": "Can you remind me what breed Max is?"}, + {"role": "assistant", "content": "Max is a golden retriever."}, + # 问题 3 (宠物) - 零食 + {"role": "user", "content": "What treats does Max like?"}, + {"role": "assistant", "content": "He loves peanut butter biscuits."}, + # 问题 4 (地址) + {"role": "user", "content": "Where did I say I live?"}, + {"role": "assistant", "content": "You live near Central Park in New York."}, + # 问题 5 (过敏) + {"role": "user", "content": "What food should I avoid due to allergy?"}, + {"role": "assistant", "content": "You're allergic to nuts."}, + {"role": "user", "content": "Perfect, just wanted to check what you remembered."}, + {"role": "assistant", "content": "Happy to help! Let me know if you need anything else."}, + ] + + questions = [ + {"question": "What's my dog's name again?", "category": "Pet"}, + {"question": "Can you remind me what breed Max is?", "category": "Pet"}, + {"question": "What treats does Max like?", "category": "Pet"}, + {"question": "Where did I say I live?", "category": "Address"}, + {"question": "What food should I avoid due to allergy?", "category": "Allergy"}, + ] + return conversations, questions + + +working_memories = [] + + +# 定义自定义查询处理函数 +def custom_query_handler(messages: list[ScheduleMessageItem]): + for msg in messages: + # 打印用户输入内容 + print(f"\n[scheduler] User input query: {msg.content}") + # 手动构造带有 MEM_UPDATE 标签的新消息以触发记忆更新 + new_msg = msg.model_copy(update={"label": MEM_UPDATE_TASK_LABEL}) + # 提交消息给调度器处理 + mem_scheduler.submit_messages([new_msg]) + + +# 定义自定义记忆更新处理函数 +def custom_mem_update_handler(messages: list[ScheduleMessageItem]): + global working_memories + search_args = {} + top_k = 2 + for msg in messages: + # 在文本记忆中搜索与当前内容相关的记忆(返回 top_k=2) + results = mem_scheduler.retriever.search( + query=msg.content, + user_id=msg.user_id, + mem_cube_id=msg.mem_cube_id, + mem_cube=mem_scheduler.current_mem_cube, + top_k=top_k, + method=mem_scheduler.search_method, + search_args=search_args, + ) + working_memories.extend(results) + working_memories = working_memories[-5:] + for mem in results: + print(f"\n[scheduler] Retrieved memory: {mem.memory}") + + +async def run_with_scheduler(): + print("==== run_with_automatic_scheduler_init ====") + conversations, questions = init_task() + + # 使用 server_router 组件进行初始化 + # 配置通过 init_server() 中的环境变量加载 + + user_id = "user_1" + mem_cube_id = "mem_cube_5" + + print(f"Adding conversations for user {user_id}...") + + # 使用 add_handler 添加记忆 + add_req = APIADDRequest( + user_id=user_id, + writable_cube_ids=[mem_cube_id], + messages=conversations, + async_mode="sync", # 在此示例中使用同步模式以便立即添加 + ) + add_handler.handle_add_memories(add_req) + + for item in questions: + print("===== Chat Start =====") + query = item["question"] + print(f"Query:\n {query}\n") + + # 使用 chat_handler 进行聊天 + chat_req = ChatPlaygroundRequest( + user_id=user_id, + query=query, + readable_cube_ids=[mem_cube_id], + writable_cube_ids=[mem_cube_id], + ) + response = chat_stream_playground(chat_req) + + answer = "" + buffer = "" + async for chunk in response.body_iterator: + if isinstance(chunk, bytes): + chunk = chunk.decode("utf-8") + buffer += chunk + while "\n\n" in buffer: + msg, buffer = buffer.split("\n\n", 1) + for line in msg.split("\n"): + if line.startswith("data: "): + json_str = line[6:] + try: + data = json.loads(json_str) + if data.get("type") == "text": + answer += data["data"] + except json.JSONDecodeError: + pass + print(f"\nAnswer: {answer}") + + +if __name__ == "__main__": + mem_scheduler.register_handlers( + { + QUERY_TASK_LABEL: custom_query_handler, # 查询任务 + MEM_UPDATE_TASK_LABEL: custom_mem_update_handler, # 记忆更新任务 + } + ) + + asyncio.run(run_with_scheduler()) + + time.sleep(20) + mem_scheduler.stop() +``` + +::note +**请注意**
+使用 dump() 和 load() 来持久化你的记忆立方体。 + +务必确保你的向量数据库维度与你的嵌入器匹配。 + +如使用基于图的明文记忆功能,你需要安装 Neo4j Desktop。 +:: + +## 下一步 + +你才刚刚开始!接下来可以尝试: + +- 选择与你使用场景匹配的示例。 +- 组合模块以构建更智能、更持久的智能体! + +还需要更多帮助? +查看 API 文档或贡献你自己的示例吧! diff --git a/docs/cn/open_source/getting_started/installation.md b/docs/cn/open_source/getting_started/installation.md new file mode 100644 index 00000000..a95375e8 --- /dev/null +++ b/docs/cn/open_source/getting_started/installation.md @@ -0,0 +1,645 @@ +--- +title: "安装指南" +desc: "MemOS 完整安装指南。" +--- + + +::card-group + + :::card + --- + icon: ri:database-2-line + title: 通过Docker安装 + to: /cn/open_source/getting_started/installation#通过docker安装 + --- + 适合快速部署:一键启动服务与依赖组件。 + ::: + + :::card + --- + icon: ri:play-line + title: 从源码安装 + to: /cn/open_source/getting_started/installation#从源码安装 + --- + 适合二次开发与贡献:可编辑安装、可跑测试、可本地调试。 + ::: + + :::card + --- + icon: ri:tree-line + title: 通过pip安装 + to: /cn/open_source/getting_started/installation#通过pip安装 + --- + 最简单的安装方式:快速开始使用 MemOS。 + ::: + + +:: + + + +## 通过Docker安装 +```bash +git clone https://github.com/MemTensor/MemOS.git +cd MemOS +``` + +#### 创建 .env 配置文件 +::note +**请注意**
+.env 文件配置需要放在MemOS 项目根目录下 +:: + +::steps{level="4"} + +#### 1. 新建 .env +```bash +cd MemOS +touch .env +``` + +#### 2. .env 内容 + +.env 快速配置如下 +```bash + +# OpenAI API 密钥 (需自定义配置) +OPENAI_API_KEY=sk-xxx +# OpenAI API 基础 URL +OPENAI_API_BASE=http://xxx:3000/v1 +# 默认模型名称 +MOS_CHAT_MODEL=qwen3-max + +# Memory Reader LLM 模型 +MEMRADER_MODEL=qwen3-max +# Memory Reader API 密钥 +MEMRADER_API_KEY=sk-xxx +# Memory Reader API 基础 URL +MEMRADER_API_BASE=http://xxx:3000/v1 + +# Embedder 模型名称 +MOS_EMBEDDER_MODEL=text-embedding-v4 +# 配置embedding backend 两种选择 ollama | universal_api +MOS_EMBEDDER_BACKEND=universal_api +# Embedder API 基础 URL +MOS_EMBEDDER_API_BASE=http://xxx:8081/v1 +# Embedder API 密钥 +MOS_EMBEDDER_API_KEY=xxx +# Embedding 向量维度 +EMBEDDING_DIMENSION=1024 +# Reranker 后端 (http_bge | etc.) +MOS_RERANKER_BACKEND=cosine_local + +# Neo4j 连接 URI +# 可选值: neo4j-community | neo4j | nebular | polardb +NEO4J_BACKEND=neo4j-community +# 当 backend=neo4j* 时必须 +NEO4J_URI=bolt://localhost:7687 +NEO4J_USER=neo4j +NEO4J_PASSWORD=12345678 +NEO4J_DB_NAME=neo4j +MOS_NEO4J_SHARED_DB=false + +# 是否使用 redis 的调度器 +DEFAULT_USE_REDIS_QUEUE=false + +# 启用聊天 API +ENABLE_CHAT_API=true +# 聊天模型列表 可以通过百炼申请. 模型可自选 +CHAT_MODEL_LIST=[{"backend": "qwen", "api_base": "https://xxx/v1", "api_key": "sk-xxx", "model_name_or_path": "qwen3-max", "extra_body": {"enable_thinking": true} ,"support_models": ["qwen3-max"]}] +``` +#### .env 以百炼为示例配置如下 +```bash +# 可通过百炼平台申请 +# https://bailian.console.aliyun.com/?spm=a2c4g.11186623.0.0.2f2165b08fRk4l&tab=api#/api +# 申请成功后,获取API_KEY和BASE_URL,示例配置如下 + +# OpenAI API 密钥 (用百炼的API_KEY) +OPENAI_API_KEY=you_bailian_api_key +# OpenAI API 基础 URL +OPENAI_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1 +# 默认模型名称 +MOS_CHAT_MODEL=qwen3-max + +# Memory Reader LLM 模型 +MEMRADER_MODEL=qwen3-max +# Memory Reader API 密钥 (用百炼的API_KEY) +MEMRADER_API_KEY=you_bailian_api_key +# Memory Reader API 基础 URL +MEMRADER_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1 + +# Embedder模型名称可以参考下面链接 +# https://bailian.console.aliyun.com/?spm=a2c4g.11186623.0.0.2f2165b08fRk4l&tab=api#/api/?type=model&url=2846066 +MOS_EMBEDDER_MODEL=text-embedding-v4 +# 配置embedding backend 两种选择 ollama | universal_api +MOS_EMBEDDER_BACKEND=universal_api +# Embedder API 基础 URL +MOS_EMBEDDER_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1 +# Embedder API 密钥 (用百炼的API_KEY) +MOS_EMBEDDER_API_KEY=you_bailian_api_key +# Embedding 向量维度 +EMBEDDING_DIMENSION=1024 +# Reranker 后端 (http_bge | etc.) +MOS_RERANKER_BACKEND=cosine_local + +# Neo4j 连接 URI +# 可选值: neo4j-community | neo4j | nebular | polardb +NEO4J_BACKEND=neo4j-community +# 当 backend=neo4j* 时必须 +NEO4J_URI=bolt://localhost:7687 +NEO4J_USER=neo4j +NEO4J_PASSWORD=12345678 +NEO4J_DB_NAME=neo4j +MOS_NEO4J_SHARED_DB=false + +# 是否使用 redis 的调度器 +DEFAULT_USE_REDIS_QUEUE=false + +# 启用聊天 API +ENABLE_CHAT_API=true + +CHAT_MODEL_LIST=[{"backend": "qwen", "api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1", "api_key": "you_bailian_api_key", "model_name_or_path": "qwen3-max-preview", "extra_body": {"enable_thinking": true} ,"support_models": ["qwen3-max-preview"]}] +``` +![MemOS bailian](https://cdn.memtensor.com.cn/img/get_key_url_by_bailian_compressed.png) +
百炼申请 API_KEY和 BASE_URL 示例
+ +:: + + +#### 配置Dockerfile文件 +::note +**请注意**
+Dockerfile 文件在 docker 目录下 +:: + +```bash +#进入docker目录下 +cd docker +``` +包含快速模式和完整模式,可区分使用精简包(区分arm和x86)和全量包(区分arm和x86) + +```bash + +● 精简包:简化体量过大的 nvidia相关等依赖,对镜像实现轻量化,使本地部署更加轻量快速。 +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-base:v1.0 +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-base-arm:v1.0 + +● 全量包:将 MemOS 全部依赖包打为镜像,可体验完整功能,通过配置 Dockerfile可直接构建启动。 +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-full-base:v1.0.0 +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-full-base-arm:v1.0.0 +``` + +```bash +# 当前示例使用精简包 url +FROM registry.cn-shanghai.aliyuncs.com/memtensor/memos-base-arm:v1.0 + +WORKDIR /app + +ENV HF_ENDPOINT=https://hf-mirror.com + +ENV PYTHONPATH=/app/src + +COPY src/ ./src/ + +EXPOSE 8000 + +CMD ["uvicorn", "memos.api.server_api:app", "--host", "0.0.0.0", "--port", "8000", "--reload"] + +``` + +#### 启动docker客户端 +```bash + # 如果没有安装docker,请安装对应版本,下载地址如下: + https://www.docker.com/ + + # 安装完成之后,可通过客户端启动docker,或者通过命令行启动docker + # 通过命令行启动docker + sudo systemctl start docker + +# 安装完成后,查看docker状态 +docker ps + +# 查看docker镜像 (可不用) +docker images + +``` + +#### 构建并启动服务 : +::note +**请注意**
+构建命令同样在 docker 目录下 +:: +```bash +# 在docker目录下 +docker compose up +``` +![MemOS buildComposeupSuccess](https://cdn.memtensor.com.cn/img/memos_build_composeup_success_compressed.png) +
示例图片,端口按 docker 自定义的配置
+ +#### 通过 [http://localhost:8000/docs](http://localhost:8000/docs) 访问 API。 + +![MemOS Architecture](https://cdn.memtensor.com.cn/img/memos_run_server_success_compressed.png) + +#### ADD Memory +```bash +curl --location --request POST 'http://127.0.0.1:8000/product/add' \ +--header 'Content-Type: application/json' \ +--data-raw '{ + + "messages": [{ + "role": "user", + "content": "我喜欢吃草莓" + }], + "user_id": "8736b16e-1d20-4163-980b-a5063c3facdc", + "writable_cube_ids":["b32d0977-435d-4828-a86f-4f47f8b55bca"] +}' + +# 响应 +{ + "code": 200, + "message": "Memory created successfully", + "data": null +} +``` + +#### Search Memory +```bash +curl --location --request POST 'http://127.0.0.1:8000/product/search' \ +--header 'Content-Type: application/json' \ +--data-raw '{ + "query": "我喜欢吃什么", + "user_id": "8736b16e-1d20-4163-980b-a5063c3facdc", + "readable_cube_ids": ["b32d0977-435d-4828-a86f-4f47f8b55bca"], + "top_k":20 + }' +# 响应 +{ + "code": 200, + "message": "Search completed successfully", + "data": { + "text_mem": [ + { + "cube_id": "7231eda8-6c57-4f6e-97ce-98b699eebb98", + "memories": [ + { + "id": "2f40be8f-736c-4a5f-aada-9489037769e0", + "memory": "[user观点]用户喜欢草莓。", + "metadata": { + "user_id": "de8215e3-3beb-4afc-9b64-ae594d62f1ea", + "session_id": "root_session", + "status": "activated", + "type": "fact", + "key": "用户对草莓的喜好", + "confidence": 0.99, + "source": null, + "tags": [ + "喜好", + "草莓" + ], + "visibility": null, + "updated_at": "2025-09-18T08:23:44.625479000+00:00", + "memory_type": "UserMemory", + "sources": [], + "embedding": [], + "created_at": "2025-09-18T08:23:44.625511000+00:00", + "usage": [ + "{ + "time": "2025-09-18T08:24:17.759748", + "info": { + "user_id": "de8215e3-3beb-4afc-9b64-ae594d62f1ea", + "session_id": "root_session" + } + }" + ], + "background": "用户表达了对草莓的喜好,显示出他们在饮食偏好上的倾向。", + "relativity": 0.6349761312470591, + "vector_sync": "success", + "ref_id": "[2f40be8f]", + "id": "2f40be8f-736c-4a5f-aada-9489037769e0", + "memory": "[user观点]用户喜欢草莓。" + }, + "ref_id": "[2f40be8f]" + }, + ... + } + } + ], + "act_mem": [], + "para_mem": [] + } +} +``` + + +## 从源码安装 +```bash +git clone https://github.com/MemTensor/MemOS.git +cd MemOS +``` + +#### 创建 .env 配置文件 +MemOS 的 server_api 依赖环境变量启动,因此需要在启动目录下创建 .env 文件。 +1. 新建 .env +```bash +cd MemOS +touch .env +``` + +2. .env 内容,快速配置请见 docker 安装下的[env 配置](/open_source/getting_started/installation#2.-.env-内容) +.env详细配置请见[env配置](/open_source/getting_started/rest_api_server/#本地运行) + +::note +**请注意**
+.env 文件配置需要放在MemOS 项目根目录下 +:: + + +#### 安装依赖 +```bash +# 执行安装命令 +pip install -e . +pip install --no-cache-dir -r ./docker/requirements.txt -i https://mirrors.aliyun.com/pypi/simple/ +# 配置PYTHONPATH 当前项目文件的绝对目录下的 src +export PYTHONPATH=/******/MemOS/src +``` + +#### 安装图数据库 +Memos的记忆底层是通过图数据库进行存储的,在开源项目中,推荐使用Neo4j运行您的第一个项目。社区同时支持Neo4j企业版/社区版与PolarDB。 + +::note +**PC开发者的最快选择:Neo4j Desktop**
如果您计划使用 Neo4j 作为图记忆,Neo4j Desktop可能是最方便的安装方式。
+另外,您需要在 .env 文件中设置 **NEO4J_BACKEND=neo4j** +:: + + +#### 启动 MemOS Server。 +```bash +# 项目根目录下 +uvicorn memos.api.server_api:app --host 0.0.0.0 --port 8000 --workers 1 +``` + +#### ADD Memory +```bash +curl --location --request POST 'http://127.0.0.1:8000/product/add' \ +--header 'Content-Type: application/json' \ +--data-raw '{ + + "messages": [{ + "role": "user", + "content": "我喜欢吃草莓" + }], + "user_id": "8736b16e-1d20-4163-980b-a5063c3facdc", + "writable_cube_ids":["b32d0977-435d-4828-a86f-4f47f8b55bca"] +}' + +# 响应 +{ + "code": 200, + "message": "Memory created successfully", + "data": null +} +``` + +#### Search Memory +```bash +curl --location --request POST 'http://127.0.0.1:8000/product/search' \ +--header 'Content-Type: application/json' \ +--data-raw '{ + "query": "我喜欢吃什么", + "user_id": "8736b16e-1d20-4163-980b-a5063c3facdc", + "readable_cube_ids": ["b32d0977-435d-4828-a86f-4f47f8b55bca"], + "top_k":20 + }' +# 响应 +{ + "code": 200, + "message": "Search completed successfully", + "data": { + "text_mem": [ + { + "cube_id": "7231eda8-6c57-4f6e-97ce-98b699eebb98", + "memories": [ + { + "id": "2f40be8f-736c-4a5f-aada-9489037769e0", + "memory": "[user观点]用户喜欢草莓。", + "metadata": { + "user_id": "de8215e3-3beb-4afc-9b64-ae594d62f1ea", + "session_id": "root_session", + "status": "activated", + "type": "fact", + "key": "用户对草莓的喜好", + "confidence": 0.99, + "source": null, + "tags": [ + "喜好", + "草莓" + ], + "visibility": null, + "updated_at": "2025-09-18T08:23:44.625479000+00:00", + "memory_type": "UserMemory", + "sources": [], + "embedding": [], + "created_at": "2025-09-18T08:23:44.625511000+00:00", + "usage": [ + "{ + "time": "2025-09-18T08:24:17.759748", + "info": { + "user_id": "de8215e3-3beb-4afc-9b64-ae594d62f1ea", + "session_id": "root_session" + } + }" + ], + "background": "用户表达了对草莓的喜好,显示出他们在饮食偏好上的倾向。", + "relativity": 0.6349761312470591, + "vector_sync": "success", + "ref_id": "[2f40be8f]", + "id": "2f40be8f-736c-4a5f-aada-9489037769e0", + "memory": "[user观点]用户喜欢草莓。" + }, + "ref_id": "[2f40be8f]" + }, + ... + } + } + ], + "act_mem": [], + "para_mem": [] + } +} +``` + + +## 通过pip安装 +安装 MemOS 最简单的方法是使用 pip。 + +::steps{level="4"} + +#### 创建并激活 Conda 环境(推荐) + +为避免依赖冲突,强烈建议使用独立的 Conda 环境。 + +```bash +conda create -n memos python=3.11 +conda activate memos +``` + +#### 从 PyPI 安装 MemOS +安装 MemOS 及其全部可选组件: + +```bash +pip install -U "MemoryOS[all]" +``` + +#### 安装图数据库 +Memos的记忆底层是通过图数据库进行存储的,在开源项目中,推荐使用Neo4j运行您的第一个项目。社区同时支持Neo4j企业版/社区版与PolarDB。 + +::note +**PC开发者的最快选择:Neo4j Desktop**
如果您计划使用 Neo4j 作为图记忆,Neo4j Desktop可能是最方便的安装方式。 +:: + + +#### 创建 .env 配置文件 +MemOS 的 server_api 依赖环境变量启动,因此需要在启动目录下创建 .env 文件。 +1. 新建 .env +```bash +touch .env +``` + +2. 示例 .env 内容 +.env详细配置请见[env配置](/open_source/getting_started/rest_api_server) + +有关详细的开发环境设置、工作流程指南和贡献最佳实践,请参阅我们的 [贡献指南](/open_source/contribution/overview)。 + +#### 启动 MemOS Server +MemOS 不会自动加载 .env 文件,请使用 python-dotenv 方式启动。 +```bash +python -m dotenv run -- \ + uvicorn memos.api.server_api:app \ + --host 0.0.0.0 \ + --port 8000 +``` +启动成功后,你将看到类似输出: +```text +INFO: Uvicorn running on http://0.0.0.0:8000 +INFO: Application startup complete. +``` + +#### 开始您的记忆操作吧 +添加记忆(调用方式和从源码部署是一致哒,这次我们试试**同步**方式来添加记忆): +```text +curl --location --request POST 'http://127.0.0.1:8000/product/add' \ +--header 'Content-Type: application/json' \ +--data-raw '{ + "messages": [{ + "role": "user", + "content": "我喜欢吃草莓" + }], + "user_id": "8736b16e-1d20-4163-980b-a5063c3facdc", + "writable_cube_ids":["b32d0977-435d-4828-a86f-4f47f8b55bca"], + "async_mode": "sync", + "mode": "fine" +}' +``` + +::note +**期望的输出**
+```json +{ + "code": 200, + "message": "Memory added successfully", + "data": [ + { + "memory": "用户喜欢吃草莓。", + "memory_id": "d01a354e-e5f6-4e2a-bd89-c57ae", + "memory_type": "UserMemory", + "cube_id": "b32d0977-435d-4828-a86f-4f47f8b55bca" + } + ] +} +``` +:: + +检索记忆(调用方式和从源码部署是一致哒): +```text +curl --location --request POST 'http://127.0.0.1:8000/product/search' \ +--header 'Content-Type: application/json' \ +--data-raw '{ + "query": "我喜欢吃什么", + "user_id": "8736b16e-1d20-4163-980b-a5063c3facdc", + "readable_cube_ids": ["b32d0977-435d-4828-a86f-4f47f8b55bca"], + "top_k":20 + }' +``` + +::note +**期望的输出**
+```json +{ + "code": 200, + "message": "Search completed successfully", + "data": { + "text_mem": [ + { + "cube_id": "b32d0977-435d-4828-a86f-4f47f8b55bca", + "memories": [ + { + "id": "f18cbe36-4cd9-456f-9b9f-6be89c35b2bf", + "memory": "用户喜欢吃草莓。", + "metadata": { + "user_id": "8736b16e-1d20-4163-980b-a5dc", + "session_id": "default_session", + "status": "activated", + "type": "fact", + "key": "草莓喜好", + "confidence": 0.99, + "source": null, + "tags": ["饮喜好", "草莓"], + "visibility": null, + "updated_at": "2025-12-26T20:35:08.178564000+00:00", + "info": null, + "covered_history": null, + "memory_type": "WorkingMemory", + "sources": [], + "embedding": [], + "created_at": "2025-12-26T20:35:08.177484000+00:00", + "usage": [], + "background": "用户表达了好,表明他们喜欢这种水果,可能在饮食选择中倾向于包含草莓。", + "file_ids": [], + "relativity": 0.0, + "ref_id": "[f18cbe36]" + }, + "ref_id": "[f18cbe36]" + } + ] + } + ], + "act_mem": [], + "para_mem": [], + "pref_mem": [ + { + "cube_id": "b32d0977-435d-4828-a86f-4f47f8b55bca", + "memories": [] + } + ], + "pref_note": "", + "tool_mem": [ + { + "cube_id": "b32d0977-435d-4828-a86f-4f47f8b55bca", + "memories": [] + } + ], + "pref_string": "" + } +} +``` +:: + +:: + +::note +**下载示例代码**
恭喜您🎉已完成从pip安装MemOS,并跑通最小验证用例!您还可以基于以下命令下载示例代码,从而了解每个memos +内部模块的调用方式: +```bash +memos download_examples +``` +:: diff --git a/docs/cn/open_source/getting_started/rest_api_server.md b/docs/cn/open_source/getting_started/rest_api_server.md new file mode 100644 index 00000000..28b8dc12 --- /dev/null +++ b/docs/cn/open_source/getting_started/rest_api_server.md @@ -0,0 +1,501 @@ +--- +title: REST API 服务 +desc: MemOS 提供了一个使用 FastAPI 编写的 REST API 服务。用户可以通过 REST 接口执行所有操作。 +--- + +![MemOS Architecture](https://cdn.memtensor.com.cn/img/memos_run_server_success_compressed.png) +
MemOS REST API 服务支持的 API
+ +### 功能特点 + +- 添加新记忆:为指定用户创建一条新的记忆。 +- 搜索记忆:为指定用户搜索其记忆内容。 +- 获取用户所有记忆:获取某个用户的所有记忆内容。 +- 记忆反馈:为指定用户反馈记忆内容。 +- 与 MemOS 对话:与 MemOS 进行对话,返回 SSE 流式响应。 + + +## 本地运行 + +### 1、本地下载 +```bash +# 将代码下载到本地文件夹下 +git clone https://github.com/MemTensor/MemOS +``` + +### 2、配置环境变量 +```bash +# 进入文件夹目录下 +cd MemOS +``` + +#### 在根目录中创建一个 `.env` 文件并设置你的环境变量。 +##### .env 快速模式配置如下,完整模式参考 .env.example。 + +```bash + +# OpenAI API 密钥 (需自定义配置) +OPENAI_API_KEY=sk-xxx +# OpenAI API 基础 URL +OPENAI_API_BASE=http://xxx:3000/v1 +# 默认模型名称 +MOS_CHAT_MODEL=qwen3-max + +# Memory Reader LLM 模型 +MEMRADER_MODEL=qwen3-max +# Memory Reader API 密钥 +MEMRADER_API_KEY=sk-xxx +# Memory Reader API 基础 URL +MEMRADER_API_BASE=http://xxx:3000/v1 + +# Embedder 模型名称 +MOS_EMBEDDER_MODEL=text-embedding-v4 +# 配置embedding backend 两种选择 ollama | universal_api +MOS_EMBEDDER_BACKEND=universal_api +# Embedder API 基础 URL +MOS_EMBEDDER_API_BASE=http://xxx:8081/v1 +# Embedder API 密钥 +MOS_EMBEDDER_API_KEY=xxx +# Embedding 向量维度 +EMBEDDING_DIMENSION=1024 +# Reranker 后端 (http_bge | etc.) +MOS_RERANKER_BACKEND=cosine_local + +# Neo4j 连接 URI +# 可选值: neo4j-community | neo4j | nebular | polardb +NEO4J_BACKEND=neo4j-community +# 当 backend=neo4j* 时必须 +NEO4J_URI=bolt://localhost:7687 +NEO4J_USER=neo4j +NEO4J_PASSWORD=12345678 +NEO4J_DB_NAME=neo4j +MOS_NEO4J_SHARED_DB=false + +# 是否使用 redis 的调度器 +DEFAULT_USE_REDIS_QUEUE=false + +# 启用聊天 API +ENABLE_CHAT_API=true +# 聊天模型列表 可以通过百炼申请. 模型可自选 +CHAT_MODEL_LIST=[{"backend": "qwen", "api_base": "https://xxx/v1", "api_key": "sk-xxx", "model_name_or_path": "qwen3-max", "extra_body": {"enable_thinking": true} ,"support_models": ["qwen3-max"]}] +``` + +### 3、以百炼为例自定义配置 + +```bash +# 可通过百炼平台申请 +# https://bailian.console.aliyun.com/?spm=a2c4g.11186623.0.0.2f2165b08fRk4l&tab=api#/api +# 申请成功后,获取API_KEY和BASE_URL,示例配置如下 + +# OpenAI API 密钥 (用百炼的API_KEY) +OPENAI_API_KEY=you_bailian_api_key +# OpenAI API 基础 URL +OPENAI_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1 +# 默认模型名称 +MOS_CHAT_MODEL=qwen3-max + +# Memory Reader LLM 模型 +MEMRADER_MODEL=qwen3-max +# Memory Reader API 密钥 (用百炼的API_KEY) +MEMRADER_API_KEY=you_bailian_api_key +# Memory Reader API 基础 URL +MEMRADER_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1 + +# Embedder模型名称可以参考下面链接 +# https://bailian.console.aliyun.com/?spm=a2c4g.11186623.0.0.2f2165b08fRk4l&tab=api#/api/?type=model&url=2846066 +MOS_EMBEDDER_MODEL=text-embedding-v4 +# 配置embedding backend 两种选择 ollama | universal_api +MOS_EMBEDDER_BACKEND=universal_api +# Embedder API 基础 URL +MOS_EMBEDDER_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1 +# Embedder API 密钥 (用百炼的API_KEY) +MOS_EMBEDDER_API_KEY=you_bailian_api_key +# Embedding 向量维度 +EMBEDDING_DIMENSION=1024 +# Reranker 后端 (http_bge | etc.) +MOS_RERANKER_BACKEND=cosine_local + +# Neo4j 连接 URI +# 可选值: neo4j-community | neo4j | nebular | polardb +NEO4J_BACKEND=neo4j-community +# 当 backend=neo4j* 时必须 +NEO4J_URI=bolt://localhost:7687 +NEO4J_USER=neo4j +NEO4J_PASSWORD=12345678 +NEO4J_DB_NAME=neo4j +MOS_NEO4J_SHARED_DB=false + +# 是否使用 redis 的调度器 +DEFAULT_USE_REDIS_QUEUE=false + +# 启用聊天 API +ENABLE_CHAT_API=true + +CHAT_MODEL_LIST=[{"backend": "qwen", "api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1", "api_key": "you_bailian_api_key", "model_name_or_path": "qwen3-max-preview", "extra_body": {"enable_thinking": true} ,"support_models": ["qwen3-max-preview"]}] +``` +![MemOS bailian](https://cdn.memtensor.com.cn/img/get_key_url_by_bailian_compressed.png) +
百炼申请 API_KEY和 BASE_URL 示例
+ +配置docker/requirement.txt中依赖包的版本等(可忽略)。完整版可参考 requirements.txt。 + +### 4、启动docker +```bash + # 如果没有安装docker,请安装对应版本,下载地址如下: + https://www.docker.com/ + +# 安装完成之后,可通过客户端启动docker,或者通过命令行启动docker +# 通过命令行启动docker +sudo systemctl start docker + +# 安装完成后,查看docker状态 +docker ps + +# 查看docker镜像 (可不用) +docker images + +``` + + +### 方式一:Docker 使用仓库依赖包镜像启动(推荐使用) +::steps{level="4"} + +```bash +#进入docker目录下 +cd docker +``` + +#### 镜像包使用确认 +包含快速模式和完整模式,可区分使用精简包(区分arm和x86)和全量包(区分arm和x86) + +```bash + +● 精简包:简化体量过大的 nvidia相关等依赖,对镜像实现轻量化,使本地部署更加轻量快速。 +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-base:v1.0 +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-base-arm:v1.0 + +● 全量包:将 MemOS 全部依赖包打为镜像,可体验完整功能,通过配置 Dockerfile可直接构建启动。 +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-full-base:v1.0.0 +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-full-base-arm:v1.0.0 +``` +#### 配置Dockerfile文件 + +```bash +# 当前示例使用精简包 url +FROM registry.cn-shanghai.aliyuncs.com/memtensor/memos-base-arm:v1.0 + +WORKDIR /app + +ENV HF_ENDPOINT=https://hf-mirror.com + +ENV PYTHONPATH=/app/src + +COPY src/ ./src/ + +EXPOSE 8000 + +CMD ["uvicorn", "memos.api.server_api:app", "--host", "0.0.0.0", "--port", "8000", "--reload"] + +``` + +#### 构建并启动服务 : +```bash +# 在docker目录下 +docker compose up +``` +![MemOS buildComposeupSuccess](https://cdn.memtensor.com.cn/img/memos_build_composeup_success_compressed.png) +
示例图片,端口按 docker 自定义的配置
+ +#### 通过 [http://localhost:8000/docs](http://localhost:8000/docs) 访问 API。 + +![MemOS Architecture](https://cdn.memtensor.com.cn/img/memos_run_server_success_compressed.png) + + +#### 测试用例 (添加用户记忆->查询用户记忆) 参考Docker Compose up测试用例 + +:: + + + +### 方式二:客户端install Docker Compose up +::steps{level="4"} +开发环境的 Docker Compose up 已预配置了 qdrant、neo4j。 +运行服务器需要环境变量 `OPENAI_API_KEY`。 + + +#### 进入docker文件夹 +```bash +# 当前文件夹下进入docker文件夹 +cd docker +``` + +#### 安装对应依赖模块 +```bash + +pip install --upgrade pip && pip install --no-cache-dir -r requirements.txt +# 使用阿里云源安装依赖 +pip install --upgrade pip && pip install --no-cache-dir -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/ + +# command not found: pip 使用pip3 + + + +``` + + +#### 在docker目录下使用 Docker Compose Up启动容器(保证vpn正常连接): + +```bash + +# 首次运行需要build +docker compose up --build +# 再次运行则不需要 +docker compose up + +``` + +#### 通过 [http://localhost:8000/docs](http://localhost:8000/docs) 访问 API。 + +#### 示例流程 + +##### (查询用户记忆(没有继续往后)->添加用户记忆->查询用户记忆) + +##### 添加用户记忆 http://localhost:8000/product/add (POST) +```bash +# 请求参数 +{ + "user_id": "8736b16e-1d20-4163-980b-a5063c3facdc", + "mem_cube_id": "b32d0977-435d-4828-a86f-4f47f8b55bca", + "async_mode": "async", + "messages": [ + { + "role": "user", + "content": "我喜欢草莓" + } + ] +} +# 响应 +{ + "code": 200, + "message": "Memory created successfully", + "data": null +} +``` + +##### 查询用户记忆 http://localhost:8000/product/search (POST) +```bash +# 请求参数 +{ + "query": "我喜欢什么", + "user_id": "8736b16e-1d20-4163-980b-a5063c3facdc", + "mem_cube_id": "b32d0977-435d-4828-a86f-4f47f8b55bca" +} +# 响应 +{ + "code": 200, + "message": "Search completed successfully", + "data": { + "text_mem": [ + { + "cube_id": "7231eda8-6c57-4f6e-97ce-98b699eebb98", + "memories": [ + { + "id": "2f40be8f-736c-4a5f-aada-9489037769e0", + "memory": "[user观点]用户喜欢草莓。", + "metadata": { + "user_id": "de8215e3-3beb-4afc-9b64-ae594d62f1ea", + "session_id": "root_session", + "status": "activated", + "type": "fact", + "key": "用户对草莓的喜好", + "confidence": 0.99, + "source": null, + "tags": [ + "喜好", + "草莓" + ], + "visibility": null, + "updated_at": "2025-09-18T08:23:44.625479000+00:00", + "memory_type": "UserMemory", + "sources": [], + "embedding": [], + "created_at": "2025-09-18T08:23:44.625511000+00:00", + "usage": [ + "{ + "time": "2025-09-18T08:24:17.759748", + "info": { + "user_id": "de8215e3-3beb-4afc-9b64-ae594d62f1ea", + "session_id": "root_session" + } + }" + ], + "background": "用户表达了对草莓的喜好,显示出他们在饮食偏好上的倾向。", + "relativity": 0.6349761312470591, + "vector_sync": "success", + "ref_id": "[2f40be8f]", + "id": "2f40be8f-736c-4a5f-aada-9489037769e0", + "memory": "[user观点]用户喜欢草莓。" + }, + "ref_id": "[2f40be8f]" + }, + ... + } + } + ], + "act_mem": [], + "para_mem": [] + } +} + + + +# 响应失败,原因排查 +# src/memos/api/config.py +# 检查get_neo4j_community_config方法中配置的"neo4j_vec_db"和"EMBEDDING_DIMENSION" +``` + + +#### 对服务器代码或库代码进行修改将自动重新加载服务器。 + + +:: + +### 方式三:客户端install 使用 CLI 命令 + +::steps{level="4"} + +#### 安装依赖 + +```bash +# pip install --upgrade pip && pip install --no-cache-dir -r ./docker/requirements.txt +# 使用阿里云源安装依赖 +pip install --no-cache-dir -r ./docker/requirements.txt -i https://mirrors.aliyun.com/pypi/simple/ + + +``` + +#### 在终端中打开运行以下命令进行安装: + +```bash + +# 目前可能需要手动安装的包 这两个包需要找资源 +# neo4j.5.26.4.tar qdrant.v1.15.3.tar +docker load -i neo4j.5.26.4.tar +docker load -i qdrant.v1.15.3.tar +# 查看是否安装成功 +docker images +# 查看是否跑起来了 +docker ps -a + +# 若启动时出现ModuleNotFoundError: No module named 'memos',是因为路径匹配有问题,请执行 +export PYTHONPATH=/you-file-absolute-path/MemOS/src + +# 根目录 + uvicorn memos.api.server_api:app --host 0.0.0.0 --port 8000 --workers 1 + + + +``` + +#### 访问 API + +启动完成后,通过 [http://localhost:8000/docs](http://localhost:8000/docs) 访问 API。 + + +:: + +### 方式四:不使用 Docker +::steps{level="4"} +#### 参考上方配置环境变量,已经好配置.env文件 + +#### 安装 Poetry 用于依赖管理: + +```bash +curl -sSL https://install.python-poetry.org | python3 - +``` + +#### Poetry 环境变量配置: + +```bash + +#要开始使用,您需要在“PATH”中找到Poetry的bin目录(/Users/jinyunyuan/.local/bin)`环境变量 +# 现代 macOS 系统默认的 Shell 是 zsh。你可以通过以下命令确认 +1. 确定你使用的 Shell + +echo $SHELL +# 如果输出是 /bin/zsh 或 /usr/bin/env zsh,那么你就是 zsh。 +# (如果你的系统版本较老,可能还在使用 bash,输出会是 /bin/bash) +2. 打开对应的 Shell 配置文件 +# 如果使用的是 zsh (绝大多数情况): +# 使用 nano 编辑器(推荐新手) +nano ~/.zshrc + +# 或者使用 vim 编辑器 +# vim ~/.zshrc +# 如果使用的是 bash: +nano ~/.bash_profile +# 或者 +nano ~/.bashrc + +3. 添加 PATH 环境变量 + +# 在打开的文件的最末尾,新起一行,粘贴安装提示给你的那行命令: +export PATH="/you-path/.local/bin:$PATH" + +4. 保存并退出编辑器 + +# 如果你用的是 nano: +# 按 Ctrl + O 来写入(保存),按 Enter 确认文件名。 +# 然后按 Ctrl + X 退出编辑器。 + +# 如果你用的是 vim: +# 按 i 进入插入模式,粘贴代码后,按 ESC 键退出插入模式。 +# 输入 :wq,然后按 Enter 来保存并退出。 + +5. 使配置立刻生效 +# 刚刚修改的配置文件不会自动在当前已打开的终端窗口生效,你需要运行以下命令之一来重新加载它: + +# 对于 zsh: +source ~/.zshrc + +# 对于 bash: +source ~/.bash_profile + +6. 验证安装是否成功 +# 现在,你可以执行提示中的测试命令来检查一切是否就绪: +poetry --version +# 成功后将显示版本号 Poetry (version 2.2.0) + +``` + +#### 安装所有项目依赖和开发工具: + +```bash +make install +``` + +#### 先在docker中启动 neo4j 和 qdrant + +#### 启动 FastAPI 服务器(在MomOS目录下): + +```bash +uvicorn memos.api.product_api:app --host 0.0.0.0 --port 8000 --reload +``` + +#### 服务器运行后,您可以使用OpenAPI文档测试API,网址为 [http://localhost:8000/docs](http://localhost:8000/docs) 或者 [http://127.0.0.1:8000/docs](http://127.0.0.1:8000/docs) + +#### 测试用例 (注册用户->添加用户记忆->查询用户记忆) 参考Docker Compose up测试用例 + +:: + + +### 方式五:使用 PyCharm 启动 + +#### 运行 server_api +```bash +1、进入MemOS/docker/Dockerfile文件,修改运行配置 +# Start the docker +CMD ["uvicorn", "memos.api.server_api:app", "--host", "0.0.0.0", "--port", "8000", "--reload"] + +2、进入目录MemOS/src/memos/api 直接运行server_api.py + +``` diff --git a/docs/cn/open_source/getting_started/your_first_memory.md b/docs/cn/open_source/getting_started/your_first_memory.md new file mode 100644 index 00000000..ea9ef8ae --- /dev/null +++ b/docs/cn/open_source/getting_started/your_first_memory.md @@ -0,0 +1,322 @@ +--- +title: 创建你的第一个记忆 +desc: "动手实战!我们将带您使用 **SimpleStructMemReader** 从对话中提取记忆,并把它存进 **TreeTextMemory** 进行管理与检索。" +--- + +## 学习目标 + +本教程将引导您完成 MemOS 的核心工作流,掌握以下能力: + +1. **读 (Read)**:怎么用 `SimpleStructMemReader` 把乱七八糟的聊天记录变成结构化的记忆。 +2. **存 (Add)**:怎么把提取出来的记忆存进 `TreeTextMemory`(图数据库)。 +3. **搜 (Search)**:怎么用自然语言把存进去的记忆搜出来。 + +--- + +## 核心组件简介 + +在开始实战前,先了解我们将要使用的两个关键组件: + +### SimpleStructMemReader(结构化记忆提取器) + +这是一个基于 LLM 的智能信息提取模块,能够: + - 自动分析对话、文档等非结构化数据 + - 识别用户偏好、事实陈述、行为模式等关键信息 + - 输出标准化的结构化记忆单元 + +### TreeTextMemory(树状文本记忆库) + +这是一个基于图数据库的记忆管理系统,能够: + - 以树状结构组织记忆,支持层级关系 + - 建立记忆间的语义关联 + - 支持高效的语义检索和图遍历 + - 底层兼容 Neo4j 等图数据库 + +## 动手试试 + +我们将通过一个具体案例演示:如何从用户关于"网球状态不佳"的对话中提取关键信息,建立可检索的记忆系统。 + +### 1. 导入模块 + +```python +from memos import log +from memos.configs.mem_reader import SimpleStructMemReaderConfig +from memos.configs.memory import TreeTextMemoryConfig +from memos.mem_reader.simple_struct import SimpleStructMemReader +from memos.memories.textual.tree import TreeTextMemory + +logger = log.get_logger(__name__) +``` + +### 2. 初始化核心组件 + +```python + +# 1. 初始化 TreeTextMemory(记忆仓库) +tree_config = TreeTextMemoryConfig.from_json_file( + "examples/data/config/tree_config_shared_database.json" +) +my_tree_textual_memory = TreeTextMemory(tree_config) + +# ⚠️ 注意:这里为了演示方便清空了旧数据。生产环境千万别这么干! +my_tree_textual_memory.delete_all() + +# 2. 初始化 SimpleStructMemReader(信息提取器) +reader_config = SimpleStructMemReaderConfig.from_json_file( + "examples/data/config/simple_struct_reader_config.json" +) +reader = SimpleStructMemReader(reader_config) +``` + +### 3. 准备一段对话 + +以下是一段用户与 AI 的对话,用户表达了打网球时的状态问题: + +```python +scene_data = [ + [ + { + "role": "user", + "chat_time": "3 May 2025", + "content": "This week I’ve been feeling a bit off, especially when playing tennis. My body just doesn’t feel right.", + }, + { + "role": "assistant", + "chat_time": "3 May 2025", + "content": "It sounds like you've been having some physical discomfort lately...", + }, + # ... (中间省略了几轮吐槽) ... + { + "role": "user", + "chat_time": "3 May 2025", + "content": "I think it might be due to stress and lack of sleep recently...", + }, + ] +] +``` + +### 4. 提取并存储 + +**SimpleStructMemReader** 会自动分析对话,提取出“用户最近压力大”、“睡眠不足”、“网球表现下降”等关键记忆点,然后存入数据库。 + +```python +# 1. 提取 (Extract) +# Reader 会调用 LLM 分析对话,返回一个记忆列表 +memory = reader.get_memory( + scene_data, + type="chat", + info={"user_id": "1234", "session_id": "2222"} +) + +# 2. 存储 (Add) +for m_list in memory: + added_ids = my_tree_textual_memory.add(m_list) + + # 看看存进去了啥 + for i, id in enumerate(added_ids): + print(f"存入第 {i} 条记忆: " + my_tree_textual_memory.get(id).memory) + + # 等待后台整理完成(建立索引需要一点时间) + my_tree_textual_memory.memory_manager.wait_reorganizer() +``` + +### 5. 检索记忆 + +**基础搜索 (Search):** + +就像用搜索引擎一样,直接问它。 + +```python +# 稍微等一下索引构建 +import time +time.sleep(2) + +init_time = time.time() + +# 试着搜一下关于“童年”的事(假设之前的对话里包含相关内容) +# 或者搜 "Why is the user feeling bad?" 试试 +results = my_tree_textual_memory.search( + "Talk about the user's childhood story?", + top_k=10, + info={ + "query": "Talk about the user's childhood story?", + "user_id": "111", + "session_id": "2234", + }, +) + +for i, r in enumerate(results): + print(f"搜到的第 {i} 条结果: {r.memory}") + +print(f"搜索耗时: {round(time.time() - init_time)}s") +``` + +**高级搜索 (Fine Mode):** + +如果您想要更聪明一点的搜索结果(比如让 LLM 帮您总结一下搜到的内容),可以开启 `mode="fine"`。 + +```python +# 开启 Fine 模式 +results_fine_search = my_tree_textual_memory.search( + "Recent news in the first city you've mentioned.", + top_k=10, + mode="fine", # 关键在这里 + info={ + "query": "Recent news in NewYork", + "user_id": "111", + "session_id": "2234", + "chat_history": [ + {"role": "user", "content": "I want to know three beautiful cities"}, + {"role": "assistant", "content": "New York, London, and Shanghai"}, + ], + }, +) + +for i, r in enumerate(results_fine_search): + print(f"Fine Search 结果: {r.memory}") +``` + +### 6. 进阶:多模态与工具 (Modality & Tools) + +MemOS 的能力不仅限于文本对话处理,还支持多模态输入和高级功能。 + +#### 1. 读取文档 (Documents) + +可直接读取本地文档并转化为记忆: + +```python +# 构造文档数据 +doc_data = [ + { + "type": "file", + "file": { + "filename": "tennis_rule.txt", + "path": "./tennis_rule.txt", # 确保文件存在 + # 或者直接提供 content: "file_data": "..." + } + } +] + +# 告诉 Reader 这是 "doc" 类型 +doc_memories = reader.get_memory( + doc_data, + type="doc", + info={"user_id": "1234", "session_id": "docs_import"} +) + +# 存入记忆 +for m in doc_memories: + my_tree_textual_memory.add(m) +``` + +#### 2. 工具调用 (Tools) + +当 Agent 使用工具(如搜索、计算器)时,MemOS 能解析工具的输入输出,记录下“用户查询了天气”、“计算结果是50”等事实。 + +```python +tool_scene = [ + [ + {"role": "user", "content": "What's the weather in Beijing?"}, + { + "role": "assistant", + "content": "", + "tool_calls": [{"id": "call_1", "function": {"name": "get_weather", "arguments": "{'city': 'Beijing'}"}}] + }, + { + "role": "tool", + "tool_call_id": "call_1", + "content": "Sunny, 25°C" + } + ] +] + +# Reader 会自动理解这是工具交互 +tool_memories = reader.get_memory(tool_scene, type="chat", info={"user_id": "1234"}) +``` + +### 7. 用户偏好 (Preferences) + +除了事实性记忆(TreeTextMemory),MemOS 还有专门的 **PreferenceTextMemory** 来管理用户喜好(如“喜欢吃辣”、“讨厌下雨”)。它使用向量数据库(如 Milvus/Qdrant)来存储,方便快速检索用户的个性化设置。 + +```python +from memos.memories.textual.simple_preference import SimplePreferenceTextMemory +# 注意:初始化需要配置 VectorDB, Embedder 等,这里仅作示意 +# pref_memory = SimplePreferenceTextMemory(...) + +# 从对话中自动提取偏好 +pref_memories = pref_memory.get_memory(chat_data, type="chat", info=...) + +# 存入偏好 +pref_memory.add(pref_memories) + +# 搜索偏好 +prefs = pref_memory.search("What is the user's UI preference?", top_k=1) +print(prefs[0].memory) # 输出: "User prefers dark mode" +``` + +### 8. 记忆反馈 (Feedback) + +记忆不是一成不变的。用户可能会纠正 AI:“我不喜欢红色,我改主意了,我喜欢蓝色”。**MemFeedback** 模块就是用来处理这种“修正”的。 + +它可以: +1. **修改**错误的记忆。 +2. **删除**过时的记忆。 +3. **合并**冲突的记忆。 + +```python +from memos.mem_feedback.simple_feedback import SimpleMemFeedback + +# 初始化反馈模块 +# feedback_module = SimpleMemFeedback(...) + +# 处理用户反馈 +# 假设用户说:"Actually, I started playing tennis in 2020, not 2018." +feedback_module.process_feedback({ + "user_id": "1234", + "feedback_content": "Actually, I started playing tennis in 2020, not 2018.", + "chat_history": [...], # 提供上下文 + "feedback_time": "Now" +}) + +# 反馈模块会自动在后台更新 Graph 数据库中的节点和关系 +``` + +### 总结 + +通过本教程,您已经掌握了 MemOS 的核心工作流: +1. **信息提取**: 使用 Reader 从各种数据源提取结构化信息 +2. **记忆存储**: 使用 TreeTextMemory 管理事实记忆,PreferenceMemory 管理用户偏好 +3. **智能检索**: 通过自然语言查询获取相关记忆 +4. **持续优化**: 通过反馈机制保持记忆的准确性和时效性 + +下一步,您可以尝试运行 `examples/mem_os/simple_memos.py`,体验一个整合了所有这些功能的完整 Agent! + +### 7. 收尾 + +测试完成后,建议进行以下清理操作: + +```python +# 关闭后台线程 +my_tree_textual_memory.memory_manager.close() + +# 备份一下记忆 +my_tree_textual_memory.dump("tmp/my_tree_textual_memory") + +# 删库跑路(仅限测试环境!) +my_tree_textual_memory.drop() +``` + +--- + +## 下一步? + +- **尝试自己的 LLM 后端:** 切换到 OpenAI、HuggingFace 或 Ollama。 +- **探索 [TreeTextMemory](/open_source/modules/memories/tree_textual_memory):** 构建基于图的层级记忆。 +- **添加 [Activation Memory](/open_source/modules/memories/kv_cache_memory):** 缓存键值状态,加速推理。 +- **深入学习:** 查看 [API Reference](/api-reference/search-memories) 和 [Examples](/open_source/getting_started/examples) 了解高级工作流程。 + + +接下来,您可以去看看更高级的玩法: +- **[MemReader](/open_source/modules/mem_reader)**:其实它还能读图片和 PDF。 +- **[MemFeedback](/open_source/modules/mem_feedback)**:如果有记忆记错了,怎么让 AI 自动修正? +- **[MemCube](/open_source/modules/mem_cube)**:怎么把各种记忆能力打包在一起,做一个真正的全能大脑。 diff --git a/docs/cn/open_source/home/architecture.md b/docs/cn/open_source/home/architecture.md new file mode 100644 index 00000000..16a10d11 --- /dev/null +++ b/docs/cn/open_source/home/architecture.md @@ -0,0 +1,139 @@ +--- +title: 架构设计 +desc: MemOS 采用模块化设计,各核心组件协同工作,将传统 LLM 升级为具备完整记忆生命周期管理能力的记忆增强系统。 +--- +## 核心模块 + +### MOS (Memory Operating System,记忆操作系统) + +MemOS 的编排层——管理跨多种记忆类型(纯文本、激活、参数化)的预测性、异步调度,并编排**多用户、多会话**记忆工作流。 + +**核心功能** + - **统一 API 网关**:为所有记忆操作(添加、搜索、更新、传输、回滚)提供一致接口 + - **工作流编排**:协调 MemCube、MemReader、MemScheduler 等组件的执行流程 + - **互操作性支持**:通过记忆交换协议 (MIP) 实现跨模型、跨设备的记忆迁移 + - **资源调度**:智能分配计算资源,平衡实时响应与后台处理需求 + + +### MemCube (记忆容器) + +MemCube 是 MemOS 的**模块化记忆存储单元**,可视为一个独立且可移植的“记忆卡”。每个 MemCube 可专门服务于特定用户、智能体或会话,容纳一种或多种记忆类型。 + +**多 Cube 支持 (Multi-Cube):** + +MemOS 支持通过**组合视图 (Composite View)** 同时操作多个 MemCube,实现灵活的记忆隔离与共享: + +| 操作类型 | 策略 | 应用场景 | +|---------|------|---------| +| **写入** | 扇出写入 (Fan-out Write) | 将记忆同时写入用户个人 Cube 和项目共享 Cube | +| **读取** | 并行搜索 (Parallel Search) | 同时查询多个 Cube,聚合结果提供全局视图 | + +**动态管理能力:** + + - **热插拔支持**:可在运行时动态注册、更新或移除 MemCube + - **容器化存储**:支持记忆在会话、模型和设备间的安全传输 + - **隔离保障**:确保不同用户或应用间的记忆数据相互隔离 + +### 异步添加机制 (Asynchronous Addition) + +为了在高并发场景下保持低延迟,MemOS 提供了异步记忆添加模式 (`async_mode`),利用 **MemScheduler** 进行后台调度: + +| 记忆类型 | 处理策略 | 优势 | +|---------|---------|------| +| **文本记忆** | 快速提取 + 异步处理 | API 立即返回基础结果,复杂处理后台执行 | +| **偏好记忆** | 全异步处理 | 最大化减少 API 响应延迟 | + +**MemScheduler 角色:** +作为异步任务调度器,负责: + - 管理后台处理任务的优先级队列 + - 协调重索引、图推理等计算密集型操作 + - 监控任务状态,确保处理一致性 + +### 记忆类型体系 + +MemOS 支持几种专门的记忆类型以满足不同需求: + +#### 1. 参数化记忆(**即将推出**) + + - **特性**: 知识固化于模型权重中,推理零延迟 + - **应用场景**:稳定的领域知识、核心技能 + - **生命周期**:长期持久,更新成本较高 + +#### 2. 激活记忆 + + - **特性**:运行时 KV 缓存与隐藏状态,快速复用 + - **应用场景**:多轮对话上下文、频繁访问的背景信息 + - **生命周期**:短期有效,会话级保持 + +#### 3. 明文记忆 + +结构化或非结构化知识块;可编辑、可追溯,适合快速更新、个性化和多代理共享。 + +| 记忆子类 | 存储结构 | 核心优势 | 典型应用 | +|---------|---------|---------|---------| +| **GeneralTextMemory** | 向量存储 | 语义检索灵活,支持元数据过滤 | 非结构化文档、聊天记录 | +| **TreeTextMemory** | 图结构存储 | 层次化组织,支持多跳推理 | 结构化知识库、用户画像 | + + +::note{title="架构选型建议"} +**起步建议**
:从 `GeneralTextMemory` 开始快速验证概念 +**演进路径**
:随着业务复杂度提升,逐步引入 `TreeTextMemory` 处理结构化知识 +:: + +#### 基础支撑组件 + +MemOS 的基础模块提供标准化能力,确保系统的可扩展性与一致性: + +| 组件类别 | 核心功能 | 实现示例 | +|---------|---------|---------| +| **文本处理** | 智能分块、记忆提取 | Chunkers, MemReaders | +| **向量化** | 文本嵌入生成 | Embedders (bge-m3, text-embedding-3-large) | +| **存储接口** | 多数据库适配 | GraphDBs (Neo4j, Qdrant, PolarDB) | +| **模型连接** | 统一 LLM 接口 | LLMs (OpenAI, Ollama) | +| **质量优化** | 检索结果重排 | Rerankers (bge-reranker-v2-m3) | + +## 代码组织架构 + +MemOS 项目组织清晰,支持即插即用: + +``` +src/memos/ + api/ # API 定义 + chunkers/ # 文本分块工具 + configs/ # 配置模式 + context/ # 日志上下文 + embedders/ # 嵌入模型 + graph_dbs/ # 图数据库后端 (例如,Neo4j) + vec_dbs/ # 向量数据库后端 (例如,Qdrant) + llms/ # LLM 连接器 + mem_agent/ # 深度检索 + mem_chat/ # 记忆增强聊天逻辑 + mem_cube/ # MemCube 管理 + mem_feedback # 记忆反馈 + mem_os/ # MOS 编排 + mem_reader/ # 记忆读取器 + mem_scheduler/ # 记忆调度模块 + memories/ # 记忆类型实现 + multi_mem_cube/# 多视图cube + parsers/ # 解析工具 + reranker/ # 重排模块 + templates/ # 提示词模板 + types/ # 类型定义 +``` + +::note{title="开发指引"} +**专业提示**
+ - **快速实验**:使用 `examples/` 目录中的示例快速验证功能 + - **深度定制**:参考 `src/` 中的模块实现进行二次开发 + - **配置管理**:所有组件均支持通过 `configs/` 进行灵活配置 +:: + +## 可扩展性 + +MemOS 是**模块化设计的**。 +添加您自己的记忆类型、存储后端或 LLM 连接器,只需最少的更改——这要归功于其**统一配置和工厂模式**。 + +::note +**专业提示**
+[贡献](/open_source/contribution/overview) 一个新的后端或分享您的自定义记忆类型——这很容易插入。 +:: diff --git a/docs/cn/open_source/home/core_concepts.md b/docs/cn/open_source/home/core_concepts.md new file mode 100644 index 00000000..bb9185a3 --- /dev/null +++ b/docs/cn/open_source/home/core_concepts.md @@ -0,0 +1,105 @@ +--- +title: 核心概念 +desc: MemOS 将记忆视为一等资源。其核心设计围绕如何为您的 LLM 应用程序组织、存储、检索和治理记忆。 +--- + +## 概述 + +* [MOS (记忆操作系统)](#mos-记忆操作系统) +* [MemCube](#memcube) +* [记忆类型](#记忆类型) +* [横切概念](#横切概念) + + +## MOS (记忆操作系统) + +**定义** +MOS 是 MemOS 的编排调度层,负责协调多个 MemCube 与各类记忆操作。它作为中间件,将 LLM 与结构化、可解释的记忆系统连接起来,以支持复杂的推理与规划任务。 + +**使用场景** +当您需要在用户、会话或智能体之间建立一致、可审计且可追溯的记忆工作流时,应使用 MOS 进行统一调度。 + +## MemCube + +**定义** +MemCube 是 MemOS 中的可插拔、可扩展记忆容器。每个用户、会话或任务均可分配独立的 MemCube,其中可承载一种或多种类型的记忆。 + +**使用场景** +随着系统规模增长,可通过配置不同的 MemCube 实现记忆的隔离、复用与水平扩展。 + +## 记忆类型 + +MemOS 将记忆视为动态演化的知识系统,而非静态数据存储。其核心记忆类型如下: + +| 记忆类型 | 描述 | 何时使用 | +|----------------|----------------------------------------------|---------------------------------------------| +| **参数记忆** | 内化至模型权重的知识 | 常青技能、稳定领域专业知识 | +| **激活记忆** | 可复用的 KV 缓存与隐藏状态 | 对话中的快速重用、多轮会话 | +| **明文记忆** | 文本、文档、图节点、工具或偏好等 | 可搜索、可检查、演进知识 | + +### 参数记忆 + +**定义** +参数记忆指固化在模型权重中的知识,可视为模型的“长期记忆”。它始终在线,为推理任务提供零延迟的知识支持。 + +**使用场景** +适用于稳定的领域知识、经过提炼的通用问题解法以及不易变化的操作技能。 + +### 激活记忆 + +**定义** +激活记忆是模型可复用的“工作记忆”,包括预先计算的键值缓存(KV Cache)与隐藏状态,可直接注入注意力机制中,避免对重复内容的重复编码。 + +**为什么重要:** +将稳定的上下文信息(如产品说明、操作指南)以 KV Cache 形式存储,可大幅降低首词元延迟(TTFT),并提升多轮对话与检索增强生成(RAG)的吞吐效率。 + +**何时使用:** +- 在连续查询中复用背景知识 +- 加速基于固定上下文的对话系统 +- 配合 MemScheduler 将高频明文记忆自动转换为 KV 缓存 + +### 明文记忆 + +**定义** +结构化或非结构化的知识单元,具有用户可见性和可解释性。除了传统的文档、聊天日志、图节点和向量嵌入外,MemOS 还将以下内容视为明文记忆: +- **工具记忆 (Tool Memory)**:包括工具的定义 (Schema) 和使用轨迹 (Trajectory),用于增强智能体(Agent)的工具调用能力。 +- **偏好记忆 (Preference Memory)**:显式或隐式的用户偏好,用于个性化推荐和响应。 + +**使用场景** +适用于语义搜索、个性化体验构建、复杂任务的工具增强以及随时间演进的可追溯事实。支持标签、来源追踪和完整的生命周期管理。 + + +## 它们如何协同工作 + +MemOS 让您在生命周期循环中调度所有三种记忆类型: + +- 提炼过程:高频使用的明文记忆可被蒸馏为参数记忆,提升推理效率。 +- 缓存优化:常见推理路径可固化为可复用的 KV 模板,减少重复计算。 +- 降级归档:使用频率降低的参数或激活记忆可降级为明文存储,便于审计与再训练。 + +借助 MemOS,您的 AI 系统不仅能存储信息,更能实现持续**记忆**、**深度理解**和**自主进化**。 + +::note +**系统洞察**
+ - 随着时间的推移,频繁使用的明文记忆可以提炼为参数记忆。 + - 低频参数或缓存则可归档为明文,形成可审计、可再训练的知识闭环。 +:: + +## 横切概念 + +### 混合检索 + +结合向量相似性检索和图遍历算法,实现稳健且具有上下文感知能力的混合搜索。 + +### 治理与生命周期 + +每个记忆单元都具备完整的生命周期状态(激活、合并、归档),并支持来源跟踪和细粒度访问控制,这对满足审计和数据合规性要求至关重要。 + +::note +**合规提示**
+请确保对每个记忆单元的来源与状态变更进行完整记录,以符合数据治理与审计规范。 +:: + +## 关键要点 + +MemOS 为您的 LLM 应用提供结构化、可演进、可治理的记忆系统,使智能体能够进行长远规划、复杂推理与持续自适应,释放下一代 AI 应用的真正潜力。 diff --git a/docs/cn/open_source/home/memos_intro.md b/docs/cn/open_source/home/memos_intro.md new file mode 100644 index 00000000..25e7b343 --- /dev/null +++ b/docs/cn/open_source/home/memos_intro.md @@ -0,0 +1,114 @@ +--- +title: 什么是 MemOS? +desc: "**MemOS** 是为大语言模型 (LLMs) 和智能体打造的**记忆操作系统**。它将记忆视为**可管理、调度和解释的一级资源**,而不是隐藏在模型权重内部的不透明层。" +--- + +![MemOS Architecture](https://statics.memtensor.com.cn/memos/memos-architecture.png) + + +随着 LLMs 的发展,它们需要处理复杂任务——如多轮对话、长期规划、决策制定和个性化用户体验——**赋予它们结构化、管理和演进记忆的能力**对于实现真正的长期智能和适应性变得至关重要。 + +然而,大多数主流 LLMs 仍然严重依赖静态参数化记忆(模型权重)。这使得更新知识、跟踪记忆使用或积累演进的用户偏好变得困难。结果是什么?刷新知识成本高、行为脆弱以及个性化有限。 + +**MemOS** 通过将记忆重新定义为具有统一结构、生命周期管理和调度逻辑的**核心模块化系统资源**来解决这些挑战。它提供了一个基于 Python 的层,位于您的 LLM 和外部知识源之间,实现**持久化、结构化和高效的记忆操作**。 + +使用 MemOS,您的 LLM 可以随时间保留知识,更稳健地管理上下文,并使用可解释和可审计的记忆进行推理——解锁更智能、可靠和自适应的 AI 行为。 + + +::note +**提示**
MemOS 帮助弥合静态参数化权重和动态、用户特定记忆之间的差距。 + 将其视为您代理的"大脑",具有明文和激活记忆的即插即用模块。 +:: + +## 为什么我们需要MemOS? + +LLMs 强大,但严重依赖参数化记忆(权重),这些权重难以检查、更新或共享。 +典型的向量搜索 (RAG) 有助于检索外部事实,但缺乏统一治理、生命周期控制或跨代理共享。 + +**MemOS** 改变了这一点。 +将其视为记忆的操作系统: +就像操作系统调度 CPU、RAM 和文件一样,MemOS **调度、转换和治理**多种记忆类型——从参数化权重到临时缓存再到明文、可追溯的知识。 + +::note +**深入理解**
MemOS 通过将参数化、激活和明文记忆融合到生命周期中,帮助您的 LLM 演进。 +:: + + +## 核心构建模块 +### MemCubes + +**灵活的容器**,容纳一种或多种记忆类型。 +每个用户、会话或代理都可以有自己的 MemCube——可交换、可重用和可追溯。 + +### 记忆生命周期 + +每个记忆单元可以流经以下状态: + +- **生成** → **激活** → **合并** → **归档** → **冻结** + +每个步骤都通过**来源跟踪**和审计日志进行版本控制。旧记忆可以"时间机器"回到之前的版本进行恢复或反事实模拟。 + + +### 操作与治理 + +模块包括: + +- **MemScheduler** — 动态转换记忆类型以实现最佳复用。 +- **MemLifecycle** — 管理状态转换、合并和归档。 +- **MemGovernance** — 处理访问控制、编辑、合规性和审计跟踪。 + + +::note +**合规提醒**
每个记忆单元都携带完整的来源元数据,因此您可以审计谁创建、修改或查询了它。 +:: + + +## 多视角记忆 + +MemOS 在生命周期中融合**三种记忆形式**: + +| 类型 | 描述 | 用例 | +|----------------| ---------------------------------------------------- | ---------------------------------------------- | +| **参数记忆** | 知识提炼到模型权重中 | 常青技能、稳定领域事实 | +| **激活记忆** | 用于推理复用的 KV caches 和隐藏状态 | 快速多轮聊天、低延迟生成 | +| **明文记忆** | 文本、文档、图、向量块、用户可见事实| 语义搜索、演进、可解释记忆 | + +随着时间的推移: + +- 频繁使用的明文记忆可以提炼为参数化权重。 +- 稳定的上下文被提升为 KV cache 以快速注入。 +- 使用频率低或过时的知识可以被降级。 + + +## MemOS 有什么不同? + +- 混合检索 — 符号和语义混合检索、向量和图混合检索。 +- 多代理和多用户图 — 私有和共享。 +- 来源和审计跟踪 — 每个记忆单元都被治理和可解释。 +- 自动 KV cache 提升以重用稳定上下文。 +- 记忆的生命周期调度 — 减少陈旧事实或臃肿权重的调用。 + + +## 适合谁? + +- 需要**多轮、演进记忆**的对话代理 +- 处理**合规性、领域更新和个性化**的企业级 Copilot +- 在**共享知识图**上协作的多代理系统 +- 想要模块化、可查记忆而不是黑盒提示的 AI 构建者 + +## 关键要点 + +**MemOS** 将您的 LLM 从"只是预测 tokens" +升级为可以**记忆**、**推理**和**适应**的智能演进系统—— +就像您代理思维的操作系统。 + +**使用 MemOS,您的 AI 不仅仅是存储事实——它在成长。** + +## 主要特性 + +- **模块化记忆架构**: 支持明文、激活 (KV cache) 和参数化 (adapters/LoRA) 记忆。 +- **MemCube**: 所有记忆类型的统一容器,具有简单的加载/保存和 API 访问。 +- **MOS**: 面向 LLMs 的记忆增强系统,具有即插即用的记忆模块。 +- **基于图的后端**: 原生支持 Neo4j 和其他图数据库,用于结构化、可解释的记忆。 +- **易于集成**: 可与 HuggingFace、Ollama 和自定义 LLMs 配合使用。 +- **可扩展**: 添加您自己的记忆模块或后端。 diff --git a/docs/cn/open_source/home/overview.md b/docs/cn/open_source/home/overview.md new file mode 100644 index 00000000..c5b8d59a --- /dev/null +++ b/docs/cn/open_source/home/overview.md @@ -0,0 +1,47 @@ +--- +title: MemOS 文档 +desc: 欢迎来到 MemOS 官方文档 – 一个专为大型语言模型 (LLMs) 提供高级模块化记忆功能的 Python 包。 +banner: https://statics.memtensor.com.cn/memos/memos-banner.gif +links: + - label: 'PyPI' + to: https://pypi.org/project/MemoryOS/ + target: _blank + avatar: + src: https://statics.memtensor.com.cn/icon/pypi.svg + alt: PyPI logo + - label: 'Open Source' + to: https://github.com/MemTensor/MemOS + target: _blank + icon: i-simple-icons-github +--- + +## 什么是 MemOS? + +随着大型语言模型(LLMs)的不断演进,其所承担的任务日益复杂,包括多轮对话、规划、决策制定以及个性化代理等。在此背景下,如何高效管理和利用记忆,成为实现长期智能与适应性能力的关键因素。 +然而,主流 LLM 架构往往在记忆结构化、管理和集成方面存在不足,导致知识更新成本高、行为状态不可持续以及难以积累用户偏好。 + +**MemOS** 通过将记忆重新定义为具有统一结构、生命周期管理和调度策略的核心一级资源来解决这些挑战。它提供了一个 Python 包,为基于 LLM 的应用程序提供统一的记忆层,实现持久化、结构化和高效的记忆操作。这使 LLMs 具备长期知识保留、强大的上下文管理和记忆增强推理能力,支持更智能和自适应的行为。 + +![MemOS Architecture](https://statics.memtensor.com.cn/memos/memos-architecture.png) + +## 主要特性 + +- **模块化记忆架构**:支持明文、激活(KV cache)和参数(适配器/LoRA)记忆。 +- **MemCube**:所有记忆类型的统一容器,易于加载/保存和 API 访问。 +- **MOS**:LLMs 的记忆增强系统,具有即插即用的记忆模块。 +- **基于图的后端**:原生支持 Neo4j 和其他图数据库,用于结构化、可解释的记忆。 +- **易于集成**:与 HuggingFace、Ollama 和自定义 LLMs 兼容。 +- **可扩展**:添加您自己的记忆模块或后端。 + + +## 安装 + +请参阅我们的 [安装指南](/open_source/getting_started/installation) 获取完整的安装说明,包括基础安装、可选依赖项和外部依赖项。 + +## 贡献 + +我们欢迎贡献!请参阅 [贡献指南](/open_source/contribution/overview) 了解设置环境和提交 pull request 的详细信息。 + +## 许可证 + +MemOS 在 Apache 2.0 许可证下发布。 diff --git a/docs/cn/open_source/modules/mem_chat.md b/docs/cn/open_source/modules/mem_chat.md new file mode 100644 index 00000000..12c7aed2 --- /dev/null +++ b/docs/cn/open_source/modules/mem_chat.md @@ -0,0 +1,180 @@ +--- +title: MemChat +desc: "MemChat 是你的“记忆外交官”,它协调用户输入、记忆检索与 LLM 生成,打造连贯且具备长期记忆的对话体验。" +--- + +## 1. 简介 + +**MemChat** 是 MemOS 的对话控制中心。 + +它不仅仅是一个聊天接口,更是连接“即时对话”与“长时记忆”的桥梁。在与用户交流的过程中,MemChat 负责实时地从 MemCube(记忆立方体)中检索相关背景信息,构建上下文,并将新的对话内容沉淀为新的记忆。通过它,你的 Agent 不再是“金鱼记忆”,而是能够真正理解过往、持续成长的智能伙伴。 + +--- + +## 2. 核心能力 + +### 记忆增强对话 (Memory-Augmented Chat) +在回答用户问题前,MemChat 会自动从 MemCube 中检索相关的 Textual Memory(文本记忆),将其注入到 Prompt 中。这使得 Agent 能够基于过往的交互历史或知识库来回答问题,而不仅仅依赖于 LLM 的预训练知识。 + +### 自动记忆沉淀 (Auto-Memorization) +对话后,MemChat 会利用 Extractor LLM 自动从对话流中提取有价值的信息(如用户偏好、事实知识),并存储到 MemCube 中。无需用户手动干预,整个过程完全自动化。 + +### 上下文管理 +自动管理对话历史窗口 (`max_turns_window`)。当对话过长时,它会智能裁剪旧的上下文,同时依赖检索到的长期记忆来保持对话的连贯性,有效解决了 LLM Context Window 的限制问题。 + +### 灵活配置 +支持通过配置开关不同类型的记忆(文本记忆、激活记忆等),适应不同的应用场景。 + +--- + +## 3. 代码结构 + +核心逻辑位于 `memos/src/memos/mem_chat/` 下。 + +* **`simple.py`**: **默认实现 (SimpleMemChat)**。这是一个开箱即用的 REPL(Read-Eval-Print Loop)实现,包含了完整的“检索 -> 生成 -> 存储”闭环逻辑。 +* **`base.py`**: **接口定义 (BaseMemChat)**。定义了 MemChat 的基本行为,如 `run()` 和 `mem_cube` 属性。 +* **`factory.py`**: **工厂类**。负责根据配置 (`MemChatConfig`) 实例化具体的 MemChat 对象。 + +--- + +## 4. 关键接口 + +主要的交互入口是 `MemChat` 类(通常由 `MemChatFactory` 创建)。 + +### 4.1 初始化 +你需要先创建一个配置对象,然后通过工厂方法创建实例。创建后,必须将 `MemCube` 实例挂载到 `mem_chat.mem_cube` 上。 + +### 4.2 `run()` +启动一个交互式的命令行对话循环。适合开发调试,它会处理用户输入、调用记忆检索、生成回复并打印。 + +### 4.3 属性 +* **`mem_cube`**: 关联的记忆立方体对象。MemChat 通过它来读写记忆。 +* **`chat_llm`**: 用于生成回复的 LLM 实例。 + +--- + +## 5. 工作流程 + +MemChat 的一轮对话循环通常包含以下步骤: + +1. **接收输入 (Input)**: 获取用户的文本输入。 +2. **记忆检索 (Recall)**: (如果开启 `enable_textual_memory`) 使用用户输入作为 Query,从 `mem_cube.text_mem` 中检索 Top-K 条相关记忆。 +3. **构建提示词 (Prompt Construction)**: 将系统提示词、检索到的记忆、最近的对话历史 (History) 拼接成完整的 Prompt。 +4. **生成回复 (Generation)**: 调用 `chat_llm` 生成回复。 +5. **记忆提取与存储 (Memorization)**: (如果开启 `enable_textual_memory`) 将本轮对话 (User + Assistant) 发送给 `mem_cube` 的提取器,提取新记忆并存入数据库。 + +--- + +## 6. 开发示例 + +下面是一个完整的代码示例,展示了如何配置 MemChat,并挂载一个基于 Qdrant 和 OpenAI 的 MemCube。 + +### 6.1 代码实现 + +```python +import os +import sys + +# 确保 src 模块可以被导入 +sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), "../../../src"))) + +from memos.configs.mem_chat import MemChatConfigFactory +from memos.configs.mem_cube import GeneralMemCubeConfig +from memos.mem_chat.factory import MemChatFactory +from memos.mem_cube.general import GeneralMemCube + +def get_mem_chat_config() -> MemChatConfigFactory: + """生成 MemChat 配置""" + return MemChatConfigFactory.model_validate( + { + "backend": "simple", + "config": { + "user_id": "user_123", + "chat_llm": { + "backend": "openai", + "config": { + "model_name_or_path": os.getenv("MOS_CHAT_MODEL", "gpt-4o"), + "temperature": 0.8, + "max_tokens": 1024, + "api_key": os.getenv("OPENAI_API_KEY"), + "api_base": os.getenv("OPENAI_API_BASE"), + }, + }, + "max_turns_window": 20, + "top_k": 5, + "enable_textual_memory": True, # 开启显式记忆 + }, + } + ) + +def get_mem_cube_config() -> GeneralMemCubeConfig: + """生成 MemCube 配置""" + return GeneralMemCubeConfig.model_validate( + { + "user_id": "user03alice", + "cube_id": "user03alice/mem_cube_tree", + "text_mem": { + "backend": "general_text", + "config": { + "cube_id": "user03alice/mem_cube_general", + "extractor_llm": { + "backend": "openai", + "config": { + "model_name_or_path": os.getenv("MOS_CHAT_MODEL", "gpt-4o"), + "api_key": os.getenv("OPENAI_API_KEY"), + "api_base": os.getenv("OPENAI_API_BASE"), + }, + }, + "vector_db": { + "backend": "qdrant", + "config": { + "collection_name": "user03alice_mem_cube_general", + "vector_dimension": 1024, + }, + }, + "embedder": { + "backend": os.getenv("MOS_EMBEDDER_BACKEND", "universal_api"), + "config": { + "provider": "openai", + "api_key": os.getenv("MOS_EMBEDDER_API_KEY", "EMPTY"), + "model_name_or_path": os.getenv("MOS_EMBEDDER_MODEL", "bge-m3"), + "base_url": os.getenv("MOS_EMBEDDER_API_BASE"), + }, + }, + }, + }, + } + ) + +def main(): + print("Initializing MemChat...") + mem_chat = MemChatFactory.from_config(get_mem_chat_config()) + + print("Initializing MemCube...") + mem_cube = GeneralMemCube(get_mem_cube_config()) + + # 关键步骤:挂载记忆立方体 + mem_chat.mem_cube = mem_cube + + print("Starting Chat Session...") + try: + mem_chat.run() + finally: + print("Saving memory cube...") + mem_chat.mem_cube.dump("new_cube_path") + +if __name__ == "__main__": + main() +``` + +--- + +## 7. 配置说明 + +在配置 `MemChatConfigFactory` 时,以下参数至关重要: + +* **`user_id`**: 必填。用于标识当前对话的用户,确保记忆的隔离性。 +* **`chat_llm`**: 对话模型配置。建议使用能力较强的模型(如 GPT-4o),以获得更好的回复质量和指令遵循能力。 +* **`enable_textual_memory`**: `True` / `False`。是否开启文本记忆。如果开启,系统会在对话前进行检索,并在对话后进行存储。 +* **`max_turns_window`**: 整数。对话历史保留的轮数。超过此限制的历史记录将被截断,从而依赖长期记忆来补充上下文。 +* **`top_k`**: 整数。每次从记忆库中检索多少条最相关的记忆片段注入到 Prompt 中。 diff --git a/docs/cn/open_source/modules/mem_cube.md b/docs/cn/open_source/modules/mem_cube.md new file mode 100644 index 00000000..b7fd92aa --- /dev/null +++ b/docs/cn/open_source/modules/mem_cube.md @@ -0,0 +1,290 @@ +--- +title: MemCube +desc: "MemCube 是你的“记忆收纳箱”,统一管理三种类型的记忆:明文记忆、激活记忆和参数化记忆。它提供简洁的接口,方便加载、保存和操作多个记忆模块,让开发者轻松构建、保存和共享记忆增强应用。" +--- +## 什么是 MemCube? + +**MemCube** 是一个容器,包含了三种主要类型的记忆: + +- **明文记忆** (例如,`GeneralTextMemory`、`TreeTextMemory`): 用于存储和检索非结构化或结构化文本知识。 +- **激活记忆** (例如,`KVCacheMemory`): 用于存储键值缓存以加速 LLM 推理和上下文重用。 +- **参数化记忆** (例如,`LoRAMemory`): 用于存储模型适应参数(如 LoRA 权重)。 + +每种记忆都可以独立配置,根据应用需求灵活组合。 + +## 结构 + +MemCube 由配置定义(参见 `GeneralMemCubeConfig`),该配置为每种记忆类型指定后端和配置。典型结构是: + +``` +MemCube + ├── user_id + ├── cube_id + ├── text_mem: TextualMemory + ├── act_mem: ActivationMemory + └── para_mem: ParametricMemory +``` + +所有记忆模块都可通过 MemCube 接口访问: + +- `mem_cube.text_mem` +- `mem_cube.act_mem` +- `mem_cube.para_mem` + +## View 架构 + +从 MemOS 2.0 开始,运行时操作(add/search)应通过 **View 架构**: + +### SingleCubeView + +用于管理单个 MemCube。当系统只需要一个记忆空间时使用。 + +```python +from memos.multi_mem_cube.single_cube import SingleCubeView + +view = SingleCubeView( + cube_id="my_cube", + naive_mem_cube=naive_mem_cube, + mem_reader=mem_reader, + mem_scheduler=mem_scheduler, + logger=logger, + searcher=searcher, + feedback_server=feedback_server, # 可选 +) + +# 添加记忆 +view.add_memories(add_request) + +# 搜索记忆 +view.search_memories(search_request) +``` + +### CompositeCubeView + +用于管理多个 MemCube。当需要跨多个记忆空间进行统一操作时使用。 + +```python +from memos.multi_mem_cube.composite_cube import CompositeCubeView + +# 创建多个 SingleCubeView +view1 = SingleCubeView(cube_id="cube_1", ...) +view2 = SingleCubeView(cube_id="cube_2", ...) + +# 用于多 cube 操作的组合视图 +composite = CompositeCubeView(cube_views=[view1, view2], logger=logger) + +# 跨所有 cube 搜索 +results = composite.search_memories(search_request) +# 结果包含 cube_id 字段以标识来源 +``` + +### API 请求字段 + +#### 添加记忆(add模式) + +| 字段 | 描述 | +| --------------------- | ---------------------------------------------------------------- | +| `writable_cube_ids` | add 操作的目标 cube | +| `async_mode` | `"async"`(启用 scheduler 后台处理)或 `"sync"`(禁用 scheduler 同步处理) | + +#### 搜索记忆(search模式) + +| 字段 | 描述 | +| --------------------- | ---------------------------------------------------------------- | +| `readable_cube_ids` | search 操作的目标 cube | +| `async_mode` | `"async"`(启用 scheduler 后台处理)或 `"sync"`(禁用 scheduler 同步处理) | + +## 核心方法(GeneralMemCube) + +GeneralMemCube 是 MemCube 的标准实现,通过统一的接口管理系统的所有记忆。GeneralMemCube 提供以下核心方法来管理记忆数据的生命周期。 + +### 初始化 + +```python +from memos.mem_cube.general import GeneralMemCube +mem_cube = GeneralMemCube(config) +``` + +### 静态数据操作 + +| 方法 | 描述 | +| ----------------------------------------- | ----------------------------------------- | +| `init_from_dir(dir)` | 从本地目录加载 MemCube | +| `init_from_remote_repo(repo, base_url)` | 从远程仓库加载 MemCube(如 Hugging Face) | +| `load(dir)` | 从目录加载所有记忆到现有实例 | +| `dump(dir)` | 将所有记忆保存到目录以持久化 | + +## 文件存储 + +MemCube 保存后的目录包含以下文件,每个文件对应一种记忆类型: + +- `config.json` (MemCube 配置) +- `textual_memory.json` (明文记忆) +- `activation_memory.pickle` (激活记忆) +- `parametric_memory.adapter` (参数化记忆) + +## 使用示例 + +### 导出示例 (dump_cube.py) + +```python +import json +import os +import shutil + +from memos.api.handlers import init_server +from memos.api.product_models import APIADDRequest +from memos.log import get_logger +from memos.multi_mem_cube.single_cube import SingleCubeView + +logger = get_logger(__name__) +EXAMPLE_CUBE_ID = "example_dump_cube" +EXAMPLE_USER_ID = "example_user" + +# 1. 初始化服务 +components = init_server() +naive = components["naive_mem_cube"] + +# 2. 创建 SingleCubeView +view = SingleCubeView( + cube_id=EXAMPLE_CUBE_ID, + naive_mem_cube=naive, + mem_reader=components["mem_reader"], + mem_scheduler=components["mem_scheduler"], + logger=logger, + searcher=components["searcher"], + feedback_server=components["feedback_server"], +) + +# 3. 通过 View 添加记忆 +result = view.add_memories(APIADDRequest( + user_id=EXAMPLE_USER_ID, + writable_cube_ids=[EXAMPLE_CUBE_ID], + messages=[ + {"role": "user", "content": "This is a test memory"}, + {"role": "user", "content": "Another memory to persist"}, + ], + async_mode="sync", # 使用同步模式确保立即完成 +)) +print(f"✓ Added {len(result)} memories") + +# 4. 导出特定 cube_id 的数据 +output_dir = "tmp/mem_cube_dump" +if os.path.exists(output_dir): + shutil.rmtree(output_dir) +os.makedirs(output_dir, exist_ok=True) + +# 导出图数据(仅导出当前 cube_id 的数据) +json_data = naive.text_mem.graph_store.export_graph( + include_embedding=True, # 包含 embedding 以支持语义搜索 + user_name=EXAMPLE_CUBE_ID, # 按 cube_id 过滤 +) + +# 修复 embedding 格式:将字符串解析为列表以兼容导入 +import contextlib +for node in json_data.get("nodes", []): + metadata = node.get("metadata", {}) + if "embedding" in metadata and isinstance(metadata["embedding"], str): + with contextlib.suppress(json.JSONDecodeError): + metadata["embedding"] = json.loads(metadata["embedding"]) + +print(f"✓ Exported {len(json_data.get('nodes', []))} nodes") + +# 保存到文件 +memory_file = os.path.join(output_dir, "textual_memory.json") +with open(memory_file, "w", encoding="utf-8") as f: + json.dump(json_data, f, indent=2, ensure_ascii=False) +print(f"✓ Saved to: {memory_file}") +``` + +### 导入与搜索示例 (load_cube.py) + +> **Embedding 兼容性说明**:示例数据使用 **bge-m3** 模型,维度为 **1024**。如果您的环境使用不同的 embedding 模型或维度,导入后的语义搜索可能不准确或失败。请确保您的 `.env` 配置与导出时的 embedding 配置一致。 + +```python +import json +import os + +from memos.api.handlers import init_server +from memos.api.product_models import APISearchRequest +from memos.log import get_logger +from memos.multi_mem_cube.single_cube import SingleCubeView + +logger = get_logger(__name__) +EXAMPLE_CUBE_ID = "example_dump_cube" +EXAMPLE_USER_ID = "example_user" + +# 1. 初始化服务 +components = init_server() +naive = components["naive_mem_cube"] + +# 2. 创建 SingleCubeView +view = SingleCubeView( + cube_id=EXAMPLE_CUBE_ID, + naive_mem_cube=naive, + mem_reader=components["mem_reader"], + mem_scheduler=components["mem_scheduler"], + logger=logger, + searcher=components["searcher"], + feedback_server=components["feedback_server"], +) + +# 3. 从文件加载数据到 graph_store +load_dir = "examples/data/mem_cube_tree" +memory_file = os.path.join(load_dir, "textual_memory.json") + +with open(memory_file, encoding="utf-8") as f: + json_data = json.load(f) + +naive.text_mem.graph_store.import_graph(json_data, user_name=EXAMPLE_CUBE_ID) + +nodes = json_data.get("nodes", []) +print(f"✓ Imported {len(nodes)} nodes") + +# 4. 显示加载的数据 +print(f"\nLoaded {len(nodes)} memories:") +for i, node in enumerate(nodes[:3], 1): # 显示前3条 + metadata = node.get("metadata", {}) + memory_text = node.get("memory", "N/A") + mem_type = metadata.get("memory_type", "unknown") + print(f" [{i}] Type: {mem_type}") + print(f" Content: {memory_text[:60]}...") + +# 5. 语义搜索验证 +query = "test memory dump persistence demonstration" +print(f'\nSearching: "{query}"') + +search_result = view.search_memories( + APISearchRequest( + user_id=EXAMPLE_USER_ID, + readable_cube_ids=[EXAMPLE_CUBE_ID], + query=query, + ) +) + +text_mem_results = search_result.get("text_mem", []) +memories = [] +for group in text_mem_results: + memories.extend(group.get("memories", [])) + +print(f"✓ Found {len(memories)} relevant memories") +for i, mem in enumerate(memories[:2], 1): # 显示前2条 + print(f" [{i}] {mem.get('memory', 'N/A')[:60]}...") +``` + +### 完整示例 + +参见代码仓库中的示例: + +- `MemOS/examples/mem_cube/dump_cube.py` - 导出 MemCube 数据(add + export) +- `MemOS/examples/mem_cube/load_cube.py` - 导入 MemCube 数据并进行语义搜索(import + search) + +### 旧 API 说明 + +早期版本中直接调用 `mem_cube.text_mem.get_all()` 的方式已废弃,请使用 View 架构。旧示例已移至 `MemOS/examples/mem_cube/_deprecated/`。 + +## 开发者说明 + +* MemCube 强制执行模式一致性,确保安全的加载/转储 +* 每种记忆类型都是可插拔的,支持独立测试 +* 参见 `/tests/mem_cube/` 了解集成测试和使用模式 diff --git a/docs/cn/open_source/modules/mem_feedback.md b/docs/cn/open_source/modules/mem_feedback.md new file mode 100644 index 00000000..31bd0042 --- /dev/null +++ b/docs/cn/open_source/modules/mem_feedback.md @@ -0,0 +1,155 @@ +--- +title: MemFeedback +desc: "MemFeedback 是你的“记忆错题本”,让你的 Agent 能够听懂“你记错了”,并自动修正记忆库。它是实现记忆自进化的关键组件。" +--- + +## 1. 简介 + +**MemFeedback** 是 MemOS 的“后悔药”。 + +在长时记忆系统中,最头疼的往往不是“记不住”,而是“记错了改不掉”。当用户说“不,我的生日是明天”或者“把这个项目的代号改成 X”时,简单的 RAG 系统通常无能为力。 + +MemFeedback 能够听懂这些自然语言指令,自动去数据库里精准定位冲突的记忆,并执行原子级的修正操作(比如把旧记忆归档、写入新记忆)。通过它,你的 Agent 能够像人一样在交流中不断纠错和学习。 + +--- + +## 2. 核心能力 + +它能处理四种常见的反馈场景: + +### 纠错 (Correction) +用户指出事实错误。系统不会粗暴地删除旧数据,而是将其**归档 (Archive)**,并写入新数据。这样既修正了错误,又保留了版本历史(Traceability)。如果是正在进行的对话(WorkingMemory),则直接原地更新,保证上下文连贯。 + +### 补充 (Addition) +如果用户只是补充了新信息,且与旧记忆不冲突,那就很简单——直接作为新节点存入记忆库。 + +### 全局替换 (Keyword Replacement) +类似于 IDE 里的“全局重构”。比如用户说“把所有文档里的‘张三’都改成‘李四’”,系统会结合 Reranker 自动圈定受影响的文档范围,批量更新所有相关记忆。 + +### 偏好进化 (Preference Evolution) +专门处理“我不吃香菜”、“我喜欢 Python”这类偏好。系统会记录下这个偏好产生的场景,不断丰富用户画像,让 Agent 越用越顺手。 + +--- + +## 3. 代码结构 + +核心逻辑都在 `memos/src/memos/mem_feedback/` 下。 + +* **`simple_feedback.py`**: **推荐直接看这个**。它是官方封装好的版本,把 LLM、向量数据库、检索器都组装好了,开箱即用。 +* **`feedback.py`**: 核心实现类 `MemFeedback`。脏活累活都在这儿:意图识别、冲突比对、安全风控。 +* **`base.py`**: 接口定义。 +* **`utils.py`**: 工具箱。 + +--- + +## 4. 关键接口 + +主入口就一个:`process_feedback()`。通常在 RAG 流程结束、用户给出反馈后异步调用。 + +### 4.1 输入参数 + +| 参数 | 说明 | +| :--- | :--- | +| `user_id` / `user_name` | 用户标识与 Cube ID。 | +| `chat_history` | 对话历史,让 LLM 知道你们刚才聊了啥。 | +| `feedback_content` | 用户说的那句反馈(比如“不对,是五点”)。 | +| **`retrieved_memory_ids`** | **必填项(强烈建议)**。把上一轮 RAG 检索到的记忆 ID 传进来,相当于给了系统一个“靶子”,告诉它要修正哪条记忆。如果不传,系统得自己去海量记忆里重新搜,不仅慢,还容易改错。 | +| `corrected_answer` | 是否顺便生成一句修正后的回复。 | + +### 4.2 输出结果 + +返回一个字典,告诉你这次操作改了什么: +* **`record`**: 数据库变更明细(比如 `{ "add": [...], "update": [...] }`)。 +* **`answer`**: 给用户的自然语言回复。 + +--- + +## 5. 工作流程 + +MemFeedback 的工作流程像是一个严谨的编辑部: + +1. **审稿 (意图识别)**: 先看用户是在纠错、补充信息,还是在改名。 +2. **定位 (召回)**: 找到要修改的那条记忆(如果你传了 ID,这步就省了)。 +3. **校对 (比对)**: 让 LLM 仔细比对新旧信息,确定是完全新增 (ADD) 还是需要更新 (UPDATE)。 +4. **风控 (安全检查)**: 防止 LLM 瞎改。比如 ID 对不对?是不是要把一篇长文档全删了?(会有阈值拦截)。 +5. **出版 (写入)**: 最后执行图数据库操作,归档旧的,写入新的。 + +--- + +## 6. 开发示例 + +这里有一份可运行的代码清单,展示了如何初始化服务、预置一个“错误记忆”,然后通过用户反馈将其修正。 + +### 6.1 准备工作 + +首先,我们需要初始化 `SimpleMemFeedback` 服务。 + +```python +# 假设 llm, embedder, graph_db 等组件已通过 Factory 初始化完成 +# 完整初始化代码请参考 examples/mem_feedback/example_feedback.py + +from memos.mem_feedback.simple_feedback import SimpleMemFeedback + +feedback_server = SimpleMemFeedback( + llm=llm, + embedder=embedder, + graph_store=graph_db, + memory_manager=memory_manager, + mem_reader=mem_reader, + searcher=searcher, + reranker=mem_reranker, + pref_mem=None, +) +``` + +### 6.2 模拟场景与执行反馈 + +场景:系统错误地记住了“你喜欢苹果,不喜欢香蕉”,现在我们要纠正它。 + +```python +import json +from memos.mem_feedback.utils import make_mem_item + +# 1. 模拟对话历史 +# 用户问偏好,助手答错了 +history = [ + {"role": "user", "content": "我喜欢什么水果,不喜欢什么水果"}, + {"role": "assistant", "content": "你喜欢苹果,不喜欢香蕉"}, +] + +# 2. 预置“错误记忆” +# 我们手动往库里塞一条错误的事实 +mem_text = "你喜欢苹果,不喜欢香蕉" +# ... (省略 make_mem_item 的详细参数,见源码) ... +memory_manager.add([make_mem_item(mem_text, ...)], ...) + +# 3. 用户反馈 +feedback_content = "错了,实际上我喜欢的是山竹" +print(f"Feedback Input: {feedback_content}") + +# 4. 执行修正 +# MemFeedback 会发现冲突,把旧记忆归档,写入新记忆“喜欢山竹” +res = feedback_server.process_feedback( + ..., + chat_history=history, + feedback_content=feedback_content, + ... +) + +# 5. 查看结果 +print(json.dumps(res, indent=4)) +``` + +--- + +## 7. 配置说明 + +要让 MemFeedback 转起来,你需要准备好以下组件的配置(通常在 `.env` 或 YAML 里): + +* **LLM (`extractor_llm`)**: 脑子要好使,建议用 GPT-4o 级别的模型。Temperature 设低点(比如 0),因为它要干的是逻辑分析,不需要太发散。 +* **Embedder (`embedder`)**: 用于把新记忆变成向量。 +* **GraphDB (`graph_db`)**: 记忆存在哪、怎么存,这两兄弟负责。 +* **MemReader (`mem_reader`)**: 如果是纯新增的记忆,用它来解析。 + + +--- diff --git a/docs/cn/open_source/modules/mem_reader.md b/docs/cn/open_source/modules/mem_reader.md new file mode 100644 index 00000000..81b024a6 --- /dev/null +++ b/docs/cn/open_source/modules/mem_reader.md @@ -0,0 +1,181 @@ +--- +title: "MemReader" +desc: “MemReader 是你的“记忆翻译官”。它负责把杂乱的输入(聊天、文档、图片)翻译成系统能理解的、结构化的记忆片段。" +--- + +## 1. 简介 + +在构建 AI 应用时,我们经常遇到这样的问题:用户发来的东西千奇百怪——有的是随口的聊天,有的是 PDF 文档,有的是图片。**MemReader** 的作用就是把这些原始数据(Raw Data)“嚼碎”并“消化”,变成带有 Embedding 和元数据的标准记忆块(Memory Item)。 + +简单来说,它做三件事: +1. **归一化**:不管你发来的是字符串还是 JSON,先统一变成标准格式。 +2. **切片 (Chunking)**:把长对话或长文档切成合适的小块,方便后续处理。 +3. **精炼 (Extraction)**:调用 LLM 把非结构化的信息提取成结构化的知识点(Fine 模式),或者直接生成快照(Fast 模式)。 + +--- + +## 2. 核心模式 + +MemReader 设计了两种工作模式,分别对应“快”和“准”两种需求: + +### ⚡ Fast 模式(唯快不破) +* **特点**:**不调用 LLM**,只做切片和 Embedding。 +* **适用场景**: + * 用户发消息飞快,系统需要毫秒级响应。 + * 只需保留对话的“快照”,不需要深度理解。 +* **产物**:原始文本片段 + 向量索引 + 来源追踪 (Sources)。 + +### 🧠 Fine 模式(精雕细琢) +* **特点**:**调用 LLM** 进行深度分析。 +* **适用场景**: + * 长时记忆写入(需要提取关键事实)。 + * 文档分析(需要总结核心观点)。 + * 多模态理解(需要看懂图片里的内容)。 +* **产物**:结构化的事实 + 关键信息提取 (Key) + 背景 (Background) + 向量索引 + 来源追踪 (Sources) + 多模态细节。 + +--- + +## 3. 代码结构 + +MemReader 的代码结构非常清晰,主要由以下几部分组成: + +* **`base.py`**: 定义了所有 Reader 必须遵守的接口规范。 +* **`simple_struct.py`**: **最常用的实现**。专攻纯文本对话和本地文档,轻量高效。 +* **`multi_modal_struct.py`**: **全能型选手**。能处理图片、文件 URL、Tool 调用等复杂输入。 +* **`read_multi_modal/`**: 存放了各种具体的解析器(Parser),比如专门解析图片的 `ImageParser`,解析文件的 `FileParser` 等。 + +--- + +## 4. 如何选择? + +| 你的需求 | 推荐选择 | 理由 | +| :--- | :--- | :--- | +| **只处理纯文本对话** | `SimpleStructMemReader` | 简单、直接、性能好。 | +| **需要处理图片、文件链接** | `MultiModalStructMemReader` | 内置了多模态解析能力。 | +| **需要从 Fast 升级到 Fine** | 任意 Reader 的 `fine_transfer` 方法 | 支持“先存后优”的渐进式策略。 | + +--- + +## 5. API 概览 + +### 统一工厂:`MemReaderFactory` + +不要自己去 `new` 对象,使用工厂模式是最佳实践: + +```python +from memos.configs.mem_reader import MemReaderConfigFactory +from memos.mem_reader.factory import MemReaderFactory + +# 从配置创建 Reader +cfg = MemReaderConfigFactory.model_validate({...}) +reader = MemReaderFactory.from_config(cfg) +``` + +### 核心方法:`get_memory()` + +这是你最常调用的方法。 + +```python +memories = reader.get_memory( + scene_data, # 你的输入数据 + type="chat", # 类型:chat 或 doc + info=user_info, # 用户信息(user_id, session_id) + mode="fine" # 模式:fast 或 fine(强烈建议显式指定!) +) +``` + +**返回结果**:`list[list[TextualMemoryItem]]` + +::note{icon="ri:bnb-fill"} +为什么是双层列表? +因为一个长对话可能会被切成多个窗口(Window),外层列表代表窗口,内层列表代表该窗口提取出的记忆项。 +:: + +--- + +## 6. 开发实战 + +### 场景一:处理简单的聊天记录 + +这是最基础的用法,使用 `SimpleStructMemReader`。 + +```python +# 1. 准备输入:标准的 OpenAI 格式对话 +conversation = [ + [ + {"role": "user", "content": "我明天下午 3 点有个会"}, + {"role": "assistant", "content": "会议主题是什么?"}, + {"role": "user", "content": "讨论 Q4 项目截止日期"}, + ] +] + +# 2. 提取记忆 (Fine 模式) +memories = reader.get_memory( + conversation, + type="chat", + mode="fine", + info={"user_id": "u1", "session_id": "s1"} +) + +# 3. 结果 +# memories 里会包含提取出的事实,例如:"用户明天下午3点有关于Q4项目的会议" +``` + +### 场景二:处理多模态输入 + +当用户发来图片或文件链接时,切换到 `MultiModalStructMemReader`。 + +```python +# 1. 准备输入:包含文件和图片的复杂消息 +scene_data = [ + [ + { + "role": "user", + "content": [ + {"type": "text", "text": "看看这个文件和图片"}, + # 文件支持 URL 自动下载解析 + {"type": "file", "file": {"file_data": "https://example.com/readme.md"}}, + # 图片支持 URL + {"type": "image_url", "image_url": {"url": "https://example.com/chart.png"}}, + ] + } + ] +] + +# 2. 提取记忆 +memories = multimodal_reader.get_memory( + scene_data, + type="chat", + mode="fine", # 只有 Fine 模式才会调用视觉模型解析图片 + info={"user_id": "u1", "session_id": "s1"} +) +``` + +### 场景三:渐进式优化 (Fine Transfer) + +为了用户体验,你可以先用 Fast 模式快速存下对话,等系统空闲时再把它“精炼”成 Fine 记忆。 + +```python +# 1. 先快速存(毫秒级) +fast_memories = reader.get_memory(conversation, mode="fast", ...) + +# ... 存入数据库 ... + +# 2. 后台异步精炼 +refined_memories = reader.fine_transfer_simple_mem( + fast_memories_flat_list, # 注意这里传入的是展平后的 Item 列表 + type="chat" +) + +# 3. 用 refined_memories 替换掉原来的 fast_memories +``` + +--- + +## 7. 配置项说明 + +在 `.env` 或配置文件中,你可以调整以下关键参数: + +* **`chat_window_max_tokens`**: **滑窗大小**。默认 1024。决定了多少上下文会被打包在一起处理。设得太小容易丢失语境,设得太大容易超出 LLM 的 Token 限制。 +* **`remove_prompt_example`**: **是否移除 Prompt 里的示例**。True = 省 Token 但可能降低提取质量;False = 保留示例提高准确度但消耗更多 Token(保留 Few-shot 示例)。 +* **`direct_markdown_hostnames`** (仅多模态): **域名白名单**。列表中的域名(如 `raw.githubusercontent.com`)会被直接当作 Markdown 文本处理,跳过 OCR/格式转换步骤,加速处理。 diff --git a/docs/cn/open_source/modules/mem_scheduler.md b/docs/cn/open_source/modules/mem_scheduler.md new file mode 100644 index 00000000..15215e33 --- /dev/null +++ b/docs/cn/open_source/modules/mem_scheduler.md @@ -0,0 +1,495 @@ +--- +title: "MemScheduler" +desc: "MemScheduler 是你的“记忆组织调度器”,它在后台异步管理记忆的流转和更新,协调工作记忆、长时记忆和激活记忆之间的交互,让对话系统能够动态地组织和利用记忆。" +--- + +## 主要特性 + +- 🚀 **与 MemOS 系统并发操作**:独立线程/进程运行,不阻塞主业务逻辑。 +- 🧠 **多记忆协调**:智能管理工作记忆、长时记忆和用户个性化记忆的流转。 +- ⚡ **事件驱动调度**:基于消息队列(Redis/Local)的异步任务分发机制。 +- 🔍 **高效检索**:集成向量检索与图谱检索,快速定位相关记忆。 +- 📊 **全面监控**:实时监控记忆使用率、任务队列状态和调度延迟。 +- 📝 **详细日志记录**:全链路追踪记忆操作,便于调试和系统分析。 + +## MemScheduler 架构 + +`MemScheduler` 采用模块化架构,分为三层: + +### 调度层(核心) +1. **调度器(路由器)**:智能消息路由器,根据消息类型(`QUERY`, `ANSWER`, `MEM_UPDATE` 等)将任务分发给对应的处理器。 +2. **消息处理**:通过带有特定标签(Label)的消息驱动业务逻辑,定义消息格式和处理规则。 + +### 执行层(保障) +3. **任务队列**:支持 Redis Stream(生产环境)和 Local Queue(开发测试)两种模式,提供异步任务缓冲和持久化。 +4. **记忆管理**:执行三层记忆(Working/Long-term/User)的读写、压缩、遗忘和类型转换。 +5. **检索系统**:混合检索模块,结合用户意图、场景管理与关键词匹配,快速定位相关记忆。 + +### 支撑层(辅助) +6. **监控**:跟踪任务积压、处理耗时和记忆库健康状态。 +7. **日志记录**:维护全链路记忆操作日志,便于调试和分析。 + +## MemScheduler 初始化 + +在 MemOS 的架构中,`MemScheduler` 是作为服务器组件的一部分在启动时被初始化的。 + +### 在 Server Router 中初始化 + +在 `src/memos/api/routers/server_router.py` 中,调度器通过 `init_server()` 函数被自动加载: + +```python +from memos.api import handlers +from memos.api.handlers.base_handler import HandlerDependencies +from memos.mem_scheduler.base_scheduler import BaseScheduler +from memos.mem_scheduler.utils.status_tracker import TaskStatusTracker + +# ... 其他导入 ... + +# 1. 初始化所有服务器组件 (包括 DB, LLM, Memory, Scheduler) +# init_server() 会读取环境变量并初始化全局单例组件 +components = handlers.init_server() + +# Create dependency container for handlers +dependencies = HandlerDependencies.from_init_server(components) + +# Initialize handlers... +# search_handler = SearchHandler(dependencies) +# ... + +# 2. 从组件字典中获取调度器实例 +# 调度器在 init_server 内部已经被初始化并启动(如果启用了的话) +mem_scheduler: BaseScheduler = components["mem_scheduler"] + +# 3. 用户还可以在components中获取其他调度相关组件 (可选,用于自定义任务处理) +# redis_client 用于直接操作 Redis 或监控任务状态 +redis_client = components["redis_client"] +# ... +``` + + +## 调度任务与数据模型 + +调度器通过消息驱动的方式分发和执行任务。本节介绍支持的任务类型、消息结构和执行日志。 + +### 消息类型与处理器 + +调度器通过注册特定的任务标签(Label)与处理器(Handler)来分发和执行任务。以下是当前版本(基于 `GeneralScheduler` 和 `OptimizedScheduler`)默认支持的调度任务: + +| 消息标签 (Label) | 对应常量 | 处理器方法 | 描述 | +| :--- | :--- | :--- | :--- | +| `query` | `QUERY_TASK_LABEL` | `_query_message_consumer` | 处理用户查询,触发意图识别、记忆检索,并将其转换为记忆更新任务。 | +| `answer` | `ANSWER_TASK_LABEL` | `_answer_message_consumer` | 处理 AI 回复,记录对话日志。 | +| `mem_update` | `MEM_UPDATE_TASK_LABEL` | `_memory_update_consumer` | 核心任务。执行长时记忆的更新流程,包括提取 Query Keyword、更新 Monitor、检索相关记忆并替换工作记忆(Working Memory)。 | +| `add` | `ADD_TASK_LABEL` | `_add_message_consumer` | 处理新记忆的添加日志记录(支持本地和云端日志)。 | +| `mem_read` | `MEM_READ_TASK_LABEL` | `_mem_read_message_consumer` | 使用 `MemReader` 深度处理和导入外部记忆内容。 | +| `mem_organize` | `MEM_ORGANIZE_TASK_LABEL` | `_mem_reorganize_message_consumer` | 触发记忆的重组和合并(Merge)操作。 | +| `pref_add` | `PREF_ADD_TASK_LABEL` | `_pref_add_message_consumer` | 处理用户偏好记忆(Preference Memory)的提取和添加。 | +| `mem_feedback` | `MEM_FEEDBACK_TASK_LABEL` | `_mem_feedback_message_consumer` | 处理用户反馈,用于修正记忆或强化偏好。 | +| `api_mix_search` | `API_MIX_SEARCH_TASK_LABEL` | `_api_mix_search_message_consumer` | (OptimizedScheduler 特有) 执行异步混合搜索任务,结合快速检索与精细检索。 | + +### 消息数据结构 (ScheduleMessageItem) + +调度器使用统一的 `ScheduleMessageItem` 结构在队列中传递消息。 + +> **注意**:`mem_cube` 对象本身不直接包含在消息模型中,而是通过 `mem_cube_id` 在运行时由调度器解析。 + +| 字段 | 类型 | 描述 | 默认值/备注 | +| :--- | :--- | :--- | :--- | +| `item_id` | `str` | 消息唯一标识符 (UUID) | 自动生成 | +| `user_id` | `str` | 关联的用户 ID | (必需) | +| `mem_cube_id` | `str` | 关联的 Memory Cube ID | (必需) | +| `label` | `str` | 任务标签 (如 `query`, `mem_update`) | (必需) | +| `content` | `str` | 消息载荷 (通常为 JSON 字符串或文本) | (必需) | +| `timestamp` | `datetime` | 消息提交时间 | 自动生成 (UTC now) | +| `session_id` | `str` | 会话 ID,用于上下文隔离 | `""` | +| `trace_id` | `str` | 链路追踪 ID,用于全链路日志关联 | 自动生成 | +| `user_name` | `str` | 用户显示名称 | `""` | +| `task_id` | `str` | 业务级任务 ID (用于关联多个消息) | `None` | +| `info` | `dict` | 额外的自定义上下文信息 | `None` | +| `stream_key` | `str` | (内部使用) Redis Stream 的键名 | `""` | + +### 执行日志结构 (ScheduleLogForWebItem) + +调度器会生成用于前端展示或持久化存储的结构化日志消息。 + +| 字段 | 类型 | 描述 | 备注 | +| :--- | :--- | :--- | :--- | +| `item_id` | `str` | 日志条目唯一标识符 | 自动生成 | +| `task_id` | `str` | 关联的父任务 ID | 可选 | +| `user_id` | `str` | 用户 ID | (必需) | +| `mem_cube_id` | `str` | Memory Cube ID | (必需) | +| `label` | `str` | 日志类别 (如 `addMessage`, `addMemory`) | (必需) | +| `log_content` | `str` | 简短的日志描述文本 | (必需) | +| `from_memory_type` | `str` | 源记忆区域 | 如 `UserInput`, `LongTermMemory` | +| `to_memory_type` | `str` | 目标记忆区域 | 如 `WorkingMemory` | +| `memcube_log_content` | `list[dict]` | 结构化的详细内容 | 包含具体的记忆文本、引用 ID 等 | +| `metadata` | `list[dict]` | 记忆项元数据 | 包含置信度、状态、标签等 | +| `status` | `str` | 任务状态 | 如 `completed`, `failed` | +| `timestamp` | `datetime` | 日志创建时间 | 自动生成 | +| `current_memory_sizes` | `MemorySizes` | 当前各区域记忆数量快照 | 用于监控面板展示 | +| `memory_capacities` | `MemoryCapacities` | 各区域记忆容量限制 | 用于监控面板展示 | + +## 调度功能示例 + +### 1. 消息处理与自定义 Handler + +调度器最强大的功能是支持注册自定义的消息处理器(Handler)。你可以定义特定类型的消息(如 `MY_CUSTOM_TASK`),并编写函数来处理它。 + +```python +import uuid +from datetime import datetime + +# 1. 导入必要的类型定义和调度器实例 +# 注意:mem_scheduler 需要从 server_router 导入,因为它是一个全局单例 +from memos.api.routers.server_router import mem_scheduler +from memos.mem_scheduler.schemas.message_schemas import ScheduleMessageItem + +# 定义一个自定义的任务标签 +MY_TASK_LABEL = "MY_CUSTOM_TASK" + + +# 定义处理器函数 +def my_task_handler(messages: list[ScheduleMessageItem]): + """ + 处理自定义任务的函数 + """ + for msg in messages: + print(f"⚡️ [Handler] 收到任务: {msg.item_id}") + print(f"📦 内容: {msg.content}") + # 在这里执行你的业务逻辑,例如:调用 LLM、写数据库、触发其他任务等 + + +# 2. 注册处理器到调度器 +# 这一步将您的自定义逻辑挂载到调度系统中 +mem_scheduler.register_handlers({ + MY_TASK_LABEL: my_task_handler +}) + +# 3. 提交任务 +task = ScheduleMessageItem( + item_id=str(uuid.uuid4()), + user_id="user_123", + mem_cube_id="cube_001", + label=MY_TASK_LABEL, + content="这是一条测试消息", + timestamp=datetime.now() +) + +# 如果调度器未启动,这里会直接放入队列等待处理(如果是 Redis 队列) +# 或者在本地队列模式下可能需要先调用 mem_scheduler.start() +mem_scheduler.submit_messages([task]) + +print(f"Task submitted: {task.item_id}") + +# 防止调度器主进程提前退出 +time.sleep(10) +``` + +### 2. Redis 队列 vs 本地队列 + +- **本地队列 (Local Queue)**: + - **适用场景**:单元测试、简单的单机脚本。 + - **特点**:速度快,但进程重启后数据丢失,不支持多进程/多实例共享。 + - **配置**:`MOS_SCHEDULER_USE_REDIS_QUEUE=false` + +- **Redis 队列 (Redis Stream)**: + - **适用场景**:生产环境、分布式部署。 + - **特点**:数据持久化,支持消费者组(Consumer Group),允许多个调度器实例共同处理任务(负载均衡)。 + - **配置**:`MOS_SCHEDULER_USE_REDIS_QUEUE=true` + - **调试**:可以使用 `show_redis_status.py` 脚本查看队列堆积情况。 + +## 综合应用场景 + +### 场景 1: 基础对话流与记忆更新 + +以下是一个完善的示例,展示了如何初始化环境、注册自定义逻辑、模拟对话流以及触发记忆更新。 + +```python +import asyncio +import json +import os +import sys +import time +from pathlib import Path + +# --- 环境准备 --- +# 1. 设置项目根目录到 sys.path,确保能导入 memos 模块 +FILE_PATH = Path(__file__).absolute() +BASE_DIR = FILE_PATH.parent.parent.parent +sys.path.insert(0, str(BASE_DIR)) + +# 2. 设置必要的环境变量 (模拟 .env 配置) +os.environ["ENABLE_CHAT_API"] = "true" +os.environ["MOS_ENABLE_SCHEDULER"] = "true" +# 决定使用 Redis 还是 Local 队列 +os.environ["MOS_SCHEDULER_USE_REDIS_QUEUE"] = "false" + +# --- 导入组件 --- +# 注意:导入 server_router 会触发组件初始化,确保环境变量在此之前设置好 +from memos.api.product_models import APIADDRequest, ChatPlaygroundRequest +from memos.api.routers.server_router import ( + add_handler, + chat_stream_playground, + mem_scheduler, # 这里的 mem_scheduler 已经是初始化好的单例 +) +from memos.log import get_logger +from memos.mem_scheduler.schemas.message_schemas import ScheduleMessageItem +from memos.mem_scheduler.schemas.task_schemas import ( + MEM_UPDATE_TASK_LABEL, + QUERY_TASK_LABEL, +) + +logger = get_logger(__name__) + +# 全局变量用于演示记忆检索结果 +working_memories = [] + +# --- 自定义处理器 --- + +def custom_query_handler(messages: list[ScheduleMessageItem]): + """ + 处理用户查询消息: + 1. 打印查询内容 + 2. 将消息转换为 MEM_UPDATE 任务,触发记忆检索/更新流程 + """ + for msg in messages: + print(f"\n[Scheduler 🟢] 收到用户查询: {msg.content}") + + # 复制消息并将标签改为 MEM_UPDATE,这是一种常见的“任务链”模式 + new_msg = msg.model_copy(update={"label": MEM_UPDATE_TASK_LABEL}) + + # 提交新任务回调度器 + mem_scheduler.submit_messages([new_msg]) + + +def custom_mem_update_handler(messages: list[ScheduleMessageItem]): + """ + 处理记忆更新任务: + 1. 使用检索器 (Retriever) 查找相关记忆 + 2. 更新全局的工作记忆列表 + """ + global working_memories + search_args = {} + top_k = 2 + + for msg in messages: + print(f"[Scheduler 🔵] 正在为查询检索记忆...") + # 调用核心检索功能 + results = mem_scheduler.retriever.search( + query=msg.content, + user_id=msg.user_id, + mem_cube_id=msg.mem_cube_id, + mem_cube=mem_scheduler.current_mem_cube, + top_k=top_k, + method=mem_scheduler.search_method, + search_args=search_args, + ) + + # 模拟工作记忆的更新 + working_memories.extend(results) + working_memories = working_memories[-5:] # 保持最新的5条 + + for mem in results: + # 打印检索到的记忆片段 + print(f" ↳ [Memory Found]: {mem.memory[:50]}...") + +# --- 模拟业务数据 --- + +def get_mock_data(): + """生成模拟对话数据""" + conversations = [ + {"role": "user", "content": "I just adopted a golden retriever puppy named Max."}, + {"role": "assistant", "content": "That's exciting! Max is a great name."}, + {"role": "user", "content": "He loves peanut butter treats but I am allergic to nuts."}, + {"role": "assistant", "content": "Noted. Peanut butter for Max, no nuts for you."}, + ] + + questions = [ + {"question": "What is my dog's name?", "category": "Pet"}, + {"question": "What am I allergic to?", "category": "Allergy"}, + ] + return conversations, questions + +# --- 主流程 --- + +async def run_demo(): + print("==== MemScheduler Demo Start ====") + conversations, questions = get_mock_data() + + user_id = "demo_user_001" + mem_cube_id = "cube_demo_001" + + print(f"1. 初始化用户记忆库 ({user_id})...") + # 使用 API Handler 添加初始记忆 (同步模式) + add_req = APIADDRequest( + user_id=user_id, + writable_cube_ids=[mem_cube_id], + messages=conversations, + async_mode="sync", + ) + add_handler.handle_add_memories(add_req) + print(" 记忆添加完成。") + + print("\n2. 开始对话测试 (并在后台触发调度任务)...") + for item in questions: + query = item["question"] + print(f"\n>> User: {query}") + + # 发起聊天请求 + chat_req = ChatPlaygroundRequest( + user_id=user_id, + query=query, + readable_cube_ids=[mem_cube_id], + writable_cube_ids=[mem_cube_id], + ) + + # 获取流式响应 + response = chat_stream_playground(chat_req) + + # 处理流式输出 (简化版) + full_answer = "" + buffer = "" + async for chunk in response.body_iterator: + if isinstance(chunk, bytes): + chunk = chunk.decode("utf-8") + buffer += chunk + while "\n\n" in buffer: + msg, buffer = buffer.split("\n\n", 1) + for line in msg.split("\n"): + if line.startswith("data: "): + try: + data = json.loads(line[6:]) + if data.get("type") == "text": + full_answer += data["data"] + except: pass + + print(f">> AI: {full_answer}") + + # 等待一小会儿让后台调度器处理任务并打印日志 + await asyncio.sleep(1) + +if __name__ == "__main__": + # 1. 注册我们的自定义 Handler + # 这会覆盖或添加到默认的调度逻辑中 + mem_scheduler.register_handlers( + { + QUERY_TASK_LABEL: custom_query_handler, + MEM_UPDATE_TASK_LABEL: custom_mem_update_handler, + } + ) + + # 2. 确保调度器已启动 + if not mem_scheduler._running: + mem_scheduler.start() + + try: + asyncio.run(run_demo()) + except KeyboardInterrupt: + pass + finally: + # 防止调度器主进程提前退出 + time.sleep(10) + + print("\n==== 停止调度器 ====") + mem_scheduler.stop() +``` + +### 场景 2: 异步任务并发与断点重启 (Redis) + +该示例展示了如何使用 Redis 队列实现异步任务的并发处理以及断点重启功能。运行此示例需要配置 Redis 环境。 + +```python +from pathlib import Path +from time import sleep + +from memos.api.routers.server_router import mem_scheduler +from memos.mem_scheduler.schemas.message_schemas import ScheduleMessageItem + + +# 调试:打印调度器配置 +print("=== Scheduler Configuration Debug ===") +print(f"Scheduler type: {type(mem_scheduler).__name__}") +print(f"Config: {mem_scheduler.config}") +print(f"use_redis_queue: {mem_scheduler.use_redis_queue}") +print(f"Queue type: {type(mem_scheduler.memos_message_queue).__name__}") +print(f"Queue maxsize: {getattr(mem_scheduler.memos_message_queue, 'maxsize', 'N/A')}") +print("=====================================\n") + +queue = mem_scheduler.memos_message_queue + + +# 定义处理函数 +def my_test_handler(messages: list[ScheduleMessageItem]): + print(f"My test handler received {len(messages)} messages: {[one.item_id for one in messages]}") + for msg in messages: + # 根据 task_id 创建文件(使用 item_id 作为数字 ID 0..99) + task_id = str(msg.item_id) + file_path = tmp_dir / f"{task_id}.txt" + try: + sleep(5) + file_path.write_text(f"Task {task_id} processed.\n") + print(f"writing {file_path} done") + except Exception as e: + print(f"Failed to write {file_path}: {e}") + + +def submit_tasks(): + mem_scheduler.memos_message_queue.clear() + + # 创建 100 条消息(task_id 0..99) + users = ["user_A", "user_B"] + messages_to_send = [ + ScheduleMessageItem( + item_id=str(i), + user_id=users[i % 2], + mem_cube_id="test_mem_cube", + label=TEST_HANDLER_LABEL, + content=f"Create file for task {i}", + ) + for i in range(100) + ] + # 批量提交消息并打印完成信息 + print(f"Submitting {len(messages_to_send)} messages to the scheduler...") + mem_scheduler.memos_message_queue.submit_messages(messages_to_send) + print(f"Task submission done! tasks in queue: {mem_scheduler.get_tasks_status()}") + + +# 注册处理函数 +TEST_HANDLER_LABEL = "test_handler" +mem_scheduler.register_handlers({TEST_HANDLER_LABEL: my_test_handler}) + +# 5秒重启 +mem_scheduler.orchestrator.tasks_min_idle_ms[TEST_HANDLER_LABEL] = 5_000 + +tmp_dir = Path("./tmp") +tmp_dir.mkdir(exist_ok=True) + +# 测试停止并重启:如果 tmp 中已有 >1 个文件,跳过提交并打印信息 +existing_count = len(list(Path("tmp").glob("*.txt"))) if Path("tmp").exists() else 0 +if existing_count > 1: + print(f"Skip submission: found {existing_count} files in tmp (>1), continue processing") +else: + submit_tasks() + +# 6. 等待直到 tmp 有 100 个文件或超时 +poll_interval = 1 +expected = 100 +tmp_dir = Path("tmp") +tasks_status = mem_scheduler.get_tasks_status() +mem_scheduler.print_tasks_status(tasks_status=tasks_status) +while ( + mem_scheduler.get_tasks_status()["remaining"] != 0 + or mem_scheduler.get_tasks_status()["running"] != 0 +): + count = len(list(tmp_dir.glob("*.txt"))) if tmp_dir.exists() else 0 + tasks_status = mem_scheduler.get_tasks_status() + mem_scheduler.print_tasks_status(tasks_status=tasks_status) + print(f"[Monitor] Files in tmp: {count}/{expected}") + sleep(poll_interval) +print(f"[Result] Final files in tmp: {len(list(tmp_dir.glob('*.txt')))})") + +# 7. 停止调度器 +sleep(20) +print("Stopping the scheduler...") +mem_scheduler.stop() +``` diff --git a/docs/cn/open_source/modules/memories/general_textual_memory.md b/docs/cn/open_source/modules/memories/general_textual_memory.md new file mode 100644 index 00000000..309cbe28 --- /dev/null +++ b/docs/cn/open_source/modules/memories/general_textual_memory.md @@ -0,0 +1,157 @@ +--- +title: "GeneralTextMemory: 通用明文记忆" +desc: "`GeneralTextMemory` 是MemOS中一个灵活的、基于向量的明文记忆模块,用于存储、搜索和管理非结构化知识。如果说 Naive 模块是‘关键词匹配’,那么 GeneralTextMemory 就是‘理解意思’的智能索引,它适用于会话代理、个人助理和任何需要语义记忆检索的系统。" +--- +## 目录 + +- [记忆结构](#记忆结构) + - [元数据域 (`TextualMemoryMetadata`)](#元数据域-textualmemorymetadata) +- [API总结 (`GeneralTextMemory`)](#api总结-generaltextmemory) + - [初始化](#初始化) + - [核心方法](#核心方法) +- [文件存储](#文件存储) +- [示例用法](#示例用法) +- [扩展与进阶](#扩展与进阶) + - [互联网检索](#互联网检索) + - [MultiModal Reader](#multimodal-reader) +- [开发者注意事项](#开发者注意事项) + + +## 记忆结构 + +每个记忆被表达为一个`TextualMemoryItem`: + +| 字段 | 类型 | 描述 | +| ---------- | --------------------------- | ---------------------------------- | +| `id` | `str` | UUID (如果省略则自动生成) | +| `memory` | `str` | 记忆内容主体 (必填) | +| `metadata` | `TextualMemoryMetadata` | 元数据(用于搜索/过滤) | + +### 元数据域 (`TextualMemoryMetadata`) + +| 字段 | 类型 | 描述 | +| ------------- | -------------------------------------------------- | ----------------------------------- | +| `type` | `"procedure"`, `"fact"`, `"event"`, `"opinion"` | 记忆类型 | +| `memory_time` | `str (YYYY-MM-DD)` | 记忆所指的日期/时间 | +| `source` | `"conversation"`, `"retrieved"`, `"web"`, `"file"` | 记忆源 | +| `confidence` | `float (0-100)` | 确定性/可信度评分 | +| `entities` | `list[str]` | 主要实体/概念 | +| `tags` | `list[str]` | 主题标签 | +| `visibility` | `"private"`, `"public"`, `"session"` | 访问范围 | +| `updated_at` | `str` | 最近更新时间戳 (ISO 8601) | + +所有的值都经过验证,无效的值将引发错误。 + +## 搜索机制 + +与前文提到的`NaiveTextMemory` 使用**关键词匹配算法**不同,`GeneralNaiveTextMemory` 使用**向量语义搜索**。 + +**与NaiveTextMemory的算法特点对比** + +| 特性 | 关键词匹配 | 向量语义搜索 | +| -------------- | ---------------------------- | -------------------------------- | +| **理解语义** | ❌ 不理解同义词 | ✅ 理解相似概念 | +| **资源占用** | ✅ 极低 | ⚠️ 需要嵌入模型和向量数据库 | +| **执行速度** | ✅ 快速(O(n)) | ⚠️ 较慢(索引构建+查询) | +| **适用规模** | < 1K 条记忆 | 10K - 100K 条记忆 | +| **可预测性** | ✅ 结果直观 | ⚠️ 黑盒模型 | + +## API总结 (`GeneralTextMemory`) + +### 初始化 +```python +GeneralTextMemory(config: GeneralTextMemoryConfig) +``` + +### 核心方法 +| 方法 | 描述 | +| ------------------------ | --------------------------------------------------- | +| `extract(messages)` | 从消息列表中提取记忆 (基于LLM) | +| `add(memories)` | 添加一个或多个记忆 (条目或字典) | +| `search(query, top_k)` | 使用向量相似度检索top-k记忆 | +| `get(memory_id)` | 通过ID获取单个记忆 | +| `get_by_ids(ids)` | 通过ID获取多个记忆 | +| `get_all()` | 返回所有记忆 | +| `update(memory_id, new)` | 通过ID更新一个记忆 | +| `delete(ids)` | 通过ID删除记忆 | +| `delete_all()` | 删除所有记忆 | +| `dump(dir)` | 将所有记忆序列化到目录中的JSON文件 | +| `load(dir)` | 从存储的文件中加载记忆 | + +## 文件存储 + +当调用 `dump(dir)`, 系统会将记忆保存到: + +``` +/ +``` + +该文件包含所有记忆条目的JSON列表,可以使用`load(dir)`重新加载. + +## 示例用法 + +```python +import os +from memos.configs.memory import MemoryConfigFactory +from memos.memories.factory import MemoryFactory + +config = MemoryConfigFactory( + backend="general_text", + config={ + "extractor_llm": { ... }, + "vector_db": { ... }, + "embedder": { ... }, + }, +) +m = MemoryFactory.from_config(config) + +# 提取并添加记忆 +memories = m.extract([ + {"role": "user", "content": "I love tomatoes."}, + {"role": "assistant", "content": "Great! Tomatoes are delicious."}, +]) +m.add(memories) + +# 通过id手动创建并添加一个记忆 +memory_id = "xxx" +m.add( + [ + { + "id": memory_id, + "memory": "User is Chinese.", + ... + } + ] +) + +# 检索记忆 +results = m.search("Tell me more about the user", top_k=2) + +# 更新记忆 +m.update(memory_id, {"memory": "User is Canadian.", ...}) + +# 删除记忆 +m.delete([memory_id]) + +# 将所有记忆序列化到目录中的JSON文件/从存储的文件中加载记忆 +m.dump("tmp/mem") +m.load("tmp/mem") +``` + +::note +**扩展:互联网检索**
+GeneralTextMemory 可以与互联网检索结合使用,从网页提取内容并添加到记忆库。
+查看示例:[从互联网检索记忆](./tree_textual_memory#从互联网检索记忆可选) +:: + +::note +**进阶:使用 MultiModal Reader**
+如果需要处理图片、URL、文件等多模态内容,可以使用 `MultiModalStructMemReader`。
+查看完整示例:[使用 MultiModalStructMemReader](./tree_textual_memory#使用-multimodalstructmemreader高级) +:: + +## 开发者注意事项 + +* 使用Qdrant(或兼容)向量DB进行快速相似度搜索 +* 嵌入和提取模型是可配置的(支持olama/OpenAI) +* `/tests`中的集成测试涵盖了所有方法。 diff --git a/docs/cn/open_source/modules/memories/kv_cache_memory.md b/docs/cn/open_source/modules/memories/kv_cache_memory.md new file mode 100644 index 00000000..ffafd844 --- /dev/null +++ b/docs/cn/open_source/modules/memories/kv_cache_memory.md @@ -0,0 +1,519 @@ +--- +title: "KVCacheMemory: 激活记忆" +desc: "`KVCacheMemory` 是MemOS中用于存储和管理KV cache的专用记忆模块,主要用于加速大语言模型(LLMs)推理并支持有效的上下文复用。作为激活记忆,它有助于提升会话式和生成式人工智能系统的性能。" +--- + +## KV Cache记忆使用案例 + +在MemOS中,KV Cache最适合存储**语义稳定且经常复用的背景信息**,例如: +- 常见问题(FAQs)或特定领域知识 +- 先前的对话历史 + +这些稳定的**明文记忆项**由`MemScheduler`模块自动识别和管理。一旦被选中,它们就会被提前转换成KV格式的表示(`KVCacheItem`)。这个预计算步骤以可复用的格式存储记忆的激活状态(键值对张量),允许它们在推理期间注入到模型的注意力缓存中。 + +一旦进行转换,这些KV记忆就可以**跨查询复用**,而不需要对原始内容重新编码。这减少了处理和存储大量文本的计算开销,使其成为需要**快速响应时间**和**高吞吐量**的应用程序的理想选择。 + +## 为什么是KV Cache记忆 +将`MemScheduler`与KV Cache记忆集成可以实现显著的性能优化,特别是在LLM推理的**预填充阶段**。 + +### 无KV Cache记忆 + +- 每个新查询都被添加到完整的提示模板中,包括背景知识。 +- 模型必须在整个序列上**重新计算token嵌入和注意力**——即使是未更改的记忆。 + +### 有KV Cache记忆 + +- 背景知识以键值对张量的形式**缓存一次**。 +- 对于每个查询,只对新用户输入(查询token)进行编码。 +- 之前缓存的KV被直接注入到注意力机制中。 + +### 好处 + +这种分离减少了预填充阶段的冗余计算,从而导致: + +- 跳过背景知识的重复编码 +- 更快的查询token和缓存记忆之间的注意力计算 +- **降低首次token时间(Time To First Token, TTFT)** 生成过程中的延迟 + +这种优化在以下方面特别有价值: + +- 多回合聊天机器人交互 +- 检索增强生成或上下文增强生成(RAG, CAG) +- 在固定文档或FAQ风格记忆上操作的助理 + + +### KV Cache记忆加速评估 + +为了验证基于KV的记忆注入对性能的影响,我们进行了一组在MemOS中模拟真实记忆复用的对照实验。 + +#### 实验建立 + +在典型的使用中,`MemScheduler`模块持续跟踪交互模式,并将高频、稳定的明文记忆提升为KV格式。这些KV记忆作为激活缓存加载到GPU内存中,并在推理过程中重复使用。 + +评估比较两种记忆策略: + +1. **基于提示的注入**: 背景知识被作为原始文本添加 +2. **KV Cache注入**: 记忆被直接注入到模型的注意力缓存 + +我们对这些策略进行了测试: + +- **三种文本长度**: 短文本, 中等长度文本和长文本 +- **三种查询类型**: 短查询, 中等查询和长查询 + +主要指标是**首次token时间(TTFT)**,这是响应式生成的关键延迟指标。 + +#### 实验结果 + +下表显示了跨三个模型的结果(Qwen3-8B, Qwen3-32B, Qwen2.5-72B).KV Cache注入下的TTFT始终低于基于提示的注入,而两种策略的输出token保持一致. + +::note{icon="ri:bnb-fill"} +`Build (s)`是指将记忆转换为KV格式的一次性预处理成本,分摊到多个查询中. +:: + +| Model | Ctx | CtxTok | Qry | QryTok | Build (s) | KV TTFT (s) | Dir TTFT (s) | Speedup (%) | +| ----------- | ------ | ------ | ------ | ------ | --------- | ----------- | ------------ | ----------- | +| Qwen3-8B | long | 6064 | long | 952.7 | 0.92 | 0.50 | 2.37 | 79.1 | +| | | | medium | 302.7 | 0.93 | 0.19 | 2.16 | 91.1 | +| | | | short | 167 | 0.93 | 0.12 | 2.04 | 94.2 | +| | medium | 2773 | long | 952.7 | 0.41 | 0.43 | 1.22 | 64.6 | +| | | | medium | 302.7 | 0.41 | 0.16 | 1.08 | 85.1 | +| | | | short | 167 | 0.43 | 0.10 | 0.95 | 89.7 | +| | short | 583 | long | 952.7 | 0.12 | 0.39 | 0.51 | 23.0 | +| | | | medium | 302.7 | 0.12 | 0.14 | 0.32 | 55.6 | +| | | | short | 167 | 0.12 | 0.08 | 0.29 | 71.3 | +| Qwen3-32B | long | 6064 | long | 952.7 | 0.71 | 0.31 | 1.09 | 71.4 | +| | | | medium | 302.7 | 0.71 | 0.15 | 0.98 | 84.3 | +| | | | short | 167 | 0.71 | 0.11 | 0.96 | 88.8 | +| | medium | 2773 | long | 952.7 | 0.31 | 0.24 | 0.56 | 56.9 | +| | | | medium | 302.7 | 0.31 | 0.12 | 0.47 | 75.1 | +| | | | short | 167 | 0.31 | 0.08 | 0.44 | 81.2 | +| | short | 583 | long | 952.7 | 0.09 | 0.20 | 0.24 | 18.6 | +| | | | medium | 302.7 | 0.09 | 0.09 | 0.15 | 39.6 | +| | | | short | 167 | 0.09 | 0.07 | 0.14 | 53.5 | +| Qwen2.5-72B | long | 6064 | long | 952.7 | 1.26 | 0.48 | 2.04 | 76.4 | +| | | | medium | 302.7 | 1.26 | 0.23 | 1.82 | 87.2 | +| | | | short | 167 | 1.27 | 0.15 | 1.79 | 91.4 | +| | medium | 2773 | long | 952.7 | 0.58 | 0.39 | 1.05 | 62.7 | +| | | | medium | 302.7 | 0.58 | 0.18 | 0.89 | 79.2 | +| | | | short | 167 | 0.71 | 0.23 | 0.82 | 71.6 | +| | short | 583 | long | 952.7 | 0.16 | 0.33 | 0.43 | 23.8 | +| | | | medium | 302.7 | 0.16 | 0.15 | 0.27 | 43.2 | +| | | | short | 167 | 0.16 | 0.10 | 0.25 | 60.5 | + + +#### 基于 vLLM 的性能表现 + +MemOS 现在支持使用 vLLM 管理激活内存。为了评估KV Cache预存不同长度的前缀文本带来的影响,我们在一个配备 8 张 `H800 80GB GPU(112 vCPU,1920 GiB 内存)`的系统,以及一个配备 8张 `RTX4090-24G-PCIe(112 vCPU,960 GiB 内存)` 的系统上分别进行了性能测试。评估覆盖了当前两种核心模型:Qwen3-32B 和 Qwen2.5-72B。 + +基准测试在一系列记忆和上下文长度组合下运行,以模拟各种激活内存场景: +- **记忆文本长度(tokens)**:500、1000、2000 +- **上下文文本长度(tokens)**:500、1000、2000、4000 + +下表总结了基准测试结果。 + +**Qwen2.5-72B** +- On 4090(2 Nodes 16 GPUs) + +| mem tks | prompt tks | TTFT (without cache, ms) | TTFT (With cache, ms) | TTFT Speedup (%) | Abs Dis(ms) | +| ------- | ---------- | ------------------------ | --------------------- | ---------------- | ----------- | +| 0.5k | 0.5k | 1787.21 | 851.47 | 52.358% | 935.74 | +| 0.5k | 1k | 2506.26 | 1290.68 | 48.502% | 1215.58 | +| 0.5k | 2k | 3843.48 | 2897.97 | 24.600% | 945.51 | +| 0.5k | 4k | 6078.01 | 5200.86 | 14.432% | 877.15 | +| 1k | 0.5k | 2274.61 | 920.16 | 59.546% | 1354.45 | +| 1k | 1k | 2907.17 | 1407.65 | 51.580% | 1499.52 | +| 1k | 2k | 4278.53 | 2916.47 | 31.835% | 1362.06 | +| 1k | 4k | 6897.99 | 5218.94 | 24.341% | 1679.05 | +| 2k | 0.5k | 3460.12 | 782.73 | 77.379% | 2677.39 | +| 2k | 1k | 4443.34 | 1491.24 | 66.439% | 2952.10 | +| 2k | 2k | 5733.14 | 2758.48 | 51.885% | 2974.66 | +| 2k | 4k | 8152.76 | 5627.41 | 30.975% | 2525.35 | + + +- On H800(4 GPUs) + +| mem tks | prompt tks | TTFT (without cache, ms) | TTFT (With cache, ms) | TTFT Speedup (%) | Abs Dis(ms) | +| ------- | ---------- | ------------------------ | --------------------- | ---------------- | ----------- | +| 0.5k | 0.5k | 51.65 | 52.17 | -1.007% | -0.52 | +| 0.5k | 1k | 55.70 | 57.03 | -2.388% | -1.33 | +| 0.5k | 2k | 74.23 | 78.56 | -5.833% | -4.33 | +| 0.5k | 4k | 77.56 | 77.45 | 0.142% | 0.11 | +| 1k | 0.5k | 55.90 | 55.73 | 0.304% | 0.17 | +| 1k | 1k | 55.35 | 52.89 | 4.444% | 2.46 | +| 1k | 2k | 80.14 | 73.82 | 7.886% | 6.32 | +| 1k | 4k | 82.83 | 73.51 | 11.252% | 9.32 | +| 2k | 0.5k | 75.82 | 71.31 | 5.948% | 4.51 | +| 2k | 1k | 80.60 | 78.71 | 2.345% | 1.89 | +| 2k | 2k | 83.91 | 78.60 | 6.328% | 5.31 | +| 2k | 4k | 99.15 | 80.12 | 19.193% | 19.03 | + +**Qwen3-32B** + +- On 4090(1 Nodes 8 GPUs) + +| mem tks | prompt tks | TTFT (without cache, ms) | TTFT (With cache, ms) | TTFT Speedup (%) | Abs Dis(ms) | +| ------- | ---------- | ------------------------ | --------------------- | ---------------- | ----------- | +| 0.5k | 0.5k | 288.72 | 139.29 | 51.756% | 149.43 | +| 0.5k | 1k | 428.72 | 245.85 | 42.655% | 182.87 | +| 0.5k | 2k | 683.65 | 538.59 | 21.218% | 145.06 | +| 0.5k | 4k | 1170.48 | 986.94 | 15.681% | 183.54 | +| 1k | 0.5k | 409.83 | 137.96 | 66.337% | 271.87 | +| 1k | 1k | 507.95 | 262.21 | 48.379% | 245.74 | +| 1k | 2k | 743.48 | 539.71 | 27.408% | 203.77 | +| 1k | 4k | 1325.34 | 1038.59 | 21.636% | 286.75 | +| 2k | 0.5k | 686.01 | 147.34 | 78.522% | 538.67 | +| 2k | 1k | 762.96 | 246.22 | 67.728% | 516.74 | +| 2k | 2k | 1083.93 | 498.05 | 54.051% | 585.88 | +| 2k | 4k | 1435.39 | 1053.31 | 26.619% | 382.08 | + + +- On H800(2 GPUs) + +| mem tks | prompt tks | TTFT (without cache, ms) | TTFT (With cache, ms) | TTFT Speedup (%) | Abs Dis(ms) | +| ------- | ---------- | ------------------------ | --------------------- | ---------------- | ----------- | +| 0.5k | 0.5k | 161.18 | 97.61 | 39.440% | 63.57 | +| 0.5k | 1k | 164.00 | 121.39 | 25.982% | 42.61 | +| 0.5k | 2k | 257.34 | 215.20 | 16.375% | 42.14 | +| 0.5k | 4k | 365.14 | 317.95 | 12.924% | 47.19 | +| 1k | 0.5k | 169.45 | 100.52 | 40.679% | 68.93 | +| 1k | 1k | 180.91 | 128.25 | 29.108% | 52.66 | +| 1k | 2k | 271.69 | 210.00 | 22.706% | 61.69 | +| 1k | 4k | 389.30 | 314.64 | 19.178% | 74.66 | +| 2k | 0.5k | 251.43 | 130.92 | 47.930% | 120.51 | +| 2k | 1k | 275.81 | 159.60 | 42.134% | 116.21 | +| 2k | 2k | 331.11 | 218.17 | 34.110% | 112.94 | +| 2k | 4k | 451.06 | 334.80 | 25.775% | 116.26 | + + +结果清楚地表明,集成 vLLM 的 KV 缓存重用功能为 MemOS 带来了革命性的性能提升。 + +## KV Cache的记忆结构 + +通过`KVCacheMemory`实现基于KV的记忆复用,在保持相同输出的同时,大大减少了模型大小和查询类型之间的延迟。通过将可复用记忆从明文提示转移到预先计算的KV Cache,MemOS消除了冗余的上下文编码,并实现了更快的响应时间,特别是在实时的、记忆增强的LLM应用程序中。 + +每个缓存被存储为一个`KVCacheItem`: + +| 字段 | 类型 | 描述 | +| ------------- | -------------- | ------------------------------------------- | +| `kv_cache_id` | `str` | 缓存中的唯一ID(UUID) | +| `kv_cache` | `DynamicCache` | 实际的KV Cache(transformers) | +| `metadata` | `dict` | 元数据 (源, 抽取时间等.) | + + +## API总结 (`KVCacheMemory`) + +### 初始化 +```python +KVCacheMemory(config: KVCacheMemoryConfig) +``` + +### 核心方法 +| 方法 | 描述 | +| ------------------------ | -------------------------------------------------------- | +| `extract(text)` | 使用LLM从输入文本中提取KV Cache | +| `add(memories)` | 添加一个或多个`KVCacheItem`到记忆中 | +| `get(memory_id)` | 根据ID获取单个缓存 | +| `get_by_ids(ids)` | 根据IDs获取多个缓存 | +| `get_all()` | 返回所有存储的缓存 | +| `get_cache(cache_ids)` | 从多个IDs合并并返回组合缓存 | +| `delete(ids)` | 通过IDs删除缓存 | +| `delete_all()` | 删除所有缓存 | +| `dump(dir)` | 将所有缓存序列化到目录中的pickle文件 | +| `load(dir)` | 从目录中的pickle文件加载缓存 | +| `from_textual_memory(mem)` | 将`TextualMemoryItem` 转换为 `KVCacheItem` | + + +当调用`dump(dir)`, 系统写到: + +``` +/ +``` + +该文件包含所有KV Cache的pickle字典,可以使用`load(dir)`重新加载。 + + +## 如何使用 + +### HF KVCache Memory + +```python +import json + +from transformers import DynamicCache + +from memos.configs.memory import MemoryConfigFactory +from memos.memories.activation.item import KVCacheItem +from memos.memories.factory import MemoryFactory + + +def get_cache_info(cache): + if not cache: + return None + + num_layers = 0 + total_size_bytes = 0 + + if hasattr(cache, "layers"): + num_layers = len(cache.layers) + for layer in cache.layers: + if hasattr(layer, "key_cache") and layer.key_cache is not None: + total_size_bytes += layer.key_cache.nelement() * layer.key_cache.element_size() + if hasattr(layer, "value_cache") and layer.value_cache is not None: + total_size_bytes += layer.value_cache.nelement() * layer.value_cache.element_size() + + if hasattr(layer, "keys") and layer.keys is not None: + total_size_bytes += layer.keys.nelement() * layer.keys.element_size() + if hasattr(layer, "values") and layer.values is not None: + total_size_bytes += layer.values.nelement() * layer.values.element_size() + + elif hasattr(cache, "key_cache") and hasattr(cache, "value_cache"): + num_layers = len(cache.key_cache) + for k, v in zip(cache.key_cache, cache.value_cache, strict=False): + if k is not None: + total_size_bytes += k.nelement() * k.element_size() + if v is not None: + total_size_bytes += v.nelement() * v.element_size() + + return { + "num_layers": num_layers, + "size_bytes": total_size_bytes, + "size_mb": f"{total_size_bytes / (1024 * 1024):.2f} MB", + } + + +def serialize_item(obj): + if isinstance(obj, list): + return [serialize_item(x) for x in obj] + + if isinstance(obj, KVCacheItem): + return { + "id": obj.id, + "metadata": obj.metadata, + "records": obj.records.model_dump() + if hasattr(obj.records, "model_dump") + else obj.records, + "memory": get_cache_info(obj.memory), + } + + if isinstance(obj, DynamicCache): + return get_cache_info(obj) + + return str(obj) + + +if __name__ == "__main__": + # ===== 示例:使用工厂和 HFLLM 构建及管理 KVCacheMemory ===== + + # 1. 创建 KVCacheMemory 配置(使用 HuggingFace 后端) + config = MemoryConfigFactory( + backend="kv_cache", + config={ + "extractor_llm": { + "backend": "huggingface", + "config": { + "model_name_or_path": "Qwen/Qwen3-0.6B", # 使用有效的 HuggingFace 模型名称 + "max_tokens": 32, + "add_generation_prompt": True, + "remove_think_prefix": True, + }, + }, + }, + ) + + # 2. 使用工厂实例化 KVCacheMemory + kv_mem = MemoryFactory.from_config(config) + + # 3. 从提示中提取 KVCacheItem (DynamicCache)(内部使用 HFLLM.build_kv_cache) + prompt = [ + {"role": "user", "content": "What is MemOS?"}, + {"role": "assistant", "content": "MemOS is a memory operating system for LLMs."}, + ] + print("===== Extract KVCacheItem =====") + cache_item = kv_mem.extract(prompt) + print(json.dumps(serialize_item(cache_item), indent=2, default=str)) + print() + + # 4. 添加提取的 KVCacheItem + print("===== Add KVCacheItem =====") + kv_mem.add([cache_item]) + print(json.dumps(serialize_item(kv_mem.get_all()), indent=2, default=str)) + print() + + # 5. 根据 ID 获取 + print("===== Get KVCacheItem by id =====") + retrieved = kv_mem.get(cache_item.id) + print(json.dumps(serialize_item(retrieved), indent=2, default=str)) + print() + + # 6. 合并缓存(使用两个项目进行模拟) + print("===== Merge DynamicCache =====") + item2 = kv_mem.extract([{"role": "user", "content": "Tell me a joke."}]) + kv_mem.add([item2]) + merged_cache = kv_mem.get_cache([cache_item.id, item2.id]) + print(json.dumps(serialize_item(merged_cache), indent=2, default=str)) + print() + + # 7. 删除一个 + print("===== Delete one KVCacheItem =====") + kv_mem.delete([cache_item.id]) + print(json.dumps(serialize_item(kv_mem.get_all()), indent=2, default=str)) + print() + + # 8. 转储和加载 + print("===== Dump and Load KVCacheMemory =====") + kv_mem.dump("tmp/kv_mem") + print("Memory dumped to 'tmp/kv_mem'.") + kv_mem.delete_all() + kv_mem.load("tmp/kv_mem") + print( + "Memory loaded from 'tmp/kv_mem':", + json.dumps(serialize_item(kv_mem.get_all()), indent=2, default=str), + ) +``` + +### VLLM KVCache Memory + +```python +#!/usr/bin/env python3 +""" +演示如何使用带有 vLLM 后端的 VLLMKVCacheMemory 的示例。 +此示例展示了如何使用新的兼容 vLLM 的 KV cache 记忆。 +""" + +from memos.configs.memory import MemoryConfigFactory +from memos.memories.factory import MemoryFactory + + +def main(): + """演示 VLLMKVCacheMemory 用法的主函数。""" + + print("=== VLLM KV Cache Memory Example ===\n") + + # 1. 创建 VLLMKVCacheMemory 配置(使用 vLLM 后端) + config = MemoryConfigFactory( + backend="vllm_kv_cache", # 使用新的 vLLM KV cache 后端 + config={ + "extractor_llm": { + "backend": "vllm", + "config": { + "model_name_or_path": "Qwen/Qwen3-0.6B", + "api_base": "http://localhost:8088/v1", + "temperature": 0.7, + "max_tokens": 1024, + "model_schema": "memos.configs.llm.VLLMLLMConfig", + }, + }, + }, + ) + + # 2. 使用工厂实例化 VLLMKVCacheMemory + print("Initializing VLLM KV Cache Memory...") + vllm_kv_mem = MemoryFactory.from_config(config) + print("✓ VLLM KV Cache Memory initialized successfully.\n") + + # 3. 从提示中提取 VLLMKVCacheItem + print("===== Extract VLLMKVCacheItem =====") + system_prompt = [ + {"role": "system", "content": "You are a helpful AI assistant."}, + {"role": "user", "content": "What is MemOS?"}, + {"role": "assistant", "content": "MemOS is a memory operating system for LLMs."}, + ] + + try: + cache_item = vllm_kv_mem.extract(system_prompt) + print("✓ KV cache item extracted successfully") + print(f" ID: {cache_item.id}") + print(f" Memory (prompt): {cache_item.memory[:100]}...") + print(f" Metadata: {cache_item.metadata}") + print() + except Exception as e: + print(f"✗ Failed to extract KV cache item: {e}") + return + + # 4. 添加提取的 VLLMKVCacheItem + print("===== Add VLLMKVCacheItem =====") + vllm_kv_mem.add([cache_item]) + all_items = vllm_kv_mem.get_all() + print(f"✓ Added cache item. Total items: {len(all_items)}") + print() + + # 5. 根据 ID 获取 + print("===== Get VLLMKVCacheItem by id =====") + retrieved = vllm_kv_mem.get(cache_item.id) + if retrieved: + print(f"✓ Retrieved cache item: {retrieved.id}") + print(f" Memory (prompt): {retrieved.memory[:100]}...") + else: + print("✗ Failed to retrieve cache item") + print() + + # 6. 获取缓存(返回 vLLM 的提示字符串) + print("===== Get Cache (Prompt String) =====") + prompt_string = vllm_kv_mem.get_cache([cache_item.id]) + if prompt_string: + print(f"✓ Retrieved prompt string: {prompt_string[:100]}...") + print(" This prompt can be used for vLLM generation with preloaded KV cache") + else: + print("✗ Failed to retrieve prompt string") + print() + + # 7. 提取另一个缓存项进行演示 + print("===== Extract Another VLLMKVCacheItem =====") + another_prompt = [ + {"role": "system", "content": "You are a coding assistant."}, + {"role": "user", "content": "Write a Python function to calculate fibonacci numbers."}, + ] + + try: + cache_item2 = vllm_kv_mem.extract(another_prompt) + vllm_kv_mem.add([cache_item2]) + print(f"✓ Added second cache item. Total items: {len(vllm_kv_mem.get_all())}") + print() + except Exception as e: + print(f"✗ Failed to extract second KV cache item: {e}") + print() + + # 8. 在 vLLM 服务器上预加载 KV cache + print("===== Preload KV Cache on vLLM Server =====") + try: + vllm_kv_mem.preload_kv_cache([cache_item.id, cache_item2.id]) + print("✓ KV cache preloaded on vLLM server successfully") + print(" The server now has the KV cache ready for fast generation") + except Exception as e: + print(f"✗ Failed to preload KV cache: {e}") + print() + + # 9. 删除一个项目 + print("===== Delete One VLLMKVCacheItem =====") + vllm_kv_mem.delete([cache_item.id]) + remaining_items = vllm_kv_mem.get_all() + print(f"✓ Deleted cache item. Remaining items: {len(remaining_items)}") + print() + + # 10. 转储和加载 + print("===== Dump and Load VLLMKVCacheMemory =====") + try: + vllm_kv_mem.dump("tmp/vllm_kv_mem") + print("✓ Memory dumped to 'tmp/vllm_kv_mem'") + + # 清除记忆并重新加载 + vllm_kv_mem.delete_all() + vllm_kv_mem.load("tmp/vllm_kv_mem") + reloaded_items = vllm_kv_mem.get_all() + print(f"✓ Memory loaded from 'tmp/vllm_kv_mem': {len(reloaded_items)} items") + except Exception as e: + print(f"✗ Failed to dump/load memory: {e}") + print() + + print("=== Example completed successfully ===") + + +if __name__ == "__main__": + main() +``` + +## 开发者注意事项 + +* 使用HuggingFace `DynamicCache` 高效的键值存储 +* 基于pickle的序列化,用于快速加载/保存 +* `/tests`中的集成测试涵盖了所有方法。 diff --git a/docs/cn/open_source/modules/memories/naive_textual_memory.md b/docs/cn/open_source/modules/memories/naive_textual_memory.md new file mode 100644 index 00000000..6393687e --- /dev/null +++ b/docs/cn/open_source/modules/memories/naive_textual_memory.md @@ -0,0 +1,508 @@ +--- +title: "NaiveTextMemory: 简单明文记忆" +desc: "MemOS 中最轻量级的记忆模块,专为快速原型开发和简单场景设计。无需向量数据库,使用关键词匹配即可快速检索。让我们用最简单的方式开始使用 MemOS 记忆系统! +`NaiveTextMemory` 是一个基于内存的明文记忆模块,将记忆存储在内存列表中,使用关键词匹配进行检索。它是学习 MemOS 的最佳起点,也适用于演示、测试和小规模应用。" + +--- + +## 目录 + +- [你将学到什么](#你将学到什么) +- [为什么选择 NaiveTextMemory](#为什么选择-naivetextmemory) +- [核心概念](#核心概念) + - [记忆结构](#记忆结构) + - [元数据字段](#元数据字段-textualmemorymetadata) + - [搜索机制](#搜索机制) +- [API 参考](#api-参考) + - [初始化](#初始化) + - [核心方法](#核心方法) + - [配置参数](#配置参数) +- [动手实践](#动手实践) + - [快速开始](#快速开始) + - [完整示例](#完整示例) + - [文件存储](#文件存储) +- [使用场景指南](#使用场景指南) +- [与其他记忆模块对比](#与其他记忆模块对比) +- [最佳实践](#最佳实践) +- [下一步](#下一步) + +## 你将学到什么 + +在本指南的最后,你将能够: +- 使用 LLM 从对话中自动提取结构化记忆 +- 在内存中存储和管理记忆(无需数据库) +- 使用关键词匹配搜索记忆 +- 持久化和恢复记忆数据 +- 理解何时使用 NaiveTextMemory,何时升级到其他模块 + +## 为什么选择 NaiveTextMemory + +### 优势特性 + +::list{icon="ph:check-circle-duotone"} +- **零依赖**:无需向量数据库或嵌入模型 +- **快速启动**:几行代码即可运行 +- **轻量高效**:低资源占用,执行速度快 +- **简单直观**:关键词匹配,结果可预测 +- **易于调试**:所有记忆都在内存中,方便查看 +- **完美起点**:学习 MemOS 的最佳入门选择 +:: + +### 适用场景 + +::list{icon="ph:lightbulb-duotone"} +- 快速原型开发和概念验证 +- 简单对话代理(记忆数量 < 1000 条) +- 测试和演示场景 +- 资源受限环境(无法运行嵌入模型) +- 关键词搜索场景(查询与记忆直接匹配) +:: + +::note +**性能提示**
+当记忆数量超过 1000 条时,建议升级到 [GeneralTextMemory](/open_source/modules/memories/general_textual_memory),它使用向量搜索,性能更优。 +:: + + +## 核心概念 + +### 记忆结构 + +每个记忆表示为一个 `TextualMemoryItem` 对象,包含以下字段: + +| 字段 | 类型 | 必填 | 描述 | +| ---------- | --------------------------- | ---- | ----------------------------- | +| `id` | `str` | ✗ | 唯一标识符(自动生成 UUID) | +| `memory` | `str` | ✓ | 记忆的主要文本内容 | +| `metadata` | `TextualMemoryMetadata` | ✗ | 元数据(用于分类、过滤和检索)| + +### 元数据字段 (`TextualMemoryMetadata`) + +元数据提供了丰富的上下文信息,用于分类、过滤和组织记忆: + +| 字段 | 类型 | 默认值 | 描述 | +| ------------- | -------------------------------------------------- | ---------- | ------------------------------ | +| `type` | `"procedure"` / `"fact"` / `"event"` / `"opinion"` | `"fact"` | 记忆类型分类 | +| `memory_time` | `str (YYYY-MM-DD)` | 当前日期 | 记忆关联的时间 | +| `source` | `"conversation"` / `"retrieved"` / `"web"` / `"file"` | - | 记忆来源 | +| `confidence` | `float (0-100)` | 80.0 | 确定性/可信度评分 | +| `entities` | `list[str]` | `[]` | 提及的实体或概念 | +| `tags` | `list[str]` | `[]` | 主题标签 | +| `visibility` | `"private"` / `"public"` / `"session"` | `"private"` | 访问控制范围 | +| `updated_at` | `str` | 自动生成 | 最近更新时间戳(ISO 8601) | + +## API 参考 + +### 初始化 + +```python +from memos.memories.textual.naive import NaiveTextMemory +from memos.configs.memory import NaiveTextMemoryConfig + +memory = NaiveTextMemory(config: NaiveTextMemoryConfig) +``` + +### 核心方法 + +| 方法 | 参数 | 返回值 | 描述 | +| ------------------------ | ------------------------------------- | ----------------------------- | -------------------------------------- | +| `extract(messages)` | `messages: list[dict]` | `list[TextualMemoryItem]` | 使用 LLM 从对话中提取结构化记忆 | +| `add(memories)` | `memories: list / dict / Item` | `None` | 添加一个或多个记忆 | +| `search(query, top_k)` | `query: str, top_k: int` | `list[TextualMemoryItem]` | 关键词匹配检索 top-k 记忆 | +| `get(memory_id)` | `memory_id: str` | `TextualMemoryItem` | 通过 ID 获取单个记忆 | +| `get_by_ids(ids)` | `ids: list[str]` | `list[TextualMemoryItem]` | 通过 ID 列表批量获取记忆 | +| `get_all()` | - | `list[TextualMemoryItem]` | 返回所有记忆 | +| `update(memory_id, new)` | `memory_id: str, new: dict` | `None` | 更新指定记忆的内容或元数据 | +| `delete(ids)` | `ids: list[str]` | `None` | 删除一个或多个记忆 | +| `delete_all()` | - | `None` | 清空所有记忆 | +| `dump(dir)` | `dir: str` | `None` | 将记忆序列化为 JSON 文件保存 | +| `load(dir)` | `dir: str` | `None` | 从 JSON 文件加载记忆 | + +### 搜索机制 + +`NaiveTextMemory` 使用**关键词匹配算法**: + +::steps{} + +#### 步骤 1: 分词 +将查询和每条记忆内容分解为词汇列表 + +#### 步骤 2: 计算匹配度 +统计查询词汇与记忆词汇的交集数量 + +#### 步骤 3: 排序 +按匹配词数降序排列所有记忆 + +#### 步骤 4: 返回结果 +取前 top-k 条记忆作为搜索结果 + +:: + + + +::note +**示例对比**
+查询:"猫咪"
+- **关键词匹配**:只匹配包含"猫"、"猫咪"的记忆
+- **语义搜索**:还能匹配"宠物"、"小猫"、"喵星人"等相关记忆(稍后我们将在“通用明文记忆”文章中学习) +:: + +### 配置参数 + +**NaiveTextMemoryConfig** + +| 参数 | 类型 | 必填 | 默认值 | 描述 | +| ------------------ | ---------------------- | ---- | ---------------------- | ------------------------------------------ | +| `extractor_llm` | `LLMConfigFactory` | ✓ | - | 用于从对话中提取记忆的 LLM 配置 | +| `memory_filename` | `str` | ✗ | `textual_memory.json` | 持久化存储的文件名 | + +**配置示例** + +```json +{ + "backend": "naive_text", + "config": { + "extractor_llm": { + "backend": "openai", + "config": { + "model_name_or_path": "gpt-4o-mini", + "temperature": 0.8, + "max_tokens": 1024, + "api_base": "xxx", + "api_key": "sk-xxx" + } + }, + "memory_filename": "my_memories.json" + } +} +``` + +## 动手实践 + +### 快速开始 + +只需 3 步即可开始使用 NaiveTextMemory: + +::steps{} + +#### 步骤 1: 创建配置 + +```python +from memos.configs.memory import MemoryConfigFactory + +config = MemoryConfigFactory( + backend="naive_text", + config={ + "extractor_llm": { + "backend": "openai", + "config": { + "model_name_or_path": "gpt-4o-mini", + "api_key": "your-api-key", + "api_base": "your-api-base" + }, + }, + }, +) +``` + +#### 步骤 2: 初始化记忆模块 + +```python +from memos.memories.factory import MemoryFactory + +memory = MemoryFactory.from_config(config) +``` + +#### 步骤 3: 提取并添加记忆 + +```python +# 从对话中自动提取记忆 +memories = memory.extract([ + {"role": "user", "content": "I love tomatoes."}, + {"role": "assistant", "content": "Great! Tomatoes are delicious."}, +]) + +# 添加到记忆库 +memory.add(memories) +print(f"✓ 已添加 {len(memories)} 条记忆") +``` + +::alert{type="info"} +**进阶:使用 MultiModal Reader**
+如果需要处理图片、URL、文件等多模态内容,可以使用 `MultiModalStructMemReader`。
+查看完整示例:[使用 MultiModalStructMemReader](./tree_textual_memory#使用-multimodalstructmemreader高级) +:: + +:: + +### 完整示例 + +以下是一个完整的端到端示例,展示所有核心功能: + +```python +from memos.configs.memory import MemoryConfigFactory +from memos.memories.factory import MemoryFactory + +# ======================================== +# 1. 初始化 +# ======================================== +config = MemoryConfigFactory( + backend="naive_text", + config={ + "extractor_llm": { + "backend": "openai", + "config": { + "model_name_or_path": "gpt-4o-mini", + "api_key": "your-api-key", + }, + }, + }, +) +memory = MemoryFactory.from_config(config) + +# ======================================== +# 2. 提取并添加记忆 +# ======================================== +memories = memory.extract([ + {"role": "user", "content": "I love tomatoes."}, + {"role": "assistant", "content": "Great! Tomatoes are delicious."}, +]) +memory.add(memories) +print(f"✓ 已添加 {len(memories)} 条记忆") + +# ======================================== +# 3. 搜索记忆 +# ======================================== +results = memory.search("tomatoes", top_k=2) +print(f"\n🔍 找到 {len(results)} 条相关记忆:") +for i, item in enumerate(results, 1): + print(f" {i}. {item.memory}") + +# ======================================== +# 4. 获取所有记忆 +# ======================================== +all_memories = memory.get_all() +print(f"\n📊 总共 {len(all_memories)} 条记忆") + +# ======================================== +# 5. 更新记忆 +# ======================================== +if memories: + memory_id = memories[0].id + memory.update( + memory_id, + { + "memory": "User loves tomatoes.", + "metadata": {"type": "opinion", "confidence": 95.0} + } + ) + print(f"\n✓ 已更新记忆: {memory_id}") + +# ======================================== +# 6. 持久化存储 +# ======================================== +memory.dump("tmp/mem") +print("\n💾 记忆已保存到 tmp/mem/textual_memory.json") + +# ======================================== +# 7. 加载记忆 +# ======================================== +memory.load("tmp/mem") +print("✓ 记忆已从文件加载") + +# ======================================== +# 8. 删除记忆 +# ======================================== +if memories: + memory.delete([memories[0].id]) + print(f"\n🗑️ 已删除 1 条记忆") + +# 删除所有记忆 +# memory.delete_all() +``` + +::note +**扩展:互联网检索**
+NaiveTextMemory 专注于本地记忆管理。如需从互联网检索信息并添加到记忆库,请查看:
+[从互联网检索记忆](./tree_textual_memory#从互联网检索记忆可选) +:: + +### 文件存储 + +调用 `dump(dir)` 时,系统会将记忆保存到: + +``` +/ +``` + +该文件包含所有记忆条目的JSON列表,可以使用`load(dir)`重新加载. + +**默认文件结构** + +```json +[ + { + "id": "550e8400-e29b-41d4-a716-446655440000", + "memory": "User loves tomatoes.", + "metadata": { + "type": "opinion", + "confidence": 95.0, + "entities": ["user", "tomatoes"], + "tags": ["food", "preference"], + "updated_at": "2026-01-14T10:30:00Z" + } + }, + ... +] +``` + +使用 `load(dir)` 可以完整恢复所有记忆数据。 + +::note +**重要提示**
+记忆存储在内存中,进程重启后会丢失。请定期调用 `dump()` 保存数据! +:: +## 使用场景指南 + +### 最适合的场景 + +::list{icon="ph:check-circle-duotone"} +- **快速原型开发**:无需配置向量数据库,几分钟即可启动 +- **简单对话代理**:记忆数量 < 1000 条的小规模应用 +- **测试和演示**:快速验证记忆提取和检索逻辑 +- **资源受限环境**:无法运行嵌入模型或向量数据库的场景 +- **关键词搜索**:查询内容与记忆文本直接匹配的场景 +- **学习和教学**:了解 MemOS 记忆系统的最佳起点 +:: + +### 不推荐的场景 + +::list{icon="ph:x-circle-duotone"} +- **大规模应用**:超过 10,000 条记忆(搜索性能退化) +- **语义搜索需求**:需要理解同义词(如"猫"和"宠物") +- **生产环境**:对性能和准确性有严格要求 +- **多语言场景**:需要跨语言语义理解 +- **复杂关系推理**:需要理解记忆之间的关联关系 +:: + +::alert{type="info"} +**升级路径**
+对于上述不推荐的场景,建议升级到: +- [GeneralTextMemory](/open_source/modules/memories/general_textual_memory) - 向量语义搜索,适合 10K-100K 条记忆 +- [TreeTextMemory](/open_source/modules/memories/tree_textual_memory) - 图结构存储,支持关系推理和多跳查询 +:: + +## 与其他记忆模块对比 + +选择合适的记忆模块对于项目成功至关重要。以下对比帮助你做出决策: + +| 特性 | **NaiveTextMemory** | **GeneralTextMemory** | **TreeTextMemory** | +| -------------- | --------------------- | -------------------------- | --------------------------- | +| **搜索方式** | 关键词匹配 | 向量语义搜索 | 图结构 + 向量搜索 | +| **依赖组件** | 仅 LLM | LLM + 嵌入器 + 向量数据库 | LLM + 嵌入器 + 图数据库 | +| **适用规模** | < 1K 条 | 1K - 100K 条 | 10K - 1M 条 | +| **查询复杂度** | O(n) 线性扫描 | O(log n) 近似最近邻 | O(log n) + 图遍历 | +| **语义理解** | ❌ | ✅ | ✅ | +| **关系推理** | ❌ | ❌ | ✅ | +| **多跳查询** | ❌ | ❌ | ✅ | +| **存储后端** | 内存列表 | 向量数据库(Qdrant 等) | 图数据库(Neo4j/PolarDB) | +| **配置复杂度** | 低 ⭐ | 中 ⭐⭐ | 高 ⭐⭐⭐ | +| **学习曲线** | 极简 | 中等 | 较陡 | +| **生产就绪** | ❌ 仅原型/演示 | ✅ 适合大多数场景 | ✅ 适合复杂应用 | + +::alert{type="success"} +**选择建议**
+- **刚开始学习?** → 从 NaiveTextMemory 开始
+- **需要语义搜索?** → 使用 GeneralTextMemory
+- **需要关系推理?** → 选择 TreeTextMemory +:: + +## 最佳实践 + +遵循以下建议,充分发挥 NaiveTextMemory 的优势: + +::steps{} + +### 1. 定期持久化数据 + +```python +# 在关键操作后立即保存 +memory.add(new_memories) +memory.dump("tmp/mem") # ✓ 立即持久化 + +# 定期自动备份 +import schedule +schedule.every(10).minutes.do(lambda: memory.dump("tmp/mem")) +``` + +### 2. 控制记忆规模 + +```python +# 定期清理旧记忆 +if len(memory.get_all()) > 1000: + old_memories = sorted( + memory.get_all(), + key=lambda m: m.metadata.updated_at + )[:100] # 最旧的 100 条 + + memory.delete([m.id for m in old_memories]) + print("✓ 已清理 100 条旧记忆") +``` + +### 3. 优化搜索查询 + +```python +# ❌ 不好:模糊查询 +results = memory.search("东西", top_k=5) + +# ✅ 好:使用具体关键词 +results = memory.search("番茄 西红柿", top_k=5) +``` + +### 4. 合理使用元数据 + +```python +# 添加记忆时设置清晰的元数据 +memory.add({ + "memory": "User prefers dark mode", + "metadata": { + "type": "opinion", # ✓ 明确分类 + "tags": ["UI", "preference"], # ✓ 便于过滤 + "confidence": 90.0, # ✓ 标注可信度 + "entities": ["user", "dark mode"] # ✓ 实体标注 + } +}) +``` + +### 5. 规划升级路径 + +```python +# 监控记忆数量,及时升级 +memory_count = len(memory.get_all()) +if memory_count > 800: + print("⚠️ 记忆数量接近上限,建议升级到 GeneralTextMemory") + # 迁移代码参考: + # 1. 导出现有记忆:memory.dump("backup") + # 2. 创建 GeneralTextMemory 配置 + # 3. 导入记忆到新模块 +``` + +:: + + +## 下一步 + +恭喜!你已经掌握了 NaiveTextMemory 的核心用法。接下来可以: + +::list{icon="ph:arrow-right-duotone"} +- **升级到向量搜索**:学习 [GeneralTextMemory](/open_source/modules/memories/general_textual_memory) 的语义检索能力 +- **探索图结构**:了解 [TreeTextMemory](/open_source/modules/memories/tree_textual_memory) 的关系推理功能 +- **集成到应用**:查看 [完整 API 文档](/api-reference/search-memories) 构建生产级应用 +- **运行示例代码**:浏览 `/examples/` 目录获取更多实战案例 +- **了解图数据库**:如果需要高级功能,可以学习 [Neo4j](/open_source/modules/memories/neo4j_graph_db) 或 [PolarDB](/open_source/modules/memories/polardb_graph_db) +:: + +::alert{type="success"} +**提示**
+NaiveTextMemory 是学习 MemOS 的完美起点。当你的应用需要更强大的功能时,可以无缝迁移到其他记忆模块! +:: diff --git a/docs/cn/open_source/modules/memories/nebula_graph_db.md b/docs/cn/open_source/modules/memories/nebula_graph_db.md new file mode 100644 index 00000000..82f70731 --- /dev/null +++ b/docs/cn/open_source/modules/memories/nebula_graph_db.md @@ -0,0 +1,126 @@ +--- +title: 基于 NebulaGraph 的明文记忆后端 +desc: "该模块为记忆增强系统(如 RAG、认知代理或个人助手)提供基于 NebulaGraph 的记忆图谱存储与查询能力。继承自 `BaseGraphDB`,支持多用户隔离、结构化搜索、外挂向量索引等能力,适用于大规模图谱构建与推理。" +--- + +## 为什么选择 NebulaGraph? + +* 适合大规模分布式部署 +* 支持点、边的标签与属性灵活定义 +* 支持向量索引(Nebula 5 起) + + +## 推荐配置模板 + +适用于生产场景、兼容多租户逻辑隔离: + +```json +"graph_db": { + "backend": "nebular", + "config": { + "uri": ["localhost:9669"], + "user": "root", + "password": "your_password", + "space": "database_name", + "user_name": "user_name", + "use_multi_db": false, + "auto_create": true, + "embedding_dimension": 1024 + } +} +``` + +* `space`:Nebula 图空间名称,相当于数据库 +* `user_name`:用于多用户逻辑隔离(自动注入过滤条件) +* `embedding_dimension`:根据你的嵌入模型调整(如 text-embedding-3-large 为 3072) +* `auto_create`: 是否自动创建图空间及 Schema(推荐测试环境使用) + + +## 多租户使用模式 + +NebulaGraph 后端支持两种多租户架构: + +### 单库多用户(Shared DB + `user_name`) + +适用于多个用户/Agent 共用图空间,每位用户使用逻辑隔离: + +```python +GraphDBConfigFactory( + backend="nebular", + config={ + "space": "shared_graph", + "user_name": "alice", + "use_multi_db": False, + ... + }, +) +``` + +### 多库(Multi DB,每用户一空间) + +适用于资源隔离更强场景,每个用户独占一个图空间(space): + +```python +GraphDBConfigFactory( + backend="nebular", + config={ + "space": "user_alice_graph", + "use_multi_db": True, + "auto_create": True, + ... + }, +) +``` + +## 快速使用示例 + +```python +import os +import json +from memos.graph_dbs.factory import GraphStoreFactory +from memos.configs.graph_db import GraphDBConfigFactory + +config = GraphDBConfigFactory( + backend="nebular", + config={ + "uri": json.loads(os.getenv("NEBULAR_HOSTS", "localhost")), + "user": os.getenv("NEBULAR_USER", "root"), + "password": os.getenv("NEBULAR_PASSWORD", "xxxxxx"), + "space": os.getenv("space"), + "use_multi_db": True, + "auto_create": True, + "embedding_dimension": os.getenv("embedding_dimension", 1024), + }, + ) + +graph = GraphStoreFactory.from_config(config) + +topic = TextualMemoryItem( + memory="This research addresses long-term multi-UAV navigation for energy-efficient communication coverage.", + metadata=TreeNodeTextualMemoryMetadata( + memory_type="LongTermMemory", + key="Multi-UAV Long-Term Coverage", + hierarchy_level="topic", + type="fact", + memory_time="2024-01-01", + source="file", + sources=["paper://multi-uav-coverage/intro"], + status="activated", + confidence=95.0, + tags=["UAV", "coverage", "multi-agent"], + entities=["UAV", "coverage", "navigation"], + visibility="public", + updated_at=datetime.now().isoformat(), + embedding=embed_memory_item( + "This research addresses long-term " + "multi-UAV navigation for " + "energy-efficient communication " + "coverage." + ), + ), + ) + +graph.add_node( + id=topic.id, memory=topic.memory, metadata=topic.metadata.model_dump(exclude_none=True) +) +``` diff --git a/docs/cn/open_source/modules/memories/neo4j_graph_db.md b/docs/cn/open_source/modules/memories/neo4j_graph_db.md new file mode 100644 index 00000000..83689183 --- /dev/null +++ b/docs/cn/open_source/modules/memories/neo4j_graph_db.md @@ -0,0 +1,189 @@ +--- +title: Neo4j 图数据库 +desc: "该模块为记忆增强系统(如RAG、认知代理或个人内存助手)提供基于图结构的记忆存储和查询。
它定义了一个干净的抽象类(`BaseGraphDB`),并使用**Neo4j**实现了一个可用于生产环境的实现。" +--- + +## 为什么记忆需要图存储? + +与向量存储不同,一个图数据库允许: + +- 将记忆组织成**链、层次和因果关系** +- 执行**多跳推理**和**子图遍历** +- 支持记忆**重复数据删除、冲突检测和调度** +- 随时间动态地演化图记忆 + +这构成了长期的、可解释的和组成性记忆推理的主干。 + +## 特色 + +- 跨不同图数据库的统一接口 +- 内置对Neo4j的支持 +- 支持向量增强检索(`search_by_embedding`) +- 模块化、可插拔和可测试 +- [v0.2.1 新特性] 支持**多租户图存储架构**(单库多用户) +- [v0.2.1 新特性] 兼容**Neo4j** 社区版(Community Edition) + +## 目录结构 + +``` + +src/memos/graph_dbs/ +├── base.py # BaseGraphDB的抽象接口 +├── factory.py # 工厂从配置中实例化GraphDB +├── neo4j.py # Neo4jGraphDB的产品实现 + +```` + +## 如何使用 + +```python +from memos.graph_dbs.factory import GraphStoreFactory +from memos.configs.graph_db import GraphDBConfigFactory + +# 步骤1:构建工厂配置 +config = GraphDBConfigFactory( + backend="neo4j", + config={ + "uri": "bolt://localhost:7687", + "user": "your_neo4j_user_name", + "password": "your_password", + "db_name": "memory_user1", + "auto_create": True, + "embedding_dimension": 768 + } +) + +# 步骤2:实例化图存储 +graph = GraphStoreFactory.from_config(config) + +# 步骤3:增加记忆 +graph.add_node( + id="node-001", + memory="Today I learned about retrieval-augmented generation.", + metadata={"type": "WorkingMemory", "tags": ["RAG", "AI"], "timestamp": "2025-06-05", "sources": []} +) + +```` + +## 可插拔的设计 + +### 接口: `BaseGraphDB` + +```` +函数功能介绍: +1.节点操作: +插入:add_node(添加单节点) + add_nodes_batch(批量添加节点) +查询:get_node(查询单节点) + get_nodes(查询多个节点) + get_memory_count(查询节点数量) + node_not_exist(节点是否存在) + search_by_embedding(向量搜索可添加filter条件过滤,filter使用参见函数neo4j_example.example_complex_shared_db_search_filter获取完整的方法文档) +更新:update_node(更新单个节点) +删除:delete_node(删除单个节点) + clear (通过user_name删除所有相关节点) + 参见函数neo4j_example.example_complex_shared_db_delete_memory获取完整的方法文档 + +2.边操作 +插入:add_edge(添加三元组记忆) +查询:get_edges(查询多个关系) + edge_exists(是否存在关系) + get_children_with_embeddings(查询关系类型PARENT的节点列表) + get_subgraph(查询多跳节点) +删除:delete_edge(删除关系) + +3.导入导出操作: + import_graph(从序列化的字典中导入整个图,参数包含所有待加载节点和边的字典 参数:{'nodes':[],'edges':[]) + export_graph(以结构化形式导出所有图节点和边,支持分页) + +参见src/memos/graph_dbs/base.py获取完整的方法文档。 +```` +### 当前的后端: + +| 后端 | 状态 | 文件 | +| ------- | ------ | ---------- | +| Neo4j | Stable | `neo4j.py` | + +## 单库多租户(Shared DB, Multi-Tenant) + +通过配置 `user_name` 字段,MemOS 支持在单个 Neo4j 数据库中隔离多个用户的记忆图谱,适用于协同系统、多角色场景: + +```python +config = GraphDBConfigFactory( + backend="neo4j", + config={ + "uri": "bolt://localhost:7687", + "user": "neo4j", + "password": "your_password", + "db_name": "shared-graph", + "user_name": "alice", + "use_multi_db": false, + "embedding_dimension": 768, + }, +) +``` + +每个用户的数据通过 `user_name` 字段在读写、搜索、导出中逻辑隔离,系统自动完成过滤。 + +::note +**示例参考**
+话不多说,都在代码里了`examples/basic_modules +/neo4j_example.example_complex_shared_db(db_name="shared-traval-group-complex-new")` +:: + +## Neo4j 社区版(Community Edition)支持 + +新增后端标识:`neo4j-community` + +使用方式与标准 Neo4j 类似,但自动关闭企业功能: + +- ❌ 不支持 `auto_create` 数据库 +- ❌ 不支持原生向量索引(必须使用外接向量库,目前只支持Qdrant) +- ✅ 强制启用 `user_name` 逻辑隔离(社区版、或user_name属于同一业务不需要强隔离) + +示例配置: + +```python +config = GraphDBConfigFactory( + backend="neo4j-community", + config={ + "uri": "bolt://localhost:7687", + "user": "neo4j", + "password": "12345678", + "db_name": "paper", + "user_name": "bob", + "auto_create": False, + "embedding_dimension": 768, + "use_multi_db": False, + "vec_config": { + "backend": "qdrant", + "config": { + "host": "localhost", + "port": 6333, + "collection_name": "neo4j_vec_db", + "vector_dimension": 768, + "distance_metric": "cosine" + }, + }, + }, +) +``` + +::note +**示例参考**
`examples/basic_modules +/neo4j_example.example_complex_shared_db(db_name="paper", +community=True)` +:: + +## 扩展 + +你可以添加任何其他图形引擎的支持(例如,**TigerGraph**, **DGraph**, **Weaviate hybrid**): + +1. 子类 `BaseGraphDB` +2. 创建配置数据类(例如, `DgraphConfig`) +3. 将它注册到: + + * `GraphDBConfigFactory.backend_to_class` + * `GraphStoreFactory.backend_to_class` + +参见 `src/memos/graph_dbs/neo4j.py` 作为参考实现。 diff --git a/docs/cn/open_source/modules/memories/overview.md b/docs/cn/open_source/modules/memories/overview.md new file mode 100644 index 00000000..f4bcd077 --- /dev/null +++ b/docs/cn/open_source/modules/memories/overview.md @@ -0,0 +1,244 @@ +--- +title: "记忆模块总览" +desc: "MemOS 记忆系统完整指南 - MemOS 提供了丰富的记忆模块,满足从轻量级文本记忆到高级图结构的各种需求。本指南帮助你快速找到最适合的记忆解决方案。" +--- + +# 为什么需要不同的记忆模块 + +记忆模块是赋予Agent“长期记忆”能力的核心组件。它不只是像数据库一样死板地存取数据,而是能够像人类一样,对信息进行自动化地提取、分类、关联和动态更新。通过选择不同的记忆模块,你可以让 Agent拥有不同能力。 + +## 🎯 快速选择指南 + +::alert{type="info"} +**不确定选哪个?** 跟随这个决策树: +- 🚀 **快速测试/演示:简单上手,无需额外软件** → [NaiveTextMemory](#naivetextmemory-简单明文记忆) +- 📝 **通用文本记忆:记住聊天内容或大量文档,并能根据语义搜索** → [GeneralTextMemory](#generaltextmemory-通用文本记忆) +- 👤 **用户偏好管理:专门针对用户画像设计** → [PreferenceTextMemory](#preferencetextmemory-偏好记忆) +- 🌳 **结构化知识图谱:数据之间有复杂的逻辑关联** → [TreeTextMemory](#treetextmemory-分层结构记忆) +- ⚡ **推理加速:访问量很大,希望回复能更平稳、响应更快** → [KVCacheMemory](#kvcachememory-激活记忆) +:: + +--- + +## 📚 记忆模块分类 + +### 一、文本记忆系列 + +专注于存储和检索文本形式的记忆,适用于绝大多数应用场景。 + +#### NaiveTextMemory: 简单明文记忆 +::card +**适用场景:** 快速原型、演示、教学、小规模应用 + +**核心特性:** +- ✅ 零依赖,纯内存存储 +- ✅ 关键词匹配检索 +- ✅ 极简 API,5 分钟上手 +- ✅ 支持文件持久化 + +**局限性:** +- ❌ 无向量语义搜索 +- ❌ 不适合大规模数据 +- ❌ 检索精度有限 + +📖 [查看文档](./naive_textual_memory) +:: + +#### GeneralTextMemory: 通用文本记忆 +::card +**适用场景:** 会话代理、个人助理、知识管理系统 + +**核心特性:** +- ✅ 基于向量的语义搜索 +- ✅ 丰富的元数据支持(类型、时间、来源等) +- ✅ 灵活的过滤和查询 +- ✅ 适合中大规模应用 + +**技术要求:** +- 需要向量数据库(Qdrant 等) +- 需要 Embedding 模型 + +📖 [查看文档](./general_textual_memory) +:: + +#### PreferenceTextMemory: 偏好记忆 +::card +**适用场景:** 个性化推荐、用户画像、智能助理 + +**核心特性:** +- ✅ 自动识别显式和隐式偏好 +- ✅ 偏好去重与冲突检测 +- ✅ 按偏好类型、强度筛选 +- ✅ 向量语义检索 + +**专用功能:** +- 双重偏好提取(explicit/implicit) +- 偏好强度评分 +- 时间衰减支持 + +📖 [查看文档](./preference_textual_memory) +:: + +#### TreeTextMemory: 分层结构记忆 +::card +**适用场景:** 知识图谱、复杂关系推理、多跳查询 + +**核心特性:** +- ✅ 基于图数据库的结构化存储 +- ✅ 支持层次关系和因果链 +- ✅ 多跳推理能力 +- ✅ 去重、冲突检测、记忆调度 + +**高级功能:** +- 支持 MultiModal Reader(图片、URL、文件) +- 支持互联网检索(BochaAI、Google、Bing) +- 工作记忆替换机制 + +**技术要求:** +- 需要图数据库(Neo4j 等) +- 需要向量数据库和 Embedding 模型 + +📖 [查看文档](./tree_textual_memory) +:: + +--- + +### 二、专用记忆模块 + +针对特定场景优化的记忆系统。 + +#### KVCacheMemory: 激活记忆 +::card +**适用场景:** LLM 推理加速、高频背景知识复用 + +**核心特性:** +- ⚡ 预计算 KV Cache,跳过重复编码 +- ⚡ 大幅减少预填充阶段计算 +- ⚡ 适合高吞吐量场景 + +**典型用例:** +- 常见问题(FAQ)缓存 +- 对话历史复用 +- 领域知识预加载 + +**工作原理:** +稳定的文本记忆 → 预转换为 KV Cache → 推理时直接注入 + +📖 [查看文档](./kv_cache_memory) +:: + +#### ParametricMemory: 参数化记忆 +::card +**状态:** 🚧 正在开发中 + +**设计目标:** +- 将知识编码到模型权重(LoRA、专家模块) +- 动态加载/卸载能力模块 +- 支持多任务、多角色架构 + +**未来功能:** +- 参数模块生成与压缩 +- 版本控制与回滚 +- 热插拔能力模块 + +📖 [查看文档](./parametric_memory) +:: + +--- + +### 三、图数据库后端 + +为 TreeTextMemory 提供图存储能力。 + +#### Neo4j Graph DB +::card +**推荐度:** ⭐⭐⭐⭐⭐ + +**特性:** +- 完整的图数据库功能 +- 支持向量增强检索 +- 多租户架构(v0.2.1+) +- 兼容社区版 + +📖 [查看文档](./neo4j_graph_db) +:: + +#### Nebula Graph DB +::card +**特性:** +- 分布式图数据库 +- 高可用性 +- 适合大规模部署 + +📖 [查看文档](./nebula_graph_db) +:: + +#### PolarDB Graph DB +::card +**特性:** +- 阿里云 PolarDB 图计算 +- 云原生架构 +- 企业级可靠性 + +📖 [查看文档](./polardb_graph_db) +:: + +--- + +## 🛠️ 使用场景推荐 + +### 场景 1: 快速原型开发 +**推荐:** [NaiveTextMemory](./naive_textual_memory) +```python +from memos.memories import NaiveTextMemory +memory = NaiveTextMemory() +memory.add("用户喜欢喝咖啡") +results = memory.search("咖啡") +``` + +### 场景 2: 聊天机器人记忆 +**推荐:** [GeneralTextMemory](./general_textual_memory) +- 支持语义搜索 +- 按时间、类型、来源过滤 +- 适合对话历史管理 + +### 场景 3: 个性化推荐系统 +**推荐:** [PreferenceTextMemory](./preference_textual_memory) +- 自动提取用户偏好 +- 偏好冲突检测 +- 强度评分与筛选 + +### 场景 4: 知识图谱应用 +**推荐:** [TreeTextMemory](./tree_textual_memory) +- 多跳关系查询 +- 层次结构管理 +- 复杂推理场景 + +### 场景 5: 高性能 LLM 服务 +**推荐:** [KVCacheMemory](./kv_cache_memory) +- FAQ 系统 +- 客服机器人 +- 大批量请求处理 + +--- + +## 🔗 高级功能 + +### MultiModal Reader(多模态读取) +在 TreeTextMemory 中支持处理: +- 对话中的图片 +- 网页 URL +- 本地文件(PDF、DOCX、TXT、Markdown) +- 混合模式(文本+图片+URL) + +👉 [查看示例](./tree_textual_memory#使用-multimodalstructmemreader高级) + +### Internet Retrieval(互联网检索) +从网络获取实时信息并添加到记忆: +- BochaAI 搜索 +- Google 搜索 +- Bing 搜索 + +👉 [查看示例](./tree_textual_memory#从互联网检索记忆可选) + +--- diff --git a/docs/cn/open_source/modules/memories/parametric_memory.md b/docs/cn/open_source/modules/memories/parametric_memory.md new file mode 100644 index 00000000..ba9a1b7c --- /dev/null +++ b/docs/cn/open_source/modules/memories/parametric_memory.md @@ -0,0 +1,44 @@ +--- +title: 参数记忆 (正在开发中) +--- + +::note +**正在开发中** +该功能仍在积极开发中,敬请期待更新! +:: + +`参数记忆 (Parametric Memory)` 是 MemOS 架构中核心的**长期知识与能力载体**。与明文记忆或激活记忆不同,参数化记忆将语言结构、世界知识以及通用推理能力的深度表征进行编码,直接嵌入在模型的权重之中。 + +在 MemOS 的架构设计中,参数化记忆不仅仅局限于静态的预训练权重,它还包含了**LoRA 适配器**和专家插件模块等模块化的权重组件。这使得你能够在无需重新训练整个模型的情况下,逐步扩展或定制 LLM 的能力。 + +例如,你可以将结构化或固定的知识提炼为参数形式,将其保存为独立的**能力模块 (Capability Blocks)**,并在推理过程中动态加载或卸载。这使得为法律推理、财务分析或特定领域摘要等任务创建“专家子模型”变得轻而易举——而这一切都由 MemOS 统一管理。 + +## 设计目标 + +::list{icon="ph:check-circle-duotone"} +- **可控性** — 支持按需生成、加载、切换或组合参数模块。 +- **可塑性** — 与明文记忆和激活记忆协同演进;支持知识的提炼与回滚。 +- **可追溯性** *(开发中)* — 提供参数模块的版本控制与管理功能。 +:: + +## 当前状态 + +`参数化记忆 (Parametric Memory)` 目前仍处于设计和原型验证阶段。我们计划在未来的版本中发布用于生成、压缩以及热插拔参数模块的 API,旨在更好地支持多任务、多角色及多智能体架构。 + +请持续关注我们的更新! + +## 相关模块 + +虽然参数记忆正在开发中,但今天你已经可以尝试这些: +- **[GeneralTextMemory](/open_source/modules/memories/general_textual_memory)**: 基于向量的灵活语义存储 +- **[TreeTextMemory](/open_source/modules/memories/tree_textual_memory)**: 结构化、层次化和知识图谱 +- **[Activation Memory](/open_source/modules/memories/kv_cache_memory)**: 高效的运行时状态缓存 + +## 开发者注意事项 + +参数化记忆将补全 MemOS 关于统一 **Memory³** 架构的愿景: +- **参数化记忆**: 内化与嵌入的隐式知识 +- **激活记忆**: 短暂的运行时状态 +- **明文记忆**: 结构化、可追溯的显式外部记忆 + +三者有机结合,将构建出一个适应性强、可持续进化且具备可解释性的智能系统。 diff --git a/docs/cn/open_source/modules/memories/polardb_graph_db.md b/docs/cn/open_source/modules/memories/polardb_graph_db.md new file mode 100644 index 00000000..5ed46898 --- /dev/null +++ b/docs/cn/open_source/modules/memories/polardb_graph_db.md @@ -0,0 +1,461 @@ +--- +title: "PolarDB 图数据库" +desc: "MemOS 支持使用 **PolarDB**(基于 Apache AGE 扩展)作为图数据库后端,用于存储和检索知识图谱式的记忆数据。PolarDB 结合了 PostgreSQL 的强大功能和图数据库的灵活性,特别适合需要同时进行关系型和图数据查询的场景。" +--- + +## 功能特性 + +::list{icon="ph:check-circle-duotone"} +- 完整的图数据库操作:节点增删改查、边管理 +- 向量嵌入搜索:支持 IVFFlat 索引的语义检索 +- 连接池管理:自动管理数据库连接,支持高并发 +- 多租户隔离:支持物理和逻辑两种隔离模式 +- JSONB 属性存储:灵活的元数据存储 +- 批量操作:支持批量插入节点和边 +- 自动时间戳:自动维护 `created_at` 和 `updated_at` +- SQL 注入防护:内置参数化查询和字符串转义 +:: + +## 目录结构 + +``` +MemOS/ +└── src/ + └── memos/ + ├── configs/ + │ └── graph_db.py # PolarDBGraphDBConfig 配置类 + └── graph_dbs/ + ├── base.py # BaseGraphDB 抽象基类 + ├── factory.py # GraphDBFactory 工厂类 + └── polardb.py # PolarDBGraphDB 实现 +``` + +## 快速开始 + +### 1. 安装依赖 + +```bash +# 安装 psycopg2 驱动(二选一) +pip install psycopg2-binary # 推荐:预编译版本 +# 或 +pip install psycopg2 # 需要 PostgreSQL 开发库 + +# 安装 MemOS +pip install MemoryOS -U +``` + +### 2. 配置 PolarDB + +#### 方式一:使用配置文件(推荐) + +```json +{ + "graph_db_store": { + "backend": "polardb", + "config": { + "host": "localhost", + "port": 5432, + "user": "postgres", + "password": "your_password", + "db_name": "memos_db", + "user_name": "alice", + "use_multi_db": true, + "auto_create": false, + "embedding_dimension": 1024, + "maxconn": 100 + } + } +} +``` + +#### 方式二:代码初始化 + +```python +from memos.configs.graph_db import PolarDBGraphDBConfig +from memos.graph_dbs.polardb import PolarDBGraphDB + +# 创建配置 +config = PolarDBGraphDBConfig( + host="localhost", + port=5432, + user="postgres", + password="your_password", + db_name="memos_db", + user_name="alice", + use_multi_db=True, + embedding_dimension=1024, + maxconn=100 +) + +# 初始化数据库 +graph_db = PolarDBGraphDB(config) +``` + +### 3. 基本操作示例 + +```python +# ======================================== +# 步骤 1: 添加节点 +# ======================================== +node_id = graph_db.add_node( + label="Memory", + properties={ + "content": "Python 是一种高级编程语言", + "memory_type": "Knowledge", + "tags": ["programming", "python"] + }, + embedding=[0.1, 0.2, 0.3, ...], # 1024维向量 + user_name="alice" +) +print(f"✓ 节点已创建: {node_id}") + +# ======================================== +# 步骤 2: 更新节点 +# ======================================== +graph_db.update_node( + id=node_id, + fields={ + "content": "Python 是一种解释型、面向对象的高级编程语言", + "updated": True + }, + user_name="alice" +) +print("✓ 节点已更新") + +# ======================================== +# 步骤 3: 创建关系 +# ======================================== +# 先创建第二个节点 +node_id_2 = graph_db.add_node( + label="Memory", + properties={ + "content": "Django 是 Python 的 Web 框架", + "memory_type": "Knowledge" + }, + embedding=[0.15, 0.25, 0.35, ...], + user_name="alice" +) + +# 创建边 +edge_id = graph_db.add_edge( + source_id=node_id, + target_id=node_id_2, + edge_type="RELATED_TO", + properties={ + "relationship": "框架与语言", + "confidence": 0.95 + }, + user_name="alice" +) +print(f"✓ 关系已创建: {edge_id}") + +# ======================================== +# 步骤 4: 向量搜索 +# ======================================== +query_embedding = [0.12, 0.22, 0.32, ...] # 查询向量 + +results = graph_db.search_by_embedding( + embedding=query_embedding, + top_k=5, + memory_type="Knowledge", + user_name="alice" +) + +print(f"\n🔍 找到 {len(results)} 个相似节点:") +for node in results: + print(f" - {node.get('content')} (相似度: {node.get('score', 'N/A')})") + +# ======================================== +# 步骤 5: 删除节点 +# ======================================== +graph_db.delete_node(id=node_id, user_name="alice") +print(f"✓ 节点 {node_id} 已删除") +``` + +## 配置详解 + +### PolarDBGraphDBConfig 参数说明 + +| 参数 | 类型 | 默认值 | 必填 | 说明 | +|------|------|--------|------|------| +| `host` | str | - | ✓ | 数据库主机地址 | +| `port` | int | 5432 | ✗ | 数据库端口 | +| `user` | str | - | ✓ | 数据库用户名 | +| `password` | str | - | ✓ | 数据库密码 | +| `db_name` | str | - | ✓ | 目标数据库名称 | +| `user_name` | str | None | ✗ | 租户标识(用于逻辑隔离) | +| `use_multi_db` | bool | True | ✗ | 是否使用多数据库物理隔离 | +| `auto_create` | bool | False | ✗ | 是否自动创建数据库 | +| `embedding_dimension` | int | 1024 | ✗ | 向量嵌入维度 | +| `maxconn` | int | 100 | ✗ | 连接池最大连接数 | + +### 多租户模式对比 + +| 特性 | 物理隔离
(`use_multi_db=True`) | 逻辑隔离
(`use_multi_db=False`) | +|------|-----------------------------------|-------------------------------------| +| **隔离级别** | 数据库级别 | 应用层标签过滤 | +| **配置要求** | `db_name` 通常等于 `user_name` | 必须提供 `user_name` | +| **性能** | 更好(独立资源) | 较好(共享资源) | +| **成本** | 高(每租户独立DB) | 低(共享数据库) | +| **适用场景** | 企业客户、高安全要求 | SaaS 多租户、开发测试 | +| **数据迁移** | 方便(整库导出) | 需要按标签过滤 | + +### 配置示例 + +#### 示例 1:物理隔离(企业版推荐) + +```json +{ + "graph_db_store": { + "backend": "polardb", + "config": { + "host": "prod-polardb.example.com", + "port": 5432, + "user": "admin", + "password": "secure_password", + "db_name": "customer_001", + "user_name": null, + "use_multi_db": true, + "auto_create": false, + "embedding_dimension": 1536, + "maxconn": 200 + } + } +} +``` + +#### 示例 2:逻辑隔离(SaaS 推荐) + +```json +{ + "graph_db_store": { + "backend": "polardb", + "config": { + "host": "shared-polardb.example.com", + "port": 5432, + "user": "app_user", + "password": "app_password", + "db_name": "shared_memos", + "user_name": "tenant_alice", + "use_multi_db": false, + "auto_create": false, + "embedding_dimension": 768, + "maxconn": 50 + } + } +} +``` + +## 高级特性 + +### 1. 批量插入节点 + +```python +# 批量添加节点(高性能) +nodes_data = [ + { + "label": "Memory", + "properties": {"content": f"节点 {i}", "memory_type": "Test"}, + "embedding": [0.1 * i] * 1024, + } + for i in range(100) +] + +node_ids = graph_db.add_nodes_batch( + nodes=nodes_data, + user_name="alice" +) +print(f"✓ 批量创建了 {len(node_ids)} 个节点") +``` + +### 2. 复杂查询示例 + +```python +# 查找特定类型的记忆并按时间排序 +def get_recent_memories(graph_db, memory_type, limit=10): + """获取最近的记忆节点""" + query = f""" + SELECT * FROM "{graph_db.db_name}_graph"."Memory" + WHERE properties->>'memory_type' = %s + AND properties->>'user_name' = %s + ORDER BY updated_at DESC + LIMIT %s + """ + + conn = graph_db._get_connection() + try: + with conn.cursor() as cursor: + cursor.execute(query, [memory_type, "alice", limit]) + results = cursor.fetchall() + return results + finally: + graph_db._return_connection(conn) + +# 使用示例 +recent = get_recent_memories(graph_db, "WorkingMemory", limit=5) +print(f"最近 5 条工作记忆: {len(recent)} 条") +``` + +### 3. 向量索引优化 + +```python +# 创建或更新向量索引 +graph_db.create_index( + label="Memory", + vector_property="embedding", + dimensions=1024, + index_name="memory_vector_index" +) +print("✓ 向量索引已优化") +``` + +### 4. 连接池监控 + +```python +# 查看连接池状态(仅供调试) +import logging +logging.basicConfig(level=logging.DEBUG) + +# 获取连接时会输出详细日志 +conn = graph_db._get_connection() +# [DEBUG] [_get_connection] Successfully acquired connection from pool +graph_db._return_connection(conn) +# [DEBUG] [_return_connection] Successfully returned connection to pool +``` + +## BaseGraphDB 接口 + +PolarDB 实现了 `BaseGraphDB` 抽象类的所有方法,确保与其他图数据库后端的互换性。 + +### 核心方法 + +| 方法 | 说明 | 参数 | +|------|------|------| +| `add_node()` | 添加单个节点 | label, properties, embedding, user_name | +| `add_nodes_batch()` | 批量添加节点 | nodes, user_name | +| `update_node()` | 更新节点属性 | id, fields, user_name | +| `delete_node()` | 删除节点 | id, user_name | +| `delete_node_by_params()` | 按条件删除节点 | params, user_name | +| `add_edge()` | 创建关系 | source_id, target_id, edge_type, properties, user_name | +| `update_edge()` | 更新关系属性 | edge_id, properties, user_name | +| `delete_edge()` | 删除关系 | edge_id, user_name | +| `search_by_embedding()` | 向量相似度搜索 | embedding, top_k, memory_type, user_name | +| `get_node()` | 获取单个节点 | id, user_name | +| `get_memory_count()` | 统计节点数量 | memory_type, user_name | +| `remove_oldest_memory()` | 清理旧记忆 | memory_type, keep_latest, user_name | + +### 完整方法签名示例 + +```python +from typing import Any + +# 添加节点 +def add_node( + self, + label: str = "Memory", + properties: dict[str, Any] | None = None, + embedding: list[float] | None = None, + user_name: str | None = None +) -> str: + """添加一个新节点到图数据库""" + pass + +# 向量搜索 +def search_by_embedding( + self, + embedding: list[float], + top_k: int = 10, + memory_type: str | None = None, + user_name: str | None = None, + filters: dict[str, Any] | None = None +) -> list[dict[str, Any]]: + """基于向量嵌入进行相似度搜索""" + pass + +# 批量操作 +def add_nodes_batch( + self, + nodes: list[dict[str, Any]], + user_name: str | None = None +) -> list[str]: + """批量添加多个节点""" + pass +``` + +## 扩展开发指南 + +如果需要基于 PolarDB 实现自定义功能,可以继承 `PolarDBGraphDB` 类: + +```python +from memos.graph_dbs.polardb import PolarDBGraphDB +from memos.configs.graph_db import PolarDBGraphDBConfig + +class CustomPolarDBGraphDB(PolarDBGraphDB): + """自定义 PolarDB 图数据库实现""" + + def __init__(self, config: PolarDBGraphDBConfig): + super().__init__(config) + # 自定义初始化逻辑 + self.custom_index_created = False + + def create_custom_index(self): + """创建自定义索引""" + conn = self._get_connection() + try: + with conn.cursor() as cursor: + cursor.execute(f""" + CREATE INDEX IF NOT EXISTS idx_custom_field + ON "{self.db_name}_graph"."Memory" + ((properties->>'custom_field')); + """) + conn.commit() + self.custom_index_created = True + print("✓ 自定义索引已创建") + except Exception as e: + print(f"❌ 创建索引失败: {e}") + conn.rollback() + finally: + self._return_connection(conn) + + def search_by_custom_field(self, field_value: str): + """基于自定义字段搜索""" + query = f""" + SELECT * FROM "{self.db_name}_graph"."Memory" + WHERE properties->>'custom_field' = %s + """ + + conn = self._get_connection() + try: + with conn.cursor() as cursor: + cursor.execute(query, [field_value]) + results = cursor.fetchall() + return results + finally: + self._return_connection(conn) + +# 使用自定义实现 +config = PolarDBGraphDBConfig( + host="localhost", + port=5432, + user="postgres", + password="password", + db_name="custom_db" +) + +custom_db = CustomPolarDBGraphDB(config) +custom_db.create_custom_index() +results = custom_db.search_by_custom_field("special_value") +``` + +## 参考资源 + +- [Apache AGE 官方文档](https://age.apache.org/) +- [PostgreSQL 连接池文档](https://www.psycopg.org/docs/pool.html) +- [PolarDB 官方文档](https://www.alibabacloud.com/product/polardb) +- [MemOS GitHub 仓库](https://github.com/MemOS-AI/MemOS) + +## 下一步 + +- 了解 [Neo4j 图数据库](./neo4j_graph_db.md) 的使用 +- 查看 [通用文本记忆](./general_textual_memory.md) 的配置 +- 探索 [树形文本记忆](./tree_textual_memory.md) 的高级特性 diff --git a/docs/cn/open_source/modules/memories/preference_textual_memory.md b/docs/cn/open_source/modules/memories/preference_textual_memory.md new file mode 100644 index 00000000..8e8370c2 --- /dev/null +++ b/docs/cn/open_source/modules/memories/preference_textual_memory.md @@ -0,0 +1,227 @@ +--- +title: "PreferenceTextMemory: 存储和管理用户偏好的明文记忆" +desc: "`PreferenceTextMemory` 是MemOS中用于存储和管理用户偏好的明文记忆模块。它适用于需要根据用户偏好进行记忆检索的场景。" +--- + +## 目录 + +- [为什么需要偏好记忆](#为什么需要偏好记忆) + - [优势特性](#优势特性) + - [应用场景](#应用场景) +- [核心概念与工作流程](#核心概念与工作流程) + - [记忆结构](#记忆结构) + - [元数据字段](#元数据字段) + - [核心工作流](#核心工作流) +- [API 参考](#api-参考) + - [初始化](#初始化) + - [核心方法](#核心方法) + - [文件存储](#文件存储) +- [动手实践:从 0 到 1](#动手实践从-0-到-1) + - [创建 PreferenceTextMemory 配置](#创建-preferencetextmemory-配置) + - [初始化 PreferenceTextMemory](#初始化-preferencetextmemory) + - [抽取结构化记忆](#抽取结构化记忆) + - [搜索记忆](#搜索记忆) + - [备份与恢复](#备份与恢复) + - [完整代码示例](#完整代码示例) + + +## 为什么需要偏好记忆 + +### 优势特性 + +::list{icon="ph:check-circle-duotone"} +- **双重偏好提取**:自动识别显式偏好和隐式偏好 +- **语义理解**:使用向量嵌入理解偏好的深层含义 +- **智能去重**:自动检测和合并重复或冲突的偏好 +- **精准检索**:基于向量相似度的语义搜索 +- **持久化存储**:支持向量数据库(Qdrant/Milvus) +- **可扩展性**:支持大规模偏好数据管理 +- **个性化增强**:为每个用户维护独立的偏好档案 +:: + +### 应用场景 + +::list{icon="ph:lightbulb-duotone"} +- 个性化对话代理(记住用户喜好) +- 智能推荐系统(基于偏好推荐) +- 客户服务系统(提供定制化服务) +- 内容过滤系统(根据偏好筛选内容) +- 学习辅助系统(适应学习风格) +:: + +::alert{type="info"} + +总结来说,当你需要构建能够"记住"用户喜好并据此提供个性化服务的系统时,`PreferenceTextMemory` 是最佳选择。 +:: + +## 核心概念与工作流程 +### 记忆结构 + +在MemOS中,偏好记忆以`PreferenceTextMemory`表示,每条记忆都是一个`TextualMemoryItem`,使用Milvus数据库存储。 +- `id`: 唯一记忆ID(如果省略则自动生成) +- `memory`: 主要文本 +- `metadata`: 包括层次结构信息、嵌入、标签、实体、源和状态 + +偏好记忆又可以分为显式偏好记忆和隐式偏好记忆: +- **显式偏好记忆**:用户明确表达的喜好或厌恶。**示例**: + - "我喜欢深色模式" + - "我不吃辣" + - "请用简短的回答" + - "我更喜欢技术文档而不是视频教程" + +- **隐式偏好记忆**:从用户行为和对话模式中推断出的偏好。**示例**: + - 用户总是询问代码示例 → 偏好实践导向的学习 + - 用户经常要求详细解释 → 偏好深入理解 + - 用户多次提到环保话题 → 关注可持续发展 + +::alert{type="success"} +**智能提取**
+`PreferenceTextMemory` 使用 LLM 自动从对话中同时提取显式和隐式偏好,无需手动标注! +:: + +### 元数据字段 (`PreferenceTextualMemoryMetadata`) + +| 字段 | 类型 | 描述 | +| ------------- | -------------------------------------------------- | ----------------------------------- | +| `preference_type` | `"explicit_preference"`, `"implicit_preference"` | 偏好记忆类型,分为显式偏好记忆和隐式偏好记忆 | +| `dialog_id` | `str` | 对话ID,用于关联偏好记忆与特定对话 | +| `original_text` | `str` | 原始文本,包含用户偏好信息 | +| `embedding` | `str` | 嵌入向量,用于语义搜索和检索 | +| `preference` | `str` | 用户偏好信息 | +| `create_at` | `str` | 创建时间戳 (ISO 8601) | +| `mem_cube_id` | `str` | 记忆立方ID,用于关联偏好记忆与特定记忆立方 | +| `score` | `float ` | 检索结果中偏好记忆和query的相似度评分 | + +### 核心工作流 + +当您运行此示例时,您的工作流将: + +1. **抽取:** 使用LLM从原始文本中提取结构化记忆. + + +2. **嵌入:** 为相似性搜索生成向量嵌入. + + +3. **存储:** 将偏好记忆存储到Milvus数据库中,同时更新元数据字段. + + +4. **搜索:** 通过向量相似度查询,返回最相关的偏好记忆. + +## API 参考 + +### 初始化 + +```python +PreferenceTextMemory(config: PreferenceTextMemoryConfig) +``` + +### 核心方法 + +| 方法 | 描述 | +| --------------------------- | ----------------------------------------------------- | +| `get_memory(messages)` | 从原始对话中抽取偏好记忆. | +| `search(query, top_k)` | 使用向量相似度检索top-k偏好记忆. | +| `load(dir)` | 从存储的文件中加载偏好记忆. | +| `dump(dir)` | 将所有偏好记忆序列化到目录中的JSON文件. | +| `add(memories)` | 批量添加偏好记忆到Milvus数据库. | +| `get_with_collection_name(collection_name, memory_id)` | 通过集合名称和记忆ID获取特定类型的偏好记忆. | +| `get_by_ids_with_collection_name(collection_name, memory_ids)` | 通过集合名称和记忆IDs**批量**获取特定类型的偏好记忆. | +| `get_all()` | 获取所有偏好记忆. | +| `get_memory_by_filter(filter)` | 根据过滤条件获取偏好记忆. | +| `delete(memory_ids)` | 删除指定ID的偏好记忆. | +| `delete_by_filter(filter)` | 根据过滤条件删除偏好记忆. | +| `delete_with_collection_name(collection_name, memory_ids)` | 删除指定集合名称和IDs的所有偏好记忆. | +| `delete_all()` | 删除所有偏好记忆. | + + +### 文件存储 + +当调用 `dump(dir)` 时,MemOS将所有偏好记忆序列化到目录中的JSON文件中: +``` +/ +``` + +--- + +## 动手实践:从 0 到 1 + +::steps{} + +### 创建 PreferenceTextMemory 配置 +定义: +- 你的embedding模型(例如,nomic-embed-text:latest), +- 你的Milvus数据库后端, +- 记忆抽取器(基于LLM)(可选). + +```python +from memos.configs.memory import PreferenceTextMemoryConfig + +config = PreferenceTextMemoryConfig.from_json_file("examples/data/config/preference_config.json") +``` + +### 初始化 PreferenceTextMemory + +```python +from memos.memories.textual.preference import PreferenceTextMemory + +preference_memory = PreferenceTextMemory(config) +``` + +### 抽取结构化记忆 + +使用记忆抽取器将对话、文件或文档解析为多个`TextualMemoryItem`. + +```python +scene_data = [[ + {"role": "user", "content": "Tell me about your childhood."}, + {"role": "assistant", "content": "I loved playing in the garden with my dog."} +]] + +memories = preference_memory.get_memory(scene_data, type="chat", info={"user_id": "1234"}) +preference_memory.add(memories) +``` + +### 搜索记忆 + +```python +results = preference_memory.search("Tell me more about the user", top_k=2) +``` + +### 备份与恢复 +支持偏好记忆的持久化存储与随时重载: +```python +preference_memory.dump("tmp/pref_memories") +preference_memory.load("tmp/pref_memories") +``` + +:: + +### 完整代码示例 + +该示例整合了上述所有步骤,提供一个端到端的完整流程,以Milvus为例 —— 复制即可运行! + +```python +from memos.configs.memory import PreferenceTextMemoryConfig +from memos.memories.textual.preference import PreferenceTextMemory + +# 创建PreferenceTextMemory +config = PreferenceTextMemoryConfig.from_json_file("examples/data/config/preference_config.json") + +preference_memory = PreferenceTextMemory(config) +preference_memory.delete_all() + +scene_data = [[ + {"role": "user", "content": "Tell me about your childhood."}, + {"role": "assistant", "content": "I loved playing in the garden with my dog."} +]] + +# 从原始对话中抽取偏好记忆,并添加到Milvus数据库中 +memories = preference_memory.get_memory(scene_data, type="chat", info={"user_id": "1234"}) +preference_memory.add(memories) + +# 搜索记忆 +results = preference_memory.search("Tell me more about the user", top_k=2) + +# 持久化存储偏好记忆 +preference_memory.dump("tmp/pref_memories") +``` diff --git a/docs/cn/open_source/modules/memories/tree_textual_memory.md b/docs/cn/open_source/modules/memories/tree_textual_memory.md new file mode 100644 index 00000000..3074a770 --- /dev/null +++ b/docs/cn/open_source/modules/memories/tree_textual_memory.md @@ -0,0 +1,509 @@ +--- +title: "TreeTextMemory: 树形明文记忆" +desc: > + 让我们在MemOS中构建你的第一个**基于图的、树形明文记忆**! +
+ **TreeTextMemory** 支持以结构化方式组织、关联并检索记忆,同时保留丰富的上下文信息与良好的可解释性。 +
+ MemOS当前使用[Neo4j](/open_source/modules/memories/neo4j_graph_db)作为后端,未来计划支持更多图数据库。 +--- + + + +## 目录 + +- [你将学到什么](#你将学到什么) +- [核心概念与工作流程](#核心概念与工作流程) + - [记忆结构](#记忆结构) + - [元数据字段](#元数据字段-treenodetextualmemorymetadata) + - [核心工作流](#核心工作流) +- [API 参考](#api-参考) +- [动手实践:从 0 到 1](#动手实践从-0-到-1) + - [创建 TreeTextMemory 配置](#创建-treetextmemory-配置) + - [初始化 TreeTextMemory](#初始化-treetextmemory) + - [抽取结构化记忆](#抽取结构化记忆) + - [搜索记忆](#搜索记忆) + - [从互联网检索记忆(可选)](#从互联网检索记忆可选) + - [替换工作记忆](#替换工作记忆) + - [备份与恢复](#备份与恢复) + - [完整代码示例](#完整代码示例) +- [为什么选择 TreeTextMemory](#为什么选择-treetextmemory) +- [下一步](#下一步) + +## 你将学到什么 + +在本指南的最后,你会: +- 从原始文本或对话中提取结构化记忆 +- 在图数据库中存储他们作为**节点** +- 将记忆链接成**层次结构**和语义图 +- 使用**向量相似度+图遍历**进行搜索 + +## 核心概念与工作流程 + +### 记忆结构 + +每个节点在`TreeTextMemory` 是一个 `TextualMemoryItem`: +- `id`: 唯一记忆ID(如果省略则自动生成) +- `memory`: 主要文本 +- `metadata`: 包括层次结构信息、嵌入、标签、实体、源和状态 + +### 元数据字段 (`TreeNodeTextualMemoryMetadata`) + +| 字段 | 类型 | 描述 | +| --------------- |-------------------------------------------------------| ------------------------------------------ | +| `memory_type` | `"WorkingMemory"`, `"LongTermMemory"`, `"UserMemory"` | 生命周期分类 | +| `status` | `"activated"`, `"archived"`, `"deleted"` | 节点状态 | +| `visibility` | `"private"`, `"public"`, `"session"` | 访问范围 | +| `sources` | `list[str]` | 来源列表 (例如: 文件, URLs) | +| `source` | `"conversation"`, `"retrieved"`, `"web"`, `"file"` | 原始来源类型 | +| `confidence` | `float (0-100)` | 确定性得分 | +| `entities` | `list[str]` | 提及的实体或概念 | +| `tags` | `list[str]` | 主题标签 | +| `embedding` | `list[float]` | 基于向量嵌入的相似性搜索 | +| `created_at` | `str` | 创建时间戳(ISO 8601) | +| `updated_at` | `str` | 最近更新时间戳(ISO 8601) | +| `usage` | `list[str]` | 使用历史 | +| `background` | `str` | 附加上下文 | + + +::note +**最佳实践**
+ 使用有意义的标签和背景——它们有助于组织你的图进行多跳推理。 +:: + +### 核心工作流 + +当您运行此示例时,您的工作流将: + +1. **抽取:** 使用LLM从原始文本中提取结构化记忆. + + +2. **嵌入:** 为相似性搜索生成向量嵌入. + + +3. **存储和链接:** 将具有关系的节点添加到图数据库(Neo4j)中. + + +4. **搜索:** 通过向量相似度查询,然后通过图跳数展开结果. + + +::note +**提示**
图链接有助于检索纯向量搜索可能遗漏的上下文! +:: + +## API 参考 + +### 初始化 + +```python +TreeTextMemory(config: TreeTextMemoryConfig) +``` + +### 核心方法 + +| 方法 | 描述 | +| --------------------------- | ----------------------------------------------------- | +| `add(memories)` | 添加一个或多个记忆(项目或字典) | +| `replace_working_memory()` | 更换所有的WorkingMemory节点 | +| `get_working_memory()` | 得到所有的WorkingMemory节点 | +| `search(query, top_k)` | 使用向量+图搜索检索top-k个记忆 | +| `get(memory_id)` | 通过ID获取单个记忆 | +| `get_by_ids(ids)` | 通过IDs获取多个记忆 | +| `get_all()` | 将整个记忆图导出为字典 | +| `update(memory_id, new)` | 通过ID更新记忆 | +| `delete(ids)` | 通过IDs删除记忆 | +| `delete_all()` | 删除所有的记忆和关系 | +| `dump(dir)` | 在目录中将图序列化为JSON | +| `load(dir)` | 从保存的JSON文件加载图 | +| `drop(keep_last_n)` | 备份图和删除数据库,保留N个备份 | + +### 文件存储 + +当调用 `dump(dir)`, MemOS将树形明文记忆导出为JSON文件: + +``` +/ +``` + +这个文件包含一个JSON结构,有 `nodes` and `edges`. 它可以使用 `load(dir)`重新加载. + +--- + +## 动手实践:从 0 到 1 + +::steps{} + +### 创建 TreeTextMemory 配置 +定义: +- 你的embedding模型(例如,nomic-embed-text:latest), +- 你的图数据库后端(Neo4j), +- 记忆抽取器(基于LLM)(可选). + +```python +from memos.configs.memory import TreeTextMemoryConfig + +config = TreeTextMemoryConfig.from_json_file("examples/data/config/tree_config.json") +``` + + +### 初始化 TreeTextMemory + +```python +from memos.memories.textual.tree import TreeTextMemory + +tree_memory = TreeTextMemory(config) +``` + +### 抽取结构化记忆 + +使用记忆抽取器将对话、文件或文档解析为多个`TextualMemoryItem`. + +#### 使用 SimpleStructMemReader(基础) + +```python +from memos.mem_reader.simple_struct import SimpleStructMemReader + +reader = SimpleStructMemReader.from_json_file("examples/data/config/simple_struct_reader_config.json") + +scene_data = [[ + {"role": "user", "content": "Tell me about your childhood."}, + {"role": "assistant", "content": "I loved playing in the garden with my dog."} +]] + +memories = reader.get_memory(scene_data, type="chat", info={"user_id": "1234"}) +for m_list in memories: + tree_memory.add(m_list) +``` + +#### 使用 MultiModalStructMemReader(高级) + +`MultiModalStructMemReader` 支持处理多模态内容(文本、图片、URL、文件等),能够自动感知(智能路由)到不同的解析器: + +```python +from memos.configs.mem_reader import MultiModalStructMemReaderConfig +from memos.mem_reader.multi_modal_struct import MultiModalStructMemReader + +# 创建 MultiModal Reader 配置 +multimodal_config = MultiModalStructMemReaderConfig( + llm={ + "backend": "openai", + "config": { + "model_name_or_path": "gpt-4o-mini", + "api_key": "your-api-key" + } + }, + embedder={ + "backend": "openai", + "config": { + "model_name_or_path": "text-embedding-3-small", + "api_key": "your-api-key" + } + }, + chunker={ + "backend": "text_splitter", + "config": { + "chunk_size": 1000, + "chunk_overlap": 200 + } + }, + extractor_llm={ + "backend": "openai", + "config": { + "model_name_or_path": "gpt-4o-mini", + "api_key": "your-api-key" + } + }, + # 可选:指定哪些域名直接返回 Markdown + direct_markdown_hostnames=["github.com", "docs.python.org"] +) + +# 初始化 MultiModal Reader +multimodal_reader = MultiModalStructMemReader(multimodal_config) + +# ======================================== +# 示例 1: 处理包含图片的对话 +# ======================================== +scene_with_image = [[ + { + "role": "user", + "content": [ + {"type": "text", "text": "这是我家的花园"}, + {"type": "image_url", "image_url": {"url": "https://example.com/garden.jpg"}} + ] + }, + { + "role": "assistant", + "content": "你的花园很漂亮!" + } +]] + +memories = multimodal_reader.get_memory( + scene_with_image, + type="chat", + info={"user_id": "1234", "session_id": "session_001"} +) +for m_list in memories: + tree_memory.add(m_list) +print(f"✓ 已添加 {len(memories)} 条多模态记忆") + +# ======================================== +# 示例 2: 处理网页 URL +# ======================================== +scene_with_url = [[ + { + "role": "user", + "content": "请分析这篇文章: https://example.com/article.html" + }, + { + "role": "assistant", + "content": "我会帮你分析这篇文章" + } +]] + +url_memories = multimodal_reader.get_memory( + scene_with_url, + type="chat", + info={"user_id": "1234", "session_id": "session_002"} +) +for m_list in url_memories: + tree_memory.add(m_list) +print(f"✓ 已从 URL 提取并添加 {len(url_memories)} 条记忆") + +# ======================================== +# 示例 3: 处理本地文件 +# ======================================== +# 支持的文件类型: PDF, DOCX, TXT, Markdown, HTML 等 +file_paths = [ + "./documents/report.pdf", + "./documents/notes.md", + "./documents/data.txt" +] + +file_memories = multimodal_reader.get_memory( + file_paths, + type="doc", + info={"user_id": "1234", "session_id": "session_003"} +) +for m_list in file_memories: + tree_memory.add(m_list) +print(f"✓ 已从文件提取并添加 {len(file_memories)} 条记忆") + +# ======================================== +# 示例 4: 混合模式(文本 + 图片 + URL) +# ======================================== +mixed_scene = [[ + { + "role": "user", + "content": [ + {"type": "text", "text": "这是我的项目文档:"}, + {"type": "text", "text": "https://github.com/user/project/README.md"}, + {"type": "image_url", "image_url": {"url": "https://example.com/diagram.png"}} + ] + } +]] + +mixed_memories = multimodal_reader.get_memory( + mixed_scene, + type="chat", + info={"user_id": "1234", "session_id": "session_004"} +) +for m_list in mixed_memories: + tree_memory.add(m_list) +print(f"✓ 已从混合内容提取并添加 {len(mixed_memories)} 条记忆") +``` + +::alert{type="info"} +**MultiModal Reader 优势**
+- **智能路由**:自动识别内容类型(图片/URL/文件)并选择合适的解析器
+- **格式支持**:支持 PDF、DOCX、Markdown、HTML、图片等多种格式
+- **URL 解析**:自动提取网页内容(包括 GitHub、文档站点等)
+- **大文件处理**:自动分块处理超大文件,避免 token 超限
+- **上下文保持**:使用滑动窗口保持分块间的上下文连续性 +:: + +::note +**配置提示**
+- 使用 `direct_markdown_hostnames` 参数可以指定哪些域名直接返回 Markdown 格式
+- 支持 `mode="fast"` 和 `mode="fine"` 两种提取模式,fine 模式提取更详细
+- 查看完整示例: `/examples/mem_reader/multimodal_struct_reader.py` +:: + +### 搜索记忆 + +尝试向量搜索+图搜索: +```python +results = tree_memory.search("Talk about the garden", top_k=5) +for i, node in enumerate(results): + print(f"{i}: {node.memory}") +``` + +### 从互联网检索记忆(可选) +你也可以从 Google / Bing / Bocha(博查) 等搜索引擎实时获取网页内容,并自动切分为记忆节点。MemOS 提供了统一接口。 + +以下示例演示如何检索“Alibaba 2024 ESG report”相关网页,并自动提取为结构化记忆。 + +```python + +# 创建embedder +embedder = EmbedderFactory.from_config( + EmbedderConfigFactory.model_validate({ + "backend": "ollama", + "config": {"model_name_or_path": "nomic-embed-text:latest"}, + }) +) + +# 配置检索器(以 BochaAI 为例) +retriever_config = InternetRetrieverConfigFactory.model_validate({ + "backend": "bocha", + "config": { + "api_key": "sk-xxx", # 替换为你的 BochaAI API Key + "max_results": 5, + "reader": { # 自动分块的 Reader 配置 + "backend": "simple_struct", + "config": ..., # 你的mem-reader config + }, + } +}) + +# 实例化检索器 +retriever = InternetRetrieverFactory.from_config(retriever_config, embedder) + +# 执行网页检索 +results = retriever.retrieve_from_internet("Alibaba 2024 ESG report") + +# 添加到记忆图中 +for m in results: + tree_memory.add(m) + +``` +你也可以直接在 TreeTextMemoryConfig 中配置 internet_retriever 字段,例如: + + +```json +{ + "internet_retriever": { + "backend": "bocha", + "config": { + "api_key": "sk-xxx", + "max_results": 5, + "reader": { + "backend": "simple_struct", + "config": ... + } + } + } +} +``` + +这样,在调用 tree_memory.search(query) 时,系统会自动调用互联网检索(如 BochaAI / Google / Bing)然后将结果与本地图中的节点一起排序返回,无需手动调用 retriever.retrieve_from_internet + +### 替换工作记忆 + +用一个新的节点替换你当前的 `WorkingMemory`: +```python +tree_memory.replace_working_memory( + [{ + "memory": "User is discussing gardening tips.", + "metadata": {"memory_type": "WorkingMemory"} + }] +) +``` + +### 备份与恢复 +支持树结构的持久化存储与随时重载: +```python +tree_memory.dump("tmp/tree_memories") +tree_memory.load("tmp/tree_memories") +``` + +:: + + +### 完整代码示例 + +该示例整合了上述所有步骤,提供一个端到端的完整流程 —— 复制即可运行! + +```python +from memos.configs.embedder import EmbedderConfigFactory +from memos.configs.memory import TreeTextMemoryConfig +from memos.configs.mem_reader import SimpleStructMemReaderConfig +from memos.embedders.factory import EmbedderFactory +from memos.mem_reader.simple_struct import SimpleStructMemReader +from memos.memories.textual.tree import TreeTextMemory + +# 嵌入设置 +embedder_config = EmbedderConfigFactory.model_validate({ + "backend": "ollama", + "config": {"model_name_or_path": "nomic-embed-text:latest"} +}) +embedder = EmbedderFactory.from_config(embedder_config) + +# 创建TreeTextMemory +tree_config = TreeTextMemoryConfig.from_json_file("examples/data/config/tree_config.json") +my_tree_textual_memory = TreeTextMemory(tree_config) +my_tree_textual_memory.delete_all() + +# 阅读器设置 +reader_config = SimpleStructMemReaderConfig.from_json_file( + "examples/data/config/simple_struct_reader_config.json" +) +reader = SimpleStructMemReader(reader_config) + +# 从对话中抽取 +scene_data = [[ + { + "role": "user", + "content": "Tell me about your childhood." + }, + { + "role": "assistant", + "content": "I loved playing in the garden with my dog." + }, +]] +memory = reader.get_memory(scene_data, type="chat", info={"user_id": "1234", "session_id": "2222"}) +for m_list in memory: + my_tree_textual_memory.add(m_list) + +# 搜索 +results = my_tree_textual_memory.search( + "Talk about the user's childhood story?", + top_k=10 +) +for i, r in enumerate(results): + print(f"{i}'th result: {r.memory}") + +# 从文档添加[可选项] +doc_paths = ["./text1.txt", "./text2.txt"] +doc_memory = reader.get_memory( + doc_paths, "doc", info={ + "user_id": "your_user_id", + "session_id": "your_session_id", + } +) +for m_list in doc_memory: + my_tree_textual_memory.add(m_list) + +# 转储和丢弃[可选项] +my_tree_textual_memory.dump("tmp/my_tree_textual_memory") +my_tree_textual_memory.drop() +``` + +## 为什么选择 TreeTextMemory + +- **结构层次:** 像思维导图一样组织记忆——节点可以有父母、孩子和交叉链接。 +- **图风格的链接:** 超越纯粹的层次结构-建立多跳推理链。 +- **语义搜索+图扩展:** 结合向量和图形的优点。 +- **可解释性:** 追踪记忆是如何连接、合并或随时间演变的. + +::note +**尝试一下**
从文档或web内容中添加记忆节点。手动链接它们或自动合并类似的节点! +:: + +## 下一步 + +- **了解更多[Neo4j](/open_source/modules/memories/neo4j_graph_db):** treeTextMemory由图数据库后端提供支持。了解Neo4j如何处理节点、边和遍历将帮助您设计更有效的记忆层次结构、多跳推理和上下文链接策略。 +- **添加 [Activation Memory](/open_source/modules/memories/kv_cache_memory):** 使用运行时KV-cache来测试会话状态。 +- **探索图推理:** 为多跳检索和答案合成构建工作流。 +- **更进一步:** 为高级应用检查 [API Reference](/api-reference/search-memories), 或者在 `examples/`运行更多的示例. + +现在你的Agent不仅能记住事实,还能记住它们之间的联系! diff --git a/docs/cn/open_source/modules/mos/overview.md b/docs/cn/open_source/modules/mos/overview.md new file mode 100644 index 00000000..fd0817f0 --- /dev/null +++ b/docs/cn/open_source/modules/mos/overview.md @@ -0,0 +1,106 @@ +--- +title: API 开发指南 (Component & Handler 架构) +desc: MemOS v2.0 采用了更加模块化和解耦的架构。旧版的 MOS 类已被弃用,现在推荐使用 Components (组件) + Handlers (处理器) 的模式进行开发。 +--- + + +这种架构将“系统的元件”(Components)与“业务逻辑的执行”(Handlers)分离开来,使得系统更易于扩展、测试和维护。 + +## 1. 核心概念 + +### 1.1 Components (核心组件) + +Components 是 MemOS 的各个“器官”,它们在服务器启动时被初始化(通过 `init_server()`),并在整个生命周期中复用。 + +核心组件包括: + +#### 核心记忆组件 + +1. **MemCube**: 记忆容器, 用于隔离不同用户/不同应用场景的记忆, 并统一管理多种记忆模块. +2. **MemReader**: 记忆加工器, 把用户输入的各类素材(聊天, 文档, 图片)解析为系统可写入的记忆片段. +3. **MemScheduler**: 后台调度器, 负责管理后台任务队列,将记忆的存储、索引、组织等耗时操作异步处理,支持多任务的并发执行. +4. **MemChat**: 对话控制器, 负责在对话过程中自动进行“记忆检索 -> 上下文管理 -> LLM 调度 -> 记忆更新”的对话闭环. +5. **MemFeedback**: 记忆纠错器,自动理解用户的自然语言反馈,精准定位冲突记忆并执行原子级修正(纠错/补充/替换). + +### 1.2 Handlers (业务处理器) + +Handlers 是 MemOS 的“大脑”,它们封装了具体的业务逻辑(如添加、搜索、对话),通过调用和协调 Components (器官)的各项能力来完成具体的用户任务。 + +#### 核心 Handler 概览 + +| Handler | 作用 | 核心方法 | +| :--- | :--- | :--- | +| **AddHandler** | 添加记忆 (对话/文档/文本) | `handle_add_memories` | +| **SearchHandler** | 搜索记忆 (语义检索) | `handle_search_memories` | +| **ChatHandler** | 对话 (带记忆增强) | `handle_chat_complete`, `handle_chat_stream` | +| **FeedbackHandler** | 反馈 (修正记忆/人工干预) | `handle_feedback_memories` | +| **MemoryHandler** | 管理 (获取详情/删除) | `handle_get_memory`, `handle_delete_memories` | +| **SchedulerHandler** | 调度 (查询异步任务状态) | `handle_scheduler_status`, `handle_scheduler_wait` | +| **SuggestionHandler** | 建议 (生成推荐问题) | `handle_get_suggestion_queries` | + +## 2. API 详解 + +### 2.1 初始化 (Initialization) +初始化是系统启动的基石。所有 Handler 的运行都依赖于统一的组件注册与依赖注入机制。 + +- 组件加载 ( init_server ) : 系统启动时会初始化所有核心组件,包括 LLM(大语言模型)、存储层(向量数据库、图数据库)、调度器(Scheduler)以及各类 Memory Cube。 +- 依赖注入 ( HandlerDependencies ) : 为了保证代码的解耦与可测试性,所有组件被统一封装到一个依赖容器(`HandlerDependencies`)中。当 Handler 启动时,只需接收这个容器,就能获取所需的 `naive_mem_cube`、`mem_reader`、`feedback_server` 等资源,而无需各自重复创建这些组件。 + +### 2.2 添加记忆 (AddHandler) +AddHandler 是大脑的"记忆接纳指令",负责将外部信息转化并写入系统记忆。它不仅负责接纳和转化各类信息,还能自动识别反馈并路由到专门的反馈处理流程。 + +- 核心功能 : + - 多模态支持 : 能够处理用户对话、文档、图片等多种输入形式,统一转化为系统内部的记忆对象。 + - 同步与异步模式 : 通过 `async_mode` 参数控制处理方式。**同步模式**("sync"):立即处理,调用者阻塞等待结果,适合调试;**异步模式**("async"):任务推入后台队列由 MemScheduler 并发处理,API 立即返回任务 ID,适合生产环境提升响应速度。 + - 自动反馈路由 : 如果请求中标记了 `is_feedback=True`,Handler 会自动提取对话中的最后一条用户消息作为反馈内容,将其转发到 MemFeedback 处理,而不是作为普通新记忆添加。 + - 多目标写入 : 支持向多个 MemCube 同时写入记忆。当指定多个目标 Cube 时,系统会并行处理所有写入任务;当仅有一个目标时,则使用轻量级的处理方式。 + +### 2.3 搜索记忆 (SearchHandler) +SearchHandler 是大脑的"记忆检索指令",提供基于语义的智能记忆查询能力,是实现 RAG(检索增强生成)的关键组件。 + +- 核心功能 : + - 语义检索 : 利用向量嵌入(Embedding)技术,根据查询语句的语义相似度召回相关记忆,相比简单的关键词匹配,能更准确地理解用户意图。 + - 灵活的搜索范围 : 支持指定检索的目标数据范围。例如,可以仅在特定用户的记忆库中搜索,也可以跨多个用户搜索共享的公开记忆,满足不同的隐私和业务需求。 + - 多种检索模式 : 根据应用场景在速度和准确率之间灵活选择。**快速模式**适合实时性要求高的场景,**精细模式**适合追求高记忆精准度的场景,**混合模式**兼顾两者。 + - 多步推理检索 : 对于复杂问题,支持引入深度推理能力,通过多轮理解和检索逐步逼近最相关的记忆。 + +### 2.4 对话 (ChatHandler) +ChatHandler 是大脑的"对话协调指令",负责将用户的对话需求转化为完整的业务流程。它不直接存储数据,而是通过协调其他 Handler 来完成端到端的对话任务。 + +- 核心功能 : + - 流程编排 : 自动执行"记忆检索 → LLM 生成 → 记忆保存"的完整对话闭环。用户每次提问都能基于历史记忆获得更智能的回复,同时每一次对话都被沉淀为新的记忆,实现对话即学习。 + - 上下文管理 : 负责处理 history (历史对话)与 query (当前问题)的拼接,确保 LLM 理解完整的对话语境,避免信息丢失。 + - 多种交互模式 : 支持标准请求-响应模式( APIChatCompleteRequest )和流式响应(Stream)模式,标准模式适合简单问题,流式模式适合长文本回复,满足不同的前端交互需求。 + - 消息推送(可选) : 支持在生成回复后自动将结果推送到第三方平台(如钉钉),实现多渠道集成。 + +### 2.5 反馈与修正 (FeedbackHandler) +FeedbackHandler 是大脑的"反馈纠正指令",负责理解用户对 AI 表现的自然语言反馈,自动精准定位并修正相关的记忆内容。 + +- 核心功能 : + - 记忆修正 : 当用户指出 AI 的错误(如"会议地点不是北京是上海")时,Handler 会自动更新或标记相关的旧记忆。系统采用版本管理而非直接删除,保持修改历史的可追溯性。 + - 正负反馈 : 支持用户通过点赞或点踩的方式标记特定记忆的质量。系统据此调整该记忆的权重和可信度,使后续检索更加准确。 + - 精准定位 : 支持两种反馈方式。一种是基于对话历史自动定位冲突信息,另一种是用户直接指定具体的记忆来修正,提高反馈的有效性和准确度。 + +### 2.6 记忆管理 (MemoryHandler) +MemoryHandler 是大脑的"记忆管理指令",提供了对记忆数据的底层 CRUD(增删改查)能力,主要用于系统管理后台或数据清理等运维场景。 + +- 核心功能 : + - 精细化管理 : 不同于 AddHandler 的业务级写入,此 Handler 允许通过记忆 ID 直接获取单条记忆的详细信息或执行物理删除。这种直接操作方式绕过了业务逻辑的包装,主要用于调试、审计或系统清理。 + - 底层直通 : 某些管理操作需要直接与底层的记忆器官(naive_mem_cube)交互,以提供最高效和最低延迟的数据操作能力,满足系统运维的需求。 + +### 2.7 任务调度状态 (SchedulerHandler) +SchedulerHandler 是大脑的"任务监控指令",负责追踪系统中所有异步任务的实时执行状态,让用户能够了解后台任务的进度和结果。 + +- 核心功能 : + - 状态追踪 : 配合 Redis 后端,追踪任务的实时状态(Queued 排队中, Running 执行中, Completed 已完成, Failed 已失败)。 + - 结果获取 : 提供任务结果查询接口。当异步任务完成后,用户可以通过此接口获取最终的执行结果或错误信息,从而了解操作是否成功以及失败的原因。 + - 同步等待(调试工具) : 在测试和集成测试时,提供将异步任务强制转为同步等待的工具,使开发者能够像调试同步代码一样调试异步流程,提高开发效率。 + +### 2.8 猜你想问 (SuggestionHandler) +SuggestionHandler 是大脑的"建议生成指令",通过预测用户的潜在需求,主动推荐相关问题,帮助用户探索系统能力和发现感兴趣的话题。 + +- 核心功能 : + - 双模式生成 : + - 基于对话的建议 : 当用户提供了最近的对话记录时,系统分析对话上下文,推断用户可能感兴趣的后续话题,生成 3 个相关的推荐问题。 + - 基于用户画像的建议 : 当没有对话上下文时,系统从用户的近期记忆中推断其兴趣和状态,生成与用户最近生活或工作相关的推荐问题。这适合在对话开始或话题转换时使用。 + - 多语言支持 : 推荐问题自动适配用户语言设置,支持中英文等多种语言,提升不同用户的体验。 diff --git a/docs/cn/open_source/modules/mos/users.md b/docs/cn/open_source/modules/mos/users.md new file mode 100644 index 00000000..7bfcfbc5 --- /dev/null +++ b/docs/cn/open_source/modules/mos/users.md @@ -0,0 +1,306 @@ +--- +title: 用户管理 +desc: "**MOS**提供全面的用户管理功能,以支持多用户、多会话记忆操作。本文档详细介绍了MOS的用户管理方法." +--- + +## 用户角色 + +MOS支持4种不同权限级别的用户角色: + +| 角色 | 描述 | 权限 | +|------|-------------|-------------| +| `ROOT` | 系统管理员 | 访问所有的立方体和用户,且不可被删除 | +| `ADMIN` | 管理员用户 | 可以管理用户和立方体,访问所有立方体 | +| `USER` | 常规用户 | 可以创建和管理自己的立方体,访问共享的立方体 | +| `GUEST` | 受限用户 | 仅仅可以访问共享的立方体,不能创建立方体 | + +## 用户管理方法 + +### 1. `create_user` + +在MOS系统中创建一个新的用户 + +**参数:** +- `user_id` (str): 用户的唯一标识符 +- `role` (UserRole, optional): 用户角色。默认是 `UserRole.USER` +- `user_name` (str, optional): 展示用户的用户名。如果不提供, 使用 `user_id` + +**返回值:** +- `str`: 创建的用户ID + +**示例:** +```python +import uuid +from memos.mem_user.user_manager import UserRole + +# 创建一个标准用户 +user_id = str(uuid.uuid4()) +memory.create_user(user_id=user_id, role=UserRole.USER, user_name="John Doe") + +# 创建一个管理员用户 +admin_id = str(uuid.uuid4()) +memory.create_user(user_id=admin_id, role=UserRole.ADMIN, user_name="Admin User") + +# 创建一个访客用户 +guest_id = str(uuid.uuid4()) +memory.create_user(user_id=guest_id, role=UserRole.GUEST, user_name="Guest User") +``` + +**注意:** +- 如果具有相同`user_name`的用户已经存在,则该方法返回现有用户的ID +- 初始化过程中系统会自动创建一个root用户 +- 用户ID在整个系统中必须是唯一的 + +### 2. `list_users` + +检索系统中所有激活用户的信息 + +**参数:** +- 无 + +**返回值:** +- `list`: 包含用户信息的字典列表: + - `user_id` (str): 唯一用户识别 + - `user_name` (str): 展示用户的名称 + - `role` (str): 用户角色 (根用户, 管理员, 普通用户, 访客) + - `created_at` (str): 创建用户的ISO格式时间戳 + - `is_active` (bool): 用户帐号是否激活 + +**示例:** +```python +# 所有用户列表 +users = memory.list_users() +for user in users: + print(f"User: {user['user_name']} (ID: {user['user_id']})") + print(f"Role: {user['role']}") + print(f"Active: {user['is_active']}") + print(f"Created: {user['created_at']}") + print("---") +``` + +**输出示例:** +``` +User: root (ID: root) +Role: root +Active: True +Created: 2024-01-15T10:30:00 +--- +User: John Doe (ID: 550e8400-e29b-41d4-a716-446655440000) +Role: user +Active: True +Created: 2024-01-15T11:00:00 +--- +``` + +### 3. `create_cube_for_user` + +创建一个新的记忆立方,并将指定用户设为其所有者 + +**参数:** +- `cube_name` (str): 立方体名称 +- `owner_id` (str): 立方体所有者的用户ID +- `cube_path` (str, optional): 立方体的本地文件路径或远程存储库URL +- `cube_id` (str, optional): 定制立方体标识符,如果没有提供,使用生成的UUID + +**返回值:** +- `str`: 创建的立方体ID + +**示例:** +```python +import uuid + +# 第一次创建用户 +user_id = str(uuid.uuid4()) +memory.create_user(user_id=user_id, user_name="Alice") + +# 为用户创建一个立方 +cube_id = memory.create_cube_for_user( + cube_name="Alice's Personal Memory", + owner_id=user_id, + cube_path="/path/to/alice/memory", + cube_id="alice_personal_cube" +) + +print(f"Created cube: {cube_id}") +``` + +**注意:** +- 所有者自动访问创建的所有立方体 +- 立方体所有者可以和其他用户共享 +- 如果提供了 `cube_path` , 它可以是本地目录路径或远程存储库URL +- 自定义`cube_id`必须在整个系统中唯一 + +### 4. `get_user_info` + +检索有关当前用户及其可访问立方体的详细信息 + +**参数:** +- 无 + +**返回值:** +- `dict`: 包含用户信息和可访问立方体的字典: + - `user_id` (str): 当前用户ID + - `user_name` (str): 当前用户名称 + - `role` (str): 当前用户角色 + - `created_at` (str): 创建用户的ISO格式时间戳 + - `accessible_cubes` (list): 每个可访问立方体的字典列表: + - `cube_id` (str): 立方体标识符 + - `cube_name` (str): 立方体名称 + - `cube_path` (str): 立方体文件路径或仓库URL + - `owner_id` (str): 立方体所有者ID + - `is_loaded` (bool): 立方体当前是否加载在记忆中 + +**示例:** +```python +# 获取当前用户信息 +user_info = memory.get_user_info() + +print(f"Current User: {user_info['user_name']} ({user_info['user_id']})") +print(f"Role: {user_info['role']}") +print(f"Created: {user_info['created_at']}") +print("\nAccessible Cubes:") +for cube in user_info['accessible_cubes']: + print(f"- {cube['cube_name']} (ID: {cube['cube_id']})") + print(f" Owner: {cube['owner_id']}") + print(f" Loaded: {cube['is_loaded']}") + print(f" Path: {cube['cube_path']}") +``` + +**输出示例:** +``` +Current User: Alice (550e8400-e29b-41d4-a716-446655440000) +Role: user +Created: 2024-01-15T11:00:00 + +Accessible Cubes: +- Alice's Personal Memory (ID: alice_personal_cube) + Owner: 550e8400-e29b-41d4-a716-446655440000 + Loaded: True + Path: /path/to/alice/memory +- Shared Project Memory (ID: project_cube) + Owner: bob_user_id + Loaded: False + Path: /path/to/project/memory +``` + +### 5. `share_cube_with_user` + +和其他用户共享一个记忆立方,授予他们访问立方体内容的权限 + +**参数:** +- `cube_id` (str): 共享立方体ID +- `target_user_id` (str): 共享立方体的用户ID + +**返回值:** +- `bool`: 如果共享,返回`True`, 否则,返回`False` + +**示例:** +```python +# 和其他用户共享一个立方 +success = memory.share_cube_with_user( + cube_id="alice_personal_cube", + target_user_id="bob_user_id" +) + +if success: + print("Cube shared successfully") +else: + print("Failed to share cube") +``` + +**注意:** +- 当前用户必须有权访问正在共享的立方体 +- 目标用户必须存在且激活 +- 共享一个立方体授予目标用户对该立方体的读写访问权限 +- 立方体所有者总是共享他的立方体 +- 具有立方体访问权限的用户可以与其他用户共享立方体(如果他们具有适当的权限) + +## 完整的用户管理工作流 + +下面是演示用户管理操作的完整示例: + +```python +import uuid +from memos.configs.mem_os import MOSConfig +from memos.mem_os.main import MOS +from memos.mem_user.user_manager import UserRole + +# 初始化MOS +mos_config = MOSConfig.from_json_file("examples/data/config/simple_memos_config.json") +memory = MOS(mos_config) + +# 1. 创建用户 +alice_id = str(uuid.uuid4()) +bob_id = str(uuid.uuid4()) + +memory.create_user(user_id=alice_id, user_name="Alice", role=UserRole.USER) +memory.create_user(user_id=bob_id, user_name="Bob", role=UserRole.USER) + +# 2. 所有用户列表 +print("All users:") +users = memory.list_users() +for user in users: + print(f"- {user['user_name']} ({user['role']})") + +# 3. 为用户创建一个立方 +alice_cube_id = memory.create_cube_for_user( + cube_name="Alice's Personal Memory", + owner_id=alice_id, + cube_path="/path/to/alice/memory" +) + +bob_cube_id = memory.create_cube_for_user( + cube_name="Bob's Work Memory", + owner_id=bob_id, + cube_path="/path/to/bob/work" +) + +# 4. 和其他用户共享立方 +memory.share_cube_with_user(alice_cube_id, bob_id) +memory.share_cube_with_user(bob_cube_id, alice_id) + +# 5. 获取用户信息 +alice_info = memory.get_user_info() +print(f"\nAlice's accessible cubes: {len(alice_info['accessible_cubes'])}") + +# 6. 添加记忆到立方 +memory.add( + messages=[ + {"role": "user", "content": "I like playing football."}, + {"role": "assistant", "content": "That's great! Football is a wonderful sport."} + ], + user_id=alice_id, + mem_cube_id=alice_cube_id +) + +# 7. 搜索记忆 +retrieved = memory.search( + query="What does Alice like?", + user_id=alice_id +) +print(f"Retrieved memories: {retrieved['text_mem']}") +``` + +## 错误处理 + +用户管理方法包括全面的错误处理: + +- **用户验证**: 方法在操作之前验证用户是否存在并处于激活状态 +- **立方体访问验证**: 确保用户在操作前对立方体具有适当的访问权限 +- **防止重复**: 优雅地处理重复的用户名和立方体ID +- **权限检查**: 验证敏感操作的用户角色和权限 + +## 数据库持久性 + +用户管理数据持久化在SQLite数据库中: +- **位置**: 默认 `~/.memos/memos_users.db` +- **表**: `users`, `cubes`, `user_cube_association` +- **关系**: 用户和立方体之间是多对多关系 +- **软删除**: 用户和立方体被软删除(标记为非激活),而不是永久删除 + +## 安全注意事项 + +- **基于角色的访问控制**: 不同的用户角色具有不同的权限 +- **立方体所有权**: 立方体所有者可以完全控制他们的立方体 +- **访问验证**: 所有操作在执行前都要验证用户访问权限 +- **根用户保护**: 根用户不可删除,具有系统完全访问权限 diff --git a/docs/cn/open_source/modules/mos/users_configurations.md b/docs/cn/open_source/modules/mos/users_configurations.md new file mode 100644 index 00000000..c48acbd1 --- /dev/null +++ b/docs/cn/open_source/modules/mos/users_configurations.md @@ -0,0 +1,719 @@ +--- +title: MemOS配置指南 +desc: 本文档全面概述了MemOS系统中不同组件的所有配置字段和初始化方法. +--- + +1. [配置概述](#configuration-overview) +2. [MOS配置](#mos-configuration) +3. [LLM配置](#llm-configuration) +4. [MemReader配置](#memreader-configuration) +5. [MemCube配置](#memcube-configuration) +6. [记忆配置](#memory-configuration) +7. [嵌入器配置](#embedder-configuration) +8. [向量数据库配置](#vector-database-configuration) +9. [图数据库配置](#graph-database-configuration) +10. [调度器配置](#scheduler-configuration) +11. [初始化方法](#initialization-methods) +12. [配置样例](#configuration-examples) + +## 配置概述 + +MemOS使用具有不同后端工厂模式的分层配置系统。每个组件都有: +- 一个基本配置类 +- 特定于后端配置类 +- 一个基于后端创建适当配置的工厂类 + +## MOS配置 + +用于协调所有组件的主 MOS 配置 + +### MOSConfig 字段 + +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `user_id` | str | "root" | MOS的用户ID,此配置用户ID将作为默认值 | +| `session_id` | str | 自动生成UUID | MOS的会话ID | +| `chat_model` | LLMConfigFactory | 必填 | LLM配置的聊天 | +| `mem_reader` | MemReaderConfigFactory | 必填 | MemReader配置 | +| `mem_scheduler` | SchedulerFactory | 非必填 | 调度器配置 | +| `max_turns_window` | int | 15 | 最大对话次数保持 | +| `top_k` | int | 5 | 每个查询要检索的最大记忆 | +| `enable_textual_memory` | bool | True | 启用明文记忆 | +| `enable_activation_memory` | bool | False | 启用激活记忆 | +| `enable_parametric_memory` | bool | False | 启用参数记忆 | +| `enable_mem_scheduler` | bool | False | 启用记忆调度 | + + +### MOS配置样例 + +```json +{ + "user_id": "root", + "chat_model": { + "backend": "huggingface", + "config": { + "model_name_or_path": "Qwen/Qwen3-1.7B", + "temperature": 0.1, + "remove_think_prefix": true, + "max_tokens": 4096 + } + }, + "mem_reader": { + "backend": "simple_struct", + "config": { + "llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b", + "temperature": 0.8, + "max_tokens": 1024, + "top_p": 0.9, + "top_k": 50 + } + }, + "embedder": { + "backend": "ollama", + "config": { + "model_name_or_path": "nomic-embed-text:latest" + } + }, + "chunker": { + "backend": "sentence", + "config": { + "tokenizer_or_token_counter": "gpt2", + "chunk_size": 512, + "chunk_overlap": 128, + "min_sentences_per_chunk": 1 + } + } + } + }, + "max_turns_window": 20, + "top_k": 5, + "enable_textual_memory": true, + "enable_activation_memory": false, + "enable_parametric_memory": false +} +``` + +## LLM配置 + +不同LLM后端的配置 + +### 基本的LLM字段 + +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `model_name_or_path` | str | 必填 | 模型名称或路径 | +| `temperature` | float | 0.8 | 采样温度 | +| `max_tokens` | int | 1024 | 生成最大token数 | +| `top_p` | float | 0.9 | Top-p采样参数 | +| `top_k` | int | 50 | Top-k 采样参数 | +| `remove_think_prefix` | bool | False | 从输出中移除think标签 | + +### 特定后端字段 + +#### OpenAI LLM +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `api_key` | str | 必填 | OpenAI API key | +| `api_base` | str | "https://api.openai.com/v1" | OpenAI API base URL | + +#### Ollama LLM +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `api_base` | str | "http://localhost:11434" | Ollama API base URL | + +#### HuggingFace LLM +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `do_sample` | bool | False | 使用采样VS贪婪编码 | +| `add_generation_prompt` | bool | True | 应用生成模板 | + +### LLM配置样例 + +```json +// OpenAI +{ + "backend": "openai", + "config": { + "model_name_or_path": "gpt-4o", + "temperature": 0.8, + "max_tokens": 1024, + "top_p": 0.9, + "top_k": 50, + "api_key": "sk-...", + "api_base": "https://api.openai.com/v1" + } +} + +// Ollama +{ + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b", + "temperature": 0.8, + "max_tokens": 1024, + "top_p": 0.9, + "top_k": 50, + "api_base": "http://localhost:11434" + } +} + +// HuggingFace +{ + "backend": "huggingface", + "config": { + "model_name_or_path": "Qwen/Qwen3-1.7B", + "temperature": 0.1, + "remove_think_prefix": true, + "max_tokens": 4096, + "do_sample": false, + "add_generation_prompt": true + } +} +``` + +## MemReader配置 + +记忆读取组件的配置 + +### 基本MemReader字段 + +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `created_at` | datetime | 自动生成 | 创建时间戳 | +| `llm` | LLMConfigFactory | 必填 | LLM配置 | +| `embedder` | EmbedderConfigFactory | 必填 | 嵌入器配置 | +| `chunker` | chunkerConfigFactory | 必填 | 块配置 | + +### 后端类型 + +- `simple_struct`: 结构化记忆阅读器 + +### MemReader配置样例 + +```json +{ + "backend": "simple_struct", + "config": { + "llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b", + "temperature": 0.0, + "remove_think_prefix": true, + "max_tokens": 8192 + } + }, + "embedder": { + "backend": "ollama", + "config": { + "model_name_or_path": "nomic-embed-text:latest" + } + }, + "chunker": { + "backend": "sentence", + "config": { + "tokenizer_or_token_counter": "gpt2", + "chunk_size": 512, + "chunk_overlap": 128, + "min_sentences_per_chunk": 1 + } + } + } +} +``` + +## MemCube配置 + +记忆立方组件配置 + +### GeneralMemCubeConfig字段 + +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `user_id` | str | "default_user" | 用户IDMemCube | +| `cube_id` | str | 自动生成UUID | MemCube的立方ID | +| `text_mem` | MemoryConfigFactory | 必填 | 明文记忆配置 | +| `act_mem` | MemoryConfigFactory | 必填 | 激活记忆配置 | +| `para_mem` | MemoryConfigFactory | 必填 | 参数记忆配置 | + +### 允许的后端 + +- **明文记忆**: `naive_text`, `general_text`, `tree_text`, `uninitialized` +- **激活记忆**: `kv_cache`, `uninitialized` +- **参数记忆**: `lora`, `uninitialized` + +### MemCube配置样例 + +```json +{ + "user_id": "root", + "cube_id": "root/mem_cube_kv_cache", + "text_mem": {}, + "act_mem": { + "backend": "kv_cache", + "config": { + "memory_filename": "activation_memory.pickle", + "extractor_llm": { + "backend": "huggingface", + "config": { + "model_name_or_path": "Qwen/Qwen3-1.7B", + "temperature": 0.8, + "max_tokens": 1024, + "top_p": 0.9, + "top_k": 50, + "add_generation_prompt": true, + "remove_think_prefix": false + } + } + } + }, + "para_mem": { + "backend": "lora", + "config": { + "memory_filename": "parametric_memory.adapter", + "extractor_llm": { + "backend": "huggingface", + "config": { + "model_name_or_path": "Qwen/Qwen3-1.7B", + "temperature": 0.8, + "max_tokens": 1024, + "top_p": 0.9, + "top_k": 50, + "add_generation_prompt": true, + "remove_think_prefix": false + } + } + } + } +} +``` + +## 记忆配置 + +配置不同类型的记忆系统 + +### 基础记忆字段 + +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `cube_id` | str | None | 唯一的 MemCube 标识符默认可以是 cube_name 或 path| + +### 明文记忆配置 + +#### 基础明文记忆 +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `memory_filename` | str | "textual_memory.json" | 存储记忆的文件名 | + +#### 纯明文记忆(仅文本) +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `extractor_llm` | LLMConfigFactory | 必填 | LLM用于记忆提取 | + +#### 通用明文记忆(带向量索引) +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `extractor_llm` | LLMConfigFactory | 必填 | LLM用于记忆提取 | +| `vector_db` | VectorDBConfigFactory | 必填 | 向量数据库配置 | +| `embedder` | EmbedderConfigFactory | 必填 | 嵌入器配置 | + +#### 树形明文记忆 +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `extractor_llm` | LLMConfigFactory | 必填 | LLM用于记忆提取 | +| `dispatcher_llm` | LLMConfigFactory | 必填 | LLM用于记忆调度 | +| `embedder` | EmbedderConfigFactory | 必填 | 嵌入器配置 | +| `graph_db` | GraphDBConfigFactory | 必填 | 图数据库配置 | + +### 激活记忆配置 + +#### 基础激活记忆 +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `memory_filename` | str | "activation_memory.pickle" | 存储记忆的文件名 | + +#### KV Cache记忆 +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `extractor_llm` | LLMConfigFactory | 必填 | LLM用于记忆提取 (must be huggingface) | + +### 参数记忆配置 + +#### 基础参数记忆 +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `memory_filename` | str | "parametric_memory.adapter" | 存储记忆的文件名 | + +#### LoRA记忆 +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `extractor_llm` | LLMConfigFactory | 必填 | LLM用于记忆提取 (must be huggingface) | + +### 记忆配置样例 + +```json +// 树形明文记忆 +{ + "backend": "tree_text", + "config": { + "memory_filename": "tree_memory.json", + "extractor_llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b", + "temperature": 0.0, + "remove_think_prefix": true, + "max_tokens": 8192 + } + }, + "dispatcher_llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b", + "temperature": 0.0, + "remove_think_prefix": true, + "max_tokens": 8192 + } + }, + "embedder": { + "backend": "ollama", + "config": { + "model_name_or_path": "nomic-embed-text:latest" + } + }, + "graph_db": { + "backend": "neo4j", + "config": { + "uri": "bolt://localhost:7687", + "user": "neo4j", + "password": "12345678", + "db_name": "user08alice", + "auto_create": true, + "embedding_dimension": 768 + } + } + } +} +``` + +## 嵌入器配置 + +嵌入模型配置 + +### 基本嵌入器字段 + +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `model_name_or_path` | str | 必填 | 模型名称或路径 | +| `embedding_dims` | int | None | 嵌入维度数量 | + +### 后端特定字段 + +#### Ollama嵌入器 +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `api_base` | str | "http://localhost:11434" | Ollama API base URL | + +#### Sentence Transformer 嵌入器 +除了基本配置之外没有其他字段。 + +### 嵌入器配置样例 + +```json +// Ollama 嵌入器 +{ + "backend": "ollama", + "config": { + "model_name_or_path": "nomic-embed-text:latest", + "api_base": "http://localhost:11434" + } +} + +// Sentence Transformer 嵌入器 +{ + "backend": "sentence_transformer", + "config": { + "model_name_or_path": "all-MiniLM-L6-v2", + "embedding_dims": 384 + } +} +``` + +## 向量数据库配置 + +配置向量数据库 + +### 基础向量数据库字段 + +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `collection_name` | str | 必填 | 集合名称 | +| `vector_dimension` | int | None | 向量维度 | +| `distance_metric` | str | None | 距离度量 (余弦, 欧式, 点积) | + +### Qdrant向量数据库字段 + +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `host` | str | None | Qdrant主机 | +| `port` | int | None | Qdrant端口 | +| `path` | str | None | Qdrant本地路径 | + +### 向量数据库配置示例 + +```json +{ + "backend": "qdrant", + "config": { + "collection_name": "memories", + "vector_dimension": 768, + "distance_metric": "cosine", + "path": "/path/to/qdrant" + } +} +``` + +## 图数据库配置 + +配置图数据库 + +### 基础图数据库字段 + +| Field | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `uri` | str | 必填 | 数据库URI | +| `user` | str | 必填 | 数据库用户名 | +| `password` | str | 必填 | 数据库密码 | + +### Neo4j图数据库字段 + +| Field | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `db_name` | str | 必填 | 目标数据库名称 | +| `auto_create` | bool | False | 如果不存在,创建数据库 | +| `embedding_dimension` | int | 768 | 向量嵌入维度 | + +### 图数据库配置示例 + +```json +{ + "backend": "neo4j", + "config": { + "uri": "bolt://localhost:7687", + "user": "neo4j", + "password": "12345678", + "db_name": "user08alice", + "auto_create": true, + "embedding_dimension": 768 + } +} +``` + +## 调度器配置 + +用于管理记忆检索和激活的记忆调度系统的配置 + +### 基本调度器字段 + +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `top_k` | int | 10 | 在初始检索中要考虑的候选记忆数 | +| `top_n` | int | 5 | 处理后返回的最终结果数 | +| `enable_parallel_dispatch` | bool | True | 是否使用线程池启用并行消息处理 | +| `thread_pool_max_workers` | int | 5 | 线程池中的最大线程数(1-20) | +| `consume_interval_seconds` | int | 3 | 从队列中消费消息的间隔(以秒为单位)(0-60) | + +### 通用调度器字段 + +| 字段 | 类型 | 默认值 | 描述 | +|-------|------|---------|-------------| +| `act_mem_update_interval` | int | 300 | 更新激活记忆的时间间隔(秒) | +| `context_window_size` | int | 5 | 对话历史记录的上下文窗口的大小 | +| `activation_mem_size` | int | 5 | 激活记忆的大小 | +| `act_mem_dump_path` | str | 自动生成 | 用于激活记忆存储的文件路径 | + +### 后端类型 + +- `general_scheduler`: 具有激活记忆管理的高级调度程序 + +### 调度器配置示例 + +```json +{ + "backend": "general_scheduler", + "config": { + "top_k": 10, + "top_n": 5, + "act_mem_update_interval": 300, + "context_window_size": 5, + "activation_mem_size": 1000, + "thread_pool_max_workers": 10, + "consume_interval_seconds": 3, + "enable_parallel_dispatch": true + } +} +``` + +## 初始化方法 + +### 来自JSON文件 + +```python +from memos.configs.mem_os import MOSConfig + +# 从JSON文件加载配置 +mos_config = MOSConfig.from_json_file("path/to/config.json") +``` + +### 来自字典 + +```python +from memos.configs.mem_os import MOSConfig + +# 从字典创建配置 +config_dict = { + "user_id": "root", + "chat_model": { + "backend": "huggingface", + "config": { + "model_name_or_path": "Qwen/Qwen3-1.7B", + "temperature": 0.1 + } + } + # ... other fields +} + +mos_config = MOSConfig(**config_dict) +``` + +### 工厂模式用法 + +```python +from memos.configs.llm import LLMConfigFactory + +# 使用工厂模式创建LLM配置 +llm_config = LLMConfigFactory( + backend="huggingface", + config={ + "model_name_or_path": "Qwen/Qwen3-1.7B", + "temperature": 0.1 + } +) +``` + +## 配置样例 + +### 创建完整MOS + +```python +from memos.configs.mem_os import MOSConfig +from memos.mem_os.main import MOS + +# 加载配置 +mos_config = MOSConfig.from_json_file("examples/data/config/simple_memos_config.json") + +# 初始化MOS +mos = MOS(mos_config) + +# 创建用户和注册立方 +user_id = "user_123" +mos.create_user(user_id=user_id) +mos.register_mem_cube("path/to/mem_cube", user_id=user_id) + +# 使用MOS +response = mos.chat("Hello, how are you?", user_id=user_id) +``` + +### 树形记忆配置 + +```python +from memos.configs.memory import MemoryConfigFactory + +# 创建树形记忆配置 +tree_memory_config = MemoryConfigFactory( + backend="tree_text", + config={ + "memory_filename": "tree_memory.json", + "extractor_llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b", + "temperature": 0.0, + "max_tokens": 8192 + } + }, + "dispatcher_llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b", + "temperature": 0.0, + "max_tokens": 8192 + } + }, + "embedder": { + "backend": "ollama", + "config": { + "model_name_or_path": "nomic-embed-text:latest" + } + }, + "graph_db": { + "backend": "neo4j", + "config": { + "uri": "bolt://localhost:7687", + "user": "neo4j", + "password": "password", + "db_name": "memories", + "auto_create": True, + "embedding_dimension": 768 + } + } + } +) +``` + +### 多后端LLM配置 + +```python +from memos.configs.llm import LLMConfigFactory + +# OpenAI 配置 +openai_config = LLMConfigFactory( + backend="openai", + config={ + "model_name_or_path": "gpt-4o", + "temperature": 0.8, + "max_tokens": 1024, + "api_key": "sk-...", + "api_base": "https://api.openai.com/v1" + } +) + +# Ollama 配置 +ollama_config = LLMConfigFactory( + backend="ollama", + config={ + "model_name_or_path": "qwen3:0.6b", + "temperature": 0.8, + "max_tokens": 1024, + "api_base": "http://localhost:11434" + } +) + +# HuggingFace 配置 +hf_config = LLMConfigFactory( + backend="huggingface", + config={ + "model_name_or_path": "Qwen/Qwen3-1.7B", + "temperature": 0.1, + "remove_think_prefix": True, + "max_tokens": 4096, + "do_sample": False, + "add_generation_prompt": True + } +) +``` + +这个全面的配置系统允许灵活和可扩展的MemOS系统设置不同的后端和组件 diff --git a/docs/cn/open_source/open_source_api/chat/chat.md b/docs/cn/open_source/open_source_api/chat/chat.md new file mode 100644 index 00000000..990e0f8a --- /dev/null +++ b/docs/cn/open_source/open_source_api/chat/chat.md @@ -0,0 +1,86 @@ +--- +title: 对话 +desc: 集成“检索、生成、存储”全链路的 RAG 闭环接口,支持基于 MemCube 的个性化回复与记忆自动沉淀。 +--- + +:::note +有关API字段、格式等信息的完整列表,详见[Chat 接口文档](/api_docs/chat/chat)。 +::: + +**接口路径**: +* **全量响应**:`POST /product/chat/complete` +* **流式响应 (SSE)**:`POST /product/chat/stream` + +**功能描述**:本接口是 MemOS 的核心业务编排入口。它能够自动从指定的 `readable_cube_ids` 中召回相关记忆,结合当前语境生成回复,并可选地将对话结果自动回写至 `writable_cube_ids` 中,实现 AI 应用的自我进化。 + + +## 1. 核心架构:ChatHandler 编排流程 + +1. **记忆检索 (Retrieval)**:根据 `readable_cube_ids` 调用 **SearchHandler**,从隔离的 Cube 中提取相关的事实、偏好及工具背景。 +2. **上下文增强生成 (Generation)**:将检索到的记忆片段注入 Prompt,调用指定的 LLM(通过 `model_name_or_path`)生成针对性回复。 +3. **记忆自动闭环 (Storage)**:若开启 `add_message_on_answer=true`,系统会调用 **AddHandler** 将本次对话异步存入指定的 Cube,无需开发者手动调用添加接口。 +## 2. 关键接口参数 + +### 2.1 身份与语境 +| 参数名 | 类型 | 必填 | 说明 | +| :--- | :--- | :--- | :--- | +| **`query`** | `str` | 是 | 用户当前的提问内容。 | +| **`user_id`** | `str` | 是 | 用户唯一标识,用于鉴权与数据隔离。 | +| `history` | `list` | 否 | 短期历史对话记录,用于维持当前会话的连贯性。 | +| `session_id` | `str` | 否 | 会话 ID。作为“软信号”提升该会话内相关记忆的召回权重。 | + +### 2.2 MemCube 读写控制 +| 参数名 | 类型 | 默认值 | 说明 | +| :--- | :--- | :--- | :--- | +| **`readable_cube_ids`** | `list` | - | **读:** 允许检索的记忆 Cube 列表(可跨个人库与公共库)。 | +| **`writable_cube_ids`** | `list` | - | **写:** 对话完成后,自动生成的记忆应存入的目标 Cube 列表。 | +| **`add_message_on_answer`** | `bool` | `true` | 是否开启自动回写。建议开启以维持记忆的持续更新。 | + +### 2.3 算法与模型配置 +| 参数名 | 类型 | 默认值 | 说明 | +| :--- | :--- | :--- | :--- | +| `mode` | `str` | `fast` | 检索模式:`fast` (快速), `fine` (精细), `mixture` (混合)。 | +| `model_name_or_path` | `str` | - | 指定使用的 LLM 模型名称或路径。 | +| `system_prompt` | `str` | - | 覆盖默认的系统提示词。 | +| `temperature` | `float` | - | 采样温度,控制生成文本的创造性。 | +| `threshold` | `float` | `0.5` | 记忆召回的相关性阈值,低于该值的记忆将被剔除。 | + +## 3. 工作原理 + +MemOS提供两种响应模式可供选型: +### 3.1 全量响应 (`/complete`) +* **特点**:等待模型生成全部内容后一次性返回 JSON。 +* **场景**:非交互式任务、后台逻辑处理、或对实时性要求较低的简单应用。 + +### 3.2 流式响应 (`/stream`) +* **特点**:采用 **Server-Sent Events (SSE)** 协议,实时推送 Token。 +* **场景**:聊天机器人、智能助手等需要即时打字机反馈效果的 UI 交互。 + +## 4. 快速上手 + +推荐使用开源版内置的 `MemOSClient` 进行调用。以下示例展示了如何询问关于 R 语言学习的建议,并利用记忆功能: + +```python +from memos.api.client import MemOSClient + +client = MemOSClient(api_key="...", base_url="...") + +# 发起对话请求 +res = client.chat( + user_id="dev_user_01", + query="根据我之前的偏好,推荐一套 R 语言数据清理方案", + readable_cube_ids=["private_cube_01", "public_kb_r_lang"], # 读:个人偏好+公共库 + writable_cube_ids=["private_cube_01"], # 写:沉淀至个人空间 + add_message_on_answer=True, # 开启自动记忆回写 + mode="fine" # 使用精细检索模式 +) + +if res: + print(f"AI 回复内容: {res.data}") +``` + + +:::note +**开发者提示:** +若需要针对 `Playground` 环境进行调试,请访问专用的调试流接口 /product/chat/stream/playground 。 +::: diff --git a/docs/cn/open_source/open_source_api/core/add_memory.md b/docs/cn/open_source/open_source_api/core/add_memory.md new file mode 100644 index 00000000..2e108a3a --- /dev/null +++ b/docs/cn/open_source/open_source_api/core/add_memory.md @@ -0,0 +1,71 @@ +--- +title: 添加记忆 (Add Memory) +desc: MemOS 的核心生产接口。通过 MemCube 隔离机制,实现个人记忆、知识库及多租户场景下的异步记忆生产。 +--- + +**接口路径**:`POST /product/add` +**功能描述**:这是系统存储非结构化数据的核心入口。它支持通过对话列表、纯文本或元数据,将原始数据转化为结构化的记忆片段。在开源版中,系统通过 **MemCube** 实现记忆的物理隔离与动态组织。 + +## 1. 核心机理:MemCube 与隔离 + +在开源架构中,理解 MemCube 是高效使用接口的关键: + +* **隔离单元**:MemCube 是记忆生成的原子单位,Cube 之间完全独立,系统仅在单个 Cube 内部进行去重和冲突解决。 +* **灵活映射**: + * **个人模式**:将 `user_id` 作为 `writable_cube_ids` 传入,即建立个人私有记忆。 + * **知识库模式**:将知识库的唯一标识(QID)作为 `writable_cube_ids` 传入,内容即存入该知识库。 +* **多目标写入**:接口支持同时向多个 Cube 写入记忆,实现跨域同步。 + + +## 2. 关键接口参数 + +核心参数定义如下: + +| 参数名 | 类型 | 必填 | 默认值 | 说明 | +| :--- | :--- | :--- | :--- | :--- | +| **`user_id`** | `str` | 是 | - | 用户唯一标识符,用于权限校验。 | +| **`messages`** | `list/str`| 是 | - | 待存储的消息列表或纯文本内容。 | +| **`writable_cube_ids`** | `list[str]`| 是 | - | **核心参数**:指定写入的目标 Cube ID 列表。 | +| **`async_mode`** | `str` | 否 | `async` | 处理模式:`async` (后台队列处理) 或 `sync` (当前请求阻塞)。 | +| **`is_feedback`** | `bool` | 否 | `false` | 若为 `true`,系统将自动路由至反馈处理器执行记忆更正。 | +| `session_id` | `str` | 否 | `default` | 会话标识符,用于追踪对话上下文。 | +| `custom_tags` | `list[str]`| 否 | - | 自定义标签,可作为后续搜索时的过滤条件。 | +| `info` | `dict` | 否 | - | 扩展元数据。其中的所有键值对均支持后续过滤检索。 | +| `mode` | `str` | 否 | - | 仅在 `async_mode='sync'` 时生效,可选 `fast` (快速) 或 `fine` (精细)。 | + +## 3. 工作原理 (Component & Handler) + +当请求到达后端时,系统由 **AddHandler** 调度核心组件执行以下逻辑: + +1. **多模态解析**:由 `MemReader` 组件将 `messages` 转化为内部记忆对象。 +2. **反馈路由**:若 `is_feedback=True`,Handler 会提取对话末尾作为反馈,直接修正已有记忆,不生成新事实。 +3. **异步分发**:若为 `async` 模式,`MemScheduler` 将任务推入任务队列,接口立即返回 `task_id`。 +4. **内部组织**:算法在目标 Cube 内执行组织逻辑,通过去重和融合优化记忆质量。 + +## 4. 快速上手示例 + +推荐使用 `MemOSClient` SDK 进行标准化调用: + +```python +from memos.api.client import MemOSClient + +# 初始化客户端 +client = MemOSClient(api_key="...", base_url="...") + +# 场景一:为个人用户添加记忆 +client.add_message( + user_id="sde_dev_01", + writable_cube_ids=["user_01_private"], + messages=[{"role": "user", "content": "我正在学习 R 语言的 ggplot2。"}], + async_mode="async", + custom_tags=["Programming", "R"] +) +# 场景二:往知识库导入内容并开启反馈 +client.add_message( + user_id="admin_01", + writable_cube_ids=["kb_finance_2026"], + messages="2026年财务审计流程已更新,请参考附件。", + is_feedback=True, # 标记为反馈以更正旧版流程 + info={"source": "Internal_Portal"} +) +``` diff --git a/docs/cn/open_source/open_source_api/core/delete_memory.md b/docs/cn/open_source/open_source_api/core/delete_memory.md new file mode 100644 index 00000000..e24af068 --- /dev/null +++ b/docs/cn/open_source/open_source_api/core/delete_memory.md @@ -0,0 +1,62 @@ +--- +title: 删除记忆 (Delete Memory) +desc: 从指定的 MemCube 中永久移除记忆条目、关联文件或符合特定过滤条件的记忆集合。 +--- + +**接口路径**:`POST /product/delete_memory` +**功能描述**:本接口用于维护记忆库的准确性与合规性。当用户要求遗忘特定信息、数据过时或需要清理特定的上传文件时,可以通过此接口在向量数据库与图数据库中同步执行物理删除。 + +## 1. 核心机理:Cube 级物理清理 + +在开源版中,删除操作遵循严格的 **MemCube** 隔离逻辑: + +* **作用域限制**:通过 `writable_cube_ids` 参数,删除操作被严格锁定在指定的记忆体中,绝不会误删其他 Cube 的内容。 +* **多维删除**:支持按 **记忆 ID**(精确)、**文件 ID**(关联删除)以及 **Filter 过滤器**(条件逻辑)三种维度并发执行清理。 +* **原子性同步**:删除操作由 **MemoryHandler** 触发,确保底层向量索引与图数据库中的实体节点同步移除,防止召回“幻觉”。 + + + +## 2. 关键接口参数 +核心参数定义如下: + +| 参数名 | 类型 | 必填 | 说明 | +| :--- | :--- | :--- | :--- | +| **`writable_cube_ids`** | `list[str]` | 是 | 指定执行删除操作的目标 Cube 列表。 | +| **`memory_ids`** | `list[str]` | 否 | 待删除的记忆唯一标识符列表。 | +| **`file_ids`** | `list[str]` | 否 | 待删除的原始文件标识符列表,将同步清理该文件产生的全部记忆。 | +| **`filter`** | `object` | 否 | 逻辑过滤器。支持按标签、元信息或时间戳批量删除符合条件的记忆。 | + +## 3. 工作原理 (MemoryHandler) + +1. **权限与路由**:系统通过 `user_id` 校验操作权限,并将请求路由至 **MemoryHandler**。 +2. **定位存储**:根据 `writable_cube_ids` 定位底层的 **naive_mem_cube** 组件。 +3. **分发清理任务**: + * **按 ID 清理**:直接根据 UUID 在主数据库和向量库中执行记录抹除。 + * **按 Filter 清理**:先检索出符合条件的记忆 ID 集合,再执行批量物理移除。 +4. **状态反馈**:操作完成后返回成功状态,相关内容将立即从 [**检索接口**](./search_memory.md) 的召回范围中消失。 + +## 4. 快速上手示例 + +使用 `MemOSClient` 执行不同维度的删除操作: + +```python +# 初始化客户端 +client = MemOSClient(api_key="...", base_url="...") + +# 场景一:精确删除单条已知的错误记忆 +client.delete_memory( + writable_cube_ids=["user_01_private"], + memory_ids=["2f40be8f-736c-4a5f-aada-9489037769e0"] +) + +# 场景二:批量清理某一特定标签下的所有过时记忆 +client.delete_memory( + writable_cube_ids=["kb_finance_2026"], + filter={"tags": {"contains": "deprecated_policy"}} +) +``` +## 5. 注意事项 + +不可恢复性:删除操作是物理删除。一旦执行成功,该记忆将无法再通过检索接口召回。 + +文件关联性:通过 `file_ids` 删除时,系统会自动溯源并清理该文件解析出的事实记忆和摘要。 diff --git a/docs/cn/open_source/open_source_api/core/get_memory.md b/docs/cn/open_source/open_source_api/core/get_memory.md new file mode 100644 index 00000000..003cf4a7 --- /dev/null +++ b/docs/cn/open_source/open_source_api/core/get_memory.md @@ -0,0 +1,81 @@ +--- +title: 获取记忆 (Get Memories) +desc: 分页查询或全量导出指定 Cube 中的记忆集合,支持按类型过滤及子图提取。 +--- + +**接口路径**: +* **分页查询**:`POST /product/get_memory` +* **全量导出**:`POST /product/get_all` + +**功能描述**:用于列出或导出指定 **MemCube** 中的记忆资产。通过这两个接口,您可以获取系统生成的原始记忆片段、用户偏好或工具使用记录,支持分页展示与结构化导出。 + +## 1. 核心机理:分页 vs. 全量导出 + +在开源版中,系统通过 **MemoryHandler** 提供了两种不同的集合访问模式: + +* **业务分页模式 (`/get_memory`)**: + * **设计初衷**:为前端 UI 列表设计。支持 `page` 和 `page_size` 参数。 + * **特性**:默认包含偏好记忆(`include_preference`),支持轻量级的数据加载。 +* **全量导出模式 (`/get_all`)**: + * **设计初衷**:为数据迁移或复杂关系分析设计。 + * **核心能力**:支持传入 `search_query` 提取相关的**子图(Subgraph)**,或按 `memory_type`(文本/动作/参数)导出全量数据。 + + +## 2. 关键接口参数 + +### 2.1 分页查询参数 (`/get_memory`) + +| 参数名 | 类型 | 必填 | 说明 | +| :--- | :--- | :--- | :--- | +| **`mem_cube_id`** | `str` | 是 | 目标 MemCube ID。 | +| **`user_id`** | `str` | 否 | 用户唯一标识符。 | +| **`page`** | `int` | 否 | 页码(从 1 开始)。若设为 `None` 则尝试全量导出。 | +| **`page_size`** | `int` | 否 | 每页条目数。 | +| `include_preference` | `bool` | 否 | 是否包含偏好记忆。 | + +### 2.2 全量/子图导出参数 (`/get_all`) + +| 参数名 | 类型 | 必填 | 说明 | +| :--- | :--- | :--- | :--- | +| **`user_id`** | `str` | 是 | 用户 ID。 | +| **`memory_type`** | `str` | 是 | 记忆类型:`text_mem`, `act_mem`, `para_mem`。 | +| `mem_cube_ids` | `list` | 否 | 待导出的 Cube ID 列表。 | +| `search_query` | `str` | 否 | 若提供,将基于此查询召回并返回相关的记忆子图。 | + +## 3. 快速上手示例 + +### 3.1 前端分页展示 (SDK 调用) + +```python +# 获取第一页,每页 10 条记忆 +res = client.get_memory( + user_id="sde_dev_01", + mem_cube_id="cube_research_01", + page=1, + page_size=10 +) + +for mem in res.data: + print(f"[{mem['type']}] {mem['memory_value']}") +``` +### 3.2 导出特定的事实记忆子图 +```python +# 提取与“R 语言”相关的全部事实记忆 +res = client.get_all( + user_id="sde_dev_01", + memory_type="text_mem", + search_query="R language visualization" +) +``` + +## 4. 响应结构说明 +接口返回标准的业务响应,其中 data 包含记忆对象数组。每条记忆通常包含以下核心字段: + +`id`: 记忆唯一标识,用于执行 获取详情 或 删除 操作。 + +`memory_value`: 经过算法加工后的记忆文本。 + +`tags`: 关联的自定义标签。 + +::note +开发者提示: 如果您已知记忆 ID 并希望查看其完整的元数据(如 confidence 或 usage 记录),请使用`获取记忆详情`(Get_ memory_by_id)接口。 ::: diff --git a/docs/cn/open_source/open_source_api/core/get_memory_by_id.md b/docs/cn/open_source/open_source_api/core/get_memory_by_id.md new file mode 100644 index 00000000..761b8921 --- /dev/null +++ b/docs/cn/open_source/open_source_api/core/get_memory_by_id.md @@ -0,0 +1,58 @@ +--- +title: 获取记忆详情 (Get Memory Detail) +desc: 通过记忆唯一标识符 (ID) 获取单条记忆的完整元数据,包括置信度、背景信息及使用记录。 +--- + +**接口路径**:`GET /product/get_memory/{memory_id}` +**功能描述**:本接口允许开发者检索单条记忆的所有底层细节。与返回摘要信息的检索接口不同,此接口会暴露该记忆的生命周期数据(如向量同步状态、AI 提取背景等),是系统管理与故障排查的核心工具。 + +## 1. 为什么需要获取详情? + +* **元数据透视**:查看 AI 在提取该条记忆时的 `confidence`和 `background`。 +* **生命周期检验**:确认该记忆的 `vector_sync`(向量同步)是否成功,以及其 `updated_at` 时间戳。 +* **使用追踪**:通过 `usage` 记录,追踪该记忆在哪些会话中被召回并辅助了生成。 + + +## 2. 关键接口参数 + +该接口采用标准的 RESTful 路径参数形式: + +| 参数名 | 位置 | 类型 | 必填 | 说明 | +| :--- | :--- | :--- | :--- | :--- | +| **`memory_id`** | Path | `str` | 是 | 记忆的唯一标识符(UUID)。您可以从 [**获取记忆列表**](./get_memory_list.md) 或 [**检索**](./search_memory.md) 的结果中获得此 ID。 | + +## 3. 工作原理 (MemoryHandler) + +1. **直通查询**:由 **MemoryHandler** 直接绕过业务编排层,与底层核心组件 **naive_mem_cube** 交互。 +2. **数据补全**:系统会从持久化数据库中拉取完整的 `metadata` 字典并返回,不进行任何语义截断。 + +## 4. 响应数据详解 + +响应体中的 `data` 对象包含以下核心字段: + +| 字段名 | 说明 | +| :--- | :--- | +| **`id`** | 记忆唯一标识符。 | +| **`memory`** | 记忆的文本内容,通常包含标注(如 `[user观点]`)。 | +| **`metadata.confidence`** | AI 提取该记忆的置信度分数(0.0 - 1.0)。 | +| **`metadata.type`** | 记忆分类,如 `fact` (事实) 或 `preference` (偏好)。 | +| **`metadata.background`** | 详细描述 AI 为何提取该记忆及其上下文背景。 | +| **`metadata.usage`** | 列表形式,记录该记忆被模型使用的历史时间与环境。 | +| **`metadata.vector_sync`**| 向量数据库同步状态,通常为 `success`。 | + +## 5. 快速上手示例 + +使用 SDK 发起详情查询: + +```python +# 假设已知一条记忆的 ID +mem_id = "2f40be8f-736c-4a5f-aada-9489037769e0" + +# 获取完整详情 +res = client.get_memory_by_id(memory_id=mem_id) + +if res and res.code == 200: + metadata = res.data.get('metadata', {}) + print(f"记忆背景: {metadata.get('background')}") + print(f"同步状态: {metadata.get('vector_sync')}") +``` diff --git a/docs/cn/open_source/open_source_api/core/search_memory.md b/docs/cn/open_source/open_source_api/core/search_memory.md new file mode 100644 index 00000000..6b5c1340 --- /dev/null +++ b/docs/cn/open_source/open_source_api/core/search_memory.md @@ -0,0 +1,95 @@ +--- +title: 检索记忆 (Search Memory) +desc: 基于 MemCube 隔离机理,利用语义检索和逻辑过滤从记忆库中召回最相关的上下文信息。 +--- + +**接口路径**:`POST /product/search` +**功能描述**:本接口是 MemOS 实现检索增强生成 (RAG) 的核心。它能够跨越多个隔离的 **MemCube** 进行语义匹配,自动召回相关的事实、用户偏好及工具调用记录。 + +## 1. 核心机理:Readable Cubes + +与云服务的单一用户视角不同,开源版接口通过 **`readable_cube_ids`** 实现了极其灵活的检索范围控制: + +* **跨 Cube 检索**:您可以同时指定多个 Cube ID(如 `[用户私有Cube, 企业公共知识库Cube]`),算法会并行从这些隔离的记忆体中召回最相关内容。 +* **软信号权重**:通过传入 `session_id`,系统会在召回时优先考虑该会话内的内容。这仅作为提升相关性的“权重”,而非强制过滤。 +* **绝对隔离**:未包含在 `readable_cube_ids` 列表中的 Cube 内容在算法层是完全不可见的,确保了多租户环境下的数据安全性。 + + + +## 2. 关键接口参数 + +核心检索参数定义如下: + +### 检索基础 +| 参数名 | 类型 | 必填 | 说明 | +| :--- | :--- | :--- | :--- | +| **`query`** | `str` | 是 | 用户的搜索查询语句,系统将基于此进行语义匹配。 | +| **`user_id`** | `str` | 是 | 请求发起者的唯一标识,用于鉴权与上下文追踪。 | +| **`readable_cube_ids`**| `list[str]`| 是 | **核心参数**:指定本次检索可读取的 Cube ID 列表。 | +| **`mode`** | `str` | 否 | **搜索策略**:可选 `fast` (快速), `fine` (精细), `mixture` (混合)。 | + +### 召回控制 +| 参数名 | 类型 | 默认值 | 说明 | +| :--- | :--- | :--- | :--- | +| **`top_k`** | `int` | `10` | 召回文本记忆的数量上限。 | +| **`include_preference`**| `bool` | `true` | 是否召回相关的用户偏好记忆(显式/隐式偏好)。 | +| **`search_tool_memory`**| `bool` | `true` | 是否召回相关的工具调用记录。 | +| **`filter`** | `dict` | - | 逻辑过滤器,支持按标签或元数据进行精确过滤。 | +| **`dedup`** | `str` | - | 去重策略:`no` (不去重), `sim` (语义去重), `None` (默认精确文本去重)。 | + +## 3. 工作原理 (SearchHandler 策略) + +当请求到达后端时,**SearchHandler** 会根据指定的 `mode` 调用不同的组件执行检索: + +1. **查询重写**:利用 LLM 对用户的 `query` 进行语义增强,提升匹配精度。 +2. **多模式匹配**: + * **Fast 模式**:通过向量索引进行快速召回,适用于对响应速度要求极高的场景。 + * **Fine 模式**:增加重排序(Rerank)环节,提升召回内容的相关度。 + * **Mixture 模式**:结合语义搜索与图谱搜索,召回更具深度的关联记忆。 +3. **多维聚合**:系统并行检索事实、偏好(`pref_top_k`)和工具记忆(`tool_mem_top_k`),并将结果聚合返回。 +4. **后处理去重**:根据 `dedup` 配置对高度相似的记忆条目进行压缩。 + +## 4. 快速上手示例 + +通过 SDK 进行多 Cube 联合检索: + +```python +from memos.api.client import MemOSClient + +client = MemOSClient(api_key="...", base_url="...") + +# 场景:同时检索用户记忆和两个专业知识库 +res = client.search_memory( + user_id="sde_dev_01", + query="根据我之前的偏好,推荐一些 R 语言的可视化方案", + # 传入可读的 Cube 列表,包括个人空间和两个知识库 + readable_cube_ids=["user_01_private", "kb_r_lang", "kb_data_viz"], + mode="fine", # 使用精细模式以获得更准确的推荐 + include_preference=True, # 召回“用户喜欢简洁风格”等偏好 + top_k=5 +) + +if res: + # 结果包含在 memory_detail_list 中 + print(f"召回结果: {res.data}") +``` + +## 5.进阶:使用过滤器 (Filter) +SearchHandler 支持复杂的过滤器,以满足更细粒度的业务需求: +```python + +# 示例:仅搜索标签为 "Programming" 且创建于 2026 年之后的记忆 +search_filter = { + "and": [ + {"tags": {"contains": "Programming"}}, + {"created_at": {"gt": "2026-01-01"}} + ] +} + +res = client.search_memory( + query="数据清洗逻辑", + user_id="sde_dev_01", + readable_cube_ids=["user_01_private"], + filter=search_filter +) +``` diff --git a/docs/cn/open_source/open_source_api/help/error_codes.md b/docs/cn/open_source/open_source_api/help/error_codes.md new file mode 100644 index 00000000..6a7f1b72 --- /dev/null +++ b/docs/cn/open_source/open_source_api/help/error_codes.md @@ -0,0 +1,48 @@ +--- +title: 错误码 +--- + +| 错误码 | 含义 | 推荐解决方法 | +| :--- | :--- | :--- | +| **参数错误** | | | +| 40000 | 请求参数错误 | 检查参数名、类型及格式是否符合要求 | +| 40001 | 请求数据不存在 | 检查资源 ID(如 memory_id)是否正确 | +| 40002 | 必填参数不能为空 | 补充缺失的必填字段 | +| 40003 | 参数为空 | 检查传入的列表或对象是否为空 | +| 40006 | 不支持的类型 | 检查 type 字段取值 | +| 40007 | 不支持的文件类型 | 仅上传允许的格式(.pdf, .docx, .doc, .txt) | +| 40008 | Base64 内容非法 | 检查 Base64 字符串是否包含非法字符 | +| 40009 | Base64 格式非法 | 检查 Base64 编码格式是否正确 | +| 40010 | 用户 ID 过长 | user_id 长度不能超过 100 字符 | +| 40011 | 会话 ID 过长 | conversation_id 长度不能超过 100 字符 | +| 40020 | 项目 ID 非法 | 确认 Project ID 格式正确 | +| **认证与权限错误** | | | +| 40100 | 需要 API Key 认证 | 在 Header 中添加有效 API Key | +| 40130 | 需要 API Key 认证 | 在 Header 中添加有效 API Key | +| 40132 | API Key 无效或已过期 | 检查 API Key 状态或重新生成 | +| **配额与限流错误** | | | +| 40300 | 超过接口调用次数上限 | 获取更多额度 | +| 40301 | 超过请求 Token 调用上限 | 减少输入内容或获取更多额度 | +| 40302 | 超过响应 Token 调用上限 | 缩短预期输出或获取更多额度 | +| 40303 | 单次对话长度超过限制 | 缩减单次输入/输出长度 | +| 40304 | 账户总 API 调用次数耗尽 | 获取更多额度 | +| 40305 | 输入超过单次 Token 上限 | 缩减输入内容 | +| 40306 | 删除记忆鉴权失败 | 确认是否有权删除该记忆 | +| 40307 | 删除记忆不存在 | 检查 memory_id 是否有效 | +| 40308 | 删除记忆对应的用户不存在 | 检查 user_id 是否正确 | +| **系统与服务错误** | | | +| 50000 | 系统内部异常 | 服务器繁忙或出现异常,请联系支持 | +| 50002 | 操作失败 | 检查操作逻辑或稍后重试 | +| 50004 | 记忆服务暂时不可用 | 稍后重试记忆写入/获取操作 | +| 50005 | 搜索服务暂时不可用 | 稍后重试记忆搜索操作 | +| **知识库与操作错误** | | | +| 50103 | 文件数量超过限制 | 单次上传文件数量不超过20个 | +| 50104 | 单个文件大小超过限制 | 确保单个文件不超过 100MB | +| 50105 | 所有文件总大小超过限制 | 确保单次上传总大小不超过 300MB | +| 50107 | 文件上传格式不符合要求 | 检查并更换文件格式 | +| 50120 | 知识库不存在 | 确认知识库 ID 是否正确 | +| 50123 | 知识库不关联此项目 | 确认知识库是否已授权给当前项目 | +| 50131 | 任务不存在 | 检查 task_id 是否正确(常见于查询处理状态) | +| 50143 | 添加记忆失败 | 算法服务处理异常,请稍后重试 | +| 50144 | 添加消息失败 | 保存聊天历史记录失败 | +| 50145 | 保存反馈并写入记忆失败 | 反馈处理过程中出现异常 | diff --git a/docs/cn/open_source/open_source_api/message/feedback.md b/docs/cn/open_source/open_source_api/message/feedback.md new file mode 100644 index 00000000..eaa73d72 --- /dev/null +++ b/docs/cn/open_source/open_source_api/message/feedback.md @@ -0,0 +1,78 @@ +--- +title: 添加反馈 +desc: 提交用户对大模型回复的反馈内容,帮助 MemOS 实时更正、优化或删除不准确的记忆。 +--- + + +**接口路径**:`POST /product/feedback` +**功能描述**:本接口用于处理用户对 AI 回复或记忆内容的反馈。通过分析 `feedback_content`,系统可以自动定位并修改存储在 **MemCube** 中的错误事实,或根据用户的正负反馈调整记忆的权重。 + +## 1. 核心机理:记忆纠偏循环 + +**FeedbackHandler** 提供了比普通添加接口更精细的控制逻辑: + +* **精确修正 (Precise Correction)**:通过提供 `retrieved_memory_ids`,系统可以直接针对某几条特定的检索结果进行更正,避免误伤其他记忆。 +* **语境分析**:结合 `history`(对话历史),系统能够理解反馈背后的真实意图(例如“你说错了,我现在的公司是 A 而不是 B”)。 +* **结果回显**:如果开启 `corrected_answer=true`,接口在处理完记忆更正后,会尝试返回一个基于新事实生成的更正后回答。 + +## 2. 关键接口参数 +本接口核心参数定义如下: + +| 参数名 | 类型 | 必填 | 默认值 | 说明 | +| :--- | :--- | :--- | :--- | :--- | +| **`user_id`** | `str` | 是 | - | 用户唯一标识符。 | +| **`history`** | `list` | 是 | - | 最近的对话历史,用于提供反馈的语境。 | +| **`feedback_content`** | `str` | 是 | - | **核心:** 用户的反馈文本内容。 | +| **`writable_cube_ids`**| `list` | 否 | - | 需要执行记忆更正的目标 Cube 列表。 | +| `retrieved_memory_ids` | `list` | 否 | - | 可选。上一次检索出的、需要被修正的特定记忆 ID 列表。 | +| `async_mode` | `str` | 否 | `async` | 处理模式:`async` (后台处理) 或 `sync` (实时处理并等待)。 | +| `corrected_answer` | `bool` | 否 | `false` | 是否需要系统在修正记忆后返回一个纠正后的新回答。 | +| `info` | `dict` | 否 | - | 附加元数据。 | + +## 3. 工作原理 + +1. **冲突检测**:`FeedbackHandler` 接收反馈后,会对比 `history` 与 `writable_cube_ids` 中现有的记忆事实。 +2. **定位与更新**: + * 若提供了 `retrieved_memory_ids`,则直接更新对应节点。 + * 若未提供 ID,系统通过语义匹配找到最相关的过时记忆进行覆盖或标记为无效。 +3. **权重调整**:对于态度模糊的反馈,系统会调整特定记忆条目的 `confidence`(置信度)或可信度等级。 +4. **异步生产**:在 `async` 模式下,修正逻辑由 `MemScheduler` 异步执行,接口立即返回 `task_id`。 + +## 4. 快速上手示例 + + +```python +from memos.api.client import MemOSClient + +client = MemOSClient(api_key="...", base_url="...") + +# 场景:修正 AI 关于用户职业的错误记忆 +res = client.add_feedback( + user_id="dev_user_01", + feedback_content="我不再减肥了,现在不需要控制饮食。", + history=[ + {"role": "assistant", "content": "您正在减肥中,近期是否控制了摄入食物的热量?"}, + {"role": "user", "content": "我不再减肥了..."} + ], + writable_cube_ids=["private_cube_01"], + # 指定具体的错误记忆 ID,以实现精准打击 + retrieved_memory_ids=["mem_id_old_job_123"], + corrected_answer=True # 要求 AI 重新根据新事实回复我 +) + +if res and res.code == 200: + print(f"修正进度: {res.message}") + if res.data: + print(f"更正后的回复: {res.data}") +``` + + +## 5. 使用场景 +### 5.1 纠正 AI 的错误推断 +人工干预:在管理后台提供“纠错”按钮,当管理员发现 AI 提取的记忆条目有误时,调用此接口进行人工更正。 +### 5.2 更新过时的用户偏好 +用户即时纠偏:在对话 UI 中,如果用户说出类似“记错了”、“不是这样的”等话语,可以自动触发此接口,利用 is_feedback=True 实现记忆的实时净化。 + +::note +如果反馈涉及的是公共知识库,请确保当前用户拥有对该 Cube 的写入权限。 +:: diff --git a/docs/cn/open_source/open_source_api/message/get_message.md b/docs/cn/open_source/open_source_api/message/get_message.md new file mode 100644 index 00000000..1387d881 --- /dev/null +++ b/docs/cn/open_source/open_source_api/message/get_message.md @@ -0,0 +1,75 @@ +--- +title: 获取消息 +desc: 获取指定会话中的用户与助手原始对话历史,用于构建聊天 UI 或提取原始语境。 +--- + +::warning +**[直接看 API文档 点这里哦](/api_docs/message/get_message)** +
+
+ +**本文聚焦于开源项目的功能说明,详细接口字段及限制请点击上方文字链接查看** +:: + +**接口路径**:`POST /product/get/message` +**功能描述**:该接口用于获取指定会话中用户与助手的原始对话记录。与返回摘要信息的“记忆”接口不同,此接口返回的是未经加工的原始文本,是构建聊天历史回溯功能的核心接口。 + +## 1. 记忆 (Memory) vs 消息 (Message) + +在开发过程中,请区分以下两类数据: +* **获取记忆 (`/get_memory`)**:返回的是系统处理后的**事实与偏好摘要**(例如:“用户喜欢 R 语言进行可视化”)。 +* **获取消息 (`/get_message`)**:返回的是**原始对话文本**(例如:“我最近在自学 R 语言,推荐个可视化包”)。 + +## 2. 关键接口参数 +本接口支持以下参数: + +| 参数名 | 类型 | 必填 | 默认值 | 说明 | +| :--- | :--- | :--- | :--- | :--- | +| `user_id` | `str` | 是 | - | 与获取消息关联的用户唯一标识符。 | +| `conversation_id` | `str` | 否 | `None` | 指定会话的唯一标识符。 | +| `message_limit_number` | `int` | 否 | `6` | 限制返回的消息条数,最大建议值为 50。 | +| `conversation_limit_number`| `int` | 否 | `6` | 限制返回的会话历史条数。 | +| `source` | `str` | 否 | `None` | 标识消息的来源渠道。 | + +## 3. 工作原理 + + +1. **定位会话**:系统根据提供的 `conversation_id` 在底层存储中检索属于该用户及会话的消息记录。 +2. **切片处理**:根据 `message_limit_number` 参数,系统从最新的消息开始倒序截取指定条数,确保返回的是最近的对话。 +3. **安全隔离**:所有请求均通过 `RequestContextMiddleware` 中间件,严格校验 `user_id` 的归属权,防止越权访问。 + +## 4. 快速上手示例 + +使用开源版内置的 `MemOSClient` 快速拉取对话历史: + +```python +from memos.api.client import MemOSClient + +# 初始化客户端 +client = MemOSClient( + api_key="YOUR_LOCAL_API_KEY", + base_url="http://localhost:8000/product" +) + +# 获取指定会话的最近 10 条对话记录 +res = client.get_message( + user_id="memos_user_123", + conversation_id="conv_r_study_001", + message_limit_number=10 +) + +if res and res.code == 200: + # 遍历返回的消息列表 + for msg in res.data: + print(f"[{msg['role']}]: {msg['content']}") +``` + +## 5. 使用场景 +### 5.1 聊天 UI 历史加载 +当用户点击进入某个历史会话时,调用此接口可恢复对话现场。建议结合 `message_limit_number` 实现分页加载,提升前端性能。 + +### 5.2 外部模型上下文注入 +如果您正在使用自定义的大模型逻辑(非 MemOS 内置 chat 接口),可以通过此接口获取原始对话历史,并将其手动拼接至模型的 messages 数组中。 + +### 5.3 消息回溯分析 +您可以定期导出原始对话记录,用于评估 AI 的回复质量或分析用户的潜在意图。 diff --git a/docs/cn/open_source/open_source_api/message/get_suggestion_queries.md b/docs/cn/open_source/open_source_api/message/get_suggestion_queries.md new file mode 100644 index 00000000..34bcbd1e --- /dev/null +++ b/docs/cn/open_source/open_source_api/message/get_suggestion_queries.md @@ -0,0 +1,70 @@ +--- +title: 获取建议问题 (Get Suggestions) +desc: 基于当前对话语境或 Cube 内的近期记忆,自动生成 3 条后续对话建议。 +--- + +# 获取建议问题 (Get Suggestion Queries) + +**接口路径**:`POST /product/suggestions` +**功能描述**:本接口用于实现“猜你想问”功能。系统会根据提供的对话上下文或目标 **MemCube** 中的近期记忆,通过大语言模型生成 3 个相关的建议问题,帮助用户延续对话。 + +## 1. 核心机理:双模式生成策略 + +**SuggestionHandler** 根据入参的不同,支持两种灵活的生成模式: + +* **基于对话的即时建议 (Context-based)**: + * **触发条件**:在请求中提供了 `message`(对话记录)。 + * **逻辑**:系统分析最近的对话内容,生成 3 个与当前话题紧密相关的后续问题。 +* **基于记忆的发现建议 (Memory-based)**: + * **触发条件**:未提供 `message`。 + * **逻辑**:系统会从 `mem_cube_id` 指定的记忆体中检索“最近记忆”,并据此生成与用户近期生活、工作状态相关的启发式问题。 + + + +## 2. 关键接口参数 + +核心参数定义如下: + +| 参数名 | 类型 | 必填 | 默认值 | 说明 | +| :--- | :--- | :--- | :--- | :--- | +| **`user_id`** | `str` | 是 | - | 用户唯一标识符。 | +| **`mem_cube_id`** | `str` | 是 | - | **核心参数**:指定建议生成所依据的记忆空间。 | +| **`language`** | `str` | 否 | `zh` | 生成建议使用的语言:`zh` (中文) 或 `en` (英文)。 | +| `message` | `list/str`| 否 | - | 当前对话上下文。若提供,则生成基于对话的建议。 | + +## 3. 工作原理 (SuggestionHandler) + +1. **语境识别**:`SuggestionHandler` 首先检查 `message` 字段。若有值,则提取对话精髓;若为空,则转向底层 `MemCube` 获取最近动态。 +2. **模板匹配**:系统根据 `language` 参数自动切换内置的中英文提示词模板(Prompt Templates)。 +3. **模型推理**:调用 LLM 对背景资料进行推导,确保生成的 3 个问题既符合逻辑又具有启发性。 +4. **格式化输出**:将建议问题以数组形式返回,便于前端直接渲染为点击按钮。 + +## 4. 快速上手示例 + +使用 SDK 获取针对当前对话的中文建议: + +```python +from memos.api.client import MemOSClient + +client = MemOSClient(api_key="...", base_url="...") + +# 场景:根据刚刚关于“R语言”的对话生成建议 +res = client.get_suggestions( + user_id="dev_user_01", + mem_cube_id="private_cube_01", + language="zh", + message=[ + {"role": "user", "content": "我想学习 R 语言的可视化。"}, + {"role": "assistant", "content": "推荐您学习 ggplot2 包,它是 R 语言可视化的核心工具。"} + ] +) + +if res and res.code == 200: + # 示例输出: ["如何安装 ggplot2?", "有哪些经典的 ggplot2 教程?", "R 语言还有哪些可视化包?"] + print(f"建议问题: {res.data}") +``` + +## 5. 使用场景建议 +对话引导:在 AI 回复完用户后,自动调用此接口,在回复框下方展示建议按钮,引导用户深入探讨。 + +冷启动激活:当用户进入一个新的会话且尚未发言时,通过“基于记忆模式”展示用户可能感兴趣的往期话题,打破沉默。 diff --git a/docs/cn/open_source/open_source_api/scheduler/ wait.md b/docs/cn/open_source/open_source_api/scheduler/ wait.md new file mode 100644 index 00000000..9849ffe6 --- /dev/null +++ b/docs/cn/open_source/open_source_api/scheduler/ wait.md @@ -0,0 +1,77 @@ +--- +title: 高级任务同步 (Advanced Task Synchronization) +desc: 提供阻塞等待与流式进度观测能力,确保在执行后续操作前,指定用户的异步任务已全部处理完成。 +--- + + +**接口路径**: +* **同步阻塞等待**:`POST /product/scheduler/wait` +* **实时进度流 (SSE)**:`GET /product/scheduler/wait/stream` + +**功能描述**:在自动化脚本、数据迁移或集成测试场景中,通常需要确保所有的异步记忆提取任务(如 LLM 事实提取、向量入库)已完全结束。本模块接口允许客户端“挂起”请求,直到调度器检测到目标用户的任务队列已清空。 + +## 1. 核心机理:调度器空闲检测 + +系统通过 **SchedulerHandler** 实时监控底层 **MemScheduler** 的运行状态: + +* **队列检查**:系统会检查 Redis Stream 中属于该用户的待处理任务(Pending)及排队任务(Remaining)。 +* **空闲判定**:仅当队列计数为 0 且当前没有 Worker 正在执行该用户的任务时,判定为“空闲 (Idle)”。 +* **超时保护**:为防止无限期阻塞,接口支持设置 `timeout_seconds`。若达到上限任务仍未完成,接口将返回当前状态并停止等待。 + + + +## 2. 关键接口参数 + +这两个接口共享以下查询参数(Query Parameters): + +| 参数名 | 类型 | 必填 | 默认值 | 说明 | +| :--- | :--- | :--- | :--- | :--- | +| **`user_name`** | `str` | 是 | - | 目标用户的名称或 ID。 | +| `timeout_seconds`| `num` | 否 | - | 最大等待时长(秒)。超过此时间将自动返回。 | +| `poll_interval` | `num` | 否 | - | 内部检查队列状态的频率(秒)。 | + +## 3. 响应模式选型 + +### 3.1 同步阻塞模式 (`/wait`) +* **特点**:标准的 HTTP 响应。连接会保持开启,直到任务清空或超时。 +* **场景**:编写自动化测试脚本或在执行 `search` 前确保数据已入库。 + +### 3.2 实时流模式 (`/wait/stream`) +* **特点**:基于 **Server-Sent Events (SSE)** 技术。 +* **场景**:在管理后台展示动态进度条,实时显示任务队列的缩减过程。 + +## 4. 快速上手示例 + +使用开源版 SDK 进行阻塞式等待: + +```python +from memos.api.client import MemOSClient + +client = MemOSClient(api_key="...", base_url="...") +user_name = "dev_user_01" + +# --- 场景 A:同步阻塞等待 (常用于 Python 自动化脚本) --- +print(f"正在等待用户 {user_name} 的任务队列清空...") +res = client.wait_until_idle( + user_name=user_name, + timeout_seconds=300, + poll_interval=2 +) +if res and res.code == 200: + print("✅ 任务已全部完成。") + +# --- 场景 B:流式进度观测 (常用于前端进度条渲染) --- +print("开始监听任务实时进度流...") +# 注意:SSE 接口在 SDK 中通常返回一个生成器 (Generator) +progress_stream = client.stream_scheduler_progress( + user_name=user_name, + timeout_seconds=300 +) + +for event in progress_stream: + # 实时打印剩余任务数 + print(f"当前排队任务数: {event['remaining_tasks_count']}") + if event['status'] == 'idle': + print("🎉 调度器已空闲") + break +``` diff --git a/docs/cn/open_source/open_source_api/scheduler/get_status.md b/docs/cn/open_source/open_source_api/scheduler/get_status.md new file mode 100644 index 00000000..87e1a4a5 --- /dev/null +++ b/docs/cn/open_source/open_source_api/scheduler/get_status.md @@ -0,0 +1,97 @@ +--- +title: 任务调度与状态监控 (Scheduler Status) +desc: 监控 MemOS 异步任务的生命周期,提供包括任务进度、队列积压及系统负载在内的全方位观测能力。 +--- + +**接口路径**: +* **系统级概览**:`GET /product/scheduler/allstatus` +* **任务进度查询**:`GET /product/scheduler/status` +* **用户队列指标**:`GET /product/scheduler/task_queue_status` + +**功能描述**:本模块接口旨在为开发者提供异步记忆生产链路的可观测性。通过这些接口,您可以实时追踪特定任务的完成状态,监控 Redis 任务队列的积压情况,以及获取整个调度系统的运行指标。 + +## 1. 核心机理:MemScheduler 调度体系 + +在开源架构中,**MemScheduler** 负责处理所有高耗时的后台任务(如 LLM 记忆提取、向量索引构建等): + +* **状态流转**:任务在生命周期内会经历 `waiting` (等待中)、`in_progress` (执行中)、`completed` (已完成) 或 `failed` (失败) 等状态。 +* **队列监控**:系统基于 Redis Stream 实现任务分发。通过监控 `pending` (已交付未确认) 和 `remaining` (排队中) 任务数,可以评估系统的处理压力。 +* **多维度观测**:支持从“单任务”、“单用户队列”以及“全系统 summary”三个维度进行状态透视。 + + +## 2. 接口详解 + +### 2.1 任务进度查询 (`/status`) +用于追踪特定异步任务的当前执行阶段。 + +| 参数名 | 类型 | 必填 | 说明 | +| :--- | :--- | :--- | :--- | +| **`user_id`** | `str` | 是 | 请求查询的用户唯一标识符。 | +| `task_id` | `str` | 否 | 可选。若提供,则仅查询该特定任务的状态。 | + +**返回状态说明**: +* `waiting`: 任务已进入队列,等待空闲 Worker 执行。 +* `in_progress`: Worker 正在调用大模型提取记忆或写入数据库。 +* `completed`: 记忆已成功持久化并完成向量索引同步。 +* `failed`: 任务失败。 + +### 2.2 用户队列指标 (`/task_queue_status`) +用于监控指定用户在 Redis 中的任务积压情况。 + +| 参数名 | 类型 | 必填 | 说明 | +| :--- | :--- | :--- | :--- | +| **`user_id`** | `str` | 是 | 需查询队列状况的用户 ID。 | + +**核心指标项**: +* `pending_tasks_count`: 已分发给 Worker 但尚未收到确认(Ack)的任务数。 +* `remaining_tasks_count`: 当前仍在队列中排队等待分配的任务总数。 +* `stream_keys`: 匹配到的 Redis Stream 键名列表。 + +### 2.3 系统级概览 (`/allstatus`) +获取调度器的全局运行概况,通常用于管理员后台监控。 + +* **核心返回信息**: + * `scheduler_summary`: 包含系统当前的负载与健康状况。 + * `all_tasks_summary`: 所有正在运行及排队任务的聚合统计。 + +## 3. 工作原理 (SchedulerHandler) + +当您发起状态查询请求时,**SchedulerHandler** 会执行以下操作: + +1. **缓存检索**:首先从 Redis 状态缓存中查找 `task_id` 对应的实时进度。 +2. **队列确认**:若查询队列指标,Handler 会调用 Redis 统计指令(如 `XLEN`, `XPENDING`)分析 Stream 状态。 +3. **指标聚合**:对于全局状态请求,Handler 会汇总所有活跃节点的指标,生成系统级的 summary 数据。 + +## 4. 快速上手示例 + +使用 SDK 轮询任务状态直至完成: + +```python +from memos.api.client import MemOSClient +import time + +client = MemOSClient(api_key="...", base_url="...") + +# 1. 系统级概览:查看整个 MemOS 系统的运行健康度 +global_res = client.get_all_scheduler_status() +if global_res: + print(f"系统运行概况: {global_res.data['scheduler_summary']}") + +# 2. 队列指标监控:检查特定用户的任务积压情况 +queue_res = client.get_task_queue_status(user_id="dev_user_01") +if queue_res: + print(f"待处理任务数: {queue_res.data['remaining_tasks_count']}") + print(f"已下发未完成任务数: {queue_res.data['pending_tasks_count']}") + +# 3. 任务进度追踪:轮询特定任务直至结束 +task_id = "task_888999" +while True: + res = client.get_task_status(user_id="dev_user_01", task_id=task_id) + if res and res.code == 200: + current_status = res.data[0]['status'] # data 为状态列表 + print(f"任务 {task_id} 当前状态: {current_status}") + + if current_status in ['completed', 'failed', 'cancelled']: + break + time.sleep(2) +``` diff --git a/docs/cn/open_source/open_source_api/start/configuration.md b/docs/cn/open_source/open_source_api/start/configuration.md new file mode 100644 index 00000000..b6ec7cbb --- /dev/null +++ b/docs/cn/open_source/open_source_api/start/configuration.md @@ -0,0 +1,7 @@ +--- +title: 项目配置 +--- + +关于 MemOS 开源版 API 服务器的详细配置说明(包括 LLM 引擎、存储后端及环境变量设置),请参考: + +👉 [**REST API 服务器配置指南**](../../../getting_started/rest_api_server.md) diff --git a/docs/cn/open_source/open_source_api/start/overview.md b/docs/cn/open_source/open_source_api/start/overview.md new file mode 100644 index 00000000..824643eb --- /dev/null +++ b/docs/cn/open_source/open_source_api/start/overview.md @@ -0,0 +1,52 @@ +--- +title: 概述 +--- + +## 1. 接口介绍 + +MemOS 开源项目提供了一套基于 **FastAPI** 编写的高性能 REST API 服务。系统采用 **Component (组件) + Handler (处理器)** 架构,所有核心逻辑(如记忆提取、语义搜索、异步调度)均可通过标准的 REST 接口进行调用。 + +![MemOS Architecture](https://cdn.memtensor.com.cn/img/memos_run_server_success_compressed.png) +
MemOS REST API 服务架构概览
+ +### 核心功能特点 + +* **多维记忆生产**:支持通过 `AddHandler` 处理对话、文本或文档,并自动转化为结构化记忆。 +* **MemCube 物理隔离**:基于 Cube ID 实现不同用户或知识库之间的数据物理隔离与独立索引。 +* **端到端对话闭环**:通过 `ChatHandler` 编排“检索 -> 生成 -> 异步存储”的全流程。 +* **异步任务调度**:内置 `MemScheduler` 调度引擎,支持大规模记忆生产任务的削峰填谷与状态追踪。 +* **自我纠偏机制**:提供反馈接口,允许利用自然语言对已存储的记忆进行修正或标记。 + +## 2. 入门指南 + +通过以下两个核心步骤,快速将记忆能力集成到您的 AI 应用中: + +* [**添加记忆**](./core/add_memory.md):通过 `POST /product/add` 接口,将原始消息流写入指定的 MemCube,开启生产链路。 +* [**检索记忆**](./core/search_memory.md):通过 `POST /product/search` 接口,基于语义相似度从多个 Cube 中召回相关上下文。 + +## 3. 接口分类 + +MemOS 的功能接口分为以下几大类: + +* **[核心记忆 (Core)](./core/add_memory.md)**:包含记忆的增、删、改、查等原子操作。 +* **[智能对话 (Chat)](./chat/chat.md)**:实现带记忆增强的流式或全量对话响应。 +* **[消息管理 (Message)](./message/feedback.md)**:涵盖用户反馈、猜你想问(Suggestion)等增强交互接口。 +* **[异步调度 (Scheduler)](./scheduler/get_status.md)**:用于监控后台记忆提取任务的进度与队列状态。 +* **[系统工具 (Tools)](./tools/check_cube.md)**:提供 Cube 存在性校验及记忆归属反查等辅助功能。 + +## 4. 鉴权认证与上下文 + +### 鉴权机制 +在开源环境中,所有的 API 请求需要在 Header 中包含 `Authorization` 字段。 +* **开发环境**:您可以在本地 `.env` 或 `configuration.md` 中自定义 `API_KEY`。 +* **生产部署**:建议通过 `RequestContextMiddleware` 扩展 OAuth2 或更高级的身份校验逻辑。 + +### 请求上下文 +* **user_id**:请求体中必须包含此标识,用于 Handler 层的身份追踪。 +* **MemCube ID**:开源版的核心隔离单元。通过指定 `readable_cube_ids` 或 `writable_cube_ids`,您可以精确控制数据读写的物理边界。 + +## 5. 下一步行动 + +* 👉 [**系统配置**](./start/configuration.md):配置您的 LLM 提供商与向量数据库引擎。 +* 👉 [**添加第一条记忆**](./core/add_memory.md):尝试通过 SDK 或 Curl 提交第一组对话消息。 +* 👉 [**探索常见错误**](./help/error_codes.md):了解 API 状态码及其背后的异常处理机制。 diff --git a/docs/cn/open_source/open_source_api/tools/check_cube.md b/docs/cn/open_source/open_source_api/tools/check_cube.md new file mode 100644 index 00000000..066d8b9e --- /dev/null +++ b/docs/cn/open_source/open_source_api/tools/check_cube.md @@ -0,0 +1,50 @@ +--- +title: 检查 MemCube 存在性 (Check Cube Existence) +desc: 校验指定的 MemCube ID 是否已在系统中初始化并可用。 +--- + +**接口路径**:`POST /product/exist_mem_cube_id` +**功能描述**:本接口用于验证指定的 `mem_cube_id` 是否已经存在于系统中。它是确保数据一致性的“守门员”接口,建议在动态创建知识库或为新用户分配空间前调用,以避免重复初始化或无效操作。 + +## 1. 核心机理:Cube 索引校验 + +在 MemOS 架构中,MemCube 的存在性决定了后续所有记忆操作的合法性: + +* **逻辑校验**:系统通过 **MemoryHandler** 检索底层存储索引,确认该 ID 是否已注册。 +* **冷启动保障**:对于按需创建 Cube 的场景,该接口可用于判断是否需要执行初次 `add` 操作来激活记忆空间。 + + + +## 2. 关键接口参数 +请求体定义如下: + +| 参数名 | 类型 | 必填 | 说明 | +| :--- | :--- | :--- | :--- | +| **`mem_cube_id`** | `str` | 是 | 待校验的 MemCube 唯一标识符。 | + +## 3. 工作原理 (MemoryHandler) + +1. **直通索引**:**MemoryHandler** 接收请求后,直接调用底层 **naive_mem_cube** 的元数据查询接口。 +2. **状态检索**:系统在持久化层中查找该 ID 对应的配置文件或数据库记录。 +3. **布尔反馈**:返回结果不包含记忆内容,仅以 `code` 或 `data` 形式告知该 Cube 是否已激活。 + +## 4. 快速上手示例 + +使用 SDK 校验目标 Cube 状态: + +```python +from memos.api.client import MemOSClient + +client = MemOSClient(api_key="...", base_url="...") + +# 场景:在导入文档前确认目标知识库已创建 +kb_id = "kb_finance_2026" +res = client.exist_mem_cube_id(mem_cube_id=kb_id) + +if res and res.code == 200: + # 假设 data 字段返回布尔值或存在性对象 + if res.data.get('exists'): + print(f"✅ MemCube '{kb_id}' 已就绪。") + else: + print(f"❌ MemCube '{kb_id}' 尚未初始化。") +``` diff --git a/docs/cn/open_source/open_source_api/tools/get_user_names.md b/docs/cn/open_source/open_source_api/tools/get_user_names.md new file mode 100644 index 00000000..16dfcd04 --- /dev/null +++ b/docs/cn/open_source/open_source_api/tools/get_user_names.md @@ -0,0 +1,53 @@ +--- +title: 反向查询用户 (Get User Names) +desc: 通过记忆唯一标识符 (ID) 反向查询该条记忆所属的用户名称。 +--- + +**接口路径**:`POST /product/get_user_names_by_memory_ids` +**功能描述**:本接口提供了一种“逆向追踪”能力。当您在系统日志或共享存储中获取到特定的 `memory_id`,但无法确定其产生者时,可以使用此接口批量获取对应的用户名。 + +## 1. 核心机理:元数据溯源 + +在 MemOS 的存储架构中,每条生成的记忆条目都与原始用户的元数据绑定。本接口通过以下逻辑执行溯源: + +* **多对一映射**:支持一次传入多个 `memory_id`,系统将返回对应的用户列表。 +* **管理透明度**:该工具通常用于管理后台,帮助管理员识别公共 Cube 中不同条目的贡献者。 + + + +## 2. 关键接口参数 + +请求体定义如下: + +| 参数名 | 类型 | 必填 | 说明 | +| :--- | :--- | :--- | :--- | +| **`memory_ids`** | `list[str]` | 是 | 待查询的记忆唯一标识符列表。 | + +## 3. 工作原理 (MemoryHandler) + +1. **ID 解析**:**MemoryHandler** 接收 ID 列表后,查询全局索引表。 +2. **关系检索**:系统从底层的持久化层(或关系图谱节点)中提取关联的 `user_id` 或 `user_name` 属性。 +3. **数据脱敏**:根据系统配置,返回对应的用户显示名称或标识符。 + +## 4. 快速上手示例 + +使用 SDK 执行反向查询: + +```python +from memos.api.client import MemOSClient + +client = MemOSClient(api_key="...", base_url="...") + +# 准备待查的记忆 ID 列表 +target_ids = [ + "2f40be8f-736c-4a5f-aada-9489037769e0", + "5e92be1a-826d-4f6e-97ce-98b699eebb98" +] + +# 执行查询 +res = client.get_user_names_by_memory_ids(memory_ids=target_ids) + +if res and res.code == 200: + # res.data 通常返回一个映射字典或用户列表 + print(f"该记忆片段归属于用户: {res.data}") +``` diff --git a/docs/cn/openclaw/changes.md b/docs/cn/openclaw/changes.md new file mode 100644 index 00000000..bc3d792e --- /dev/null +++ b/docs/cn/openclaw/changes.md @@ -0,0 +1,119 @@ +--- +title: OpenClaw 插件更新日志 +--- + +::OpenclawReleaseTimeline +--- +releases: + - date: '2026-05-08' + plugins: + - title: '云插件' + version: 'v0.1.15' + sections: + - title: '新增' + items: + - '为 OpenClaw / Moltbot / ClawDBot 插件清单新增 `activation.onCapabilities: ["hook"]` 能力声明。' + - '适配 OpenClaw 5.3 及其之后版本的插件加载机制:OpenClaw 会在插件注册前基于能力声明判断插件是否应被加载。该声明确保本插件作为 lifecycle hook 插件能够被正确识别和加载,从而继续注册 `before_agent_start`、`agent_end` 等 hook。' + - title: '改进' + items: + - '调整 `hooks.allowConversationAccess: true` 的自动补充时机,在 gateway 就绪后再写入宿主配置,使宿主配置更新能够触发 gateway 自动重启并应用所需的 hook 权限。' + + - date: '2026-04-29' + plugins: + - title: '云插件' + version: 'v0.1.14' + summary: '适配 OpenClaw 2026.4.23 及其之后版本对 agent_end 的权限限制:插件会在启动 gateway 时自动检查配置,并为插件补充 `hooks.allowConversationAccess: true`,帮助用户避免因缺少该配置导致记忆写入相关 hook 无法正常工作。' + + - date: '2026-04-16' + plugins: + - title: '云插件' + version: 'v0.1.13' + summary: '全面支持多 Agent 模式下的共享知识库访问与协同处理。' + sections: + - title: '共享知识库支持(多 Agent 场景)' + items: + - '**多 Agent 知识库支持**:全面支持了多 Agent 对知识库的协同访问与处理。允许不同的 Agent 节点共享、检索和调用同一个知识库中的数据,提升了复杂任务下多智能体协作时的知识获取效率与上下文一致性。' + + - date: '2026-04-03' + plugins: + - title: '云插件' + version: 'v0.1.12' + summary: '推出本地可视化配置界面,深度重构配置解析架构并适配 OpenClaw 插件安全审查。' + sections: + - title: '可视化配置 UI (Config UI)' + items: + - '**本地配置服务**:内置 HTTP 服务提供插件管理后台,支持在浏览器中可视化查看与修改配置,并实现配置变更的实时同步(默认访问地址为 `http://127.0.0.1:38463`)。' + - '**启动稳定性保障**:服务启动流程中引入了网关就绪检测 (`waitForGatewayReady`),确保服务状态稳定。' + - '**界面体验优化**:新增响应式布局与可折叠悬浮导航工具,并补充了全新的 SVG 图标。' + - title: '架构优化与安全合规' + items: + - '**适配插件安全审查(移除子进程)**:为了符合严格的插件沙箱与安全合规要求,完全移除了 `child_process` 的 `spawn`/`exec` 调用。插件自更新机制由原来的“后台静默下载并强制更新”改为了“仅检测版本并在日志中打印手动更新命令提示”,避免后台进程逃逸风险。' + - '**适配插件安全审查(移除默认越权)**:移除了 `plugin.json` 声明文件中的所有 `default` 默认值设定,确保插件在无显式配置时不会触发越权或非预期调用。' + - '**配置 Schema 集中管理**:重构配置解析逻辑 (`getConfigResolution`),集中管理环境变量、用户配置与默认值的优先级策略,提升了代码的安全性和健壮性。' + + - date: '2026-03-30' + plugins: + - title: '云插件' + version: 'v0.1.11' + summary: '强化多 Agent 场景的细粒度控制,增强动态用户标识提取能力。' + sections: + - title: '会话与用户身份管理' + items: + - '**Direct Session User ID 支持**:新增 `useDirectSessionUserId` 配置,开启后可直接从 `sessionKey` 中解析并提取真实会话的用户 ID,满足复杂代理场景下的数据隔离需求。' + - title: '多 Agent 配置增强' + items: + - '**Agent 运行白名单**:新增 `allowedAgents` 配置项,允许在多 Agent 模式下仅对特定的 Agent 触发记忆召回和记录,避免全局拦截带来的冗余消耗。' + - '**差异化覆盖机制 (Agent Overrides)**:引入 `agentOverrides` 配置对象,支持针对不同的 Agent 单独覆盖如知识库 ID (`knowledgebaseIds`)、召回条数 (`memoryLimitNumber`)、功能开关 (`recallEnabled`) 等核心参数。' + + - date: '2026-03-24' + plugins: + - title: '云插件' + version: 'v0.1.10' + sections: + - items: + - '**消息入库质量提升**:新增并强化对 OpenClaw 入站元数据、时间戳包裹、飞书尾部系统提示的清洗,减少无效噪音写入记忆。' + - '**多渠道消息前缀清洗优化**:扩展并统一处理 WebChat、WhatsApp、Telegram、Slack、Discord、Zalo 等 channel 的消息 envelope/前缀,降低平台包装信息对记忆写入与召回质量的干扰。' + - '**召回展示更准确**:召回结果时间展示优先使用更新时间,提升时间语义一致性。' + - '**Recall Filter 更稳健**:默认参数与运行时回退值(超时、重试)保持对齐,提升本地模型场景稳定性。' + - '**超时与资源管理优化**:修复定时器清理问题,避免异常路径下的资源泄漏。' + - '**配置能力补全**:插件 schema 补齐 Recall Filter 相关字段,配置更完整、可控性更强。' + - '**可观测性增强**:增加过滤前后数量日志,便于排查召回质量与过滤效果。' + - date: '2026-03-13' + plugins: + - title: '云插件' + version: 'v0.1.9' + summary: '无感升级与记忆召回优化。本次更新主要包含以下改进,旨在提升插件的易用性与 Token 利用率:' + sections: + - title: '插件无感自检测升级' + items: + - '新增插件版本自检测机制,后台定期检查 NPM 仓库最新版本。' + - '检测到新版本后自动触发静默升级流程,用户无需手动操作即可持续获取最新能力与修复。' + - title: '支持用户配置模型进行 Memory Recall' + items: + - '引入基于 LLM 的记忆二次筛选能力。' + - '新增 recallFilterModel、recallFilterBaseUrl 等配置项,可指定独立模型进行相关性评审。' + - '可有效剔除干扰项,仅保留对当前对话真正有用的记忆片段。' + - title: '对话注入瘦身(System Prompt 优化)' + items: + - '重构记忆注入逻辑,将静态协议与指令移动到 appendSystemContext。' + - 'prependContext 仅保留动态检索得到的 memory-list 数据。' + - '显著降低重复提示词带来的 Token 消耗,并提升模型对核心记忆的聚焦。' + - date: '2026-03-09' + plugins: + - title: '云插件' + version: 'v0.1.8' + summary: '支持用户开启多Agent模式,实现从上下文中识别agent进行记忆隔离,同时做了开关,兼容旧版本。' + + - date: '2026-03-05' + plugins: + - title: '云插件' + version: 'v0.1.7' + summary: '支持用户自定义searchMemory接口的relativity字段。' + + - date: '2026-02-26' + plugins: + - title: '云插件' + version: '其他历史版本(基础功能)' + summary: '支持 before_agent_start 事件中 searchMemory、在 agent_end 事件中进行 addMessage。' +--- +:: diff --git a/docs/cn/openclaw/examples/hermes_usage.md b/docs/cn/openclaw/examples/hermes_usage.md new file mode 100644 index 00000000..606549c2 --- /dev/null +++ b/docs/cn/openclaw/examples/hermes_usage.md @@ -0,0 +1,112 @@ +--- +title: 本地插件使用 +desc: MemOS 本地插件在 OpenClaw 与 Hermes 中的基础使用、工具调用、团队共享和多 Agent 场景示例。 +--- + +## 基本使用 + +`@memtensor/memos-local-plugin` 同时支持 OpenClaw 与 Hermes。安装完成后,按你使用的 Agent 正常启动即可;插件会在每轮任务开始前注入本地记忆上下文,并在任务结束后写入 Trace、Policy、World Model 和 Skill。 + +| Agent | 启动方式 | Viewer | +| --- | --- | --- | +| OpenClaw | 正常启动或重启 OpenClaw gateway | `http://127.0.0.1:18799` | +| Hermes | `hermes chat` | `http://127.0.0.1:18800` | + +### 验证记忆功能 + +1. 与 OpenClaw 或 Hermes Agent 进行任意对话。 +2. 打开对应 Memory Viewer,确认对话内容已出现在 **Memories** / **Tasks** 页面。 +3. 新开一个对话,让 Agent 回忆之前的内容: + +```text +你:你还记得我之前让你帮我处理过什么事情吗? +Agent:(调用 memory_search)是的,我们之前讨论过…… +``` + +--- + +## 记忆工具 + +本地插件会通过各自 Agent 宿主暴露记忆工具。不同宿主展示名称可能略有差异,但核心能力一致。 + +| 工具 | 说明 | +| --- | --- | +| `memory_search` | 从 Skill、Trace/Episode、World Model 三层检索相关上下文。 | +| `memory_get` | 获取某条记忆详情。 | +| `memory_timeline` | 查看某个 episode / task 附近的时间线。 | +| `skill_list` | 列出当前可用 Skill。 | +| `skill_get` | 获取某个 Skill 的调用指南。 | +| `memory_environment` | 查询 L3 World Model,了解项目结构、环境规律和约束。 | + +### 调用示例 + +```text +Agent 调用: + memory_search("Nginx 部署配置") + → 返回相关 Skill、Trace 片段和环境认知 + +Agent 调用: + skill_get("nginx-proxy") + → 返回可执行步骤、适用条件和注意事项 +``` + +插件也会记录工具调用成功 / 失败结果,用于后续 decision repair。 + +--- + +## 团队共享 + +默认情况下,OpenClaw 与 Hermes 各自使用独立本地数据库。需要协作时,可以在 Memory Viewer 中启用 Team Sharing,把本地结晶出的 Skill 和可选 Trace 摘要共享给同一局域网 / VPN 内的其他实例。 + +### 配置方式 + +打开对应 Agent 的 Memory Viewer,进入 **Settings → Team Sharing**,按面板提示填写团队地址和 token,保存后插件会自动重启并加载设置。 + +### 预期效果 + +- 私有本地数据默认留在当前 Agent 的运行目录中。 +- 明确共享的 Skill 可被其他实例检索和复用。 +- Hub 不在算法关键路径上;共享失败时,本地记忆写入、召回和 Skill 检索仍可继续。 + +--- + +## 多 Agent 场景 + +同一台机器上同时安装 OpenClaw 和 Hermes 时,它们的端口和数据完全隔离: + +| 资源 | OpenClaw | Hermes | +| --- | --- | --- | +| Viewer | `18799` | `18800` | +| 数据目录 | `~/.openclaw/memos-plugin/` | `~/.hermes/memos-plugin/` | +| 配置入口 | Viewer → Settings | Viewer → Settings | + +```text +OpenClaw: + memory_search("deploy config") + → 优先使用 OpenClaw 本地经验 + +Hermes: + memory_search("deploy config") + → 优先使用 Hermes 本地经验 + +开启 Hub 后: + 两者可以显式复用团队共享 Skill +``` + +--- + +## Viewer 管理 + +Memory Viewer 提供这些常用入口: + +| 页面 | 用途 | +| --- | --- | +| Overview | 查看核心状态、版本、事件流和健康状态。 | +| Memories | 查看 L1 Trace 和原始执行记录。 | +| Tasks | 查看按任务聚合的对话与执行结果。 | +| Policies | 查看从多个 Trace 归纳出的策略。 | +| World Models | 查看环境认知与约束。 | +| Skills | 查看、检索或停用结晶出的 Skill。 | +| Import | 导入旧版插件数据、OpenClaw 会话 JSONL、Hermes `MEMORY.md`,或导入 / 导出 JSON 备份。 | +| Settings | 配置模型、团队共享、日志和 telemetry。 | +| Help | 查看 `V`、`α`、`R_human`、`η`、support、gain 等字段含义。 | diff --git a/docs/cn/openclaw/examples/multi_agent.md b/docs/cn/openclaw/examples/multi_agent.md new file mode 100644 index 00000000..9f4cabf4 --- /dev/null +++ b/docs/cn/openclaw/examples/multi_agent.md @@ -0,0 +1,97 @@ +--- +title: 多智能体记忆隔离 +--- +## 云插件 + +MemOS Openclaw 云插件支持多个 Agent 之间完全隔离记忆和和消息历史。每个 Agent 都只能看到自己的记忆,不会串台。 + +### 如何使用 + +只需简单配置,即可让不同 Agent 拥有独立的记忆空间。支持自动识别和静态指定两种模式。 + +#### 1. 开启多 Agent 模式 + +在 `openclaw.json` 配置中添加: + +```json +{ + "plugins": { + "entries": { + "memos-cloud-openclaw-plugin": { + "config": { + "multiAgentMode": true + } + } + } + } +} +``` + +或设置环境变量: + +```bash +MEMOS_MULTI_AGENT_MODE=true +``` + +#### 2. 自动识别 Agent + +开启后,插件会自动读取 `ctx.agentId` ,不同 Agent 的记忆自动隔离。无需额外配置。 + +#### 3. 静态指定 Agent(可选) + +如果需要固定某个 Agent ID,可以在配置中指定: + +```json +{ + "config": { + "agentId": "marketing_agent" + } +} +``` + +### 原理介绍 + +- **/search/memory**:检索记忆——只返回当前 Agent 的记忆 +- **/add/message**:添加记录——自动标记为当前 Agent 的数据 +- **向下兼容**:默认 Agent `"main"` 会被忽略,保证老用户的单 Agent 数据不受影响 + +### 适用场景 + +- **多角色协作**:战略/业务/营销/技术 Agent 分工协作 +- **业务线独立**:不同业务线的 Agent 独立运行互不干扰 +- **人设一致性**:保持 Agent 长期人设和行为风格一致 + +--- + +## 本地插件 + +`@memtensor/memos-local-plugin` 同时支持 OpenClaw 与 Hermes。默认情况下,每个 Agent 使用独立运行目录和本地数据库;如果在同一套运行目录内区分多个会话 / Agent,检索会优先限定在当前 Agent 的上下文中。需要跨实例协作时,可以在 Memory Viewer 的 **Settings → Team Sharing** 中开启团队共享。 + +### 规则 + +- **默认隔离**:OpenClaw 使用 `~/.openclaw/memos-plugin/`,Hermes 使用 `~/.hermes/memos-plugin/`,两者不会自动共享数据库。 +- **当前 Agent 优先**:检索时优先使用当前 Agent / session 的 Trace、Policy、World Model 和 Skill。 +- **可选共享**:开启 `hub.enabled` 后,可在局域网 / VPN 内共享本地结晶的 Skill 和可选 Trace 摘要。 +- **失败降级**:Hub 不在算法关键路径上;共享服务不可用时,本地插件自动退回本机记忆模式。 + +### 操作示例 + +```text +OpenClaw: + memory_search("deploy config") + → 优先检索 OpenClaw 本地库中的 Skill / Trace / World Model + +Hermes: + memory_search("deploy config") + → 优先检索 Hermes 本地库中的 Skill / Trace / World Model + +开启 Hub 后: + OpenClaw / Hermes 可以拉取团队共享 Skill + 私有 Trace 默认仍留在各自机器和运行目录中 +``` + +### 预期结果 + +- OpenClaw 与 Hermes 默认互不读取对方的本地数据库 +- 同一团队内可显式共享高价值 Skill,减少重复踩坑 +- 即使 Hub 不可用,本地记忆写入、召回和技能检索仍然可用 diff --git a/docs/cn/openclaw/examples/recall_filter.md b/docs/cn/openclaw/examples/recall_filter.md new file mode 100644 index 00000000..04d047db --- /dev/null +++ b/docs/cn/openclaw/examples/recall_filter.md @@ -0,0 +1,107 @@ +--- +title: 记忆召回的二次过滤 +--- +## 云插件 + +MemOS Openclaw 云插件支持使用指定的大语言模型对召回的记忆进行二次精准过滤。过滤后,只有与当前任务高度相关的记忆才会被注入到上下文中,有效避免无关记忆的干扰并节省 Token。 + +### 如何使用 + +只需配置兼容 OpenAI 格式的模型接口(如本地 Ollama 或第三方大模型 API)并开启过滤开关,即可启用记忆二次过滤功能。 + +#### 1. 开启记忆过滤功能 + +在配置大模型过滤记忆时,**必须**配置 API Key 和 Base URL。 + +在 `openclaw.json` 配置中添加: +```json +{ + "plugins": { + "entries": { + "memos-cloud-openclaw-plugin": { + "config": { + "recallFilterEnabled": true, + "recallFilterBaseUrl": "http://127.0.0.1:11434/v1", + "recallFilterApiKey": "sk-...", + "recallFilterModel": "qwen2.5_7b" + } + } + } + } +} +``` + +或设置环境变量: +```bash +MEMOS_RECALL_FILTER_ENABLED=true +MEMOS_RECALL_FILTER_BASE_URL="http://127.0.0.1:11434/v1" +MEMOS_RECALL_FILTER_API_KEY="sk-..." +MEMOS_RECALL_FILTER_MODEL="qwen2.5_7b" +``` + +#### 2. 配置鉴权与进阶参数(可选) + +如果需要调整超时时间及失败策略,可以在配置中指定: +```json +{ + "config": { + "recallFilterTimeoutMs": 6000, + "recallFilterFailOpen": true + } +} +``` + +### 原理介绍 +- **召回后拦截**:在每轮对话前从云端召回记忆后,插件会把候选的记忆条目发送给你配置的过滤模型做二次筛选。 +- **精准保留**:过滤模型判断后,只保留被标记为 `keep` 的相关条目,最终注入到 Agent 的上下文中。 +- **高可用回退**:默认开启了失败放行(`recallFilterFailOpen: true`)。当过滤模型请求超时或失败时,会自动回退为“不过滤”全量注入,保证当前对话不被中断。 + +### 适用场景 +- **超长记忆精简**:长期对话积累大量记忆时,剔除与当前 Prompt 无关的内容,大幅降低主模型上下文的 Token 消耗。 +- **提升推理精度**:为需要专注处理复杂任务的 Agent 过滤掉早期无关的记忆干扰,提高核心任务的推理准确度。 +- **本地模型协同**:搭配本地运行的小模型(如 Ollama 运行的 `qwen2.5_7b`)作为低成本前置过滤器,在不增加主模型 API 费用的前提下提升记忆注入质量。 + +--- + +## 本地插件 + +`@memtensor/memos-local-plugin` 的本地检索内置多阶段过滤。它会先从 Skill、Trace/Episode、World Model 三层召回候选,再通过 RRF + MMR 做融合与去冗余;如果配置了可用 LLM,还可以在注入前做相关性复核,进一步筛掉表面关键词相似但对当前任务帮助不大的内容。 + +### 如何配置 + +直接在对应 Agent 的 Memory Viewer 里配置: + +| Agent | Memory Viewer | +| --- | --- | +| OpenClaw | `http://127.0.0.1:18799` | +| Hermes | `http://127.0.0.1:18800` | + +配置步骤: + +1. 打开 Memory Viewer。 +2. 进入 **Settings → AI Models**。 +3. 在 **LLM** 区域选择 provider,并填写 endpoint、API Key、model 等信息。 +4. 点击 **Test** 确认模型可用。 +5. 保存设置;Viewer 会自动重启插件并加载新配置。 + +保存后,本地检索会在召回、RRF/MMR 排序之后使用该 LLM 做相关性复核。未配置 LLM 时,插件仍会使用内置的多通道召回和机械阈值过滤。 + +### 本地召回流程 + +```text +用户问题 +→ 构建检索 query 与标签 +→ Tier 1: Skill 候选 +→ Tier 2: Trace / Episode 候选 +→ Tier 3: World Model 候选 +→ 向量 / FTS5 / pattern / 错误特征多通道召回 +→ RRF 融合 + MMR 多样性控制 +→ 可选 LLM 相关性复核 +→ 注入给 Agent +``` + +### 预期结果 + +- 注入上下文的记忆更聚焦,噪音更少 +- Skill、Trace/Episode、World Model 不会只靠单一向量相似度命中 +- LLM 不可用时会使用更严格的机械阈值回退,不影响基础召回 diff --git a/docs/cn/openclaw/guide.md b/docs/cn/openclaw/guide.md new file mode 100644 index 00000000..424a8d27 --- /dev/null +++ b/docs/cn/openclaw/guide.md @@ -0,0 +1,343 @@ +--- +title: OpenClaw 云插件 +desc: 增强 OpenClaw 的记忆能力并减少 72% 的 Token 消耗:MemOS OpenClaw 插件现已上线! +--- + +OpenClaw 近期备受关注,但在实际使用中,用户普遍会遇到两个难以回避的问题: + +1. **Token 消耗过快**:OpenClaw 能处理许多长尾任务,但每次运行都会消耗大量 Token。当你让它监控屏幕、运行定时任务或处理复杂工作流时,Token 消耗更是惊人。 + + > ("你知道 Token 就是金钱🫠") + +2. **记忆功能薄弱**:虽然很多人声称 OpenClaw 的记忆能力超越 ChatGPT,但实践中你会发现它虽然能记住一些信息,却往往不是你需要的关键信息。重要偏好可能被遗忘,而无关紧要的闲聊却被记得一清二楚。 + + > ("能不能请你记住一些对我真正重要的事情???") + +::tip +**这不是 OpenClaw 的错,所有 AI Agent 都面临这些挑战。** +:: + +本教程将指导你通过 MemOS OpenClaw 插件解决这 3 个核心痛点: +- **显著降低 Token 消耗** — 智能检索相关记忆,而非无差别加载全部历史 +- **让记忆真正有用** — 专业级记忆分类与管理,记住该记的,遗忘该忘的 +- **保留 OpenClaw 的核心优势** — 跨设备控制、主动交互、类人体验保持不变 + +--- + +## 为什么 OpenClaw 成了"Token 杀手"🥷? + +### OpenClaw 的问题 + +```plaintext +第 1 次对话: 500 tokens +第 2 次对话: 500 + 800 = 1,300 tokens +第 3 次对话: 1,300 + 600 = 1,900 tokens +第 10 次对话: 10,000+ tokens +``` + +当你让 OpenClaw 监控屏幕、执行定时任务并按计划运行时,这个数字增长得更快。 + +### OpenClaw 原生记忆管理的三个关键缺陷 + +OpenClaw 的记忆存储在本地 `.md` 文件中,分为全局记忆和每日记忆。虽然听起来不错,但实际使用中存在三个不可避免的问题: + +#### 1. 全局记忆膨胀失控 +随着全局记忆不断累积,上下文超载随之而来。更糟糕的是,这些记忆会持续干扰当前对话——你可能只想问一个简单的问题,它却把三个月前的每一句话都翻出来。 + +#### 2. 每日记忆检索困难 +每日记忆不断累积,使检索变得繁琐。要回忆昨天的活动,你必须经历额外的检索过程。维护跨会话记忆几乎变得不可能。 + +#### 3. 记忆依赖模型的主动记录 +OpenClaw 的记忆系统依赖模型自身记录信息,而非自动记录。这意味着它经常遗漏细节——你提到某件事,它马上就忘了。 + +> 我自己就遇到过好几次:我明确强调了某个项目配置,但第二天重启对话时,它完全没有印象,需要我重新解释一遍。 + +--- + +## OpenClaw vs OpenClaw + MemOS:记忆方案对比 + +### OpenClaw 原生记忆方案 + +#### 记忆存储方案 + +**核心哲学:文件即真理** — 摒弃不透明的向量数据库,选择 Markdown 文件作为记忆的核心载体。 + +![OpenClaw记忆方案](https://cdn.memtensor.com.cn/img/1772698365666_utw5a2_compressed.png) + +#### 记忆检索方案:双引擎驱动 + +| 引擎 | 技术 | 特点 | +|-----|------|------| +| **向量搜索** (Vector Search) | 余弦相似度 | 捕捉语义关联,擅长处理"概念匹配",如将"登录流程"关联至"身份验证" | +| **BM25 搜索** (Lexical Matching) | 基于 FTS5 的词法匹配 | 处理"精确 Token",如错误代码、函数名或特定 ID | + +**检索触发方式**:通过 Prompt 触发,模型自动决策 + +**加权分数融合**:`Score = (0.7 * VectorScore) + (0.3 * BM25Score)` + +#### 现有方案痛点 + +- **检索算法简陋**:召回不稳定、相关性弱,Agent 反复试错,Token 快速累积 +- **上下文注入过量**:固定读取 today + yesterday + 长期记忆,无效上下文占比高 +- **记忆缺少结构与去冗余**:工具调用长输出直接写入并反复重传,成本滚雪球 + +### OpenClaw + MemOS 的记忆方案 + +![MemOS-OpenClaw](https://cdn.memtensor.com.cn/img/1772627912577_gvwyaz_compressed.png) + +#### 三大核心效果 + +**效果一:Token 成本可控 💰** +> 从"全量灌上下文"变成"按任务精确召回" + +OpenClaw 不再每次固定塞入 today+yesterday+长期记忆,而是由 MemOS 按当前任务检索最相关的少量记忆(可设定召回预算/条数),显著降低无效上下文占比,避免 Token 滚雪球。 + +**效果二:检索更稳更准 🎯** +> 减少反复试错与重问,提升一次命中率 + +MemOS 提供更强的记忆组织与检索能力(结构化、分层/多粒度、语义检索 + 规则过滤等),让 OpenClaw 召回的内容相关性更强、稳定性更高,减少 Agent 因"召回不稳"导致的重复推理与反复确认。 + +**效果三:记忆更干净可用 ✨** +> 结构化 + 去冗余 + 高压缩,避免"长输出污染" + +工具调用的长输出(如遍历结果、config/schema 等)不会直接原样反复写入上下文;MemOS 可以做摘要/压缩、去重与归档,长期运行越用越"清爽",记忆质量随时间提升而不是劣化。 + +--- + +## 集成 MemOS OpenClaw 插件后的效果👇🏻 + +- ✅ 每次仅检索 3-5 条相关记忆 +- ✅ 在 2,000-3,000 tokens 内保持上下文稳定性 +- ✅ 无论对话多长,成本始终保持可控 + +### MemOS 插件能为 OpenClaw 带来的增强 + +| 功能 | 说明 | +|-----|------| +| **自动记忆所有对话** | 不依赖模型主动记录,确保关键信息不被遗漏 | +| **精准召回** | 基于当前任务意图检索相关记忆,避免无关历史数据 | +| **记住用户偏好** | 专门分类和存储偏好信息,跨会话保持有效 | + +MemOS OpenClaw 重构了 Token 消耗模型,将成本从"历史长度函数"转变为"任务相关性函数"。你的本地 OpenClaw 成本变得可控,系统运行更加稳定。 + +--- + +## 快速开始 + +只需 3 步,即可让你的 Agent 具备基础记忆能力。 + +### 1. 安装 OpenClaw + +确保你的系统中已安装 OpenClaw 环境: + +```bash +# 安装最新版 +npm install -g openclaw@latest + +# 初始化并配置启动 +openclaw onboard +``` + +### 2. 获取并配置 API Key + +#### 2.1 获取 Key + +登陆/注册 MemOS Cloud 获取你的 API Key 🔗 [MemOS Cloud](https://memos-dashboard.openmem.net/cn/apikeys/) + +![image.png](https://cdn.memtensor.com.cn/img/1772443326905_kkxve6_compressed.webp) + +#### 2.2 设置环境变量 + +插件会按顺序尝试读取 env 文件(**openclaw → moltbot → clawdbot**)。对于每个键,优先使用首个包含该值的文件。 +如果这些文件都不存在(或缺少对应键),则会回退到进程环境变量。 + +**配置位置** +- 文件(优先级顺序): + - `~/.openclaw/.env` + - `~/.moltbot/.env` + - `~/.clawdbot/.env` +- 每行格式为 `KEY=value` + +**快速配置(Shell)** +```bash +echo 'export MEMOS_API_KEY="mpg-..."' >> ~/.zshrc +source ~/.zshrc + +# or + +echo 'export MEMOS_API_KEY="mpg-..."' >> ~/.bashrc +source ~/.bashrc +``` + +**快速配置(Windows PowerShell)** +```powershell +[System.Environment]::SetEnvironmentVariable("MEMOS_API_KEY", "mpg-...", "User") +``` + +如果缺少 `MEMOS_API_KEY`,插件会提示配置说明和 API Key 获取链接。 + +**最小配置** +```env +MEMOS_API_KEY=YOUR_TOKEN +``` + +### 3. 安装插件 + +#### 方案 A — NPM(推荐) + +```bash +openclaw plugins install @memtensor/memos-cloud-openclaw-plugin@latest +openclaw gateway restart +``` + +> Windows 用户注意:如果遇到 `Error: spawn EINVAL`,这是 OpenClaw 插件安装器在 Windows 上的已知问题。请使用下面的方案 B(手动安装)。 + +请确认在 `~/.openclaw/openclaw.json` 中已启用: + +```json +{ + "plugins": { + "entries": { + "memos-cloud-openclaw-plugin": { "enabled": true } + } + } +} +``` + +#### 方案 B — 手动安装(Windows 兼容方案) + +1. 从 [NPM](https://www.npmjs.com/package/@memtensor/memos-cloud-openclaw-plugin) 下载最新的 `.tgz` 包。 +2. 解压到本地目录(例如:`C:\Users\YourName\.openclaw\extensions\memos-cloud-openclaw-plugin`)。 +3. 配置 `~/.openclaw/openclaw.json`(或 `%USERPROFILE%\.openclaw\openclaw.json`): + +```json +{ + "plugins": { + "entries": { + "memos-cloud-openclaw-plugin": { "enabled": true } + }, + "load": { + "paths": [ + "C:\\Users\\YourName\\.openclaw\\extensions\\memos-cloud-openclaw-plugin" + ] + } + } +} +``` + +::info +注意:解压后的目录通常包含一个 `package` 子目录。请将路径指向包含 `package.json` 的文件夹。 +:: + +配置修改后请重启 gateway。 + +### 4. 更新插件 + +你可以通过以下命令手动更新云服务插件到最新版本: + +```bash +openclaw plugins update @memtensor/memos-cloud-openclaw-plugin@latest +openclaw gateway restart +``` + +## 开源项目进阶配置 + +如果希望进一步解锁更多可能性,还可以通过 MemOS Github 项目进行进一步探索和配置! + +### 可视化配置界面 (Config UI) + +自 `v0.1.12` 版本起,云插件内置了本地可视化配置服务,让您可以更直观地管理和修改插件配置。 + +**如何访问:** +1. 启动 OpenClaw 节点或宿主网关。 +2. 插件成功加载并检测到网关就绪后,会自动在后台启动 Config UI 服务。 +3. 在终端控制台日志中会打印访问链接(默认地址通常为 `http://127.0.0.1:38463`)。 +4. 在浏览器中打开该链接,即可进入插件的可视化管理后台。 + +**功能特点:** +- **直观编辑**:支持以表单形式编辑所有核心配置(如知识库 ID、大模型检索参数、多 Agent 覆盖规则等)。 +- **实时同步**:在界面上保存的配置变更会立即在插件运行时生效,无需重启服务。 +- **状态监控**:界面提供与宿主网关的心跳检测,确保配置同步链路健康。 + +### 多Agent支持与隔离(Multi-Agent) + +插件内置对多 Agent 模式的强大支持(通过 `agent_id` 参数实现),非常适合在复杂工作流或团队代理场景下使用。 + +**1. 开启与数据隔离** +- **开启方式**:在配置中设置 `"multiAgentMode": true` 或配置环境变量 `MEMOS_MULTI_AGENT_MODE=true`。 +- **自动隔离**:开启后,插件会自动读取上下文中的 `ctx.agentId`。在进行记忆检索和写入时,会自动附带该 Agent 标识,从而保证同一用户下的不同 Agent 之间记忆数据完全隔离(注:默认的 `"main"` Agent 会被忽略以保证旧数据兼容性)。 + +**2. 按 Agent 开关记忆(白名单控制)** +在多 Agent 模式下,如果不想让所有 Agent 都产生记忆消耗,你可以使用 `allowedAgents` 精确控制白名单: +```json +{ + "plugins": { + "entries": { + "memos-cloud-openclaw-plugin": { + "enabled": true, + "config": { + "multiAgentMode": true, + "allowedAgents": ["research-agent", "coding-agent"] + } + } + } + } +} +``` +*(提示:1. 如果 `allowedAgents` 未配置或为空数组 `[]`,则表示**所有 Agent** 都允许使用记忆检索和写入。2. 如果进行了配置,那么不在配置中的 Agent 将被完全跳过,只有配置中的 Agent 才会生效进行记忆检索和写入,从而避免 Token 浪费)。* + +**3. 按 Agent 独立配置参数(agentOverrides)** +除了简单的开关,你还可以通过 `agentOverrides` 为**每个 Agent 单独覆写记忆参数**。例如,让研究助手拥有更宽松的检索阈值,而让代码助手只读取特定的代码库知识: + +```json +{ + "plugins": { + "entries": { + "memos-cloud-openclaw-plugin": { + "enabled": true, + "config": { + "multiAgentMode": true, + "allowedAgents": ["research-agent", "coding-agent"], + "memoryLimitNumber": 6, + "relativity": 0.45, + + "agentOverrides": { + "research-agent": { + "knowledgebaseIds": ["kb-research-papers"], + "memoryLimitNumber": 12, + "relativity": 0.3, + "queryPrefix": "research context: " + }, + "coding-agent": { + "knowledgebaseIds": ["kb-codebase"], + "memoryLimitNumber": 9, + "addEnabled": false + } + } + } + } + } + } +} +``` +*(在上面的例子中,`coding-agent` 被禁止了记忆写入,且只能检索 `kb-codebase` 知识库中的前 9 条高相关性记忆)。* + +### 环境变量深度定制 + +除了必需的 API Key,你还可以通过环境变量调整插件行为。 + +更多细节配置项可以见 [MemTensor GitHub 官方插件仓库](https://github.com/MemTensor/MemOS/tree/main/apps/MemOS-Cloud-OpenClaw-Plugin) + +## 测试记忆功能 + +现在,可以与你的 Agent 进行多轮对话,例如: + +**第一次会话:** +- "我最喜欢的编程语言是 Python" +- "我正在开发一个电商项目" + +**第二次会话(新启动):** +- "你还记得我喜欢用什么编程语言吗?" +- "我之前说的项目进展如何?" + +现在,你的 OpenClaw 会从 MemOS Cloud 中检索记忆并给出准确回答啦~ diff --git a/docs/cn/openclaw/hermes_local_plugin.md b/docs/cn/openclaw/hermes_local_plugin.md new file mode 100644 index 00000000..606cf48e --- /dev/null +++ b/docs/cn/openclaw/hermes_local_plugin.md @@ -0,0 +1,8 @@ +--- +title: Hermes 本地插件已合并 +desc: Hermes 本地插件已合并到统一的 MemOS 本地插件安装指南。 +--- + +Hermes 本地插件已合并到统一的 **本地插件** 文档中。新版 `@memtensor/memos-local-plugin` 使用同一套本地优先记忆核心,同时支持 OpenClaw 与 Hermes Agent。 + +请查看 [本地插件安装指南](/cn/openclaw/local_plugin)。 diff --git a/docs/cn/openclaw/local_plugin.md b/docs/cn/openclaw/local_plugin.md new file mode 100644 index 00000000..c0b53588 --- /dev/null +++ b/docs/cn/openclaw/local_plugin.md @@ -0,0 +1,127 @@ +--- +title: 本地插件 +desc: 使用 @memtensor/memos-local-plugin 为 OpenClaw 与 Hermes Agent 提供本地优先的长期记忆、三层检索、技能结晶和可观测管理面板。 +--- + +`@memtensor/memos-local-plugin` 是 MemOS 新一代本地插件:一套本地优先的记忆核心,同时适配 **OpenClaw** 与 **Hermes Agent**。它不会把记忆数据托管到云端,而是在你的机器上维护 SQLite 数据库、技能包和日志,让 Agent 在本地持续积累可复用经验。 + +如果你只想为 OpenClaw 快速接入云端托管记忆,请查看 [OpenClaw 云插件](/cn/openclaw/guide)。如果你更看重隐私、本机运行、可观测性,或希望 OpenClaw / Hermes 都使用同一套本地记忆能力,请使用本页面的本地插件。 + +## 核心能力 + +| 能力 | 说明 | +| --- | --- | +| 本地优先 | OpenClaw 与 Hermes 各自拥有独立运行目录,SQLite、Skill、日志和配置都保留在本机。 | +| 双 Agent 适配 | OpenClaw 通过 TypeScript 插件进程内接入;Hermes 通过 Python Provider + JSON-RPC 桥接到同一套 Node.js 记忆核心。 | +| 四层记忆 | L1 Trace 记录每一步执行,L2 Policy 归纳跨任务策略,L3 World Model 压缩环境认知,Skill 将高价值经验结晶为可调用能力。 | +| 三层检索 | 按 Skill → Trace/Episode → World Model 检索,并融合向量、FTS5、关键词 pattern 与错误特征,使用 RRF + MMR 控制相关性和多样性。 | +| 反馈驱动进化 | 工具结果、环境反馈、用户显式反馈会更新记忆价值,推动策略归纳、技能结晶和 decision repair。 | +| 本地 Viewer | 提供 Overview、Memories、Tasks、Policies、World Models、Skills、Analytics、Logs、Import、Settings、Help 等页面。 | +| 导入与迁移 | 支持 JSON 导入导出、旧版插件数据迁移,以及按当前 Agent 导入 OpenClaw 会话 JSONL 或 Hermes `MEMORY.md`。 | +| 可选团队共享 | 默认完全隔离;如需协作,可在 Memory Viewer 的 Team Sharing 面板中开启局域网 / VPN 内 Skill 和可选 Trace 摘要共享。 | + +## 工作原理 + +插件在每轮任务开始前检索相关上下文,并把结果注入给 Agent;任务结束后,它会把对话、工具调用、观察结果和反馈写入本地流水线。高价值模式会从原始 Trace 逐步沉淀为 Policy、World Model 和可调用 Skill。下次遇到相似任务时,Agent 可以直接得到“该怎么做”和“哪些坑要避开”的上下文。 + +| 阶段 | 发生了什么 | 产物 | +| --- | --- | --- | +| 1. Agent 适配 | OpenClaw / Hermes 通过各自 Adapter 把会话、工具调用和反馈交给统一的 `MemoryCore`。 | 标准化的 turn、tool outcome、feedback | +| 2. 本地写入 | `MemoryCore` 把执行过程拆成可追溯的步骤记录。 | L1 Trace | +| 3. 经验归纳 | 多个相似 Trace 会归纳为跨任务策略,并进一步压缩为环境认知。 | L2 Policy、L3 World Model | +| 4. 技能结晶 | 高价值策略会生成可调用 Skill,并根据后续反馈更新可靠性。 | Skill、η、生命周期状态 | +| 5. 检索注入 | 下一轮任务开始前,Retriever 从 Skill、Trace/Episode、World Model 三层召回上下文。 | 注入给 Agent 的本地记忆上下文 | + +## 快速开始 + +### Step 1:一行命令安装或升级 + +安装与升级使用同一条命令。当前安装脚本面向 macOS / Linux: + +```bash +curl -fsSL https://raw.githubusercontent.com/MemTensor/MemOS/main/apps/memos-local-plugin/install.sh | bash +``` + +安装器会自动检测系统中是否已安装 OpenClaw / Hermes。交互式终端会询问安装到哪个 Agent;非交互环境会自动安装到检测到的 Agent。安装器会部署插件代码、安装生产依赖,并在需要时重启对应运行时。 + +> 不建议直接 `npm install` 这个包。安装脚本会处理 Agent 检测、目录布局、配置初始化和运行时重启。 + +### Step 2:打开 Memory Viewer + +安装完成后,打开对应的 Memory Viewer: + +| Agent | Memory Viewer | +| --- | --- | +| OpenClaw | `http://127.0.0.1:18799` | +| Hermes | `http://127.0.0.1:18800` | + +如果你同时安装了 OpenClaw 和 Hermes,它们会使用各自独立的 Viewer 和本地数据目录。 + +### Step 3:在面板里完成配置 + +所有用户可见配置都从 Memory Viewer 修改: + +- **Settings → AI Models**:配置 Embedding、LLM、Skill Evolver,并用 Test 按钮确认可用。 +- **Settings → Team Sharing**:开启或关闭团队共享,配置团队地址与 token。 +- **Settings → General**:配置语言、日志详细程度、匿名 telemetry 等。 + +保存后,Viewer 会自动重启插件并加载新设置。 + +### Step 4:启动对应 Agent + +安装完成后,按你选择的 Agent 正常启动即可。插件会在 Agent 构建 prompt 前检索本地上下文,并在本轮任务结束后把对话、工具调用、观察结果和反馈写入本地记忆。 + +| Agent | 启动方式 | 插件接入方式 | +| --- | --- | --- | +| OpenClaw | 正常启动或重启 OpenClaw gateway | TypeScript 插件在 OpenClaw 进程内调用 `MemoryCore` | +| Hermes | 运行 `hermes chat` | Python Provider 通过 JSON-RPC 调用 Node.js 记忆核心 | + +如果 Hermes 所在机器无法运行 Node.js,Hermes Provider 会报告不可用,并回退到 Hermes 自身的内存模式。 + +### Step 5:验证记忆功能 + +回到 Memory Viewer,建议检查以下页面: + +1. **Overview**:确认核心状态、版本、事件流正常。 +2. **Memories**:确认对话和工具步骤被写入为 Trace。 +3. **Tasks / Policies / World Models / Skills**:查看经验如何逐步归纳和结晶。 +4. **Import**:导入旧版数据、OpenClaw 会话 JSONL、Hermes `MEMORY.md`,或导入 / 导出 JSON 备份。 +5. **Help**:查看每个字段含义,例如 `V`、`α`、`R_human`、`η`、support、gain 等。 + +## Agent 差异 + +| 项目 | OpenClaw | Hermes | +| --- | --- | --- | +| 接入方式 | TypeScript 插件,进程内调用 `MemoryCore` | Python `MemoryProvider`,通过 stdio JSON-RPC 调用 Node bridge | +| 默认 Viewer | `http://127.0.0.1:18799` | `http://127.0.0.1:18800` | +| 模型配置 | 在 OpenClaw Viewer 的 Settings → AI Models 中配置 | 在 Hermes Viewer 的 Settings → AI Models 中配置 | +| 数据共享 | 默认与 Hermes 隔离 | 默认与 OpenClaw 隔离 | + +两个 Agent 即使安装在同一台机器上,也会使用各自的数据库和 Viewer。只有显式开启 `hub:` 后,才会进行团队共享。 + +## 可用工具 + +OpenClaw 与 Hermes 会通过各自宿主暴露记忆工具,常见能力包括: + +| 工具 | 用途 | +| --- | --- | +| `memory_search` | 按查询检索相关 Skill、Trace/Episode、World Model。 | +| `memory_get` | 获取某条记忆详情。 | +| `memory_timeline` | 查看某个 episode / task 的时间线。 | +| `skill_list` | 列出可调用 Skill。 | +| `skill_get` | 获取某个 Skill 的调用指南。 | +| `memory_environment` | 查询 L3 World Model,了解项目结构、环境规律和约束。 | + +插件也会记录工具调用成功 / 失败结果,用于后续 decision repair。 + +## 数据管理 + +- **备份**:在 Viewer 的 Import 页面导出 JSON,或备份当前 Agent 的 `~/./memos-plugin/` 目录。 +- **仅清空记忆**:在确认已备份后删除运行目录下的 `data/` 和 `skills/`。 +- **清空日志**:删除 `logs/` 下普通日志。`audit.log` 会按月 gzip 保留。 +- **彻底重置**:删除整个 `~/./memos-plugin/`,下次启动会重新创建空目录。 + +## 更多资料 + +- [MemOS 本地插件项目](https://github.com/MemTensor/MemOS/tree/main/apps/memos-local-plugin) +- [云插件 vs 本地插件](/cn/openclaw/plugin_compare) diff --git a/docs/cn/openclaw/plugin_compare.md b/docs/cn/openclaw/plugin_compare.md new file mode 100644 index 00000000..b1e86974 --- /dev/null +++ b/docs/cn/openclaw/plugin_compare.md @@ -0,0 +1,81 @@ +--- +title: 云插件 vs 本地插件 +desc: 云插件面向快速接入 MemOS Cloud,本地插件面向 OpenClaw 与 Hermes 的本机长期记忆和自进化能力。本文将帮你快速理解两者差异,选择最适合自己的方案。 +--- + +## 插件简介 + +### 云插件 + +将记忆托管于 **MemOS Cloud**。安装 OpenClaw 云插件后,只需配置一个 MemOS Cloud API Key 即可使用,支持多 Agent 跨设备共享记忆,经基准测试可降低约 **72% 的 Token 消耗**,适合快速上手、跨设备协作和生产环境接入。 + +### 本地插件 + +新版本地插件为 `@memtensor/memos-local-plugin`,是一套 **OpenClaw 与 Hermes 共用的本地优先记忆核心**。它把数据写入本机 SQLite,并沉淀为 L1 Trace、L2 Policy、L3 World Model 与可调用 Skill 四层记忆;同时通过反馈驱动自进化、三层检索和决策修复,让 Agent 在本机逐步积累可复用经验。适合对隐私、安全、本地化运行或可观测性有更高要求的开发者。 + +--- + +## 核心区别 + +| 对比维度 | ☁️ MemOS 云插件 | 🖥️ MemOS 本地插件 | +| --- | --- | --- | +| 💾 **数据存储与隐私** | **云端存储**:记忆数据存储在 MemOS Cloud,便于跨设备、多实例共享。 | **本地存储**:每个 Agent 拥有独立运行目录,OpenClaw 默认在 `~/.openclaw/memos-plugin/`,Hermes 默认在 `~/.hermes/memos-plugin/`。SQLite、Skill 包、日志和配置都保留在本机。 | +| 🤖 **Agent 支持** | 面向 OpenClaw 云插件,依托 MemOS Cloud 提供统一记忆服务。 | 同一套核心支持 OpenClaw 与 Hermes:OpenClaw 通过 TypeScript 插件进程内集成,Hermes 通过 Python Provider + JSON-RPC 桥接到 Node 核心。 | +| 🔑 **API 与模型配置** | 使用 MemOS Cloud API Key,由云端承担记忆处理、检索和演进。 | 通过 Memory Viewer 的 Settings 面板配置模型与团队共享。Embedding 默认可使用本地 provider,也可配置 OpenAI-compatible、Gemini、Cohere、Voyage、Mistral;OpenClaw 可继承宿主模型,Hermes 可在面板中配置 LLM provider 与 API Key。 | +| 🔍 **检索能力** | 云端语义向量检索 + 图检索,由服务端统一优化。 | 三层检索:Tier 1 Skill、Tier 2 Trace/Episode、Tier 3 World Model;同时融合向量、FTS5、关键词 pattern 与错误特征等通道,并通过 RRF + MMR 控制相关性和多样性。 | +| 🧠 **记忆进化** | 由云端服务自动完成:对写入记忆进行结构化处理、去冗余与自然语言纠错。 | Reflect2Evolve 本地流水线:对话与工具调用沉淀为 L1 Trace,跨任务归纳为 L2 Policy,再抽象为 L3 World Model;高价值策略会结晶为可调用 Skill,并根据反馈进入 active / retired 等生命周期。 | +| 🛠️ **决策修复** | 主要依赖云端召回更相关的历史记忆,降低重复上下文和无效 Token。 | 工具失败、负反馈和任务结果会进入反馈通道;失败模式可触发 decision repair,在下一轮注入纠偏上下文,帮助 Agent 避免重复踩坑。 | +| 👥 **多 Agent 与共享** | 支持多 Agent 场景和跨设备共享,适合团队协作。 | 默认按 Agent 隔离:OpenClaw 与 Hermes 拥有各自数据库和 Viewer。可选开启 Hub,在局域网 / VPN 内共享本地结晶的 Skill 和可选 Trace 摘要;共享不在算法关键路径上,失败会自动退化为本地模式。 | +| 👀 **可视化与可观测性** | 通过 MemOS Cloud Dashboard 管理 API Key 和云端记忆能力。 | 内置本地 Viewer:Overview、Memories、Tasks、Policies、World Models、Skills、Analytics、Logs、Import、Settings、Help 等页面;HTTP + SSE 实时展示事件、日志、检索、Skill 和健康状态。 | +| 🛠️ **部署与配置** | **极简**:三步完成(安装插件、获取 API Key、配置环境变量),主要依赖云服务。 | **极简**:安装与升级都是一行命令。安装器会自动检测系统中已安装的 OpenClaw / Hermes,安装 `@memtensor/memos-local-plugin`、创建运行目录并重启对应运行时。 | + +--- + +## 安装速览 + +### 云插件(3 步完成) + +1. **安装插件** + ```bash + openclaw plugins install @memtensor/memos-cloud-openclaw-plugin@latest + ``` + +2. **获取并配置 API Key** + + 获取 API Key:[MemOS Cloud Dashboard](https://memos-dashboard.openmem.net/cn/apikeys/) + + ```bash + mkdir -p ~/.openclaw && echo "MEMOS_API_KEY=mpg-..." > ~/.openclaw/.env + ``` + +3. **重启 gateway** + + ```bash + openclaw gateway restart + ``` + +**手动更新插件**: +```bash +openclaw plugins update @memtensor/memos-cloud-openclaw-plugin@latest +openclaw gateway restart +``` + +> 更多信息请参考 [Openclaw 云插件文档](/cn/openclaw/guide#快速开始) + +### 本地插件(一行命令) + +```bash +# 安装插件 +curl -fsSL https://raw.githubusercontent.com/MemTensor/MemOS/main/apps/memos-local-plugin/install.sh | bash +``` + +安装与升级使用同一条命令。安装器会自动检测本机是否安装 OpenClaw / Hermes。交互式终端会询问安装到哪个 Agent,非交互环境会自动安装到检测到的 Agent。 + +| Agent | 代码目录 | 数据与配置目录 | Viewer | +| --- | --- | --- | --- | +| OpenClaw | `~/.openclaw/plugins/memos-local-plugin/` | `~/.openclaw/memos-plugin/` | `http://127.0.0.1:18799` | +| Hermes | `~/.hermes/plugins/memos-local-plugin/` | `~/.hermes/memos-plugin/` | `http://127.0.0.1:18800` | + +> 升级或卸载插件代码不会删除已有本地数据、技能包或日志。OpenClaw 与 Hermes 各自运行独立 Viewer,没有共享端口或只读 peer 视图。 +> +> 模型、团队共享和常规选项都在对应 Agent 的 Memory Viewer 里配置:OpenClaw 默认 `http://127.0.0.1:18799`,Hermes 默认 `http://127.0.0.1:18800`。 diff --git a/docs/en/open_source/best_practice/common_errors_solutions.md b/docs/en/open_source/best_practice/common_errors_solutions.md new file mode 100644 index 00000000..f5ff2aa3 --- /dev/null +++ b/docs/en/open_source/best_practice/common_errors_solutions.md @@ -0,0 +1,75 @@ +--- +title: Common Errors and Solutions +--- + +## Configuration Errors + +### Missing Required Fields + +```python +# ✅ Always include required fields +llm_config = { + "backend": "openai", + "config": { + "api_key": "your-api-key", + "model_name_or_path": "gpt-4" + } +} +``` + +### Backend Mismatch + +```python +# ✅ KVCache requires HuggingFace backend +kv_config = { + "backend": "kv_cache", + "config": { + "extractor_llm": { + "backend": "huggingface", + "config": { + "model_name_or_path": "Qwen/Qwen3-1.7B" + } + } + } +} +``` + +## Service Connection Issues + +```bash +# Start required services as needed +docker run -p 6333:6333 qdrant/qdrant +ollama serve +``` + + +### Memory Loading Failures + +```python +try: + mem_cube.load("memory_dir") +except Exception: + mem_cube = GeneralMemCube(config) + mem_cube.dump("memory_dir") +``` + +### GPU Out Of Memory + +```python +import os +os.environ["CUDA_VISIBLE_DEVICES"] = "0" +# Use smaller models if GPU memory is limited: Qwen/Qwen3-0.6B +``` + +## User Management + +```python +# Register user first +mos.register_mem_cube(cube_path="path", user_id="user_id", cube_id="cube_id") + +# Check if user exists +try: + user_id = mos.create_user(user_name="john", role=UserRole.USER) +except ValueError: + user = mos.user_manager.get_user_by_name("john") +``` diff --git a/docs/en/open_source/best_practice/mcp_for_cozespace_and_tools.md b/docs/en/open_source/best_practice/mcp_for_cozespace_and_tools.md new file mode 100644 index 00000000..1253a0fc --- /dev/null +++ b/docs/en/open_source/best_practice/mcp_for_cozespace_and_tools.md @@ -0,0 +1,420 @@ +--- +title: MemOS MCP Integration Guide +description: Configure MemOS MCP service on platforms like Coze to seamlessly integrate agents with the memory system +--- + +This guide helps you configure MemOS MCP service in platforms like Coze Space, enabling seamless integration between your agent and the memory system. + +## Choose an MCP Deployment Method + +MemOS provides two MCP deployment options. Choose based on your needs: + +### Use MemOS Cloud Service (Recommended) + +If you want to connect quickly without deploying your own server, MemOS official cloud service is recommended. + +**Advantages:** +- ✅ Out of the box, no deployment required +- ✅ High availability guarantees +- ✅ Automatic scaling and maintenance +- ✅ Supports multiple clients (Claude, Cursor, Cline, etc.) + +**How to configure:** + +Visit [MemOS Cloud MCP Configuration Guide](https://memos-docs.openmem.net/cn/mcp_agent/mcp/guide) for detailed instructions. + +Main steps: +1. Register and get an API Key in [MemOS API Console](https://memos-dashboard.openmem.net/cn/apikeys/) +2. Configure `@memtensor/memos-api-mcp` service in your MCP client +3. Set environment variables (`MEMOS_API_KEY`, `MEMOS_USER_ID`, `MEMOS_CHANNEL`) + +### Deploy MCP Service Yourself + +If you need a private deployment or custom requirements, you can deploy MCP service on your own server. + +**Advantages:** +- ✅ Fully private data +- ✅ Configurable and customizable +- ✅ Full control of the service +- ✅ Suitable for internal enterprise use + +**Prerequisites:** +- Python 3.9+ +- Neo4j database (or another supported graph database) +- HTTPS domain (required by platforms like Coze) + +Continue reading for detailed deployment steps. + +--- + +## Self-Hosted MCP Service Configuration + +The content below applies to users who deploy MCP service themselves. + +## Architecture + +Self-hosted MCP service uses the following architecture: + +``` +Client (Coze/Claude, etc.) + ↓ [HTTPS] +MCP Server (port 8002) + ↓ [HTTP calls] +Server API (port 8001) + ↓ +MemOS Core Service +``` + +**Component overview:** +- **Server API**: provides REST APIs (`/product/*`) to handle memory CRUD +- **MCP Server**: exposes the MCP protocol over HTTP and calls Server API to complete operations +- **HTTPS reverse proxy**: platforms like Coze require HTTPS secure connections + +::steps{level="3"} + +### Step 1: Start Server API + +Server API is the backend for MCP service and provides actual memory management capabilities. + +```bash +cd /path/to/MemOS +python src/memos/api/server_api.py --port 8001 +``` + +Verify whether Server API is running: + +```bash +curl http://localhost:8001/docs +``` + +If it returns the API documentation page, startup succeeded. + +::note +**Configuration file**
+Server API loads configuration automatically. Ensure Neo4j and other dependencies are configured correctly. You can refer to `examples/data/config/tree_config_shared_database.json` as an example configuration. +:: + +### Step 2: Start MCP HTTP Service + +Start MCP service in another terminal: + +```bash +cd /path/to/MemOS +python examples/mem_mcp/simple_fastmcp_serve.py --transport http --port 8002 +``` + +After MCP service starts, it will show information similar to: + +``` +╭──────────────────────────────────────────────────╮ +│ MemOS MCP via Server API │ +│ Transport: HTTP │ +│ Server URL: http://localhost:8002/mcp │ +╰──────────────────────────────────────────────────╯ +``` + +**Environment variable configuration (optional):** + +You can configure the Server API address via a `.env` file or environment variables: + +```bash +export MEMOS_API_BASE_URL="http://localhost:8001/product" +``` + +::note +**Tool list**
+MCP service provides the following tools: +- `add_memory`: add memory +- `search_memories`: search memories +- `chat`: chat with the memory system + +For the full tool list, see `examples/mem_mcp/simple_fastmcp_serve.py` +:: + +### Step 3: Configure an HTTPS Reverse Proxy + +Platforms like Coze require HTTPS. You need to set up an HTTPS reverse proxy (e.g., Nginx) to forward traffic to MCP service. + +**Nginx configuration example:** + +```nginx +server { + listen 443 ssl http2; + server_name your-domain.com; + + ssl_certificate /path/to/cert.pem; + ssl_certificate_key /path/to/key.pem; + + location /mcp { + proxy_pass http://localhost:8002/mcp; + proxy_http_version 1.1; + proxy_set_header Upgrade $http_upgrade; + proxy_set_header Connection "upgrade"; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + + # SSE support + proxy_buffering off; + proxy_cache off; + } +} +``` + +::warning +**HTTPS certificate**
+Make sure you use a valid SSL certificate. Self-signed certificates may not be accepted by platforms like Coze. You can use Let's Encrypt to obtain a free certificate. +:: + +### Step 4: Test MCP Service + +Use the client test script to verify the service: + +```bash +cd /path/to/MemOS +python examples/mem_mcp/simple_fastmcp_client.py +``` + +Example success output: + +``` +Working FastMCP Client +======================================== +Connected to MCP server + + 1. Adding memory... + Result: Memory added successfully + + 2. Searching memories... + Result: [search result] + + 3. Chatting... + Result: [AI response] + +✓ All tests completed! +``` + +:: + +## Configure MCP in Coze Space + +After the service is deployed, configure the MCP connection in Coze Space. + +::steps{level="3"} + +### Step 1: Open Coze Space and go to the tool configuration page + +![Coze Space configuration page](https://statics.memtensor.com.cn/memos/coze_space_1.png) + +### Step 2: Add a custom MCP tool + +Add a custom tool on the tool configuration page: + +![Add a custom tool](https://statics.memtensor.com.cn/memos/coze_space_2.png) + +### Step 3: Configure the MCP endpoint URL + +Configure the MCP endpoint URL with your HTTPS address: + +``` +https://your-domain.com/mcp +``` + +Available MCP tools: +- **add_memory**: add a new memory +- **search_memories**: search existing memories +- **chat**: memory-based chat + +::note +**Test connection**
+After configuration, test whether MCP connection works in Coze. Ensure each tool can be called successfully. +:: + +:: + +--- + +## Use REST API Directly (Advanced) + +For scenarios that require more flexible integration, you can call Server API’s REST endpoints directly. + +::steps{level="3"} + +### Step 1: Start Server API + +```bash +cd /path/to/MemOS +python src/memos/api/server_api.py --port 8001 +``` + +**Port notes** +- Server API runs on port 8001 by default +- Provides `/product/*` REST API endpoints + +### Step 2: Configure custom tools in Coze IDE + +1. In Coze, choose the "IDE plugin" creation method +2. Configure requests to your deployed Server API service + +![Coze IDE plugin configuration](https://statics.memtensor.com.cn/memos/coze_tools_1.png) + +### Step 3: Implement the add_memory tool + +![Configure add_memory operation](https://statics.memtensor.com.cn/memos/coze_tools_2.png) + +**Code example:** configure and publish the `add_memory` operation in the IDE: + +![Configure add_memory operation](https://statics.memtensor.com.cn/memos/coze_tools_2.png) + +Full code is as follows: + +```python +import json +import requests +from runtime import Args +from typings.add_memory.add_memory import Input, Output + +def handler(args: Args[Input])->Output: + memory_content = args.input.memory_content + user_id = args.input.user_id + cube_id = args.input.cube_id + + # Call Server API add endpoint + url = "https://your-domain.com:8001/product/add" + payload = json.dumps({ + "user_id": user_id, + "messages": memory_content, # Supports string or message array + "writable_cube_ids": [cube_id] if cube_id else None + }) + headers = { + 'Content-Type': 'application/json' + } + + response = requests.post(url, headers=headers, data=payload, timeout=30) + response.raise_for_status() + + return response.json() +``` + +**Other tool implementations:** + +Similarly, implement the search and chat tools: + +```python +# Search tool +def search_handler(args: Args[Input]) -> Output: + url = "https://your-domain.com:8001/product/search" + payload = json.dumps{ + "user_id": args.input.user_id, + "query": args.input.query, + }) + headers = { + 'Content-Type': 'application/json' + } + + response = requests.post(url, headers=headers, data=payload, timeout=30) + response.raise_for_status() + + return response.json() + +# Chat tool +def chat_handler(args: Args[Input]) -> Output: + url = "https://your-domain.com:8001/product/chat/complete" + payload = json.dumps({ + "user_id": args.input.user_id, + "query": args.input.query + }) + response = requests.post(url, json=payload, timeout=30) + return response.json() +``` + +### Step 4: Publish and test tools + +After publishing, you can view the plugin under "My Resources": + +![Published plugin resource](https://statics.memtensor.com.cn/memos/coze_tools_3.png) + +### Step 5: Integrate into agent workflow + +Add the plugin into the agent workflow: + +1. Create a new agent or edit an existing agent +2. Add the published MemOS plugin to the tool list +3. Configure the workflow to call memory tools +4. Test memory write and retrieval functions + +:: + +--- + +## FAQ + +### Q1: MCP service cannot connect to Server API + +**Solution:** +- Check whether Server API is running: `curl http://localhost:8001/docs` +- Check whether environment variable `MEMOS_API_BASE_URL` is configured correctly +- Check MCP service logs and confirm the call address + +### Q2: Coze cannot connect to MCP service + +**Solution:** +- Make sure you use HTTPS +- Check whether the SSL certificate is valid +- Test reverse proxy configuration: `curl https://your-domain.com/mcp` +- Check firewall and security group settings + +### Q3: Neo4j connection failed + +**Solution:** +- Ensure Neo4j service is running +- Check connection info in the configuration file (uri, user, password) +- Refer to `examples/data/config/tree_config_shared_database.json` as an example configuration + +### Q4: How to see complete API examples? + +**Reference files:** +- MCP server: `examples/mem_mcp/simple_fastmcp_serve.py` +- MCP client: `examples/mem_mcp/simple_fastmcp_client.py` +- API tests: `examples/api/server_router_api.py` + +--- + +## Summary + +With this guide, you can: +- ✅ Choose a suitable MCP deployment option (cloud or self-hosted) +- ✅ Complete the full MCP service deployment process +- ✅ Integrate MemOS memory features into platforms like Coze +- ✅ Integrate directly via REST API + +No matter which option you choose, MemOS can provide your agent with powerful memory managementders=headers, data=payload) + return json.loads(response.text) + +::note +**API parameter notes** +- Use the standard Server API parameter format +- `messages`: replaces the previous `memory_content`, supports string or message array +- `writable_cube_ids`: replaces the previous `mem_cube_id`, supports multiple cubes +- Server API runs on port 8001, and the path is `/product/add` +- Ensure it matches MemOS Server API interface. You can refer to the example in `examples/api/server_router_api.py` +**IDE configuration**
In the IDE, you can customize tool parameters, return value formats, etc., ensuring consistency with MemOS API. Use this method to implement the search endpoint and user registration endpoint, then click Publish. +:: + +### Publish and Use the Plugin + +After publishing, you can view the plugin under "My Resources" and integrate it into the agent workflow as a plugin: + +![Published plugin resource](https://statics.memtensor.com.cn/memos/coze_tools_3.png) + +### Build an Agent and Test + +After building the simplest agent, you can test memory operations: + +1. Create a new agent +2. Add the published memory plugin +3. Configure the workflow +4. Test memory write and retrieval functions + +With the above configuration, you can successfully integrate MemOS memory features in Coze Space and provide powerful memory capabilities for your agent. diff --git a/docs/en/open_source/best_practice/memory_structure_design.md b/docs/en/open_source/best_practice/memory_structure_design.md new file mode 100644 index 00000000..a790b18f --- /dev/null +++ b/docs/en/open_source/best_practice/memory_structure_design.md @@ -0,0 +1,136 @@ +--- +title: Memory Structure Design Best Practices +--- + +## Memory Type Selection + +### TreeTextMemory + +**Best for**: Knowledge management, research assistants, hierarchical data +```python +tree_config = { + "backend": "tree_text", + "config": { + "extractor_llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b" + } + }, + "graph_db": { + "backend": "neo4j", + "config": { + "host": "localhost", + "port": 7687 + } + } + } +} +``` + +### PreferenceTextMemory + +**Best for**:Personalized conversation, intelligent recommendation, customer service + +```python +preference_config = { + "backend": "preference_text", + "config": { + "extractor_llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b", + } + }, + "vector_db": { + "backend": "milvus", + "config": { + "collection_name": [ + "explicit_preference", + "implicit_preference" + ], + "vector_dimension": 768, + "distance_metric": "cosine", + "uri": "./milvus_demo.db" + } + }, + "embedder": { + "backend": "ollama", + "config": { + "model_name_or_path": "nomic-embed-text:latest" + } + }, + "reranker": { + "backend": "cosine_local", + "config": { + "level_weights": { + "topic": 1.0, + "concept": 1.0, + "fact": 1.0 + }, + "level_field": "background" + } + } + } +} +``` + +### GeneralTextMemory + +**Best for**: Conversational AI, personal assistants, FAQ systems +```python +general_config = { + "backend": "general_text", + "config": { + "extractor_llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b" + } + }, + "vector_db": { + "backend": "qdrant", + "config": { + "collection_name": "general" + } + }, + "embedder": { + "backend": "ollama", + "config": { + "model_name_or_path": "nomic-embed-text" + } + } + } +} +``` + +### NaiveTextMemory + +**Best for**: Simple applications, prototyping +```python +naive_config = { + "backend": "naive_text", + "config": { + "extractor_llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b" + } + } + } +} +``` + +## Capacity Planning + +If you enable the scheduler, you can set memory capacities to control resource usage: + +```python +scheduler_config = { + "memory_capacities": { + "working_memory_capacity": 20, # Active conversation + "user_memory_capacity": 500, # User knowledge + "long_term_memory_capacity": 2000 # Domain knowledge + } +} +``` diff --git a/docs/en/open_source/best_practice/network_workarounds.md b/docs/en/open_source/best_practice/network_workarounds.md new file mode 100644 index 00000000..ffeba60a --- /dev/null +++ b/docs/en/open_source/best_practice/network_workarounds.md @@ -0,0 +1,109 @@ +--- +title: Network Workarounds +desc: Here are some solutions to address the network issues you may encounter during developing. +--- + +## **Downloading Huggingface Models** + +### Mirror Site (HF-Mirror) + +To download Huggingface models using the mirror site, you can follow these steps: + +::steps{level="4"} + +#### Install Dependencies + +Install the necessary dependencies by running: + +```bash +pip install -U huggingface_hub +``` + +#### Set Environment Variable + +Set the environment variable `HF_ENDPOINT` to `https://hf-mirror.com`. + +#### Download Models or Datasets + +Use huggingface-cli to download models or datasets. For example: + +- To download a model: + + ```bash + huggingface-cli download --resume-download gpt2 --local-dir gpt2 + ``` +- To download a dataset: + ``` + huggingface-cli download --repo-type dataset --resume-download wikitext --local-dir wikitext + ``` + +:: + +For more detailed instructions and additional methods, please refer to [this link](https://hf-mirror.com/). + +### Alternative Sources +You may still encounter limitations accessing some models in your regions. In such cases, you can use modelscope: + +::steps{level="4"} + +#### Install ModelScope + +Install the necessary dependencies by running: + +```bash +pip install modelscope[framework] +``` + +#### Download Models or Datasets + +Use modelscope to download models or datasets. For example: + +- To download a model: + ```bash + modelscope download --model 'Qwen/Qwen2-7b' --local_dir 'path/to/dir' + ``` +- To download a dataset: + + ```bash + modelscope download --dataset 'Tongyi-DataEngine/SA1B-Dense-Caption' --local_dir './local_dir' + ``` + +:: + +For more detailed instructions and additional methods, please refer to the [official docs](https://modelscope.cn/docs/home). + +## **Using Poetry** + +### Network Errors during Installing +To address network errors when using "poetry install" in your regions, you can follow these steps: + +::steps{level="4"} + +#### Update Configuration + +Update the `pyproject.toml` file to use a mirror source by adding the following configuration: + +```toml +[[tool.poetry.source]] +name = "mirrors" +url = "https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple/" +priority = "primary" +``` + +#### Reconfigure Poetry + +Run the command `poetry lock` in the terminal to reconfigure Poetry with the new mirror source. + +:: + +**Tips:** +Be aware that `poetry lock` will modify both Pyproject.toml and poetry.lock files. To avoid committing redundant changes: + - Option 1: After successful `poetry install`, revert to the git HEAD node using `git reset --hard HEAD`. + - Option 2: When executing `git add`, exclude the Pyproject.toml and poetry.lock files by specifying other files. + +For future dependency management tasks like adding or removing packages, you can use the `poetry add` command: +```bash +poetry add +``` + +Refer to the [Poetry CLI documentation](https://python-poetry.org/docs/cli/) for more commands and details. diff --git a/docs/en/open_source/best_practice/performance_tuning.md b/docs/en/open_source/best_practice/performance_tuning.md new file mode 100644 index 00000000..66b22d52 --- /dev/null +++ b/docs/en/open_source/best_practice/performance_tuning.md @@ -0,0 +1,55 @@ +--- +title: Performance Tuning +--- + +## Embedding Optimization + +```python +fast_embedder = { + "backend": "ollama", + "config": { + "model_name_or_path": "nomic-embed-text:latest" + } +} + +slow_embedder = { + "backend": "sentence_transformer", + "config": { + "model_name_or_path": "nomic-ai/nomic-embed-text-v1.5" + } +} +``` + +## Inference Speed + +```python +generation_config = { + "max_new_tokens": 256, # Limit response length + "temperature": 0.7, + "do_sample": True +} +``` + +## System Resource Optimization + +### Memory Capacity Limits + +```python +scheduler_config = { + "memory_capacities": { + "working_memory_capacity": 20, # Active context + "user_memory_capacity": 500, # User storage + "long_term_memory_capacity": 2000, # Domain knowledge + "transformed_act_memory_capacity": 50 # KV cache items + } +} +``` + +### Batch Processing + +```python +def batch_memory_operations(operations, batch_size=10): + for i in range(0, len(operations), batch_size): + batch = operations[i:i + batch_size] + yield batch # Process in batches +``` diff --git a/docs/en/open_source/contribution/commit_guidelines.md b/docs/en/open_source/contribution/commit_guidelines.md new file mode 100644 index 00000000..425ee506 --- /dev/null +++ b/docs/en/open_source/contribution/commit_guidelines.md @@ -0,0 +1,17 @@ +--- +title: Commit Guidelines +--- + +Please follow the [Conventional Commits](https://www.conventionalcommits.org/) format: + +- `feat:` for new features +- `fix:` for bug fixes +- `docs:` for documentation updates +- `style:` for formatting changes (no code logic change) +- `refactor:` for code refactoring +- `test:` for adding or updating tests +- `chore:` for other maintenance tasks +- `ci:` for CI/CD or workflow related changes + +**Example:** +`feat: add user authentication` diff --git a/docs/en/open_source/contribution/development_workflow.md b/docs/en/open_source/contribution/development_workflow.md new file mode 100644 index 00000000..f6c1596a --- /dev/null +++ b/docs/en/open_source/contribution/development_workflow.md @@ -0,0 +1,74 @@ +--- +title: Development Workflow +--- + +Follow these steps to contribute to the project. + +::steps{level="4"} + +#### Sync with Upstream + +If you've previously forked the repository, sync with the upstream changes: + +```bash +git checkout dev # switch to dev branch +git fetch upstream # fetch latest changes from upstream +git pull upstream dev # merge changes into your local dev branch +git push origin dev # push changes to your fork +``` + +#### Create a Feature Branch + +Create a new branch for your feature or fix: + +```bash +git checkout -b feat/descriptive-name +``` + +#### Make Your Changes + +Implement your feature, fix, or improvement in the appropriate files. + +- For example, you might add a function in `src/memos/hello_world.py` and create corresponding tests in `tests/test_hello_world.py`. + +#### Test Your Changes + +Run the test suite to ensure your changes work correctly: + +```bash +make test +``` + +#### Commit Your Changes + +Before committing or creating a PR, rebase to the latest upstream/dev: + +```bash +git fetch upstream +git rebase upstream/dev # Replay your feat branch on top of the latest dev +``` + +Follow the project's commit guidelines (see [Commit Guidelines](./commit_guidelines.md)) when committing your changes. + +#### Push to Your Fork + +Push your feature branch to your forked repository: + +```bash +git push origin feat/descriptive-name +``` + +#### Create a Pull Request + +Submit your changes for review: + +- **Important:** Please create your pull request against + - ✅ the `dev` branch of the upstream repository, + - ❎ not the `main` branch of the upstream repository. +- Go to the original repository on GitHub +- Click on "Pull Requests" +- Click on "New Pull Request" +- Select `dev` as the base branch, and your branch as compare +- Fulfill the PR description carefully. + +:: diff --git a/docs/en/open_source/contribution/overview.md b/docs/en/open_source/contribution/overview.md new file mode 100644 index 00000000..0b56ee06 --- /dev/null +++ b/docs/en/open_source/contribution/overview.md @@ -0,0 +1,14 @@ +--- +title: Contributing to MemOS +desc: Welcome to the MemOS contribution guide! Learn how to set up your development environment, follow our workflow, write good commit messages, improve documentation, and add tests. +--- + +- **First-time contributors:** Please start by reading the [Setting Up](./setting_up.md) guide to prepare your development environment. +- **Ready to code?** The [Development Workflow](./development_workflow.md) guide will walk you through our process for submitting changes. +- **Writing good commit messages:** See our [Commit Guidelines](./commit_guidelines.md). +- **Contributing to documentation:** If you're helping us improve our docs, check out the [Writing Documentation](./writing_docs.md) guide. +- **Adding or improving tests:** The [Writing Tests](./writing_tests.md) guide is for you. + +Your contributions make this project better! ✨ If you have any questions, feel free to open an issue or join the discussion or scan the QR codes below to connect with us on Discord or WeChat. + +QR Code diff --git a/docs/en/open_source/contribution/setting_up.md b/docs/en/open_source/contribution/setting_up.md new file mode 100644 index 00000000..66f3fc73 --- /dev/null +++ b/docs/en/open_source/contribution/setting_up.md @@ -0,0 +1,212 @@ +--- +title: Setting Up Your Development Environment +desc: To contribute to MemOS, you'll need to set up your local development environment. +--- + +::steps{level="4"} + +#### Fork & Clone the Repository + +Set up the repository on your local machine: + +- Fork the repository on GitHub +- Clone your fork to your local machine: + + ```bash + git clone https://github.com/YOUR-USERNAME/MemOS.git + cd MemOS + ``` + +- Add the upstream repository as a remote: + + ```bash + git remote add upstream https://github.com/MemTensor/MemOS.git + ``` + +#### Prepare Development Dependencies + +Ensure the following are installed locally: + +- Git +- Python 3.9+ +- Make + +Verify Python: + +```bash +python3 --version +``` + +#### Install Poetry + +MemOS uses Poetry for dependency management. We recommend using the official installer: + +```bash +curl -sSL https://install.python-poetry.org | python3 - +``` + +Verify the installation: + +```bash +poetry --version +``` + +If you see `poetry: command not found`, please add the Poetry executable directory to your PATH as prompted by the installer, then restart your terminal and verify again. + +For more installation options, see the [official installation guide](https://python-poetry.org/docs/#installing-with-the-official-installer). + +#### Install Dependencies and Set Up Pre-commit Hooks + +Install all project dependencies and development tools in the repository root: + +```bash +make install +``` + +Tip: + +- If you switch branches or dependencies change, you may need to **re-run `make install`** to keep the environment consistent. + +### Understanding Memory Modules and Dependency Selection +Before setting up the environment, we need to understand MemOS's memory module classification and their corresponding database dependencies. This will determine which components you need to install. + +#### Memory Types + +The MemOS memory system is mainly divided into two categories (identifiers for `backend` config are in parentheses): + +- **Textual Memory**: Fact-based memory, **you must choose one**. + - `tree` (`tree_text`): Tree memory (recommended), highest structure. + - `general` (`general_text`): General memory, based on vector retrieval. + - `naive` (`naive_text`): Naive memory, no special dependencies (for testing only). +- **Preference Memory**: User preferences, **optional**. + - `pref`: Used for storing and retrieving user preferences. + +#### Database Dependency Matrix + +Different memory types require different database support: + +| Memory Type | Component Dependency | Note | +| :--- | :--- | :--- | +| **Tree** | **Graph Database** | Required. Supports Neo4j Desktop, Neo4j Community, PolarDB | +| **General** | **Vector Database** | Required. Recommended to use Qdrant (or compatible Vector DB) | +| **Naive** | None | No database installation required | +| **Pref** | **Milvus** | If Preference Memory is enabled, Milvus must be installed | + +#### About Tree Memory and Graph Database Selection + +If you choose the most powerful `tree` memory (which is what most developers choose), you need to prepare a graph database. Currently, there are three options: + +- **Neo4j Desktop** (Recommended for PC): Install directly on PC, comes with full GUI and features, easiest solution. +- **PolarDB**: Graph database service provided by Alibaba Cloud (paid). +- **Neo4j Community**: Open source and free, suitable for server or Linux environments. + +**Special Note**: + +- If you use **Neo4j Desktop**, it usually handles graph data independently. +- If you use **Neo4j Community**, **it does not have native vector retrieval capabilities**. Therefore, you need to pair it with an additional vector database (Qdrant) to supplement vector retrieval capabilities. + +#### Configuration Scheme for This Tutorial + +To help developers get started quickly, this tutorial will use the following configuration: + +- **Memory Type**: `tree` (`tree_text`) +- **Graph Database**: **Neo4j Community** (requires you to download installer or use Docker) +- **Vector Database**: **Qdrant (Local Mode)** + +Since Neo4j Community lacks vector capabilities, we will introduce Qdrant. To avoid running an extra Qdrant service (Docker), we will configure Qdrant to run in **Local Embedded Mode** (reading/writing local files directly), so you don't need to install an additional Qdrant server. If no external configuration is provided, the system will automatically create a local database. + +#### Create Configuration File + +For .env content, please refer to [env config](/open_source/getting_started/installation#2.-.env-content) under docker installation for quick configuration. +For detailed .env configuration, please see [env configuration](/open_source/getting_started/rest_api_server/#running-locally). + +::note +**Note**
+The .env configuration file needs to be placed in the MemOS project root directory. +:: + +```bash +cd MemOS +touch .env +``` + +#### Configure Dockerfile + +::note +**Note**
+The Dockerfile is located in the docker directory. +:: + +```bash +# Enter the docker directory +cd docker +``` + +Includes fast mode and full mode, distinguishing between slim packages (arm and x86) and full packages (arm and x86). + +```bash + +● Slim Package: Simplifies heavy dependencies like nvidia, making the image lightweight for faster local deployment. + - url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-base:v1.0 + - url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-base-arm:v1.0 + +● Full Package: Packages all MemOS dependencies into the image for full functionality. Can be built and started directly by configuring the Dockerfile. + - url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-full-base:v1.0.0 + - url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-full-base-arm:v1.0.0 +``` + +```bash +# Current example uses slim package url +FROM registry.cn-shanghai.aliyuncs.com/memtensor/memos-base-arm:v1.0 + +WORKDIR /app + +ENV HF_ENDPOINT=https://hf-mirror.com + +ENV PYTHONPATH=/app/src + +COPY src/ ./src/ + +EXPOSE 8000 + +CMD ["uvicorn", "memos.api.server_api:app", "--host", "0.0.0.0", "--port", "8000", "--reload"] + +``` + +#### Start Docker Client + +```bash + # If docker is not installed, please install the corresponding version from: + https://www.docker.com/ + + # After installation, start docker via client or command line + # Start docker via command line + sudo systemctl start docker + +# After installation, check docker status +docker ps + +# Check docker images (optional) +docker images + +``` + +#### Build and Start Service + +::note +**Note**
+Build commands are also executed in the docker directory. +:: + +```bash +# In the docker directory +docker compose up neo4j +``` + +#### Open New Terminal to Start Server + +```bash +cd MemOS +make serve +``` +:: diff --git a/docs/en/open_source/contribution/writing_docs.md b/docs/en/open_source/contribution/writing_docs.md new file mode 100644 index 00000000..bc5c1aff --- /dev/null +++ b/docs/en/open_source/contribution/writing_docs.md @@ -0,0 +1,537 @@ +--- +title: Documentation Writing Guidelines +desc: This project uses Nuxt Content to build a documentation system that supports Markdown and rich Vue components. +--- + +## Creating New Documents + +::steps +### Create Markdown File +Create a new `.md` file in the `content/` directory or its subdirectories. Choose an appropriate location based on your content type. + +### Add Frontmatter +Add YAML frontmatter at the top of your file to provide metadata. The frontmatter supports the following fields: + +::card{title="Frontmatter Fields"} +**Required Fields:** +- `title` (string) - The document title that appears in navigation and page headers + +**Optional Fields:** +- `desc` (string) - Brief description of the document content +- `banner` (string) - URL to a banner image displayed at the top of the page +- `links` (array) - Array of related links with labels, URLs, and icons + +![Frontmatter Example](https://statics.memtensor.com.cn/memos/frontmatter.png) +:: + +**Complete Frontmatter Example:** + +```yaml +--- +title: MemOS Documentation +desc: Welcome to the official documentation for MemOS – a Python package designed to empower large language models (LLMs) with advanced, modular memory capabilities. +banner: https://statics.memtensor.com.cn/memos/memos-banner.gif +links: + - label: 'PyPI' + to: https://pypi.org/project/MemoryOS/ + target: _blank + avatar: + src: https://statics.memtensor.com.cn/icon/pypi.svg + alt: PyPI logo + - label: 'Open Source' + to: https://github.com/MemTensor/MemOS + target: _blank + icon: i-simple-icons-github +--- +``` + +### Write Content +Use Markdown syntax and MDC components to write your documentation content. Take advantage of the available components to create engaging and well-structured content. + +### Update Navigation +Add the new document to the `nav` section in `content/settings.yml` to make it accessible through the site navigation. + +### Merge to Main Branch +Once changes are merged into the `main` branch, the documentation will be automatically updated and deployed. +:: + +## Component Examples + +This project uses Nuxt Content's MDC (Markdown Components) syntax, which supports using Vue components within Markdown. These components help create engaging, well-structured documentation with consistent styling and improved user experience. + +### Image References + +When adding images to your documentation, you can use several methods to reference them: + +#### Local Assets with Base64Image Component + +For images stored in the `public/assets` directory, use the `Base64Image` component. This component provides better performance by embedding the image directly in the page: + +```mdc +:Base64Image{src="/assets/memos-architecture.png" alt="MemOS Architecture"} +``` + +#### Remote Images with Markdown Syntax + +For remote images (hosted on external servers), use standard Markdown image syntax: + +```markdown +![MemOS Architecture](https://statics.memtensor.com.cn/memos/memos-architecture.png) +``` + +### Steps + +Use `steps` to create step-by-step guides from document headings. The `steps` component automatically numbers headings, creating a numbered guide for processes and tutorials. + +::code-preview +--- +class: "[&>div]:*:w-full" +--- + :::steps{level="4"} + #### Fork & Clone the Repository + + Set up the repository on your local machine: + + - Fork the repository on GitHub + - Clone your fork to your local machine: + + ```bash + git clone https://github.com/YOUR-USERNAME/MemOS.git + cd MemOS + ``` + + - Add the upstream repository as a remote: + + ```bash + git remote add upstream https://github.com/MemTensor/MemOS.git + ``` + + #### Prepare Development Dependencies + + Ensure the following are installed locally: + + - Git + - Python 3.9+ + - Make + + Verify Python: + + ```bash + python3 --version + ``` + + #### Install Poetry + + MemOS uses Poetry for dependency management. We recommend using the official installer: + + ```bash + curl -sSL https://install.python-poetry.org | python3 - + ``` + + Verify the installation: + + ```bash + poetry --version + ``` + + If you see `poetry: command not found`, please add the Poetry executable directory to your PATH as prompted by the installer, then restart your terminal and verify again. + + For more installation options, see the [official installation guide](https://python-poetry.org/docs/#installing-with-the-official-installer). + + #### Install Dependencies and Set Up Pre-commit Hooks + + Install all project dependencies and development tools in the repository root: + + ```bash + make install + ``` + + Tip: + + - If you switch branches or dependencies change, you may need to **re-run `make install`** to keep the environment consistent. + ::: + + ```python + from memos.configs.mem_os import MOSConfig + + # init MOSConfig + mos_config = MOSConfig.from_json_file("examples/data/config/simple_memos_config.json") + ``` + + #### Create a User & Register a MemCube + + ```python + import uuid + from memos.mem_os.main import MOS + + mos = MOS(mos_config) + + # Generate a unique user ID + user_id = str(uuid.uuid4()) + + # Create the user + mos.create_user(user_id=user_id) + ``` + ::: + +#code +````mdc +::steps{level="4"} + +#### Install MemOS + +```bash +pip install MemoryOS +``` + +#### Create a Minimal Config + +For this Quick Start, we'll use the built-in GeneralTextMemory. + +```python +from memos.configs.mem_os import MOSConfig + +# init MOSConfig +mos_config = MOSConfig.from_json_file("examples/data/config/simple_memos_config.json") +``` + +#### Create a User & Register a MemCube + +```python +import uuid +from memos.mem_os.main import MOS + +mos = MOS(mos_config) + +# Generate a unique user ID +user_id = str(uuid.uuid4()) + +# Create the user +mos.create_user(user_id=user_id) +``` + +:: +```` +:: + + +### Accordion + +Use `accordion` and `accordion-item` to create collapsible content sections. Accordions are useful for organizing FAQs, expandable details, or grouped information in an interactive way. + +::code-preview +--- +class: "[&>div]:*:my-0" +--- + :::accordion + ::::accordion-item + --- + icon: i-lucide-circle-help + label: Is MemOS compatible with LLMs accessed via API? + --- + Yes. MemOS is designed to be as compatible as possible with various types of models. However, it's important to note that if you're using API-based models, activation and parametric memories cannot be utilized. + :::: + + ::::accordion-item + --- + icon: i-lucide-circle-help + label: How does MemOS improve the effectiveness of large language model applications? + --- + MemOS enhances large language model applications by providing structured, persistent memory with intelligent scheduling, long-term knowledge retention, and KV cache for fast inference. It supports fine-grained access control and user isolation, ensuring memory security in multi-user environments. Its modular architecture allows seamless integration of new memory types, LLMs, and storage backends, making it adaptable to a wide range of intelligent applications. + :::: + + ::::accordion-item{icon="i-lucide-circle-help" label="What is the pricing?"} + MemOS open-source is free. + :::: + ::: + +#code +```mdc +::accordion + +:::accordion-item{label="Is MemOS compatible with LLMs accessed via API?" icon="i-lucide-circle-help"} +Yes. MemOS is designed to be as compatible as possible with various types of models. However, it's important to note that if you're using API-based models, activation and parametric memories cannot be utilized. +::: + +:::accordion-item{label="How does MemOS improve the effectiveness of large language model applications?" icon="i-lucide-circle-help"} +MemOS enhances large language model applications by providing structured, persistent memory with intelligent scheduling, long-term knowledge retention, and KV cache for fast inference. It supports fine-grained access control and user isolation, ensuring memory security in multi-user environments. Its modular architecture allows seamless integration of new memory types, LLMs, and storage backends, making it adaptable to a wide range of intelligent applications. +::: + +:::accordion-item{label="What is the pricing?" icon="i-lucide-circle-help"} +MemOS open-source is free. +::: + +:: +``` +:: + +### Badge + +Use badge to display status indicators or labels. Badges are great for highlighting version numbers, statuses, or categories within your content. + +::code-preview +--- +label: Preview +--- + :::badge + **v1.0.0** + ::: + +#code +```mdc +::badge +**v1.0.0** +:: +``` +:: + + + +### Callout + +Use callout to emphasize important contextual information. Callouts draw attention to notes, tips, warnings, or cautions, making key information stand out. + +Customize with `icon` and `color` props or use `note`, `tip`, `warning`, `caution` shortcuts for pre-defined semantic styles. + +::code-preview +--- +class: "[&>div]:*:my-0 [&>div]:*:w-full" +--- + :::callout + This is a `callout` with full **markdown** support. + ::: + +#code +```mdc +::callout +This is a `callout` with full **markdown** support. +:: +``` +:: + +::code-preview + :::div{.flex.flex-col.gap-4.w-full} + ::::note{.w-full.my-0} + Basic note content + :::: + + ::::note{.w-full.my-0 to="/open_source/getting_started/quick_start"} + Note with link - click to navigate to quick start guide + :::: + + ::::note{.w-full.my-0 to="/open_source/modules/mem_cube" icon="ri:database-line"} + Note with custom icon - learn more about MemCube + :::: + + ::::tip{.w-full.my-0} + Here's a helpful suggestion. + :::: + + ::::warning{.w-full.my-0} + Be careful with this action as it might have unexpected results. + :::: + + ::::caution{.w-full.my-0} + This action cannot be undone. + :::: + ::: + +#code +```mdc +::note +Basic note content +:: + +::note{to="/open_source/getting_started/quick_start"} +Note with link - click to navigate to quick start guide +:: + +::note{to="/open_source/modules/mem_cube" icon="ri:database-line"} +Note with custom icon - learn more about MemCube +:: + +::tip +Here's a helpful suggestion. +:: + +::warning +Be careful with this action as it might have unexpected results. +:: + +::caution +This action cannot be undone. +:: +``` +:: + +### Card + +Use `card` to highlight content blocks. Cards are useful for showcasing features, resources, or related information in visually distinct and interactive containers. + +Customize with `title`, `icon`, and `color` props. Cards can also act as links using `` properties for navigation. + +::code-preview +--- +class: "[&>div]:*:my-0 [&>div]:*:w-full" +--- + :::card + --- + icon: i-simple-icons-github + target: _blank + title: Open Source + to: https://github.com/MemTensor/MemOS + --- + Use our open-source version + ::: + +#code +```mdc +::card +--- +title: Open Source +icon: i-simple-icons-github +to: https://github.com/MemTensor/MemOS +target: _blank +--- +Use our open-source version +:: +``` +:: + +### CardGroup + +Use `card-group` to arrange cards in a grid layout. `card-group` is ideal for displaying collections of cards in a structured, visually appealing, and responsive grid. + +::code-preview + :::card-group{.w-full} + ::::card + --- + icon: ri:play-line + title: Minimal Pipeline + to: /open_source/getting_started/examples#example-1-minimal-pipeline + --- + The smallest working pipeline — add, search, update and dump plaintext memories. + :::: + + ::::card + --- + icon: ri:tree-line + title: TreeTextMemory Only + to: /open_source/getting_started/examples#example-2-treetextmemory-only + --- + Use Neo4j-backed hierarchical memory to build structured, multi-hop knowledge graphs. + :::: + + ::::card + --- + icon: ri:database-2-line + title: KVCacheMemory Only + to: /open_source/getting_started/examples#example-3-kvcachememory-only + --- + Speed up sessions with short-term KV cache for fast context injection. + :::: + + ::::card + --- + icon: hugeicons:share-07 + title: Hybrid TreeText + KVCache + to: /open_source/getting_started/examples#example-4-hybrid + --- + Combine explainable graph memory with fast KV caching in a single MemCube. + :::: + ::: + +#code +```mdc +::card-group + +:::card +--- +icon: ri:play-line +title: Minimal Pipeline +to: /open_source/getting_started/examples#example-1-minimal-pipeline +--- +The smallest working pipeline — add, search, update and dump plaintext memories. +::: + +:::card +--- +icon: ri:tree-line +title: TreeTextMemory Only +to: /open_source/getting_started/examples#example-2-treetextmemory-only +--- +Use Neo4j-backed hierarchical memory to build structured, multi-hop knowledge graphs. +::: + +:::card +--- +icon: ri:database-2-line +title: KVCacheMemory Only +to: /open_source/getting_started/examples#example-3-kvcachememory-only +--- +Speed up sessions with short-term KV cache for fast context injection. +::: + +:::card +--- +icon: hugeicons:share-07 +title: Hybrid TreeText + KVCache +to: /open_source/getting_started/examples#example-4-hybrid +--- +Combine explainable graph memory with fast KV caching in a single MemCube. +::: + +:: +``` +:: + +## Navigation Icons + +When adding navigation entries in `content/settings.yml`, you can include icons using the syntax `(ri:icon-name)`: + +```yaml +- "(ri:home-line) Home": overview.md +- "(ri:team-line) User Management": modules/mos/users.md +- "(ri:flask-line) Writing Tests": contribution/writing_tests.md +``` + +Available icons can be found at: [https://icones.js.org/](https://icones.js.org/) + +## Local Preview + +To preview the documentation locally, run the following command from the project root: + + +```bash +## Make sure to install the dependencies: +pnpm install +``` + +```bash +pnpm dev +``` + +This command will start a local web server, usually accessible at `http://127.0.0.1:3000`. + +## Learn More + +### Nuxt Content and Typography + +This project uses Nuxt Content and supports rich Typography components and styles. To learn more about available components and customization options, please refer to: + +- [Nuxt UI Typography Documentation](https://ui.nuxt.com/getting-started/typography) + +## Best Practices + +::note +**Documentation Writing Tips** + +1. **Keep document structure clear**: Use appropriate heading levels to organize content logically +2. **Use components wisely**: Use note, card, and other components to improve readability and engagement +3. **Code examples**: Provide clear code examples for technical documentation with proper syntax highlighting +4. **Icon usage**: Use appropriate icons in navigation to enhance user experience and visual hierarchy +:: + +::card{title="Quick Reference"} +Remember to test your documentation locally before submitting. Use `npm run dev` to preview your changes and ensure all components render correctly. +:: diff --git a/docs/en/open_source/contribution/writing_tests.md b/docs/en/open_source/contribution/writing_tests.md new file mode 100644 index 00000000..c32f38d2 --- /dev/null +++ b/docs/en/open_source/contribution/writing_tests.md @@ -0,0 +1,43 @@ +--- +title: How to Write Unit Tests +desc: This project uses [pytest](https://docs.pytest.org/) for unit testing. +--- + +## Writing a Test + +1. Create a new Python file in the `tests/` directory. The filename should start with `test_`. +2. Inside the file, create functions whose names start with `test_`. +3. Use the `assert` statement to check for expected outcomes. + +Here is a basic example: + +```python +# tests/test_example.py + +def test_addition(): + assert 1 + 1 == 2 +``` + +## Running Tests + +To run all the tests, execute the following command from the root of the project: + +```bash +make test +``` + +This will discover and run all the tests in the `tests/` directory. + +## Advanced Techniques + +Pytest has many advanced features, such as fixtures and mocking. + +### Fixtures + +Fixtures are functions that can provide data or set up state for your tests. They are defined using the `@pytest.fixture` decorator. + +### Mocking + +Mocking is used to replace parts of your system with mock objects. This is useful for isolating the code you are testing. The `unittest.mock` library is commonly used for this, often with the `patch` function. + +For an example of mocking, see `tests/test_hello_world.py`. diff --git a/docs/en/open_source/getting_started/examples.md b/docs/en/open_source/getting_started/examples.md new file mode 100644 index 00000000..a433633e --- /dev/null +++ b/docs/en/open_source/getting_started/examples.md @@ -0,0 +1,581 @@ +--- +title: MemOS Examples +desc: "Congratulations - you've mastered the Quick Start and built your first working memory! Now it's time to see how far you can take MemOS by combining different memory types and features. Use these curated examples to inspire your own agents, chatbots, or knowledge systems." +--- + +::card-group + + :::card + --- + icon: ri:play-line + title: Minimal Pipeline + to: /open_source/getting_started/examples#example-1-minimal-pipeline + --- + The smallest working pipeline — add, search, update and dump plaintext memories. + ::: + + :::card + --- + icon: ri:tree-line + title: Adding and retrieving multiple information sources + to: /open_source/getting_started/examples#example-2-multi-modal + --- + Adding multi-source messages—including text, images, files, and tool calls—into memory and enabling their retrieval. + ::: + + :::card + --- + icon: ri:apps-line + title: Multi-Cube addition and retrieval + to: /open_source/getting_started/examples#example-3-multi-cube + --- + Add different memories to different Cubes and retrieve them simultaneously during a search. + ::: + + :::card + --- + icon: ri:database-2-line + title: KVCacheMemory Only + to: /open_source/getting_started/examples#example-4-kvcachememory-only + --- + Speed up sessions with short-term KV cache for fast context injection. + ::: + + :::card + --- + icon: hugeicons:share-07 + title: Hybrid TreeText + KVCache + to: /open_source/getting_started/examples#example-5-hybrid + --- + Combine explainable graph memory with fast KV caching in a single MemCube. + ::: + + :::card + --- + icon: ri:calendar-check-line + title: Multi-Memory Scheduling + to: /open_source/getting_started/examples#example-6-multi-memory-scheduling + --- + Run dynamic memory orchestration for multi-user, multi-session agents. + ::: + +:: + + +## Example 1: Minimal Pipeline + +### When to Use: +- You want the smallest possible working example. +- You only need simple plaintext memories stored in a vector DB and retrieve them. + +### Key Points: +- Supports basic personal memory integration and search. + +### Full Example Code +```python +import json +from memos.api.routers.server_router import add_memories, search_memories +from memos.api.product_models import APIADDRequest, APISearchRequest + +user_id = "test_user_1" +add_req = APIADDRequest( + user_id=user_id, + writable_cube_ids=["cube_test_user_1"], + messages = [ + {"role": "user", "content": "I’ve planned to travel to Guangzhou during the summer vacation. What chain hotels are available for accommodation?"}, + {"role": "assistant", "content": "You can consider [7 Days Inn, Ji Hotel, Hilton], etc."}, + {"role": "user", "content": "I’ll choose 7 Days Inn."}, + {"role": "assistant", "content": "Okay, feel free to ask me if you have any other questions."} + ], + async_mode="sync", + mode="fine", +) + +add_rsp = add_memories(add_req) +print("add_memories rsp: \n\n", add_rsp) + +search_req = APISearchRequest( + user_id=user_id, + readable_cube_ids=["cube_test_user_1"], + query="Please recommend a hotel that I haven’t stayed at before.", + include_preference=True, +) + +search_rsp = search_memories(search_req).data +print("\n\nsearch_rsp: \n\n", json.dumps(search_rsp, indent=2, ensure_ascii=False)) +``` + +## Example 2: Adding and retrieving multi-source memories + +### When to Use: +- In addition to plain text conversations, you need to add files, image content, or tool call history to memory. +- At the same time, you want to retrieve memories from these multiple sources. + +### Key Points: +- Adding memories from multiple information sources. +- Needs to include downloadable file and image URLs. +- The added information must strictly follow the OpenAI Messages format. +- The tool schema in the system prompt needs to be wrapped in . + +### Full Example Code +Adding text and files to memory +```python +import json +from memos.api.routers.server_router import add_memories, search_memories +from memos.api.product_models import APIADDRequest, APISearchRequest + +user_id = "test_user_2" +add_req = APIADDRequest( + user_id=user_id, + writable_cube_ids=["cube_test_user_2"], + messages = [ + { + "role": "user", + "content": [ + { + "type": "text", + "text": "Please read this file, summarize the key points, and provide a final conclusion." + }, + { + "type": "file", + "file": { + "file_id": "file_123", + "filename": "report.md", + "file_data": "@http://139.196.232.20:9090/graph-test/algorithm/2025_11_13/1763043889_1763043782_PM1%E8%BD%A6%E9%97%B4PMT%E9%9D%B4%E5%8E%8B%E8%BE%B9%E5%8E%8B%E5%8E%8B%E5%8A%9B%E6%97%A0%E6%B3%95%E5%BB%BA%E7%AB%8B%E6%95%85%E9%9A%9C%E6%8A%A5%E5%91%8A20240720.md" + } + }, + ] + }, + { + "role": "assistant", + "content": [ + { + "type": "text", + "text": "Final Summary: During the PMT boot-pressure startup test of the PM1 workshop on July 20, 2024, the drive could not run because the edge pressures on both sides failed to reach the 2.5-bar interlock requirement. After troubleshooting, the PLC output signals, hydraulic pipelines, and valves were all found to be normal. The root cause was ultimately identified as poor contact at the negative terminal of the proportional valve’s DC 24V power supply inside the PLC cabinet, caused by a short-jumpered terminal block. After re-connecting the negative incoming lines in parallel, the equipment returned to normal operation. It is recommended to replace terminal blocks in batches, inspect instruments with uncertain service life, and optimize the troubleshooting process by tracing common-mode issues from shared buses and power supply sources." + } + ] + } + ], + async_mode="sync", + mode="fine", +) + +add_rsp = add_memories(add_req) +print("add_memories rsp: \n\n", add_rsp) + +search_req = APISearchRequest( + user_id=user_id, + readable_cube_ids=["cube_test_user_2"], + query="Workshop PMT boot pressure startup test", + include_preference=False, +) +search_rsp = search_memories(search_req).data +print("\n\nsearch_rsp: \n\n", json.dumps(search_rsp, indent=2, ensure_ascii=False)) +``` +Adding messages from multiple mixed information sources to memory +```python +import json +from memos.api.routers.server_router import add_memories, search_memories +from memos.api.product_models import APIADDRequest, APISearchRequest + +user_id = "test_user_2" +add_req = APIADDRequest( + user_id=user_id, + writable_cube_ids=["cube_test_user_2"], + messages = [ + { + "role": "system", + "content": [ + { + "type": "text", + "text": "You are a professional industrial fault analysis assistant. Please read the PDF, images, and instructions provided by the user and provide a professional technical summary.\n\n\n[\n {\n \"name\": \"file_reader\",\n \"description\": \"Used to read the content of files uploaded by the user and return the text data (in JSON string format).\",\n \"parameters\": [\n {\"name\": \"file_id\", \"type\": \"string\", \"required\": true, \"description\": \"The file ID to be read\"}\n ],\n \"returns\": {\"type\": \"text\", \"description\": \"Returns the extracted text content of the file\"}\n }\n]\n" + } + ] + }, + { + "role": "user", + "content": [ + { + "type": "text", + "text": "Please read this file and image, summarize the key points, and provide a final conclusion." + }, + { + "type": "file", + "file": { + "file_id": "file_123", + "filename": "report.pdf", + "file_data": "@http://139.196.232.20:9090/graph-test/algorithm/2025_11_13/1763043889_1763043782_PM1%E8%BD%A6%E9%97%B4PMT%E9%9D%B4%E5%8E%8B%E8%BE%B9%E5%8E%8B%E5%8E%8B%E5%8A%9B%E6%97%A0%E6%B3%95%E5%BB%BA%E7%AB%8B%E6%95%85%E9%9A%9C%E6%8A%A5%E5%91%8A20240720.md" + } + }, + { + "type": "image_url", + "image_url": { + "url": "https://play-groud-test-1.oss-cn-shanghai.aliyuncs.com/%E5%9B%BE%E7%89%871.jpeg" + } + } + ] + }, + { + "role": "assistant", + "tool_calls": [ + { + "id": "call_file_reader_001", + "type": "function", + "function": { + "name": "file_reader", + "arguments": "{\"file_id\": \"file_123\"}" + } + } + ] + }, + { + "role": "tool", + "tool_call_id": "call_file_reader_001", + "content": [ + { + "type": "text", + "text": "{\"file_id\":\"file_123\",\"extracted_text\":\"PM1 workshop PMT boot pressure startup test record… Final fault cause: poor contact at the negative terminal of the DC 24V power supply circuit due to a short-jumped terminal block.\"}" + } + ] + }, + { + "role": "assistant", + "content": [ + { + "type": "text", + "text": "Final Summary: During the PMT boot-pressure startup test of the PM1 workshop on July 20, 2024, the drive could not run because the edge pressures on both sides failed to reach the 2.5-bar interlock requirement. After troubleshooting, the PLC output signals, hydraulic pipelines, and valves were all found to be normal. The root cause was ultimately identified as poor contact at the negative terminal of the proportional valve’s DC 24V power supply inside the PLC cabinet, caused by a short-jumpered terminal block. After re-connecting the negative incoming lines in parallel, the equipment returned to normal operation. It is recommended to replace terminal blocks in batches, inspect instruments with uncertain service life, and optimize the troubleshooting process by tracing common-mode issues from shared buses and power supply sources." + } + ] + } +], + async_mode="sync", + mode="fine", +) + +add_rsp = add_memories(add_req) + +print("add_memories rsp: \n\n", add_rsp) + + + +search_req = APISearchRequest( + user_id=user_id, + readable_cube_ids=["cube_test_user_2"], + query="Workshop PMT boot pressure startup test", + include_preference=False, +) + +search_rsp = search_memories(search_req).data +print("\n\nsearch_rsp: \n\n", json.dumps(search_rsp, indent=2, ensure_ascii=False)) +``` + +## Example 3: Multi-Cube addition and retrieval + +### When to Use: + +- Add memories to separate, isolated Cube spaces +- You want to retrieve memories from different Cube spaces simultaneously + +### Key Points: + +- Input a readable_cube_ids list containing multiple cube IDs during retrieval + +### Full Example Code + +```python +import json +from memos.api.routers.server_router import add_memories, search_memories +from memos.api.product_models import APIADDRequest, APISearchRequest + +user_id = "test_user_3" +add_req = APIADDRequest( + user_id=user_id, + writable_cube_ids=["cube_test_user_3_1"] , + messages = [ + {"role": "user", "content": "I’ve planned to travel to Guangzhou during the summer vacation. What chain hotels are available for accommodation?"}, + {"role": "assistant", "content": "You can consider [7 Days Inn, Ji Hotel, Hilton], etc."}, + {"role": "user", "content": "I’ll choose 7 Days Inn."}, + {"role": "assistant", "content": "Okay, feel free to ask me if you have any other questions."} + ], + async_mode="sync", + mode="fine", +) + +add_rsp = add_memories(add_req) +print("add_memories rsp: \n\n", add_rsp) + +add_req = APIADDRequest( + user_id=user_id, + writable_cube_ids=["cube_test_user_3_2"] , + messages = [ + {"role": "user", "content": "I love you, I need you."}, + {"role": "assistant", "content": "Wow, I love you too"}, + ], + async_mode="sync", + mode="fine", +) + +add_rsp = add_memories(add_req) +print("add_memories rsp: \n\n", add_rsp) + +search_req = APISearchRequest( + user_id=user_id, + readable_cube_ids=["cube_test_user_3_1", "cube_test_user_3_2"], + query="Please recommend a hotel, Love u u", + include_preference=True, +) + +search_rsp = search_memories(search_req).data +print("\n\nsearch_rsp: \n\n", json.dumps(search_rsp, indent=2, ensure_ascii=False)) +``` + +## Example 4: KVCacheMemory Only + +### When to Use: +- You want short-term working memory for faster multi-turn conversation. +- Useful for chatbot session acceleration or prompt reuse. +- Best for caching hidden states / KV pairs. + +### Key Points: +- Uses KVCacheMemory with no explicit text memory. +- Demonstrates extract → add → merge → get → delete. +- Shows how to dump/load KV caches. + +### Full Example Code + +```python +from memos.configs.memory import MemoryConfigFactory +from memos.memories.factory import MemoryFactory + +# Create config for KVCacheMemory (HuggingFace backend) +config = MemoryConfigFactory( + backend="kv_cache", + config={ + "extractor_llm": { + "backend": "huggingface", + "config": { + "model_name_or_path": "Qwen/Qwen3-0.6B", + "max_tokens": 32, + "add_generation_prompt": True, + "remove_think_prefix": True, + }, + }, + }, +) + +# Instantiate KVCacheMemory +kv_mem = MemoryFactory.from_config(config) + +# Extract a KVCacheItem (DynamicCache) +prompt = [ + {"role": "user", "content": "What is MemOS?"}, + {"role": "assistant", "content": "MemOS is a memory operating system for LLMs."}, +] +print("===== Extract KVCacheItem =====") +cache_item = kv_mem.extract(prompt) +print(cache_item) + +# Add the cache to memory +kv_mem.add([cache_item]) +print("All caches:", kv_mem.get_all()) + +# Get by ID +retrieved = kv_mem.get(cache_item.id) +print("Retrieved:", retrieved) + +# Merge caches (simulate multi-turn) +item2 = kv_mem.extract([{"role": "user", "content": "Tell me a joke."}]) +kv_mem.add([item2]) +merged = kv_mem.get_cache([cache_item.id, item2.id]) +print("Merged cache:", merged) + +# Delete one +kv_mem.delete([cache_item.id]) +print("After delete:", kv_mem.get_all()) + +# Dump & load caches +kv_mem.dump("tmp/kv_mem") +print("Dumped to tmp/kv_mem") +kv_mem.delete_all() +kv_mem.load("tmp/kv_mem") +print("Loaded caches:", kv_mem.get_all()) +``` + +## Example 5: Hybrid + +### When to Use: +- You want long-term explainable memory and short-term fast context together. +- Ideal for complex agents that plan, remember facts, and keep chat context. +- Demonstrates multi-memory orchestration. + +### How It Works: + +- **TreeTextMemory** stores your long-term knowledge in a graph DB (Neo4j). +- **KVCacheMemory** stores recent or stable context as activation caches. +- Both work together in a single **MemCube**, managed by your `MOS` pipeline. + + +### Full Example Code + +```python +import os + +from memos.configs.mem_cube import GeneralMemCubeConfig +from memos.configs.mem_os import MOSConfig +from memos.mem_cube.general import GeneralMemCube +from memos.mem_os.main import MOS + +# 1. Setup CUDA (if needed) — for local GPU inference +os.environ["CUDA_VISIBLE_DEVICES"] = "1" + +# 2. Define user & paths +user_id = "root" +cube_id = "root/mem_cube_kv_cache" +tmp_cube_path = "/tmp/default/mem_cube_5" + +# 3. Initialize MOSConfig +mos_config = MOSConfig.from_json_file("examples/data/config/simple_treekvcache_memos_config.json") +mos = MOS(mos_config) + +# 4. Initialize the MemCube (TreeTextMemory + KVCacheMemory) +cube_config = GeneralMemCubeConfig.from_json_file( + "examples/data/config/simple_treekvcache_cube_config.json" +) +mem_cube = GeneralMemCube(cube_config) + +# 5. Dump the MemCube to disk +try: + mem_cube.dump(tmp_cube_path) +except Exception as e: + print(e) + +# 6. Register the MemCube explicitly +mos.register_mem_cube(tmp_cube_path, mem_cube_id=cube_id, user_id=user_id) + +# 7. Extract and add a KVCache memory (simulate stable context) +extract_kvmem = mos.mem_cubes[cube_id].act_mem.extract("I like football") +mos.mem_cubes[cube_id].act_mem.add([extract_kvmem]) + +# 8. Start chatting — now your chat uses: +# - TreeTextMemory: for structured multi-hop retrieval +# - KVCacheMemory: for fast context injection +while True: + user_input = input("👤 [You] ").strip() + print() + response = mos.chat(user_input) + print(f"🤖 [Assistant] {response}\n") + +print("📢 [System] MemChat has stopped.") +``` + +## Example 6: Multi-Memory Scheduling + +### When to Use: +- You want to manage multiple users, multiple MemCubes, or dynamic memory flows. +- Good for SaaS agents or multi-session LLMs. +- Demonstrates MemScheduler + config YAMLs. + +### Key Points: +- Uses parse_yaml to load MOSConfig and MemCubeConfig. +- Dynamic user and cube creation. +- Shows runtime scheduling of memories. + +### Full Example Code + +```python +import shutil +import uuid +from pathlib import Path + +from memos.configs.mem_cube import GeneralMemCubeConfig +from memos.configs.mem_os import MOSConfig +from memos.mem_cube.general import GeneralMemCube +from memos.mem_os.main import MOS +from memos.mem_scheduler.utils import parse_yaml + +# Load main MOS config with MemScheduler +config = parse_yaml("./examples/data/config/mem_scheduler/memos_config_w_scheduler.yaml") +mos_config = MOSConfig(**config) +mos = MOS(mos_config) + +# Create user with dynamic ID +user_id = str(uuid.uuid4()) +mos.create_user(user_id=user_id) + +# Create MemCube config and dump it +config = GeneralMemCubeConfig.from_yaml_file( + "./examples/data/config/mem_scheduler/mem_cube_config.yaml" +) +mem_cube_id = "mem_cube_5" +mem_cube_name_or_path = f"./outputs/mem_scheduler/{user_id}/{mem_cube_id}" + +# Remove old folder if exists +if Path(mem_cube_name_or_path).exists(): + shutil.rmtree(mem_cube_name_or_path) + print(f"{mem_cube_name_or_path} is not empty, and has been removed.") + +# Dump new cube +mem_cube = GeneralMemCube(config) +mem_cube.dump(mem_cube_name_or_path) + +# Register MemCube for this user +mos.register_mem_cube( + mem_cube_name_or_path=mem_cube_name_or_path, + mem_cube_id=mem_cube_id, + user_id=user_id +) + +# Add messages +messages = [ + { + "role": "user", + "content": "I like playing football." + }, + { + "role": "assistant", + "content": "I like playing football too." + }, +] +mos.add(messages, user_id=user_id, mem_cube_id=mem_cube_id) + +# Chat loop: show TreeTextMemory nodes + KVCache +while True: + user_input = input("👤 [You] ").strip() + print() + response = mos.chat(user_input, user_id=user_id) + retrieved_memories = mos.get_all(mem_cube_id=mem_cube_id, user_id=user_id) + + print(f"🤖 [Assistant] {response}") + + # Show WorkingMemory nodes in TreeTextMemory + for node in retrieved_memories["text_mem"][0]["memories"]["nodes"]: + if node["metadata"]["memory_type"] == "WorkingMemory": + print(f"[WorkingMemory] {node['memory']}") + + # Show Activation Memory + if retrieved_memories["act_mem"][0]["memories"]: + for act_mem in retrieved_memories["act_mem"][0]["memories"]: + print(f"⚡ [KVCache] {act_mem['memory']}") + else: + print("⚡ [KVCache] None\n") +``` + + + +::note +**Keep in Mind**
+Use dump() and load() to persist your memory cubes. + +Always check your vector DB dimension matches your embedder. + +For graph memory, you'll need Neo4j Desktop (community version support coming soon). +:: + +## Next Steps +You're just getting started!Next, try: + +- Pick the example that matches your use case. +- Combine modules to build smarter, more persistent agents! + +Need more? +See the API Reference or contribute your own example! diff --git a/docs/en/open_source/getting_started/installation.md b/docs/en/open_source/getting_started/installation.md new file mode 100644 index 00000000..8042e23b --- /dev/null +++ b/docs/en/open_source/getting_started/installation.md @@ -0,0 +1,574 @@ +--- +title: "Installation Guide" +desc: "Complete installation guide for MemOS." +--- + + +::card-group + + :::card + --- + icon: ri:database-2-line + title: Install via Docker + to: /open_source/getting_started/installation#from-docker + --- + Ideal for quick deployment: one-click startup for services and dependencies. + ::: + + :::card + --- + icon: ri:play-line + title: Install from Source + to: /open_source/getting_started/installation#from-source + --- + Ideal for development and contribution: editable installation, run tests, local debugging. + ::: + + :::card + --- + icon: ri:tree-line + title: Install via pip + to: /open_source/getting_started/installation#from-pip + --- + The simplest installation method: get started with MemOS quickly. + ::: + +:: + + +:span{id="from-docker"} +## Install via Docker +```bash +git clone https://github.com/MemTensor/MemOS.git +cd MemOS +``` + +#### Create .env Configuration File +::note +**Please Note**
+The .env file must be placed in the MemOS project root directory. +:: + +::steps{level="4"} + +#### 1. Create .env +```bash +cd MemOS +touch .env +``` + +#### 2. .env Contents + +Here is a quick .env configuration example: +```bash + +# OpenAI API Key (Required configuration) +OPENAI_API_KEY=sk-xxx +# OpenAI API Base URL +OPENAI_API_BASE=http://xxx:3000/v1 +# Default model name +MOS_CHAT_MODEL=qwen3-max + +# Memory Reader LLM Model +MEMRADER_MODEL=qwen3-max +# Memory Reader API Key +MEMRADER_API_KEY=sk-xxx +# Memory Reader API Base URL +MEMRADER_API_BASE=http://xxx:3000/v1 + +# Embedder Model Name +MOS_EMBEDDER_MODEL=text-embedding-v4 +# Configure embedding backend: ollama | universal_api +MOS_EMBEDDER_BACKEND=universal_api +# Embedder API Base URL +MOS_EMBEDDER_API_BASE=http://xxx:8081/v1 +# Embedder API Key +MOS_EMBEDDER_API_KEY=xxx +# Embedding Vector Dimension +EMBEDDING_DIMENSION=1024 +# Reranker Backend (http_bge | etc.) +MOS_RERANKER_BACKEND=cosine_local + +# Neo4j Connection URI +# Options: neo4j-community | neo4j | nebular | polardb +NEO4J_BACKEND=neo4j-community +# Required when backend=neo4j* +NEO4J_URI=bolt://localhost:7687 +NEO4J_USER=neo4j +NEO4J_PASSWORD=12345678 +NEO4J_DB_NAME=neo4j +MOS_NEO4J_SHARED_DB=false + +# Whether to use redis scheduler +DEFAULT_USE_REDIS_QUEUE=false + +# Enable Chat API +ENABLE_CHAT_API=true +# Chat model list, can be applied through Bailian. Models are customizable. +CHAT_MODEL_LIST=[{"backend": "qwen", "api_base": "https://xxx/v1", "api_key": "sk-xxx", "model_name_or_path": "qwen3-max", "extra_body": {"enable_thinking": true} ,"support_models": ["qwen3-max"]}] +``` +#### .env Configuration Example using Bailian +```bash +# Can be applied through Bailian platform +# https://bailian.console.aliyun.com/?spm=a2c4g.11186623.0.0.2f2165b08fRk4l&tab=api#/api +# After successful application, get API_KEY and BASE_URL, configuration example as follows + +# OpenAI API Key (Use Bailian API_KEY) +OPENAI_API_KEY=you_bailian_api_key +# OpenAI API Base URL +OPENAI_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1 +# Default model name +MOS_CHAT_MODEL=qwen3-max + +# Memory Reader LLM Model +MEMRADER_MODEL=qwen3-max +# Memory Reader API Key (Use Bailian API_KEY) +MEMRADER_API_KEY=you_bailian_api_key +# Memory Reader API Base URL +MEMRADER_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1 + +# Embedder model name can refer to the link below +# https://bailian.console.aliyun.com/?spm=a2c4g.11186623.0.0.2f2165b08fRk4l&tab=api#/api/?type=model&url=2846066 +MOS_EMBEDDER_MODEL=text-embedding-v4 +# Configure embedding backend: ollama | universal_api +MOS_EMBEDDER_BACKEND=universal_api +# Embedder API Base URL +MOS_EMBEDDER_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1 +# Embedder API Key (Use Bailian API_KEY) +MOS_EMBEDDER_API_KEY=you_bailian_api_key +# Embedding Vector Dimension +EMBEDDING_DIMENSION=1024 +# Reranker Backend (http_bge | etc.) +MOS_RERANKER_BACKEND=cosine_local + +# Neo4j Connection URI +# Options: neo4j-community | neo4j | nebular | polardb +NEO4J_BACKEND=neo4j-community +# Required when backend=neo4j* +NEO4J_URI=bolt://localhost:7687 +NEO4J_USER=neo4j +NEO4J_PASSWORD=12345678 +NEO4J_DB_NAME=neo4j +MOS_NEO4J_SHARED_DB=false + +# Whether to use redis scheduler +DEFAULT_USE_REDIS_QUEUE=false + +# Enable Chat API +ENABLE_CHAT_API=true + +CHAT_MODEL_LIST=[{"backend": "qwen", "api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1", "api_key": "you_bailian_api_key", "model_name_or_path": "qwen3-max-preview", "extra_body": {"enable_thinking": true} ,"support_models": ["qwen3-max-preview"]}] +``` +![MemOS bailian](https://cdn.memtensor.com.cn/img/get_key_url_by_bailian_compressed.png) +
Example of applying for API_KEY and BASE_URL in Bailian
+ +:: + + +#### Configure Dockerfile +::note +**Please Note**
+The Dockerfile is located in the docker directory. +:: + +```bash +# Enter the docker directory +cd docker +``` +Includes quick mode and full mode. You can choose to use the lite package (distinguished by arm and x86) or the full package (distinguished by arm and x86). + +```bash + +● Lite Package: Simplifies large dependencies like nvidia-related ones, effectively making the image lightweight for faster local deployment. +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-base:v1.0 +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-base-arm:v1.0 + +● Full Package: Packaged with all MemOS dependencies into the image, allowing for a full-feature experience. Can be built and started directly by configuring the Dockerfile. +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-full-base:v1.0.0 +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-full-base-arm:v1.0.0 +``` + +```bash +# This example uses the lite package url +FROM registry.cn-shanghai.aliyuncs.com/memtensor/memos-base-arm:v1.0 + +WORKDIR /app + +ENV HF_ENDPOINT=https://hf-mirror.com + +ENV PYTHONPATH=/app/src + +COPY src/ ./src/ + +EXPOSE 8000 + +CMD ["uvicorn", "memos.api.server_api:app", "--host", "0.0.0.0", "--port", "8000", "--reload"] + +``` + +#### Start Docker Client +```bash + # If Docker is not installed, please install the corresponding version. Download address: + https://www.docker.com/ + +#After installation, Docker can be started through the client or through the command line +#Command line start +sudo systemctl start docker + +# After installation, check docker status +docker ps + +# Check docker images (optional) +docker images + +``` + +#### Build and Start Service: +::note +**Please Note**
+The build command must also be executed in the docker directory. +:: +```bash +# In the docker directory +docker compose up +``` +![MemOS buildComposeupSuccess](https://cdn.memtensor.com.cn/img/memos_build_composeup_success_compressed.png) +
Example image, port according to custom docker configuration
+ +#### Access the API at [http://localhost:8000/docs](http://localhost:8000/docs). + +![MemOS Architecture](https://cdn.memtensor.com.cn/img/memos_run_server_success_compressed.png) + +#### Search Memory +```bash +curl --location --request POST 'http://127.0.0.1:8000/product/search' \ +--header 'Content-Type: application/json' \ +--data-raw '{ + "query": "What do I like to eat", + "user_id": "8736b16e-1d20-4163-980b-a5063c3facdc", + "readable_cube_ids": ["b32d0977-435d-4828-a86f-4f47f8b55bca"], + "top_k":20 + }' + +# response +{ + "code": 200, + "message": "Search completed successfully", + "data": { + "text_mem": [ + { + "cube_id": "7231eda8-6c57-4f6e-97ce-98b699eebb98", + "memories": [ + { + "id": "2f40be8f-736c-4a5f-aada-9489037769e0", + "memory": "[user opinion] User likes strawberries.", + "metadata": { + "user_id": "de8215e3-3beb-4afc-9b64-ae594d62f1ea", + "session_id": "root_session", + "status": "activated", + "type": "fact", + "key": "User's preference for strawberries", + "confidence": 0.99, + "source": null, + "tags": [ + "preference", + "strawberries" + ], + "visibility": null, + "updated_at": "2025-09-18T08:23:44.625479000+00:00", + "memory_type": "UserMemory", + "sources": [], + "embedding": [], + "created_at": "2025-09-18T08:23:44.625511000+00:00", + "usage": [ + "{ + "time": "2025-09-18T08:24:17.759748", + "info": { + "user_id": "de8215e3-3beb-4afc-9b64-ae594d62f1ea", + "session_id": "root_session" + } + }" + ], + "background": "User expressed a preference for strawberries, indicating a tendency in dietary preferences.", + "relativity": 0.6349761312470591, + "vector_sync": "success", + "ref_id": "[2f40be8f]", + "id": "2f40be8f-736c-4a5f-aada-9489037769e0", + "memory": "[user opinion] User likes strawberries." + }, + "ref_id": "[2f40be8f]" + }, + ... + } + } + ], + "act_mem": [], + "para_mem": [] + } +} +``` + + +:span{id="from-source"} +## Install from Source +```bash +git clone https://github.com/MemTensor/MemOS.git +cd MemOS +``` + +#### Create .env Configuration File +The MemOS server_api relies on environment variables to start, so you need to create a .env file in the startup directory. +1. Create .env file +```bash +cd MemOS +touch .env +``` + +2. .env contents +Please refer to the Docker installation for quick configuration[env configuration](/open_source/getting_started/installation#from-docker) +For detailed .env configuration, please refer to [env configuration](/open_source/getting_started/rest_api_server/#local-deployment) + +::note +**Please Note**
+The .env file must be placed in the MemOS project root directory. +:: + +#### Install Dependencies +```bash +# Execute the installation command +pip install -e . +pip install --no-cache-dir -r ./docker/requirements.txt +# Configure PYTHONPATH to the absolute directory of the current project file src +export PYTHONPATH=/******/MemOS/src +``` + +#### Neo4j Support + +::note +**Neo4j Desktop Requirement**
If you plan to use Neo4j for graph memory, please install Neo4j Desktop.
+Additionally, you need to set **NEO4J_BACKEND=neo4j** in .env file +:: + + +#### Start MemOS Server +```bash +# project root directory +uvicorn memos.api.server_api:app --host 0.0.0.0 --port 8000 --workers 1 +``` + +#### Add Memory +```bash +curl --location --request POST 'http://127.0.0.1:8000/product/add' \ +--header 'Content-Type: application/json' \ +--data-raw '{ + + "messages": [{ + "role": "user", + "content": "I like eating strawberries" + }], + "user_id": "8736b16e-1d20-4163-980b-a5063c3facdc", + "writable_cube_ids":["b32d0977-435d-4828-a86f-4f47f8b55bca"] +}' + +# response +{ + "code": 200, + "message": "Memory created successfully", + "data": null +} +``` + +#### Search Memory +```bash +curl --location --request POST 'http://127.0.0.1:8000/product/search' \ +--header 'Content-Type: application/json' \ +--data-raw '{ + "query": "What do I like to eat", + "user_id": "8736b16e-1d20-4163-980b-a5063c3facdc", + "readable_cube_ids": ["b32d0977-435d-4828-a86f-4f47f8b55bca"], + "top_k":20 + }' + +# response +{ + "code": 200, + "message": "Search completed successfully", + "data": { + "text_mem": [ + { + "cube_id": "7231eda8-6c57-4f6e-97ce-98b699eebb98", + "memories": [ + { + "id": "2f40be8f-736c-4a5f-aada-9489037769e0", + "memory": "[user opinion] User likes strawberries.", + "metadata": { + "user_id": "de8215e3-3beb-4afc-9b64-ae594d62f1ea", + "session_id": "root_session", + "status": "activated", + "type": "fact", + "key": "User's preference for strawberries", + "confidence": 0.99, + "source": null, + "tags": [ + "preference", + "strawberries" + ], + "visibility": null, + "updated_at": "2025-09-18T08:23:44.625479000+00:00", + "memory_type": "UserMemory", + "sources": [], + "embedding": [], + "created_at": "2025-09-18T08:23:44.625511000+00:00", + "usage": [ + "{ + "time": "2025-09-18T08:24:17.759748", + "info": { + "user_id": "de8215e3-3beb-4afc-9b64-ae594d62f1ea", + "session_id": "root_session" + } + }" + ], + "background": "User expressed a preference for strawberries, indicating a tendency in dietary preferences.", + "relativity": 0.6349761312470591, + "vector_sync": "success", + "ref_id": "[2f40be8f]", + "id": "2f40be8f-736c-4a5f-aada-9489037769e0", + "memory": "[user opinion] User likes strawberries." + }, + "ref_id": "[2f40be8f]" + }, + ... + } + } + ], + "act_mem": [], + "para_mem": [] + } +} +``` + + +:span{id="from-pip"} +## Install via pip +The simplest way to install MemOS is using pip. + +::steps{level="4"} + +#### Create and Activate Conda Environment (Recommended) + +To avoid dependency conflicts, it is strongly recommended to use a dedicated Conda environment. + +```bash +conda create -n memos python=3.11 +conda activate memos +``` + +#### Install MemOS from PyPI +Install MemOS with all optional components: + +```bash +pip install -U "MemoryOS[all]" +``` + +After installation, you can verify it was successful: + +```bash +python -c "import memos; print(memos.__version__)" +``` + + +::note +**Optional Dependencies**
+ +MemOS provides several optional dependency groups for different features. You can install them based on your needs. + +| Feature | Package Name | +| ---------------- | ------------------------- | +| Tree Memory | `MemoryOS[tree-mem]` | +| Memory Reader | `MemoryOS[mem-reader]` | +| Memory Scheduler | `MemoryOS[mem-scheduler]` | + +Example installation commands: + +```bash +pip install MemoryOS[tree-mem] +pip install MemoryOS[tree-mem,mem-reader] +pip install MemoryOS[mem-scheduler] +pip install MemoryOS[tree-mem,mem-reader,mem-scheduler] +``` +:: + +#### Create .env Configuration File +The MemOS server_api relies on environment variables to start, so you need to create a .env file in the startup directory. +1. Create .env file +```bash +touch .env +``` + +2. Example .env contents +```text +# ========== Required Configuration ========== +CHAT_MODEL_LIST='[ + { + "name": "default", + "backend": "openai", + "config": { + "model": "gpt-4o-mini", + "api_key": "YOUR_API_KEY" + } + } +]' + +# ========== Optional Configuration ========== +MEMOS_LOG_LEVEL=INFO +``` + +::note +**Please Note**
+env notes +:: + +For detailed development environment setup, workflow guidelines, and contribution best practices, please see our [Contribution Guide](/open_source/contribution/overview). + +#### Start MemOS Server +MemOS does not automatically load .env files. Please use the python-dotenv method to start. +```bash +python -m dotenv run -- \ + uvicorn memos.api.server_api:app \ + --host 0.0.0.0 \ + --port 8000 +``` +After successful startup, you will see output similar to: +```text +INFO: Uvicorn running on http://0.0.0.0:8000 +INFO: Application startup complete. +``` + +#### Verify Service is Running + +:: + +#### Ollama Support +To use MemOS with [Ollama](https://ollama.com/), first install the Ollama CLI: + +```bash +curl -fsSL https://ollama.com/install.sh | sh +``` + +#### Transformers Support + +To use functionalities based on the `transformers` library, ensure you have [PyTorch](https://pytorch.org/get-started/locally/) installed (CUDA version recommended for GPU acceleration). + +#### Neo4j Support + +::note +**Neo4j Desktop Requirement**
If you plan to use Neo4j for graph memory, please install Neo4j Desktop. +:: + +#### Download Examples + +To download example code, data, and configurations, run the following command: + +```bash +memos download_examples +``` diff --git a/docs/en/open_source/getting_started/rest_api_server.md b/docs/en/open_source/getting_started/rest_api_server.md new file mode 100644 index 00000000..9dab5474 --- /dev/null +++ b/docs/en/open_source/getting_started/rest_api_server.md @@ -0,0 +1,500 @@ +--- +title: REST API Server +desc: MemOS provides a REST API service written using FastAPI. Users can perform all operations via REST interfaces. +--- + +![MemOS Architecture](https://cdn.memtensor.com.cn/img/memos_run_server_success_compressed.png) +
APIs supported by MemOS REST API Server
+ +### Features + +- Add new memory: Create a new memory for a specific user. +- Search memories: Search for memory content for a specific user. +- Get all user memories: Get all memory content for a specific user. +- Memory feedback: Feedback memory content for a specific user. +- Chat with MemOS: Chat with MemOS, returning SSE streaming response. + + +## Run Locally + +### 1、Local Download +```bash +# Download the code to the local folder +git clone https://github.com/MemTensor/MemOS +``` + +### 2、Configure Environment Variables +```bash +# Enter the folder directory +cd MemOS +``` + +#### Create a `.env` file in the root directory and set your environment variables. +##### .env The quick mode configuration is as follows, Complete Mode Reference .env.example. + +```bash + +# OpenAI API Key (Custom configuration required) +OPENAI_API_KEY=sk-xxx +# OpenAI API Base URL +OPENAI_API_BASE=http://xxx:3000/v1 +# Default model name +MOS_CHAT_MODEL=qwen3-max + +# Memory Reader LLM model +MEMRADER_MODEL=qwen3-max +# Memory Reader API Key +MEMRADER_API_KEY=sk-xxx +# Memory Reader API Base URL +MEMRADER_API_BASE=http://xxx:3000/v1 + +# Embedder model name +MOS_EMBEDDER_MODEL=text-embedding-v4 +# set default embedding backend default: ollama | universal_api +MOS_EMBEDDER_BACKEND=universal_api +# Embedder API Base URL +MOS_EMBEDDER_API_BASE=http://xxx:8081/v1 +# Embedder API Key +MOS_EMBEDDER_API_KEY=xxx +# Embedding vector dimension +EMBEDDING_DIMENSION=1024 +# Reranker backend (http_bge | etc.) +MOS_RERANKER_BACKEND=cosine_local + +# Neo4j Connection URI +# Optional values: neo4j-community | neo4j | nebular | polardb +NEO4J_BACKEND=neo4j-community +# required when backend=neo4j* +NEO4J_URI=bolt://localhost:7687 +NEO4J_USER=neo4j +NEO4J_PASSWORD=12345678 +NEO4J_DB_NAME=neo4j +MOS_NEO4J_SHARED_DB=false + +# Whether to use Redis scheduler +DEFAULT_USE_REDIS_QUEUE=false + +# Enable chat api +ENABLE_CHAT_API=true +# Chat Model List can apply through Bailian. Models are selectable. +CHAT_MODEL_LIST=[{"backend": "qwen", "api_base": "https://xxx/v1", "api_key": "sk-xxx", "model_name_or_path": "qwen3-max", "extra_body": {"enable_thinking": true} ,"support_models": ["qwen3-max"]}] + +``` + +### 3、Taking Bailian as an example to customize configuration + +```bash +# You can apply through the Bailian platform +# https://bailian.console.aliyun.com/?spm=a2c4g.11186623.0.0.2f2165b08fRk4l&tab=api#/api +# After successful application, obtain API_KEY and BASE-URL. The example configuration is as follows + +# OpenAI API Key (Using the API_KEY of Bailian) +OPENAI_API_KEY=you_bailian_api_key +# OpenAI API Base URL +OPENAI_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1 +# Default model name +MOS_CHAT_MODEL=qwen3-max + +# Memory Reader LLM model +MEMRADER_MODEL=qwen3-max +# Memory Reader API Key (Using the API_KEY of Bailian) +MEMRADER_API_KEY=you_bailian_api_key +# Memory Reader API Base URL +MEMRADER_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1 + +# Embedder The model name can refer to the following link +# https://bailian.console.aliyun.com/?spm=a2c4g.11186623.0.0.2f2165b08fRk4l&tab=api#/api/?type=model&url=2846066 +MOS_EMBEDDER_MODEL=text-embedding-v4 +# set default embedding backend default: ollama | universal_api +MOS_EMBEDDER_BACKEND=universal_api +# Embedder API Base URL +MOS_EMBEDDER_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1 +# Embedder API Key (Using the API_KEY of Bailian) +MOS_EMBEDDER_API_KEY=you_bailian_api_key +# Embedding vector dimension +EMBEDDING_DIMENSION=1024 +# Reranker backend (http_bge | etc.) +MOS_RERANKER_BACKEND=cosine_local + +# Neo4j Connection URI +# Optional values: neo4j-community | neo4j | nebular | polardb +NEO4J_BACKEND=neo4j-community +# required when backend=neo4j* +NEO4J_URI=bolt://localhost:7687 +NEO4J_USER=neo4j +NEO4J_PASSWORD=12345678 +NEO4J_DB_NAME=neo4j +MOS_NEO4J_SHARED_DB=false + +# Whether to use Redis scheduler +DEFAULT_USE_REDIS_QUEUE=false + +# Enable chat api +ENABLE_CHAT_API=true + +CHAT_MODEL_LIST=[{"backend": "qwen", "api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1", "api_key": "you_bailian_api_key", "model_name_or_path": "qwen3-max-preview", "extra_body": {"enable_thinking": true} ,"support_models": ["qwen3-max-preview"]}] + +``` +![MemOS bailian](https://cdn.memtensor.com.cn/img/get_key_url_by_bailian_compressed.png) +
Bailian application API_KEY and BASE_URL example
+ +Configure dependency versions in docker/requirement.txt (negligible), Complete Mode Reference requirements.txt. + +### 4、Start Docker +```bash + # If Docker is not installed, please install the corresponding version. The download link is as follows: + https://www.docker.com/ + + #After installation, Docker can be started through the client or through the command line + #Command line start + sudo systemctl start docker + +# Check docker status +docker ps +# Check docker images (optional) +docker images + +``` + + +### Method 1: Docker use repository dependency package image/start (Recommended use) +::steps{level="4"} +```bash +# Enter the Docker directory +cd docker +``` + +#### Reference configuration environment variables above, .env file should be configured + +#### Configure Dockerfile(cd docker) + + +Contains quick mode and full mode, distinguishing between using simplified packages (x86 and arm) and full packages (x86 and arm) +```bash +● Simplified package: Simplify dependencies related to Nvidia that are too large in size, achieve lightweight mirroring, and make local deployment lighter and faster. +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-base:v1.0 +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-base-arm:v1.0 + +● Full package: Convert all MemOS dependencies into images, Experience complete functionality. By configuring Dockerfiles, you can directly build and start the package. +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-full-base:v1.0.0 +url: registry.cn-shanghai.aliyuncs.com/memtensor/memos-full-base-arm:v1.0.0 +``` + +#### Configure Dockerfile(cd docker) +```bash +# The current example uses a simplified package url +FROM registry.cn-shanghai.aliyuncs.com/memtensor/memos-base:v1.0 + +WORKDIR /app + +ENV HF_ENDPOINT=https://hf-mirror.com + +ENV PYTHONPATH=/app/src + +COPY src/ ./src/ + +EXPOSE 8000 + +CMD ["uvicorn", "memos.api.server_api:app", "--host", "0.0.0.0", "--port", "8000", "--reload"] + +``` +#### Build and start service using docker compose up: +```bash +# Enter docker directory +docker compose up +``` +![MemOS buildComposeupSuccess](https://cdn.memtensor.com.cn/img/memos_build_composeup_success_compressed.png) +
Example image, port as per docker custom configuration
+ +#### Access API via [http://localhost:8000/docs](http://localhost:8000/docs). + +![MemOS Architecture](https://cdn.memtensor.com.cn/img/memos_run_server_success_compressed.png) + + +#### Test cases (Add user memory->Query user memory) Refer to Docker Compose up test cases + +:: + + + +### Method 2:Client Install with Docker Compose up +::steps{level="4"} +Development Docker Compose up comes pre-configured with qdrant, neo4j. +Running the server requires the `OPENAI_API_KEY` environment variable. + + +#### Enter docker folder +```bash +# Enter docker folder from current directory +cd docker +``` + +#### Install corresponding dependency modules +```bash + +pip install --upgrade pip && pip install --no-cache-dir -r requirements.txt +# Install dependencies using Aliyun source +# pip install --upgrade pip && pip install --no-cache-dir -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/ + +``` + + +#### Start container using Docker Compose Up in docker directory (ensure vpn connects normally): + +```bash + +# Build required for first run +docker compose up --build +# Not required for subsequent runs +docker compose up + +``` + +#### Access API via [http://localhost:8000/docs](http://localhost:8000/docs). + +#### Example process + +##### (Query user memory (stop if none) -> Add user memory -> Query user memory) + +##### Add User Memory http://localhost:8000/product/add (POST) +```bash +# Request params +{ + "user_id": "8736b16e-1d20-4163-980b-a5063c3facdc", + "mem_cube_id": "b32d0977-435d-4828-a86f-4f47f8b55bca", + "messages": [ + { + "role": "user", + "content": "I like strawberry" + } + ], + "memory_content": "", + "doc_path": "", + "source": "", + "user_profile": false +} +# Response +{ + "code": 200, + "message": "Memory created successfully", + "data": null +} +``` + +##### Query User Memory http://localhost:8000/product/search (POST) +```bash +# Request params +{ + "query": "What do I like", + "user_id": "8736b16e-1d20-4163-980b-a5063c3facdc", + "mem_cube_id": "b32d0977-435d-4828-a86f-4f47f8b55bca" +} +# Response +{ + "code": 200, + "message": "Search completed successfully", + "data": { + "text_mem": [ + { + "cube_id": "7231eda8-6c57-4f6e-97ce-98b699eebb98", + "memories": [ + { + "id": "2f40be8f-736c-4a5f-aada-9489037769e0", + "memory": "[user viewpoint] User likes strawberries.", + "metadata": { + "user_id": "de8215e3-3beb-4afc-9b64-ae594d62f1ea", + "session_id": "root_session", + "status": "activated", + "type": "fact", + "key": "User preference for strawberries", + "confidence": 0.99, + "source": null, + "tags": [ + "preference", + "strawberry" + ], + "visibility": null, + "updated_at": "2025-09-18T08:23:44.625479000+00:00", + "memory_type": "UserMemory", + "sources": [], + "embedding": [], + "created_at": "2025-09-18T08:23:44.625511000+00:00", + "usage": [ + "{ + "time": "2025-09-18T08:24:17.759748", + "info": { + "user_id": "de8215e3-3beb-4afc-9b64-ae594d62f1ea", + "session_id": "root_session" + } + }" + ], + "background": "The user expressed a preference for strawberries, indicating their inclination towards dietary preferences.", + "relativity": 0.6349761312470591, + "vector_sync": "success", + "ref_id": "[2f40be8f]", + "id": "2f40be8f-736c-4a5f-aada-9489037769e0", + "memory": "[user viewpoint] User likes strawberries." + }, + "ref_id": "[2f40be8f]" + }, + ... + } + } + ], + "act_mem": [], + "para_mem": [] + } +} + + + +# Response failure, troubleshooting +# src/memos/api/config.py +# Check "neo4j_vec_db" and "EMBEDDING_DIMENSION" configured in get_neo4j_community_config method +``` + + +#### Modifications to server code or library code will automatically reload the server. + + +:: + +### Method 3:Client Install using CLI commands + +::steps{level="4"} + +#### Install dependencies + +```bash +# pip install --upgrade pip && pip install --no-cache-dir -r ./docker/requirements.txt +# Install dependencies using Aliyun source +pip install --no-cache-dir -r ./docker/requirements.txt -i https://mirrors.aliyun.com/pypi/simple/ + + +``` + +#### Open terminal and run the following command to install: + +```bash + +# Packages that might need manual installation currently. Need to find resources for these two packages +# neo4j.5.26.4.tar qdrant.v1.15.3.tar +docker load -i neo4j.5.26.4.tar +docker load -i qdrant.v1.15.3.tar +# Check if installed successfully +docker images +# Check if running +docker ps -a + + +# Root directory + uvicorn memos.api.server_api:app --host 0.0.0.0 --port 8000 --workers 1 + +# If ModuleNotFoundError: No module named 'memos' appears during startup, it is due to path matching problem, please execute +export PYTHONPATH=/you-file-absolute-path/MemOS/src + +``` + +#### Access API + +After startup is complete, access API via [http://localhost:8000/docs](http://localhost:8000/docs). + + +:: + +### Method 4: Without Docker +::steps{level="4"} +#### Reference configuration environment variables above, .env file should be configured + +#### Install Poetry for dependency management: + +```bash +curl -sSL https://install.python-poetry.org | python3 - +``` + +#### Poetry environment variable configuration: + +```bash + +# To start using, you need to find Poetry's bin directory in "PATH" (/Users/jinyunyuan/.local/bin) environment variable +# Modern macOS systems default Shell is zsh. You can confirm via following command +1. Determine which Shell you are using + +echo $SHELL +# If output is /bin/zsh or /usr/bin/env zsh, then you are zsh. +# (If your system version is older, might still be using bash, output will be /bin/bash) +2. Open corresponding Shell config file +# If using zsh (vast majority of cases): +# Use nano editor (recommended for beginners) +nano ~/.zshrc + +# Or use vim editor +# vim ~/.zshrc +# If using bash: +nano ~/.bash_profile +# Or +nano ~/.bashrc + +3. Add PATH environment variable + +# At the very end of opened file, start a new line, paste installation prompt command: +export PATH="/you-path/.local/bin:$PATH" + +4. Save and exit editor + +# If using nano: +# Press Ctrl + O to write (save), press Enter to confirm filename. +# Then press Ctrl + X to exit editor. + +# If using vim: +# Press i to enter insert mode, paste code, then press ESC key to exit insert mode. +# Input :wq, then press Enter to save and exit. + +5. Make configuration take effect immediately +# Newly modified config file won't automatically take effect in currently open terminal window, you need to run one of the following commands to reload it: + +# For zsh: +source ~/.zshrc + +# For bash: +source ~/.bash_profile + +6. Verify if installation is successful +# Now, you can execute test command in prompt to check if everything is ready: +poetry --version +# Success will show version number Poetry (version 2.2.0) + +``` + +#### Install all project dependencies and development tools: + +```bash +make install +``` + +#### First start neo4j and qdrant in docker + +#### Start FastAPI server (In MomOS directory): + +```bash +uvicorn memos.api.product_api:app --host 0.0.0.0 --port 8000 --reload +``` + +#### After server runs, you can use OpenAPI docs to test API, URL is [http://localhost:8000/docs](http://localhost:8000/docs) or [http://127.0.0.1:8000/docs](http://127.0.0.1:8000/docs) + +#### Test cases (Register user->Add user memory->Query user memory) Refer to Docker Compose up test cases + +:: + + +### Method 5:Start using PyCharm + +#### Run server_api +```bash +1. Enter MemOS/docker/Dockerfile file, modify run configuration +# Start the docker +CMD ["uvicorn", "memos.api.server_api:app", "--host", "0.0.0.0", "--port", "8000", "--reload"] + +2. Enter directory MemOS/src/memos/api directly run server_api.py + +``` diff --git a/docs/en/open_source/getting_started/your_first_memory.md b/docs/en/open_source/getting_started/your_first_memory.md new file mode 100644 index 00000000..28af0180 --- /dev/null +++ b/docs/en/open_source/getting_started/your_first_memory.md @@ -0,0 +1,268 @@ +--- +title: Your First Memory +desc: Let’s build your first plaintext memory in MemOS! **GeneralTextMemory** is the easiest way to get hands-on with extracting, embedding, and searching simple text memories. +--- + +## What You'll Learn + +By the end of this guide, you will: +- Extract memories from plain text or chat messages. +- Store them as semantic vectors. +- Search and manage them using vector similarity. + +## How It Works + +### Memory Structure + +Every memory is stored as a `TextualMemoryItem`: +- `memory`: the main text content (e.g., "The user loves tomatoes.") +- `metadata`: extra details to make the memory searchable and manageable — type, + time, source, confidence, entities, tags, visibility, and updated_at. + +These fields make each piece of memory queryable, filterable, and easy to govern. + +For each `TextualMemoryItem`: + +| Field | Example | What it means | +| ------------- | ------------------------- | ------------------------------------------ | +| `type` | `"opinion"` | Classify if it's a fact, event, or opinion | +| `memory_time` | `"2025-07-02"` | When it happened | +| `source` | `"conversation"` | Where it came from | +| `confidence` | `100.0` | Certainty score (0–100) | +| `entities` | `["tomatoes"]` | Key concepts | +| `tags` | `["food", "preferences"]` | Extra labels for grouping | +| `visibility` | `"private"` | Who can access it | +| `updated_at` | `"2025-07-02T00:00:00Z"` | Last modified | + +::note +**Best Practice**
You can define any metadata fields that make sense for your use case! +:: + + + +### The Core Steps +When you run this example: + +1. **Extract:** +Your messages go through an `extractor_llm`, which returns a JSON list of `TextualMemoryItem`s. + +2. **Embed:** +Each memory's `memory` field is turned into an embedding vector via `embedder`. + +3. **Store:** +The embeddings are saved into a local **Qdrant** collection. + +4. **Search & Manage:** +You can now `search` by semantic similarity, `update` by ID, or `delete` memories. + +::note +**Hint**
Make sure your embedder's output dimension matches your vector DB's `vector_dimension`. + Mismatch may cause search errors! +:: + + + +::note +**Hint**
If your search results are too noisy or irrelevant, check whether your embedder config and vector DB are properly initialized. +:: + +### Example Flow + +**Input Messages:** + +```json +[ + {"role": "user", "content": "I love tomatoes."}, + {"role": "assistant", "content": "Great! Tomatoes are healthy."} +] +``` + +**Extracted Memory:** + +```json +{ + "memory": "The user loves tomatoes.", + "metadata": { + "type": "opinion", + "memory_time": "2025-07-02", + "source": "conversation", + "confidence": 100.0, + "entities": ["tomatoes"], + "tags": ["food", "preferences"], + "visibility": "private", + "updated_at": "2025-07-02T00:00:00" + } +} +``` + +Here's a minimal script to create, extract, store, and search a memory: + +::steps{level="4"} + +#### Create a Memory Config + +First, create your minimal GeneralTextMemory config. +It contains three key parts: +- extractor_llm: uses an LLM to extract plaintext memories from conversations. +- embedder: turns each memory into a vector. +- vector_db: stores vectors and supports similarity search. + +```python +from memos.configs.memory import MemoryConfigFactory +from memos.memories.factory import MemoryFactory + +config = MemoryConfigFactory( + backend="general_text", + config={ + "extractor_llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b", + "temperature": 0.0, + "remove_think_prefix": True, + "max_tokens": 8192, + }, + }, + "vector_db": { + "backend": "qdrant", + "config": { + "collection_name": "test_textual_memory", + "distance_metric": "cosine", + "vector_dimension": 768, + }, + }, + "embedder": { + "backend": "ollama", + "config": { + "model_name_or_path": "nomic-embed-text:latest", + }, + }, + }, +) + +m = MemoryFactory.from_config(config) +``` + + +#### Extract Memories from Messages +Give your LLM a simple dialogue and see how it extracts structured plaintext memories. + +```python +memories = m.extract( + [ + {"role": "user", "content": "I love tomatoes."}, + {"role": "assistant", "content": "Great! Tomatoes are delicious."}, + ] +) +print("Extracted:", memories) +``` +You'll get a list of TextualMemoryItem, with each of them like: +```text +TextualMemoryItem( + id='...', + memory='The user loves tomatoes.', + metadata=... +) +``` + +#### Add Memories to Your Vector DB + +Save the extracted memories to your vector DB and demonstrate adding a custom plaintext memory manually (with a custom ID). + +```python +m.add(memories) +m.add([ + { + "id": "a19b6caa-5d59-42ad-8c8a-e4f7118435b4", + "memory": "User is Chinese.", + "metadata": {"type": "opinion"}, + } +]) +``` + + +#### Search Memories + +Now test similarity search! +Type any natural language query and find related memories. +```python +results = m.search("Tell me more about the user", top_k=2) +print("Search results:", results) +``` + +#### Get Memories by ID + +Fetch any memory directly by its ID: +```python +print("Get one by ID:", m.get("a19b6caa-5d59-42ad-8c8a-e4f7118435b4")) +``` + +#### Update a Memory + +Need to fix or refine a memory? +Update it by ID and re-embed the new version. +```python +m.update( + "a19b6caa-5d59-42ad-8c8a-e4f7118435b4", + { + "memory": "User is Canadian.", + "metadata": { + "type": "opinion", + "confidence": 85, + "memory_time": "2025-05-24", + "source": "conversation", + "entities": ["Canadian"], + "tags": ["happy"], + "visibility": "private", + "updated_at": "2025-05-19T00:00:00", + }, + } +) +print("Updated:", m.get("a19b6caa-5d59-42ad-8c8a-e4f7118435b4")) +``` + +#### Delete Memories + +Remove one or more memories cleanly +```python +m.delete(["a19b6caa-5d59-42ad-8c8a-e4f7118435b4"]) +print("Remaining:", m.get_all()) +``` + +#### Dump Memories to Disk + +Finally, dump all your memories to local storage: +```python +m.dump("tmp/mem") +print("Memory dumped to tmp/mem") +``` +By default, your memories are saved to: +``` +/ +``` +They can be reloaded anytime with `load()`. + +::note +By default, your dumped memories are saved to the file path you set in your config. + Always check config.memory_filename if you want to customize it. +:: + +:: + +Now your agent remembers — no more stateless chatbots! + +## What's Next? + +Ready to level up? + +- **Try your own LLM backend:** Swap to OpenAI, HuggingFace, or Ollama. +- **Explore [TreeTextMemory](/open_source/modules/memories/tree_textual_memory):** Build a graph-based, + hierarchical memory. +- **Add [Activation Memory](/open_source/modules/memories/kv_cache_memory):** Cache key-value + states for faster inference. +- **Dive deeper:** Check the [API Reference](/api-reference/search-memories) and [Examples](/open_source/getting_started/examples) for advanced workflows. + +::note +**Try Graph Textual Memory**
Try switching to +TreeTextMemory to add a graph-based, hierarchical structure to your memories.
Perfect for scenarios that need explainability and long-term structured knowledge. +:: diff --git a/docs/en/open_source/home/architecture.md b/docs/en/open_source/home/architecture.md new file mode 100644 index 00000000..b0a3c38a --- /dev/null +++ b/docs/en/open_source/home/architecture.md @@ -0,0 +1,98 @@ +--- +title: Architecture +desc: MemOS is made up of **core modules** that work together to turn your LLM into a truly **memory-augmented system** — from orchestration to storage to retrieval. +--- + +## Core Modules + +### MOS (Memory Operating System) + +The orchestration layer of MemOS — it + manages predictive, asynchronous scheduling across multiple memory types (plaintext, activation, parametric) and orchestrates **multi-user, multi-session** memory workflows. + +MOS connects memory containers (**MemCubes**) with LLMs via a unified API for adding, searching, updating, transferring, or rolling back memories. It also supports cross-model, cross-device interoperability through a unified Memory Interchange Protocol (MIP). + +### MemCube +A modular, portable **memory container** — think of it like a flexible cartridge that can hold one or more memory types for a **user, agent, or session**. + +MemCubes can be dynamically registered, updated, or removed. They support containerized storage that is transferable across sessions, models, and devices. + +### Memories + + MemOS supports several specialized memory types for different needs: + +#### 1. **Parametric Memory**(**Coming Soon**) + +Embedded in model weights; + long-term, + high-efficiency, but hard to edit. + +#### 2. **Activation Memory** + +Runtime hidden states & KV-cache; short-term, +transient, steering dynamic behavior. + +#### 3. Plaintext Memory + +Structured or unstructured knowledge +blocks; editable, traceable, suitable for fast updates, personalization & multi-agent sharing. + +- **GeneralTextMemory:** Flexible, vector-based storage for unstructured +textual knowledge with semantic search and metadata filtering. +- **TreeTextMemory:** Hierarchical, graph-style memory for structured +knowledge — combining **tree-based hierarchy** and **cross-branch linking** for dynamic, evolving knowledge graphs. It supports long-term organization and multi-hop reasoning (often Neo4j-backed). + +::note +**Best Practice**
+Start simple with GeneralTextMemory — then scale to graph or KV-cache as your needs grow. +:: + +#### Basic Modules + +Includes chunkers, embedders, LLM connectors, parsers, and interfaces for vector/graph databases. These provide the building blocks for memory extraction, semantic embedding, storage, and retrieval. + +## Code Structure + +MemOS project is organized for clarity and plug-and-play: + +``` +src/memos/ + api/ # API definitions + chunkers/ # Text chunking utilities + configs/ # Configuration schemas + context/ # Log context + embedders/ # Embedding models + graph_dbs/ # Graph database backends (e.g., Neo4j) + vec_dbs/ # Vector database backends (e.g., Qdrant) + llms/ # LLM connectors + mem_agent/ # Deep search + mem_chat/ # Memory-augmented chat logic + mem_cube/ # MemCube management + mem_feedback # Memory feedback + mem_os/ # MOS orchestration + mem_reader/ # Memory readers + mem_scheduler/ # Memory scheduling module + memories/ # Memory type implementations + multi_mem_cube/# Multi-view Cube + parsers/ # Parsing utilities + reranker/ # Reranker module + templates/ # Prompt templates + types/ # Type definitions +``` + +::note +**Pro Tip**
+Use examples/ for quick experimentation and docs/ for module deep dives. +:: + +## Extensibility + +MemOS is **modular by design**. +Add your own memory types, storage backends, or LLM connectors with minimal changes — thanks to its **unified config and factory patterns**. + + +::note +**Pro Tip**
+[Contribute](/open_source/contribution/overview) a new backend or share your custom memory +type — it’s easy to plug in. +:: diff --git a/docs/en/open_source/home/core_concepts.md b/docs/en/open_source/home/core_concepts.md new file mode 100644 index 00000000..54c4ddaa --- /dev/null +++ b/docs/en/open_source/home/core_concepts.md @@ -0,0 +1,107 @@ +--- +title: Core Concepts +desc: MemOS treats memory as a first-class citizen. Its core design revolves around how to orchestrate, store, retrieve, and govern memory for your LLM applications. +--- + +## Overview + +* [MOS (Memory Operating System)](#mos-memory-operating-system) +* [MemCube](#️memcube) +* [Memory Types](#memory-types) +* [Cross-Cutting Concepts](#cross-cutting-concepts) + + +## MOS (Memory Operating System) + +**What it is:** +The orchestration layer that coordinates multiple MemCubes and memory operations. It connects your LLMs with structured, explainable memory for reasoning and planning. + +**When to use:** +Use MOS whenever you need to bridge users, sessions, or agents with consistent, auditable memory workflows. + +## MemCube + +**What it is:** +A MemCube is like a flexible, swappable memory cartridge. Each user, session, or task can have its own MemCube, which can hold one or more memory types. + +**When to use:** +Use different MemCubes to isolate, reuse, or scale your memory as your system grows. + +## Memory Types + +MemOS treats memory like a living system — not just static data but evolving knowledge. Here's how the three core memory types work together: + +| Memory Type | Description | When to Use | +|----------------|----------------------------------------------|---------------------------------------------| +| **Parametric** | Knowledge distilled into model weights | Evergreen skills, stable domain expertise | +| **Activation** | Short-term KV cache and hidden states | Fast reuse in dialogue, multi-turn sessions | +| **Plaintext** | Text, docs, graph nodes, or vector chunks | Searchable, inspectable, evolving knowledge | + +### Parametric Memory + +**What:** +Knowledge embedded directly into the model's weights — think of this as the model's "cortex". It's always on, providing zero-latency reasoning. + +**When to use:** +Perfect for stable domain knowledge, distilled FAQs, or skills that rarely change. + +### Activation Memory + +**What:** +Activation Memory is your model's reusable "working memory" — it includes precomputed key-value caches and hidden states that can be directly injected into the model's attention mechanism. +Think of it as pre-cooked context that saves your LLM from repeatedly +re-encoding static or frequently used information. + +**Why it matters:** +By storing stable background content (like FAQs or known facts) in a KV-cache, your model can skip redundant computation during the prefill phase. +This dramatically reduces Time To First Token (TTFT) and improves throughput for multi-turn conversations or retrieval-augmented generation. + +**When to use:** +- Reuse background knowledge across many user queries. +- Speed up chatbots that rely on the same domain context each turn. +- Combine with MemScheduler to auto-promote stable plaintext memory to KV format. + +### Explicit Memory + +**What:** +Structured or unstructured knowledge units — user-visible, explainable. These can be documents, chat logs, graph nodes, or vector embeddings. + +**When to use:** +Best for semantic search, user preferences, or traceable facts that evolve over time. Supports tags, provenance, and lifecycle states. + + +## How They Work Together + +MemOS lets you orchestrate all three memory types in a living loop: + +- Hot plaintext memories can be distilled into parametric weights. +- High-frequency activation paths become reusable KV templates. +- Stale parametric or activation units can be downgraded to plaintext nodes for traceability. + +With MemOS, your AI doesn't just store facts — it **remembers**, **understands**, and **grows**. + +::note +**Insight**
+ Over time, frequently used plaintext memories can be distilled into parametric form. + Rarely used weights or caches can be demoted to plaintext storage for auditing and retraining. +:: + +## Cross-Cutting Concepts + +### Hybrid Retrieval + +Combines vector similarity and graph traversal for robust, context-aware search. + +### Governance & Lifecycle + +Every memory unit supports states (active, merged, archived), provenance tracking, and fine-grained access control — essential for auditing and compliance. + +::note +**Compliance Reminder**
+Always track provenance and state changes for each memory unit. + This helps meet audit and data governance requirements. +:: + +## Key Takeaway + +With MemOS, your LLM applications gain structured, evolving memory — empowering agents to plan, reason, and adapt like never before. diff --git a/docs/en/open_source/home/memos_intro.md b/docs/en/open_source/home/memos_intro.md new file mode 100644 index 00000000..e957cb55 --- /dev/null +++ b/docs/en/open_source/home/memos_intro.md @@ -0,0 +1,117 @@ +--- +title: What is MemOS? +desc: "**MemOS** is a **Memory Operating System** for large language models (LLMs) and autonomous agents. It treats memory as a **first-class, orchestrated, and explainable resource**, rather than an opaque layer hidden inside model weights." +--- + +![MemOS Architecture](https://statics.memtensor.com.cn/memos/memos-architecture.png) + + +As LLMs advance to handle complex tasks — like multi-turn dialogue, long-term planning, decision-making, and personalized user experiences — their ability to **structure, manage, and evolve memory** becomes critical for achieving true long-term intelligence and adaptability. + +However, most mainstream LLMs still rely heavily on static parametric memory (model weights). This makes it difficult to update knowledge, track memory usage, or accumulate evolving user preferences. The result? High costs to refresh knowledge, brittle behaviors, and limited personalization. + +**MemOS** solves these challenges by redefining memory as a **core, modular system resource** with a unified structure, lifecycle management, and scheduling logic. It provides a Python-based layer that sits between your LLM and external knowledge sources, enabling **persistent, structured, and efficient memory operations**. + +With MemOS, your LLM can retain knowledge over time, manage context more robustly, and reason with memory that's explainable and auditable — unlocking more intelligent, reliable, and adaptive AI behaviors. + + +::note +**Tip**
MemOS helps bridge the gap between static parametric weights and dynamic, user-specific memory. + Think of it as your agent's "brain", with plug-and-play modules for text, graph, and activation memory. +:: + +## Why do we need a Memory OS? + +Modern LLMs are powerful—but static. +They rely heavily on **parametric memory** (the weights) that is hard to inspect, update, or share. +Typical vector search (RAG) helps retrieve external facts, but lacks unified governance, lifecycle control, or cross-agent sharing. + +**MemOS** changes this. +Think of it like an OS for memory: +just as an operating system schedules CPU, RAM, and files, MemOS **schedules, +transforms, and governs** multiple memory types — from parametric weights to ephemeral caches to plaintext, traceable knowledge. + +::note +**Insight**
MemOS helps your LLM evolve, by blending parametric, activation, and plaintext memory into a living loop. +:: + + +## Core Building Blocks +### MemCubes + +**Flexible containers** that hold one or more memory types. +Each user, session, or agent can have its own MemCube — swappable, reusable, and traceable. + +### Memory Lifecycle + +Each memory unit can flow through states like: + +- **Generated** → **Activated** → **Merged** → **Archived** → **Frozen** + +Every step is versioned with **provenance tracking** and audit logs. +Old memories can be "time-machined" back to prior versions for recovery or counterfactual simulations. + + +### Operation & Governance + +Modules like: + +- **MemScheduler** — dynamically transforms memory types for optimal reuse. +- **MemLifecycle** — manages state transitions, merging, and archiving. +- **MemGovernance** — handles access control, redaction, compliance, and audit trails. + + +::note +**Compliance Reminder**
Every memory unit carries full provenance metadata, so you can audit who created, modified, or queried it. +:: + + +## Multi-Perspective Memory + +MemOS blends **three memory forms** in a living loop: + +| Type | Description | Use Case | +|----------------| ---------------------------------------------------- | ---------------------------------------------- | +| **Parametric** | Knowledge distilled into model weights | Evergreen skills, stable domain facts | +| **Activation** | KV-caches and hidden states for inference reuse | Fast multi-turn chat, low-latency generation | +| **Plaintext** | Text, docs, graphs, vector chunks, user-visible facts| Semantic search, evolving, explainable memory | + +Over time: + +- Hot plaintext memories can be distilled into parametric weights. +- Stable context is promoted to KV-cache for rapid injection. +- Cold or outdated knowledge can be demoted for auditing. + + +## What makes MemOS different? + +- Hybrid retrieval — symbolic & semantic, vector + graph. +- Multi-agent & multi-user graphs — private and shared. +- Provenance & audit trail — every memory unit is governed and explainable. +- Automatic KV-cache promotion for stable context reuse. +- Lifecycle-aware scheduling — no more stale facts or bloated weights. + + +## Who is it for? + +- Conversational agents needing **multi-turn, evolving memory** +- Enterprise copilots handling **compliance, domain updates, and personalization** +- Multi-agent systems collaborating on a **shared knowledge graph** +- AI builders wanting modular, inspectable memory instead of black-box prompts + +## Key Takeaway + +**MemOS** upgrades your LLM from "just predicting tokens" +to an intelligent, evolving system that can **remember**, **reason**, and **adapt** — +like an operating system for your agent's mind. + +**With MemOS, your AI doesn't just store facts — it grows.** + +## Key Features + +- **Modular Memory Architecture**: Support for textual, activation (KV cache), and parametric (adapters/LoRA) memory. +- **MemCube**: Unified container for all memory types, with easy load/save and API access. +- **MOS**: Memory-augmented chat orchestration for LLMs, with plug-and-play memory modules. +- **Graph-based Backends**: Native support for Neo4j and other graph DBs for structured, explainable memory. +- **Easy Integration**: Works with HuggingFace, Ollama, and custom LLMs. +- **Extensible**: Add your own memory modules or backends. diff --git a/docs/en/open_source/home/overview.md b/docs/en/open_source/home/overview.md new file mode 100644 index 00000000..2c240a4e --- /dev/null +++ b/docs/en/open_source/home/overview.md @@ -0,0 +1,46 @@ +--- +title: MemOS Documentation +desc: Welcome to the official documentation for MemOS – a Python package designed to empower large language models (LLMs) with advanced, modular memory capabilities. +banner: https://statics.memtensor.com.cn/memos/memos-banner.gif +links: + - label: 'PyPI' + to: https://pypi.org/project/MemoryOS/ + target: _blank + avatar: + src: https://statics.memtensor.com.cn/icon/pypi.svg + alt: PyPI logo + - label: 'Open Source' + to: https://github.com/MemTensor/MemOS + target: _blank + icon: i-simple-icons-github +--- + +## What is MemOS? + +As large language models (LLMs) evolve to tackle advanced tasks—such as multi-turn dialogue, planning, decision-making, and personalized agents—their ability to manage and utilize memory becomes crucial for achieving long-term intelligence and adaptability. However, mainstream LLM architectures often struggle with weak memory structuring, management, and integration, leading to high knowledge update costs, unsustainable behavioral states, and difficulty in accumulating user preferences. + +**MemOS** addresses these challenges by redefining memory as a core, first-class resource with unified structure, lifecycle management, and scheduling strategies. It provides a Python package that delivers a unified memory layer for LLM-based applications, enabling persistent, structured, and efficient memory operations. This empowers LLMs with long-term knowledge retention, robust context management, and memory-augmented reasoning, supporting more intelligent and adaptive behaviors. + +![MemOS Architecture](https://statics.memtensor.com.cn/memos/memos-architecture.png) + +## Key Features + +- **Modular Memory Architecture**: Support for textual, activation (KV cache), and parametric (adapters/LoRA) memory. +- **MemCube**: Unified container for all memory types, with easy load/save and API access. +- **MOS**: Memory-augmented chat orchestration for LLMs, with plug-and-play memory modules. +- **Graph-based Backends**: Native support for Neo4j and other graph DBs for structured, explainable memory. +- **Easy Integration**: Works with HuggingFace, Ollama, and custom LLMs. +- **Extensible**: Add your own memory modules or backends. + +## Installation + +Please refer to our [installation guide](/open_source/getting_started/installation) for complete installation instructions, including basic installation, optional dependencies, and external dependencies. + +## Contributing + +We welcome contributions! Please see the [contribution guidelines](/open_source/contribution/overview) for details on setting up your environment and +submitting pull requests. + +## License + +MemOS is released under the Apache 2.0 License. diff --git a/docs/en/open_source/modules/mem_chat.md b/docs/en/open_source/modules/mem_chat.md new file mode 100644 index 00000000..073a8039 --- /dev/null +++ b/docs/en/open_source/modules/mem_chat.md @@ -0,0 +1,180 @@ +--- +title: MemChat +desc: MemChat is your "memory diplomat". It coordinates user input, memory retrieval, and LLM generation to create coherent conversations with long-term memory. +--- + +## 1. Introduction + +**MemChat** is the conversation control center of MemOS. + +It is not just a chat interface, but a bridge connecting "instant conversation" and "long-term memory". During interactions with users, MemChat is responsible for real-time retrieval of relevant background information from MemCube (Memory Cube), building context, and crystallizing new conversation content into new memories. With it, your Agent is no longer "goldfish memory", but a truly intelligent companion that can understand the past and continuously grow. + +--- + +## 2. Core Capabilities + +### Memory-Augmented Chat +Before answering user questions, MemChat automatically retrieves relevant Textual Memory from MemCube and injects it into the Prompt. This enables the Agent to answer questions based on past interaction history or knowledge bases, rather than relying solely on the LLM's pre-trained knowledge. + +### Auto-Memorization +After conversation, MemChat uses Extractor LLM to automatically extract valuable information from the conversation flow (such as user preferences, factual knowledge) and store it in MemCube. The entire process is fully automated without manual user intervention. + +### Context Management +Automatically manages conversation history window (`max_turns_window`). When conversations become too long, it intelligently trims old context while relying on retrieved long-term memory to maintain conversation coherence, effectively solving the LLM Context Window limitation problem. + +### Flexible Configuration +Supports configurable toggles for different types of memory (textual memory, activation memory, etc.) to adapt to different application scenarios. + +--- + +## 3. Code Structure + +Core logic is located under `memos/src/memos/mem_chat/`. + +* **`simple.py`**: **Default implementation (SimpleMemChat)**. This is an out-of-the-box REPL (Read-Eval-Print Loop) implementation containing complete "retrieve -> generate -> store" loop logic. +* **`base.py`**: **Interface definition (BaseMemChat)**. Defines the basic behavior of MemChat, such as `run()` and `mem_cube` properties. +* **`factory.py`**: **Factory class**. Responsible for instantiating concrete MemChat objects based on configuration (`MemChatConfig`). + +--- + +## 4. Key Interface + +The main interaction entry point is the `MemChat` class (typically created by `MemChatFactory`). + +### 4.1 Initialization +You need to first create a configuration object, then create an instance through the factory method. After creation, you must mount the `MemCube` instance to `mem_chat.mem_cube`. + +### 4.2 `run()` +Starts an interactive command-line conversation loop. Suitable for development and debugging, it handles user input, calls memory retrieval, generates replies, and prints output. + +### 4.3 Properties +* **`mem_cube`**: Associated MemCube object. MemChat reads and writes memories through it. +* **`chat_llm`**: LLM instance used to generate replies. + +--- + +## 5. Workflow + +A typical conversation round in MemChat includes the following steps: + +1. **Receive Input**: Get user text input. +2. **Memory Recall**: (If `enable_textual_memory` is enabled) Use user input as Query to retrieve Top-K relevant memories from `mem_cube.text_mem`. +3. **Prompt Construction**: Concatenate system prompt, retrieved memories, and recent conversation history into a complete Prompt. +4. **Generate Response**: Call `chat_llm` to generate a reply. +5. **Memorization**: (If `enable_textual_memory` is enabled) Send this round's conversation (User + Assistant) to `mem_cube`'s extractor, extract new memories and store them in the database. + +--- + +## 6. Development Example + +Below is a complete code example showing how to configure MemChat and mount a MemCube based on Qdrant and OpenAI. + +### 6.1 Code Implementation + +```python +import os +import sys + +# Ensure src module can be imported +sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), "../../../src"))) + +from memos.configs.mem_chat import MemChatConfigFactory +from memos.configs.mem_cube import GeneralMemCubeConfig +from memos.mem_chat.factory import MemChatFactory +from memos.mem_cube.general import GeneralMemCube + +def get_mem_chat_config() -> MemChatConfigFactory: + """Generate MemChat configuration""" + return MemChatConfigFactory.model_validate( + { + "backend": "simple", + "config": { + "user_id": "user_123", + "chat_llm": { + "backend": "openai", + "config": { + "model_name_or_path": os.getenv("MOS_CHAT_MODEL", "gpt-4o"), + "temperature": 0.8, + "max_tokens": 1024, + "api_key": os.getenv("OPENAI_API_KEY"), + "api_base": os.getenv("OPENAI_API_BASE"), + }, + }, + "max_turns_window": 20, + "top_k": 5, + "enable_textual_memory": True, # Enable explicit memory + }, + } + ) + +def get_mem_cube_config() -> GeneralMemCubeConfig: + """Generate MemCube configuration""" + return GeneralMemCubeConfig.model_validate( + { + "user_id": "user03alice", + "cube_id": "user03alice/mem_cube_tree", + "text_mem": { + "backend": "general_text", + "config": { + "cube_id": "user03alice/mem_cube_general", + "extractor_llm": { + "backend": "openai", + "config": { + "model_name_or_path": os.getenv("MOS_CHAT_MODEL", "gpt-4o"), + "api_key": os.getenv("OPENAI_API_KEY"), + "api_base": os.getenv("OPENAI_API_BASE"), + }, + }, + "vector_db": { + "backend": "qdrant", + "config": { + "collection_name": "user03alice_mem_cube_general", + "vector_dimension": 1024, + }, + }, + "embedder": { + "backend": os.getenv("MOS_EMBEDDER_BACKEND", "universal_api"), + "config": { + "provider": "openai", + "api_key": os.getenv("MOS_EMBEDDER_API_KEY", "EMPTY"), + "model_name_or_path": os.getenv("MOS_EMBEDDER_MODEL", "bge-m3"), + "base_url": os.getenv("MOS_EMBEDDER_API_BASE"), + }, + }, + }, + }, + } + ) + +def main(): + print("Initializing MemChat...") + mem_chat = MemChatFactory.from_config(get_mem_chat_config()) + + print("Initializing MemCube...") + mem_cube = GeneralMemCube(get_mem_cube_config()) + + # Critical step: mount the memory cube + mem_chat.mem_cube = mem_cube + + print("Starting Chat Session...") + try: + mem_chat.run() + finally: + print("Saving memory cube...") + mem_chat.mem_cube.dump("new_cube_path") + +if __name__ == "__main__": + main() +``` + +--- + +## 7. Configuration Description + +When configuring `MemChatConfigFactory`, the following parameters are crucial: + +* **`user_id`**: Required. Used to identify the current user in the conversation, ensuring memory isolation. +* **`chat_llm`**: Chat model configuration. Recommend using a capable model (such as GPT-4o) for better reply quality and instruction-following ability. +* **`enable_textual_memory`**: `True` / `False`. Whether to enable textual memory. If enabled, the system will perform retrieval before conversation and storage after conversation. +* **`max_turns_window`**: Integer. Number of conversation turns to retain in history. History beyond this limit will be truncated, relying on long-term memory to supplement context. +* **`top_k`**: Integer. How many most relevant memory fragments to retrieve from the memory library and inject into the Prompt each time. diff --git a/docs/en/open_source/modules/mem_cube.md b/docs/en/open_source/modules/mem_cube.md new file mode 100644 index 00000000..09a6e265 --- /dev/null +++ b/docs/en/open_source/modules/mem_cube.md @@ -0,0 +1,285 @@ +--- +title: MemCube +desc: "`MemCube` is your memory container that manages three types of memories: textual memory, activation memory, and parametric memory. It provides a simple interface for loading, saving, and operating on multiple memory modules, making it easy to build, save, and share memory-augmented applications." +--- + +## What is MemCube? + +**MemCube** contains three major types of memory: + +- **Textual Memory**: Stores text knowledge, supporting semantic search and knowledge management. +- **Activation Memory**: Stores intermediate reasoning results, accelerating LLM responses. +- **Parametric Memory**: Stores model adaptation weights, used for personalization. + +Each memory type can be independently configured and flexibly combined based on application needs. + +## Structure + +MemCube is defined by a configuration (see `GeneralMemCubeConfig`), which specifies the backend and settings for each memory type. The typical structure is: + +``` +MemCube + ├── user_id + ├── cube_id + ├── text_mem: TextualMemory + ├── act_mem: ActivationMemory + └── para_mem: ParametricMemory +``` + +All memory modules are accessible via the MemCube interface: + +- `mem_cube.text_mem` +- `mem_cube.act_mem` +- `mem_cube.para_mem` + +## View Architecture + +Starting from MemOS 2.0, runtime operations (add/search) should go through the **View architecture**: + +### SingleCubeView + +Use this to manage a single MemCube. When you only need one memory space. + +```python +from memos.multi_mem_cube.single_cube import SingleCubeView + +view = SingleCubeView( + cube_id="my_cube", + naive_mem_cube=naive_mem_cube, + mem_reader=mem_reader, + mem_scheduler=mem_scheduler, + logger=logger, + searcher=searcher, + feedback_server=feedback_server, # Optional +) + +# Add memories +view.add_memories(add_request) + +# Search memories +view.search_memories(search_request) +``` + +### CompositeCubeView + +Use this to manage multiple MemCubes. When you need unified operations across multiple memory spaces. + +```python +from memos.multi_mem_cube.composite_cube import CompositeCubeView + +# Create multiple SingleCubeViews +view1 = SingleCubeView(cube_id="cube_1", ...) +view2 = SingleCubeView(cube_id="cube_2", ...) + +# Composite view for multi-cube operations +composite = CompositeCubeView(cube_views=[view1, view2], logger=logger) + +# Search across all cubes +results = composite.search_memories(search_request) +# Results contain cube_id field to identify source +``` + +## API Request Fields + +When using the View architecture for add/search operations, specify these parameters: + +| Field | Type | Description | +| :--- | :--- | :--- | +| `writable_cube_ids` | `list[str]` | Target cubes for add operations. Can specify multiple; the system will write to all targets in parallel. | +| `readable_cube_ids` | `list[str]` | Target cubes for search operations. Can search across multiple cubes; results include source information. | +| `async_mode` | `str` | Execution mode: `"sync"` for synchronous processing (wait for results), `"async"` for asynchronous processing (push to background queue, return task ID immediately). | + +## Core Methods (`GeneralMemCube`) + +**GeneralMemCube** is the standard implementation of MemCube, managing all system memories through a unified interface. Here are the main methods to complete memory lifecycle management. + +### Initialization + +```python +from memos.mem_cube.general import GeneralMemCube +mem_cube = GeneralMemCube(config) +``` + +### Static Data Operations + +| Method | Description | +| :--- | :--- | +| `init_from_dir(dir)` | Load a MemCube from a local directory | +| `init_from_remote_repo(repo, base_url)` | Load a MemCube from a remote repository (e.g., Hugging Face) | +| `load(dir)` | Load all memories from a directory into the existing instance | +| `dump(dir)` | Save all memories to a directory for persistence | + +## File Structure + +A MemCube directory contains the following files, with each file corresponding to a memory type: + +- `config.json` (MemCube configuration) +- `textual_memory.json` (textual memory) +- `activation_memory.pickle` (activation memory) +- `parametric_memory.adapter` (parametric memory) + +## Usage Examples + +### Export Example (dump_cube.py) + +```python +import json +import os +import shutil + +from memos.api.handlers import init_server +from memos.api.product_models import APIADDRequest +from memos.log import get_logger +from memos.multi_mem_cube.single_cube import SingleCubeView + +logger = get_logger(__name__) +EXAMPLE_CUBE_ID = "example_dump_cube" +EXAMPLE_USER_ID = "example_user" + +# 1. Initialize server +components = init_server() +naive = components["naive_mem_cube"] + +# 2. Create SingleCubeView +view = SingleCubeView( + cube_id=EXAMPLE_CUBE_ID, + naive_mem_cube=naive, + mem_reader=components["mem_reader"], + mem_scheduler=components["mem_scheduler"], + logger=logger, + searcher=components["searcher"], + feedback_server=components["feedback_server"], +) + +# 3. Add memories via View +result = view.add_memories(APIADDRequest( + user_id=EXAMPLE_USER_ID, + writable_cube_ids=[EXAMPLE_CUBE_ID], + messages=[ + {"role": "user", "content": "This is a test memory"}, + {"role": "user", "content": "Another memory to persist"}, + ], + async_mode="sync", # Use sync mode to ensure immediate completion +)) +print(f"✓ Added {len(result)} memories") + +# 4. Export data for the specific cube_id +output_dir = "tmp/mem_cube_dump" +if os.path.exists(output_dir): + shutil.rmtree(output_dir) +os.makedirs(output_dir, exist_ok=True) + +# Export graph data (only data for the current cube_id) +json_data = naive.text_mem.graph_store.export_graph( + include_embedding=True, # Include embeddings to support semantic search + user_name=EXAMPLE_CUBE_ID, # Filter by cube_id +) + +# Fix embedding format: parse string to list for import compatibility +import contextlib +for node in json_data.get("nodes", []): + metadata = node.get("metadata", {}) + if "embedding" in metadata and isinstance(metadata["embedding"], str): + with contextlib.suppress(json.JSONDecodeError): + metadata["embedding"] = json.loads(metadata["embedding"]) + +print(f"✓ Exported {len(json_data.get('nodes', []))} nodes") + +# Save to file +memory_file = os.path.join(output_dir, "textual_memory.json") +with open(memory_file, "w", encoding="utf-8") as f: + json.dump(json_data, f, indent=2, ensure_ascii=False) +print(f"✓ Saved to: {memory_file}") +``` + +### Import and Search Example (load_cube.py) + +> **Embedding Compatibility Note**: The sample data uses the **bge-m3** model with **1024 dimensions**. If your environment uses a different embedding model or dimension, semantic search after import may be inaccurate or fail. Ensure your `.env` configuration matches the embedding settings used during export. + +```python +import json +import os + +from memos.api.handlers import init_server +from memos.api.product_models import APISearchRequest +from memos.log import get_logger +from memos.multi_mem_cube.single_cube import SingleCubeView + +logger = get_logger(__name__) +EXAMPLE_CUBE_ID = "example_dump_cube" +EXAMPLE_USER_ID = "example_user" + +# 1. Initialize server +components = init_server() +naive = components["naive_mem_cube"] + +# 2. Create SingleCubeView +view = SingleCubeView( + cube_id=EXAMPLE_CUBE_ID, + naive_mem_cube=naive, + mem_reader=components["mem_reader"], + mem_scheduler=components["mem_scheduler"], + logger=logger, + searcher=components["searcher"], + feedback_server=components["feedback_server"], +) + +# 3. Load data from file into graph_store +load_dir = "examples/data/mem_cube_tree" +memory_file = os.path.join(load_dir, "textual_memory.json") + +with open(memory_file, encoding="utf-8") as f: + json_data = json.load(f) + +naive.text_mem.graph_store.import_graph(json_data, user_name=EXAMPLE_CUBE_ID) + +nodes = json_data.get("nodes", []) +print(f"✓ Imported {len(nodes)} nodes") + +# 4. Display loaded data +print(f"\nLoaded {len(nodes)} memories:") +for i, node in enumerate(nodes[:3], 1): # Show first 3 + metadata = node.get("metadata", {}) + memory_text = node.get("memory", "N/A") + mem_type = metadata.get("memory_type", "unknown") + print(f" [{i}] Type: {mem_type}") + print(f" Content: {memory_text[:60]}...") + +# 5. Semantic search verification +query = "test memory dump persistence demonstration" +print(f'\nSearching: "{query}"') + +search_result = view.search_memories( + APISearchRequest( + user_id=EXAMPLE_USER_ID, + readable_cube_ids=[EXAMPLE_CUBE_ID], + query=query, + ) +) + +text_mem_results = search_result.get("text_mem", []) +memories = [] +for group in text_mem_results: + memories.extend(group.get("memories", [])) + +print(f"✓ Found {len(memories)} relevant memories") +for i, mem in enumerate(memories[:2], 1): # Show first 2 + print(f" [{i}] {mem.get('memory', 'N/A')[:60]}...") +``` + +### Complete Examples + +See examples in the code repository: + +- `MemOS/examples/mem_cube/dump_cube.py` - Export MemCube data (add + export) +- `MemOS/examples/mem_cube/load_cube.py` - Import MemCube data and perform semantic search (import + search) + +### Legacy API Notes + +The old approach of directly calling `mem_cube.text_mem.get_all()` is deprecated. Please use the View architecture. Legacy examples have been moved to `MemOS/examples/mem_cube/_deprecated/`. + +## Developer Notes + +* MemCube enforces schema consistency to ensure safe loading and dumping +* Each memory type can be independently configured, tested, and extended +* See `/tests/mem_cube/` for integration tests and usage examples diff --git a/docs/en/open_source/modules/mem_feedback.md b/docs/en/open_source/modules/mem_feedback.md new file mode 100644 index 00000000..71dd682a --- /dev/null +++ b/docs/en/open_source/modules/mem_feedback.md @@ -0,0 +1,152 @@ +--- +title: MemFeedback +desc: MemFeedback is your "memory error notebook". It enables your Agent to understand 'You remembered it wrong' and automatically correct the memory database. It is a key component for achieving self-evolving memory. +--- + +## 1. Introduction + +**MemFeedback** is the "regret medicine" for MemOS. + +In long-term memory systems, the biggest headache is often not "forgetting," but "remembering wrong and unable to change." When a user says, "No, my birthday is tomorrow" or "Change the project code to X," simple RAG systems are usually helpless. + +MemFeedback can understand these natural language instructions, automatically locate conflicting memories in the database, and execute atomic correction operations (such as archiving old memories and writing new ones). With it, your Agent can correct errors and learn continuously during interactions, just like a human. + +--- + +## 2. Core Capabilities + +It can handle four common feedback scenarios: + +### Correction +When the user points out a factual error. The system will not brutally delete the old data but **Archive** it and write new data. This corrects the error while preserving version history (Traceability). If it is an ongoing conversation (WorkingMemory), it updates in place to ensure context continuity. + +### Addition +If the user just supplements new information that does not conflict with old memories, it is simple—directly save it as a new node in the memory database. + +### Keyword Replacement (Global Refactor) +Similar to "Global Refactor" in an IDE. For example, if the user says, "Change 'Zhang San' to 'Li Si' in all documents," the system will combine the Reranker to automatically determine the scope of affected documents and update all relevant memories in batches. + +### Preference Evolution +Specifically handles preferences like "I don't eat cilantro" or "I like Python." The system records the context in which this preference arose, constantly enriching the user profile to make the Agent more tailored to use. + +--- + +## 3. Code Structure + +The core logic is located under `memos/src/memos/mem_feedback/`. + +* **`simple_feedback.py`**: **Recommended entry point**. It is the official encapsulated version that assembles LLM, vector database, and searcher, ready to use out of the box. +* **`feedback.py`**: Core implementation class `MemFeedback`. The heavy lifting is done here: intent recognition, conflict comparison, and security risk control. +* **`base.py`**: Interface definition. +* **`utils.py`**: Utility box. + +--- + +## 4. Key Interface + +There is only one main entry point: `process_feedback()`. It is usually called asynchronously after the RAG process ends and the user gives feedback. + +### 4.1 Input Parameters + +| Parameter | Description | +| :--- | :--- | +| `user_id` / `user_name` | User identification and Cube ID. | +| `chat_history` | Conversation history, letting LLM know what you talked about. | +| `feedback_content` | The feedback sentence from the user (e.g., "No, it's 5 o'clock"). | +| **`retrieved_memory_ids`** | **Required (Strongly Recommended)**. Pass in the memory IDs retrieved in the previous RAG round. This gives the system a "target," telling it which memory to correct. If not passed, the system has to search again in the massive memory, which is slow and prone to errors. | +| `corrected_answer` | Whether to generate a corrected response along the way. | + +### 4.2 Output Result + +Returns a dictionary telling you what changed in this operation: +* **`record`**: Database change details (e.g., `{ "add": [...], "update": [...] }`). +* **`answer`**: Natural language response to the user. + +--- + +## 5. Workflow + +The workflow of MemFeedback is like a rigorous editorial office: + +1. **Review (Intent Recognition)**: First, see if the user is correcting errors, adding information, or renaming. +2. **Locate (Recall)**: Find the memory to be modified (if you passed the ID, this step is skipped). +3. **Proofread (Comparison)**: Let LLM carefully compare new and old information to determine if it is completely new (ADD) or needs an update (UPDATE). +4. **Risk Control (Security Check)**: Prevent LLM from making random changes. For example, is the ID correct? Is it trying to delete an entire long document? (Threshold interception applies). +5. **Publish (Write)**: Finally, execute graph database operations, archive the old, and write the new. + +--- + +## 6. Development Example + +Here is a runnable code snippet showing how to initialize the service, preset an "incorrect memory," and then correct it through user feedback. + +### 6.1 Preparation + +First, we need to initialize the `SimpleMemFeedback` service. + +```python +# Assuming components like llm, embedder, graph_db are initialized via Factory +# For complete initialization code, please refer to examples/mem_feedback/example_feedback.py + +from memos.mem_feedback.simple_feedback import SimpleMemFeedback + +feedback_server = SimpleMemFeedback( + llm=llm, + embedder=embedder, + graph_store=graph_db, + memory_manager=memory_manager, + mem_reader=mem_reader, + searcher=searcher, + reranker=mem_reranker, + pref_mem=None, +) +``` + +### 6.2 Simulate Scenario and Execute Feedback + +Scenario: The system incorrectly remembers "You like apples, dislike bananas," and now we want to correct it. + +```python +import json +from memos.mem_feedback.utils import make_mem_item + +# 1. Simulate Chat History +# User asks for preference, assistant answers wrongly +history = [ + {"role": "user", "content": "What fruits do I like and dislike?"}, + {"role": "assistant", "content": "You like apples, dislike bananas."}, +] + +# 2. Preset "Incorrect Memory" +# We manually insert an incorrect fact into the database +mem_text = "You like apples, dislike bananas" +# ... (Omitted detailed parameters of make_mem_item, see source code) ... +memory_manager.add([make_mem_item(mem_text, ...)], ...) + +# 3. User Feedback +feedback_content = "Wrong, actually I like mangosteens." +print(f"Feedback Input: {feedback_content}") + +# 4. Execute Correction +# MemFeedback will detect conflict, archive old memory, and write new memory "like mangosteens" +res = feedback_server.process_feedback( + ..., + chat_history=history, + feedback_content=feedback_content, + ... +) + +# 5. View Result +print(json.dumps(res, indent=4)) +``` + +--- + +## 7. Configuration Description + +To make MemFeedback work, you need to prepare the configuration of the following components (usually in `.env` or YAML): + +* **LLM (`extractor_llm`)**: Needs a smart brain, recommend GPT-4o level models. Set Temperature low (e.g., 0) because it performs logical analysis and shouldn't be too divergent. +* **Embedder (`embedder`)**: Used to convert new memories into vectors. +* **GraphDB (`graph_db`)**: Where memories are stored and how, handled by these two. +* **MemReader (`mem_reader`)**: Used to parse purely new memories. diff --git a/docs/en/open_source/modules/mem_reader.md b/docs/en/open_source/modules/mem_reader.md new file mode 100644 index 00000000..0b5294b8 --- /dev/null +++ b/docs/en/open_source/modules/mem_reader.md @@ -0,0 +1,181 @@ +--- +title: "MemReader" +desc: MemReader is your "memory translator". It translates messy user inputs (chat, documents, images) into structured memory fragments the system can understand. +--- + +## 1. Overview + +When building AI applications, we often run into this problem: users send all kinds of things—casual chat messages, PDF documents, and images. **MemReader** turns these raw inputs (Raw Data) into standard memory blocks (Memory Items) with embeddings and metadata by "chewing" and "digesting" them. + +In short, it does three things: +1. **Normalization**: Whether you send a string or JSON, it first converts everything into a standard format. +2. **Chunking**: It splits long conversations or documents into appropriately sized chunks for downstream processing. +3. **Extraction**: It calls an LLM to extract unstructured information into structured knowledge points (Fine mode), or directly generates snapshots (Fast mode). + +--- + +## 2. Core Modes + +MemReader provides two modes, corresponding to the needs for "speed" and "accuracy": + +### ⚡ Fast Mode (speed first) +* **Characteristics**: **Does not call an LLM**, only performs chunking and embeddings. +* **Use cases**: + * Users are sending messages quickly and the system needs millisecond-level responses. + * You only need to keep "snapshots" of the conversation, without deep understanding. +* **Output**: raw text chunks + vector index + provenance tracking (Sources). + +### 🧠 Fine Mode (carefully crafted) +* **Characteristics**: **Calls an LLM** for deeper analysis. +* **Use cases**: + * Long-term memory writing (needs key facts extracted). + * Document analysis (needs core ideas summarized). + * Multimodal understanding (needs to understand what's in an image). +* **Output**: structured facts + key information extraction (Key) + background (Background) + vector index + provenance tracking (Sources) + multimodal details. + +--- + +## 3. Code Structure + +MemReader's code structure is straightforward and mainly includes: + +* **`base.py`**: defines the interface contract that all Readers must follow. +* **`simple_struct.py`**: **the most commonly used implementation**. Focuses on pure-text conversations and local documents; lightweight and efficient. +* **`multi_modal_struct.py`**: **an all-rounder**. Handles images, file URLs, tool calls, and other complex inputs. +* **`read_multi_modal/`**: contains various parsers, such as `ImageParser` for images and `FileParser` for files. + +--- + +## 4. How to Choose? + +| Your need | Recommended choice | Why | +| :--- | :--- | :--- | +| **Only process plain text chats** | `SimpleStructMemReader` | Simple, direct, and performant. | +| **Need to handle images and file links** | `MultiModalStructMemReader` | Built-in multimodal parsing. | +| **Upgrade from Fast to Fine** | Any Reader's `fine_transfer` method | Supports a progressive "store first, refine later" strategy. | + +--- + +## 5. API Overview + +### Unified Factory: `MemReaderFactory` + +Don't instantiate readers directly; using the factory pattern is best practice: + +```python +from memos.configs.mem_reader import MemReaderConfigFactory +from memos.mem_reader.factory import MemReaderFactory + +# Create a Reader from configuration +cfg = MemReaderConfigFactory.model_validate({...}) +reader = MemReaderFactory.from_config(cfg) +``` + +### Core Method: `get_memory()` + +This is the method you will call most often. + +```python +memories = reader.get_memory( + scene_data, # your input data + type="chat", # type: chat or doc + info=user_info, # user info (user_id, session_id) + mode="fine" # mode: fast or fine (highly recommended to specify explicitly!) +) +``` + +**Return value**: `list[list[TextualMemoryItem]]` + +:::note +Why a nested list? +Because a long conversation may be split into multiple windows (Window). The outer list represents windows, and the inner list represents memory items extracted from that window. +::: + +--- + +## 6. Practical Development + +### Scenario 1: Processing simple chat logs + +This is the most basic usage, with `SimpleStructMemReader`. + +```python +# 1. Prepare input: standard OpenAI-style conversation format +conversation = [ + [ + {"role": "user", "content": "I have a meeting tomorrow at 3pm"}, + {"role": "assistant", "content": "What is the meeting about?"}, + {"role": "user", "content": "Discussing the Q4 project deadline"}, + ] +] + +# 2. Extract memory (Fine mode) +memories = reader.get_memory( + conversation, + type="chat", + mode="fine", + info={"user_id": "u1", "session_id": "s1"} +) + +# 3. Result +# memories will include extracted facts, e.g., "User has a meeting tomorrow at 3pm about the Q4 project deadline" +``` + +### Scenario 2: Processing multimodal inputs + +When users send images or file links, switch to `MultiModalStructMemReader`. + +```python +# 1. Prepare input: a complex message containing files and images +scene_data = [ + [ + { + "role": "user", + "content": [ + {"type": "text", "text": "Check this file and image"}, + # Files support automatic download and parsing via URL + {"type": "file", "file": {"file_data": "https://example.com/readme.md"}}, + # Images support URL + {"type": "image_url", "image_url": {"url": "https://example.com/chart.png"}}, + ] + } + ] +] + +# 2. Extract memory +memories = multimodal_reader.get_memory( + scene_data, + type="chat", + mode="fine", # Only Fine mode invokes the vision model to parse images + info={"user_id": "u1", "session_id": "s1"} +) +``` + +### Scenario 3: Progressive optimization (Fine Transfer) + +For better UX, you can first store the conversation quickly in Fast mode, then "refine" it into Fine memories when the system is idle. + +```python +# 1. Store quickly first (millisecond-level) +fast_memories = reader.get_memory(conversation, mode="fast", ...) + +# ... store into the database ... + +# 2. Refine asynchronously in the background +refined_memories = reader.fine_transfer_simple_mem( + fast_memories_flat_list, # Note: pass a flattened list of Items here + type="chat" +) + +# 3. Replace the original fast_memories with refined_memories +``` + +--- + +## 7. Configuration Notes + +In `.env` or configuration files, you can adjust these key parameters: + +* **`chat_window_max_tokens`**: **sliding window size**. Default is 1024. It determines how much context is packed together for processing. Too small may lose context; too large may exceed the LLM token limit. +* **`remove_prompt_example`**: **whether to remove examples from the prompt**. True = save tokens but may reduce extraction quality; False = keep few-shot examples for better accuracy but consume more tokens. +* **`direct_markdown_hostnames`** (multimodal only): **hostname allowlist**. If a file URL's hostname is in this list (e.g., `raw.githubusercontent.com`), the Reader treats it as Markdown text directly instead of trying OCR or conversion, which is more efficient. diff --git a/docs/en/open_source/modules/mem_scheduler.md b/docs/en/open_source/modules/mem_scheduler.md new file mode 100644 index 00000000..147663ac --- /dev/null +++ b/docs/en/open_source/modules/mem_scheduler.md @@ -0,0 +1,495 @@ +--- +title: "MemScheduler" +desc: MemScheduler is your "memory organization scheduler". It asynchronously manages memory flow and updates in the background, coordinating interactions between working memory, long-term memory, and activation memory, enabling conversational systems to dynamically organize and utilize memories. +--- + +## Key Features + +- 🚀 **Concurrent operation with MemOS system**: Runs in independent threads/processes without blocking main business logic. +- 🧠 **Multi-memory coordination**: Intelligently manages the flow of working memory, long-term memory, and user-personalized memory. +- ⚡ **Event-driven scheduling**: Asynchronous task distribution based on message queues (Redis/Local). +- 🔍 **Efficient retrieval**: Integrated vector and graph retrieval for quick location of relevant memories. +- 📊 **Comprehensive monitoring**: Real-time monitoring of memory utilization, task queue status, and scheduling latency. +- 📝 **Detailed logging**: Full-chain tracing of memory operations for debugging and system analysis. + +## MemScheduler Architecture + +`MemScheduler` adopts a three-layer modular architecture: + +### Scheduling Layer (Core) +1. **Scheduler (Router)**: Intelligent message router that dispatches tasks to corresponding handlers based on message types (e.g., `QUERY`, `ANSWER`, `MEM_UPDATE`). +2. **Message Processing**: Event-driven business logic through messages with specific labels, defining message formats and processing rules. + +### Execution Layer (Guarantee) +3. **Task Queue**: Supports both Redis Stream (production) and Local Queue (development/testing) modes, providing asynchronous task buffering and persistence. +4. **Memory Management**: Executes read/write, compression, forgetting, and type conversion operations on three-layer memory (Working/Long-term/User). +5. **Retrieval System**: Hybrid retrieval module combining user intent, scenario management, and keyword matching for quick memory location. + +### Support Layer (Auxiliary) +6. **Monitoring**: Tracks task accumulation, processing latency, and memory health status. +7. **Logging**: Maintains full-chain memory operation logs for debugging and analysis. + +## MemScheduler Initialization + +In the MemOS architecture, `MemScheduler` is initialized as part of the server components during startup. + +### Initialization in Server Router + +In `src/memos/api/routers/server_router.py`, the scheduler is automatically loaded through the `init_server()` function: + +```python +from memos.api import handlers +from memos.api.handlers.base_handler import HandlerDependencies +from memos.mem_scheduler.base_scheduler import BaseScheduler +from memos.mem_scheduler.utils.status_tracker import TaskStatusTracker + +# ... other imports ... + +# 1. Initialize all server components (including DB, LLM, Memory, Scheduler) +# init_server() reads environment variables and initializes global singleton components +components = handlers.init_server() + +# Create dependency container for handlers +dependencies = HandlerDependencies.from_init_server(components) + +# Initialize handlers... +# search_handler = SearchHandler(dependencies) +# ... + +# 2. Get the scheduler instance from the components dictionary +# The scheduler is already initialized and started inside init_server (if enabled) +mem_scheduler: BaseScheduler = components["mem_scheduler"] + +# 3. Users can also get other scheduling-related components from components (optional, for custom task handling) +# redis_client is used for direct Redis operations or monitoring task status +redis_client = components["redis_client"] +# ... +``` + +## Scheduling Tasks and Data Models + +The scheduler distributes and executes tasks through a message-driven approach. This section introduces supported task types, message structures, and execution logs. + +### Message Types and Handlers + +The scheduler dispatches and executes tasks by registering specific task labels (Label) with handlers (Handler). The following are the default supported scheduling tasks in the current version (based on `GeneralScheduler` and `OptimizedScheduler`): + +| Message Label | Constant | Handler Method | Description | +| :--- | :--- | :--- | :--- | +| `query` | `QUERY_TASK_LABEL` | `_query_message_consumer` | Processes user queries, triggers intent recognition, memory retrieval, and converts them to memory update tasks. | +| `answer` | `ANSWER_TASK_LABEL` | `_answer_message_consumer` | Processes AI responses and logs conversations. | +| `mem_update` | `MEM_UPDATE_TASK_LABEL` | `_memory_update_consumer` | Core task. Executes the long-term memory update process, including extracting Query Keywords, updating Monitor, retrieving relevant memories, and replacing Working Memory. | +| `add` | `ADD_TASK_LABEL` | `_add_message_consumer` | Handles logging of new memory additions (supports local and cloud logs). | +| `mem_read` | `MEM_READ_TASK_LABEL` | `_mem_read_message_consumer` | Deep processing and importing external memory content using `MemReader`. | +| `mem_organize` | `MEM_ORGANIZE_TASK_LABEL` | `_mem_reorganize_message_consumer` | Triggers memory reorganization and merge operations. | +| `pref_add` | `PREF_ADD_TASK_LABEL` | `_pref_add_message_consumer` | Handles extraction and addition of user preference memory (Preference Memory). | +| `mem_feedback` | `MEM_FEEDBACK_TASK_LABEL` | `_mem_feedback_message_consumer` | Processes user feedback for correcting or reinforcing preferences. | +| `api_mix_search` | `API_MIX_SEARCH_TASK_LABEL` | `_api_mix_search_message_consumer` | (OptimizedScheduler only) Executes asynchronous hybrid search tasks combining fast and fine retrieval. | + +### Message Data Structure (ScheduleMessageItem) + +The scheduler uses a unified `ScheduleMessageItem` structure to pass messages in the queue. + +> **Note**: The `mem_cube` object itself is not directly included in the message model; instead, it is resolved by the scheduler at runtime through `mem_cube_id`. + +| Field | Type | Description | Default/Remarks | +| :--- | :--- | :--- | :--- | +| `item_id` | `str` | Unique message identifier (UUID) | Auto-generated | +| `user_id` | `str` | Associated user ID | (Required) | +| `mem_cube_id` | `str` | Associated Memory Cube ID | (Required) | +| `label` | `str` | Task label (e.g., `query`, `mem_update`) | (Required) | +| `content` | `str` | Message payload (typically JSON string or text) | (Required) | +| `timestamp` | `datetime` | Message submission time | Auto-generated (UTC now) | +| `session_id` | `str` | Session ID for context isolation | `""` | +| `trace_id` | `str` | Trace ID for full-chain log association | Auto-generated | +| `user_name` | `str` | User display name | `""` | +| `task_id` | `str` | Business-level task ID (for associating multiple messages) | `None` | +| `info` | `dict` | Additional custom context information | `None` | +| `stream_key` | `str` | (Internal use) Redis Stream key name | `""` | + +### Execution Log Structure (ScheduleLogForWebItem) + +The scheduler generates structured log messages for frontend display or persistent storage. + +| Field | Type | Description | Remarks | +| :--- | :--- | :--- | :--- | +| `item_id` | `str` | Unique log entry identifier | Auto-generated | +| `task_id` | `str` | Associated parent task ID | Optional | +| `user_id` | `str` | User ID | (Required) | +| `mem_cube_id` | `str` | Memory Cube ID | (Required) | +| `label` | `str` | Log category (e.g., `addMessage`, `addMemory`) | (Required) | +| `log_content` | `str` | Brief log description text | (Required) | +| `from_memory_type` | `str` | Source memory area | e.g., `UserInput`, `LongTermMemory` | +| `to_memory_type` | `str` | Destination memory area | e.g., `WorkingMemory` | +| `memcube_log_content` | `list[dict]` | Structured detailed content | Contains specific memory text, reference IDs, etc. | +| `metadata` | `list[dict]` | Memory item metadata | Contains confidence, status, tags, etc. | +| `status` | `str` | Task status | e.g., `completed`, `failed` | +| `timestamp` | `datetime` | Log creation time | Auto-generated | +| `current_memory_sizes` | `MemorySizes` | Current memory quantity snapshot for each area | For monitoring dashboard display | +| `memory_capacities` | `MemoryCapacities` | Memory capacity limits for each area | For monitoring dashboard display | + +## Scheduling Function Examples + +### 1. Message Processing and Custom Handlers + +The scheduler's most powerful feature is support for registering custom message handlers. You can define specific message types (e.g., `MY_CUSTOM_TASK`) and write functions to handle them. + +```python +import uuid +from datetime import datetime + +# 1. Import necessary type definitions and scheduler instance +# Note: mem_scheduler needs to be imported from server_router as it's a global singleton +from memos.api.routers.server_router import mem_scheduler +from memos.mem_scheduler.schemas.message_schemas import ScheduleMessageItem + +# Define a custom task label +MY_TASK_LABEL = "MY_CUSTOM_TASK" + + +# Define a handler function +def my_task_handler(messages: list[ScheduleMessageItem]): + """ + Function to handle custom tasks + """ + for msg in messages: + print(f"⚡️ [Handler] Received task: {msg.item_id}") + print(f"📦 Content: {msg.content}") + # Execute your business logic here, e.g., call LLM, write to database, trigger other tasks, etc. + + +# 2. Register the handler to the scheduler +# This step mounts your custom logic to the scheduling system +mem_scheduler.register_handlers({ + MY_TASK_LABEL: my_task_handler +}) + +# 3. Submit a task +task = ScheduleMessageItem( + item_id=str(uuid.uuid4()), + user_id="user_123", + mem_cube_id="cube_001", + label=MY_TASK_LABEL, + content="This is a test message", + timestamp=datetime.now() +) + +# If the scheduler is not started, the task will be queued for processing +# or in local queue mode may require calling mem_scheduler.start() first +mem_scheduler.submit_messages([task]) + +print(f"Task submitted: {task.item_id}") + +# Prevent scheduler main process from exiting prematurely +time.sleep(10) +``` + +### 2. Redis Queue vs Local Queue + +- **Local Queue**: + - **Use case**: Unit tests, simple single-machine scripts. + - **Characteristics**: Fast, but data is lost after process restart; does not support multi-process/multi-instance sharing. + - **Configuration**: `MOS_SCHEDULER_USE_REDIS_QUEUE=false` + +- **Redis Queue (Redis Stream)**: + - **Use case**: Production environment, distributed deployment. + - **Characteristics**: Data persistence, supports consumer groups allowing multiple scheduler instances to handle tasks together (load balancing). + - **Configuration**: `MOS_SCHEDULER_USE_REDIS_QUEUE=true` + - **Debugging**: Use the `show_redis_status.py` script to check queue accumulation. + +## Comprehensive Application Scenarios + +### Scenario 1: Basic Conversation Flow and Memory Update + +The following is a complete example demonstrating how to initialize the environment, register custom logic, simulate conversation flow, and trigger memory updates. + +```python +import asyncio +import json +import os +import sys +import time +from pathlib import Path + +# --- Environment Setup --- +# 1. Add project root to sys.path to ensure memos module can be imported +FILE_PATH = Path(__file__).absolute() +BASE_DIR = FILE_PATH.parent.parent.parent +sys.path.insert(0, str(BASE_DIR)) + +# 2. Set necessary environment variables (simulating .env configuration) +os.environ["ENABLE_CHAT_API"] = "true" +os.environ["MOS_ENABLE_SCHEDULER"] = "true" +# Choose between Redis or Local queue +os.environ["MOS_SCHEDULER_USE_REDIS_QUEUE"] = "false" + +# --- Import Components --- +# Note: Importing server_router triggers component initialization, +# ensure environment variables are set before this import +from memos.api.product_models import APIADDRequest, ChatPlaygroundRequest +from memos.api.routers.server_router import ( + add_handler, + chat_stream_playground, + mem_scheduler, # mem_scheduler here is already an initialized singleton +) +from memos.log import get_logger +from memos.mem_scheduler.schemas.message_schemas import ScheduleMessageItem +from memos.mem_scheduler.schemas.task_schemas import ( + MEM_UPDATE_TASK_LABEL, + QUERY_TASK_LABEL, +) + +logger = get_logger(__name__) + +# Global variable for demonstrating memory retrieval results +working_memories = [] + +# --- Custom Handlers --- + +def custom_query_handler(messages: list[ScheduleMessageItem]): + """ + Handle user query messages: + 1. Print query content + 2. Convert message to MEM_UPDATE task, triggering memory retrieval/update process + """ + for msg in messages: + print(f"\n[Scheduler 🟢] Received user query: {msg.content}") + + # Copy message and change label to MEM_UPDATE, a common "task chaining" pattern + new_msg = msg.model_copy(update={"label": MEM_UPDATE_TASK_LABEL}) + + # Submit new task back to scheduler + mem_scheduler.submit_messages([new_msg]) + + +def custom_mem_update_handler(messages: list[ScheduleMessageItem]): + """ + Handle memory update tasks: + 1. Use retriever to find relevant memories + 2. Update global working memory list + """ + global working_memories + search_args = {} + top_k = 2 + + for msg in messages: + print(f"[Scheduler 🔵] Retrieving memories for query...") + # Call core retrieval functionality + results = mem_scheduler.retriever.search( + query=msg.content, + user_id=msg.user_id, + mem_cube_id=msg.mem_cube_id, + mem_cube=mem_scheduler.current_mem_cube, + top_k=top_k, + method=mem_scheduler.search_method, + search_args=search_args, + ) + + # Simulate working memory update + working_memories.extend(results) + working_memories = working_memories[-5:] # Keep the latest 5 + + for mem in results: + # Print retrieved memory fragments + print(f" ↳ [Memory Found]: {mem.memory[:50]}...") + +# --- Mock Business Data --- + +def get_mock_data(): + """Generate mock conversation data""" + conversations = [ + {"role": "user", "content": "I just adopted a golden retriever puppy named Max."}, + {"role": "assistant", "content": "That's exciting! Max is a great name."}, + {"role": "user", "content": "He loves peanut butter treats but I am allergic to nuts."}, + {"role": "assistant", "content": "Noted. Peanut butter for Max, no nuts for you."}, + ] + + questions = [ + {"question": "What is my dog's name?", "category": "Pet"}, + {"question": "What am I allergic to?", "category": "Allergy"}, + ] + return conversations, questions + +# --- Main Flow --- + +async def run_demo(): + print("==== MemScheduler Demo Start ====") + conversations, questions = get_mock_data() + + user_id = "demo_user_001" + mem_cube_id = "cube_demo_001" + + print(f"1. Initialize user memory library ({user_id})...") + # Use API Handler to add initial memories (synchronous mode) + add_req = APIADDRequest( + user_id=user_id, + writable_cube_ids=[mem_cube_id], + messages=conversations, + async_mode="sync", + ) + add_handler.handle_add_memories(add_req) + print(" Memory addition completed.") + + print("\n2. Start conversation testing (triggering background scheduling tasks)...") + for item in questions: + query = item["question"] + print(f"\n>> User: {query}") + + # Initiate chat request + chat_req = ChatPlaygroundRequest( + user_id=user_id, + query=query, + readable_cube_ids=[mem_cube_id], + writable_cube_ids=[mem_cube_id], + ) + + # Get streaming response + response = chat_stream_playground(chat_req) + + # Handle streaming output (simplified) + full_answer = "" + buffer = "" + async for chunk in response.body_iterator: + if isinstance(chunk, bytes): + chunk = chunk.decode("utf-8") + buffer += chunk + while "\n\n" in buffer: + msg, buffer = buffer.split("\n\n", 1) + for line in msg.split("\n"): + if line.startswith("data: "): + try: + data = json.loads(line[6:]) + if data.get("type") == "text": + full_answer += data["data"] + except: pass + + print(f">> AI: {full_answer}") + + # Wait a moment for background scheduler to process tasks and print logs + await asyncio.sleep(1) + +if __name__ == "__main__": + # 1. Register our custom handlers + # This will override or add to the default scheduling logic + mem_scheduler.register_handlers( + { + QUERY_TASK_LABEL: custom_query_handler, + MEM_UPDATE_TASK_LABEL: custom_mem_update_handler, + } + ) + + # 2. Ensure scheduler is started + if not mem_scheduler._running: + mem_scheduler.start() + + try: + asyncio.run(run_demo()) + except KeyboardInterrupt: + pass + finally: + # Prevent scheduler main process from exiting prematurely + time.sleep(10) + + print("\n==== Stopping scheduler ====") + mem_scheduler.stop() +``` + +### Scenario 2: Concurrent Asynchronous Tasks and Checkpoint Restart (Redis) + +This example demonstrates how to use Redis queues to achieve concurrent asynchronous task processing and checkpoint restart functionality. Running this example requires Redis environment configuration. + +```python +from pathlib import Path +from time import sleep + +from memos.api.routers.server_router import mem_scheduler +from memos.mem_scheduler.schemas.message_schemas import ScheduleMessageItem + + +# Debug: Print scheduler configuration +print("=== Scheduler Configuration Debug ===") +print(f"Scheduler type: {type(mem_scheduler).__name__}") +print(f"Config: {mem_scheduler.config}") +print(f"use_redis_queue: {mem_scheduler.use_redis_queue}") +print(f"Queue type: {type(mem_scheduler.memos_message_queue).__name__}") +print(f"Queue maxsize: {getattr(mem_scheduler.memos_message_queue, 'maxsize', 'N/A')}") +print("=====================================\n") + +queue = mem_scheduler.memos_message_queue + + +# Define handler function +def my_test_handler(messages: list[ScheduleMessageItem]): + print(f"My test handler received {len(messages)} messages: {[one.item_id for one in messages]}") + for msg in messages: + # Create file based on task_id (use item_id as numeric ID 0..99) + task_id = str(msg.item_id) + file_path = tmp_dir / f"{task_id}.txt" + try: + sleep(5) + file_path.write_text(f"Task {task_id} processed.\n") + print(f"writing {file_path} done") + except Exception as e: + print(f"Failed to write {file_path}: {e}") + + +def submit_tasks(): + mem_scheduler.memos_message_queue.clear() + + # Create 100 messages (task_id 0..99) + users = ["user_A", "user_B"] + messages_to_send = [ + ScheduleMessageItem( + item_id=str(i), + user_id=users[i % 2], + mem_cube_id="test_mem_cube", + label=TEST_HANDLER_LABEL, + content=f"Create file for task {i}", + ) + for i in range(100) + ] + # Batch submit messages and print completion info + print(f"Submitting {len(messages_to_send)} messages to the scheduler...") + mem_scheduler.memos_message_queue.submit_messages(messages_to_send) + print(f"Task submission done! tasks in queue: {mem_scheduler.get_tasks_status()}") + + +# Register handler function +TEST_HANDLER_LABEL = "test_handler" +mem_scheduler.register_handlers({TEST_HANDLER_LABEL: my_test_handler}) + +# 5 second restart +mem_scheduler.orchestrator.tasks_min_idle_ms[TEST_HANDLER_LABEL] = 5_000 + +tmp_dir = Path("./tmp") +tmp_dir.mkdir(exist_ok=True) + +# Test stop and restart: if tmp has >1 files, skip submission and print info +existing_count = len(list(Path("tmp").glob("*.txt"))) if Path("tmp").exists() else 0 +if existing_count > 1: + print(f"Skip submission: found {existing_count} files in tmp (>1), continue processing") +else: + submit_tasks() + +# Wait until tmp has 100 files or timeout +poll_interval = 1 +expected = 100 +tmp_dir = Path("tmp") +tasks_status = mem_scheduler.get_tasks_status() +mem_scheduler.print_tasks_status(tasks_status=tasks_status) +while ( + mem_scheduler.get_tasks_status()["remaining"] != 0 + or mem_scheduler.get_tasks_status()["running"] != 0 +): + count = len(list(tmp_dir.glob("*.txt"))) if tmp_dir.exists() else 0 + tasks_status = mem_scheduler.get_tasks_status() + mem_scheduler.print_tasks_status(tasks_status=tasks_status) + print(f"[Monitor] Files in tmp: {count}/{expected}") + sleep(poll_interval) +print(f"[Result] Final files in tmp: {len(list(tmp_dir.glob('*.txt')))})") + +# Stop scheduler +sleep(20) +print("Stopping the scheduler...") +mem_scheduler.stop() +``` diff --git a/docs/en/open_source/modules/memories/general_textual_memory.md b/docs/en/open_source/modules/memories/general_textual_memory.md new file mode 100644 index 00000000..da5861d5 --- /dev/null +++ b/docs/en/open_source/modules/memories/general_textual_memory.md @@ -0,0 +1,145 @@ +--- +title: "GeneralTextMemory: General-Purpose Textual Memory" +desc: "`GeneralTextMemory` is a flexible, vector-based textual memory module in MemOS, designed for storing, searching, and managing unstructured knowledge. It is suitable for conversational agents, personal assistants, and any system requiring semantic memory retrieval." +--- + +## Table of Contents + +- [Memory Structure](#memory-structure) + - [Metadata Fields (`TextualMemoryMetadata`)](#metadata-fields-textualmemorymetadata) +- [API Summary (`GeneralTextMemory`)](#api-summary-generaltextmemory) + - [Initialization](#initialization) + - [Core Methods](#core-methods) +- [File Storage](#file-storage) +- [Example Usage](#example-usage) +- [Extension: Internet Retrieval](#extension-internet-retrieval) +- [Advanced: Using MultiModal Reader](#advanced-using-multimodal-reader) +- [Developer Notes](#developer-notes) + + +## Memory Structure + +Each memory is represented as a `TextualMemoryItem`: + +| Field | Type | Description | +| ---------- | --------------------------- | ---------------------------------- | +| `id` | `str` | UUID (auto-generated if omitted) | +| `memory` | `str` | The main memory content (required) | +| `metadata` | `TextualMemoryMetadata` | Metadata for search/filtering | + +### Metadata Fields (`TextualMemoryMetadata`) + +| Field | Type | Description | +| ------------- | -------------------------------------------------- | ----------------------------------- | +| `type` | `"procedure"`, `"fact"`, `"event"`, `"opinion"` | Memory type | +| `memory_time` | `str (YYYY-MM-DD)` | Date/time the memory refers to | +| `source` | `"conversation"`, `"retrieved"`, `"web"`, `"file"` | Source of the memory | +| `confidence` | `float (0-100)` | Certainty/confidence score | +| `entities` | `list[str]` | Key entities/concepts | +| `tags` | `list[str]` | Thematic tags | +| `visibility` | `"private"`, `"public"`, `"session"` | Access scope | +| `updated_at` | `str` | Last update timestamp (ISO 8601) | + +All values are validated. Invalid values will raise errors. + +### Search Mechanism +Unlike NaiveTextMemory, which relies on keyword matching, GeneralTextMemory utilizes vector-based semantic search. + +## Algorithm Comparison + +| Feature | Keyword Matching | Vector Semantic Search | +| ------------------ | ---------------------------------- | ------------------------------------------ | +| **Semantic Understanding** | ❌ Doesn't understand synonyms | ✅ Understands similar concepts | +| **Resource Usage** | ✅ Extremely low | ⚠️ Requires embedding model and vector DB | +| **Execution Speed** | ✅ Fast (O(n)) | ⚠️ Slower (indexing + querying) | +| **Suitable Scale** | < 1K memories | 10K - 100K memories | +| **Predictability** | ✅ Intuitive results | ⚠️ Black box model + + +## API Summary (`GeneralTextMemory`) + +### Initialization +```python +GeneralTextMemory(config: GeneralTextMemoryConfig) +``` + +### Core Methods +| Method | Description | +| ------------------------ | --------------------------------------------------- | +| `extract(messages)` | Extracts memories from message list (LLM-based) | +| `add(memories)` | Adds one or more memories (items or dicts) | +| `search(query, top_k)` | Retrieves top-k memories using vector similarity | +| `get(memory_id)` | Fetch single memory by ID | +| `get_by_ids(ids)` | Fetch multiple memories by IDs | +| `get_all()` | Returns all memories | +| `update(memory_id, new)` | Update a memory by ID | +| `delete(ids)` | Delete memories by IDs | +| `delete_all()` | Delete all memories | +| `dump(dir)` | Serialize all memories to JSON file in directory | +| `load(dir)` | Load memories from saved file | + +## File Storage + +When calling `dump(dir)`, the system stores the memories to: + +``` +/ +``` + +This file contains a JSON list of all memory items, which can be reloaded using `load(dir)`. + +## Example Usage + +```python +import os +from memos.configs.memory import MemoryConfigFactory +from memos.memories.factory import MemoryFactory + +config = MemoryConfigFactory( + backend="general_text", + config={ + "extractor_llm": { ... }, + "vector_db": { ... }, + "embedder": { ... }, + }, +) +m = MemoryFactory.from_config(config) + +# Extract and add memories +memories = m.extract([ + {"role": "user", "content": "I love tomatoes."}, + {"role": "assistant", "content": "Great! Tomatoes are delicious."}, +]) +m.add(memories) + +# Search +results = m.search("Tell me more about the user", top_k=2) + +# Update +m.update(memory_id, {"memory": "User is Canadian.", ...}) + +# Delete +m.delete([memory_id]) + +# Dump/load +m.dump("tmp/mem") +m.load("tmp/mem") +``` + +::note +**Extension: Internet Retrieval**
+GeneralTextMemory can be combined with Internet Retrieval to extract content from web pages and add to memory.
+View example: [Retrieve Memories from the Internet](./tree_textual_memory#retrieve-memories-from-the-internet-optional) +:: + +::note +**Advanced: Using MultiModal Reader**
+For processing images, URLs, or files within conversations, see the comprehensive MultiModal Reader examples.
+View documentation: [Using MultiModalStructMemReader](./tree_textual_memory#using-multimodalstructmemreader-advanced) +:: + +## Developer Notes + +* Uses Qdrant (or compatible) vector DB for fast similarity search +* Embedding and extraction models are configurable (Ollama/OpenAI supported) +* All methods are covered by integration tests in `/tests` diff --git a/docs/en/open_source/modules/memories/kv_cache_memory.md b/docs/en/open_source/modules/memories/kv_cache_memory.md new file mode 100644 index 00000000..4bb709e9 --- /dev/null +++ b/docs/en/open_source/modules/memories/kv_cache_memory.md @@ -0,0 +1,267 @@ +--- +title: "KVCacheMemory: Key-Value Cache for Activation Memory" +desc: "`KVCacheMemory` is a specialized memory module in MemOS for storing and managing key-value (KV) caches, primarily used to accelerate large language model (LLM) inference and support efficient context reuse. It is especially useful for activation memory in conversational and generative AI systems." +--- + +## KV-cache Memory Use Cases + +In MemOS, KV-cache memory is best suited for storing **semantically stable and frequently reused background content** such as: + +- Frequently asked questions (FAQs) or domain-specific knowledge +- Prior conversation history + +These stable **plaintext memory items** are automatically identified and managed by the `MemScheduler` module. Once selected, they are converted into KV-format representations (`KVCacheItem`) ahead of time. This precomputation step stores the activation states (Key/Value tensors) of the memory in a reusable format, allowing them to be injected into the model’s attention cache during inference. + +Once converted, these KV memories can be **reused across queries without requiring re-encoding** of the original content. This reduces the computational overhead of processing and storing large amounts of text, making it ideal for applications that require **rapid response times** and **high throughput**. + + +## Why KV-cache Memory +Integrating `MemScheduler` with KV-cache memory enables significant performance optimization, particularly in the **prefill phase** of LLM inference. + +### Without KVCacheMemory + +- Each new query is appended to the full prompt, including the background memory. +- The model must **recompute token embeddings and attention** over the full sequence — even for unchanged memory. + +### With KVCacheMemory + +- The background content is **cached once** as Key/Value tensors. +- For each query, only the new user input (query tokens) is encoded. +- The previously cached KV is injected directly into the attention mechanism. + +### Benefits + +This separation reduces redundant computation in the prefill phase and leads to: + +- Skipping repeated encoding of background content +- Faster attention computation between query tokens and cached memory +- **Lower Time To First Token (TTFT)** latency during generation + +This optimization is especially valuable in: + +- Multi-turn chatbot interactions +- Retrieval-augmented or context-augmented generation (RAG, CAG) +- Assistants operating over fixed documentation or FAQ-style memory + + +### KVCacheMemory Acceleration Evaluation + +To validate the performance impact of KV-based memory injection, we conducted a set of controlled experiments simulating real memory reuse in MemOS. + +#### Experiment Setup + +During typical usage, the `MemScheduler` module continuously tracks interaction patterns and promotes high-frequency, stable plaintext memory into KV format. These KV memories are loaded into GPU memory as activation caches and reused during inference. + +The evaluation compares two memory injection strategies: + +1. **Prompt-based injection**: background memory is prepended as raw text. +2. **KV-cache injection**: memory is injected directly into the model’s attention cache. + +We test these strategies across: + +- **Three context sizes**: short, medium, and long +- **Three query types**: short-form, medium-form, and long-form + +The primary metric is **Time To First Token (TTFT)**, a key latency indicator for responsive generation. + +#### Results + +The following table shows results across three models (Qwen3-8B, Qwen3-32B, Qwen2.5-72B). TTFT under KV-cache injection is consistently lower than prompt-based injection, while the output tokens remain identical across both strategies. + +::note{icon="ri:bnb-fill"} +`Build (s)` refers to the one-time preprocessing cost of converting the memory to KV format, amortized across multiple queries. +:: + +| Model | Ctx | CtxTok | Qry | QryTok | Build (s) | KV TTFT (s) | Dir TTFT (s) | Speedup (%) | +| ----------- | ------ | ------ | ------ | ------ | --------- | ----------- | ------------ | ----------- | +| Qwen3-8B | long | 6064 | long | 952.7 | 0.92 | 0.50 | 2.37 | 79.1 | +| | | | medium | 302.7 | 0.93 | 0.19 | 2.16 | 91.1 | +| | | | short | 167 | 0.93 | 0.12 | 2.04 | 94.2 | +| | medium | 2773 | long | 952.7 | 0.41 | 0.43 | 1.22 | 64.6 | +| | | | medium | 302.7 | 0.41 | 0.16 | 1.08 | 85.1 | +| | | | short | 167 | 0.43 | 0.10 | 0.95 | 89.7 | +| | short | 583 | long | 952.7 | 0.12 | 0.39 | 0.51 | 23.0 | +| | | | medium | 302.7 | 0.12 | 0.14 | 0.32 | 55.6 | +| | | | short | 167 | 0.12 | 0.08 | 0.29 | 71.3 | +| Qwen3-32B | long | 6064 | long | 952.7 | 0.71 | 0.31 | 1.09 | 71.4 | +| | | | medium | 302.7 | 0.71 | 0.15 | 0.98 | 84.3 | +| | | | short | 167 | 0.71 | 0.11 | 0.96 | 88.8 | +| | medium | 2773 | long | 952.7 | 0.31 | 0.24 | 0.56 | 56.9 | +| | | | medium | 302.7 | 0.31 | 0.12 | 0.47 | 75.1 | +| | | | short | 167 | 0.31 | 0.08 | 0.44 | 81.2 | +| | short | 583 | long | 952.7 | 0.09 | 0.20 | 0.24 | 18.6 | +| | | | medium | 302.7 | 0.09 | 0.09 | 0.15 | 39.6 | +| | | | short | 167 | 0.09 | 0.07 | 0.14 | 53.5 | +| Qwen2.5-72B | long | 6064 | long | 952.7 | 1.26 | 0.48 | 2.04 | 76.4 | +| | | | medium | 302.7 | 1.26 | 0.23 | 1.82 | 87.2 | +| | | | short | 167 | 1.27 | 0.15 | 1.79 | 91.4 | +| | medium | 2773 | long | 952.7 | 0.58 | 0.39 | 1.05 | 62.7 | +| | | | medium | 302.7 | 0.58 | 0.18 | 0.89 | 79.2 | +| | | | short | 167 | 0.71 | 0.23 | 0.82 | 71.6 | +| | short | 583 | long | 952.7 | 0.16 | 0.33 | 0.43 | 23.8 | +| | | | medium | 302.7 | 0.16 | 0.15 | 0.27 | 43.2 | +| | | | short | 167 | 0.16 | 0.10 | 0.25 | 60.5 | + +#### vLLM-based Performance + +MemOS now supports using vLLM to manage activation memory. To evaluate the impact of KV Cache prefilling for different prefix text lengths, we conducted performance tests on a system equipped with 8x `H800 80GB GPUs (112 vCPUs, 1920 GiB Memory)` and a system equipped with 8x `RTX4090-24G-PCIe (112 vCPUs, 960 GiB Memory)`. The evaluation covered two core models: Qwen3-32B and Qwen2.5-72B. + +The benchmarks were run across a range of memory and context length combinations to simulate various activation memory scenarios: +- **Memory Text Lengths (tokens)**: 500, 1000, 2000 +- **Context Text Lengths (tokens)**: 500, 1000, 2000, 4000 + +The following table summarizes the benchmark results. + +**Qwen2.5-72B** +- On 4090 (2 Nodes 16 GPUs) + +| mem tks | prompt tks | TTFT (without cache, ms) | TTFT (With cache, ms) | TTFT Speedup (%) | Abs Dis(ms) | +| --- | --- | --- | --- | --- | --- | +| 0.5k | 0.5k | 1787.21 | 851.47 | 52.358% | 935.74 | +| 0.5k | 1k | 2506.26 | 1290.68 | 48.502% | 1215.58 | +| 0.5k | 2k | 3843.48 | 2897.97 | 24.600% | 945.51 | +| 0.5k | 4k | 6078.01 | 5200.86 | 14.432% | 877.15 | +| 1k | 0.5k | 2274.61 | 920.16 | 59.546% | 1354.45 | +| 1k | 1k | 2907.17 | 1407.65 | 51.580% | 1499.52 | +| 1k | 2k | 4278.53 | 2916.47 | 31.835% | 1362.06 | +| 1k | 4k | 6897.99 | 5218.94 | 24.341% | 1679.05 | +| 2k | 0.5k | 3460.12 | 782.73 | 77.379% | 2677.39 | +| 2k | 1k | 4443.34 | 1491.24 | 66.439% | 2952.10 | +| 2k | 2k | 5733.14 | 2758.48 | 51.885% | 2974.66 | +| 2k | 4k | 8152.76 | 5627.41 | 30.975% | 2525.35 | + + +- On H800 (4 GPUs) + +| mem tks | prompt tks | TTFT (without cache, ms) | TTFT (With cache, ms) | TTFT Speedup (%) | Abs Dis(ms) | +| --- | --- | --- | --- | --- | --- | +| 0.5k | 0.5k | 51.65 | 52.17 | \-1.007% | \-0.52 | +| 0.5k | 1k | 55.70 | 57.03 | \-2.388% | \-1.33 | +| 0.5k | 2k | 74.23 | 78.56 | \-5.833% | \-4.33 | +| 0.5k | 4k | 77.56 | 77.45 | 0.142% | 0.11 | +| 1k | 0.5k | 55.90 | 55.73 | 0.304% | 0.17 | +| 1k | 1k | 55.35 | 52.89 | 4.444% | 2.46 | +| 1k | 2k | 80.14 | 73.82 | 7.886% | 6.32 | +| 1k | 4k | 82.83 | 73.51 | 11.252% | 9.32 | +| 2k | 0.5k | 75.82 | 71.31 | 5.948% | 4.51 | +| 2k | 1k | 80.60 | 78.71 | 2.345% | 1.89 | +| 2k | 2k | 83.91 | 78.60 | 6.328% | 5.31 | +| 2k | 4k | 99.15 | 80.12 | 19.193% | 19.03 | + +**Qwen3-32B** + +- On 4090 (1 Nodes 8 GPUs) + +| mem tks | prompt tks | TTFT (without cache, ms) | TTFT (With cache, ms) | TTFT Speedup (%) | Abs Dis(ms) | +| --- | --- | --- | --- | --- | --- | +| 0.5k | 0.5k | 288.72 | 139.29 | 51.756% | 149.43 | +| 0.5k | 1k | 428.72 | 245.85 | 42.655% | 182.87 | +| 0.5k | 2k | 683.65 | 538.59 | 21.218% | 145.06 | +| 0.5k | 4k | 1170.48 | 986.94 | 15.681% | 183.54 | +| 1k | 0.5k | 409.83 | 137.96 | 66.337% | 271.87 | +| 1k | 1k | 507.95 | 262.21 | 48.379% | 245.74 | +| 1k | 2k | 743.48 | 539.71 | 27.408% | 203.77 | +| 1k | 4k | 1325.34 | 1038.59 | 21.636% | 286.75 | +| 2k | 0.5k | 686.01 | 147.34 | 78.522% | 538.67 | +| 2k | 1k | 762.96 | 246.22 | 67.728% | 516.74 | +| 2k | 2k | 1083.93 | 498.05 | 54.051% | 585.88 | +| 2k | 4k | 1435.39 | 1053.31 | 26.619% | 382.08 | + + +- On H800 (2 GPUs) + +| mem tks | prompt tks | TTFT (without cache, ms) | TTFT (With cache, ms) | TTFT Speedup (%) | Abs Dis(ms) | +| --- | --- | --- | --- | --- | --- | +| 0.5k | 0.5k | 161.18 | 97.61 | 39.440% | 63.57 | +| 0.5k | 1k | 164.00 | 121.39 | 25.982% | 42.61 | +| 0.5k | 2k | 257.34 | 215.20 | 16.375% | 42.14 | +| 0.5k | 4k | 365.14 | 317.95 | 12.924% | 47.19 | +| 1k | 0.5k | 169.45 | 100.52 | 40.679% | 68.93 | +| 1k | 1k | 180.91 | 128.25 | 29.108% | 52.66 | +| 1k | 2k | 271.69 | 210.00 | 22.706% | 61.69 | +| 1k | 4k | 389.30 | 314.64 | 19.178% | 74.66 | +| 2k | 0.5k | 251.43 | 130.92 | 47.930% | 120.51 | +| 2k | 1k | 275.81 | 159.60 | 42.134% | 116.21 | +| 2k | 2k | 331.11 | 218.17 | 34.110% | 112.94 | +| 2k | 4k | 451.06 | 334.80 | 25.775% | 116.26 | + +The results clearly demonstrate that integrating vLLM's KV Cache reuse provides a transformative performance improvement for MemOS. + +## KV-cache Memory Structure + +KV-based memory reuse via `KVCacheMemory` offers substantial latency reduction across model sizes and query types, while maintaining identical output. By shifting reusable memory from plaintext prompts into precomputed KV caches, MemOS eliminates redundant context encoding and achieves faster response times—especially beneficial in real-time, memory-augmented LLM applications. + +Each cache is stored as a `KVCacheItem`: + +| Field | Type | Description | +| ------------- | -------------- | ------------------------------------------- | +| `kv_cache_id` | `str` | Unique ID for the cache (UUID) | +| `kv_cache` | `DynamicCache` | The actual key-value cache (transformers) | +| `metadata` | `dict` | Metadata (source, extraction time, etc.) | + + +## API Summary (`KVCacheMemory`) + +### Initialization +```python +KVCacheMemory(config: KVCacheMemoryConfig) +``` + +### Core Methods +| Method | Description | +| ------------------------ | -------------------------------------------------------- | +| `extract(text)` | Extracts a KV cache from input text using the LLM | +| `add(memories)` | Adds one or more `KVCacheItem` to memory | +| `get(memory_id)` | Fetch a single cache by ID | +| `get_by_ids(ids)` | Fetch multiple caches by IDs | +| `get_all()` | Returns all stored caches | +| `get_cache(cache_ids)` | Merge and return a combined cache from multiple IDs | +| `delete(ids)` | Delete caches by IDs | +| `delete_all()` | Delete all caches | +| `dump(dir)` | Serialize all caches to a pickle file in directory | +| `load(dir)` | Load caches from a pickle file in directory | +| `from_textual_memory(mem)` | Convert a `TextualMemoryItem` to a `KVCacheItem` | +| `build_vllm_kv_cache( messages)` | Build a vLLM KV cache from a list of messages | + + +When calling `dump(dir)`, the system writes to: + +``` +/ +``` + +This file contains a pickled dictionary of all KV caches, which can be reloaded using `load(dir)`. + + +## How to Use + +```python +from memos.configs.memory import KVCacheMemoryConfig +from memos.memories.activation.kv import KVCacheMemory + +config = KVCacheMemoryConfig( + extractor_llm={ + "backend": "huggingface", + "config": {"model_name_or_path": "Qwen/Qwen3-1.7B"} + } +) +mem = KVCacheMemory(config) + +# Extract and add a cache +cache_item = mem.extract("The capital of France is Paris.") +mem.add([cache_item]) + +# Retrieve and merge caches +merged_cache = mem.get_cache([cache_item.kv_cache_id]) + +# Save/load +mem.dump("tmp/act_mem") +mem.load("tmp/act_mem") +``` + + +## Developer Notes + +* Uses HuggingFace `DynamicCache` for efficient key-value storage +* Pickle-based serialization for fast load/save +* All methods are covered by integration tests in `/tests` diff --git a/docs/en/open_source/modules/memories/naive_textual_memory.md b/docs/en/open_source/modules/memories/naive_textual_memory.md new file mode 100644 index 00000000..5c447dfe --- /dev/null +++ b/docs/en/open_source/modules/memories/naive_textual_memory.md @@ -0,0 +1,504 @@ +--- +title: "NaiveTextMemory: Simple Plain Text Memory" +desc: "The most lightweight memory module in MemOS, designed for rapid prototyping and simple scenarios. No vector database required—quickly retrieve memories using keyword matching." +--- + +Let's get started with the MemOS memory system in the simplest way possible! + +NaiveTextMemory is a lightweight, memory-based, plain-text memory module. It stores memories in an in-memory list and retrieves them using keyword matching. It is the perfect starting point for learning MemOS, as well as an ideal choice for demos, testing, and small-scale applications. + +## Table of Contents + +- [What You'll Learn](#what-youll-learn) +- [Why Choose NaiveTextMemory](#why-choose-naivetextmemory) +- [Core Concepts](#core-concepts) + - [Memory Structure](#memory-structure) + - [Metadata Fields](#metadata-fields-textualmemorymetadata) + - [Search Mechanism](#search-mechanism) +- [API Reference](#api-reference) + - [Initialization](#initialization) + - [Core Methods](#core-methods) + - [Configuration Parameters](#configuration-parameters) +- [Hands-On Practice](#hands-on-practice) + - [Quick Start](#quick-start) + - [Complete Example](#complete-example) + - [File Storage](#file-storage) +- [Use Case Guide](#use-case-guide) +- [Comparison with Other Memory Modules](#comparison-with-other-memory-modules) +- [Best Practices](#best-practices) +- [Next Steps](#next-steps) + +## What You'll Learn + +By the end of this guide, you will be able to: +- Automatically extract structured memories from conversations using LLM +- Store and manage memories in memory (no database required) +- Search memories using keyword matching +- Persist and restore memory data +- Understand when to use NaiveTextMemory and when to upgrade to other modules + +## Why Choose NaiveTextMemory + +### Key Advantages + +::list{icon="ph:check-circle-duotone"} +- **Zero Dependencies**: No vector database or embedding model required +- **Fast Startup**: Up and running in just a few lines of code +- **Lightweight & Efficient**: Low resource footprint, fast execution +- **Simple & Intuitive**: Keyword matching with predictable results +- **Easy to Debug**: All memories in memory, easy to inspect +- **Perfect Starting Point**: The best entry point for learning MemOS +:: + +### Suitable Scenarios + +::list{icon="ph:lightbulb-duotone"} +- Rapid prototyping and proof of concept +- Simple conversational agents (< 1000 memories) +- Testing and demo scenarios +- Resource-constrained environments (cannot run embedding models) +- Keyword search scenarios (queries directly match memories) +:: + +::note +**Performance Tip**
+When memory count exceeds 1000, it's recommended to upgrade to [GeneralTextMemory](/open_source/modules/memories/general_textual_memory), which uses vector search for better performance. +:: + +## Core Concepts + +### Memory Structure + +Each memory is represented as a `TextualMemoryItem` object with the following fields: + +| Field | Type | Required | Description | +| ---------- | --------------------------- | -------- | ------------------------------------ | +| `id` | `str` | ✗ | Unique identifier (auto-generated UUID) | +| `memory` | `str` | ✓ | Main text content of the memory | +| `metadata` | `TextualMemoryMetadata` | ✗ | Metadata (for categorization, filtering, and retrieval) | + +### Metadata Fields (`TextualMemoryMetadata`) + +Metadata provides rich contextual information for categorization, filtering, and organizing memories: + +| Field | Type | Default | Description | +| ------------- | -------------------------------------------------- | ---------- | ---------------------------------- | +| `type` | `"procedure"` / `"fact"` / `"event"` / `"opinion"` | `"fact"` | Memory type classification | +| `memory_time` | `str (YYYY-MM-DD)` | Current date | Time associated with the memory | +| `source` | `"conversation"` / `"retrieved"` / `"web"` / `"file"` | - | Source of the memory | +| `confidence` | `float (0-100)` | 80.0 | Certainty/confidence score | +| `entities` | `list[str]` | `[]` | Mentioned entities or concepts | +| `tags` | `list[str]` | `[]` | Topic tags | +| `visibility` | `"private"` / `"public"` / `"session"` | `"private"` | Access control scope | +| `updated_at` | `str` | Auto-generated | Last update timestamp (ISO 8601) | + +## API Reference + +### Initialization + +```python +from memos.memories.textual.naive import NaiveTextMemory +from memos.configs.memory import NaiveTextMemoryConfig + +memory = NaiveTextMemory(config: NaiveTextMemoryConfig) +``` + +### Core Methods + +| Method | Parameters | Returns | Description | +| ------------------------ | ------------------------------------- | ----------------------------- | --------------------------------------------- | +| `extract(messages)` | `messages: list[dict]` | `list[TextualMemoryItem]` | Extract structured memories from conversation using LLM | +| `add(memories)` | `memories: list / dict / Item` | `None` | Add one or more memories | +| `search(query, top_k)` | `query: str, top_k: int` | `list[TextualMemoryItem]` | Retrieve top-k memories using keyword matching | +| `get(memory_id)` | `memory_id: str` | `TextualMemoryItem` | Get a single memory by ID | +| `get_by_ids(ids)` | `ids: list[str]` | `list[TextualMemoryItem]` | Batch retrieve memories by ID list | +| `get_all()` | - | `list[TextualMemoryItem]` | Return all memories | +| `update(memory_id, new)` | `memory_id: str, new: dict` | `None` | Update content or metadata of specified memory | +| `delete(ids)` | `ids: list[str]` | `None` | Delete one or more memories | +| `delete_all()` | - | `None` | Clear all memories | +| `dump(dir)` | `dir: str` | `None` | Serialize memories to JSON file | +| `load(dir)` | `dir: str` | `None` | Load memories from JSON file | + +### Search Mechanism + +Unlike `GeneralTextMemory`'s vector semantic search, `NaiveTextMemory` uses a **keyword matching algorithm**: + +::steps{} + +#### Step 1: Tokenization +Break down the query and each memory content into lists of tokens + +#### Step 2: Calculate Match Score +Count the number of overlapping tokens between query and memory + +#### Step 3: Sort +Sort all memories by match count in descending order + +#### Step 4: Return Results +Return the top-k memories as search results + + +::note +**Example Comparison**
+Query: "cat"
+- **Keyword Matching**: Only matches memories containing "cat"
+- **Semantic Search**: Also matches memories about "pet", "kitten", "feline", etc. +:: + +### Configuration Parameters + +**NaiveTextMemoryConfig** + +| Parameter | Type | Required | Default | Description | +| ------------------ | ---------------------- | -------- | ---------------------- | ---------------------------------------------- | +| `extractor_llm` | `LLMConfigFactory` | ✓ | - | LLM configuration for extracting memories from conversations | +| `memory_filename` | `str` | ✗ | `textual_memory.json` | Filename for persistent storage | + +**Configuration Example** + +```json +{ + "backend": "naive_text", + "config": { + "extractor_llm": { + "backend": "openai", + "config": { + "model_name_or_path": "gpt-4o-mini", + "temperature": 0.8, + "max_tokens": 1024, + "api_base": "xxx", + "api_key": "sk-xxx" + } + }, + "memory_filename": "my_memories.json" + } +} +``` + +## Hands-On Practice + +### Quick Start + +Get started with NaiveTextMemory in just 3 steps: + +::steps{} + +#### Step 1: Create Configuration + +```python +from memos.configs.memory import MemoryConfigFactory + +config = MemoryConfigFactory( + backend="naive_text", + config={ + "extractor_llm": { + "backend": "openai", + "config": { + "model_name_or_path": "gpt-4o-mini", + "api_key": "your-api-key", + "api_base": "your-api-base" + }, + }, + }, +) +``` + +#### Step 2: Initialize Memory Module + +```python +from memos.memories.factory import MemoryFactory + +memory = MemoryFactory.from_config(config) +``` + +#### Step 3: Extract and Add Memories + +```python +# Automatically extract memories from conversation +memories = memory.extract([ + {"role": "user", "content": "I love tomatoes."}, + {"role": "assistant", "content": "Great! Tomatoes are delicious."}, +]) + +# Add to memory store +memory.add(memories) +print(f"✓ Added {len(memories)} memories") +``` + +::note +**Advanced: Using MultiModal Reader**
+If you need to process multimodal content such as images, URLs, or files, use `MultiModalStructMemReader`.
+View complete example: [Using MultiModalStructMemReader (Advanced)](./tree_textual_memory#using-multimodalstructmemreader-advanced) +:: + +:: + +### Complete Example + +Here's a complete end-to-end example demonstrating all core functionality: + +```python +from memos.configs.memory import MemoryConfigFactory +from memos.memories.factory import MemoryFactory + +# ======================================== +# 1. Initialization +# ======================================== +config = MemoryConfigFactory( + backend="naive_text", + config={ + "extractor_llm": { + "backend": "openai", + "config": { + "model_name_or_path": "gpt-4o-mini", + "api_key": "your-api-key", + }, + }, + }, +) +memory = MemoryFactory.from_config(config) + +# ======================================== +# 2. Extract and Add Memories +# ======================================== +memories = memory.extract([ + {"role": "user", "content": "I love tomatoes."}, + {"role": "assistant", "content": "Great! Tomatoes are delicious."}, +]) +memory.add(memories) +print(f"✓ Added {len(memories)} memories") + +# ======================================== +# 3. Search Memories +# ======================================== +results = memory.search("tomatoes", top_k=2) +print(f"\n🔍 Found {len(results)} relevant memories:") +for i, item in enumerate(results, 1): + print(f" {i}. {item.memory}") + +# ======================================== +# 4. Get All Memories +# ======================================== +all_memories = memory.get_all() +print(f"\n📊 Total {len(all_memories)} memories") + +# ======================================== +# 5. Update Memory +# ======================================== +if memories: + memory_id = memories[0].id + memory.update( + memory_id, + { + "memory": "User loves tomatoes.", + "metadata": {"type": "opinion", "confidence": 95.0} + } + ) + print(f"\n✓ Updated memory: {memory_id}") + +# ======================================== +# 6. Persist to Storage +# ======================================== +memory.dump("tmp/mem") +print("\n💾 Memories saved to tmp/mem/textual_memory.json") + +# ======================================== +# 7. Load Memories +# ======================================== +memory.load("tmp/mem") +print("✓ Memories loaded from file") + +# ======================================== +# 8. Delete Memories +# ======================================== +if memories: + memory.delete([memories[0].id]) + print(f"\n🗑️ Deleted 1 memory") + +# Delete all memories +# memory.delete_all() +``` + +::note +**Extension: Internet Retrieval**
+NaiveTextMemory focuses on local memory management. For retrieving information from the internet and adding it to your memory store, see:
+[Retrieve Memories from the Internet (Optional)](./tree_textual_memory#retrieve-memories-from-the-internet-optional) +:: + +### File Storage + +When calling `dump(dir)`, the system saves memories to: + +``` +/ +``` + +**Default File Structure** + +```json +[ + { + "id": "550e8400-e29b-41d4-a716-446655440000", + "memory": "User loves tomatoes.", + "metadata": { + "type": "opinion", + "confidence": 95.0, + "entities": ["user", "tomatoes"], + "tags": ["food", "preference"], + "updated_at": "2026-01-14T10:30:00Z" + } + }, + ... +] +``` + +Use `load(dir)` to fully restore all memory data. + +::note +**Important Note**
+Memories are stored in memory and will be lost after process restart. Remember to call `dump()` regularly to save data! +:: + +## Use Case Guide + +### Best Suited For + +::list{icon="ph:check-circle-duotone"} +- **Rapid Prototyping**: No need to configure vector databases, get started in minutes +- **Simple Conversational Agents**: Small-scale applications with < 1000 memories +- **Testing and Demos**: Quickly validate memory extraction and retrieval logic +- **Resource-Constrained Environments**: Scenarios where embedding models or vector databases cannot run +- **Keyword Search**: Scenarios where query content directly matches memory text +- **Learning and Teaching**: The best starting point for understanding MemOS memory system +:: + +### Not Recommended For + +::list{icon="ph:x-circle-duotone"} +- **Large-Scale Applications**: More than 10,000 memories (search performance degrades) +- **Semantic Search Needs**: Need to understand synonyms (e.g., "cat" and "pet") +- **Production Environments**: Strict performance and accuracy requirements +- **Multilingual Scenarios**: Need cross-language semantic understanding +- **Complex Relationship Reasoning**: Need to understand relationships between memories +:: + +::alert{type="info"} +**Upgrade Path**
+For the scenarios not recommended above, consider upgrading to: +- [GeneralTextMemory](/open_source/modules/memories/general_textual_memory) - Vector semantic search, suitable for 10K-100K memories +- [TreeTextMemory](/open_source/modules/memories/tree_textual_memory) - Graph structure storage, supports relationship reasoning and multi-hop queries +:: + +## Comparison with Other Memory Modules + +Choosing the right memory module is crucial for project success. This comparison helps you make the decision: + +| Feature | **NaiveTextMemory** | **GeneralTextMemory** | **TreeTextMemory** | +| ------------------ | --------------------- | -------------------------- | --------------------------- | +| **Search Method** | Keyword matching | Vector semantic search | Graph structure + vector search | +| **Dependencies** | LLM only | LLM + Embedder + Vector DB | LLM + Embedder + Graph DB | +| **Suitable Scale** | < 1K | 1K - 100K | 10K - 1M | +| **Query Complexity** | O(n) linear scan | O(log n) approximate NN | O(log n) + graph traversal | +| **Semantic Understanding** | ❌ | ✅ | ✅ | +| **Relationship Reasoning** | ❌ | ❌ | ✅ | +| **Multi-Hop Queries** | ❌ | ❌ | ✅ | +| **Storage Backend** | In-memory list | Vector DB (Qdrant, etc.) | Graph DB (Neo4j/PolarDB) | +| **Configuration Complexity** | Low ⭐ | Medium ⭐⭐ | High ⭐⭐⭐ | +| **Learning Curve** | Minimal | Moderate | Steep | +| **Production Ready** | ❌ Prototype/demo only | ✅ Suitable for most cases | ✅ Suitable for complex apps | + +::alert{type="success"} +**Selection Guide**
+- **Just getting started?** → Start with NaiveTextMemory
+- **Need semantic search?** → Use GeneralTextMemory
+- **Need relationship reasoning?** → Choose TreeTextMemory +:: + +## Best Practices + +Follow these recommendations to make the most of NaiveTextMemory: + +::steps{} + +### 1. Persist Data Regularly + +```python +# Save immediately after critical operations +memory.add(new_memories) +memory.dump("tmp/mem") # ✓ Persist immediately + +# Regular automatic backups +import schedule +schedule.every(10).minutes.do(lambda: memory.dump("tmp/mem")) +``` + +### 2. Control Memory Scale + +```python +# Regularly clean old memories +if len(memory.get_all()) > 1000: + old_memories = sorted( + memory.get_all(), + key=lambda m: m.metadata.updated_at + )[:100] # Oldest 100 + + memory.delete([m.id for m in old_memories]) + print("✓ Cleaned 100 old memories") +``` + +### 3. Optimize Search Queries + +```python +# ❌ Poor: Vague query +results = memory.search("thing", top_k=5) + +# ✅ Good: Use specific keywords +results = memory.search("tomato", top_k=5) +``` + +### 4. Use Metadata Wisely + +```python +# Set clear metadata when adding memories +memory.add({ + "memory": "User prefers dark mode", + "metadata": { + "type": "opinion", # ✓ Clear classification + "tags": ["UI", "preference"], # ✓ Easy filtering + "confidence": 90.0, # ✓ Mark confidence + "entities": ["user", "dark mode"] # ✓ Entity annotation + } +}) +``` + +### 5. Plan Upgrade Path + +```python +# Monitor memory count and upgrade timely +memory_count = len(memory.get_all()) +if memory_count > 800: + print("⚠️ Memory count approaching limit, consider upgrading to GeneralTextMemory") + # Migration code reference: + # 1. Export existing memories: memory.dump("backup") + # 2. Create GeneralTextMemory configuration + # 3. Import memories to new module +``` + +:: + +## Next Steps + +Congratulations! You've mastered the core usage of NaiveTextMemory. Next, you can: + +::list{icon="ph:arrow-right-duotone"} +- **Upgrade to Vector Search**: Learn about [GeneralTextMemory](/open_source/modules/memories/general_textual_memory)'s semantic retrieval capabilities +- **Explore Graph Structure**: Understand [TreeTextMemory](/open_source/modules/memories/tree_textual_memory)'s relationship reasoning features +- **Integrate into Applications**: Check [Complete API Documentation](/api-reference/search-memories) to build production-grade applications +- **Run Example Code**: Browse the `/examples/` directory for more practical cases +- **Learn Graph Databases**: If you need advanced features, learn about [Neo4j](/open_source/modules/memories/neo4j_graph_db) or [PolarDB](/open_source/modules/memories/polardb_graph_db) +:: + +::alert{type="success"} +**Tip**
+NaiveTextMemory is the perfect starting point for learning MemOS. When your application needs more powerful features, you can seamlessly migrate to other memory modules! +:: diff --git a/docs/en/open_source/modules/memories/nebula_graph_db.md b/docs/en/open_source/modules/memories/nebula_graph_db.md new file mode 100644 index 00000000..85b54fdf --- /dev/null +++ b/docs/en/open_source/modules/memories/nebula_graph_db.md @@ -0,0 +1,125 @@ +--- +title: NebulaGraph-Based Plaintext Memory Backend +desc: This module provides graph-based memory storage and querying + capabilities based on **NebulaGraph** for memory-augmented systems such as RAG pipelines, cognitive agents, or personal assistants. It inherits from `BaseGraphDB`, supports multi-user isolation, structured search, external vector indexing, and is well-suited for large-scale graph construction and reasoning. +--- + +## Why Choose NebulaGraph? + +* Designed for large-scale distributed deployment +* Flexible support for labels and properties on both nodes and edges +* Built-in vector index support (starting from Nebula 5) + +## Recommended Configuration Template + +Ideal for production environments with multi-tenant isolation support: + +```json +"graph_db": { + "backend": "nebular", + "config": { + "uri": ["localhost:9669"], + "user": "root", + "password": "your_password", + "space": "database_name", + "user_name": "user_name", + "use_multi_db": false, + "auto_create": true, + "embedding_dimension": 1024 + } +} +``` + +* `space`: The Nebula graph space name, equivalent to a database +* `user_name`: Used for logical isolation between users (automatically added as a filter condition) +* `embedding_dimension`: Should match your embedding model (e.g., `text-embedding-3-large` = 3072) +* `auto_create`: Whether to automatically create the graph space and schema (recommended for testing) + +## Multi-Tenant Usage Patterns + +The NebulaGraph backend supports two multi-tenant architectures: + +### Shared DB with Logical User Isolation (`user_name`) + +Best for scenarios where multiple users or agents share one graph space with logical separation: + +```python +GraphDBConfigFactory( + backend="nebular", + config={ + "space": "shared_graph", + "user_name": "alice", + "use_multi_db": False, + ... + }, +) +``` + +### Dedicated DB per User (Multi-DB) + +Recommended for stronger resource isolation. Each user has their own dedicated graph space: + +```python +GraphDBConfigFactory( + backend="nebular", + config={ + "space": "user_alice_graph", + "use_multi_db": True, + "auto_create": True, + ... + }, +) +``` + +## Quick Usage Example + +```python +import os +import json +from memos.graph_dbs.factory import GraphStoreFactory +from memos.configs.graph_db import GraphDBConfigFactory + +config = GraphDBConfigFactory( + backend="nebular", + config={ + "uri": json.loads(os.getenv("NEBULAR_HOSTS", "localhost")), + "user": os.getenv("NEBULAR_USER", "root"), + "password": os.getenv("NEBULAR_PASSWORD", "xxxxxx"), + "space": os.getenv("space"), + "use_multi_db": True, + "auto_create": True, + "embedding_dimension": os.getenv("embedding_dimension", 1024), + }, +) + +graph = GraphStoreFactory.from_config(config) + +topic = TextualMemoryItem( + memory="This research addresses long-term multi-UAV navigation for energy-efficient communication coverage.", + metadata=TreeNodeTextualMemoryMetadata( + memory_type="LongTermMemory", + key="Multi-UAV Long-Term Coverage", + hierarchy_level="topic", + type="fact", + memory_time="2024-01-01", + source="file", + sources=["paper://multi-uav-coverage/intro"], + status="activated", + confidence=95.0, + tags=["UAV", "coverage", "multi-agent"], + entities=["UAV", "coverage", "navigation"], + visibility="public", + updated_at=datetime.now().isoformat(), + embedding=embed_memory_item( + "This research addresses long-term " + "multi-UAV navigation for " + "energy-efficient communication " + "coverage." + ), + ), +) + +graph.add_node( + id=topic.id, memory=topic.memory, metadata=topic.metadata.model_dump(exclude_none=True) +) +``` diff --git a/docs/en/open_source/modules/memories/neo4j_graph_db.md b/docs/en/open_source/modules/memories/neo4j_graph_db.md new file mode 100644 index 00000000..5be17637 --- /dev/null +++ b/docs/en/open_source/modules/memories/neo4j_graph_db.md @@ -0,0 +1,188 @@ +--- +title: Neo4j Graph Database +desc: "This module provides graph-based memory storage and querying for memory-augmented systems such as RAG, cognitive agents, or personal memory assistants.
It defines a clean abstraction (`BaseGraphDB`) and includes a production-ready implementation using **Neo4j**." +--- + +## Why Graph for Memory? + +Unlike flat vector stores, a graph database allows: + +- Structuring memory into **chains, hierarchies, and causal links** +- Performing **multi-hop reasoning** and **subgraph traversal** +- Supporting memory **deduplication, conflict detection, and scheduling** +- Dynamically evolving a memory graph over time + +This forms the backbone of long-term, explainable, and compositional memory reasoning. + +## Features + +- Unified interface across different graph databases +- Built-in support for Neo4j +- Support for vector-enhanced retrieval (`search_by_embedding`) +- Modular, pluggable, and testable +- [v0.2.1 New! ] Supports **multi-tenant graph memory architecture** (shared DB, per-user logic) +- [v0.2.1 New! ] Compatible with **Neo4j Community Edition** environments + +## Directory Structure + +``` + +src/memos/graph_dbs/ +├── base.py # Abstract interface: BaseGraphDB +├── factory.py # Factory to instantiate GraphDB from config +├── neo4j.py # Neo4jGraphDB: production implementation + +```` + +## How to Use + +```python +from memos.graph_dbs.factory import GraphStoreFactory +from memos.configs.graph_db import GraphDBConfigFactory + +# Step 1: Build factory config +config = GraphDBConfigFactory( + backend="neo4j", + config={ + "uri": "bolt://localhost:7687", + "user": "your_neo4j_user_name", + "password": "your_password", + "db_name": "memory_user1", + "auto_create": True, + "embedding_dimension": 768 + } +) + +# Step 2: Instantiate the graph store +graph = GraphStoreFactory.from_config(config) + +# Step 3: Add memory +graph.add_node( + id="node-001", + memory="Today I learned about retrieval-augmented generation.", + metadata={"type": "WorkingMemory", "tags": ["RAG", "AI"], "timestamp": "2025-06-05", "sources": []} +) +```` + +## Pluggable Design + +### Interface: `BaseGraphDB` + +```` +Function Introduction: +1. Node Operations: + Insert: add_node (Adds a single node) + add_nodes_batch (Adds multiple nodes in batch) + Query: get_node (Retrieves a single node) + get_nodes (Retrieves multiple nodes) + get_memory_count (Retrieves the count of nodes) + node_not_exist (Checks if a node exists) + search_by_embedding (Vector search supports adding filter conditions for filtering. For usage of the filter, refer to the function neo4j_example.example_complex_shared_db_search_filter for the complete method documentation.) + Update: update_node (Updates a single node) + Delete: delete_node (Deletes a single node) + clear(deletes all associated nodes by the user_name attribute.) + See neo4j_example.example_complex_shared_db_delete_memory for full method docs + +2. Edge Operations: + Insert: add_edge (Adds a triple/relation as a memory element) + Query: get_edges (Retrieves multiple relations/edges) + edge_exists (Checks if a relation/edge exists) + get_children_with_embeddings (Retrieves a list of child nodes for the PARENT relation type) + get_subgraph (Queries multi-hop nodes/retrieves a subgraph) + Delete: delete_edge (Deletes a relation/edge) + +3. Import/Export Operations: + import_graph (Imports an entire graph from a serialized dictionary. Parameters: A dictionary containing all nodes and edges to load, format: {'nodes': [], 'edges': []}) + export_graph (Exports all graph nodes and edges in a structured format, with pagination support) + +See src/memos/graph_dbs/base.py for full method docs. +```` +### Current Backend: + +| Backend | Status | File | +| ------- | ------ | ---------- | +| Neo4j | Stable | `neo4j.py` | + +## Shared DB, Multi-Tenant Support + +By specifying the `user_name` field, MemOS can isolate memory graphs for multiple users in a single Neo4j database. Ideal for collaborative systems or multi-agent applications: + +```python +config = GraphDBConfigFactory( + backend="neo4j", + config={ + "uri": "bolt://localhost:7687", + "user": "neo4j", + "password": "your_password", + "db_name": "shared-graph", + "user_name": "alice", + "use_multi_db": False, + "embedding_dimension": 768, + }, +) +``` + +User data is logically isolated via the `user_name` field. Filtering is handled automatically during reads, writes, and searches. + +:::note +**Example? You bet.**
+No blah blah, just go check the code: +`examples/basic_modules/neo4j_example.example_complex_shared_db(db_name="shared-traval-group-complex-new")` +::: + +## Neo4j Community Edition Support + +New backend identifier: `neo4j-community` + +Usage is similar to standard Neo4j, but disables Enterprise-only features: + +- ❌ No support for `auto_create` databases +- ❌ No native vector indexes (External vector library must be used, currently only Qdrant is supported) +- ✅ Enforces `user_name` logic-based isolation(Community version or username belong to the same business and do not require strong isolation) + +Example configuration: + +```python +config = GraphDBConfigFactory( + backend="neo4j-community", + config={ + "uri": "bolt://localhost:7687", + "user": "neo4j", + "password": "12345678", + "db_name": "paper", + "user_name": "bob", + "auto_create": False, + "embedding_dimension": 768, + "use_multi_db": False, + "vec_config": { + "backend": "qdrant", + "config": { + "host": "localhost", + "port": 6333, + "collection_name": "neo4j_vec_db", + "vector_dimension": 768, + "distance_metric": "cosine" + }, + }, + }, +) +``` + +:::note +**Example? You bet.**
+No blah blah, just go check the code: +`examples/basic_modules/neo4j_example.example_complex_shared_db(db_name="paper", community=True)` +::: + +## Extending + +You can add support for any other graph engine (e.g., **TigerGraph**, **DGraph**, **Weaviate hybrid**) by: + +1. Subclassing `BaseGraphDB` +2. Creating a config dataclass (e.g., `DgraphConfig`) +3. Registering it in: + + * `GraphDBConfigFactory.backend_to_class` + * `GraphStoreFactory.backend_to_class` + +See `src/memos/graph_dbs/neo4j.py` as a reference for implementation. diff --git a/docs/en/open_source/modules/memories/overview.md b/docs/en/open_source/modules/memories/overview.md new file mode 100644 index 00000000..a90e7d67 --- /dev/null +++ b/docs/en/open_source/modules/memories/overview.md @@ -0,0 +1,279 @@ +--- +title: "Memory Modules Overview" +desc: "Complete guide to MemOS memory systems - from lightweight text memory to advanced graph structures, choose the right memory module for your needs" +--- + + +The Memory Module provides Agents with essential long-term memory capabilities. Instead of acting as a static database, it mimics human cognitive processes by automatically extracting, organizing, and linking information. Choosing different memory modules allows you to customize and enhance your Agent's skills. + +## 🎯 Quick Selection Guide + +::alert{type="info"} +**Not sure which to choose?** Follow this decision tree: +- 🚀 **Quick testing/demo**: Get started easily with no additional software → [NaiveTextMemory](#naivetextmemory-simple-textual-memory) +- 📝 **General text memory**: Retain chat history or massive documents with semantic search capabilities → [GeneralTextMemory](#generaltextmemory-general-purpose-textual-memory) +- 👤 **User preference management**:Specifically designed for building and managing user profiles → [PreferenceTextMemory](#preferencetextmemory-preference-memory) +- 🌳 **Structured knowledge graph**: Ideal for data with complex logical relationships and interconnections → [TreeTextMemory](#treetextmemory-hierarchical-structured-memory) +- ⚡ **Inference acceleration**: Optimized for high-traffic scenarios to ensure stable and rapid responses → [KVCacheMemory](#kvcachememory-activation-memory) +:: + +--- + +## 📚 Memory Module Categories + +### I. Textual Memory Series + +Focused on storing and retrieving text-based memories, suitable for most application scenarios. + +#### NaiveTextMemory: Simple Textual Memory +::card +**Use Cases:** Rapid prototyping, demos, teaching, small-scale applications + +**Core Features:** +- ✅ Zero dependencies, pure in-memory storage +- ✅ Keyword-based retrieval +- ✅ Minimal API, get started in 5 minutes +- ✅ File persistence support + +**Limitations:** +- ❌ No vector semantic search +- ❌ Not suitable for large-scale data +- ❌ Limited retrieval precision + +📖 [View Documentation](./naive_textual_memory) +:: + +#### GeneralTextMemory: General-Purpose Textual Memory +::card +**Use Cases:** Conversational agents, personal assistants, knowledge management systems + +**Core Features:** +- ✅ Vector-based semantic search +- ✅ Rich metadata support (type, time, source, etc.) +- ✅ Flexible filtering and querying +- ✅ Suitable for medium to large-scale applications + +**Technical Requirements:** +- Requires vector database (Qdrant, etc.) +- Requires embedding model + +📖 [View Documentation](./general_textual_memory) +:: + +#### PreferenceTextMemory: Preference Memory +::card +**Use Cases:** Personalized recommendations, user profiling, intelligent assistants + +**Core Features:** +- ✅ Automatic detection of explicit and implicit preferences +- ✅ Preference deduplication and conflict detection +- ✅ Filter by preference type and strength +- ✅ Vector semantic retrieval + +**Specialized Functions:** +- Dual preference extraction (explicit/implicit) +- Preference strength scoring +- Temporal decay support + +📖 [View Documentation](./preference_textual_memory) +:: + +#### TreeTextMemory: Hierarchical Structured Memory +::card +**Use Cases:** Knowledge graphs, complex relationship reasoning, multi-hop queries + +**Core Features:** +- ✅ Graph database-based structured storage +- ✅ Support for hierarchical relationships and causal chains +- ✅ Multi-hop reasoning capabilities +- ✅ Deduplication, conflict detection, memory scheduling + +**Advanced Features:** +- Supports MultiModal Reader (images, URLs, files) +- Supports Internet Retrieval (BochaAI, Google, Bing) +- Working memory replacement mechanism + +**Technical Requirements:** +- Requires graph database (Neo4j, etc.) +- Requires vector database and embedding model + +📖 [View Documentation](./tree_textual_memory) +:: + +--- + +### II. Specialized Memory Modules + +Memory systems optimized for specific scenarios. + +#### KVCacheMemory: Activation Memory +::card +**Use Cases:** LLM inference acceleration, high-frequency background knowledge reuse + +**Core Features:** +- ⚡ Pre-computed KV Cache, skip repeated encoding +- ⚡ Significantly reduce prefill phase computation +- ⚡ Suitable for high-throughput scenarios + +**Typical Use Cases:** +- FAQ caching +- Conversation history reuse +- Domain knowledge preloading + +**How It Works:** +Stable text memory → Pre-convert to KV Cache → Direct injection during inference + +📖 [View Documentation](./kv_cache_memory) +:: + +#### ParametricMemory: Parametric Memory +::card +**Status:** 🚧 Under Development + +**Design Goals:** +- Encode knowledge into model weights (LoRA, expert modules) +- Dynamically load/unload capability modules +- Support multi-task, multi-role architecture + +**Future Features:** +- Parameter module generation and compression +- Version control and rollback +- Hot-swappable capability modules + +📖 [View Documentation](./parametric_memory) +:: + +--- + +### III. Graph Database Backends + +Provide graph storage capabilities for TreeTextMemory. + +#### Neo4j Graph DB +::card +**Recommendation:** ⭐⭐⭐⭐⭐ + +**Features:** +- Complete graph database functionality +- Support for vector-enhanced retrieval +- Multi-tenant architecture (v0.2.1+) +- Compatible with Community Edition + +📖 [View Documentation](./neo4j_graph_db) +:: + +#### Nebula Graph DB +::card +**Features:** +- Distributed graph database +- High availability +- Suitable for large-scale deployment + +📖 [View Documentation](./nebula_graph_db) +:: + +#### PolarDB Graph DB +::card +**Features:** +- Alibaba Cloud PolarDB graph computing +- Cloud-native architecture +- Enterprise-grade reliability + +📖 [View Documentation](./polardb_graph_db) +:: + +--- + +## 📊 Feature Comparison Table + +| Feature | Naive | General | Preference | Tree | KVCache | +|---------|-------|---------|------------|------|---------| +| **Search Method** | Keyword | Vector Semantic | Vector Semantic | Vector+Graph | N/A | +| **Metadata Support** | ⭐ | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐ | - | +| **Relationship Reasoning** | ❌ | ❌ | ❌ | ✅ | - | +| **Deduplication** | ❌ | ⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐ | - | +| **Scalability** | Small | Medium-Large | Medium-Large | Large | - | +| **Deployment Complexity** | Minimal | Medium | Medium | Higher | Medium | +| **Inference Acceleration** | - | - | - | - | ⭐⭐⭐⭐⭐ | + +--- + +## 🛠️ Usage Scenario Recommendations + +### Scenario 1: Rapid Prototyping +**Recommended:** [NaiveTextMemory](./naive_textual_memory) +```python +from memos.memories import NaiveTextMemory +memory = NaiveTextMemory() +memory.add("User likes coffee") +results = memory.search("coffee") +``` + +### Scenario 2: Chatbot Memory +**Recommended:** [GeneralTextMemory](./general_textual_memory) +- Supports semantic search +- Filter by time, type, source +- Suitable for conversation history management + +### Scenario 3: Personalized Recommendation System +**Recommended:** [PreferenceTextMemory](./preference_textual_memory) +- Automatic user preference extraction +- Preference conflict detection +- Strength scoring and filtering + +### Scenario 4: Knowledge Graph Applications +**Recommended:** [TreeTextMemory](./tree_textual_memory) +- Multi-hop relationship queries +- Hierarchical structure management +- Complex reasoning scenarios + +### Scenario 5: High-Performance LLM Services +**Recommended:** [KVCacheMemory](./kv_cache_memory) +- FAQ systems +- Customer service bots +- High-volume request processing + +--- + +## 🔗 Advanced Features + +### MultiModal Reader (Multimodal Reading) +Supported in TreeTextMemory for processing: +- 📷 Images in conversations +- 🌐 Web URLs +- 📄 Local files (PDF, DOCX, TXT, Markdown) +- 🔀 Mixed mode (text+images+URLs) + +👉 [View Examples](./tree_textual_memory#using-multimodalstructmemreader-advanced) + +### Internet Retrieval +Fetch real-time information from the web and add to memory: +- 🔍 BochaAI search +- 🌍 Google search +- 🔎 Bing search + +👉 [View Examples](./tree_textual_memory#retrieve-memories-from-the-internet-optional) + +--- + +## 🚀 Quick Start + +1. **Choose Memory Module** - Select the appropriate module based on the guide above +2. **Read Documentation** - Click the corresponding link to view detailed documentation +3. **Hands-On Practice** - Each module has complete code examples +4. **Production Deployment** - Refer to the best practices section + +--- + +## 📖 Related Resources + +- [API Reference](/api) +- [Best Practices Guide](/best-practices) +- [Example Code Repository](https://github.com/MemOS/examples) +- [FAQ](/faq) + +--- + +::alert{type="tip"} +**Beginner Suggestion:** Start with NaiveTextMemory, understand the basic concepts, then explore GeneralTextMemory and TreeTextMemory. +:: diff --git a/docs/en/open_source/modules/memories/parametric_memory.md b/docs/en/open_source/modules/memories/parametric_memory.md new file mode 100644 index 00000000..ef98aa33 --- /dev/null +++ b/docs/en/open_source/modules/memories/parametric_memory.md @@ -0,0 +1,49 @@ +--- +title: Parametric Memory *(Coming Soon)* +--- + +::note +**Coming Soon** +This feature is still under active development. Stay tuned for updates! +:: + +`Parametric Memory` is the core **long-term knowledge and capability store** inside MemOS. +Unlike plaintext or activation memories, parametric memory is embedded directly within a model’s weights — encoding deep representations of language structure, world knowledge, and general reasoning abilities. + +In the MemOS architecture, parametric memory does not just refer to static pre-trained weights. It also includes modular weight components such as **LoRA adapters** and plug-in expert modules. These allow you to incrementally expand or specialize your LLM’s capabilities without retraining the entire model. + +For example, you could distill structured or stable knowledge into parametric form, save it as a **capability block**, and dynamically load or unload it during inference. This makes it easy to create “expert sub-models” for tasks like legal reasoning, financial analysis, or domain-specific summarization — all managed by MemOS. + + +## Design Goals + +::list{icon="ph:check-circle-duotone"} +- **Controllability** — Generate, load, swap, or compose parametric modules + on demand. +- **Plasticity** — Evolve alongside plaintext and activation memories; support knowledge distillation and rollback. +- **Traceability** *(Coming Soon)* — Versioning and governance for parametric blocks. +:: + +## Current Status + +`Parametric Memory` is currently under design and prototyping. +APIs for generating, compressing, and hot-swapping parametric modules will be released in future versions — supporting multi-task, multi-role, and multi-agent architectures. + +Stay tuned! + + +## Related Modules + +While parametric memory is under development, try out these today: +- **[GeneralTextMemory](/open_source/modules/memories/general_textual_memory)**: Flexible vector-based semantic storage. +- **[TreeTextMemory](/open_source/modules/memories/tree_textual_memory)**: Structured, hierarchical knowledge graphs. +- **[Activation Memory](/open_source/modules/memories/kv_cache_memory)**: Efficient runtime state caching. + +## Developer Note + +Parametric Memory will complete MemOS’s vision of a unified **Memory³** architecture: +- **Parametric**: Embedded knowledge +- **Activation**: Ephemeral runtime states +- **Plaintext**: Structured, traceable external memories + +Bringing all three together enables adaptable, evolvable, and explainable intelligent systems. diff --git a/docs/en/open_source/modules/memories/polardb_graph_db.md b/docs/en/open_source/modules/memories/polardb_graph_db.md new file mode 100644 index 00000000..f781216a --- /dev/null +++ b/docs/en/open_source/modules/memories/polardb_graph_db.md @@ -0,0 +1,464 @@ +--- +title: "PolarDB Graph Database" +desc: "Configuration and usage of PolarDB graph database in the MemOS framework. MemOS supports using **PolarDB** (based on Apache AGE extension) as a graph database backend for storing and retrieving knowledge graph-style memory data. PolarDB combines the powerful capabilities of PostgreSQL with the flexibility of graph databases, making it particularly suitable for scenarios requiring both relational and graph data queries." +--- + + + + +## Features + +::list{icon="ph:check-circle-duotone"} +- Complete graph database operations: node CRUD, edge management +- Vector embedding search: semantic retrieval with IVFFlat index support +- Connection pool management: automatic database connection management with high concurrency support +- Multi-tenant isolation: supports both physical and logical isolation modes +- JSONB property storage: flexible metadata storage +- Batch operations: supports batch insertion of nodes and edges +- Automatic timestamps: automatically maintains `created_at` and `updated_at` +- SQL injection protection: built-in parameterized queries and string escaping +:: + +## Directory Structure + +``` +MemOS/ +└── src/ + └── memos/ + ├── configs/ + │ └── graph_db.py # PolarDBGraphDBConfig configuration class + └── graph_dbs/ + ├── base.py # BaseGraphDB abstract base class + ├── factory.py # GraphDBFactory factory class + └── polardb.py # PolarDBGraphDB implementation +``` + +## Quick Start + +### 1. Install Dependencies + +```bash +# Install psycopg2 driver (choose one) +pip install psycopg2-binary # Recommended: pre-compiled version +# or +pip install psycopg2 # Requires PostgreSQL development libraries + +# Install MemOS +pip install MemoryOS -U +``` + +### 2. Configure PolarDB + +#### Method 1: Using Configuration File (Recommended) + +```json +{ + "graph_db_store": { + "backend": "polardb", + "config": { + "host": "localhost", + "port": 5432, + "user": "postgres", + "password": "your_password", + "db_name": "memos_db", + "user_name": "alice", + "use_multi_db": true, + "auto_create": false, + "embedding_dimension": 1024, + "maxconn": 100 + } + } +} +``` + +#### Method 2: Code Initialization + +```python +from memos.configs.graph_db import PolarDBGraphDBConfig +from memos.graph_dbs.polardb import PolarDBGraphDB + +# Create configuration +config = PolarDBGraphDBConfig( + host="localhost", + port=5432, + user="postgres", + password="your_password", + db_name="memos_db", + user_name="alice", + use_multi_db=True, + embedding_dimension=1024, + maxconn=100 +) + +# Initialize database +graph_db = PolarDBGraphDB(config) +``` + +### 3. Basic Operation Examples + +```python +# ======================================== +# Step 1: Add Node +# ======================================== +node_id = graph_db.add_node( + label="Memory", + properties={ + "content": "Python is a high-level programming language", + "memory_type": "Knowledge", + "tags": ["programming", "python"] + }, + embedding=[0.1, 0.2, 0.3, ...], # 1024-dimensional vector + user_name="alice" +) +print(f"✓ Node created: {node_id}") + +# ======================================== +# Step 2: Update Node +# ======================================== +graph_db.update_node( + id=node_id, + fields={ + "content": "Python is an interpreted, object-oriented high-level programming language", + "updated": True + }, + user_name="alice" +) +print("✓ Node updated") + +# ======================================== +# Step 3: Create Relationship +# ======================================== +# First create a second node +node_id_2 = graph_db.add_node( + label="Memory", + properties={ + "content": "Django is a web framework for Python", + "memory_type": "Knowledge" + }, + embedding=[0.15, 0.25, 0.35, ...], + user_name="alice" +) + +# Create edge +edge_id = graph_db.add_edge( + source_id=node_id, + target_id=node_id_2, + edge_type="RELATED_TO", + properties={ + "relationship": "framework and language", + "confidence": 0.95 + }, + user_name="alice" +) +print(f"✓ Relationship created: {edge_id}") + +# ======================================== +# Step 4: Vector Search +# ======================================== +query_embedding = [0.12, 0.22, 0.32, ...] # Query vector + +results = graph_db.search_by_embedding( + embedding=query_embedding, + top_k=5, + memory_type="Knowledge", + user_name="alice" +) + +print(f"\n🔍 Found {len(results)} similar nodes:") +for node in results: + print(f" - {node.get('content')} (similarity: {node.get('score', 'N/A')})") + +# ======================================== +# Step 5: Delete Node +# ======================================== +graph_db.delete_node(id=node_id, user_name="alice") +print(f"✓ Node {node_id} deleted") +``` + +## Configuration Details + +### PolarDBGraphDBConfig Parameters + +| Parameter | Type | Default | Required | Description | +|------|------|--------|------|------| +| `host` | str | - | ✓ | Database host address | +| `port` | int | 5432 | ✗ | Database port | +| `user` | str | - | ✓ | Database username | +| `password` | str | - | ✓ | Database password | +| `db_name` | str | - | ✓ | Target database name | +| `user_name` | str | None | ✗ | Tenant identifier (for logical isolation) | +| `use_multi_db` | bool | True | ✗ | Whether to use multi-database physical isolation | +| `auto_create` | bool | False | ✗ | Whether to automatically create database | +| `embedding_dimension` | int | 1024 | ✗ | Vector embedding dimension | +| `maxconn` | int | 100 | ✗ | Maximum connections in connection pool | + +### Multi-Tenant Mode Comparison + +| Feature | Physical Isolation
(`use_multi_db=True`) | Logical Isolation
(`use_multi_db=False`) | +|------|-----------------------------------|-------------------------------------| +| **Isolation Level** | Database level | Application layer tag filtering | +| **Configuration Requirements** | `db_name` typically equals `user_name` | Must provide `user_name` | +| **Performance** | Better (independent resources) | Good (shared resources) | +| **Cost** | High (independent DB per tenant) | Low (shared database) | +| **Use Cases** | Enterprise customers, high security requirements | SaaS multi-tenant, development testing | +| **Data Migration** | Convenient (full database export) | Requires filtering by tags | + +### Configuration Examples + +#### Example 1: Physical Isolation (Recommended for Enterprise) + +```json +{ + "graph_db_store": { + "backend": "polardb", + "config": { + "host": "prod-polardb.example.com", + "port": 5432, + "user": "admin", + "password": "secure_password", + "db_name": "customer_001", + "user_name": null, + "use_multi_db": true, + "auto_create": false, + "embedding_dimension": 1536, + "maxconn": 200 + } + } +} +``` + +#### Example 2: Logical Isolation (Recommended for SaaS) + +```json +{ + "graph_db_store": { + "backend": "polardb", + "config": { + "host": "shared-polardb.example.com", + "port": 5432, + "user": "app_user", + "password": "app_password", + "db_name": "shared_memos", + "user_name": "tenant_alice", + "use_multi_db": false, + "auto_create": false, + "embedding_dimension": 768, + "maxconn": 50 + } + } +} +``` + +## Advanced Features + +### 1. Batch Insert Nodes + +```python +# Batch add nodes (high performance) +nodes_data = [ + { + "label": "Memory", + "properties": {"content": f"Node {i}", "memory_type": "Test"}, + "embedding": [0.1 * i] * 1024, + } + for i in range(100) +] + +node_ids = graph_db.add_nodes_batch( + nodes=nodes_data, + user_name="alice" +) +print(f"✓ Batch created {len(node_ids)} nodes") +``` + +### 2. Complex Query Examples + +```python +# Find memories of specific type and sort by time +def get_recent_memories(graph_db, memory_type, limit=10): + """Get recent memory nodes""" + query = f""" + SELECT * FROM "{graph_db.db_name}_graph"."Memory" + WHERE properties->>'memory_type' = %s + AND properties->>'user_name' = %s + ORDER BY updated_at DESC + LIMIT %s + """ + + conn = graph_db._get_connection() + try: + with conn.cursor() as cursor: + cursor.execute(query, [memory_type, "alice", limit]) + results = cursor.fetchall() + return results + finally: + graph_db._return_connection(conn) + +# Usage example +recent = get_recent_memories(graph_db, "WorkingMemory", limit=5) +print(f"Recent 5 working memories: {len(recent)} items") +``` + +### 3. Vector Index Optimization + +```python +# Create or update vector index +graph_db.create_index( + label="Memory", + vector_property="embedding", + dimensions=1024, + index_name="memory_vector_index" +) +print("✓ Vector index optimized") +``` + +### 4. Connection Pool Monitoring + +```python +# View connection pool status (for debugging only) +import logging +logging.basicConfig(level=logging.DEBUG) + +# Detailed logs will be output when acquiring connection +conn = graph_db._get_connection() +# [DEBUG] [_get_connection] Successfully acquired connection from pool +graph_db._return_connection(conn) +# [DEBUG] [_return_connection] Successfully returned connection to pool +``` + +## BaseGraphDB Interface + +PolarDB implements all methods of the `BaseGraphDB` abstract class, ensuring interoperability with other graph database backends. + +### Core Methods + +| Method | Description | Parameters | +|------|------|------| +| `add_node()` | Add a single node | label, properties, embedding, user_name | +| `add_nodes_batch()` | Batch add nodes | nodes, user_name | +| `update_node()` | Update node properties | id, fields, user_name | +| `delete_node()` | Delete node | id, user_name | +| `delete_node_by_params()` | Delete nodes by conditions | params, user_name | +| `add_edge()` | Create relationship | source_id, target_id, edge_type, properties, user_name | +| `update_edge()` | Update relationship properties | edge_id, properties, user_name | +| `delete_edge()` | Delete relationship | edge_id, user_name | +| `search_by_embedding()` | Vector similarity search | embedding, top_k, memory_type, user_name | +| `get_node()` | Get a single node | id, user_name | +| `get_memory_count()` | Count nodes | memory_type, user_name | +| `remove_oldest_memory()` | Clean old memories | memory_type, keep_latest, user_name | + +### Complete Method Signature Examples + +```python +from typing import Any + +# Add node +def add_node( + self, + label: str = "Memory", + properties: dict[str, Any] | None = None, + embedding: list[float] | None = None, + user_name: str | None = None +) -> str: + """Add a new node to the graph database""" + pass + +# Vector search +def search_by_embedding( + self, + embedding: list[float], + top_k: int = 10, + memory_type: str | None = None, + user_name: str | None = None, + filters: dict[str, Any] | None = None +) -> list[dict[str, Any]]: + """Perform similarity search based on vector embedding""" + pass + +# Batch operations +def add_nodes_batch( + self, + nodes: list[dict[str, Any]], + user_name: str | None = None +) -> list[str]: + """Batch add multiple nodes""" + pass +``` + +## Extension Development Guide + +If you need to implement custom functionality based on PolarDB, you can inherit the `PolarDBGraphDB` class: + +```python +from memos.graph_dbs.polardb import PolarDBGraphDB +from memos.configs.graph_db import PolarDBGraphDBConfig + +class CustomPolarDBGraphDB(PolarDBGraphDB): + """Custom PolarDB graph database implementation""" + + def __init__(self, config: PolarDBGraphDBConfig): + super().__init__(config) + # Custom initialization logic + self.custom_index_created = False + + def create_custom_index(self): + """Create custom index""" + conn = self._get_connection() + try: + with conn.cursor() as cursor: + cursor.execute(f""" + CREATE INDEX IF NOT EXISTS idx_custom_field + ON "{self.db_name}_graph"."Memory" + ((properties->>'custom_field')); + """) + conn.commit() + self.custom_index_created = True + print("✓ Custom index created") + except Exception as e: + print(f"❌ Failed to create index: {e}") + conn.rollback() + finally: + self._return_connection(conn) + + def search_by_custom_field(self, field_value: str): + """Search based on custom field""" + query = f""" + SELECT * FROM "{self.db_name}_graph"."Memory" + WHERE properties->>'custom_field' = %s + """ + + conn = self._get_connection() + try: + with conn.cursor() as cursor: + cursor.execute(query, [field_value]) + results = cursor.fetchall() + return results + finally: + self._return_connection(conn) + +# Use custom implementation +config = PolarDBGraphDBConfig( + host="localhost", + port=5432, + user="postgres", + password="password", + db_name="custom_db" +) + +custom_db = CustomPolarDBGraphDB(config) +custom_db.create_custom_index() +results = custom_db.search_by_custom_field("special_value") +``` + +## Reference Resources + +- [Apache AGE Official Documentation](https://age.apache.org/) +- [PostgreSQL Connection Pool Documentation](https://www.psycopg.org/docs/pool.html) +- [PolarDB Official Documentation](https://www.alibabacloud.com/product/polardb) +- [MemOS GitHub Repository](https://github.com/MemOS-AI/MemOS) + +## Next Steps + +- Learn about using [Neo4j Graph Database](./neo4j_graph_db.md) +- Check out [General Textual Memory](./general_textual_memory.md) configuration +- Explore advanced features of [Tree Textual Memory](./tree_textual_memory.md) diff --git a/docs/en/open_source/modules/memories/preference_textual_memory.md b/docs/en/open_source/modules/memories/preference_textual_memory.md new file mode 100644 index 00000000..ab715be5 --- /dev/null +++ b/docs/en/open_source/modules/memories/preference_textual_memory.md @@ -0,0 +1,226 @@ +--- +title: "PreferenceTextMemory: Textual Memory for User Preferences" +desc: "`PreferenceTextMemory` is a textual memory module in MemOS for storing and managing user preferences. It is suitable for scenarios where memory retrieval needs to be based on user preferences." +--- + +## Table of Contents + +- [Why Preference Memory is Needed](#why-preference-memory-is-needed) + - [Key Features](#key-features) + - [Application Scenarios](#application-scenarios) +- [Core Concepts and Workflow](#core-concepts-and-workflow) + - [Memory Structure](#memory-structure) + - [Metadata Fields (`PreferenceTextualMemoryMetadata`)](#metadata-fields-preferencetextualmemorymetadata) + - [Core Workflow](#core-workflow) +- [API Reference](#api-reference) + - [Initialization](#initialization) + - [Core Methods](#core-methods) + - [File Storage](#file-storage) +- [Hands-on Practice: From Zero to One](#hands-on-practice-from-zero-to-one) + - [Create PreferenceTextMemory Configuration](#create-preferencetextmemory-configuration) + - [Initialize PreferenceTextMemory](#initialize-preferencetextmemory) + - [Extract Structured Memory](#extract-structured-memory) + - [Search Memory](#search-memory) + - [Backup and Restore](#backup-and-restore) + - [Complete Code Example](#complete-code-example) + + +## Why Preference Memory is Needed + +### Key Features + +::list{icon="ph:check-circle-duotone"} +- **Dual Preference Extraction**: Automatically identifies explicit and implicit preferences +- **Semantic Understanding**: Uses vector embeddings to understand the deep meaning of preferences +- **Smart Deduplication**: Automatically detects and merges duplicate or conflicting preferences +- **Precise Retrieval**: Semantic search based on vector similarity +- **Persistent Storage**: Supports vector databases (Qdrant/Milvus) +- **Scalability**: Supports large-scale preference data management +- **Personalization Enhancement**: Maintains independent preference profiles for each user +:: + +### Application Scenarios + +::list{icon="ph:lightbulb-duotone"} +- Personalized conversational agents (remembering user likes/dislikes) +- Intelligent recommendation systems (recommendations based on preferences) +- Customer service systems (providing customized services) +- Content filtering systems (filtering content based on preferences) +- Learning assistance systems (adapting to learning styles) +:: + + +In conclusion, when you need to build systems that can "remember" user preferences and provide personalized services accordingly, `PreferenceTextMemory` is the best choice. +:: + +## Core Concepts and Workflow +### Memory Structure + +In MemOS, preference memory is represented by `PreferenceTextMemory`, where each memory item is a `TextualMemoryItem` stored in Milvus database. +- `id`: Unique memory ID (automatically generated if omitted) +- `memory`: Main text content +- `metadata`: Includes hierarchical structure information, embeddings, tags, entities, sources, and status + +Preference memory can be divided into explicit preference memory and implicit preference memory: +- **Explicit Preference Memory**: Preferences that users explicitly express. **Examples**: + - "I like dark mode" + - "I don't eat spicy food" + - "Please use short answers" + - "I prefer technical documentation over video tutorials" + +- **Implicit Preference Memory**: Preferences inferred from user behavior and conversation patterns. **Examples**: + - User always asks for code examples → prefers practice-oriented learning + - User frequently requests detailed explanations → prefers in-depth understanding + - User mentions environmental topics multiple times → concerned about sustainable development + +::note +**Intelligent Extraction**
+`PreferenceTextMemory` automatically extracts both explicit and implicit preferences from conversations using LLM, no manual annotation required! +:: + +### Metadata Fields (`PreferenceTextualMemoryMetadata`) + +| Field | Type | Description | +| ------------- | -------------------------------------------------- | ----------------------------------- | +| `preference_type` | `"explicit_preference"`, `"implicit_preference"` | Preference memory type, divided into explicit and implicit preference memory | +| `dialog_id` | `str` | Dialog ID, used to associate preference memory with specific dialogs | +| `original_text` | `str` | Original text containing user preference information | +| `embedding` | `str` | Embedding vector for semantic search and retrieval | +| `preference` | `str` | User preference information | +| `create_at` | `str` | Creation timestamp (ISO 8601) | +| `mem_cube_id` | `str` | Memory cube ID, used to associate preference memory with specific memory cubes | +| `score` | `float ` | Similarity score between preference memory and query in search results | + +### Core Workflow + +When you run this example, your workflow will: + +1. **Extraction:** Use LLM to extract structured memory from raw text. + + +2. **Embedding:** Generate vector embeddings for similarity search. + + +3. **Storage:** Store preference memory in Milvus database while updating metadata fields. + + +4. **Search:** Return the most relevant preference memories through vector similarity queries. + +## API Reference + +### Initialization + +```python +PreferenceTextMemory(config: PreferenceTextMemoryConfig) +``` + +### Core Methods + +| Method | Description | +| --------------------------- | ----------------------------------------------------- | +| `get_memory(messages)` | Extract preference memories from original dialogues. | +| `search(query, top_k)` | Retrieve top-k preference memories using vector similarity. | +| `load(dir)` | Load preference memories from stored files. | +| `dump(dir)` | Serialize all preference memories to JSON files in the directory. | +| `add(memories)` | Batch add preference memories to Milvus database. | +| `get_with_collection_name(collection_name, memory_id)` | Get specific type of preference memory by collection name and memory ID. | +| `get_by_ids_with_collection_name(collection_name, memory_ids)` | Batch get specific type of preference memory by collection name and memory IDs. | +| `get_all()` | Get all preference memories. | +| `get_memory_by_filter(filter)` | Get preference memories based on filter conditions. | +| `delete(memory_ids)` | Delete preference memories by specified IDs. | +| `delete_by_filter(filter)` | Delete preference memories based on filter conditions. | +| `delete_with_collection_name(collection_name, memory_ids)` | Delete all preference memories with specified collection name and IDs. | +| `delete_all()` | Delete all preference memories. | + + +### File Storage + +When calling `dump(dir)`, MemOS will serialize all preference memories to JSON files in the directory: +``` +/ +``` + +--- + +## Hands-on Practice: From Zero to One + +::steps{} + +### Create PreferenceTextMemory Configuration +Define: +- Your embedding model (e.g., nomic-embed-text:latest), +- Your Milvus database backend, +- Memory extractor (based on LLM) (optional). + +```python +from memos.configs.memory import PreferenceTextMemoryConfig + +config = PreferenceTextMemoryConfig.from_json_file("examples/data/config/preference_config.json") +``` + +### Initialize PreferenceTextMemory + +```python +from memos.memories.textual.preference import PreferenceTextMemory + +preference_memory = PreferenceTextMemory(config) +``` + +### Extract Structured Memory + +Use the memory extractor to parse dialogues, files, or documents into multiple `TextualMemoryItem`. + +```python +scene_data = [[ + {"role": "user", "content": "Tell me about your childhood."}, + {"role": "assistant", "content": "I loved playing in the garden with my dog."} +]] + +memories = preference_memory.get_memory(scene_data, type="chat", info={"user_id": "1234"}) +preference_memory.add(memories) +``` + +### Search Memory + +```python +results = preference_memory.search("Tell me more about the user", top_k=2) +``` + +### Backup and Restore +Support persistent storage and on-demand reloading of preference memories: +```python +preference_memory.dump("tmp/pref_memories") +preference_memory.load("tmp/pref_memories") +``` + +:: + +### Complete Code Example + +This example integrates all the above steps, providing an end-to-end complete workflow — copy and run! + +```python +from memos.configs.memory import PreferenceTextMemoryConfig +from memos.memories.textual.preference import PreferenceTextMemory + +# Create PreferenceTextMemory +config = PreferenceTextMemoryConfig.from_json_file("examples/data/config/preference_config.json") + +preference_memory = PreferenceTextMemory(config) +preference_memory.delete_all() + +scene_data = [[ + {"role": "user", "content": "Tell me about your childhood."}, + {"role": "assistant", "content": "I loved playing in the garden with my dog."} +]] + +# Extract preference memories from original dialogues and add to Milvus database +memories = preference_memory.get_memory(scene_data, type="chat", info={"user_id": "1234"}) +preference_memory.add(memories) + +# Search memory +results = preference_memory.search("Tell me more about the user", top_k=2) + +# Persist preference memories +preference_memory.dump("tmp/pref_memories") +``` diff --git a/docs/en/open_source/modules/memories/tree_textual_memory.md b/docs/en/open_source/modules/memories/tree_textual_memory.md new file mode 100644 index 00000000..436cd6ed --- /dev/null +++ b/docs/en/open_source/modules/memories/tree_textual_memory.md @@ -0,0 +1,515 @@ +--- +title: "TreeTextMemory: Structured Hierarchical Textual Memory" +desc: > + Let’s build your first **graph-based, tree-structured memory** in MemOS! +
+ **TreeTextMemory** helps you organize, link, and retrieve memories with rich context and explainability. +
+ [Neo4j](/open_source/modules/memories/neo4j_graph_db) is the current backend, with support for additional graph stores planned in the future. +--- + + +## Table of Contents + +- [What You’ll Learn](#what-youll-learn) +- [Core Concepts and Workflow](#core-concepts-and-workflow) + - [Memory Structure](#memory-structure) + - [Metadata Fields](#metadata-fields-treenodetextualmemorymetadata) + - [Core Workflow](#core-workflow) +- [API Reference](#api-reference) +- [Hands-on: From 0 to 1](#hands-on-from-0-to-1) + - [Create TreeTextMemory Config](#create-treetextmemory-config) + - [Initialize TreeTextMemory](#initialize-treetextmemory) + - [Extract Structured Memories](#extract-structured-memories) + - [Search Memories](#search-memories) + - [Retrieve Memories from the Internet (Optional)](#retrieve-memories-from-the-internet-optional) + - [Replace Working Memory](#replace-working-memory) + - [Backup & Restore](#backup--restore) + - [Full Code Example](#full-code-example) +- [Why Choose TreeTextMemory](#why-choose-treetextmemory) +- [What’s Next](#whats-next) + +## What You’ll Learn + +By the end of this guide, you will: +- Extract structured memories from raw text or conversations. +- Store them as **nodes** in a graph database. +- Link memories into **hierarchies** and semantic graphs. +- Search them using **vector similarity + graph traversal**. + +## Core Concepts and Workflow + +### Memory Structure + +Every node in your `TreeTextMemory` is a `TextualMemoryItem`: +- `id`: Unique memory ID (auto-generated if omitted). +- `memory`: the main text. +- `metadata`: includes hierarchy info, embeddings, tags, entities, source, and status. + +### Metadata Fields (`TreeNodeTextualMemoryMetadata`) + +| Field | Type | Description | +| --------------- |-------------------------------------------------------| ------------------------------------------ | +| `memory_type` | `"WorkingMemory"`, `"LongTermMemory"`, `"UserMemory"` | Lifecycle category | +| `status` | `"activated"`, `"archived"`, `"deleted"` | Node status | +| `visibility` | `"private"`, `"public"`, `"session"` | Access scope | +| `sources` | `list[str]` | List of sources (e.g. files, URLs) | +| `source` | `"conversation"`, `"retrieved"`, `"web"`, `"file"` | Original source type | +| `confidence` | `float (0-100)` | Certainty score | +| `entities` | `list[str]` | Mentioned entities or concepts | +| `tags` | `list[str]` | Thematic tags | +| `embedding` | `list[float]` | Vector embedding for similarity search | +| `created_at` | `str` | Creation timestamp (ISO 8601) | +| `updated_at` | `str` | Last update timestamp (ISO 8601) | +| `usage` | `list[str]` | Usage history | +| `background` | `str` | Additional context | + + +::note +**Best Practice**
+ Use meaningful tags and background — they help organize your graph for +multi-hop reasoning. +:: + +### Core Workflow + +When you run this example, your workflow will: + +1. **Extract:** Use an LLM to pull structured memories from raw text. + + +2. **Embed:** Generate vector embeddings for similarity search. + + +3. **Store & Link:** Add nodes to your graph database (Neo4j) with relationships. + + +4. **Search:** Query by vector similarity, then expand results by graph hops. + + +::note +**Hint**
Graph links help retrieve context that pure vector search might miss! +:: + +## API Reference + +### Initialization + +```python +TreeTextMemory(config: TreeTextMemoryConfig) +``` + +### Core Methods + +| Method | Description | +| --------------------------- | ----------------------------------------------------- | +| `add(memories)` | Add one or more memories (items or dicts) | +| `replace_working_memory()` | Replace all WorkingMemory nodes | +| `get_working_memory()` | Get all WorkingMemory nodes | +| `search(query, top_k)` | Retrieve top-k memories using vector + graph search | +| `get(memory_id)` | Fetch single memory by ID | +| `get_by_ids(ids)` | Fetch multiple memories by IDs | +| `get_all()` | Export the full memory graph as dictionary | +| `update(memory_id, new)` | Update a memory by ID | +| `delete(ids)` | Delete memories by IDs | +| `delete_all()` | Delete all memories and relationships | +| `dump(dir)` | Serialize the graph to JSON in directory | +| `load(dir)` | Load graph from saved JSON file | +| `drop(keep_last_n)` | Backup graph & drop database, keeping N backups | + +### File Storage + +When calling `dump(dir)`, the system writes to: + +``` +/ +``` + +This file contains a JSON structure with `nodes` and `edges`. It can be reloaded using `load(dir)`. + +--- + +## Hands-on: From 0 to 1 + +::steps{} + +### Create TreeTextMemory Config +Define: +- your embedder (to create vectors), +- your graph DB backend (Neo4j), +- and your extractor LLM (optional). + +```python +from memos.configs.memory import TreeTextMemoryConfig + +config = TreeTextMemoryConfig.from_json_file("examples/data/config/tree_config.json") +``` + + +### Initialize TreeTextMemory + +```python +from memos.memories.textual.tree import TreeTextMemory + +tree_memory = TreeTextMemory(config) +``` + +### Extract Structured Memories + +Use your extractor to parse conversations, files, or docs into `TextualMemoryItem`s. + +```python +from memos.mem_reader.simple_struct import SimpleStructMemReader + +reader = SimpleStructMemReader.from_json_file("examples/data/config/simple_struct_reader_config.json") + +scene_data = [[ + {"role": "user", "content": "Tell me about your childhood."}, + {"role": "assistant", "content": "I loved playing in the garden with my dog."} +]] + +memories = reader.get_memory(scene_data, type="chat", info={"user_id": "1234"}) +for m_list in memories: + tree_memory.add(m_list) +``` + +#### Using MultiModalStructMemReader (Advanced) + +`MultiModalStructMemReader` supports processing multimodal content (text, images, URLs, files, etc.) and intelligently routes to different parsers: + +```python +from memos.configs.mem_reader import MultiModalStructMemReaderConfig +from memos.mem_reader.multi_modal_struct import MultiModalStructMemReader + +# Create MultiModal Reader configuration +multimodal_config = MultiModalStructMemReaderConfig( + llm={ + "backend": "openai", + "config": { + "model_name_or_path": "gpt-4o-mini", + "api_key": "your-api-key" + } + }, + embedder={ + "backend": "openai", + "config": { + "model_name_or_path": "text-embedding-3-small", + "api_key": "your-api-key" + } + }, + chunker={ + "backend": "text_splitter", + "config": { + "chunk_size": 1000, + "chunk_overlap": 200 + } + }, + extractor_llm={ + "backend": "openai", + "config": { + "model_name_or_path": "gpt-4o-mini", + "api_key": "your-api-key" + } + }, + # Optional: specify which domains should return Markdown directly + direct_markdown_hostnames=["github.com", "docs.python.org"] +) + +# Initialize MultiModal Reader +multimodal_reader = MultiModalStructMemReader(multimodal_config) + +# ======================================== +# Example 1: Process conversations with images +# ======================================== +scene_with_image = [[ + { + "role": "user", + "content": [ + {"type": "text", "text": "This is my garden"}, + {"type": "image_url", "image_url": {"url": "https://example.com/garden.jpg"}} + ] + }, + { + "role": "assistant", + "content": "Your garden looks beautiful!" + } +]] + +memories = multimodal_reader.get_memory( + scene_with_image, + type="chat", + info={"user_id": "1234", "session_id": "session_001"} +) +for m_list in memories: + tree_memory.add(m_list) +print(f"✓ Added {len(memories)} multimodal memories") + +# ======================================== +# Example 2: Process web URLs +# ======================================== +scene_with_url = [[ + { + "role": "user", + "content": "Please analyze this article: https://example.com/article.html" + }, + { + "role": "assistant", + "content": "I'll help you analyze this article" + } +]] + +url_memories = multimodal_reader.get_memory( + scene_with_url, + type="chat", + info={"user_id": "1234", "session_id": "session_002"} +) +for m_list in url_memories: + tree_memory.add(m_list) +print(f"✓ Extracted and added {len(url_memories)} memories from URL") + +# ======================================== +# Example 3: Process local files +# ======================================== +# Supported file types: PDF, DOCX, TXT, Markdown, HTML, etc. +file_paths = [ + "./documents/report.pdf", + "./documents/notes.md", + "./documents/data.txt" +] + +file_memories = multimodal_reader.get_memory( + file_paths, + type="doc", + info={"user_id": "1234", "session_id": "session_003"} +) +for m_list in file_memories: + tree_memory.add(m_list) +print(f"✓ Extracted and added {len(file_memories)} memories from files") + +# ======================================== +# Example 4: Mixed mode (text + images + URLs) +# ======================================== +mixed_scene = [[ + { + "role": "user", + "content": [ + {"type": "text", "text": "Here's my project documentation:"}, + {"type": "text", "text": "https://github.com/user/project/README.md"}, + {"type": "image_url", "image_url": {"url": "https://example.com/diagram.png"}} + ] + } +]] + +mixed_memories = multimodal_reader.get_memory( + mixed_scene, + type="chat", + info={"user_id": "1234", "session_id": "session_004"} +) +for m_list in mixed_memories: + tree_memory.add(m_list) +print(f"✓ Extracted and added {len(mixed_memories)} memories from mixed content") +``` + +::note +**MultiModal Reader Advantages**
+- **Smart Routing**: Automatically identifies content type (image/URL/file) and selects appropriate parser
+- **Format Support**: Supports PDF, DOCX, Markdown, HTML, images, and more
+- **URL Parsing**: Automatically extracts web content (including GitHub, documentation sites, etc.)
+- **Large File Handling**: Automatically chunks oversized files to avoid token limits
+- **Context Preservation**: Uses sliding window to maintain context continuity between chunks +:: + +::note +**Configuration Tips**
+- Use the `direct_markdown_hostnames` parameter to specify which domains should return Markdown format
+- Supports both `mode="fast"` and `mode="fine"` extraction modes; fine mode extracts more details
+- See complete examples: `/examples/mem_reader/multimodal_struct_reader.py` +:: + +### Search Memories + +Try a vector + graph search: +```python +results = tree_memory.search("Talk about the garden", top_k=5) +for i, node in enumerate(results): + print(f"{i}: {node.memory}") +``` + +### Retrieve Memories from the Internet (Optional) + +You can also fetch real-time web content using search engines such as Google, Bing, or Bocha, and automatically extract them into structured memory nodes. MemOS provides a unified interface for this purpose. + +The following example demonstrates how to retrieve web content related to **“Alibaba 2024 ESG report”** and convert it into structured memories: + +```python +# Create the embedder +embedder = EmbedderFactory.from_config( + EmbedderConfigFactory.model_validate({ + "backend": "ollama", + "config": {"model_name_or_path": "nomic-embed-text:latest"}, + }) +) + +# Configure the retriever (using BochaAI as an example) +retriever_config = InternetRetrieverConfigFactory.model_validate({ + "backend": "bocha", + "config": { + "api_key": "sk-xxx", # Replace with your BochaAI API Key + "max_results": 5, + "reader": { # Reader config for automatic chunking + "backend": "simple_struct", + "config": ..., # Your mem-reader config + }, + } +}) + +# Instantiate the retriever +retriever = InternetRetrieverFactory.from_config(retriever_config, embedder) + +# Perform internet search +results = retriever.retrieve_from_internet("Alibaba 2024 ESG report") + +# Add results to the memory graph +for m in results: + tree_memory.add(m) +``` + +Alternatively, you can configure the `internet_retriever` field directly in the `TreeTextMemoryConfig`. For example: + +```json +{ + "internet_retriever": { + "backend": "bocha", + "config": { + "api_key": "sk-xxx", + "max_results": 5, + "reader": { + "backend": "simple_struct", + "config": ... + } + } + } +} +``` + +With this setup, when you call `tree_memory.search(query)`, the system will automatically trigger an internet search (via BochaAI, Google, or Bing), and merge the results with local memory nodes in a unified ranked list — no need to manually call `retriever.retrieve_from_internet`. + + +### Replace Working Memory + +Replace your current `WorkingMemory` nodes with new ones: +```python +tree_memory.replace_working_memory( + [{ + "memory": "User is discussing gardening tips.", + "metadata": {"memory_type": "WorkingMemory"} + }] +) +``` + +### Backup & Restore +Dump your entire tree structure to disk and reload anytime: +```python +tree_memory.dump("tmp/tree_memories") +tree_memory.load("tmp/tree_memories") +``` + +:: + + +### Full Code Example + +This combines all the steps above into one end-to-end example — copy & run! + +```python +from memos.configs.embedder import EmbedderConfigFactory +from memos.configs.memory import TreeTextMemoryConfig +from memos.configs.mem_reader import SimpleStructMemReaderConfig +from memos.embedders.factory import EmbedderFactory +from memos.mem_reader.simple_struct import SimpleStructMemReader +from memos.memories.textual.tree import TreeTextMemory + +# Setup Embedder +embedder_config = EmbedderConfigFactory.model_validate({ + "backend": "ollama", + "config": {"model_name_or_path": "nomic-embed-text:latest"} +}) +embedder = EmbedderFactory.from_config(embedder_config) + +# Create TreeTextMemory +tree_config = TreeTextMemoryConfig.from_json_file("examples/data/config/tree_config.json") +my_tree_textual_memory = TreeTextMemory(tree_config) +my_tree_textual_memory.delete_all() + +# Setup Reader +reader_config = SimpleStructMemReaderConfig.from_json_file( + "examples/data/config/simple_struct_reader_config.json" +) +reader = SimpleStructMemReader(reader_config) + +# Extract from conversation +scene_data = [[ + { + "role": "user", + "content": "Tell me about your childhood." + }, + { + "role": "assistant", + "content": "I loved playing in the garden with my dog." + }, +]] +memory = reader.get_memory(scene_data, type="chat", info={"user_id": "1234", "session_id": "2222"}) +for m_list in memory: + my_tree_textual_memory.add(m_list) + +# Search +results = my_tree_textual_memory.search( + "Talk about the user's childhood story?", + top_k=10 +) +for i, r in enumerate(results): + print(f"{i}'th result: {r.memory}") + +# [Optional] Add from documents +doc_paths = ["./text1.txt", "./text2.txt"] +doc_memory = reader.get_memory( + doc_paths, "doc", info={ + "user_id": "your_user_id", + "session_id": "your_session_id", + } +) +for m_list in doc_memory: + my_tree_textual_memory.add(m_list) + +# [Optional] Dump & Drop +my_tree_textual_memory.dump("tmp/my_tree_textual_memory") +my_tree_textual_memory.drop() +``` + +## Why Choose TreeTextMemory + +- **Structured Hierarchy:** Organize memories like a mind map — nodes can +have parents, children, and cross-links. +- **Graph-Style Linking:** Beyond pure hierarchy — build multi-hop reasoning + chains. +- **Semantic Search + Graph Expansion:** Combine the best of vectors and + graphs. +- **Explainability:** Trace how memories connect, merge, or evolve over time. + +::note +**Try This**
Add memory nodes from documents or web content. Link them +manually or auto-merge similar nodes! +:: + +## What’s Next + +- **Know more about [Neo4j](/open_source/modules/memories/neo4j_graph_db):** TreeTextMemory is powered by a graph database backend. + Understanding how Neo4j handles nodes, edges, and traversal will help you design more efficient memory hierarchies, multi-hop reasoning, and context linking strategies. +- **Add [Activation Memory](/open_source/modules/memories/kv_cache_memory):** + Experiment with + runtime KV-cache for session + state. +- **Explore Graph Reasoning:** Build workflows for multi-hop retrieval and answer synthesis. +- **Go Deep:** Check the [API Reference](/api-reference/search-memories) for advanced usage, or run more examples in `examples/`. + +Now your agent remembers not just facts — but the connections between them! diff --git a/docs/en/open_source/modules/model_backend.md b/docs/en/open_source/modules/model_backend.md new file mode 100644 index 00000000..164352f9 --- /dev/null +++ b/docs/en/open_source/modules/model_backend.md @@ -0,0 +1,104 @@ +--- +title: LLMs and Embeddings +desc: "A practical guide to configuring and using Large Language Models (LLM) and Embedders in **MemOS**." +--- + +## Overview +MemOS decouples **model logic** from **runtime config** via two Pydantic factories: + +| Factory | Produces | Typical backends | +|---------|----------|------------------| +| `LLMFactory` | Chat model | `ollama`, `openai`, `azure`, `qwen`, `deepseek`, `huggingface`, `huggingface_singleton`, `vllm`, `openai_new` | +| `EmbedderFactory` | Text embedder | `ollama`, `sentence_transformer`, `ark`, `universal_api` | + +Both factories accept a `*_ConfigFactory.model_validate(...)` blob, so you can switch provider with a single `backend=` swap. + + +## LLM Module + +### Supported LLM Backends +| Backend | Notes | Example model_name_or_path | +|---|---|---| +| `ollama` | Local Ollama server | `qwen3:0.6b` | +| `openai` | OpenAI-compatible Chat Completions | `gpt-4.1-nano` | +| `azure` | Azure OpenAI Chat Completions | `` | +| `qwen` | DashScope OpenAI-compatible API | `qwen-plus` | +| `deepseek` | DeepSeek OpenAI-compatible API | `deepseek-chat` / `deepseek-reasoner` | +| `huggingface` | Local transformers pipeline | `Qwen/Qwen3-1.7B` | +| `huggingface_singleton` | Same as `huggingface` + singleton reuse | `Qwen/Qwen3-1.7B` | +| `vllm` | OpenAI-compatible vLLM server | `Qwen/Qwen2.5-7B-Instruct` | +| `openai_new` | OpenAI Responses API wrapper | `gpt-4.1` | + +### LLM Config Schema + + +Common fields: + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `model_name_or_path` | str | – | Model id or local tag | +| `temperature` | float | 0.7 | +| `max_tokens` | int | 8192 | +| `top_p` / `top_k` | float / int | 0.95 / 50 | +| *API‑specific* | e.g. `api_key`, `api_base` | – | OpenAI‑compatible creds | +| `remove_think_prefix` | bool | False | Remove content within think tags from the generated text | + + +### Factory Usage +```python +from memos.configs.llm import LLMConfigFactory +from memos.llms.factory import LLMFactory + +cfg = LLMConfigFactory.model_validate({ + "backend": "ollama", + "config": {"model_name_or_path": "qwen3:0.6b"} +}) +llm = LLMFactory.from_config(cfg) +``` + +### LLM Core APIs +| Method | Purpose | +|--------|---------| +| `generate(messages: list)` | Return full string response | +| `generate_stream(messages)` | Yield streaming chunks| + +### Streaming & CoT +```python +messages = [{"role": "user", "content": "Let’s think step by step: …"}] +for chunk in llm.generate_stream(messages): + print(chunk, end="") +``` + +::note +**Full code** +Find all scenarios in `examples/basic_modules/llm.py`. +:: + +### Performance Tips +- Use `qwen3:0.6b` for <2 GB footprint when prototyping locally. +- Combine with KV Cache (see *KVCacheMemory* doc) to cut TTFT . + +## Embedding Module + +### Supported Embedder Backends +| Backend | Notes | Example model_name_or_path | +|---|---|---| +| `ollama` | Local Ollama server | `nomic-embed-text:latest` | +| `sentence_transformer` | Local sentence-transformers | `nomic-ai/nomic-embed-text-v1.5` | +| `ark` | Volcano Engine Ark embeddings | `` | +| `universal_api` | Universal provider wrapper (e.g. OpenAI) | `text-embedding-3-large` | + +### Embedder Config Schema +Shared keys: `model_name_or_path`, optional API creds (`api_key`, `base_url`), etc. + +### Factory Usage +```python +from memos.configs.embedder import EmbedderConfigFactory +from memos.embedders.factory import EmbedderFactory + +cfg = EmbedderConfigFactory.model_validate({ + "backend": "ollama", + "config": {"model_name_or_path": "nomic-embed-text:latest"} +}) +embedder = EmbedderFactory.from_config(cfg) +``` diff --git a/docs/en/open_source/modules/mos/memos_mcp.md b/docs/en/open_source/modules/mos/memos_mcp.md new file mode 100644 index 00000000..4118e4cb --- /dev/null +++ b/docs/en/open_source/modules/mos/memos_mcp.md @@ -0,0 +1,110 @@ +--- +title: MCP (Model Context Protocol) Setup Guide +desc: The Model Context Protocol (MCP) is a standard protocol that enables AI assistants to securely access and interact with local and remote resources. In the MemOS project, MCP provides a standardized interface for memory operations, allowing external applications to interact with the memory system through well-defined tools and resources. +--- + + +## Configuration + +### Environment Variables + +Create a `.env` file in your project root with the following configuration: + +```bash +# OpenAI Configuration +OPENAI_API_KEY=your_openai_api_key_here +OPENAI_API_BASE=https://api.openai.com/v1 + +# Memory System Configuration +MOS_TEXT_MEM_TYPE=tree_text + +# Neo4j Configuration (required for tree_text memory type) +NEO4J_URI=bolt://localhost:7687 +NEO4J_USER=neo4j +NEO4J_PASSWORD=your_neo4j_password +``` + +## Starting the MCP Server + +### Method 1: Using the Built-in Server Script + +```bash +# Navigate to the project root +cd /path/to/MemOS + +# Run with default stdio transport +python src/memos/api/mcp_serve.py + +# Run with HTTP transport +python src/memos/api/mcp_serve.py --transport http --host localhost --port 8000 + +# Run with SSE transport (deprecated but supported) +python src/memos/api/mcp_serve.py --transport sse --host localhost --port 8000 +``` + +### Method 2: Using the Example Script + +```bash +# Navigate to the examples directory +cd examples/mem_mcp + +# Run the server +python simple_fastmcp_serve.py --transport http --port 8000 +``` + +### Transport Options + +The MCP server supports three transport methods: + +1. **stdio** (default): Standard input/output for local applications +2. **http**: HTTP-based transport for web applications +3. **sse**: Server-Sent Events (deprecated but still supported) + +### Command Line Arguments + +- `--transport`: Choose transport method (`stdio`, `http`, `sse`) +- `--host`: Host address for HTTP/SSE transport (default: `localhost`) +- `--port`: Port number for HTTP/SSE transport (default: `8000`) + +## MCP Client Usage + +### Basic Client Example + +The project includes a sample client that demonstrates how to interact with the MCP server: + +```bash +# Ensure the MCP server is running on HTTP transport +cd examples/mem_mcp +python simple_fastmcp_serve.py --transport http --port 8000 + +# In another terminal, run the client +cd examples/mem_mcp +python simple_fastmcp_client.py +``` + +## MCP Configuration + +For Cursor IDE integration with MemOS MCP server, add this configuration to your `desktop_config.json` and other local MCP: + +```json +{ + "mcpServers": { + "memos-fastmcp": { + "command": "/path/to/your/conda/envs/memos/bin/python", + "args": [ + "-m", "memos.api.mcp_serve", + "--transport", "stdio" + ], + // "cwd": "/path/to/your/MemOS pip user is optional", + "env": { + "OPENAI_API_KEY": "sk-your-openai-key-here", + "OPENAI_API_BASE": "https://api.openai.com/v1", + "MOS_TEXT_MEM_TYPE": "tree_text", + "NEO4J_URI": "bolt://localhost:7687", + "NEO4J_USER": "neo4j", + "NEO4J_PASSWORD": "your-neo4j-password" + } + } + } +} +``` diff --git a/docs/en/open_source/modules/mos/memos_neo.md b/docs/en/open_source/modules/mos/memos_neo.md new file mode 100644 index 00000000..86b1ebb1 --- /dev/null +++ b/docs/en/open_source/modules/mos/memos_neo.md @@ -0,0 +1,171 @@ +--- +title: MemOS NEO Version +desc: Get up and running with MemOS in minutes using `MOS.simple()` - the fastest way to start building memory-enhanced applications. +--- + +## Quick Setup + +### Environment Variables + +Set your API credentials: + +```bash +export OPENAI_API_KEY="sk-your-api-key-here" +export OPENAI_API_BASE="https://api.openai.com/v1" # Optional +export MOS_TEXT_MEM_TYPE="general_text" #or "tree_text" for advanced + +#tips: general_text only support one-user when init MOS +``` + +### One-Line Setup + +```python +from memos.mem_os.main import MOS + +# Auto-configured instance +memory = MOS.simple() +``` +::note +**Warning:**
The `MOS.simple()` will use deafult embedding model and size text-embedding-3-large dim-size 3027 if you had use other version memos before, you need delete dir ~/.memos for new qdrant or drop neo4j db +:: + +## Basic Usage + +```python +#!/usr/bin/env python3 +import os +from memos.mem_os.main import MOS + +# Set environment variables +os.environ["OPENAI_API_KEY"] = "sk-your-api-key" +os.environ["MOS_TEXT_MEM_TYPE"] = "general_text" + +# Create memory system +memory = MOS.simple() + +# Add memories +memory.add("My favorite color is blue") +memory.add("I work as a software engineer") +memory.add("I live in San Francisco") + +# Chat with memory context +response = memory.chat("What is user favorite color?") +print(response) # "favorite color is blue!" + +response = memory.chat("Tell me about user job and location") +print(response) # Uses stored memories to respond +``` + +## Memory Types + +### General Text Memory (Recommended for Beginners) +- **Storage**: Local JSON files + Qdrant vector database +- **Setup**: No external dependencies +- **Best for**: Most use cases, quick prototyping + +```bash +export MOS_TEXT_MEM_TYPE="general_text" +``` + +### Tree Text Memory (Advanced) +- **Storage**: Neo4j graph database +- **Setup**: Requires Neo4j server +- **Best for**: Complex relationship reasoning + +```bash +export MOS_TEXT_MEM_TYPE="tree_text" +export NEO4J_URI="bolt://localhost:7687" # Optional +export NEO4J_PASSWORD="your-password" # Optional +``` + +## Neo version Overview + +`MOS.simple()` automatically creates a complete configuration using sensible defaults: + +### Default Settings +- **LLM**: GPT-4o-mini with temperature 0.8 +- **Embedder**: OpenAI text-embedding-3-large +- **Chunking**: 512 tokens with 128 overlap +- **Graph-DB**: graph db for neo4j + +### Default Configuration Utilities + +MemOS provides three main configuration utilities in `default_config.py`: + +- **`get_default_config()`**: Creates complete MOS configuration with sensible defaults +- **`get_default_cube_config()`**: Creates MemCube configuration for memory storage +- **`get_default()`**: Returns both MOS config and MemCube instance together + +```python +from memos.mem_os.utils.default_config import get_default, get_default_cube_config + +# Get both MOS config and MemCube instance +mos_config, default_cube = get_default( + openai_api_key="sk-your-key", + text_mem_type="general_text" +) + +# Or create just MemCube config +cube_config = get_default_cube_config( + openai_api_key="sk-your-key", + text_mem_type="general_text" +) +``` + +### Manual Configuration (Optional) + +If you need more control, use the configuration utilities: + +```python +from memos.mem_os.main import MOS +from memos.mem_os.utils.default_config import get_default_config + +# Custom configuration +config = get_default_config( + openai_api_key="sk-your-key", + text_mem_type="general_text", + user_id="my_user", + model_name="gpt-4", # Different model + temperature=0.5, # Lower creativity + chunk_size=256, # Smaller chunks + top_k=10 # More search results +) + +memory = MOS(config) +``` + +### Advanced Features + +Enable additional capabilities: + +```python +config = get_default_config( + openai_api_key="sk-your-key", + enable_activation_memory=True, # KV-cache memory + enable_mem_scheduler=True, # Background processing +) +``` + + +## Other Tips + +1. **Start Simple**: Use `general_text` memory type initially +2. **Environment Setup**: Keep API keys in environment variables +3. **Memory Quality**: Add specific, factual information for best results +4. **Batch Operations**: Add multiple related memories together +5. **User Context**: Use `user_id` parameter for multi-user scenarios only for `tree_text` + +## Troubleshooting + +### Common Issues + +**Missing API Key Error**: +```bash +# Ensure environment variable is set +echo $OPENAI_API_KEY +``` + +**Neo4j Connection Error** (tree_text mode): +```bash +# Check Neo4j is running desktop for local user or enterprise neo4j +``` diff --git a/docs/en/open_source/modules/mos/overview.md b/docs/en/open_source/modules/mos/overview.md new file mode 100644 index 00000000..07225482 --- /dev/null +++ b/docs/en/open_source/modules/mos/overview.md @@ -0,0 +1,105 @@ +--- +title: MemOS API Development Guide (Components & Handlers Architecture) +desc: MemOS v2.0 adopts a more modular and decoupled architecture. The legacy MOS class is deprecated; Components + Handlers is now recommended for development. +--- + +This architecture separates "system components" (Components) from "business logic execution" (Handlers), making the system easier to extend, test, and maintain. + +## 1. Core Concepts + +### 1.1 Components (Core Components) + +Components are the "organs" of MemOS. They are initialized when the server starts (via `init_server()`) and reused throughout the system lifecycle. + +Core components include: + +#### Core Memory Components + +1. **MemCube**: A memory container that isolates memories across different users and application scenarios, managing multiple memory modules in a unified way. +2. **MemReader**: A memory processor that parses user inputs (chat, documents, images) into standardized memory items that the system can persist. +3. **MemScheduler**: A background scheduler that handles asynchronous processing of memory operations—storage, indexing, and organization—supporting concurrent task execution. +4. **MemChat**: A conversation controller responsible for orchestrating the memory-augmented dialogue loop: "retrieve memory → generate response → store new memory". +5. **MemFeedback**: A memory correction engine that understands users' natural-language feedback and performs atomic-level updates to memories (correction, addition, replacement). + +### 1.2 Handlers (Business Processors) + +Handlers are the "brain" of MemOS. They encapsulate concrete business logic by coordinating and calling the capabilities of Components to complete user-facing tasks. + +#### Core Handlers Overview + +| Handler | Purpose | Key Methods | +| :--- | :--- | :--- | +| **AddHandler** | Add memories (chat / documents / text) | `handle_add_memories` | +| **SearchHandler** | Search memories (semantic retrieval) | `handle_search_memories` | +| **ChatHandler** | Chat (with memory augmentation) | `handle_chat_complete`, `handle_chat_stream` | +| **FeedbackHandler** | Feedback (correct memories / human feedback) | `handle_feedback_memories` | +| **MemoryHandler** | Manage (get details / delete) | `handle_get_memory`, `handle_delete_memories` | +| **SchedulerHandler** | Scheduling (query async task status) | `handle_scheduler_status`, `handle_scheduler_wait` | +| **SuggestionHandler** | Suggestions (generate recommended questions) | `handle_get_suggestion_queries` | + +## 2. API Details + +### 2.1 Initialization +Initialization is the foundation of system startup. All Handlers rely on a unified component registry and dependency-injection mechanism. + +- Component loading (`init_server`): When the system starts, it initializes all core components, including the LLM, storage layers (vector DB, graph DB), scheduler, and various Memory Cubes. +- Dependency injection (`HandlerDependencies`): To ensure loose coupling and testability, all components are wrapped into a `HandlerDependencies` container. When a Handler is instantiated, it receives this container and can access needed resources—such as `naive_mem_cube`, `mem_reader`, or `feedback_server`—without duplicating initialization logic. + +### 2.2 Add Memories (AddHandler) +AddHandler is the brain's "memory intake instruction", responsible for converting external information into system memories. It handles not only intake and conversion of various information types, but also automatically recognizes feedback and routes it to dedicated feedback processing. + +- Core capabilities: + - Multimodal support: Processes user conversations, documents, images, and other input types, converting them into standardized memory objects. + - Sync and async modes: Controlled via `async_mode`. **Sync mode** ("sync"): processes immediately and blocks until completion, suitable for debugging. **Async mode** ("async"): pushes tasks to a background queue for concurrent processing by MemScheduler, returns a task ID immediately, suitable for production to improve response speed. + - Automatic feedback routing: If the request sets `is_feedback=True`, the Handler automatically extracts the last user message as feedback content and routes it to MemFeedback processing, instead of adding it as a normal memory. + - Multi-target writes: Supports writing to multiple MemCubes simultaneously. When multiple targets are specified, the system processes all write tasks in parallel; when only one target is specified, it uses a lightweight approach. + +### 2.3 Search Memories (SearchHandler) +SearchHandler is the brain's "memory retrieval instruction", providing semantic-based intelligent memory query capabilities and serving as a key component for RAG (Retrieval-Augmented Generation). + +- Core capabilities: + - Semantic retrieval: Uses embedding technology to recall relevant memories based on semantic similarity, understanding user intent more accurately than simple keyword matching. + - Flexible search scope: Supports specifying the target data range for retrieval. For example, you can search only within a specific user's memory, or search across multiple users' shared public memories, meeting different privacy and business needs. + - Multiple retrieval modes: Flexibly choose between speed and accuracy based on application scenarios. **Fast mode** suits scenarios requiring high real-time performance, **fine mode** suits scenarios pursuing high retrieval accuracy, and **mixed mode** balances both. + - Multi-step reasoning retrieval: For complex questions, supports deep reasoning capability to progressively approach the most relevant memories through multiple rounds of understanding and retrieval. + +### 2.4 Chat (ChatHandler) +ChatHandler is the brain's "dialogue coordination instruction", responsible for converting user dialogue requirements into a complete business process. It does not directly operate on memories; instead, it coordinates other Handlers to complete end-to-end dialogue tasks. + +- Core capabilities: + - Orchestration: Automatically executes the complete dialogue loop of "retrieve memory → generate response → store memory". Each user query benefits from historical memories for smarter responses, and each dialogue is crystallized as new memory, achieving "chat-as-learning". + - Context management: Handles the assembly of `history` (past conversation) and `query` (current question) to ensure the LLM understands the complete dialogue context and avoids information loss. + - Multiple interaction modes: Supports standard request-response mode and streaming response mode. Standard mode suits simple questions, streaming mode suits long-text replies, meeting different frontend interaction needs. + - Message push (optional): Supports automatically pushing results to third-party platforms (such as DingTalk) after generating responses, enabling multi-channel integration. + +### 2.5 Feedback and Correction (FeedbackHandler) +FeedbackHandler is the brain's "feedback correction instruction", responsible for understanding users' natural-language feedback about AI performance and automatically locating and correcting relevant memory content. + +- Core capabilities: + - Memory correction: When users point out AI errors (such as "the meeting location is Shanghai, not Beijing"), the Handler automatically updates or marks old memories. The system uses version management rather than direct deletion, maintaining traceability of modification history. + - Positive and negative feedback: Supports users marking specific memory quality through upvote or downvote. The system adjusts the memory's weight and credibility accordingly, making subsequent retrieval more accurate. + - Precise targeting: Supports two feedback modes. One is automatic conflict detection based on dialogue history, the other allows users to directly specify memories to correct, improving feedback effectiveness and accuracy. + +### 2.6 Memory Management (MemoryHandler) +MemoryHandler is the brain's "memory management instruction", providing low-level CRUD capabilities for memory data, primarily for system admin backends or data cleanup scenarios. + +- Core capabilities: + - Fine-grained management: Unlike AddHandler's business-level writes, this Handler allows fetching detailed information of a single memory or performing physical deletion by memory ID. This direct operation bypasses business logic packaging, primarily for debugging, auditing, or system cleanup. + - Direct backend access: Some management operations need to interact directly with the underlying memory component (naive_mem_cube) to provide the most efficient and lowest-latency data operations, meeting system operations needs. + +### 2.7 Scheduler Status (SchedulerHandler) +SchedulerHandler is the brain's "task monitoring instruction", responsible for tracking the real-time execution status of all async tasks in the system, allowing users to understand background task progress and results. + +- Core capabilities: + - Status tracking: Tracks real-time task status in real-time (queued, running, completed, failed). This is important for users in async mode who need to understand when tasks complete. + - Result fetching: Provides a task result query interface. When async tasks complete, users can fetch the final execution result or error information through this interface, understanding whether operations succeeded and the reasons for failure. + - Sync wait (debugging tool): During testing and integration testing, provides a tool to force async tasks into synchronous waits, allowing developers to debug async flows like debugging synchronous code, improving development efficiency. + +### 2.8 Suggested Questions (SuggestionHandler) +SuggestionHandler is the brain's "suggestion generation instruction", predicting users' potential needs and proactively recommending related questions to help users explore system capabilities and discover topics of interest. + +- Core capabilities: + - Dual-mode generation: + - Conversation-based suggestions: When users provide recent conversation records, the system analyzes dialogue context and infers potential follow-up topics of interest, generating 3 related recommended questions. + - Memory-based suggestions: When there is no conversation context, the system infers user interests and status from recent memories, generating recommended questions related to the user's recent life or work. This suits dialogue initiation or topic transitions. + - Multi-language support: Recommended questions automatically adapt to user language settings, supporting Chinese, English, and other languages, improving experience for different users. diff --git a/docs/en/open_source/modules/mos/users.md b/docs/en/open_source/modules/mos/users.md new file mode 100644 index 00000000..8ded0175 --- /dev/null +++ b/docs/en/open_source/modules/mos/users.md @@ -0,0 +1,306 @@ +--- +title: User Management +desc: The **MOS** provides comprehensive user management capabilities to support multi-user, multi-session memory operations. This document details the user management methods available in the MOS. +--- + +## User Roles + +MOS supports four user roles with different permission levels: + +| Role | Description | Permissions | +|------|-------------|-------------| +| `ROOT` | System administrator | Full access to all cubes and users, cannot be deleted | +| `ADMIN` | Administrative user | Can manage users and cubes, access to all cubes | +| `USER` | Standard user | Can create and manage own cubes, access shared cubes | +| `GUEST` | Limited user | Read-only access to shared cubes, cannot create cubes | + +## User Management Methods + +### 1. `create_user` + +Creates a new user in the MOS system. + +**Parameters:** +- `user_id` (str): Unique identifier for the user +- `role` (UserRole, optional): User role. Defaults to `UserRole.USER` +- `user_name` (str, optional): Display name for the user. If not provided, uses `user_id` + +**Returns:** +- `str`: The created user ID + +**Example:** +```python +import uuid +from memos.mem_user.user_manager import UserRole + +# Create a standard user +user_id = str(uuid.uuid4()) +memory.create_user(user_id=user_id, role=UserRole.USER, user_name="John Doe") + +# Create an admin user +admin_id = str(uuid.uuid4()) +memory.create_user(user_id=admin_id, role=UserRole.ADMIN, user_name="Admin User") + +# Create a guest user +guest_id = str(uuid.uuid4()) +memory.create_user(user_id=guest_id, role=UserRole.GUEST, user_name="Guest User") +``` + +**Notes:** +- If a user with the same `user_name` already exists, the method returns the existing user's ID +- The system automatically creates a root user during initialization +- User IDs must be unique across the system + +### 2. `list_users` + +Retrieves information about all active users in the system. + +**Parameters:** +- None + +**Returns:** +- `list`: List of dictionaries containing user information: + - `user_id` (str): Unique user identifier + - `user_name` (str): Display name of the user + - `role` (str): User role (root, admin, user, guest) + - `created_at` (str): ISO format timestamp of user creation + - `is_active` (bool): Whether the user account is active + +**Example:** +```python +# List all users +users = memory.list_users() +for user in users: + print(f"User: {user['user_name']} (ID: {user['user_id']})") + print(f"Role: {user['role']}") + print(f"Active: {user['is_active']}") + print(f"Created: {user['created_at']}") + print("---") +``` + +**Output Example:** +``` +User: root (ID: root) +Role: root +Active: True +Created: 2024-01-15T10:30:00 +--- +User: John Doe (ID: 550e8400-e29b-41d4-a716-446655440000) +Role: user +Active: True +Created: 2024-01-15T11:00:00 +--- +``` + +### 3. `create_cube_for_user` + +Creates a new memory cube for a specific user as the owner. + +**Parameters:** +- `cube_name` (str): Name of the cube +- `owner_id` (str): User ID of the cube owner +- `cube_path` (str, optional): Local file path or remote repository URL for the cube +- `cube_id` (str, optional): Custom cube identifier. If not provided, a UUID is generated + +**Returns:** +- `str`: The created cube ID + +**Example:** +```python +import uuid + +# Create a user first +user_id = str(uuid.uuid4()) +memory.create_user(user_id=user_id, user_name="Alice") + +# Create a cube for the user +cube_id = memory.create_cube_for_user( + cube_name="Alice's Personal Memory", + owner_id=user_id, + cube_path="/path/to/alice/memory", + cube_id="alice_personal_cube" +) + +print(f"Created cube: {cube_id}") +``` + +**Notes:** +- The owner automatically gets full access to the created cube +- The cube owner can share the cube with other users +- If `cube_path` is provided, it can be a local directory path or a remote repository URL +- Custom `cube_id` must be unique across the system + +### 4. `get_user_info` + +Retrieves detailed information about the current user and their accessible cubes. + +**Parameters:** +- None + +**Returns:** +- `dict`: Dictionary containing user information and accessible cubes: + - `user_id` (str): Current user's ID + - `user_name` (str): Current user's display name + - `role` (str): Current user's role + - `created_at` (str): ISO format timestamp of user creation + - `accessible_cubes` (list): List of dictionaries for each accessible cube: + - `cube_id` (str): Cube identifier + - `cube_name` (str): Cube display name + - `cube_path` (str): Cube file path or repository URL + - `owner_id` (str): ID of the cube owner + - `is_loaded` (bool): Whether the cube is currently loaded in memory + +**Example:** +```python +# Get current user information +user_info = memory.get_user_info() + +print(f"Current User: {user_info['user_name']} ({user_info['user_id']})") +print(f"Role: {user_info['role']}") +print(f"Created: {user_info['created_at']}") +print("\nAccessible Cubes:") +for cube in user_info['accessible_cubes']: + print(f"- {cube['cube_name']} (ID: {cube['cube_id']})") + print(f" Owner: {cube['owner_id']}") + print(f" Loaded: {cube['is_loaded']}") + print(f" Path: {cube['cube_path']}") +``` + +**Output Example:** +``` +Current User: Alice (550e8400-e29b-41d4-a716-446655440000) +Role: user +Created: 2024-01-15T11:00:00 + +Accessible Cubes: +- Alice's Personal Memory (ID: alice_personal_cube) + Owner: 550e8400-e29b-41d4-a716-446655440000 + Loaded: True + Path: /path/to/alice/memory +- Shared Project Memory (ID: project_cube) + Owner: bob_user_id + Loaded: False + Path: /path/to/project/memory +``` + +### 5. `share_cube_with_user` + +Shares a memory cube with another user, granting them access to the cube's contents. + +**Parameters:** +- `cube_id` (str): ID of the cube to share +- `target_user_id` (str): ID of the user to share the cube with + +**Returns:** +- `bool`: `True` if sharing was successful, `False` otherwise + +**Example:** +```python +# Share a cube with another user +success = memory.share_cube_with_user( + cube_id="alice_personal_cube", + target_user_id="bob_user_id" +) + +if success: + print("Cube shared successfully") +else: + print("Failed to share cube") +``` + +**Notes:** +- The current user must have access to the cube being shared +- The target user must exist and be active +- Sharing a cube grants the target user read and write access to the cube +- Cube owners can always share their cubes +- Users with access to a cube can share it with other users (if they have appropriate permissions) + +## Complete User Management Workflow + +Here's a complete example demonstrating user management operations: + +```python +import uuid +from memos.configs.mem_os import MOSConfig +from memos.mem_os.main import MOS +from memos.mem_user.user_manager import UserRole + +# Initialize MOS +mos_config = MOSConfig.from_json_file("examples/data/config/simple_memos_config.json") +memory = MOS(mos_config) + +# 1. Create users +alice_id = str(uuid.uuid4()) +bob_id = str(uuid.uuid4()) + +memory.create_user(user_id=alice_id, user_name="Alice", role=UserRole.USER) +memory.create_user(user_id=bob_id, user_name="Bob", role=UserRole.USER) + +# 2. List all users +print("All users:") +users = memory.list_users() +for user in users: + print(f"- {user['user_name']} ({user['role']})") + +# 3. Create cubes for users +alice_cube_id = memory.create_cube_for_user( + cube_name="Alice's Personal Memory", + owner_id=alice_id, + cube_path="/path/to/alice/memory" +) + +bob_cube_id = memory.create_cube_for_user( + cube_name="Bob's Work Memory", + owner_id=bob_id, + cube_path="/path/to/bob/work" +) + +# 4. Share cubes between users +memory.share_cube_with_user(alice_cube_id, bob_id) +memory.share_cube_with_user(bob_cube_id, alice_id) + +# 5. Get user information +alice_info = memory.get_user_info() +print(f"\nAlice's accessible cubes: {len(alice_info['accessible_cubes'])}") + +# 6. Add memory to cubes +memory.add( + messages=[ + {"role": "user", "content": "I like playing football."}, + {"role": "assistant", "content": "That's great! Football is a wonderful sport."} + ], + user_id=alice_id, + mem_cube_id=alice_cube_id +) + +# 7. Search memories +retrieved = memory.search( + query="What does Alice like?", + user_id=alice_id +) +print(f"Retrieved memories: {retrieved['text_mem']}") +``` + +## Error Handling + +The user management methods include comprehensive error handling: + +- **User Validation**: Methods validate that users exist and are active before operations +- **Cube Access Validation**: Ensures users have appropriate access to cubes before operations +- **Duplicate Prevention**: Handles duplicate user names and cube IDs gracefully +- **Permission Checks**: Validates user roles and permissions for sensitive operations + +## Database Persistence + +User management data is persisted in a SQLite database: +- **Location**: Defaults to `~/.memos/memos_users.db` +- **Tables**: `users`, `cubes`, `user_cube_association` +- **Relationships**: Many-to-many relationship between users and cubes +- **Soft Deletes**: Users and cubes are soft-deleted (marked as inactive) rather than permanently removed + +## Security Considerations + +- **Role-based Access Control**: Different user roles have different permissions +- **Cube Ownership**: Cube owners have full control over their cubes +- **Access Validation**: All operations validate user access before execution +- **Root User Protection**: Root user cannot be deleted and has full system access diff --git a/docs/en/open_source/modules/mos/users_configurations.md b/docs/en/open_source/modules/mos/users_configurations.md new file mode 100644 index 00000000..0453bbcc --- /dev/null +++ b/docs/en/open_source/modules/mos/users_configurations.md @@ -0,0 +1,719 @@ +--- +title: MemOS Configuration Guide +desc: This document provides a comprehensive overview of all configuration fields and initialization methods across the different components in the MemOS system. +--- + +1. [Configuration Overview](#configuration-overview) +2. [MOS Configuration](#mos-configuration) +3. [LLM Configuration](#llm-configuration) +4. [MemReader Configuration](#memreader-configuration) +5. [MemCube Configuration](#memcube-configuration) +6. [Memory Configuration](#memory-configuration) +7. [Embedder Configuration](#embedder-configuration) +8. [Vector Database Configuration](#vector-database-configuration) +9. [Graph Database Configuration](#graph-database-configuration) +10. [Scheduler Configuration](#scheduler-configuration) +11. [Initialization Methods](#initialization-methods) +12. [Configuration Examples](#configuration-examples) + +## Configuration Overview + +MemOS uses a hierarchical configuration system with factory patterns for different backends. Each component has: +- A base configuration class +- Backend-specific configuration classes +- A factory class that creates the appropriate configuration based on the backend + +## MOS Configuration + +The main MOS configuration that orchestrates all components. + +### MOSConfig Fields + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `user_id` | str | "root" | User ID for the MOS this Config User ID will as default | +| `session_id` | str | auto-generated UUID | Session ID for the MOS | +| `chat_model` | LLMConfigFactory | required | LLM configuration for chat | +| `mem_reader` | MemReaderConfigFactory | required | MemReader configuration | +| `mem_scheduler` | SchedulerFactory | not required | Scheduler configuration | +| `max_turns_window` | int | 15 | Maximum conversation turns to keep | +| `top_k` | int | 5 | Maximum memories to retrieve per query | +| `enable_textual_memory` | bool | True | Enable textual memory | +| `enable_activation_memory` | bool | False | Enable activation memory | +| `enable_parametric_memory` | bool | False | Enable parametric memory | +| `enable_mem_scheduler` | bool | False | Enable scheduler memory | + + +### Example MOS Configuration + +```json +{ + "user_id": "root", + "chat_model": { + "backend": "huggingface", + "config": { + "model_name_or_path": "Qwen/Qwen3-1.7B", + "temperature": 0.1, + "remove_think_prefix": true, + "max_tokens": 4096 + } + }, + "mem_reader": { + "backend": "simple_struct", + "config": { + "llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b", + "temperature": 0.8, + "max_tokens": 1024, + "top_p": 0.9, + "top_k": 50 + } + }, + "embedder": { + "backend": "ollama", + "config": { + "model_name_or_path": "nomic-embed-text:latest" + } + }, + "chunker": { + "backend": "sentence", + "config": { + "tokenizer_or_token_counter": "gpt2", + "chunk_size": 512, + "chunk_overlap": 128, + "min_sentences_per_chunk": 1 + } + } + } + }, + "max_turns_window": 20, + "top_k": 5, + "enable_textual_memory": true, + "enable_activation_memory": false, + "enable_parametric_memory": false +} +``` + +## LLM Configuration + +Configuration for different Large Language Model backends. + +### Base LLM Fields + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `model_name_or_path` | str | required | Model name or path | +| `temperature` | float | 0.8 | Temperature for sampling | +| `max_tokens` | int | 1024 | Maximum tokens to generate | +| `top_p` | float | 0.9 | Top-p sampling parameter | +| `top_k` | int | 50 | Top-k sampling parameter | +| `remove_think_prefix` | bool | False | Remove think tags from output | + +### Backend-Specific Fields + +#### OpenAI LLM +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `api_key` | str | required | OpenAI API key | +| `api_base` | str | "https://api.openai.com/v1" | OpenAI API base URL | + +#### Ollama LLM +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `api_base` | str | "http://localhost:11434" | Ollama API base URL | + +#### HuggingFace LLM +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `do_sample` | bool | False | Use sampling vs greedy decoding | +| `add_generation_prompt` | bool | True | Apply generation template | + +### Example LLM Configurations + +```json +// OpenAI +{ + "backend": "openai", + "config": { + "model_name_or_path": "gpt-4o", + "temperature": 0.8, + "max_tokens": 1024, + "top_p": 0.9, + "top_k": 50, + "api_key": "sk-...", + "api_base": "https://api.openai.com/v1" + } +} + +// Ollama +{ + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b", + "temperature": 0.8, + "max_tokens": 1024, + "top_p": 0.9, + "top_k": 50, + "api_base": "http://localhost:11434" + } +} + +// HuggingFace +{ + "backend": "huggingface", + "config": { + "model_name_or_path": "Qwen/Qwen3-1.7B", + "temperature": 0.1, + "remove_think_prefix": true, + "max_tokens": 4096, + "do_sample": false, + "add_generation_prompt": true + } +} +``` + +## MemReader Configuration + +Configuration for memory reading components. + +### Base MemReader Fields + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `created_at` | datetime | auto-generated | Creation timestamp | +| `llm` | LLMConfigFactory | required | LLM configuration | +| `embedder` | EmbedderConfigFactory | required | Embedder configuration | +| `chunker` | chunkerConfigFactory | required | chunker configuration | + +### Backend Types + +- `simple_struct`: Structured memory reader + +### Example MemReader Configuration + +```json +{ + "backend": "simple_struct", + "config": { + "llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b", + "temperature": 0.0, + "remove_think_prefix": true, + "max_tokens": 8192 + } + }, + "embedder": { + "backend": "ollama", + "config": { + "model_name_or_path": "nomic-embed-text:latest" + } + }, + "chunker": { + "backend": "sentence", + "config": { + "tokenizer_or_token_counter": "gpt2", + "chunk_size": 512, + "chunk_overlap": 128, + "min_sentences_per_chunk": 1 + } + } + } +} +``` + +## MemCube Configuration + +Configuration for memory cube components. + +### GeneralMemCubeConfig Fields + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `user_id` | str | "default_user" | User ID for the MemCube | +| `cube_id` | str | auto-generated UUID | Cube ID for the MemCube | +| `text_mem` | MemoryConfigFactory | required | Textual memory configuration | +| `act_mem` | MemoryConfigFactory | required | Activation memory configuration | +| `para_mem` | MemoryConfigFactory | required | Parametric memory configuration | + +### Allowed Backends + +- **Text Memory**: `naive_text`, `general_text`, `tree_text`, `uninitialized` +- **Activation Memory**: `kv_cache`, `uninitialized` +- **Parametric Memory**: `lora`, `uninitialized` + +### Example MemCube Configuration + +```json +{ + "user_id": "root", + "cube_id": "root/mem_cube_kv_cache", + "text_mem": {}, + "act_mem": { + "backend": "kv_cache", + "config": { + "memory_filename": "activation_memory.pickle", + "extractor_llm": { + "backend": "huggingface", + "config": { + "model_name_or_path": "Qwen/Qwen3-1.7B", + "temperature": 0.8, + "max_tokens": 1024, + "top_p": 0.9, + "top_k": 50, + "add_generation_prompt": true, + "remove_think_prefix": false + } + } + } + }, + "para_mem": { + "backend": "lora", + "config": { + "memory_filename": "parametric_memory.adapter", + "extractor_llm": { + "backend": "huggingface", + "config": { + "model_name_or_path": "Qwen/Qwen3-1.7B", + "temperature": 0.8, + "max_tokens": 1024, + "top_p": 0.9, + "top_k": 50, + "add_generation_prompt": true, + "remove_think_prefix": false + } + } + } + } +} +``` + +## Memory Configuration + +Configuration for different types of memory systems. + +### Base Memory Fields + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `cube_id` | str | None | Unique MemCube identifier is can be cube_name or path as default| + +### Textual Memory Configurations + +#### Base Text Memory +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `memory_filename` | str | "textual_memory.json" | Filename for storing memories | + +#### Naive Text Memory +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `extractor_llm` | LLMConfigFactory | required | LLM for memory extraction | + +#### General Text Memory +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `extractor_llm` | LLMConfigFactory | required | LLM for memory extraction | +| `vector_db` | VectorDBConfigFactory | required | Vector database configuration | +| `embedder` | EmbedderConfigFactory | required | Embedder configuration | + +#### Tree Text Memory +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `extractor_llm` | LLMConfigFactory | required | LLM for memory extraction | +| `dispatcher_llm` | LLMConfigFactory | required | LLM for memory dispatching | +| `embedder` | EmbedderConfigFactory | required | Embedder configuration | +| `graph_db` | GraphDBConfigFactory | required | Graph database configuration | + +### Activation Memory Configurations + +#### Base Activation Memory +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `memory_filename` | str | "activation_memory.pickle" | Filename for storing memories | + +#### KV Cache Memory +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `extractor_llm` | LLMConfigFactory | required | LLM for memory extraction (must be huggingface) | + +### Parametric Memory Configurations + +#### Base Parametric Memory +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `memory_filename` | str | "parametric_memory.adapter" | Filename for storing memories | + +#### LoRA Memory +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `extractor_llm` | LLMConfigFactory | required | LLM for memory extraction (must be huggingface) | + +### Example Memory Configurations + +```json +// Tree Text Memory +{ + "backend": "tree_text", + "config": { + "memory_filename": "tree_memory.json", + "extractor_llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b", + "temperature": 0.0, + "remove_think_prefix": true, + "max_tokens": 8192 + } + }, + "dispatcher_llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b", + "temperature": 0.0, + "remove_think_prefix": true, + "max_tokens": 8192 + } + }, + "embedder": { + "backend": "ollama", + "config": { + "model_name_or_path": "nomic-embed-text:latest" + } + }, + "graph_db": { + "backend": "neo4j", + "config": { + "uri": "bolt://localhost:7687", + "user": "neo4j", + "password": "12345678", + "db_name": "user08alice", + "auto_create": true, + "embedding_dimension": 768 + } + } + } +} +``` + +## Embedder Configuration + +Configuration for embedding models. + +### Base Embedder Fields + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `model_name_or_path` | str | required | Model name or path | +| `embedding_dims` | int | None | Number of embedding dimensions | + +### Backend-Specific Fields + +#### Ollama Embedder +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `api_base` | str | "http://localhost:11434" | Ollama API base URL | + +#### Sentence Transformer Embedder +No additional fields beyond base configuration. + +### Example Embedder Configurations + +```json +// Ollama Embedder +{ + "backend": "ollama", + "config": { + "model_name_or_path": "nomic-embed-text:latest", + "api_base": "http://localhost:11434" + } +} + +// Sentence Transformer Embedder +{ + "backend": "sentence_transformer", + "config": { + "model_name_or_path": "all-MiniLM-L6-v2", + "embedding_dims": 384 + } +} +``` + +## Vector Database Configuration + +Configuration for vector databases. + +### Base Vector DB Fields + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `collection_name` | str | required | Name of the collection | +| `vector_dimension` | int | None | Dimension of the vectors | +| `distance_metric` | str | None | Distance metric (cosine, euclidean, dot) | + +### Qdrant Vector DB Fields + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `host` | str | None | Qdrant host | +| `port` | int | None | Qdrant port | +| `path` | str | None | Qdrant local path | + +### Example Vector DB Configuration + +```json +{ + "backend": "qdrant", + "config": { + "collection_name": "memories", + "vector_dimension": 768, + "distance_metric": "cosine", + "path": "/path/to/qdrant" + } +} +``` + +## Graph Database Configuration + +Configuration for graph databases. + +### Base Graph DB Fields + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `uri` | str | required | Database URI | +| `user` | str | required | Database username | +| `password` | str | required | Database password | + +### Neo4j Graph DB Fields + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `db_name` | str | required | Target database name | +| `auto_create` | bool | False | Create DB if it doesn't exist | +| `embedding_dimension` | int | 768 | Vector embedding dimension | + +### Example Graph DB Configuration + +```json +{ + "backend": "neo4j", + "config": { + "uri": "bolt://localhost:7687", + "user": "neo4j", + "password": "12345678", + "db_name": "user08alice", + "auto_create": true, + "embedding_dimension": 768 + } +} +``` + +## Scheduler Configuration + +Configuration for memory scheduling systems that manage memory retrieval and activation. + +### Base Scheduler Fields + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `top_k` | int | 10 | Number of top candidates to consider in initial retrieval | +| `top_n` | int | 5 | Number of final results to return after processing | +| `enable_parallel_dispatch` | bool | True | Whether to enable parallel message processing using thread pool | +| `thread_pool_max_workers` | int | 5 | Maximum worker threads in pool (1-20) | +| `consume_interval_seconds` | int | 3 | Interval for consuming messages from queue in seconds (0-60) | + +### General Scheduler Fields + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `act_mem_update_interval` | int | 300 | Interval in seconds for updating activation memory | +| `context_window_size` | int | 5 | Size of the context window for conversation history | +| `activation_mem_size` | int | 5 | Maximum size of the activation memory | +| `act_mem_dump_path` | str | auto-generated | File path for dumping activation memory | + +### Backend Types + +- `general_scheduler`: Advanced scheduler with activation memory management + +### Example Scheduler Configuration + +```json +{ + "backend": "general_scheduler", + "config": { + "top_k": 10, + "top_n": 5, + "act_mem_update_interval": 300, + "context_window_size": 5, + "activation_mem_size": 1000, + "thread_pool_max_workers": 10, + "consume_interval_seconds": 3, + "enable_parallel_dispatch": true + } +} +``` + +## Initialization Methods + +### From JSON File + +```python +from memos.configs.mem_os import MOSConfig + +# Load configuration from JSON file +mos_config = MOSConfig.from_json_file("path/to/config.json") +``` + +### From Dictionary + +```python +from memos.configs.mem_os import MOSConfig + +# Create configuration from dictionary +config_dict = { + "user_id": "root", + "chat_model": { + "backend": "huggingface", + "config": { + "model_name_or_path": "Qwen/Qwen3-1.7B", + "temperature": 0.1 + } + } + # ... other fields +} + +mos_config = MOSConfig(**config_dict) +``` + +### Factory Pattern Usage + +```python +from memos.configs.llm import LLMConfigFactory + +# Create LLM configuration using factory +llm_config = LLMConfigFactory( + backend="huggingface", + config={ + "model_name_or_path": "Qwen/Qwen3-1.7B", + "temperature": 0.1 + } +) +``` + +## Configuration Examples + +### Complete MOS Setup + +```python +from memos.configs.mem_os import MOSConfig +from memos.mem_os.main import MOS + +# Load configuration +mos_config = MOSConfig.from_json_file("examples/data/config/simple_memos_config.json") + +# Initialize MOS +mos = MOS(mos_config) + +# Create user and register cube +user_id = "user_123" +mos.create_user(user_id=user_id) +mos.register_mem_cube("path/to/mem_cube", user_id=user_id) + +# Use MOS +response = mos.chat("Hello, how are you?", user_id=user_id) +``` + +### Tree Memory Configuration + +```python +from memos.configs.memory import MemoryConfigFactory + +# Create tree memory configuration +tree_memory_config = MemoryConfigFactory( + backend="tree_text", + config={ + "memory_filename": "tree_memory.json", + "extractor_llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b", + "temperature": 0.0, + "max_tokens": 8192 + } + }, + "dispatcher_llm": { + "backend": "ollama", + "config": { + "model_name_or_path": "qwen3:0.6b", + "temperature": 0.0, + "max_tokens": 8192 + } + }, + "embedder": { + "backend": "ollama", + "config": { + "model_name_or_path": "nomic-embed-text:latest" + } + }, + "graph_db": { + "backend": "neo4j", + "config": { + "uri": "bolt://localhost:7687", + "user": "neo4j", + "password": "password", + "db_name": "memories", + "auto_create": True, + "embedding_dimension": 768 + } + } + } +) +``` + +### Multi-Backend LLM Configuration + +```python +from memos.configs.llm import LLMConfigFactory + +# OpenAI configuration +openai_config = LLMConfigFactory( + backend="openai", + config={ + "model_name_or_path": "gpt-4o", + "temperature": 0.8, + "max_tokens": 1024, + "api_key": "sk-...", + "api_base": "https://api.openai.com/v1" + } +) + +# Ollama configuration +ollama_config = LLMConfigFactory( + backend="ollama", + config={ + "model_name_or_path": "qwen3:0.6b", + "temperature": 0.8, + "max_tokens": 1024, + "api_base": "http://localhost:11434" + } +) + +# HuggingFace configuration +hf_config = LLMConfigFactory( + backend="huggingface", + config={ + "model_name_or_path": "Qwen/Qwen3-1.7B", + "temperature": 0.1, + "remove_think_prefix": True, + "max_tokens": 4096, + "do_sample": False, + "add_generation_prompt": True + } +) +``` + +This comprehensive configuration system allows for flexible and extensible setup of the MemOS system with different backends and components. diff --git a/docs/en/openclaw/changes.md b/docs/en/openclaw/changes.md new file mode 100644 index 00000000..be1755ff --- /dev/null +++ b/docs/en/openclaw/changes.md @@ -0,0 +1,118 @@ +--- +title: OpenClaw Plugin Changelog +--- + +::OpenclawReleaseTimeline +--- +releases: + - date: '2026-05-08' + plugins: + - title: 'Cloud Plugin' + version: 'v0.1.15' + sections: + - title: 'Added' + items: + - 'Added `activation.onCapabilities: ["hook"]` to the OpenClaw, Moltbot, and ClawDBot plugin manifests.' + - 'Added compatibility with the plugin loading mechanism introduced in OpenClaw 5.3 and later. OpenClaw evaluates capability declarations before plugin registration; this declaration ensures the plugin is recognized and loaded as a lifecycle hook plugin, allowing hooks such as `before_agent_start` and `agent_end` to continue registering correctly.' + - title: 'Improved' + items: + - 'Adjusted the automatic `hooks.allowConversationAccess: true` patching flow to run after the gateway is ready, allowing the host config update to trigger a gateway restart and apply the required hook permission.' + + - date: '2026-04-29' + plugins: + - title: 'Cloud Plugin' + version: 'v0.1.14' + summary: 'Added compatibility support for the agent_end permission restriction introduced in OpenClaw 2026.4.23 and later: when the gateway starts, the plugin automatically checks the host config and adds `hooks.allowConversationAccess: true` for this plugin, helping users avoid memory-write hook failures caused by missing permissions.' + - date: '2026-04-16' + plugins: + - title: 'Cloud Plugin' + version: 'v0.1.13' + summary: 'Fully supports shared knowledge base access and collaborative processing in multi-agent mode.' + sections: + - title: 'Shared Knowledge Base Support (Multi-Agent Scenario)' + items: + - '**Multi-Agent Knowledge Base Support**: Fully supported collaborative access and processing of the knowledge base by multiple agents. Allows different agent nodes to share, retrieve, and invoke data from the same knowledge base, improving knowledge acquisition efficiency and context consistency during multi-agent collaboration in complex tasks.' + + - date: '2026-04-03' + plugins: + - title: 'Cloud Plugin' + version: 'v0.1.12' + summary: 'Introduced local visual configuration interface, deeply refactored configuration resolution architecture, and adapted to OpenClaw plugin security review.' + sections: + - title: 'Visual Configuration UI (Config UI)' + items: + - '**Local Configuration Service**: Built-in HTTP service provides a plugin management backend, supporting visual configuration viewing and modification in the browser, and real-time synchronization of configuration changes (default URL is `http://127.0.0.1:38463`).' + - '**Startup Stability Assurance**: Introduced gateway readiness detection (`waitForGatewayReady`) in the service startup process to ensure stable service status.' + - '**UI Experience Optimization**: Added responsive layout and collapsible floating navigation tools, along with new SVG icons.' + - title: 'Architecture Optimization & Security Compliance' + items: + - '**Security Review Adaptation (Subprocess Removed)**: To comply with strict plugin sandbox and security requirements, completely removed `child_process` `spawn`/`exec` calls. The auto-update mechanism was changed from "silent background download and force update" to "version detection only with manual update command prompts in logs", eliminating the risk of background process escape.' + - '**Security Review Adaptation (Default Overstep Removed)**: Removed all `default` value settings in the `plugin.json` declaration files to ensure the plugin does not trigger unauthorized or unexpected calls when no explicit configuration is provided.' + - '**Centralized Schema Management**: Refactored configuration resolution logic (`getConfigResolution`) to centrally manage priority strategies for environment variables, user configurations, and default values, enhancing code security and robustness.' + + - date: '2026-03-30' + plugins: + - title: 'Cloud Plugin' + version: 'v0.1.11' + summary: 'Strengthened fine-grained control for multi-agent scenarios and enhanced dynamic user identity extraction capabilities.' + sections: + - title: 'Session & User Identity Management' + items: + - '**Direct Session User ID Support**: Added `useDirectSessionUserId` configuration. When enabled, it directly parses and extracts the real session user ID from the `sessionKey`, meeting data isolation needs in complex agent scenarios.' + - title: 'Multi-Agent Configuration Enhancements' + items: + - '**Agent Execution Whitelist**: Added the `allowedAgents` configuration item, allowing memory recall and recording to be triggered only for specific agents in multi-agent mode, avoiding redundant consumption caused by global interception.' + - '**Differentiated Override Mechanism (Agent Overrides)**: Introduced the `agentOverrides` configuration object, supporting individual overrides for core parameters such as knowledge base IDs (`knowledgebaseIds`), recall limit (`memoryLimitNumber`), and feature switches (`recallEnabled`) for different agents.' + + - date: '2026-03-24' + plugins: + - title: 'Cloud Plugin' + version: 'v0.1.10' + sections: + - items: + - '**Improved memory ingestion quality:** Added and strengthened cleanup for OpenClaw inbound metadata, timestamp wrappers, and trailing Feishu system hints to reduce noisy writes into memory.' + - '**Multi-channel message prefix cleanup improvements**: Expanded and standardized envelope/prefix stripping for channels such as WebChat, WhatsApp, Telegram, Slack, Discord, and Zalo, reducing platform wrapper noise in memory ingestion and recall quality.' + - '**More accurate recall display**: Recall timestamps now prioritize update time for better temporal consistency.' + - '**More robust Recall Filter**: Default parameters are aligned with runtime fallback values (timeout and retries), improving stability in local model scenarios.' + - '**Timeout and resource management optimization**: Fixed timer cleanup behavior to prevent resource leaks on exceptional code paths.' + - '**Configuration completeness**: Completed Recall Filter-related fields in the plugin schema for more complete and controllable configuration.' + - '**Enhanced observability**: Added before/after filtering count logs to make recall quality and filter effect troubleshooting easier.' + - date: '2026-03-13' + plugins: + - title: 'Cloud Plugin' + version: 'v0.1.9' + summary: 'Silent upgrade and memory recall optimization. This release includes the following improvements to enhance usability and Token efficiency:' + sections: + - title: 'Silent Self-Detection and Upgrade' + items: + - 'Added a plugin version self-check mechanism that periodically checks the latest version from the NPM registry in the background.' + - 'When a new version is detected, a silent upgrade is triggered automatically so users can continuously receive the latest capabilities and fixes without manual actions.' + - title: 'Support Custom Models for Memory Recall' + items: + - 'Introduced LLM-based secondary filtering for memory recall.' + - 'Added configuration options such as recallFilterModel and recallFilterBaseUrl, allowing an independent model to evaluate relevance.' + - 'Effectively removes noisy results and keeps only memory snippets that are truly useful for the current conversation.' + - title: 'Lean Prompt Injection (System Prompt Optimization)' + items: + - 'Refactored memory injection logic by moving static protocols and instructions to appendSystemContext.' + - 'prependContext now keeps only dynamically retrieved memory-list data.' + - 'Significantly reduces Token usage caused by repetitive prompts and improves model focus on core memory.' + - date: '2026-03-09' + plugins: + - title: 'Cloud Plugin' + version: 'v0.1.8' + summary: 'Added support for multi-agent mode, enabling agent identification from context for memory isolation, with a compatibility switch for older versions.' + + - date: '2026-03-05' + plugins: + - title: 'Cloud Plugin' + version: 'v0.1.7' + summary: 'Added support for user-defined relativity in the searchMemory API.' + + - date: '2026-02-26' + plugins: + - title: 'Cloud Plugin' + version: 'Other Historical Versions (Core Capabilities)' + summary: 'Supports searchMemory in the before_agent_start event and addMessage in the agent_end event.' +--- +:: diff --git a/docs/en/openclaw/examples/hermes_usage.md b/docs/en/openclaw/examples/hermes_usage.md new file mode 100644 index 00000000..ff48bf6b --- /dev/null +++ b/docs/en/openclaw/examples/hermes_usage.md @@ -0,0 +1,112 @@ +--- +title: Local Plugin Usage +desc: Basic usage, memory tools, team sharing, and multi-agent examples for the MemOS local plugin in OpenClaw and Hermes. +--- + +## Basic Usage + +`@memtensor/memos-local-plugin` supports both OpenClaw and Hermes. After installation, start the agent you use as usual. The plugin injects local memory context before each task and writes Trace, Policy, World Model, and Skill data after the task finishes. + +| Agent | How to start | Viewer | +| --- | --- | --- | +| OpenClaw | Start or restart the OpenClaw gateway normally | `http://127.0.0.1:18799` | +| Hermes | `hermes chat` | `http://127.0.0.1:18800` | + +### Verify Memory is Working + +1. Have a conversation with OpenClaw or Hermes. +2. Open the corresponding Memory Viewer and confirm the conversation appears in **Memories** / **Tasks**. +3. In a new conversation, ask the agent to recall what you discussed: + +```text +You: Do you remember what I asked you to help me with before? +Agent: (Calls memory_search) Yes, we previously discussed... +``` + +--- + +## Memory Tools + +The local plugin exposes memory tools through each agent host. Exact tool presentation may differ by host, but the core capabilities are shared. + +| Tool | Purpose | +| --- | --- | +| `memory_search` | Search across Skill, Trace/Episode, and World Model tiers. | +| `memory_get` | Fetch a memory detail. | +| `memory_timeline` | Inspect an episode / task timeline. | +| `skill_list` | List currently available Skills. | +| `skill_get` | Fetch a Skill invocation guide. | +| `memory_environment` | Query L3 World Models for project structure, environment behavior, and constraints. | + +### Call Examples + +```text +Agent call: + memory_search("Nginx deployment config") + → Returns relevant Skills, Trace snippets, and environment knowledge + +Agent call: + skill_get("nginx-proxy") + → Returns executable steps, applicability, and caveats +``` + +The plugin also records tool successes and failures for later decision repair. + +--- + +## Team Sharing + +By default, OpenClaw and Hermes use separate local databases. For collaboration, enable Team Sharing from the Memory Viewer to share locally crystallized Skills and optional trace excerpts with other instances on the same LAN / VPN. + +### How to Configure + +Open the Memory Viewer for the target agent, go to **Settings → Team Sharing**, fill in the team address and tokens as prompted, then save. The Viewer restarts the plugin and loads the new settings. + +### Expected Results + +- Private local data stays in the current agent's runtime home by default. +- Explicitly shared Skills can be discovered and reused by other instances. +- Hub is not on the algorithm critical path. If sharing fails, local writes, retrieval, and Skill lookup continue to work. + +--- + +## Multi-Agent Scenarios + +When OpenClaw and Hermes are installed on the same machine, their ports and data are isolated: + +| Resource | OpenClaw | Hermes | +| --- | --- | --- | +| Viewer | `18799` | `18800` | +| Data directory | `~/.openclaw/memos-plugin/` | `~/.hermes/memos-plugin/` | +| Config entry | Viewer → Settings | Viewer → Settings | + +```text +OpenClaw: + memory_search("deploy config") + → prioritizes OpenClaw's local experience + +Hermes: + memory_search("deploy config") + → prioritizes Hermes' local experience + +With Hub enabled: + both can explicitly reuse team-shared Skills +``` + +--- + +## Viewer Management + +The Memory Viewer provides these common entry points: + +| Page | Purpose | +| --- | --- | +| Overview | Inspect core status, version, event stream, and health. | +| Memories | Inspect L1 Traces and raw execution records. | +| Tasks | Inspect conversations and execution results grouped by task. | +| Policies | Inspect strategies induced from multiple Traces. | +| World Models | Inspect environment knowledge and constraints. | +| Skills | Inspect, search, or retire crystallized Skills. | +| Import | Import legacy plugin data, OpenClaw session JSONL, Hermes `MEMORY.md`, or import/export JSON backups. | +| Settings | Configure models, team sharing, logs, and telemetry. | +| Help | Look up field meanings such as `V`, `α`, `R_human`, `η`, support, and gain. | diff --git a/docs/en/openclaw/examples/multi_agent.md b/docs/en/openclaw/examples/multi_agent.md new file mode 100644 index 00000000..a28ea1ca --- /dev/null +++ b/docs/en/openclaw/examples/multi_agent.md @@ -0,0 +1,98 @@ +--- +title: Multi-Agent Memory Isolation +--- + +## Cloud Plugin + +The MemOS OpenClaw Cloud plugin supports complete isolation of memory and message history across multiple Agents. Each Agent can only access its own memory, preventing cross-agent interference. + +### How to Use in Cloud Plugin + +With a simple configuration, different Agents can have independent memory spaces. Both auto-detection and static assignment are supported. + +#### 1. Enable Multi-Agent Mode + +Add the following to your `openclaw.json`: + +```json +{ + "plugins": { + "entries": { + "memos-cloud-openclaw-plugin": { + "config": { + "multiAgentMode": true + } + } + } + } +} +``` + +Or set the environment variable: + +```bash +MEMOS_MULTI_AGENT_MODE=true +``` + +#### 2. Auto-detect Agent + +Once enabled, the plugin automatically reads `ctx.agentId` and isolates memory for each Agent. No extra configuration is required. + +#### 3. Statically Assign Agent (Optional) + +If you need to pin a specific Agent ID, set it in the config: + +```json +{ + "config": { + "agentId": "marketing_agent" + } +} +``` + +### Principles + +- **/search/memory**: Memory retrieval — returns only the current Agent's memories +- **/add/message**: Record insertion — automatically tags data for the current Agent +- **Backward compatibility**: Default Agent `"main"` is ignored to keep existing single-Agent data unaffected + +### Use Cases + +- **Multi-role collaboration**: Strategy, business, marketing, and engineering Agents can work in parallel +- **Business-line isolation**: Agents from different business lines run independently without interference +- **Persona consistency**: Preserve each Agent's long-term persona and behavior style + +--- + +## Local Plugin + +`@memtensor/memos-local-plugin` supports both OpenClaw and Hermes. By default, each agent uses its own runtime home and local database. If multiple sessions / agents share one runtime, retrieval is scoped toward the current agent context. For cross-instance collaboration, enable team sharing from **Viewer → Settings → Team Sharing**. + +### Rules + +- **Isolated by default**: OpenClaw uses `~/.openclaw/memos-plugin/`, while Hermes uses `~/.hermes/memos-plugin/`. They do not share databases automatically. +- **Current agent first**: retrieval prioritizes the current agent / session's Traces, Policies, World Models, and Skills. +- **Optional sharing**: when `hub.enabled` is on, instances can share locally crystallized Skills and optional trace excerpts over a LAN / VPN. +- **Graceful fallback**: Hub is not on the algorithm critical path. If sharing is unavailable, the plugin falls back to local-only memory. + +### Example Workflow + +```text +OpenClaw: + memory_search("deploy config") + → prioritizes OpenClaw's local Skill / Trace / World Model store + +Hermes: + memory_search("deploy config") + → prioritizes Hermes' local Skill / Trace / World Model store + +With Hub enabled: + OpenClaw / Hermes can pull team-shared Skills + private Traces remain local to each machine and runtime home by default +``` + +### Expected Results + +- OpenClaw and Hermes do not read each other's local database by default +- Team members can explicitly share high-value Skills to avoid repeating mistakes +- Local writes, retrieval, and skill lookup continue to work even if Hub is unavailable diff --git a/docs/en/openclaw/examples/recall_filter.md b/docs/en/openclaw/examples/recall_filter.md new file mode 100644 index 00000000..93c51713 --- /dev/null +++ b/docs/en/openclaw/examples/recall_filter.md @@ -0,0 +1,108 @@ +--- +title: Secondary Filtering for Memory Recall +--- + +## Cloud Plugin + +The MemOS Openclaw cloud plugin supports secondary filtering of recalled memories with a specified large language model. After filtering, only memories that are highly relevant to the current task are injected into context, which reduces irrelevant noise and saves tokens. + +### How to Use + +Just configure an OpenAI-compatible model endpoint (such as local Ollama or a third-party LLM API) and enable the filter switch to turn on secondary memory filtering. + +#### 1. Enable Memory Filtering + +When configuring an LLM for memory filtering, you **must** configure the API Key and Base URL. + +Add the following in your `openclaw.json` config: +```json +{ + "plugins": { + "entries": { + "memos-cloud-openclaw-plugin": { + "config": { + "recallFilterEnabled": true, + "recallFilterBaseUrl": "http://127.0.0.1:11434/v1", + "recallFilterApiKey": "sk-...", + "recallFilterModel": "qwen2.5_7b" + } + } + } + } +} +``` + +Or set environment variables: +```bash +MEMOS_RECALL_FILTER_ENABLED=true +MEMOS_RECALL_FILTER_BASE_URL="http://127.0.0.1:11434/v1" +MEMOS_RECALL_FILTER_API_KEY="sk-..." +MEMOS_RECALL_FILTER_MODEL="qwen2.5_7b" +``` + +#### 2. Configure Authentication and Advanced Parameters (Optional) + +If you need to adjust timeout and failure strategy, you can specify them in the config: +```json +{ + "config": { + "recallFilterTimeoutMs": 6000, + "recallFilterFailOpen": true + } +} +``` + +### How It Works +- **Post-recall interception**: Before each conversation round, after memories are recalled from the cloud, the plugin sends candidate memory entries to your configured filtering model for secondary screening. +- **Precise retention**: After model judgment, only entries marked as `keep` are retained and injected into the agent context. +- **High-availability fallback**: Fail-open (`recallFilterFailOpen: true`) is enabled by default. If the filtering model times out or fails, it automatically falls back to full injection without filtering, so the current conversation is not interrupted. + +### Typical Use Cases +- **Pruning long-term memory**: In long-running conversations with many accumulated memories, remove content unrelated to the current prompt to significantly reduce main-model context token usage. +- **Improving reasoning accuracy**: For agents handling complex tasks, filter out early irrelevant memories to improve reasoning quality on the core task. +- **Working with local models**: Use a locally running small model (such as `qwen2.5_7b` via Ollama) as a low-cost pre-filter to improve memory injection quality without increasing main-model API costs. + +--- + +## Local Plugin + +`@memtensor/memos-local-plugin` includes multi-stage local retrieval filtering. It first recalls candidates from Skill, Trace/Episode, and World Model tiers, then applies RRF + MMR for fusion and deduplication. If an LLM is configured, it can also run a final relevance check before injection to drop items that only share surface keywords with the current task. + +### How to Configure + +Configure this directly in the Memory Viewer for the target agent: + +| Agent | Memory Viewer | +| --- | --- | +| OpenClaw | `http://127.0.0.1:18799` | +| Hermes | `http://127.0.0.1:18800` | + +Steps: + +1. Open the Memory Viewer. +2. Go to **Settings → AI Models**. +3. In the **LLM** section, choose a provider and fill in endpoint, API Key, model, and related fields. +4. Click **Test** to confirm the model works. +5. Save the settings. The Viewer restarts the plugin and loads the new config. + +After saving, local retrieval can use that LLM for a relevance check after recall and RRF/MMR ranking. If no LLM is configured, the plugin still uses built-in multi-channel recall and mechanical threshold filtering. + +### Local Retrieval Flow + +```text +User request +→ Build retrieval query and tags +→ Tier 1: Skill candidates +→ Tier 2: Trace / Episode candidates +→ Tier 3: World Model candidates +→ Multi-channel recall: vector / FTS5 / pattern / error signatures +→ RRF fusion + MMR diversity control +→ Optional LLM relevance check +→ Inject into the agent +``` + +### Expected Results + +- Injected memory context is more focused and less noisy +- Skill, Trace/Episode, and World Model hits are not selected by vector similarity alone +- If the LLM is unavailable, retrieval falls back to stricter mechanical thresholds without breaking basic recall diff --git a/docs/en/openclaw/guide.md b/docs/en/openclaw/guide.md new file mode 100644 index 00000000..1bb75b42 --- /dev/null +++ b/docs/en/openclaw/guide.md @@ -0,0 +1,343 @@ +--- +title: OpenClaw Cloud Plugin +desc: Enhance your OpenClaw's memory and reduce token by 72%. MemOS OpenClaw plugin is now live! +--- + +OpenClaw's going viral lately. But if you've actually used it for a while, you'll find two issues you can hardly avoid: + +1. **Tokens burn way too quickly**:OpenClaw can handle many long-tail tasks, but the cost is that each run consumes a huge number of tokens. When you have it monitoring your screen, running scheduled tasks, or handling complex workflows, the token consumption is painfully fast. + + > ("u know token is money🫠") + +2. **Its memory function is rather poor**:Many claim OpenClaw's memory outperforms ChatGPT. Yet in practice, you'll find it does retain some information—but often not what you need. Crucial preferences may be forgotten, while trivial chatter is remembered in vivid detail. + + > ("can u please remember something really matter to me???") + +::tip +**NOT OpenClaw's fault, ALL AI agents suffering.** +:: + +This tutorial guides you through using the MemOS OpenClaw plugin to figure out these 3 pain issues: +- **Significantly reduce token consumption** — intelligently retrieve relevant memories without indiscriminately loading all history +- **Make memories genuinely useful** — professional memory categorisation and management, remembering what should be retained and forgetting what should be discarded +- **Preserve OpenClaw's core strengths** — cross-device control, proactive interaction, and human-like experience remain intact + +--- + +## Why is OpenClaw now a Token Killer🥷? + +### Issues with OpenClaw + +```plaintext +1st convo: 500 tokens +2nd convo: 500 + 800 = 1,300 tokens +3rd convo: 1,300 + 600 = 1,900 tokens +10th convo: 10,000+ tokens +``` + +When you have OpenClaw monitoring your screen, performing executive tasks, and running on a schedule, this figure increases even more rapidly. + +### Three critical points in OpenClaw's native memory management + +OpenClaw's memories reside in local `.md` files, categorised as global memories and daily memories. While this sounds promising, practical use reveals three unavoidable issues: + +#### 1. Global memories become booming +As global memories accumulate, context overload ensues. Moreover, these memories persistently interfere with current conversations. You might simply wish to ask a straightforward question, yet it dredges up every utterance from three months prior. + +#### 2. Daily memory recall proves difficult +Accumulating daily memories invariably makes retrieval cumbersome. To recall yesterday's activities, one must undergo an additional retrieval process. Maintaining cross-session memory becomes nearly impossible. + +#### 3. Memory relies on the model's proactive logging +OpenClaw's memory system relies on the model to log information itself, rather than automatic logging. This means it frequently misses details—you mention something, and it promptly forgets. + +> I've encountered this several times myself: I'd explicitly emphasised a particular project configuration, yet when restarting the conversation the next day, it had no recollection whatsoever, requiring me to explain it all over again. + +--- + +## OpenClaw vs OpenClaw + MemOS: Memory Solution Comparison + +### OpenClaw Native Memory Solution + +#### Memory Storage Solution + +**Core Philosophy: File is Truth** — Abandoning opaque vector databases in favor of Markdown files as the core carrier of memory. + +![Memory Storage Solution](https://cdn.memtensor.com.cn/img/1772697758585_b155tx_compressed.png) + + +#### Memory Retrieval Solution: Dual-Engine Drive + +| Engine | Technology | Features | +|-----|------|------| +| **Vector Search** | Cosine Similarity | Captures semantic associations, excels at "concept matching", e.g., associating "login flow" with "authentication" | +| **BM25 Search** (Lexical Matching) | FTS5-based lexical matching | Handles "exact tokens", such as error codes, function names, or specific IDs | + +**Retrieval Trigger**: Triggered via Prompt, model decides automatically + +**Weighted Score Fusion**: `Score = (0.7 * VectorScore) + (0.3 * BM25Score)` + +#### Pain Points of Existing Solutions + +- **Rudimentary Retrieval Algorithms**: Unstable recall, weak relevance, Agent repeats trial and error, Token accumulates rapidly +- **Excessive Context Injection**: Fixed reading of today + yesterday + long-term memory, high proportion of invalid context +- **Lack of Structure and Deduplication in Memory**: Tool call long outputs are written directly and re-transmitted repeatedly, costs snowball + +### OpenClaw + MemOS Memory Solution + +![MemOS-OpenClaw](https://cdn.memtensor.com.cn/img/1772679552943_lsuh81_compressed.png) + +#### Three Core Effects + +**Effect 1: Controllable Token Costs 💰** +> From "Full Context Stuffing" to "Precise Recall per Task" + +OpenClaw no longer stuffs today+yesterday+long-term memory every time. Instead, MemOS retrieves the most relevant few memories based on the current task (recall budget/count can be set), significantly reducing the proportion of invalid context and avoiding Token snowballing. + +**Effect 2: More Stable and Accurate Retrieval 🎯** +> Reduce repeated trial and error and re-asking, improve one-shot hit rate + +MemOS provides stronger memory organization and retrieval capabilities (structured, hierarchical/multi-granular, semantic retrieval + rule filtering, etc.), making OpenClaw's recalled content more relevant and stable, reducing repeated reasoning and confirmation caused by "unstable recall". + +**Effect 3: Cleaner and More Usable Memory ✨** +> Structured + Deduplicated + High Compression, avoiding "Long Output Pollution" + +Long outputs from tool calls (such as traversal results, config/schema, etc.) are not written back to the context verbatim repeatedly; MemOS can summarize/compress, deduplicate, and archive, making it "cleaner" over long-term operation, with memory quality improving rather than deteriorating over time. + +--- + +## After integrating the MemOS OpenClaw plugin👇🏻 + +- ✅ Retrieve only 3–5 relevant memories at a time +- ✅ Maintain context stability within 2,000–3,000 tokens +- ✅ Cost remains manageable regardless of dialogue length + +### MemOS plugins can enhance your OpenClaw + +| 功能 | 说明 | +|-----|------| +| **Automatically remember all conversations** | without relying on models to actively log, ensuring no critical information is missed | +| **Precise recall** | retrieve relevant memories based on current task intent, avoiding irrelevant historical data | +| **Remember user preferences** | categorise and store preference information specifically, remaining effective across sessions | + +MemOS OpenClaw has restructured the token consumption model, transforming costs from a ‘historical length function’ into a ‘task relevance function’. Your local OpenClaw costs become manageable, and the system operates more stably. + +--- + +## Quick Start + +Three steps to boost your Agent with basic memory capabilities. + +### 1. Install OpenClaw + +Ensure that the OpenClaw environment is installed on your system: + +```bash +# Install the newest version +npm install -g openclaw@latest + +# Initialize and configure startup +openclaw onboard +``` + +### 2. Get and configure your API Key + +#### 2.1 Get your Key + +Log in to or register with MemOS Cloud to get your API Key 🔗 [MemOS Cloud](https://memos-dashboard.openmem.net/apikeys/) + +![image.png](https://cdn.memtensor.com.cn/img/1772443326905_kkxve6_compressed.webp) + +#### 2.2 Set Environment Variables + +The plugin tries env files in order (**openclaw → moltbot → clawdbot**). For each key, the first file with a value wins. +If none of these files exist (or the key is missing), it falls back to the process environment. + +**Where to configure** +- Files (priority order): + - `~/.openclaw/.env` + - `~/.moltbot/.env` + - `~/.clawdbot/.env` +- Each line is `KEY=value` + +**Quick setup (shell)** +```bash +echo 'export MEMOS_API_KEY="mpg-..."' >> ~/.zshrc +source ~/.zshrc +# or + +echo 'export MEMOS_API_KEY="mpg-..."' >> ~/.bashrc +source ~/.bashrc +``` + +**Quick setup (Windows PowerShell)** +```powershell +[System.Environment]::SetEnvironmentVariable("MEMOS_API_KEY", "mpg-...", "User") +``` + +If `MEMOS_API_KEY` is missing, the plugin will warn with setup instructions and the API key URL. + +**Minimal config** +```env +MEMOS_API_KEY=YOUR_TOKEN +``` + +### 3. Install Plugins + +#### Option A — NPM (Recommended) + +```bash +openclaw plugins install @memtensor/memos-cloud-openclaw-plugin@latest +openclaw gateway restart +``` + +> Note for Windows Users: If you encounter Error: spawn EINVAL, this is a known issue with OpenClaw's plugin installer on Windows. Please use Option B (Manual Install) below. + +Make sure it’s enabled in ~/.openclaw/openclaw.json: + +```json +{ + "plugins": { + "entries": { + "memos-cloud-openclaw-plugin": { "enabled": true } + } + } +} +``` + +#### Option B — Manual Install (Workaround for Windows) + +1. Download the latest `.tgz` from [NPM](https://www.npmjs.com/package/@memtensor/memos-cloud-openclaw-plugin). +2. Extract it to a local folder (e.g., `C:\Users\YourName\.openclaw\extensions\memos-cloud-openclaw-plugin`). +3. Configure `~/.openclaw/openclaw.json` (or `%USERPROFILE%\.openclaw\openclaw.json`): + +```json +{ + "plugins": { + "entries": { + "memos-cloud-openclaw-plugin": { "enabled": true } + }, + "load": { + "paths": [ + "C:\\Users\\YourName\\.openclaw\\extensions\\memos-cloud-openclaw-plugin\\package" + ] + } + } +} +``` + +::tip +Note: The extracted folder usually contains a package subfolder. Point to the folder containing package.json. +:: + +Restart the gateway after config changes. + +### 4. Update Plugin + +You can manually update the cloud plugin to the latest version using the following commands: + +```bash +openclaw plugins update @memtensor/memos-cloud-openclaw-plugin@latest +openclaw gateway restart +``` + +## Advanced Configuration for Open-Source Projects + +If you wanna unlock further possibilities, you may explore and configure additional features via the MemOS GitHub project! + +### Visual Configuration UI (Config UI) + +Starting from version `v0.1.12`, the Cloud Plugin features a built-in local visual configuration service, allowing you to manage and modify plugin settings more intuitively. + +**How to access:** +1. Start your OpenClaw node or host gateway. +2. Once the plugin is successfully loaded and detects that the gateway is ready, it will automatically start the Config UI service in the background. +3. An access link will be printed in the terminal console logs (the default URL is typically `http://127.0.0.1:38463`). +4. Open this link in your browser to access the plugin's visual management backend. + +**Features:** +- **Intuitive Editing**: Supports form-based editing of all core configurations (such as Knowledge Base IDs, LLM retrieval parameters, multi-agent override rules, etc.). +- **Real-time Synchronization**: Configuration changes saved via the interface take effect immediately during plugin runtime, without requiring a service restart. +- **Status Monitoring**: The interface provides heartbeat detection with the host gateway to ensure the configuration synchronization link is healthy. + +### Multi-Agent Support & Isolation + +The plugin provides powerful native support for multi-agent architectures (via the `agent_id` parameter), making it ideal for complex workflows or team agent scenarios. + +**1. Enable & Data Isolation** +- **How to enable**: Set `"multiAgentMode": true` in the config or configure the environment variable `MEMOS_MULTI_AGENT_MODE=true`. +- **Automatic Isolation**: When enabled, the plugin automatically reads `ctx.agentId` from the context. This Agent identifier is attached to memory retrieval and writing, ensuring complete data isolation between different Agents under the same user (Note: the default `"main"` Agent is ignored to maintain legacy data compatibility). + +**2. Memory Switch per Agent (Whitelist Control)** +In Multi-Agent mode, if you do not want all Agents to consume memory, you can use `allowedAgents` to precisely control the whitelist: +```json +{ + "plugins": { + "entries": { + "memos-cloud-openclaw-plugin": { + "enabled": true, + "config": { + "multiAgentMode": true, + "allowedAgents": ["research-agent", "coding-agent"] + } + } + } + } +} +``` +*(Tip: 1. If `allowedAgents` is not configured or is an empty array `[]`, it means **all Agents** are allowed to use memory retrieval and writing. 2. If it is configured, Agents not in the configuration will be completely skipped, and only the configured Agents will be effective for memory retrieval and writing, thereby avoiding Token waste.)* + +**3. Per-Agent Configuration (agentOverrides)** +Beyond simple toggles, you can use `agentOverrides` to **configure different memory parameters for each Agent**. For example, giving a research assistant a looser retrieval threshold, while restricting a coding assistant to read only a specific codebase knowledge base: + +```json +{ + "plugins": { + "entries": { + "memos-cloud-openclaw-plugin": { + "enabled": true, + "config": { + "multiAgentMode": true, + "allowedAgents": ["research-agent", "coding-agent"], + "memoryLimitNumber": 6, + "relativity": 0.45, + + "agentOverrides": { + "research-agent": { + "knowledgebaseIds": ["kb-research-papers"], + "memoryLimitNumber": 12, + "relativity": 0.3, + "queryPrefix": "research context: " + }, + "coding-agent": { + "knowledgebaseIds": ["kb-codebase"], + "memoryLimitNumber": 9, + "addEnabled": false + } + } + } + } + } + } +} +``` +*(In the example above, memory writing is disabled for the `coding-agent`, and it can only retrieve the top 9 highly relevant memories from the `kb-codebase` knowledge base).* + +### Deep customisation of environment variables + +In addition to the required API Key, you may also adjust the plugin's behaviour via environment variables。 + +Further configuration details can be found in [the MemTensor official plugin repo](https://github.com/MemTensor/MemOS/tree/main/apps/MemOS-Cloud-OpenClaw-Plugin) + +## Testing + +Now, you can engage in multi-turn conversations with your Agent, for example: + +**First convo:** +- "My favourite programming language is Python" +- "I'm developing an e-commerce project" + +**Second convo (new convo):** +- "Do you recall which programming language I prefer?" +- "How is the project I mentioned previously progressing?" + +Now, your OpenClaw will retrieve memories from MemOS Cloud and provide accurate responses ✅ diff --git a/docs/en/openclaw/hermes_local_plugin.md b/docs/en/openclaw/hermes_local_plugin.md new file mode 100644 index 00000000..0c2d5077 --- /dev/null +++ b/docs/en/openclaw/hermes_local_plugin.md @@ -0,0 +1,8 @@ +--- +title: Hermes Local Plugin Merged +desc: The Hermes local plugin guide has been merged into the unified MemOS Local Plugin guide. +--- + +The Hermes local plugin guide has been merged into the unified **Local Plugin** documentation. The new `@memtensor/memos-local-plugin` uses one local-first memory core for both OpenClaw and Hermes Agent. + +See the [Local Plugin installation guide](/openclaw/local_plugin). diff --git a/docs/en/openclaw/local_plugin.md b/docs/en/openclaw/local_plugin.md new file mode 100644 index 00000000..c415fcf4 --- /dev/null +++ b/docs/en/openclaw/local_plugin.md @@ -0,0 +1,127 @@ +--- +title: Local Plugin +desc: Use @memtensor/memos-local-plugin to bring local-first long-term memory, three-tier retrieval, skill crystallization, and an observable management panel to OpenClaw and Hermes Agent. +--- + +`@memtensor/memos-local-plugin` is the new MemOS local plugin: one local-first memory core for both **OpenClaw** and **Hermes Agent**. It does not host your memory data in the cloud. Instead, it maintains SQLite data, skill packages, and logs on your own machine so the agent can accumulate reusable experience locally. + +If you want a cloud-hosted memory service for OpenClaw with the simplest API Key setup, see the [OpenClaw Cloud Plugin](/openclaw/guide). If you care more about privacy, local runtime, observability, or using the same local memory capability across OpenClaw / Hermes, use this local plugin. + +## Core Capabilities + +| Capability | Description | +| --- | --- | +| Local-first | OpenClaw and Hermes each get an isolated runtime home. SQLite, skills, logs, and config stay on your machine. | +| Dual-agent support | OpenClaw integrates through an in-process TypeScript plugin; Hermes integrates through a Python Provider that talks to the same Node.js memory core over JSON-RPC. | +| Four memory layers | L1 Trace records each execution step, L2 Policy induces cross-task strategies, L3 World Model compresses environment knowledge, and Skill turns high-value experience into callable capabilities. | +| Three-tier retrieval | Retrieval runs across Skill → Trace/Episode → World Model, combining vector, FTS5, keyword pattern, and error-signature channels with RRF + MMR. | +| Feedback-driven evolution | Tool outcomes, environment feedback, and explicit user feedback update memory value and drive policy induction, skill crystallization, and decision repair. | +| Local Viewer | Includes Overview, Memories, Tasks, Policies, World Models, Skills, Analytics, Logs, Import, Settings, and Help pages. | +| Import and migration | Supports JSON import/export, legacy plugin migration, and agent-specific native imports for OpenClaw session JSONL or Hermes `MEMORY.md`. | +| Optional team sharing | Isolated by default. Enable sharing from the Memory Viewer's Team Sharing panel to share crystallized Skills and optional trace excerpts over a LAN / VPN. | + +## How It Works + +Before each task, the plugin retrieves relevant context and injects it into the agent. After the task ends, it stores conversations, tool calls, observations, and feedback in the local pipeline. High-value patterns gradually become Policies, World Models, and callable Skills. The next time a similar task appears, the agent receives guidance about what to do and what to avoid. + +| Stage | What Happens | Output | +| --- | --- | --- | +| 1. Agent adapter | OpenClaw / Hermes send conversations, tool calls, and feedback to the shared `MemoryCore` through their adapters. | Standardized turns, tool outcomes, feedback | +| 2. Local capture | `MemoryCore` turns the execution process into grounded, traceable step records. | L1 Trace | +| 3. Experience induction | Similar Traces are induced into cross-task strategies, then compressed into environment knowledge. | L2 Policy, L3 World Model | +| 4. Skill crystallization | High-value strategies become callable Skills and keep updating reliability from later feedback. | Skill, η, lifecycle status | +| 5. Retrieval injection | Before the next task, Retriever recalls context from Skill, Trace/Episode, and World Model tiers. | Local memory context injected into the agent | + +## Quick Start + +### Step 1: Install or Upgrade with One Command + +Installation and upgrades use the same command. The current installer targets macOS / Linux: + +```bash +curl -fsSL https://raw.githubusercontent.com/MemTensor/MemOS/main/apps/memos-local-plugin/install.sh | bash +``` + +The installer auto-detects whether OpenClaw and/or Hermes are installed. In an interactive terminal, it asks which agent to install for; in non-interactive environments, it installs for the detected agent(s). It deploys plugin code, installs production dependencies, and restarts the target runtime when needed. + +> Do not use direct `npm install` as the primary path. The installer handles agent detection, directory layout, config initialization, and runtime restart. + +### Step 2: Open the Memory Viewer + +After installation, open the corresponding Memory Viewer: + +| Agent | Memory Viewer | +| --- | --- | +| OpenClaw | `http://127.0.0.1:18799` | +| Hermes | `http://127.0.0.1:18800` | + +If you install both OpenClaw and Hermes, they use separate Viewers and separate local data directories. + +### Step 3: Configure from the Panel + +All user-facing configuration is done from the Memory Viewer: + +- **Settings → AI Models**: configure Embedding, LLM, Skill Evolver, and use Test to confirm connectivity. +- **Settings → Team Sharing**: enable or disable team sharing, then configure team address and tokens. +- **Settings → General**: configure language, detailed logs, anonymous telemetry, and related options. + +After saving, the Viewer restarts the plugin and loads the new settings. + +### Step 4: Start the Target Agent + +After installation, start the agent you selected as usual. The plugin retrieves local context before the agent builds its prompt, then writes conversations, tool calls, observations, and feedback into local memory after the turn finishes. + +| Agent | How to start | Plugin integration | +| --- | --- | --- | +| OpenClaw | Start or restart the OpenClaw gateway normally | TypeScript plugin calls `MemoryCore` in the OpenClaw process | +| Hermes | Run `hermes chat` | Python Provider calls the Node.js memory core over JSON-RPC | + +If the Hermes machine cannot run Node.js, the Hermes Provider reports unavailable and falls back to Hermes' own in-memory mode. + +### Step 5: Verify Memory + +Back in the Memory Viewer, check: + +1. **Overview**: confirm core status, version, and event stream. +2. **Memories**: confirm conversations and tool steps are written as Traces. +3. **Tasks / Policies / World Models / Skills**: inspect how experience is induced and crystallized. +4. **Import**: migrate legacy data, import OpenClaw session JSONL, import Hermes `MEMORY.md`, or import/export JSON backups. +5. **Help**: look up field meanings such as `V`, `α`, `R_human`, `η`, support, and gain. + +## Agent Differences + +| Item | OpenClaw | Hermes | +| --- | --- | --- | +| Integration | TypeScript plugin, in-process calls to `MemoryCore` | Python `MemoryProvider`, stdio JSON-RPC to Node bridge | +| Default Viewer | `http://127.0.0.1:18799` | `http://127.0.0.1:18800` | +| Model configuration | Configure in OpenClaw Viewer Settings → AI Models | Configure in Hermes Viewer Settings → AI Models | +| Data sharing | Isolated from Hermes by default | Isolated from OpenClaw by default | + +Even on the same machine, the two agents use separate databases and Viewers. They only share data after you explicitly enable `hub:`. + +## Available Tools + +OpenClaw and Hermes expose memory tools through their own host interfaces. Common capabilities include: + +| Tool | Purpose | +| --- | --- | +| `memory_search` | Search across relevant Skills, Trace/Episodes, and World Models. | +| `memory_get` | Fetch a memory detail. | +| `memory_timeline` | Inspect an episode / task timeline. | +| `skill_list` | List callable Skills. | +| `skill_get` | Fetch a Skill invocation guide. | +| `memory_environment` | Query L3 World Models for project structure, environment behavior, and constraints. | + +The plugin also records tool successes and failures for later decision repair. + +## Data Management + +- **Back up**: export JSON from the Viewer's Import page, or back up the current agent's `~/./memos-plugin/` directory. +- **Clear only memory**: after confirming you have a backup, delete `data/` and `skills/` under the runtime home. +- **Clear logs**: delete regular files under `logs/`. Audit logs are gzipped monthly and kept by default. +- **Full reset**: delete the entire `~/./memos-plugin/` directory. It will be recreated empty on the next start. + +## More + +- [MemOS local plugin project](https://github.com/MemTensor/MemOS/tree/main/apps/memos-local-plugin) +- [Cloud Plugin vs Local Plugin](/openclaw/plugin_compare) diff --git a/docs/en/openclaw/plugin_compare.md b/docs/en/openclaw/plugin_compare.md new file mode 100644 index 00000000..ea9ed9fe --- /dev/null +++ b/docs/en/openclaw/plugin_compare.md @@ -0,0 +1,82 @@ +--- +title: Cloud Plugin vs Local Plugin +desc: The cloud plugin is for quick MemOS Cloud adoption, while the local plugin brings local-first long-term memory and self-evolution to OpenClaw and Hermes. This guide helps you choose the right option. +--- + +## Overview + +### Cloud Plugin + +Stores memories in **MemOS Cloud**. After installing the OpenClaw cloud plugin, a single MemOS Cloud API Key is all you need to get started. It supports multi-agent memory sharing across devices, and benchmarks show up to **72% reduction in Token usage** — ideal for quick setup, cross-device collaboration, and production use. + +### Local Plugin + +The new local plugin is `@memtensor/memos-local-plugin`: a **local-first memory core shared by OpenClaw and Hermes**. It stores data in local SQLite and evolves it into four layers: L1 Trace, L2 Policy, L3 World Model, and callable Skills. With feedback-driven self-evolution, three-tier retrieval, and decision repair, the agent accumulates reusable experience on your own machine. It is best for developers who care most about privacy, local deployment, and observability. + +--- + +## Core Differences + +| Comparison Dimension | ☁️ MemOS Cloud Plugin | 🖥️ MemOS Local Plugin | +| --- | --- | --- | +| 💾 **Data Storage & Privacy** | **Cloud storage**: Memory data is stored in MemOS Cloud, making cross-device and multi-instance sharing easy. | **Local storage**: Each agent has its own runtime home. OpenClaw defaults to `~/.openclaw/memos-plugin/`, and Hermes defaults to `~/.hermes/memos-plugin/`. SQLite, skill packages, logs, and config all stay on the local machine. | +| 🤖 **Agent Support** | Built for the OpenClaw cloud plugin, backed by MemOS Cloud as the unified memory service. | One shared core supports both OpenClaw and Hermes: OpenClaw integrates through an in-process TypeScript plugin; Hermes integrates through a Python Provider that talks to the Node core over JSON-RPC. | +| 🔑 **API & Model Config** | Uses a MemOS Cloud API Key. Memory processing, retrieval, and evolution are handled by the cloud service. | Uses the Memory Viewer's Settings panel for model and team-sharing configuration. Embeddings can use the local provider by default or OpenAI-compatible, Gemini, Cohere, Voyage, and Mistral providers. OpenClaw can inherit the host model; Hermes can configure an LLM provider and API Key in the panel. | +| 🔍 **Retrieval Capability** | Cloud-based semantic vector retrieval + graph retrieval, optimized by the service. | Three-tier retrieval: Tier 1 Skill, Tier 2 Trace/Episode, and Tier 3 World Model. It combines vector, FTS5, keyword pattern, and error-signature channels, then uses RRF + MMR for relevance and diversity. | +| 🧠 **Memory Evolution** | Automatically handled by cloud services: written memories are structured, deduplicated, and corrected in natural language. | Local Reflect2Evolve pipeline: conversations and tool calls become L1 Traces, cross-task patterns become L2 Policies, policies roll up into L3 World Models, and high-value strategies crystallize into callable Skills with active / retired lifecycle states. | +| 🛠️ **Decision Repair** | Mainly relies on cloud retrieval to bring back more relevant memory and reduce repeated context. | Tool failures, negative feedback, and task outcomes enter the feedback channel. Failure patterns can trigger decision repair, injecting corrective context into the next turn so the agent avoids repeating the same mistake. | +| 👥 **Multi-Agent & Sharing** | Supports multi-agent scenarios and cross-device sharing, making it suitable for teams. | Isolated by default: OpenClaw and Hermes have separate databases and viewers. Optional Hub sharing can publish locally crystallized Skills and optional trace excerpts inside a LAN / VPN; hub failures degrade back to local-only mode. | +| 👀 **Visualization & Observability** | Managed through the MemOS Cloud Dashboard for API Key and cloud memory capabilities. | Includes a local Viewer with Overview, Memories, Tasks, Policies, World Models, Skills, Analytics, Logs, Import, Settings, and Help pages. HTTP + SSE streams expose events, logs, retrieval, skills, and health status in real time. | +| 🛠️ **Deployment & Configuration** | **Very simple**: Done in 3 steps (install plugin, get API Key, configure env vars), mainly relying on cloud services. | **Very simple**: Installation and upgrades are both one command. The installer auto-detects installed OpenClaw / Hermes agents, installs `@memtensor/memos-local-plugin`, creates runtime folders, and restarts the target runtime. | + +--- + +## Quick Install + +### Cloud Plugin (3 steps) + +1. **Install the plugin** + + ```bash + openclaw plugins install @memtensor/memos-cloud-openclaw-plugin@latest + ``` + +2. **Get and configure API Key** + + Get your API Key: [MemOS Cloud Dashboard](https://memos-dashboard.openmem.net/apikeys/) + + ```bash + mkdir -p ~/.openclaw && echo "MEMOS_API_KEY=mpg-..." > ~/.openclaw/.env + ``` + +3. **Restart the gateway** + + ```bash + openclaw gateway restart + ``` + +**Manually update the plugin**: +```bash +openclaw plugins update @memtensor/memos-cloud-openclaw-plugin@latest +openclaw gateway restart +``` + +> For more details, see the [OpenClaw Cloud Plugin documentation](/openclaw/guide#quick-start). + +### Local Plugin (one command) + +```bash +# Install the plugin +curl -fsSL https://raw.githubusercontent.com/MemTensor/MemOS/main/apps/memos-local-plugin/install.sh | bash +``` + +Installation and upgrades use the same command. The installer auto-detects whether OpenClaw and/or Hermes are installed. In an interactive terminal, it asks which agent to install for; in non-interactive environments, it installs for the detected agent(s). + +| Agent | Code directory | Data and config directory | Viewer | +| --- | --- | --- | --- | +| OpenClaw | `~/.openclaw/plugins/memos-local-plugin/` | `~/.openclaw/memos-plugin/` | `http://127.0.0.1:18799` | +| Hermes | `~/.hermes/plugins/memos-local-plugin/` | `~/.hermes/memos-plugin/` | `http://127.0.0.1:18800` | + +> Upgrading or uninstalling plugin code does not delete existing local data, skill packages, or logs. OpenClaw and Hermes each run their own Viewer; there is no shared port or read-only peer view. +> +> Configure models, team sharing, and general options from the Memory Viewer for the target agent: OpenClaw defaults to `http://127.0.0.1:18799`, and Hermes defaults to `http://127.0.0.1:18800`. diff --git a/docs/openapi.json b/docs/openapi.json deleted file mode 100644 index d9ef710b..00000000 --- a/docs/openapi.json +++ /dev/null @@ -1,3569 +0,0 @@ -{ - "openapi": "3.1.0", - "info": { - "title": "MemOS Server REST APIs", - "description": "A REST API for managing multiple users with MemOS Server.", - "version": "1.0.1" - }, - "paths": { - "/product/search": { - "post": { - "tags": [ - "Server API" - ], - "summary": "Search memories", - "description": "Search memories for a specific user.\n\nThis endpoint uses the class-based SearchHandler for better code organization.", - "operationId": "search_memories_product_search_post", - "requestBody": { - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/APISearchRequest" - } - } - }, - "required": true - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/SearchResponse" - } - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/HTTPValidationError" - } - } - } - } - } - } - }, - "/product/add": { - "post": { - "tags": [ - "Server API" - ], - "summary": "Add memories", - "description": "Add memories for a specific user.\n\nThis endpoint uses the class-based AddHandler for better code organization.", - "operationId": "add_memories_product_add_post", - "requestBody": { - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/APIADDRequest" - } - } - }, - "required": true - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/MemoryResponse" - } - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/HTTPValidationError" - } - } - } - } - } - } - }, - "/product/scheduler/allstatus": { - "get": { - "tags": [ - "Server API" - ], - "summary": "Get detailed scheduler status", - "description": "Get detailed scheduler status including running tasks and queue metrics.", - "operationId": "scheduler_allstatus_product_scheduler_allstatus_get", - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/AllStatusResponse" - } - } - } - } - } - } - }, - "/product/scheduler/status": { - "get": { - "tags": [ - "Server API" - ], - "summary": "Get scheduler running status", - "description": "Get scheduler running status.", - "operationId": "scheduler_status_product_scheduler_status_get", - "parameters": [ - { - "name": "user_id", - "in": "query", - "required": true, - "schema": { - "type": "string", - "description": "User ID", - "title": "User Id" - }, - "description": "User ID" - }, - { - "name": "task_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "description": "Optional Task ID to query a specific task", - "title": "Task Id" - }, - "description": "Optional Task ID to query a specific task" - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/StatusResponse" - } - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/HTTPValidationError" - } - } - } - } - } - } - }, - "/product/scheduler/task_queue_status": { - "get": { - "tags": [ - "Server API" - ], - "summary": "Get scheduler task queue status", - "description": "Get scheduler task queue backlog/pending status for a user.", - "operationId": "scheduler_task_queue_status_product_scheduler_task_queue_status_get", - "parameters": [ - { - "name": "user_id", - "in": "query", - "required": true, - "schema": { - "type": "string", - "description": "User ID whose queue status is requested", - "title": "User Id" - }, - "description": "User ID whose queue status is requested" - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/TaskQueueResponse" - } - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/HTTPValidationError" - } - } - } - } - } - } - }, - "/product/scheduler/wait": { - "post": { - "tags": [ - "Server API" - ], - "summary": "Wait until scheduler is idle for a specific user", - "description": "Wait until scheduler is idle for a specific user.", - "operationId": "scheduler_wait_product_scheduler_wait_post", - "parameters": [ - { - "name": "user_name", - "in": "query", - "required": true, - "schema": { - "type": "string", - "title": "User Name" - } - }, - { - "name": "timeout_seconds", - "in": "query", - "required": false, - "schema": { - "type": "number", - "default": 120.0, - "title": "Timeout Seconds" - } - }, - { - "name": "poll_interval", - "in": "query", - "required": false, - "schema": { - "type": "number", - "default": 0.5, - "title": "Poll Interval" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": {} - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/HTTPValidationError" - } - } - } - } - } - } - }, - "/product/scheduler/wait/stream": { - "get": { - "tags": [ - "Server API" - ], - "summary": "Stream scheduler progress for a user", - "description": "Stream scheduler progress via Server-Sent Events (SSE).", - "operationId": "scheduler_wait_stream_product_scheduler_wait_stream_get", - "parameters": [ - { - "name": "user_name", - "in": "query", - "required": true, - "schema": { - "type": "string", - "title": "User Name" - } - }, - { - "name": "timeout_seconds", - "in": "query", - "required": false, - "schema": { - "type": "number", - "default": 120.0, - "title": "Timeout Seconds" - } - }, - { - "name": "poll_interval", - "in": "query", - "required": false, - "schema": { - "type": "number", - "default": 0.5, - "title": "Poll Interval" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": {} - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/HTTPValidationError" - } - } - } - } - } - } - }, - "/product/chat/complete": { - "post": { - "tags": [ - "Server API" - ], - "summary": "Chat with MemOS (Complete Response)", - "description": "Chat with MemOS for a specific user. Returns complete response (non-streaming).\n\nThis endpoint uses the class-based ChatHandler.", - "operationId": "chat_complete_product_chat_complete_post", - "requestBody": { - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/APIChatCompleteRequest" - } - } - }, - "required": true - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": {} - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/HTTPValidationError" - } - } - } - } - } - } - }, - "/product/chat/stream": { - "post": { - "tags": [ - "Server API" - ], - "summary": "Chat with MemOS", - "description": "Chat with MemOS for a specific user. Returns SSE stream.\n\nThis endpoint uses the class-based ChatHandler which internally\ncomposes SearchHandler and AddHandler for a clean architecture.", - "operationId": "chat_stream_product_chat_stream_post", - "requestBody": { - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ChatRequest" - } - } - }, - "required": true - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": {} - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/HTTPValidationError" - } - } - } - } - } - } - }, - "/product/chat/stream/playground": { - "post": { - "tags": [ - "Server API" - ], - "summary": "Chat with MemOS playground", - "description": "Chat with MemOS for a specific user. Returns SSE stream.\n\nThis endpoint uses the class-based ChatHandler which internally\ncomposes SearchHandler and AddHandler for a clean architecture.", - "operationId": "chat_stream_playground_product_chat_stream_playground_post", - "requestBody": { - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ChatPlaygroundRequest" - } - } - }, - "required": true - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": {} - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/HTTPValidationError" - } - } - } - } - } - } - }, - "/product/suggestions": { - "post": { - "tags": [ - "Server API" - ], - "summary": "Get suggestion queries", - "description": "Get suggestion queries for a specific user with language preference.", - "operationId": "get_suggestion_queries_product_suggestions_post", - "requestBody": { - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/SuggestionRequest" - } - } - }, - "required": true - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/SuggestionResponse" - } - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/HTTPValidationError" - } - } - } - } - } - } - }, - "/product/get_all": { - "post": { - "tags": [ - "Server API" - ], - "summary": "Get all memories for user", - "description": "Get all memories or subgraph for a specific user.\n\nIf search_query is provided, returns a subgraph based on the query.\nOtherwise, returns all memories of the specified type.", - "operationId": "get_all_memories_product_get_all_post", - "requestBody": { - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/GetMemoryPlaygroundRequest" - } - } - }, - "required": true - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/MemoryResponse" - } - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/HTTPValidationError" - } - } - } - } - } - } - }, - "/product/get_memory": { - "post": { - "tags": [ - "Server API" - ], - "summary": "Get memories for user", - "operationId": "get_memories_product_get_memory_post", - "requestBody": { - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/GetMemoryRequest" - } - } - }, - "required": true - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/GetMemoryResponse" - } - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/HTTPValidationError" - } - } - } - } - } - } - }, - "/product/get_memory/{memory_id}": { - "get": { - "tags": [ - "Server API" - ], - "summary": "Get memory by id", - "operationId": "get_memory_by_id_product_get_memory__memory_id__get", - "parameters": [ - { - "name": "memory_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "title": "Memory Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/GetMemoryResponse" - } - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/HTTPValidationError" - } - } - } - } - } - } - }, - "/product/delete_memory": { - "post": { - "tags": [ - "Server API" - ], - "summary": "Delete memories for user", - "operationId": "delete_memories_product_delete_memory_post", - "requestBody": { - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/DeleteMemoryRequest" - } - } - }, - "required": true - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/DeleteMemoryResponse" - } - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/HTTPValidationError" - } - } - } - } - } - } - }, - "/product/feedback": { - "post": { - "tags": [ - "Server API" - ], - "summary": "Feedback memories", - "description": "Feedback memories for a specific user.\n\nThis endpoint uses the class-based FeedbackHandler for better code organization.", - "operationId": "feedback_memories_product_feedback_post", - "requestBody": { - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/APIFeedbackRequest" - } - } - }, - "required": true - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/MemoryResponse" - } - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/HTTPValidationError" - } - } - } - } - } - } - }, - "/product/get_user_names_by_memory_ids": { - "post": { - "tags": [ - "Server API" - ], - "summary": "Get user names by memory ids", - "description": "Get user names by memory ids.", - "operationId": "get_user_names_by_memory_ids_product_get_user_names_by_memory_ids_post", - "requestBody": { - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/GetUserNamesByMemoryIdsRequest" - } - } - }, - "required": true - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/GetUserNamesByMemoryIdsResponse" - } - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/HTTPValidationError" - } - } - } - } - } - } - }, - "/product/exist_mem_cube_id": { - "post": { - "tags": [ - "Server API" - ], - "summary": "Check if mem cube id exists", - "description": "Check if mem cube id exists.", - "operationId": "exist_mem_cube_id_product_exist_mem_cube_id_post", - "requestBody": { - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ExistMemCubeIdRequest" - } - } - }, - "required": true - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ExistMemCubeIdResponse" - } - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/HTTPValidationError" - } - } - } - } - } - } - } - }, - "components": { - "schemas": { - "APIADDRequest": { - "properties": { - "user_id": { - "type": "string", - "title": "User Id", - "description": "User ID" - }, - "session_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Session Id", - "description": "Session ID. If not provided, a default session will be used." - }, - "task_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Task Id", - "description": "Task ID for monitering async tasks" - }, - "writable_cube_ids": { - "anyOf": [ - { - "items": { - "type": "string" - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Writable Cube Ids", - "description": "List of cube IDs user can write for multi-cube add" - }, - "async_mode": { - "type": "string", - "enum": [ - "async", - "sync" - ], - "title": "Async Mode", - "description": "Whether to add memory in async mode. Use 'async' to enqueue background add (non-blocking), or 'sync' to add memories in the current call. Default: 'async'.", - "default": "async" - }, - "mode": { - "anyOf": [ - { - "type": "string", - "enum": [ - "fast", - "fine" - ] - }, - { - "type": "null" - } - ], - "title": "Mode", - "description": "(Internal) Add mode used only when async_mode='sync'. If set to 'fast', the handler will use a fast add pipeline. Ignored when async_mode='async'." - }, - "custom_tags": { - "anyOf": [ - { - "items": { - "type": "string" - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Custom Tags", - "description": "Custom tags for this add request, e.g. ['Travel', 'family']. These tags can be used as filters in search." - }, - "info": { - "anyOf": [ - { - "additionalProperties": true, - "type": "object" - }, - { - "type": "null" - } - ], - "title": "Info", - "description": "Additional metadata for the add request. All keys can be used as filters in search. Example: {'agent_id': 'xxxxxx', 'app_id': 'xxxx', 'source_type': 'web', 'source_url': 'https://www.baidu.com', 'source_content': '西湖是杭州最著名的景点'}." - }, - "messages": { - "anyOf": [ - { - "type": "string" - }, - { - "items": { - "anyOf": [ - { - "$ref": "#/components/schemas/ChatCompletionSystemMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionUserMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionAssistantMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionToolMessageParam" - } - ] - }, - "type": "array" - }, - { - "items": { - "anyOf": [ - { - "$ref": "#/components/schemas/ChatCompletionContentPartTextParam" - }, - { - "$ref": "#/components/schemas/File" - } - ] - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Messages", - "description": "List of messages to store. Supports: - system / user / assistant messages with 'content' and 'chat_time'; - tool messages including: * tool_description (name, description, parameters), * tool_input (call_id, name, argument), * raw tool messages where content is str or list[str], * tool_output with structured output items (input_text / input_image / input_file, etc.). Also supports pure input items when there is no dialog." - }, - "chat_history": { - "anyOf": [ - { - "items": { - "anyOf": [ - { - "$ref": "#/components/schemas/ChatCompletionSystemMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionUserMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionAssistantMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionToolMessageParam" - } - ] - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Chat History", - "description": "Historical chat messages used internally by algorithms. If None, internal stored history will be used; if provided (even an empty list), this value will be used as-is." - }, - "is_feedback": { - "type": "boolean", - "title": "Is Feedback", - "description": "Whether this request represents user feedback. Default: False.", - "default": false - }, - "mem_cube_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Mem Cube Id", - "description": "(Deprecated) Target cube ID for this add request (optional for developer API)." - }, - "memory_content": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Memory Content", - "description": "(Deprecated) Plain memory content to store. Prefer using `messages`." - }, - "doc_path": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Doc Path", - "description": "(Deprecated / internal) Path to document to store." - }, - "source": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Source", - "description": "(Deprecated) Simple source tag of the memory. Prefer using `info.source_type` / `info.source_url`." - }, - "operation": { - "anyOf": [ - { - "items": { - "$ref": "#/components/schemas/PermissionDict" - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Operation", - "description": "(Internal) Operation definitions for multi-cube write permissions." - } - }, - "type": "object", - "title": "APIADDRequest", - "description": "Request model for creating memories." - }, - "APIChatCompleteRequest": { - "properties": { - "user_id": { - "type": "string", - "title": "User Id", - "description": "User ID" - }, - "query": { - "type": "string", - "title": "Query", - "description": "Chat query message" - }, - "readable_cube_ids": { - "anyOf": [ - { - "items": { - "type": "string" - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Readable Cube Ids", - "description": "List of cube IDs user can read for multi-cube chat" - }, - "writable_cube_ids": { - "anyOf": [ - { - "items": { - "type": "string" - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Writable Cube Ids", - "description": "List of cube IDs user can write for multi-cube chat" - }, - "history": { - "anyOf": [ - { - "items": { - "anyOf": [ - { - "$ref": "#/components/schemas/ChatCompletionSystemMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionUserMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionAssistantMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionToolMessageParam" - } - ] - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "History", - "description": "Chat history" - }, - "mode": { - "$ref": "#/components/schemas/SearchMode", - "description": "search mode: fast, fine, or mixture", - "default": "fast" - }, - "system_prompt": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "System Prompt", - "description": "Base system prompt to use for chat" - }, - "top_k": { - "type": "integer", - "title": "Top K", - "description": "Number of results to return", - "default": 10 - }, - "session_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Session Id", - "description": "Session ID for soft-filtering memories" - }, - "include_preference": { - "type": "boolean", - "title": "Include Preference", - "description": "Whether to handle preference memory", - "default": true - }, - "pref_top_k": { - "type": "integer", - "title": "Pref Top K", - "description": "Number of preference results to return", - "default": 6 - }, - "model_name_or_path": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Model Name Or Path", - "description": "Model name to use for chat" - }, - "max_tokens": { - "anyOf": [ - { - "type": "integer" - }, - { - "type": "null" - } - ], - "title": "Max Tokens", - "description": "Max tokens to generate" - }, - "temperature": { - "anyOf": [ - { - "type": "number" - }, - { - "type": "null" - } - ], - "title": "Temperature", - "description": "Temperature for sampling" - }, - "top_p": { - "anyOf": [ - { - "type": "number" - }, - { - "type": "null" - } - ], - "title": "Top P", - "description": "Top-p (nucleus) sampling parameter" - }, - "add_message_on_answer": { - "type": "boolean", - "title": "Add Message On Answer", - "description": "Add dialogs to memory after chat", - "default": true - }, - "filter": { - "anyOf": [ - { - "additionalProperties": true, - "type": "object" - }, - { - "type": "null" - } - ], - "title": "Filter", - "description": "\n Filter for the memory, example:\n {\n \"`and` or `or`\": [\n {\"id\": \"uuid-xxx\"},\n {\"created_at\": {\"gt\": \"2024-01-01\"}},\n ]\n }\n " - }, - "internet_search": { - "type": "boolean", - "title": "Internet Search", - "description": "Whether to use internet search", - "default": false - }, - "threshold": { - "type": "number", - "title": "Threshold", - "description": "Threshold for filtering references", - "default": 0.5 - }, - "mem_cube_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Mem Cube Id", - "description": "Cube ID to use for chat" - }, - "moscube": { - "type": "boolean", - "title": "Moscube", - "description": "(Deprecated) Whether to use legacy MemOSCube pipeline", - "default": false - } - }, - "type": "object", - "required": [ - "user_id", - "query" - ], - "title": "APIChatCompleteRequest", - "description": "Request model for chat operations." - }, - "APIFeedbackRequest": { - "properties": { - "user_id": { - "type": "string", - "title": "User Id", - "description": "User ID" - }, - "session_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Session Id", - "description": "Session ID for soft-filtering memories", - "default": "default_session" - }, - "task_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Task Id", - "description": "Task ID for monitering async tasks" - }, - "history": { - "anyOf": [ - { - "items": { - "anyOf": [ - { - "$ref": "#/components/schemas/ChatCompletionSystemMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionUserMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionAssistantMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionToolMessageParam" - } - ] - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "History", - "description": "Chat history" - }, - "retrieved_memory_ids": { - "anyOf": [ - { - "items": { - "type": "string" - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Retrieved Memory Ids", - "description": "Retrieved memory ids at last turn" - }, - "feedback_content": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Feedback Content", - "description": "Feedback content to process" - }, - "feedback_time": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Feedback Time", - "description": "Feedback time" - }, - "writable_cube_ids": { - "anyOf": [ - { - "items": { - "type": "string" - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Writable Cube Ids", - "description": "List of cube IDs user can write for multi-cube add" - }, - "async_mode": { - "type": "string", - "enum": [ - "sync", - "async" - ], - "title": "Async Mode", - "description": "feedback mode: sync or async", - "default": "async" - }, - "corrected_answer": { - "type": "boolean", - "title": "Corrected Answer", - "description": "Whether need return corrected answer", - "default": false - }, - "info": { - "anyOf": [ - { - "additionalProperties": true, - "type": "object" - }, - { - "type": "null" - } - ], - "title": "Info", - "description": "Additional metadata for the add request. All keys can be used as filters in search. Example: {'agent_id': 'xxxxxx', 'app_id': 'xxxx', 'source_type': 'web', 'source_url': 'https://www.baidu.com', 'source_content': 'West Lake is the most famous scenic spot in Hangzhou'}." - }, - "mem_cube_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Mem Cube Id", - "description": "(Deprecated) Single cube ID to search in. Prefer `readable_cube_ids` for multi-cube search." - } - }, - "type": "object", - "required": [ - "user_id", - "history", - "feedback_content" - ], - "title": "APIFeedbackRequest", - "description": "Request model for processing feedback info." - }, - "APISearchRequest": { - "properties": { - "query": { - "type": "string", - "title": "Query", - "description": "User search query" - }, - "user_id": { - "type": "string", - "title": "User Id", - "description": "User ID" - }, - "readable_cube_ids": { - "anyOf": [ - { - "items": { - "type": "string" - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Readable Cube Ids", - "description": "List of cube IDs that are readable for this request. Required for algorithm-facing API; optional for developer-facing API." - }, - "mode": { - "$ref": "#/components/schemas/SearchMode", - "description": "Search mode: fast, fine, or mixture.", - "default": "fast" - }, - "session_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Session Id", - "description": "Session ID used as a soft signal to prioritize more relevant memories. Only used for weighting, not as a hard filter." - }, - "top_k": { - "type": "integer", - "minimum": 1.0, - "title": "Top K", - "description": "Number of textual memories to retrieve (top-K). Default: 10.", - "default": 10 - }, - "dedup": { - "anyOf": [ - { - "type": "string", - "enum": [ - "no", - "sim" - ] - }, - { - "type": "null" - } - ], - "title": "Dedup", - "description": "Optional dedup option for textual memories. Use 'no' for no dedup, 'sim' for similarity dedup. If None, default exact-text dedup is applied." - }, - "pref_top_k": { - "type": "integer", - "minimum": 0.0, - "title": "Pref Top K", - "description": "Number of preference memories to retrieve (top-K). Default: 6.", - "default": 6 - }, - "include_preference": { - "type": "boolean", - "title": "Include Preference", - "description": "Whether to retrieve preference memories along with general memories. If enabled, the system will automatically recall user preferences relevant to the query. Default: True.", - "default": true - }, - "search_tool_memory": { - "type": "boolean", - "title": "Search Tool Memory", - "description": "Whether to retrieve tool memories along with general memories. If enabled, the system will automatically recall tool memories relevant to the query. Default: True.", - "default": true - }, - "tool_mem_top_k": { - "type": "integer", - "minimum": 0.0, - "title": "Tool Mem Top K", - "description": "Number of tool memories to retrieve (top-K). Default: 6.", - "default": 6 - }, - "filter": { - "anyOf": [ - { - "additionalProperties": true, - "type": "object" - }, - { - "type": "null" - } - ], - "title": "Filter", - "description": "\n Filter for the memory, example:\n {\n \"`and` or `or`\": [\n {\"id\": \"uuid-xxx\"},\n {\"created_at\": {\"gt\": \"2024-01-01\"}},\n ]\n }\n " - }, - "internet_search": { - "type": "boolean", - "title": "Internet Search", - "description": "Whether to enable internet search in addition to memory search. Primarily used by internal algorithms. Default: False.", - "default": false - }, - "threshold": { - "anyOf": [ - { - "type": "number" - }, - { - "type": "null" - } - ], - "title": "Threshold", - "description": "Internal similarity threshold for searching plaintext memories. If None, default thresholds will be applied." - }, - "search_memory_type": { - "type": "string", - "title": "Search Memory Type", - "description": "Type of memory to search: All, WorkingMemory, LongTermMemory, UserMemory, OuterMemory, ToolSchemaMemory, ToolTrajectoryMemory", - "default": "All" - }, - "chat_history": { - "anyOf": [ - { - "items": { - "anyOf": [ - { - "$ref": "#/components/schemas/ChatCompletionSystemMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionUserMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionAssistantMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionToolMessageParam" - } - ] - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Chat History", - "description": "Historical chat messages used internally by algorithms. If None, internal stored history may be used; if provided (even an empty list), this value will be used as-is." - }, - "mem_cube_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Mem Cube Id", - "description": "(Deprecated) Single cube ID to search in. Prefer `readable_cube_ids` for multi-cube search." - }, - "moscube": { - "type": "boolean", - "title": "Moscube", - "description": "(Deprecated / internal) Whether to use legacy MemOSCube path.", - "default": false - }, - "operation": { - "anyOf": [ - { - "items": { - "$ref": "#/components/schemas/PermissionDict" - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Operation", - "description": "(Internal) Operation definitions for multi-cube read permissions." - }, - "source": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Source", - "description": "Source of the search query [plugin will router diff search]" - } - }, - "type": "object", - "required": [ - "query", - "user_id" - ], - "title": "APISearchRequest", - "description": "Request model for searching memories." - }, - "AllStatusResponse": { - "properties": { - "code": { - "type": "integer", - "title": "Code", - "description": "Response status code", - "default": 200 - }, - "message": { - "type": "string", - "title": "Message", - "default": "Scheduler status summary retrieved successfully" - }, - "data": { - "anyOf": [ - { - "$ref": "#/components/schemas/AllStatusResponseData" - }, - { - "type": "null" - } - ], - "description": "Response data" - } - }, - "type": "object", - "title": "AllStatusResponse", - "description": "Response model for full scheduler status operations." - }, - "AllStatusResponseData": { - "properties": { - "scheduler_summary": { - "$ref": "#/components/schemas/TaskSummary", - "description": "Aggregated status for scheduler-managed tasks" - }, - "all_tasks_summary": { - "$ref": "#/components/schemas/TaskSummary", - "description": "Aggregated status for all tracked tasks" - } - }, - "type": "object", - "required": [ - "scheduler_summary", - "all_tasks_summary" - ], - "title": "AllStatusResponseData", - "description": "Aggregated scheduler status metrics." - }, - "Audio": { - "properties": { - "id": { - "type": "string", - "title": "Id" - } - }, - "type": "object", - "required": [ - "id" - ], - "title": "Audio" - }, - "ChatCompletionAssistantMessageParam": { - "properties": { - "role": { - "type": "string", - "const": "assistant", - "title": "Role" - }, - "audio": { - "anyOf": [ - { - "$ref": "#/components/schemas/Audio" - }, - { - "type": "null" - } - ] - }, - "content": { - "anyOf": [ - { - "type": "string" - }, - { - "items": { - "anyOf": [ - { - "$ref": "#/components/schemas/ChatCompletionContentPartTextParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionContentPartRefusalParam" - } - ] - }, - "type": "array" - }, - { - "$ref": "#/components/schemas/ChatCompletionContentPartTextParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionContentPartRefusalParam" - }, - { - "type": "null" - } - ], - "title": "Content" - }, - "refusal": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Refusal" - }, - "tool_calls": { - "anyOf": [ - { - "items": { - "anyOf": [ - { - "$ref": "#/components/schemas/ChatCompletionMessageFunctionToolCallParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionMessageCustomToolCallParam" - } - ] - }, - "type": "array" - }, - { - "$ref": "#/components/schemas/ChatCompletionMessageFunctionToolCallParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionMessageCustomToolCallParam" - } - ], - "title": "Tool Calls" - }, - "chat_time": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Chat Time" - }, - "message_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Message Id" - } - }, - "type": "object", - "required": [ - "role" - ], - "title": "ChatCompletionAssistantMessageParam" - }, - "ChatCompletionContentPartImageParam": { - "properties": { - "image_url": { - "$ref": "#/components/schemas/ImageURL" - }, - "type": { - "type": "string", - "const": "image_url", - "title": "Type" - } - }, - "type": "object", - "required": [ - "image_url", - "type" - ], - "title": "ChatCompletionContentPartImageParam" - }, - "ChatCompletionContentPartInputAudioParam": { - "properties": { - "input_audio": { - "$ref": "#/components/schemas/InputAudio" - }, - "type": { - "type": "string", - "const": "input_audio", - "title": "Type" - } - }, - "type": "object", - "required": [ - "input_audio", - "type" - ], - "title": "ChatCompletionContentPartInputAudioParam" - }, - "ChatCompletionContentPartRefusalParam": { - "properties": { - "refusal": { - "type": "string", - "title": "Refusal" - }, - "type": { - "type": "string", - "const": "refusal", - "title": "Type" - } - }, - "type": "object", - "required": [ - "refusal", - "type" - ], - "title": "ChatCompletionContentPartRefusalParam" - }, - "ChatCompletionContentPartTextParam": { - "properties": { - "text": { - "type": "string", - "title": "Text" - }, - "type": { - "type": "string", - "const": "text", - "title": "Type" - } - }, - "type": "object", - "required": [ - "text", - "type" - ], - "title": "ChatCompletionContentPartTextParam" - }, - "ChatCompletionMessageCustomToolCallParam": { - "properties": { - "id": { - "type": "string", - "title": "Id" - }, - "custom": { - "$ref": "#/components/schemas/Custom" - }, - "type": { - "type": "string", - "const": "custom", - "title": "Type" - } - }, - "type": "object", - "required": [ - "id", - "custom", - "type" - ], - "title": "ChatCompletionMessageCustomToolCallParam" - }, - "ChatCompletionMessageFunctionToolCallParam": { - "properties": { - "id": { - "type": "string", - "title": "Id" - }, - "function": { - "$ref": "#/components/schemas/Function" - }, - "type": { - "type": "string", - "const": "function", - "title": "Type" - } - }, - "type": "object", - "required": [ - "id", - "function", - "type" - ], - "title": "ChatCompletionMessageFunctionToolCallParam" - }, - "ChatCompletionSystemMessageParam": { - "properties": { - "content": { - "anyOf": [ - { - "type": "string" - }, - { - "items": { - "$ref": "#/components/schemas/ChatCompletionContentPartTextParam" - }, - "type": "array" - }, - { - "$ref": "#/components/schemas/ChatCompletionContentPartTextParam" - } - ], - "title": "Content" - }, - "role": { - "type": "string", - "const": "system", - "title": "Role" - }, - "name": { - "type": "string", - "title": "Name" - }, - "chat_time": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Chat Time" - }, - "message_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Message Id" - } - }, - "type": "object", - "required": [ - "content", - "role" - ], - "title": "ChatCompletionSystemMessageParam" - }, - "ChatCompletionToolMessageParam": { - "properties": { - "content": { - "anyOf": [ - { - "type": "string" - }, - { - "items": { - "anyOf": [ - { - "$ref": "#/components/schemas/ChatCompletionContentPartTextParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionContentPartImageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionContentPartInputAudioParam" - }, - { - "$ref": "#/components/schemas/File" - } - ] - }, - "type": "array" - }, - { - "$ref": "#/components/schemas/ChatCompletionContentPartTextParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionContentPartImageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionContentPartInputAudioParam" - }, - { - "$ref": "#/components/schemas/File" - } - ], - "title": "Content" - }, - "role": { - "type": "string", - "const": "tool", - "title": "Role" - }, - "tool_call_id": { - "type": "string", - "title": "Tool Call Id" - }, - "chat_time": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Chat Time" - }, - "message_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Message Id" - } - }, - "type": "object", - "required": [ - "content", - "role", - "tool_call_id" - ], - "title": "ChatCompletionToolMessageParam" - }, - "ChatCompletionUserMessageParam": { - "properties": { - "content": { - "anyOf": [ - { - "type": "string" - }, - { - "items": { - "anyOf": [ - { - "$ref": "#/components/schemas/ChatCompletionContentPartTextParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionContentPartImageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionContentPartInputAudioParam" - }, - { - "$ref": "#/components/schemas/File" - } - ] - }, - "type": "array" - }, - { - "$ref": "#/components/schemas/ChatCompletionContentPartTextParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionContentPartImageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionContentPartInputAudioParam" - }, - { - "$ref": "#/components/schemas/File" - } - ], - "title": "Content" - }, - "role": { - "type": "string", - "const": "user", - "title": "Role" - }, - "name": { - "type": "string", - "title": "Name" - }, - "chat_time": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Chat Time" - }, - "message_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Message Id" - } - }, - "type": "object", - "required": [ - "content", - "role" - ], - "title": "ChatCompletionUserMessageParam" - }, - "ChatPlaygroundRequest": { - "properties": { - "user_id": { - "type": "string", - "title": "User Id", - "description": "User ID" - }, - "query": { - "type": "string", - "title": "Query", - "description": "Chat query message" - }, - "readable_cube_ids": { - "anyOf": [ - { - "items": { - "type": "string" - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Readable Cube Ids", - "description": "List of cube IDs user can read for multi-cube chat" - }, - "writable_cube_ids": { - "anyOf": [ - { - "items": { - "type": "string" - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Writable Cube Ids", - "description": "List of cube IDs user can write for multi-cube chat" - }, - "history": { - "anyOf": [ - { - "items": { - "anyOf": [ - { - "$ref": "#/components/schemas/ChatCompletionSystemMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionUserMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionAssistantMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionToolMessageParam" - } - ] - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "History", - "description": "Chat history" - }, - "mode": { - "$ref": "#/components/schemas/SearchMode", - "description": "search mode: fast, fine, or mixture", - "default": "fast" - }, - "system_prompt": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "System Prompt", - "description": "Base system prompt to use for chat" - }, - "top_k": { - "type": "integer", - "title": "Top K", - "description": "Number of results to return", - "default": 10 - }, - "session_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Session Id", - "description": "Session ID for soft-filtering memories" - }, - "include_preference": { - "type": "boolean", - "title": "Include Preference", - "description": "Whether to handle preference memory", - "default": true - }, - "pref_top_k": { - "type": "integer", - "title": "Pref Top K", - "description": "Number of preference results to return", - "default": 6 - }, - "model_name_or_path": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Model Name Or Path", - "description": "Model name to use for chat" - }, - "max_tokens": { - "anyOf": [ - { - "type": "integer" - }, - { - "type": "null" - } - ], - "title": "Max Tokens", - "description": "Max tokens to generate" - }, - "temperature": { - "anyOf": [ - { - "type": "number" - }, - { - "type": "null" - } - ], - "title": "Temperature", - "description": "Temperature for sampling" - }, - "top_p": { - "anyOf": [ - { - "type": "number" - }, - { - "type": "null" - } - ], - "title": "Top P", - "description": "Top-p (nucleus) sampling parameter" - }, - "add_message_on_answer": { - "type": "boolean", - "title": "Add Message On Answer", - "description": "Add dialogs to memory after chat", - "default": true - }, - "filter": { - "anyOf": [ - { - "additionalProperties": true, - "type": "object" - }, - { - "type": "null" - } - ], - "title": "Filter", - "description": "\n Filter for the memory, example:\n {\n \"`and` or `or`\": [\n {\"id\": \"uuid-xxx\"},\n {\"created_at\": {\"gt\": \"2024-01-01\"}},\n ]\n }\n " - }, - "internet_search": { - "type": "boolean", - "title": "Internet Search", - "description": "Whether to use internet search", - "default": false - }, - "threshold": { - "type": "number", - "title": "Threshold", - "description": "Threshold for filtering references", - "default": 0.5 - }, - "moscube": { - "type": "boolean", - "title": "Moscube", - "description": "(Deprecated) Whether to use legacy MemOSCube pipeline.", - "default": false - }, - "mem_cube_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Mem Cube Id", - "description": "(Deprecated) Single cube ID to use for chat. Prefer `readable_cube_ids` / `writable_cube_ids` for multi-cube chat." - }, - "beginner_guide_step": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Beginner Guide Step", - "description": "Whether to use beginner guide, option: [first, second]" - } - }, - "type": "object", - "required": [ - "user_id", - "query" - ], - "title": "ChatPlaygroundRequest", - "description": "Request model for chat operations in playground." - }, - "ChatRequest": { - "properties": { - "user_id": { - "type": "string", - "title": "User Id", - "description": "User ID" - }, - "query": { - "type": "string", - "title": "Query", - "description": "Chat query message" - }, - "readable_cube_ids": { - "anyOf": [ - { - "items": { - "type": "string" - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Readable Cube Ids", - "description": "List of cube IDs user can read for multi-cube chat" - }, - "writable_cube_ids": { - "anyOf": [ - { - "items": { - "type": "string" - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Writable Cube Ids", - "description": "List of cube IDs user can write for multi-cube chat" - }, - "history": { - "anyOf": [ - { - "items": { - "anyOf": [ - { - "$ref": "#/components/schemas/ChatCompletionSystemMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionUserMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionAssistantMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionToolMessageParam" - } - ] - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "History", - "description": "Chat history" - }, - "mode": { - "$ref": "#/components/schemas/SearchMode", - "description": "search mode: fast, fine, or mixture", - "default": "fast" - }, - "system_prompt": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "System Prompt", - "description": "Base system prompt to use for chat" - }, - "top_k": { - "type": "integer", - "title": "Top K", - "description": "Number of results to return", - "default": 10 - }, - "session_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Session Id", - "description": "Session ID for soft-filtering memories" - }, - "include_preference": { - "type": "boolean", - "title": "Include Preference", - "description": "Whether to handle preference memory", - "default": true - }, - "pref_top_k": { - "type": "integer", - "title": "Pref Top K", - "description": "Number of preference results to return", - "default": 6 - }, - "model_name_or_path": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Model Name Or Path", - "description": "Model name to use for chat" - }, - "max_tokens": { - "anyOf": [ - { - "type": "integer" - }, - { - "type": "null" - } - ], - "title": "Max Tokens", - "description": "Max tokens to generate" - }, - "temperature": { - "anyOf": [ - { - "type": "number" - }, - { - "type": "null" - } - ], - "title": "Temperature", - "description": "Temperature for sampling" - }, - "top_p": { - "anyOf": [ - { - "type": "number" - }, - { - "type": "null" - } - ], - "title": "Top P", - "description": "Top-p (nucleus) sampling parameter" - }, - "add_message_on_answer": { - "type": "boolean", - "title": "Add Message On Answer", - "description": "Add dialogs to memory after chat", - "default": true - }, - "filter": { - "anyOf": [ - { - "additionalProperties": true, - "type": "object" - }, - { - "type": "null" - } - ], - "title": "Filter", - "description": "\n Filter for the memory, example:\n {\n \"`and` or `or`\": [\n {\"id\": \"uuid-xxx\"},\n {\"created_at\": {\"gt\": \"2024-01-01\"}},\n ]\n }\n " - }, - "internet_search": { - "type": "boolean", - "title": "Internet Search", - "description": "Whether to use internet search", - "default": false - }, - "threshold": { - "type": "number", - "title": "Threshold", - "description": "Threshold for filtering references", - "default": 0.5 - }, - "moscube": { - "type": "boolean", - "title": "Moscube", - "description": "(Deprecated) Whether to use legacy MemOSCube pipeline.", - "default": false - }, - "mem_cube_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Mem Cube Id", - "description": "(Deprecated) Single cube ID to use for chat. Prefer `readable_cube_ids` / `writable_cube_ids` for multi-cube chat." - } - }, - "type": "object", - "required": [ - "user_id", - "query" - ], - "title": "ChatRequest", - "description": "Request model for chat operations.\n\nThis model is used as the algorithm-facing chat interface, while also\nremaining backward compatible with older developer-facing APIs." - }, - "Custom": { - "properties": { - "input": { - "type": "string", - "title": "Input" - }, - "name": { - "type": "string", - "title": "Name" - } - }, - "type": "object", - "required": [ - "input", - "name" - ], - "title": "Custom" - }, - "DeleteMemoryRequest": { - "properties": { - "writable_cube_ids": { - "items": { - "type": "string" - }, - "type": "array", - "title": "Writable Cube Ids", - "description": "Writable cube IDs" - }, - "memory_ids": { - "anyOf": [ - { - "items": { - "type": "string" - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Memory Ids", - "description": "Memory IDs" - }, - "file_ids": { - "anyOf": [ - { - "items": { - "type": "string" - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "File Ids", - "description": "File IDs" - }, - "filter": { - "anyOf": [ - { - "additionalProperties": true, - "type": "object" - }, - { - "type": "null" - } - ], - "title": "Filter", - "description": "Filter for the memory" - } - }, - "type": "object", - "title": "DeleteMemoryRequest", - "description": "Request model for deleting memories." - }, - "DeleteMemoryResponse": { - "properties": { - "code": { - "type": "integer", - "title": "Code", - "description": "Response status code", - "default": 200 - }, - "message": { - "type": "string", - "title": "Message", - "description": "Response message" - }, - "data": { - "anyOf": [ - { - "additionalProperties": true, - "type": "object" - }, - { - "type": "null" - } - ], - "title": "Data", - "description": "Response data" - } - }, - "type": "object", - "required": [ - "message" - ], - "title": "DeleteMemoryResponse", - "description": "Response model for deleting memories." - }, - "ExistMemCubeIdRequest": { - "properties": { - "mem_cube_id": { - "type": "string", - "title": "Mem Cube Id", - "description": "Mem cube ID" - } - }, - "type": "object", - "required": [ - "mem_cube_id" - ], - "title": "ExistMemCubeIdRequest", - "description": "Request model for checking if mem cube id exists." - }, - "ExistMemCubeIdResponse": { - "properties": { - "code": { - "type": "integer", - "title": "Code", - "description": "Response status code", - "default": 200 - }, - "message": { - "type": "string", - "title": "Message", - "description": "Response message" - }, - "data": { - "anyOf": [ - { - "additionalProperties": { - "type": "boolean" - }, - "type": "object" - }, - { - "type": "null" - } - ], - "title": "Data", - "description": "Response data" - } - }, - "type": "object", - "required": [ - "message" - ], - "title": "ExistMemCubeIdResponse", - "description": "Response model for checking if mem cube id exists." - }, - "File": { - "properties": { - "file": { - "$ref": "#/components/schemas/FileFile" - }, - "type": { - "type": "string", - "const": "file", - "title": "Type" - } - }, - "type": "object", - "required": [ - "file", - "type" - ], - "title": "File" - }, - "FileFile": { - "properties": { - "file_data": { - "type": "string", - "title": "File Data" - }, - "file_id": { - "type": "string", - "title": "File Id" - }, - "filename": { - "type": "string", - "title": "Filename" - } - }, - "type": "object", - "title": "FileFile" - }, - "Function": { - "properties": { - "arguments": { - "type": "string", - "title": "Arguments" - }, - "name": { - "type": "string", - "title": "Name" - } - }, - "type": "object", - "required": [ - "arguments", - "name" - ], - "title": "Function" - }, - "GetMemoryPlaygroundRequest": { - "properties": { - "user_id": { - "type": "string", - "title": "User Id", - "description": "User ID" - }, - "memory_type": { - "type": "string", - "enum": [ - "text_mem", - "act_mem", - "param_mem", - "para_mem" - ], - "title": "Memory Type", - "description": "Memory type" - }, - "mem_cube_ids": { - "anyOf": [ - { - "items": { - "type": "string" - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Mem Cube Ids", - "description": "Cube IDs" - }, - "search_query": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Search Query", - "description": "Search query" - } - }, - "type": "object", - "required": [ - "user_id", - "memory_type" - ], - "title": "GetMemoryPlaygroundRequest", - "description": "Request model for getting memories." - }, - "GetMemoryRequest": { - "properties": { - "mem_cube_id": { - "type": "string", - "title": "Mem Cube Id", - "description": "Cube ID" - }, - "user_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "User Id", - "description": "User ID" - }, - "include_preference": { - "type": "boolean", - "title": "Include Preference", - "description": "Whether to handle preference memory", - "default": true - }, - "page": { - "anyOf": [ - { - "type": "integer" - }, - { - "type": "null" - } - ], - "title": "Page", - "description": "Page number (starts from 1). If None, exports all data without pagination." - }, - "page_size": { - "anyOf": [ - { - "type": "integer" - }, - { - "type": "null" - } - ], - "title": "Page Size", - "description": "Number of items per page. If None, exports all data without pagination." - } - }, - "type": "object", - "required": [ - "mem_cube_id" - ], - "title": "GetMemoryRequest", - "description": "Request model for getting memories." - }, - "GetMemoryResponse": { - "properties": { - "code": { - "type": "integer", - "title": "Code", - "description": "Response status code", - "default": 200 - }, - "message": { - "type": "string", - "title": "Message", - "description": "Response message" - }, - "data": { - "anyOf": [ - { - "additionalProperties": true, - "type": "object" - }, - { - "type": "null" - } - ], - "title": "Data", - "description": "Response data" - } - }, - "type": "object", - "required": [ - "message" - ], - "title": "GetMemoryResponse", - "description": "Response model for getting memories." - }, - "GetUserNamesByMemoryIdsRequest": { - "properties": { - "memory_ids": { - "items": { - "type": "string" - }, - "type": "array", - "title": "Memory Ids", - "description": "Memory IDs" - } - }, - "type": "object", - "required": [ - "memory_ids" - ], - "title": "GetUserNamesByMemoryIdsRequest", - "description": "Request model for getting user names by memory ids." - }, - "GetUserNamesByMemoryIdsResponse": { - "properties": { - "code": { - "type": "integer", - "title": "Code", - "description": "Response status code", - "default": 200 - }, - "message": { - "type": "string", - "title": "Message", - "description": "Response message" - }, - "data": { - "anyOf": [ - { - "additionalProperties": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ] - }, - "type": "object" - }, - { - "type": "null" - } - ], - "title": "Data", - "description": "Response data" - } - }, - "type": "object", - "required": [ - "message" - ], - "title": "GetUserNamesByMemoryIdsResponse", - "description": "Response model for getting user names by memory ids." - }, - "HTTPValidationError": { - "properties": { - "detail": { - "items": { - "$ref": "#/components/schemas/ValidationError" - }, - "type": "array", - "title": "Detail" - } - }, - "type": "object", - "title": "HTTPValidationError" - }, - "ImageURL": { - "properties": { - "url": { - "type": "string", - "title": "Url" - }, - "detail": { - "type": "string", - "enum": [ - "auto", - "low", - "high" - ], - "title": "Detail" - } - }, - "type": "object", - "required": [ - "url" - ], - "title": "ImageURL" - }, - "InputAudio": { - "properties": { - "data": { - "type": "string", - "title": "Data" - }, - "format": { - "type": "string", - "enum": [ - "wav", - "mp3" - ], - "title": "Format" - } - }, - "type": "object", - "required": [ - "data", - "format" - ], - "title": "InputAudio" - }, - "MemoryResponse": { - "properties": { - "code": { - "type": "integer", - "title": "Code", - "description": "Response status code", - "default": 200 - }, - "message": { - "type": "string", - "title": "Message", - "description": "Response message" - }, - "data": { - "anyOf": [ - { - "items": {}, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Data", - "description": "Response data" - } - }, - "type": "object", - "required": [ - "message" - ], - "title": "MemoryResponse", - "description": "Response model for memory operations." - }, - "PermissionDict": { - "properties": { - "permissions": { - "items": { - "type": "string", - "enum": [ - "read", - "write", - "delete", - "execute" - ] - }, - "type": "array", - "title": "Permissions" - }, - "mem_cube_id": { - "type": "string", - "title": "Mem Cube Id" - } - }, - "type": "object", - "title": "PermissionDict", - "description": "Typed dictionary for chat message dictionaries." - }, - "SearchMode": { - "type": "string", - "enum": [ - "fast", - "fine", - "mixture" - ], - "title": "SearchMode", - "description": "Enumeration for search modes." - }, - "SearchResponse": { - "properties": { - "code": { - "type": "integer", - "title": "Code", - "description": "Response status code", - "default": 200 - }, - "message": { - "type": "string", - "title": "Message", - "description": "Response message" - }, - "data": { - "anyOf": [ - { - "additionalProperties": true, - "type": "object" - }, - { - "type": "null" - } - ], - "title": "Data", - "description": "Response data" - } - }, - "type": "object", - "required": [ - "message" - ], - "title": "SearchResponse", - "description": "Response model for search operations." - }, - "StatusResponse": { - "properties": { - "code": { - "type": "integer", - "title": "Code", - "description": "Response status code", - "default": 200 - }, - "message": { - "type": "string", - "title": "Message", - "default": "Memory get status successfully" - }, - "data": { - "anyOf": [ - { - "items": { - "$ref": "#/components/schemas/StatusResponseItem" - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Data", - "description": "Response data" - } - }, - "type": "object", - "title": "StatusResponse", - "description": "Response model for scheduler status operations." - }, - "StatusResponseItem": { - "properties": { - "task_id": { - "type": "string", - "title": "Task Id", - "description": "The ID of the task" - }, - "status": { - "type": "string", - "enum": [ - "in_progress", - "completed", - "waiting", - "failed", - "cancelled" - ], - "title": "Status", - "description": "The current status of the task" - } - }, - "type": "object", - "required": [ - "task_id", - "status" - ], - "title": "StatusResponseItem", - "description": "Individual task status item." - }, - "SuggestionRequest": { - "properties": { - "user_id": { - "type": "string", - "title": "User Id", - "description": "User ID" - }, - "mem_cube_id": { - "type": "string", - "title": "Mem Cube Id", - "description": "Cube ID" - }, - "language": { - "type": "string", - "enum": [ - "zh", - "en" - ], - "title": "Language", - "description": "Language for suggestions", - "default": "zh" - }, - "message": { - "anyOf": [ - { - "type": "string" - }, - { - "items": { - "anyOf": [ - { - "$ref": "#/components/schemas/ChatCompletionSystemMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionUserMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionAssistantMessageParam" - }, - { - "$ref": "#/components/schemas/ChatCompletionToolMessageParam" - } - ] - }, - "type": "array" - }, - { - "items": { - "anyOf": [ - { - "$ref": "#/components/schemas/ChatCompletionContentPartTextParam" - }, - { - "$ref": "#/components/schemas/File" - } - ] - }, - "type": "array" - }, - { - "type": "null" - } - ], - "title": "Message", - "description": "List of messages to store." - } - }, - "type": "object", - "required": [ - "user_id", - "mem_cube_id" - ], - "title": "SuggestionRequest", - "description": "Request model for getting suggestion queries." - }, - "SuggestionResponse": { - "properties": { - "code": { - "type": "integer", - "title": "Code", - "description": "Response status code", - "default": 200 - }, - "message": { - "type": "string", - "title": "Message", - "description": "Response message" - }, - "data": { - "anyOf": [ - { - "additionalProperties": { - "items": { - "type": "string" - }, - "type": "array" - }, - "type": "object" - }, - { - "type": "null" - } - ], - "title": "Data", - "description": "Response data" - } - }, - "type": "object", - "required": [ - "message" - ], - "title": "SuggestionResponse", - "description": "Response model for suggestion operations." - }, - "TaskQueueData": { - "properties": { - "user_id": { - "type": "string", - "title": "User Id", - "description": "User ID the query is scoped to" - }, - "user_name": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "User Name", - "description": "User name if available" - }, - "mem_cube_id": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "null" - } - ], - "title": "Mem Cube Id", - "description": "MemCube ID if a single cube is targeted; otherwise None" - }, - "stream_keys": { - "items": { - "type": "string" - }, - "type": "array", - "title": "Stream Keys", - "description": "Matched Redis stream keys for this user" - }, - "users_count": { - "type": "integer", - "title": "Users Count", - "description": "Distinct users currently present in queue streams" - }, - "pending_tasks_count": { - "type": "integer", - "title": "Pending Tasks Count", - "description": "Count of pending (delivered, not acked) tasks" - }, - "remaining_tasks_count": { - "type": "integer", - "title": "Remaining Tasks Count", - "description": "Count of enqueued tasks (xlen)" - }, - "pending_tasks_detail": { - "items": { - "type": "string" - }, - "type": "array", - "title": "Pending Tasks Detail", - "description": "Per-stream pending counts, formatted as '{stream_key}:{count}'" - }, - "remaining_tasks_detail": { - "items": { - "type": "string" - }, - "type": "array", - "title": "Remaining Tasks Detail", - "description": "Per-stream remaining counts, formatted as '{stream_key}:{count}'" - } - }, - "type": "object", - "required": [ - "user_id", - "stream_keys", - "users_count", - "pending_tasks_count", - "remaining_tasks_count", - "pending_tasks_detail", - "remaining_tasks_detail" - ], - "title": "TaskQueueData", - "description": "Queue-level metrics for scheduler tasks." - }, - "TaskQueueResponse": { - "properties": { - "code": { - "type": "integer", - "title": "Code", - "description": "Response status code", - "default": 200 - }, - "message": { - "type": "string", - "title": "Message", - "default": "Scheduler task queue status retrieved successfully" - }, - "data": { - "anyOf": [ - { - "$ref": "#/components/schemas/TaskQueueData" - }, - { - "type": "null" - } - ], - "description": "Response data" - } - }, - "type": "object", - "title": "TaskQueueResponse", - "description": "Response model for scheduler task queue status." - }, - "TaskSummary": { - "properties": { - "waiting": { - "type": "integer", - "title": "Waiting", - "description": "Number of tasks waiting to run", - "default": 0 - }, - "in_progress": { - "type": "integer", - "title": "In Progress", - "description": "Number of tasks currently running", - "default": 0 - }, - "pending": { - "type": "integer", - "title": "Pending", - "description": "Number of tasks fetched by workers but not yet acknowledged", - "default": 0 - }, - "completed": { - "type": "integer", - "title": "Completed", - "description": "Number of tasks completed", - "default": 0 - }, - "failed": { - "type": "integer", - "title": "Failed", - "description": "Number of tasks failed", - "default": 0 - }, - "cancelled": { - "type": "integer", - "title": "Cancelled", - "description": "Number of tasks cancelled", - "default": 0 - }, - "total": { - "type": "integer", - "title": "Total", - "description": "Total number of tasks counted", - "default": 0 - } - }, - "type": "object", - "title": "TaskSummary", - "description": "Aggregated counts of tasks by status." - }, - "ValidationError": { - "properties": { - "loc": { - "items": { - "anyOf": [ - { - "type": "string" - }, - { - "type": "integer" - } - ] - }, - "type": "array", - "title": "Location" - }, - "msg": { - "type": "string", - "title": "Message" - }, - "type": { - "type": "string", - "title": "Error Type" - } - }, - "type": "object", - "required": [ - "loc", - "msg", - "type" - ], - "title": "ValidationError" - } - } - } -} diff --git a/docs/product-api-tests.md b/docs/product-api-tests.md deleted file mode 100644 index cff807e0..00000000 --- a/docs/product-api-tests.md +++ /dev/null @@ -1,65 +0,0 @@ -## Product API smoke tests (local 0.0.0.0:8001) - -Source: https://github.com/MemTensor/MemOS/issues/518 - -### Prerequisites -- Service is running: `python -m uvicorn memos.api.server_api:app --host 0.0.0.0 --port 8001` -- `.env` is configured for Redis, embeddings, and the vector DB (current test setup: Redis reachable, Qdrant Cloud connected). - -### 1) /product/add -- Purpose: Write a memory (sync/async). -- Example request (sync): - - ```bash - curl -s -X POST http://127.0.0.1:8001/product/add \ - -H 'Content-Type: application/json' \ - -d '{ - "user_id": "tester", - "mem_cube_id": "default_cube", - "memory_content": "Apple is a fruit rich in fiber.", - "async_mode": "sync" - }' - ``` - -- Observed result: `200`, message: "Memory added successfully", returns the written `memory_id` and related info. - -### 2) /product/get_all -- Purpose: List all memories for the user/type to confirm writes. -- Example request: - - ```bash - curl -s -X POST http://127.0.0.1:8001/product/get_all \ - -H 'Content-Type: application/json' \ - -d '{ - "user_id": "tester", - "memory_type": "text_mem", - "mem_cube_ids": ["default_cube"] - }' - ``` - -- Observed result: `200`, shows the recently written apple memories (WorkingMemory/LongTermMemory/UserMemory present, `vector_sync=success`). - -### 3) /product/search -- Purpose: Vector search memories. -- Example request: - - ```bash - curl -s -X POST http://127.0.0.1:8001/product/search \ - -H 'Content-Type: application/json' \ - -d '{ - "query": "What fruit is rich in fiber?", - "user_id": "tester", - "mem_cube_id": "default_cube", - "top_k": 5, - "pref_top_k": 3, - "include_preference": false - }' - ``` - -- Observed result: previously returned 400 because payload indexes (e.g., `vector_sync`) were missing in Qdrant. Index creation is now automatic during Qdrant initialization (memory_type/status/vector_sync/user_name). -- If results are empty or errors persist, verify indexes exist (auto-created on restart) or recreate/clean the collection. - -### Notes / Next steps -- `/product/add` and `/product/get_all` are healthy. -- `/product/search` still returns empty results even with vectors present; likely related to search filters or vector retrieval. -- Suggested follow-ups: inspect `SearchHandler` flow, filter conditions (user_id/session/cube_name), and vector DB search calls; capture logs or compare with direct `VecDBFactory.search` calls. From 1331eadefed5f98425d7f57be3f67ddf445ba21e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E5=B8=AD=E9=98=B3=E9=98=B3?= Date: Tue, 12 May 2026 16:22:59 +0800 Subject: [PATCH 3/4] docs: move docsw --- .github/CONTRIBUTING | 3 --- .../cn/open_source/modules/dream.md | 0 .../openai_memory_locomo_eval_guide.md | 6 ++--- .../en/open_source/evaluation/overview.md | 24 +++++++++---------- .../en/open_source/modules/api_deployment.md | 0 .../en/open_source/modules/dream.md | 0 .../modules/mem_reader_examples.md | 2 +- 7 files changed, 16 insertions(+), 19 deletions(-) delete mode 100644 .github/CONTRIBUTING rename src/memos/dream/README_ZH.md => docs/cn/open_source/modules/dream.md (100%) rename {evaluation/scripts/locomo => docs/en/open_source/evaluation}/openai_memory_locomo_eval_guide.md (95%) rename evaluation/README.md => docs/en/open_source/evaluation/overview.md (81%) rename src/memos/api/README_api.md => docs/en/open_source/modules/api_deployment.md (100%) rename src/memos/dream/README.md => docs/en/open_source/modules/dream.md (100%) rename examples/mem_reader/README.md => docs/en/open_source/modules/mem_reader_examples.md (91%) diff --git a/.github/CONTRIBUTING b/.github/CONTRIBUTING deleted file mode 100644 index bbb218a6..00000000 --- a/.github/CONTRIBUTING +++ /dev/null @@ -1,3 +0,0 @@ -Please read https://memos-docs.openmem.net/contribution/overview to learn how to contribute to this repository. 🌟 - -请阅读 https://memos-docs.openmem.net/contribution/overview 了解如何为此项目贡献代码。🌟 diff --git a/src/memos/dream/README_ZH.md b/docs/cn/open_source/modules/dream.md similarity index 100% rename from src/memos/dream/README_ZH.md rename to docs/cn/open_source/modules/dream.md diff --git a/evaluation/scripts/locomo/openai_memory_locomo_eval_guide.md b/docs/en/open_source/evaluation/openai_memory_locomo_eval_guide.md similarity index 95% rename from evaluation/scripts/locomo/openai_memory_locomo_eval_guide.md rename to docs/en/open_source/evaluation/openai_memory_locomo_eval_guide.md index dc92bd5c..9545e418 100644 --- a/evaluation/scripts/locomo/openai_memory_locomo_eval_guide.md +++ b/docs/en/open_source/evaluation/openai_memory_locomo_eval_guide.md @@ -102,11 +102,11 @@ The memories are currently saved per session. You need to write a simple script ### Step 2.4: Automated Evaluation -Once the memories for all conversations have been extracted and saved, you can run the automated [evaluation script](../run_openai_eval.sh). This script will handle the process of generating answers, evaluating them, and calculating metrics. +Once the memories for all conversations have been extracted and saved, you can run the automated [evaluation script](../../../../evaluation/scripts/run_openai_eval.sh). This script will handle the process of generating answers, evaluating them, and calculating metrics. ```bash -# Edit the configuration in ./scripts/run_openai_eval.sh -./scripts/run_openai_eval.sh +# Edit the configuration in evaluation/scripts/run_openai_eval.sh +evaluation/scripts/run_openai_eval.sh ``` ## 3. Considerations diff --git a/evaluation/README.md b/docs/en/open_source/evaluation/overview.md similarity index 81% rename from evaluation/README.md rename to docs/en/open_source/evaluation/overview.md index 8c189694..34ae1990 100644 --- a/evaluation/README.md +++ b/docs/en/open_source/evaluation/overview.md @@ -7,7 +7,7 @@ This repository provides tools and scripts for evaluating the `LoCoMo`, `LongMem 1. Set the `PYTHONPATH` environment variable: ```bash export PYTHONPATH=../src - cd evaluation + cd evaluation # run from the repository root ``` 2. Install the required dependencies: @@ -44,24 +44,24 @@ And give unofficial implementations for the following memory frameworks:`zep`, ` ## Evaluation Scripts ### LoCoMo Evaluation -⚙️ To evaluate the **LoCoMo** dataset using one of the supported memory frameworks — run the following [script](./scripts/run_locomo_eval.sh): +⚙️ To evaluate the **LoCoMo** dataset using one of the supported memory frameworks — run the following [script](../../../../evaluation/scripts/run_locomo_eval.sh): ```bash # Edit the configuration in ./scripts/run_locomo_eval.sh # Specify the model and memory backend you want to use (e.g., mem0, zep, etc.) -./scripts/run_locomo_eval.sh +evaluation/scripts/run_locomo_eval.sh ``` -✍️ For evaluating OpenAI's native memory feature with the LoCoMo dataset, please refer to the detailed guide: [OpenAI Memory on LoCoMo - Evaluation Guide](./scripts/locomo/openai_memory_locomo_eval_guide.md). +✍️ For evaluating OpenAI's native memory feature with the LoCoMo dataset, please refer to the detailed guide: [OpenAI Memory on LoCoMo - Evaluation Guide](./openai_memory_locomo_eval_guide.md). ### LongMemEval Evaluation First prepare the dataset `longmemeval_s` from https://huggingface.co/datasets/xiaowu0162/longmemeval-cleaned , and save it as `data/longmemeval/longmemeval_s.json` ```bash -# Edit the configuration in ./scripts/run_lme_eval.sh +# Edit the configuration in evaluation/scripts/run_lme_eval.sh # Specify the model and memory backend you want to use (e.g., mem0, zep, etc.) -./scripts/run_lme_eval.sh +evaluation/scripts/run_lme_eval.sh ``` #### Question date and `reference_time` @@ -72,19 +72,19 @@ LongMemEval gives each question a **question date**; evaluation should use that ### PrefEval Evaluation Downloading benchmark_dataset/filtered_inter_turns.json from https://github.com/amazon-science/PrefEval/blob/main/benchmark_dataset/filtered_inter_turns.json and save it as `./data/prefeval/filtered_inter_turns.json`. -To evaluate the **Prefeval** dataset — run the following [script](./scripts/run_prefeval_eval.sh): +To evaluate the **Prefeval** dataset — run the following [script](evaluation/scripts/run_prefeval_eval.sh): ```bash -# Edit the configuration in ./scripts/run_prefeval_eval.sh +# Edit the configuration in evaluation/scripts/run_prefeval_eval.sh # Specify the model and memory backend you want to use (e.g., mem0, zep, etc.) -./scripts/run_prefeval_eval.sh +evaluation/scripts/run_prefeval_eval.sh ``` ### PersonaMem Evaluation get `questions_32k.csv` and `shared_contexts_32k.jsonl` from https://huggingface.co/datasets/bowen-upenn/PersonaMem and save them at `data/personamem/` ```bash -# Edit the configuration in ./scripts/run_pm_eval.sh +# Edit the configuration in evaluation/scripts/run_pm_eval.sh # Specify the model and memory backend you want to use (e.g., mem0, zep, etc.) -# If you want to use MIRIX, edit the the configuration in ./scripts/personamem/config.yaml -./scripts/run_pm_eval.sh +# If you want to use MIRIX, edit the the configuration in evaluation/scripts/personamem/config.yaml +evaluation/scripts/run_pm_eval.sh ``` diff --git a/src/memos/api/README_api.md b/docs/en/open_source/modules/api_deployment.md similarity index 100% rename from src/memos/api/README_api.md rename to docs/en/open_source/modules/api_deployment.md diff --git a/src/memos/dream/README.md b/docs/en/open_source/modules/dream.md similarity index 100% rename from src/memos/dream/README.md rename to docs/en/open_source/modules/dream.md diff --git a/examples/mem_reader/README.md b/docs/en/open_source/modules/mem_reader_examples.md similarity index 91% rename from examples/mem_reader/README.md rename to docs/en/open_source/modules/mem_reader_examples.md index 3677d050..151adcdf 100644 --- a/examples/mem_reader/README.md +++ b/docs/en/open_source/modules/mem_reader_examples.md @@ -1,6 +1,6 @@ # MemReader Examples -This directory contains examples and sample code demonstrating how to use the `MemReader` module in MemOS. `MemReader` is responsible for parsing various types of input data (text, chat history, files, images) into structured memory formats. +This page documents the examples and sample code (located in [`examples/mem_reader/`](../../../../examples/mem_reader/)) demonstrating how to use the `MemReader` module in MemOS. `MemReader` is responsible for parsing various types of input data (text, chat history, files, images) into structured memory formats. ## 📂 Directory Structure From e877a72a4256096ce7fce37513efad1c0eefc0f2 Mon Sep 17 00:00:00 2001 From: CaralHsi Date: Tue, 12 May 2026 16:27:43 +0800 Subject: [PATCH 4/4] doc: Rename contributing.md to CONTRIBUTING --- contributing.md => CONTRIBUTING.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename contributing.md => CONTRIBUTING.md (100%) diff --git a/contributing.md b/CONTRIBUTING.md similarity index 100% rename from contributing.md rename to CONTRIBUTING.md