longmemeval

Star

Here are 22 public repositories matching this topic...

ZenSystemAI / Zengram

Star

A Multi Agent Memory MCP That Connect Agents Across Systems and Machines

Updated Jun 18, 2026
JavaScript

jaylfc / taosmd

Sponsor

Star

Local-first AI memory — runs offline on any machine with 8 GB+ RAM (SBC, mini PC, laptop, workstation). Zero-loss verbatim archive, knowledge graph, hybrid retrieval. Framework-agnostic, no cloud.

arm sqlite knowledge-graph framework-agnostic orange-pi embedding onnx locomo rag vector-search edge-ai local-first llm rk3588 ai-memory offline-ai longmemeval

Updated Jun 23, 2026
Python

JubaKitiashvili / context-mem

Star

Your AI forgets everything between sessions. This fixes that — 98%+ retrieval accuracy, 100% on LongMemEval, 99% token savings. 44 MCP tools. Fully local, zero cost.

retrieval memory sqlite mcp knowledge-graph summarization cursor bm25 entity-extraction ai-agents vector-search context-window ai-memory ai-coding mcp-server context-optimization claude-code longmemeval

Updated Apr 19, 2026
TypeScript

Local-first Memory Framework for AI Agents · 99.2% LongMemEval-S retrieval @ k=10 · Supports Claude · Gemini · Antigravity · OpenCode · OpenClaw · Hermes · MCP-native and plugins · Hybrid search (FTS5 + vector + MMR) · GDPR · FIPS 140-3 ready · 100% local (fully offline) or cloud capable

Updated Jun 23, 2026
Python

ContextFit / cf

Star

Token-native agent memory retrieval for LLMs, without embedding APIs or vector databases.

python retrieval mcp ai-agents llm agent-memory longmemeval token-native

Updated May 31, 2026
Python

Sibyl-Labs / memory-bench-kit

Star

Benchmark results, scorer, and reproducibility kit for Sibyl Memory. LongMemEval 95.6% (#2). Verify it yourself.

benchmark memory beam reproducibility scorer llm-memory ai-memory agent-memory longmemeval file-based-memory

Updated May 25, 2026
JavaScript

Alienfader / continuity-benchmarks

Star

Reproducible benchmarks for execution-intent memory in long-horizon AI coding agents. ID-RAG cross-corpus matrix + LongMemEval-S subset; BYO API keys.

benchmark memory evaluation ai-agents rag llm retrieval-augmented-generation longmemeval

Updated May 13, 2026
TypeScript

pgmnemo / pgmnemo

Star

Multi-agent memory substrate for PostgreSQL — provenance-gated, vector-hybrid recall

memory postgresql provenance magma postgresql-extension ai-agents zep locomo vector-search llm pgvector memgpt bge-m3 mem0 agent-memory letta longmemeval dragon-encoder

Updated Jun 23, 2026
PLpgSQL

recallrai / sdk-python

Star

Official Python SDK for RecallrAI – a revolutionary contextual memory system that enables AI assistants to form meaningful connections between conversations, just like human memory.

ai memory long-term-memory memora getzep mem0 mem0ai contextual-memory halucination recallrai cognee longmemeval

Updated May 31, 2026
Python

rkceve / HAMIB

Star

Retrain-free attention patch that makes Llama 3.3 70B ~1.3× more accurate on long-conversation memory

attention llama large-language-models llm long-context llama3 longmemeval

Updated Jun 6, 2026
Python

AgentBrainHQ / agentbrain-benchmarks

Star

Public, reproducible benchmarks for Agent Brain on LongMemEval-M. 71.7% accuracy (Test 0). Companion code to https://doi.org/10.5281/zenodo.19673132 (Concept DOI → latest version, currently v3).

benchmark ai memory knowledge-graph rag llm fsrs agent-memory longmemeval dream-cycle

Updated Apr 21, 2026
Python

hifriendbot / cogmemai-mcp-agent-sdk-quickstart

Star

Smallest possible working example of CogmemAi (95.1% LongMemEval) wired into the Claude Agent SDK. Two-session demo: save in session 1, recall in session 2.

memory mcp persistent-memory ai-agent claude-code agent-sdk longmemeval cogmemai

Updated May 14, 2026
HTML

qishengdong / touchstone-longmemeval

Star

100-question 6-dimension long-conversation memory benchmark for Chinese-healthcare AI. Sivon reference: 92/100 mean (2026-05-27).

benchmark retrieval memory chinese-nlp healthcare-ai long-context llm-evaluation glp-1 longmemeval perimenopause sivon

Updated May 27, 2026
JavaScript

marklubin / lens-benchmark

Star

LENS - AI Memory Benchmark - Memory as Experience, Not Facts

agent benchmark memory knowledge-graph rag vector-search context-engineering longmemeval

Updated Apr 4, 2026
HTML

peterjohannmedina / heurchain-benchmarks

Star

Benchmark harness for HeurChain on LongMemEval-S — reproduce the R@10, MRR, NDCG, and latency numbers from heurchain.com

benchmark ai memory ai-agents rag vector-search llm longmemeval heurchain

Updated May 22, 2026
Python

megawer93 / Project-Shadows

Star

Multi-agent strategic intelligence system with hybrid memory retrieval. Research project.

multi-agent agents multi-agent-systems llm llm-agents agent-memory longmemeval strategic-intelligence

Updated Apr 19, 2026
Python

shhahhussain / mnemo-benchmarks

Star

LongMemEval-S benchmark results for Mnemo - 80.8% strict (404/500).

benchmark ai memory ai-agents rag llm longmemeval

Updated Jun 18, 2026

Evanyuan-builder / memory-core-eval

Star

Reproducible evaluation harness for agent memory systems (LongMemEval and beyond).

benchmark retrieval memory evaluation agents rag llm longmemeval

Updated May 30, 2026
Python

zbl1998-sdjn / MASE-agent-memory

Star

Anti-RAG dual-whitebox memory for LLM agents. 2.72 MB SQLite + Markdown kernel, no vector DB, no embeddings. Lifts qwen2.5:7b from 1.79% to 60.71% on NoLiMa-32k (+58.9pp), 88.71% on LV-Eval EN 256k, 84.8% on LongMemEval-S. Restart-safe, concurrency-bullet-proof, 100% transparent.

Updated Jun 14, 2026
Python

hermes-labs-ai / fidelis

Star

fidelis is zero-LLM agent memory for Claude Code and AI agents: a local-first memory layer whose default retrieval path uses BM25, dense vectors, and reciprocal rank fusion with no LLM call. It returns your original passages verbatim instead of paraphrasing and runs fully local. Benchmarked on LongMemEval-S. MIT, by Hermes Labs.

retrieval mcp bm25 fidelity ai-agents rag local-first llm llm-memory agent-memory claude-code longmemeval ai-reliability zero-llm hermes-labs

Updated Jun 22, 2026
Python

Improve this page

Add a description, image, and links to the longmemeval topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the longmemeval topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

longmemeval

Here are 22 public repositories matching this topic...

ZenSystemAI / Zengram

jaylfc / taosmd

JubaKitiashvili / context-mem

skynetcmd / m3-memory

ContextFit / cf

Sibyl-Labs / memory-bench-kit

Alienfader / continuity-benchmarks

pgmnemo / pgmnemo

recallrai / sdk-python

rkceve / HAMIB

AgentBrainHQ / agentbrain-benchmarks

hifriendbot / cogmemai-mcp-agent-sdk-quickstart

qishengdong / touchstone-longmemeval

marklubin / lens-benchmark

peterjohannmedina / heurchain-benchmarks

megawer93 / Project-Shadows

shhahhussain / mnemo-benchmarks

Evanyuan-builder / memory-core-eval

zbl1998-sdjn / MASE-agent-memory

hermes-labs-ai / fidelis

Improve this page

Add this topic to your repo