feat(models): add Kimi/Moonshot as overflow fallback by jonathanpeterwu · Pull Request #13 · stackmemoryai/stackmemory

jonathanpeterwu · 2026-04-29T13:46:18Z

Summary

Adds Moonshot/Kimi K2.6 as a provider across the routing stack (model-router, provider-adapter, Zod schemas)
When Claude CLI or Anthropic API hits rate limits/quota errors, tasks automatically overflow to Kimi at ~10x lower cost ($0.60/$2.50 per MTok)
Sensitive content stays on Anthropic via existing sensitive-guard — moonshot is intentionally NOT in APPROVED_PROVIDERS
Kimi added as first entry in FALLBACK_CHAIN and CHEAP_PROVIDERS for cost-optimal routing

Test plan

46/46 tests passing (model-router + subagent-client)
New tests: Kimi token limits, moonshot cheap provider routing, fallback chain priority
New tests: quota error pattern detection, Kimi overflow behavior, graceful failure without MOONSHOT_API_KEY
Build passes (esbuild + tsc --noEmit clean)
Manual: set MOONSHOT_API_KEY and verify overflow triggers on Claude 429

…a exhausted Adds moonshot (Kimi K2.6) as a provider throughout the routing stack. When Claude CLI or API hits rate limits/quota, tasks automatically overflow to Kimi at ~10x lower cost ($0.60/$2.50 per MTok). Sensitive content stays on Anthropic via existing guard.

Analyzes stored traces to detect repeated failure patterns (lint, test, build, timeout, rate-limit), verification gaps, retry loops, and context thrash. Generates actionable recommendations with confidence scores and persists reports to .stackmemory/build/.

- Wire ContentCache into MCP server tool dispatch (28 cacheable read-only tools) - Add lookupByKey/putByKey for input-addressed caching (tool+args → result) - Add cache_stats + cache_lookup MCP tools for in-session token savings visibility - Print cache hit rate + tokens saved to stderr on MCP server exit - Add `stackmemory pack` CLI (install/list/search/show/init/fork/publish/uninstall) - Support local dirs, pack.yaml paths, GitHub URLs, and namespace/name shorthand - Create 3 first-party skill packs: coding/typescript-react, coding/python-fastapi, ops/decision-recovery - 6 new tests for key-based cache operations, all 2172 tests passing

…tools Three new core modules for stackmemory Q1: - src/core/provenance/ — TraceEvent spec (ASI-shaped), confidence scorer (ported from provenantai), provenance store with supersession + lineage - src/core/skill-packs/ — pack.yaml Zod schema, YAML parser, SQLite registry with FTS5 search, namespace/runtime filtering - src/core/cache/ — Content-hash token cache with SHA-256 dedup, hit counting, eviction, and savings stats MCP server wired with 7 new tools: cache_lookup, cache_stats, pack_list, pack_search, pack_get, record_trace, score_confidence. 102 tests passing across all modules.

Self-contained SDK at packages/sdk/ with typed facade over: - ContentCache (SHA-256 dedup, token savings tracking) - SkillPackRegistry (install/search/list, FTS5, pack.yaml parser) - ProvenanceStore (TraceEvent spec, lineage, supersession) - scoreConfidence() (decision detection, weighted signals) Usage: new StackMemory({ dataDir }) → sm.cache / sm.packs / sm.provenance 16 tests passing. Zero type errors. Ready for npm publish.

- Add ASI-shaped TraceEvent type matching kickoff spec (score, feedback, provenance, cost, tokens) - Add TraceEventStore with SQLite persistence, filtered queries, batch recording, annotation - Add provenance columns to frames + anchors tables (source, derivation, confidence, superseded_by) - Populate provenance automatically on frame/anchor creation - Add `stackmemory cache stats/clear/search` CLI for terminal-printable token savings - 19 new tests for trace event store, all 2191 tests passing

- Wire TraceEventStore into MCP server with query/stats/record handlers - Add verification commands to multimodal harness (custom pass/fail checks) - Deterministic critique now checks verification results - Update MCP tool definitions + docs

SDK ready for npm publish at 17.9 kB.

@StackMemory

- Add @StackMemory Python SDK (packages/python-sdk/) with cache, packs, provenance - Zero external deps (stdlib sqlite3), 12 tests passing - Wire TraceEventStore into MCP server — every tool call recorded as ASI-shaped event - Add trace_events, trace_event_stats, trace_event_annotate MCP tools - Add stackmemory cache stats/clear/search CLI commands

StackMemory Bot (CLI) added 9 commits April 29, 2026 09:45

docs(sdk): add README, .npmignore, repo metadata for npm publish

3e7f8b6

SDK ready for npm publish at 17.9 kB.

jonathanpeterwu merged commit c68bbf6 into main May 3, 2026
3 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(models): add Kimi/Moonshot as overflow fallback#13

feat(models): add Kimi/Moonshot as overflow fallback#13
jonathanpeterwu merged 9 commits intomainfrom
feature/kimi-overflow-fallback

jonathanpeterwu commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jonathanpeterwu commented Apr 29, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant