Skip to content

feat(models): add Kimi/Moonshot as overflow fallback#13

Merged
jonathanpeterwu merged 9 commits intomainfrom
feature/kimi-overflow-fallback
May 3, 2026
Merged

feat(models): add Kimi/Moonshot as overflow fallback#13
jonathanpeterwu merged 9 commits intomainfrom
feature/kimi-overflow-fallback

Conversation

@jonathanpeterwu
Copy link
Copy Markdown
Collaborator

Summary

  • Adds Moonshot/Kimi K2.6 as a provider across the routing stack (model-router, provider-adapter, Zod schemas)
  • When Claude CLI or Anthropic API hits rate limits/quota errors, tasks automatically overflow to Kimi at ~10x lower cost ($0.60/$2.50 per MTok)
  • Sensitive content stays on Anthropic via existing sensitive-guard — moonshot is intentionally NOT in APPROVED_PROVIDERS
  • Kimi added as first entry in FALLBACK_CHAIN and CHEAP_PROVIDERS for cost-optimal routing

Test plan

  • 46/46 tests passing (model-router + subagent-client)
  • New tests: Kimi token limits, moonshot cheap provider routing, fallback chain priority
  • New tests: quota error pattern detection, Kimi overflow behavior, graceful failure without MOONSHOT_API_KEY
  • Build passes (esbuild + tsc --noEmit clean)
  • Manual: set MOONSHOT_API_KEY and verify overflow triggers on Claude 429

StackMemory Bot (CLI) added 9 commits April 29, 2026 09:45
…a exhausted

Adds moonshot (Kimi K2.6) as a provider throughout the routing stack.
When Claude CLI or API hits rate limits/quota, tasks automatically
overflow to Kimi at ~10x lower cost ($0.60/$2.50 per MTok).
Sensitive content stays on Anthropic via existing guard.
Analyzes stored traces to detect repeated failure patterns (lint, test,
build, timeout, rate-limit), verification gaps, retry loops, and
context thrash. Generates actionable recommendations with confidence
scores and persists reports to .stackmemory/build/.
- Wire ContentCache into MCP server tool dispatch (28 cacheable read-only tools)
- Add lookupByKey/putByKey for input-addressed caching (tool+args → result)
- Add cache_stats + cache_lookup MCP tools for in-session token savings visibility
- Print cache hit rate + tokens saved to stderr on MCP server exit
- Add `stackmemory pack` CLI (install/list/search/show/init/fork/publish/uninstall)
- Support local dirs, pack.yaml paths, GitHub URLs, and namespace/name shorthand
- Create 3 first-party skill packs: coding/typescript-react, coding/python-fastapi, ops/decision-recovery
- 6 new tests for key-based cache operations, all 2172 tests passing
…tools

Three new core modules for stackmemory Q1:

- src/core/provenance/ — TraceEvent spec (ASI-shaped), confidence scorer
  (ported from provenantai), provenance store with supersession + lineage
- src/core/skill-packs/ — pack.yaml Zod schema, YAML parser, SQLite
  registry with FTS5 search, namespace/runtime filtering
- src/core/cache/ — Content-hash token cache with SHA-256 dedup, hit
  counting, eviction, and savings stats

MCP server wired with 7 new tools: cache_lookup, cache_stats, pack_list,
pack_search, pack_get, record_trace, score_confidence.

102 tests passing across all modules.
Self-contained SDK at packages/sdk/ with typed facade over:
- ContentCache (SHA-256 dedup, token savings tracking)
- SkillPackRegistry (install/search/list, FTS5, pack.yaml parser)
- ProvenanceStore (TraceEvent spec, lineage, supersession)
- scoreConfidence() (decision detection, weighted signals)

Usage: new StackMemory({ dataDir }) → sm.cache / sm.packs / sm.provenance

16 tests passing. Zero type errors. Ready for npm publish.
- Add ASI-shaped TraceEvent type matching kickoff spec (score, feedback, provenance, cost, tokens)
- Add TraceEventStore with SQLite persistence, filtered queries, batch recording, annotation
- Add provenance columns to frames + anchors tables (source, derivation, confidence, superseded_by)
- Populate provenance automatically on frame/anchor creation
- Add `stackmemory cache stats/clear/search` CLI for terminal-printable token savings
- 19 new tests for trace event store, all 2191 tests passing
- Wire TraceEventStore into MCP server with query/stats/record handlers
- Add verification commands to multimodal harness (custom pass/fail checks)
- Deterministic critique now checks verification results
- Update MCP tool definitions + docs
- Add @StackMemory Python SDK (packages/python-sdk/) with cache, packs, provenance
- Zero external deps (stdlib sqlite3), 12 tests passing
- Wire TraceEventStore into MCP server — every tool call recorded as ASI-shaped event
- Add trace_events, trace_event_stats, trace_event_annotate MCP tools
- Add stackmemory cache stats/clear/search CLI commands
@jonathanpeterwu jonathanpeterwu merged commit c68bbf6 into main May 3, 2026
3 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant