v0.35.0
🎉 Overview
v0.35.0 adds support for OpenAI-compatible APIs as model providers, enabling the use of OpenAI, Ollama, vLLM, LM Studio, Together AI, RunPod, and any other service that exposes an OpenAI-compatible chat completions endpoint. Truncation strategies now preserve the first user message across summarization to retain the original task instructions, and the truncation headroom has been doubled to reduce the chance of hitting context limits immediately after truncation.
✨ New Features
OpenAIVlmProvider— VLM provider for any OpenAI-compatible API (OpenAI, vLLM, LM Studio, Together AI, etc.) by @philipph-askui in #268OpenAIImageQAProvider— image Q&A provider for any OpenAI-compatible API by @philipph-askui in #268OllamaVlmProvider— convenience wrapper for local Ollama instances with sensible defaults (base_url=http://localhost:11434/v1,model_id=qwen3.5) by @philipph-askui in #268OllamaImageQAProvider— image Q&A via local Ollama instances by @philipph-askui in #268OpenAICompatibleVlmProvider— VLM provider for endpoints that require an exact URL (e.g., RunPod, custom proxies) where the OpenAI SDK's automatic path appending would break the request by @philipph-askui in #268OpenAIMessagesApi— full translation layer between the internalMessageParamformat and OpenAI's chat completions API, handling tool calls, image content, thinking blocks, and role alternation by @philipph-askui in #268OpenAIGetModel—GetModelimplementation for OpenAI-compatible APIs with structured output support by @philipph-askui in #268- Built-in pricing data for
gpt-5.4,gpt-5.4-mini, andgpt-5.4-nanomodels by @philipph-askui in #268
🔧 Improvements
- Truncation strategies now preserve the first user message across summarization, ensuring the original task instructions are never lost when the conversation is truncated by @philipph-askui in #280
MAX_INPUT_TOKENSincreased from 100k to 200k andTRUNCATION_THRESHOLDlowered from 0.7 to 0.56, roughly doubling the headroom after truncation to reduce the chance of re-triggering truncation immediately by @philipph-askui in #280process_idparameter inlist_process_windowstool is now auto-converted toint, preventing tool errors when the agent passes it as a string by @philipph-askui in #279
🐛 Bug Fixes
AgentSpeakernow handles the case where the model returnsstop_reason='tool_use'but no actual tool call blocks in the content, preventing stopped executions by prompting the model to retry with a valid tool call by @philipph-askui in #278
Full Changelog: v0.34.0...v0.35.0