Add minicpm5 tool call parser by zhangtao2-1 · Pull Request #23802 · ggml-org/llama.cpp

zhangtao2-1 · 2026-05-28T07:34:57Z

Overview

Add MiniCPM5 tool call support for llama-server.

MiniCPM5 outputs tool calls as XML (<function name="..."><param name="...">...</param></function>), not JSON. This PR adds a peg-minicpm5 parser to detect the template, parse tool calls (including streaming), normalize common output quirks, and expose OpenAI-compatible tool_calls.

Also fixes a peg mapper streaming bug (use tool index instead of dangling pointer) and adds Jinja min/max filters needed by the MiniCPM5 template.

Test plan

test-chat-peg-parser-minicpm5
test-jinja (min / max)
Manual test with MiniCPM5 GGUF + tools on llama-server

zhangtao2-1 · 2026-05-28T07:37:09Z

@CISC would you mind taking a look at this PR? MiniCPM5 tool call parsing — would appreciate your review.

CISC · 2026-05-28T07:49:03Z

@CISC would you mind taking a look at this PR? MiniCPM5 tool call parsing — would appreciate your review.

I'll review the jinja part, leaving the parser to @ggml-org/llama-common

aldehir

Thanks for the PR. The code present deviates too much from what is already established. I don't really see anything that would necessitate the need for these deviations.

I'll let @pwilkin weigh in if he thinks a dedicated parser is necessary or if this can be handled by the autoparser with some tweaks.

aldehir · 2026-05-28T08:10:14Z

 std::string & common_chat_peg_mapper::args_target() {
-    return (current_tool && !current_tool->name.empty()) ? current_tool->arguments : args_buffer;
+    common_chat_tool_call * tool = active_tool();
+    return (tool && !tool->name.empty()) ? tool->arguments : args_buffer;
+}
+
+common_chat_tool_call * common_chat_peg_mapper::active_tool() {
+    if (committed_tool_idx.has_value()) {
+        return &result.tool_calls.at(committed_tool_idx.value());
+    }
+    if (pending_tool_call.has_value()) {
+        return &pending_tool_call.value();
+    }
+    return nullptr;


If it's not broke, don't fix it.

aldehir · 2026-05-28T08:12:10Z

    }
 }
+
+common_peg_parser common_chat_peg_builder::minicpm5_xml_tool_calls(const ordered_json & tools,


Inline in the chat param init for the model template.

aldehir · 2026-05-28T08:13:26Z

+            auto arg_choice = choice();
+            for (const auto & el : params.at("properties").items()) {
+                const std::string & prop_name = el.key();
+                const std::string & prop_type = el.value().value("type", "string");


Please use the established patterns in the repo for determining if a type is a string. See common_schema_info.

aldehir · 2026-05-28T08:16:34Z

+    auto replace_all = [](std::string & s, const std::string & from, const std::string & to) {
+        if (from.empty()) {
+            return;
+        }
+        for (size_t pos = 0; (pos = s.find(from, pos)) != std::string::npos;) {
+            s.replace(pos, from.size(), to);
+            pos += to.size();
+        }
+    };
+
+    replace_all(out, "<functionname=", "<function name=");
+    replace_all(out, "<paramname=", "<param name=");
+
+    // Some GGUF outputs drop opening tag names but keep attributes, e.g.
+    // ` name="python"> name="code">...` instead of `<function name="python">...`.
+    static const std::regex LEADING_FUNC_ATTR(R"((?:^|[\n\r])\s*name=\"([^\"]+)\">)");
+    out = std::regex_replace(out, LEADING_FUNC_ATTR, "\n<function name=\"$1\">");
+    static const std::regex PARAM_ATTR(R"(>\s*name=\"([^\"]+)\">)");
+    out = std::regex_replace(out, PARAM_ATTR, "><param name=\"$1\">");
+
+    static const std::string IM_END = "<|im_end|>";
+    if (out.size() >= IM_END.size() &&
+        out.compare(out.size() - IM_END.size(), IM_END.size(), IM_END) == 0) {
+        out.erase(out.size() - IM_END.size());
+        while (!out.empty() && (out.back() == '\n' || out.back() == ' ')) {
+            out.pop_back();
+        }
+    }
+
+    return out;


All of this should be defined in the parser. None of this looks like it needs to be done post process.

aldehir · 2026-05-28T08:17:19Z

+    static const std::string SP_SPACE = "\xC4\xA0"; // U+0120
+    static const std::string SP_NL    = "\xC1\x8A"; // U+010A
+
+    for (size_t pos = 0; (pos = out.find(SP_SPACE, pos)) != std::string::npos;) {
+        out.replace(pos, SP_SPACE.size(), " ");
+        pos += 1;
+    }
+    for (size_t pos = 0; (pos = out.find(SP_NL, pos)) != std::string::npos;) {
+        out.replace(pos, SP_NL.size(), "\n");
+        pos += 1;
+    }


This seems easy enough to do in a single pass.

aldehir · 2026-05-28T08:22:47Z

+            reasoning = p.reasoning(p.until_one_of(TOOL_START_MARKERS)) + p.optional(p.literal("\n")) +
+                        p.optional(p.literal(THINK_END) + p.optional(p.literal("\n\n")));


Too many optionals. This can be a choice with two branches: one that completes a thought with the end thinking tag and one that preempts thinking with the start of a tool call. Use p.space() to consume whitespace.

aldehir · 2026-05-28T08:23:06Z

+                        p.optional(p.literal(THINK_END) + p.optional(p.literal("\n\n")));
+        }
+
+        auto suffix = p.optional(p.literal("<|im_end|>") + p.optional(p.literal("\n")));


Should not be needed.

aldehir · 2026-05-28T08:25:27Z

+    // MiniCPM5 tool calls are parsed post-hoc via peg-minicpm5 (XML output).
+    // Do not attach JSON-schema GBNF here — it is invalid for this format and
+    // can destabilize llama-server. SGLang/vLLM use parsers only for MiniCPM5.
+    data.grammar.clear();
+    data.grammar_lazy     = false;
+    data.grammar_triggers = {};
+    (void) include_grammar;


Even if you don't use the JSON schema to grammar implementation, the PEG parser will compose GBNF rules to enforce tool call structure. Please don't deviate from established patterns.

aldehir · 2026-05-28T08:26:22Z

+    const std::string normalized_input = params.format == COMMON_CHAT_FORMAT_PEG_MINICPM5 ?
+        common_chat_normalize_minicpm5_output(input) :
+        input;
+
    const std::string effective_input = params.generation_prompt.empty()
-        ? input
-        : params.generation_prompt + input;
+        ? normalized_input
+        : params.generation_prompt + normalized_input;


Normalization should occur in a dedicated mapper. Not here.

aldehir · 2026-05-28T08:27:24Z

 endif()

 llama_build_and_test(test-chat-peg-parser.cpp peg-parser/simple-tokenize.cpp)
+llama_build_and_test(test-chat-peg-parser-minicpm5.cpp)


To reiterate, do not deviate from established patterns. Chat template tests are located under test-chat.cpp.

zhangtao2-1 · 2026-05-28T08:41:50Z

@aldehir Thanks. Understood on aligning with established patterns — I'll refactor accordingly. Also happy to wait for @pwilkin's input on dedicated parser vs. autoparser before going too far in either direction.

pwilkin · 2026-05-28T09:27:25Z

Can you add the chat template here? Then we'll be able to just test it with the autoparser and see how much works out of the box and what needs to be possibly fixed.

zhangtao2-1 · 2026-05-28T10:16:35Z

@pwilkin @CISC @aldehir
Pushed 5187d06 with review fixes: align with established patterns (inline PEG, common_schema_info, dedicated minicpm5_mapper, GBNF restored, no generic mapper changes), jinja min/max API + tests, MiniCPM5 template fixture for autoparser experiments, tests in test-chat.cpp. test-chat / test-jinja / test-chat-peg-parser pass. PTAL — especially on dedicated parser vs autoparser.

zhangtao2-1 · 2026-05-31T07:30:28Z

Hi @pwilkin @aldehir
When you have a moment, could you please review my PR? Thanks!

lexasub · 2026-06-01T19:21:23Z

@pwilkin review pls

lexasub · 2026-06-02T19:57:14Z

@zhangtao2-1, hello, i fetched you branch (and rebuild) and try test tool calling (mistral vibe, opencode) on your template ./bin/llama-server --chat-template-file ../models/templates/MiniCPM5-1B.jinja --host 0.0.0.0 --jinja -fa on --port 1113 --ctx-size 200000 --model ./MiniCPM5-1B-Q4_K_M.gguf (qwopus - it name in mistral vibe config)

zhangtao2-1 · 2026-06-03T06:16:02Z

Hi @lexasub
Thanks for the repro — the two screenshots point to two different issues:

HTTP 500 (Failed to parse tool call arguments as JSON)
On multi-turn requests, malformed/truncated tool_calls.arguments in history (e.g. {"url") caused func_args_not_string() to throw → 500. Fixed on this branch: parse failures now log a warning and fall back to {} instead of aborting. Added a multi-turn regression test in test-chat.cpp.
<command_utilization> + H5H5... repetition
Not a MiniCPM5 XML parser issue — the model output mistral vibe/opencode’s format, not <param ...>. The H5 loop is small-model repetition (1B Q4 under agent load).

For agent testing: use --reasoning-budget 256, "enable_thinking": false for tool tasks, max_tokens ≥ 1024, and prefer F16 if possible.

The 500 on follow-up turns should be fixed on the latest branch — would appreciate a retest. Thanks!

pwilkin · 2026-06-03T09:47:18Z

Sorry for the delay, I had a busy time recently and was AFK all weekend, could only access comments.

From what I determined, the specifics of the MiniCPM5 template format are that it doesn't have a tool call marker per se - its tool calls start directly with the function marker, but it has a call separator marker. I think the cleanest solution will be to add support for this type of behavior in the autoparser, if that fails, we can fall back to the dedicated parser solution.

…-token triggers

zhangtao2-1 · 2026-06-03T13:53:16Z

Thanks @pwilkin — I've refactored to the autoparser-first approach: removed the dedicated MiniCPM5 parser, extended autoparser for direct calls with call_separator, and added a small diff-analyzer workaround only where needed. Local tests pass

JINZIPING · 2026-06-05T13:10:49Z

I retested latest 829d6f6. MiniCPM5 XML tool parsing still has a partial/streaming edge case.

For output like:

<function name="category_menu"><param name="category">Dessert</param></function>

I can still get args shaped like:

{"_raw_arguments":"{}\"category\":\"Dessert\"}"}

It looks like the partial PEG mapper finalizes the initial { placeholder into {} before the first arrives, then later arg deltas append after {}.

A narrow local fix was:

treat target == "{}" as an empty placeholder when the first arg name is parsed
in partial parsing, do not close braces for an empty tool-arg placeholder before any arg has been seen

I verified this locally with test-chat and test-chat-peg-parser.

pwilkin · 2026-06-05T15:49:35Z

Is the short syntax really a thing? Would it hurt to constrain the model to just the normal form? I'm not a huge fan of adding the alt markers to autoparser because of the way they're implemented here - the autoparser is supposed to rely on automatic detection mechanisms and here the detection is impossible.

zhangtao2-1 · 2026-06-08T03:51:35Z

Hi @pwilkin @JINZIPING Follow-up pushed: streaming {} placeholder fix (per JINZIPING), removed alt/compact XML markers (per pwilkin), plus a category_menu regression test. test-chat / test-chat-peg-parser pass. PTAL.

zhangtao2-1 · 2026-06-09T02:12:22Z

Hi @pwilkin @CISC @aldehir
Can we run CI tests next?

zhangtao2-1 · 2026-06-10T01:58:21Z

Hi @CISC Pushed 809b392 to fix test-jinja-py CI failures (skip min/max attribute tests in -py mode). Could a maintainer please Approve and run the pending workflows? Thanks!

CISC · 2026-06-10T04:42:18Z

+    // attribute= is not implemented in C++ yet; skip in -py mode (Python Jinja2 renders output)
+    if (!g_python_mode) {


Don't do this, restore the tests unconditionally, but put the real expected output instead of "".

restored attribute tests with real expected output ({'x': 1} / {'x': 2}). Could you approve the pending workflows when you have a moment? Thanks!

Add minicpm5 tool call parser

f50d698

zhangtao2-1 requested review from a team, CISC and ggerganov as code owners May 28, 2026 07:34

CISC reviewed May 28, 2026

View reviewed changes

Comment thread common/jinja/value.cpp

Comment thread tests/test-jinja.cpp

zhangtao2-1 mentioned this pull request May 28, 2026

Eval bug: Toolcalling not working for MiniCPM5-1B #23781

Open

aldehir requested changes May 28, 2026

View reviewed changes

github-actions Bot added testing Everything test related jinja parser Issues related to the jinja parser labels May 28, 2026

Refactor MiniCPM5 PEG parser per review feedback

5187d06

zhangtao2-1 requested a review from pwilkin as a code owner May 28, 2026 10:10

CISC reviewed May 28, 2026

View reviewed changes

Comment thread common/jinja/value.cpp Outdated

Comment thread common/jinja/value.cpp

Comment thread tests/test-jinja.cpp Outdated

Comment thread tests/test-jinja.cpp Outdated

Comment thread tests/test-jinja.cpp Outdated

Fix jinja min/max API to match Jinja2

4188894

modify by review

6288bdb

MiniCPM5: use autoparser for XML tool calls and fix grammar preserved…

829d6f6

…-token triggers

JINZIPING mentioned this pull request Jun 5, 2026

MiniCPM5 Tool Calling Support in llama.cpp #23860

Open

MiniCPM5: fix streaming tool-arg placeholder and remove alt XML markers

f3e81e1

skip min/max attribute tests in -py mode

809b392

CISC reviewed Jun 10, 2026

View reviewed changes

test-jinja: use real expected output for min/max attribute tests

781a640

		reasoning = p.reasoning(p.until_one_of(TOOL_START_MARKERS)) + p.optional(p.literal("\n")) +
		p.optional(p.literal(THINK_END) + p.optional(p.literal("\n\n")));

		// attribute= is not implemented in C++ yet; skip in -py mode (Python Jinja2 renders output)
		if (!g_python_mode) {

Conversation

zhangtao2-1 commented May 28, 2026

Overview

Test plan

Uh oh!

zhangtao2-1 commented May 28, 2026

Uh oh!

Uh oh!

Uh oh!

CISC commented May 28, 2026

Uh oh!

aldehir left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aldehir May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhangtao2-1 commented May 28, 2026

Uh oh!

pwilkin commented May 28, 2026

Uh oh!

zhangtao2-1 commented May 28, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zhangtao2-1 commented May 31, 2026

Uh oh!

lexasub commented Jun 1, 2026

Uh oh!

lexasub commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhangtao2-1 commented Jun 3, 2026

Uh oh!

pwilkin commented Jun 3, 2026

Uh oh!

zhangtao2-1 commented Jun 3, 2026

Uh oh!

JINZIPING commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pwilkin commented Jun 5, 2026

Uh oh!

zhangtao2-1 commented Jun 8, 2026

Uh oh!

zhangtao2-1 commented Jun 9, 2026

Uh oh!

zhangtao2-1 commented Jun 10, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

aldehir left a comment •

edited

Loading

aldehir May 28, 2026 •

edited

Loading

lexasub commented Jun 2, 2026 •

edited

Loading

JINZIPING commented Jun 5, 2026 •

edited

Loading