Skip to content

Got HTTP 500 (Internal Server Error) when sending a prompt with "\n" in it #23

@palominoforever

Description

@palominoforever

Name and Version

I compiled and ran the llama-server from the "gemma4-mtp" branch. While using your mtp-bench.py, I noticed that running the "qa_factual" test triggers an HTTP 500 error from llama-server. However, simply changing "Q: What are the four fundamental forces of physics?\nA:" to "Q: What are the four fundamental forces of physics? A:" in the prompt completely resolves the error.This HTTP 500 error ONLY occurs when MTP is enabled.

llama-server version:
$ ~/am17an-llama.cpp/build/bin/llama-server --version
version: 9541 (65eef95)
built with GNU 16.1.1 for Linux x86_64

btw,there is another issue: in the "summarize" test, the returned text is empty. with or without MTP enabled. Don't know if it is a gemma4 issue
payload: {"model": "gemma-4-12B", "prompt": "Summarize in two sentences: The Industrial Revolution began in Britain in the late 18th century, transforming manufacturing through mechanization, steam power, and the factory system. It spread to continental Europe and North America during the 19th century.", "n_predict": 192, "temperature": 0.0, "seed": 42, "cache_prompt": false, "stream": false}
return: {"choices":[{"text":"","index":0,"logprobs":null,"finish_reason":"stop"}],"created":1780747864,"model":"gemma-4-12B","system_fingerprint":"b9541-65eef9549","object":"text_completion","usage":{"completion_tokens":1,"prompt_tokens":52,"total_tokens":53,"prompt_tokens_details":{"cached_tokens":0}},"id":"chatcmpl-6tDfXh0F4norklRfyeSJo7fowybIaxt3","timings":{"cache_n":0,"prompt_n":52,"prompt_ms":53.573,"prompt_per_token_ms":1.03025,"prompt_per_second":970.638194612958,"predicted_n":1,"predicted_ms":0.001,"predicted_per_token_ms":0.001,"predicted_per_second":1000000.0}}

Operating systems

Linux

GGML backends

CUDA

Hardware

ryzen 3700+rtx 5080

Models

https://huggingface.co/unsloth/gemma-4-12b-it-GGUF/blob/main/gemma-4-12b-it-Q8_0.gguf
https://huggingface.co/colefuoco00/gemma-4-12B-it-assistant-GGUF/blob/main/gemma-4-12B-it-assistant-Q8_0.gguf

Problem description & steps to reproduce

here is my llama-server parameters:
[gemma-4-12B]
model = /home/palomino/models/gemma-4-12b-it-Q8_0.gguf
model-draft=/home/palomino/models/gemma-4-12B-it-assistant-Q8_0.gguf
c = 131072
threads = 8
flash-attn = true
#reasoning-format = deepseek
#reasoning-budget = 4096
temp = 1.0
top-p = 0.95
top-k = 64
min-p = 0.0
presence-penalty = 0.0
repeat-penalty = 1.0
n-predict = -1
parallel = 1
n-gpu-layers= 99
no-mmap = true
b=256
ub = 256
ctk = q8_0
ctv = q8_0
spec-type=draft-mtp
spec-draft-n-max=4

First Bad Commit

No response

Relevant log output

Logs
result: {"error":{"code":500,"message":"Failed to parse input at pos 0:  �<|channel>thought\n<|channel>thought\n<channel|>The four fundamental forces of physics are the basic interactions that govern how all matter and energy in the universe behave. They are:\n\n### 1. Gravitational Force\n*   **What it does:** It is the force of attraction between any two objects that have mass or energy.\n*   **Scope:** It is the weakest of the four forces, but it has an infinite range and is responsible for the structure of the universe (keeping planets in orbit, stars together, and galaxies intact).\n*   **Governing Equation:** Newton's Law of Universal Gravitation / Einstein's General Relativity.\n\n### 2. Electromagnetic Force\n*   **What it does:** It acts between electrically charged particles. It includes both electric forces (attraction/repulsion of charges) and magnetic forces.\n*   **Scope:** It is much stronger than gravity. It governs almost everything we experience in","type":"server_error"}}


Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions