From b0fdae951a6a49f84cc87c30309f9596f840ab73 Mon Sep 17 00:00:00 2001 From: Claude Date: Sun, 7 Jun 2026 22:09:35 +0000 Subject: [PATCH 1/3] docs(CLAUDE): link cross-repo JPMS module-descriptor policy (javadoc Java-bump trap) --- CLAUDE.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/CLAUDE.md b/CLAUDE.md index 56680cb2..5ae7041a 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -701,6 +701,14 @@ See [`../workspace/policies/jqwik-prompt-injection.md`](../workspace/policies/jq See [`../workspace/policies/lombok-config.md`](../workspace/policies/lombok-config.md). +## JPMS Module Descriptor + +This repo ships a `module-info.java` compiled in a separate `release 9` execution. Javadoc +currently runs in **classpath mode** (javadoc `` is `1.8`), which is the *only* thing +keeping it clear of the JPMS module-mode javadoc trap that bit BAF. **Before raising the Java / +javadoc source level to ≥ 9, read** +[`../workspace/policies/jpms-module-descriptor.md`](../workspace/policies/jpms-module-descriptor.md). + ## Open TODOs Open TODOs for this repo live in [`TODO.md`](TODO.md). Cross-repo status From 763abe1659e2bfea4f387fc028f3b56be8ce1a4f Mon Sep 17 00:00:00 2001 From: Claude Date: Mon, 8 Jun 2026 06:51:00 +0000 Subject: [PATCH 2/3] Upgrade llama.cpp from b9549 to b9553 API compatibility: the only breaking change in this range is common_sampler_types_from_names() dropping its `bool allow_alt_names` parameter (common/sampling.h). All call sites (common/arg.cpp, common/common.cpp, tools/server/server-task.cpp) are upstream-compiled translation units that upstream updated in the same patch; grep confirms zero references to the symbol in src/main/cpp / src/test/cpp, so no project C++ source change is required. New behaviour gained for free: server-task.cpp previously passed allow_alt_names=false, so the project's "samplers" JSON field only matched canonical snake_case names. b9553 always accepts aliases (top-k, topk, nucleus, temp, typ) and is case-insensitive. Added 5 params_from_json_cmpl tests in test_server.cpp pinning this. Other changes in range are no-ops for the JNI build: the llama-kv-cache shared-cells refactor (internal src/ headers not included by the project; new `using llama_kv_cells_vec`) and two Python conversion-script .get() robustness fixes. Updated GIT_TAG (CMakeLists.txt), the README badge/link, and the CLAUDE.md pinned version. cmake configure verified clean against b9553; full build + ctest verification and the breaking-changes log row to follow. --- CLAUDE.md | 2 +- CMakeLists.txt | 2 +- README.md | 2 +- src/test/cpp/test_server.cpp | 51 ++++++++++++++++++++++++++++++++++++ 4 files changed, 54 insertions(+), 3 deletions(-) diff --git a/CLAUDE.md b/CLAUDE.md index 5ae7041a..82a00664 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -6,7 +6,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co Java bindings for [llama.cpp](https://github.com/ggerganov/llama.cpp) via JNI, providing a high-level API for LLM inference in Java. The Java layer communicates with a native C++ library through JNI. -Current llama.cpp pinned version: **b9549** +Current llama.cpp pinned version: **b9553** ## Upgrading CUDA Version diff --git a/CMakeLists.txt b/CMakeLists.txt index 0391d119..a670485c 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -114,7 +114,7 @@ set(LLAMA_BUILD_APP OFF CACHE BOOL "" FORCE) FetchContent_Declare( llama.cpp GIT_REPOSITORY https://github.com/ggerganov/llama.cpp.git - GIT_TAG b9549 + GIT_TAG b9553 ) FetchContent_MakeAvailable(llama.cpp) diff --git a/README.md b/README.md index 12ee7bdd..c2db4e7d 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@ **Build:** ![Java 8+](https://img.shields.io/badge/Java-8%2B-informational) ![Platform](https://img.shields.io/badge/Platform-Linux%20%7C%20macOS%20%7C%20Windows%20%7C%20Android-lightgrey) -[![llama.cpp b9549](https://img.shields.io/badge/llama.cpp-%23b9549-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9549) +[![llama.cpp b9553](https://img.shields.io/badge/llama.cpp-%23b9553-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9553) [![JPMS](https://img.shields.io/badge/JPMS-modular%20JAR-25A162)](https://openjdk.org/projects/jigsaw/) ![JUnit](https://img.shields.io/badge/tested%20with-JUnit6-25A162) [![JSpecify](https://img.shields.io/badge/JSpecify-1.0.0%20%40NullMarked-25A162)](https://jspecify.dev) diff --git a/src/test/cpp/test_server.cpp b/src/test/cpp/test_server.cpp index b52552d9..0db128ac 100644 --- a/src/test/cpp/test_server.cpp +++ b/src/test/cpp/test_server.cpp @@ -1681,6 +1681,57 @@ TEST(ParamsFromJsonCmpl, NCmpl_AliasedFromN) { EXPECT_EQ(p.n_cmpl, 1); } +// ============================================================ +// params_from_json_cmpl — "samplers" name matching (llama.cpp b9553) +// common_sampler_types_from_names dropped its allow_alt_names flag: +// the server path (params_from_json_cmpl) now ALWAYS accepts aliases and +// is case-insensitive. Before b9553 the server passed allow_alt_names=false, +// so only the canonical snake_case names matched and "top-k" / "TOP_K" were +// skipped. These tests pin the more lenient behaviour the project's +// "samplers" JSON field now exposes for free. +// ============================================================ + +TEST(ParamsFromJsonCmpl, Samplers_CanonicalNames_Parsed) { + const auto p = parse_params({{"samplers", {"top_k", "top_p", "min_p", "temperature"}}}); + ASSERT_EQ(p.sampling.samplers.size(), 4u); + EXPECT_EQ(p.sampling.samplers[0], COMMON_SAMPLER_TYPE_TOP_K); + EXPECT_EQ(p.sampling.samplers[1], COMMON_SAMPLER_TYPE_TOP_P); + EXPECT_EQ(p.sampling.samplers[2], COMMON_SAMPLER_TYPE_MIN_P); + EXPECT_EQ(p.sampling.samplers[3], COMMON_SAMPLER_TYPE_TEMPERATURE); +} + +TEST(ParamsFromJsonCmpl, Samplers_KebabCaseAlias_NowAccepted) { + // "top-k" / "min-p" alt names were rejected by the server before b9553. + const auto p = parse_params({{"samplers", {"top-k", "min-p"}}}); + ASSERT_EQ(p.sampling.samplers.size(), 2u); + EXPECT_EQ(p.sampling.samplers[0], COMMON_SAMPLER_TYPE_TOP_K); + EXPECT_EQ(p.sampling.samplers[1], COMMON_SAMPLER_TYPE_MIN_P); +} + +TEST(ParamsFromJsonCmpl, Samplers_CaseInsensitive) { + const auto p = parse_params({{"samplers", {"TOP_K", "Temperature", "Min-P"}}}); + ASSERT_EQ(p.sampling.samplers.size(), 3u); + EXPECT_EQ(p.sampling.samplers[0], COMMON_SAMPLER_TYPE_TOP_K); + EXPECT_EQ(p.sampling.samplers[1], COMMON_SAMPLER_TYPE_TEMPERATURE); + EXPECT_EQ(p.sampling.samplers[2], COMMON_SAMPLER_TYPE_MIN_P); +} + +TEST(ParamsFromJsonCmpl, Samplers_MiscAliases_Parsed) { + // "nucleus" -> top_p, "temp" -> temperature, "typ" -> typical_p + const auto p = parse_params({{"samplers", {"nucleus", "temp", "typ"}}}); + ASSERT_EQ(p.sampling.samplers.size(), 3u); + EXPECT_EQ(p.sampling.samplers[0], COMMON_SAMPLER_TYPE_TOP_P); + EXPECT_EQ(p.sampling.samplers[1], COMMON_SAMPLER_TYPE_TEMPERATURE); + EXPECT_EQ(p.sampling.samplers[2], COMMON_SAMPLER_TYPE_TYPICAL_P); +} + +TEST(ParamsFromJsonCmpl, Samplers_UnknownName_SkippedNotError) { + // unknown names are warned and skipped, not a hard error. + const auto p = parse_params({{"samplers", {"top_k", "definitely_not_a_sampler"}}}); + ASSERT_EQ(p.sampling.samplers.size(), 1u); + EXPECT_EQ(p.sampling.samplers[0], COMMON_SAMPLER_TYPE_TOP_K); +} + // ============================================================ // params_from_json_cmpl — reasoning_budget_tokens // reasoning_budget_tokens defaults to -1 (disabled). From 483bf838ffe3883ed786ec53a93c2d222fb2f965 Mon Sep 17 00:00:00 2001 From: Claude Date: Mon, 8 Jun 2026 06:52:05 +0000 Subject: [PATCH 3/3] docs(history): log b9549->b9553 breaking-changes audit + verified 440/440 build Records the common_sampler_types_from_names signature change (and the free lenient-sampler-name behaviour), the llama-kv-cache shared-cells refactor, and the Python conversion-script fixes. Build + ctest verified clean against b9553: 440/440 tests pass (435 prior + 5 new Samplers_* tests). --- docs/history/llama-cpp-breaking-changes.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/history/llama-cpp-breaking-changes.md b/docs/history/llama-cpp-breaking-changes.md index 0700c8db..58e39320 100644 --- a/docs/history/llama-cpp-breaking-changes.md +++ b/docs/history/llama-cpp-breaking-changes.md @@ -321,3 +321,7 @@ Used during `llama.cpp` version bumps: when upgrading, scan this file from the r | ~b9543–b9549 | `.github/workflows/docker.yml` (upstream CI) | Upstream's `cuda13` Docker image bumped from CUDA `13.1.1` to `13.3.0`. Upstream's own CI only; this project ships its own `publish.yml` and pins CUDA 13.2 via `.github/build_cuda_linux.sh` (see CLAUDE.md "Upgrading CUDA Version"). No impact | | ~b9543–b9549 | project `CMakeLists.txt` (pre-existing latent bug, fixed in this bump) | **Not an upstream change** — surfaced while build-testing this bump locally. The OS/arch detection block invoked `net.ladenthin.llama.OSInfo`, but the class had moved to `net.ladenthin.llama.loader.OSInfo` in the earlier layered-package restructure, so `cmake -B build` failed with "Could not determine OS name" on any host that does not pass `-DOS_NAME`/`-DOS_ARCH` explicitly (CI does, which is why it went unnoticed). Fixed both `execute_process` invocations (`--os` and `--arch`) to the `loader.OSInfo` FQN. Same stale-FQN-after-restructure class as the earlier `spotbugs-exclude.xml` / PIT-`targetClasses` repairs — the standing reminder to re-validate every FQN-bearing config after a package move now also covers `CMakeLists.txt` | | ~b9543–b9549 | upstream build / verification | Local build with `GIT_TAG b9549` verified clean on Linux x86_64: `cmake -B build -DBUILD_TESTING=ON` configures cleanly (after the `loader.OSInfo` FQN fix above), `cmake --build build --config Release -j$(nproc)` links `libjllama.so` + `jllama_test` with zero warnings on any project translation unit (incl. the changed `server-context.cpp`), and `ctest --test-dir build --output-on-failure` reports 435/435 tests passing. All upstream breaking changes in this range are absorbed inside upstream-compiled translation units; no project C++ source edits were required for the version bump itself | +| ~b9549–b9553 | `common/sampling.h` + `common/sampling.cpp` + `common/arg.cpp` + `common/common.cpp` + `tools/server/server-task.cpp` | `common_sampler_types_from_names()` **dropped its `bool allow_alt_names` parameter** — the signature is now `common_sampler_types_from_names(const std::vector & names)`. The body was rewritten to (a) auto-generate kebab-case (`top-k`) and no-dash (`topk`) aliases from the canonical snake_case names, plus misc aliases (`nucleus`→top_p, `temp`→temperature, `typ`→typical_p), and (b) lowercase the input so matching is **case-insensitive**; aliases are now *always* accepted (the old gate is gone). All three call sites were updated upstream (`arg.cpp` / `common.cpp` dropped the `, true` arg; `server-task.cpp` dropped the `, false` arg). **Project impact: none at the source level** — `grep -rn common_sampler_types_from_names src/main/cpp src/test/cpp` returns zero matches; the symbol is reached only through the upstream-compiled `server-task.cpp` linked into `jllama`. **New behaviour exposed for free:** because `server-task.cpp` previously passed `allow_alt_names=false`, the project's `InferenceParameters` `samplers` JSON array only matched canonical names like `top_k`; it now also accepts `top-k` / `topk` / `nucleus` / `temp` / `typ` and is case-insensitive (`TOP_K`, `Min-P`). Pinned by 5 new `ParamsFromJsonCmpl.Samplers_*` tests in `test_server.cpp` | +| ~b9549–b9553 | `src/llama-kv-cache.cpp` + `src/llama-kv-cache.h` + `src/llama-kv-cells.h` | KV-cache shared-cells refactor (continues `TAG_KV_CACHE_SHARE_CELLS`, used by the Gemma4-assistant MTP head): the `v_cells` member changed from a by-value `std::vector` to a `std::shared_ptr v_cells_impl` plus a `llama_kv_cells_vec & v_cells` reference, so a target cache now *views* the source cache's cells instead of copying them in `apply_ubatch()`; the constructor also clamps `kv_size` down to the shared source's size. New type alias `using llama_kv_cells_vec = std::vector;` in `llama-kv-cells.h`. All internal `src/` headers the JNI build does **not** include (the project pulls public `llama.h` / `llama-cpp.h`, never `llama-kv-cache.h` / `llama-kv-cells.h`) — verified via `grep -rn "llama_kv_cells\|llama-kv-cache" src/main/cpp src/test/cpp` → zero matches. No project source changes required | +| ~b9549–b9553 | `conversion/mistral.py` + `convert_hf_to_gguf.py` | Python conversion-script robustness only: `hparams["llama_4_scaling"]` and `"moe" in hparams` replaced with `hparams.get(...)` / `is not None` guards so a present-but-null key no longer crashes conversion. Python tooling, not part of the JNI build. No impact | +| ~b9549–b9553 | upstream build / verification | Local build with `GIT_TAG b9553` verified clean on Linux x86_64: `cmake -B build -DBUILD_TESTING=ON` configures cleanly, `cmake --build build --config Release -j$(nproc)` links `libjllama.so` + `jllama_test` with zero warnings on any project translation unit, and `ctest --test-dir build --output-on-failure` reports **440/440 tests passing** (435 prior + 5 new `Samplers_*` tests). The sole breaking change in this range (the `common_sampler_types_from_names` signature) is absorbed inside upstream-compiled translation units; no project C++ source edits were required for the version bump itself |