bernardladenthin · bernardladenthin · Jun 8, 2026 · Jun 8, 2026 · Jun 8, 2026
@@ -6,7 +6,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
 
 Java bindings for [llama.cpp](https://github.com/ggerganov/llama.cpp) via JNI, providing a high-level API for LLM inference in Java. The Java layer communicates with a native C++ library through JNI.
 
-Current llama.cpp pinned version: **b9553**
+Current llama.cpp pinned version: **b9555**
 
 ## Upgrading CUDA Version
 

@@ -114,7 +114,7 @@ set(LLAMA_BUILD_APP OFF CACHE BOOL "" FORCE)
 FetchContent_Declare(
 	llama.cpp
 	GIT_REPOSITORY https://github.com/ggerganov/llama.cpp.git
-	GIT_TAG        b9553
+	GIT_TAG        b9555
 )
 FetchContent_MakeAvailable(llama.cpp)
 

@@ -1,7 +1,7 @@
 **Build:**  
 ![Java 8+](https://img.shields.io/badge/Java-8%2B-informational)  
 ![Platform](https://img.shields.io/badge/Platform-Linux%20%7C%20macOS%20%7C%20Windows%20%7C%20Android-lightgrey)  
-[![llama.cpp b9553](https://img.shields.io/badge/llama.cpp-%23b9553-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9553)  
+[![llama.cpp b9555](https://img.shields.io/badge/llama.cpp-%23b9555-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9555)  
 [![JPMS](https://img.shields.io/badge/JPMS-modular%20JAR-25A162)](https://openjdk.org/projects/jigsaw/)  
 ![JUnit](https://img.shields.io/badge/tested%20with-JUnit6-25A162)  
 [![JSpecify](https://img.shields.io/badge/JSpecify-1.0.0%20%40NullMarked-25A162)](https://jspecify.dev)  
@@ -110,7 +110,7 @@ Access this library via Maven (released versions on Maven Central):
 <dependency>
     <groupId>net.ladenthin</groupId>
     <artifactId>llama</artifactId>
-    <version>5.0.1</version>
+    <version>5.0.2</version>
 </dependency>
 ```
 
@@ -168,22 +168,22 @@ Pick at most one — they are mutually exclusive.
 <dependency>
     <groupId>net.ladenthin</groupId>
     <artifactId>llama</artifactId>
-    <version>5.0.1</version>
+    <version>5.0.2</version>
 </dependency>
 
 <!-- CUDA on Linux x86-64 (requires CUDA 13 runtime on the host) -->
 <dependency>
     <groupId>net.ladenthin</groupId>
     <artifactId>llama</artifactId>
-    <version>5.0.1</version>
+    <version>5.0.2</version>
     <classifier>cuda13-linux-x86-64</classifier>
 </dependency>
 
 <!-- OpenCL/Adreno on Android (requires device-provided OpenCL ICD) -->
 <dependency>
     <groupId>net.ladenthin</groupId>
     <artifactId>llama</artifactId>
-    <version>5.0.1</version>
+    <version>5.0.2</version>
     <classifier>opencl-android-aarch64</classifier>
 </dependency>
 ```

@@ -325,3 +325,4 @@ Used during `llama.cpp` version bumps: when upgrading, scan this file from the r
 | ~b9549&ndash;b9553 | `src/llama-kv-cache.cpp` + `src/llama-kv-cache.h` + `src/llama-kv-cells.h` | KV-cache shared-cells refactor (continues `TAG_KV_CACHE_SHARE_CELLS`, used by the Gemma4-assistant MTP head): the `v_cells` member changed from a by-value `std::vector<llama_kv_cells>` to a `std::shared_ptr<llama_kv_cells_vec> v_cells_impl` plus a `llama_kv_cells_vec & v_cells` reference, so a target cache now *views* the source cache's cells instead of copying them in `apply_ubatch()`; the constructor also clamps `kv_size` down to the shared source's size. New type alias `using llama_kv_cells_vec = std::vector<llama_kv_cells>;` in `llama-kv-cells.h`. All internal `src/` headers the JNI build does **not** include (the project pulls public `llama.h` / `llama-cpp.h`, never `llama-kv-cache.h` / `llama-kv-cells.h`) &mdash; verified via `grep -rn "llama_kv_cells\|llama-kv-cache" src/main/cpp src/test/cpp` &#x2192; zero matches. No project source changes required |
 | ~b9549&ndash;b9553 | `conversion/mistral.py` + `convert_hf_to_gguf.py` | Python conversion-script robustness only: `hparams["llama_4_scaling"]` and `"moe" in hparams` replaced with `hparams.get(...)` / `is not None` guards so a present-but-null key no longer crashes conversion. Python tooling, not part of the JNI build. No impact |
 | ~b9549&ndash;b9553 | upstream build / verification | Local build with `GIT_TAG b9553` verified clean on Linux x86_64: `cmake -B build -DBUILD_TESTING=ON` configures cleanly, `cmake --build build --config Release -j$(nproc)` links `libjllama.so` + `jllama_test` with zero warnings on any project translation unit, and `ctest --test-dir build --output-on-failure` reports **440/440 tests passing** (435 prior + 5 new `Samplers_*` tests). The sole breaking change in this range (the `common_sampler_types_from_names` signature) is absorbed inside upstream-compiled translation units; no project C++ source edits were required for the version bump itself |
+| ~b9553&ndash;b9555 | `.devops/intel.Dockerfile` + `ggml/src/ggml-metal/ggml-metal-device.cpp` + `tests/test-backend-ops.cpp` | Tiny maintenance bump &mdash; **no API change and no new feature**. (1) `intel.Dockerfile`: Intel GPU userspace driver pins bumped (IGC `v2.20.5`&#x2192;`v2.34.4`, compute-runtime `25.40.35563.10`&#x2192;`26.18.38308.1`, IGDGMM `22.8.2`&#x2192;`22.10.0`) with the old multi-GPU-safe versions commented out; upstream's own Docker image only &mdash; this project ships its own `publish.yml` and does not consume `.devops/`. No impact. (2) `ggml-metal-device.cpp`: bugfix to the Metal im2col pipeline selector &mdash; the standard-vs-`_ext` kernel choice now keys off the actual conv-kernel footprint (`KH*KW`, with `KH = is_2D ? ne01 : 1`, `KW = ne00`) instead of the raw `ne00*ne01` product, fixing kernel selection for 1-D convolutions. Backend-internal Metal TU compiled via FetchContent; no API surface visible to `jllama.cpp`, and only affects the macOS/Metal backend at runtime. (3) `tests/test-backend-ops.cpp`: one extra `test_im2col` case (`{3000,384,1,1}` / `{3,384,384,1}`) added &mdash; upstream test only, not linked into the JNI build. **No project source changes required; no new Java-API-exposable feature.** Build verification deferred to CI (`publish.yml`) / a developer host as usual |
@@ -12,7 +12,7 @@ SPDX-License-Identifier: MIT
 
 	<groupId>net.ladenthin</groupId>
 	<artifactId>llama</artifactId>
-	<version>5.0.2-SNAPSHOT</version>
+	<version>5.0.2</version>
 	<packaging>jar</packaging>
 
 	<name>${project.groupId}:${project.artifactId}</name>