Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co

Java bindings for [llama.cpp](https://github.com/ggerganov/llama.cpp) via JNI, providing a high-level API for LLM inference in Java. The Java layer communicates with a native C++ library through JNI.

Current llama.cpp pinned version: **b9553**
Current llama.cpp pinned version: **b9555**

## Upgrading CUDA Version

Expand Down
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ set(LLAMA_BUILD_APP OFF CACHE BOOL "" FORCE)
FetchContent_Declare(
llama.cpp
GIT_REPOSITORY https://github.com/ggerganov/llama.cpp.git
GIT_TAG b9553
GIT_TAG b9555
)
FetchContent_MakeAvailable(llama.cpp)

Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
**Build:**
![Java 8+](https://img.shields.io/badge/Java-8%2B-informational)
![Platform](https://img.shields.io/badge/Platform-Linux%20%7C%20macOS%20%7C%20Windows%20%7C%20Android-lightgrey)
[![llama.cpp b9553](https://img.shields.io/badge/llama.cpp-%23b9553-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9553)
[![llama.cpp b9555](https://img.shields.io/badge/llama.cpp-%23b9555-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9555)
[![JPMS](https://img.shields.io/badge/JPMS-modular%20JAR-25A162)](https://openjdk.org/projects/jigsaw/)
![JUnit](https://img.shields.io/badge/tested%20with-JUnit6-25A162)
[![JSpecify](https://img.shields.io/badge/JSpecify-1.0.0%20%40NullMarked-25A162)](https://jspecify.dev)
Expand Down Expand Up @@ -110,7 +110,7 @@ Access this library via Maven (released versions on Maven Central):
<dependency>
<groupId>net.ladenthin</groupId>
<artifactId>llama</artifactId>
<version>5.0.1</version>
<version>5.0.2</version>
</dependency>
```

Expand Down Expand Up @@ -168,22 +168,22 @@ Pick at most one — they are mutually exclusive.
<dependency>
<groupId>net.ladenthin</groupId>
<artifactId>llama</artifactId>
<version>5.0.1</version>
<version>5.0.2</version>
</dependency>

<!-- CUDA on Linux x86-64 (requires CUDA 13 runtime on the host) -->
<dependency>
<groupId>net.ladenthin</groupId>
<artifactId>llama</artifactId>
<version>5.0.1</version>
<version>5.0.2</version>
<classifier>cuda13-linux-x86-64</classifier>
</dependency>

<!-- OpenCL/Adreno on Android (requires device-provided OpenCL ICD) -->
<dependency>
<groupId>net.ladenthin</groupId>
<artifactId>llama</artifactId>
<version>5.0.1</version>
<version>5.0.2</version>
<classifier>opencl-android-aarch64</classifier>
</dependency>
```
Expand Down
1 change: 1 addition & 0 deletions docs/history/llama-cpp-breaking-changes.md
Original file line number Diff line number Diff line change
Expand Up @@ -325,3 +325,4 @@ Used during `llama.cpp` version bumps: when upgrading, scan this file from the r
| ~b9549&ndash;b9553 | `src/llama-kv-cache.cpp` + `src/llama-kv-cache.h` + `src/llama-kv-cells.h` | KV-cache shared-cells refactor (continues `TAG_KV_CACHE_SHARE_CELLS`, used by the Gemma4-assistant MTP head): the `v_cells` member changed from a by-value `std::vector<llama_kv_cells>` to a `std::shared_ptr<llama_kv_cells_vec> v_cells_impl` plus a `llama_kv_cells_vec & v_cells` reference, so a target cache now *views* the source cache's cells instead of copying them in `apply_ubatch()`; the constructor also clamps `kv_size` down to the shared source's size. New type alias `using llama_kv_cells_vec = std::vector<llama_kv_cells>;` in `llama-kv-cells.h`. All internal `src/` headers the JNI build does **not** include (the project pulls public `llama.h` / `llama-cpp.h`, never `llama-kv-cache.h` / `llama-kv-cells.h`) &mdash; verified via `grep -rn "llama_kv_cells\|llama-kv-cache" src/main/cpp src/test/cpp` &#x2192; zero matches. No project source changes required |
| ~b9549&ndash;b9553 | `conversion/mistral.py` + `convert_hf_to_gguf.py` | Python conversion-script robustness only: `hparams["llama_4_scaling"]` and `"moe" in hparams` replaced with `hparams.get(...)` / `is not None` guards so a present-but-null key no longer crashes conversion. Python tooling, not part of the JNI build. No impact |
| ~b9549&ndash;b9553 | upstream build / verification | Local build with `GIT_TAG b9553` verified clean on Linux x86_64: `cmake -B build -DBUILD_TESTING=ON` configures cleanly, `cmake --build build --config Release -j$(nproc)` links `libjllama.so` + `jllama_test` with zero warnings on any project translation unit, and `ctest --test-dir build --output-on-failure` reports **440/440 tests passing** (435 prior + 5 new `Samplers_*` tests). The sole breaking change in this range (the `common_sampler_types_from_names` signature) is absorbed inside upstream-compiled translation units; no project C++ source edits were required for the version bump itself |
| ~b9553&ndash;b9555 | `.devops/intel.Dockerfile` + `ggml/src/ggml-metal/ggml-metal-device.cpp` + `tests/test-backend-ops.cpp` | Tiny maintenance bump &mdash; **no API change and no new feature**. (1) `intel.Dockerfile`: Intel GPU userspace driver pins bumped (IGC `v2.20.5`&#x2192;`v2.34.4`, compute-runtime `25.40.35563.10`&#x2192;`26.18.38308.1`, IGDGMM `22.8.2`&#x2192;`22.10.0`) with the old multi-GPU-safe versions commented out; upstream's own Docker image only &mdash; this project ships its own `publish.yml` and does not consume `.devops/`. No impact. (2) `ggml-metal-device.cpp`: bugfix to the Metal im2col pipeline selector &mdash; the standard-vs-`_ext` kernel choice now keys off the actual conv-kernel footprint (`KH*KW`, with `KH = is_2D ? ne01 : 1`, `KW = ne00`) instead of the raw `ne00*ne01` product, fixing kernel selection for 1-D convolutions. Backend-internal Metal TU compiled via FetchContent; no API surface visible to `jllama.cpp`, and only affects the macOS/Metal backend at runtime. (3) `tests/test-backend-ops.cpp`: one extra `test_im2col` case (`{3000,384,1,1}` / `{3,384,384,1}`) added &mdash; upstream test only, not linked into the JNI build. **No project source changes required; no new Java-API-exposable feature.** Build verification deferred to CI (`publish.yml`) / a developer host as usual |
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ SPDX-License-Identifier: MIT

<groupId>net.ladenthin</groupId>
<artifactId>llama</artifactId>
<version>5.0.2-SNAPSHOT</version>
<version>5.0.2</version>
<packaging>jar</packaging>

<name>${project.groupId}:${project.artifactId}</name>
Expand Down
Loading