Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
69ea3f6
Restructure flat root package into a layered package hierarchy
claude Jun 7, 2026
fb36b38
ArchUnit: per-module banned-import rule for Jackson
claude Jun 7, 2026
139d49f
ArchUnit: one-package-per-layer strict layered architecture
claude Jun 7, 2026
207e105
CI: add early code-style gate (spotless:check) + jdeps package-graph …
claude Jun 7, 2026
68ea95d
Typed-exception audit: add ModelUnavailableExceptionTest
claude Jun 7, 2026
40f299e
build: add Hamcrest 3.0 test dependency for cross-repo parity
claude Jun 7, 2026
b5b9465
test: adopt Hamcrest matchers in the four highest-value jllama test c…
claude Jun 7, 2026
4a5b21b
test: adopt Hamcrest matchers in 12 more jllama test classes
claude Jun 7, 2026
94f817f
test(value,exception): reach 100% PIT mutation coverage; gate both pa…
claude Jun 7, 2026
eed57e3
test(args,json): extend 100% PIT gate to args.* enums + json.TimingsL…
claude Jun 7, 2026
2b61213
docs(TODO): refresh PIT mutation gate status (27-class value/exceptio…
claude Jun 7, 2026
d291940
fix: RerankResponseParser to 100% PIT; repair stale spotbugs-exclude …
claude Jun 7, 2026
7b707d4
Drive ChatResponseParser and CompletionResponseParser to 100% PIT cov…
claude Jun 7, 2026
78cfef1
Upgrade llama.cpp from b9543 to b9549
claude Jun 7, 2026
811d614
fix(loader): anchor native-library resource path to fixed package root
claude Jun 7, 2026
b4443b1
chore(deps): bump codecov/codecov-action from 6 to 7
claude Jun 7, 2026
e26e1ea
fix(jni): update FindClass FQNs for classes moved in the layered rest…
claude Jun 7, 2026
ed5a82a
test(loader): add model-free native-load smoke + document the procedure
claude Jun 7, 2026
d45e352
test: drop Enum.ordinal() dependence (Error Prone EnumOrdinal) in two…
claude Jun 7, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 22 additions & 3 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,25 @@ jobs:
# Cross-compile jobs (Docker / dockcross) — produce release artifacts, no testing
# ---------------------------------------------------------------------------

code-style:
name: Code style (spotless) + package graph
needs: startgate
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions/setup-java@v5
with:
java-version: '21'
distribution: temurin
- name: Spotless check (fail fast on format violations)
run: mvn -B --no-transfer-progress spotless:check
- name: Print internal package dependency graph (jdeps, informational)
continue-on-error: true
run: |
mvn -B --no-transfer-progress -DskipTests -Denforcer.skip=true compile
echo "=== internal package dependency graph (jdeps, bytecode) ==="
jdeps -verbose:package target/classes | grep 'net.ladenthin.llama' || true

crosscompile-linux-x86_64-cuda:
name: Cross-Compile manylinux_2_28 x86_64 (CUDA)
needs: startgate
Expand Down Expand Up @@ -794,7 +813,7 @@ jobs:
format: jacoco
continue-on-error: true
- name: Codecov
uses: codecov/codecov-action@v6
uses: codecov/codecov-action@v7
with:
token: ${{ secrets.CODECOV_TOKEN }}
files: target/site/jacoco/jacoco.xml
Expand Down Expand Up @@ -822,7 +841,7 @@ jobs:

publish-snapshot:
name: Publish Snapshot to Central
needs: [check-snapshot, crosscompile-linux-x86_64-cuda, crosscompile-android-aarch64-opencl]
needs: [check-snapshot, crosscompile-linux-x86_64-cuda, crosscompile-android-aarch64-opencl, code-style]
if: needs.check-snapshot.result == 'success'
runs-on: ubuntu-latest
environment: maven-central
Expand Down Expand Up @@ -898,7 +917,7 @@ jobs:
publish-release:
name: Publish Release to Central
if: needs.check-tag.result == 'success'
needs: [check-tag, crosscompile-linux-x86_64-cuda, crosscompile-android-aarch64-opencl]
needs: [check-tag, crosscompile-linux-x86_64-cuda, crosscompile-android-aarch64-opencl, code-style]
runs-on: ubuntu-latest
environment: maven-central
permissions:
Expand Down
45 changes: 44 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co

Java bindings for [llama.cpp](https://github.com/ggerganov/llama.cpp) via JNI, providing a high-level API for LLM inference in Java. The Java layer communicates with a native C++ library through JNI.

Current llama.cpp pinned version: **b9543**
Current llama.cpp pinned version: **b9549**

## Upgrading CUDA Version

Expand Down Expand Up @@ -303,6 +303,49 @@ be exercised either in CI (via `.github/workflows/publish.yml`) or on a
developer machine with HF access; pre-staged models can also be uploaded
into `models/` out-of-band.

**Verifying the native library *loads* without models (model-free smoke).**
Even with HuggingFace blocked you can still do the one piece of *real native*
verification that does not need a GGUF: confirm the library loads and its
`JNI_OnLoad` resolves every Java class it looks up by name. The model-gated
tests cannot do this in a restricted sandbox — they self-skip via
`Assume.assumeTrue(model present)` **before** the lib is ever loaded, so a plain
`mvn test` is silent on load-time breakage. The full local recipe:

```bash
# 1. Build the native lib locally (FetchContent pulls llama.cpp from GitHub,
# which is reachable even when huggingface.co is not):
mvn -q compile
cmake -B build -DBUILD_TESTING=ON
cmake --build build --config Release -j$(nproc) # -> src/main/resources/.../<os>/<arch>/libjllama.so
# 2. Force LlamaModel.<clinit> (System.load -> JNI_OnLoad) with no model:
mvn test -Dtest=NativeLibraryLoadSmokeTest
```

`NativeLibraryLoadSmokeTest` (in the `loader` package) calls
`Class.forName("net.ladenthin.llama.LlamaModel")`, which runs
`LlamaLoader.initialize() -> System.load() -> JNI_OnLoad`, which in turn calls
`FindClass(...)` for every JNI-referenced Java class. It **passes** when the lib
loads cleanly, **fails** if the native-resource path in `LlamaLoader` is wrong
(lib not found) or a `FindClass`/field-signature FQN in
`src/main/cpp/jllama.cpp` is stale after a Java package move (lib loads but
`JNI_OnLoad` throws `NoClassDefFoundError: net/ladenthin/llama/...`), and
**self-skips** when `libjllama` is not on the classpath (pure-Java checkout, no
CMake build) so it never breaks a build-less `mvn test`.

Both of those failure modes shipped on a branch once — the layered-package
restructure left (a) `LlamaLoader.getNativeResourcePath()` deriving the resource
root from the loader's own package (which moved to `…loader`) and (b)
`jllama.cpp` still `FindClass`-ing the old flat paths — and neither was visible
to a local `mvn test` (model tests skipped) or to the pure-Java unit tests.
**When you move a Java class the JNI layer references by name** (`LlamaModel`
[root], `exception.LlamaException`, `value.LogLevel`, `args.LogFormat`,
`callback.LoadProgressCallback`), update the matching `FindClass` / `"L…;"`
signature string in `src/main/cpp/jllama.cpp` and keep the native-resource root
anchored at `net/ladenthin/llama/` in `LlamaLoader.NATIVE_RESOURCE_BASE` (it must
not track the loader's own Java package). This is the same
"FQN/path not updated after a package move" class as the stale
`spotbugs-exclude.xml`, PIT `targetClasses`, and `CMakeLists.txt` OSInfo repairs.

### Code Formatting
```bash
clang-format -i src/main/cpp/*.cpp src/main/cpp/*.hpp # Format C++ code
Expand Down
6 changes: 3 additions & 3 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ set(LLAMA_BUILD_APP OFF CACHE BOOL "" FORCE)
FetchContent_Declare(
llama.cpp
GIT_REPOSITORY https://github.com/ggerganov/llama.cpp.git
GIT_TAG b9543
GIT_TAG b9549
)
FetchContent_MakeAvailable(llama.cpp)

Expand Down Expand Up @@ -159,7 +159,7 @@ if(NOT DEFINED OS_NAME)
find_package(Java REQUIRED)
find_program(JAVA_EXECUTABLE NAMES java)
execute_process(
COMMAND ${JAVA_EXECUTABLE} -cp ${CMAKE_SOURCE_DIR}/target/classes net.ladenthin.llama.OSInfo --os
COMMAND ${JAVA_EXECUTABLE} -cp ${CMAKE_SOURCE_DIR}/target/classes net.ladenthin.llama.loader.OSInfo --os
OUTPUT_VARIABLE OS_NAME
OUTPUT_STRIP_TRAILING_WHITESPACE
)
Expand All @@ -177,7 +177,7 @@ if(NOT DEFINED OS_ARCH)
find_package(Java REQUIRED)
find_program(JAVA_EXECUTABLE NAMES java)
execute_process(
COMMAND ${JAVA_EXECUTABLE} -cp ${CMAKE_SOURCE_DIR}/target/classes net.ladenthin.llama.OSInfo --arch
COMMAND ${JAVA_EXECUTABLE} -cp ${CMAKE_SOURCE_DIR}/target/classes net.ladenthin.llama.loader.OSInfo --arch
OUTPUT_VARIABLE OS_ARCH
OUTPUT_STRIP_TRAILING_WHITESPACE
)
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
**Build:**
![Java 8+](https://img.shields.io/badge/Java-8%2B-informational)
![Platform](https://img.shields.io/badge/Platform-Linux%20%7C%20macOS%20%7C%20Windows%20%7C%20Android-lightgrey)
[![llama.cpp b9543](https://img.shields.io/badge/llama.cpp-%23b9543-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9543)
[![llama.cpp b9549](https://img.shields.io/badge/llama.cpp-%23b9549-informational)](https://github.com/ggml-org/llama.cpp/releases/tag/b9549)
[![JPMS](https://img.shields.io/badge/JPMS-modular%20JAR-25A162)](https://openjdk.org/projects/jigsaw/)
![JUnit](https://img.shields.io/badge/tested%20with-JUnit6-25A162)
[![JSpecify](https://img.shields.io/badge/JSpecify-1.0.0%20%40NullMarked-25A162)](https://jspecify.dev)
Expand Down
33 changes: 31 additions & 2 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,12 +69,41 @@ These are JNI plumbing items for upstream API additions. Policy: add only after
(`07109cc`): 25 sites. The same rule is suppressed in BAF
(`52c8c95`) for identical reasons.

- **Additional ArchUnit rules to consider** — layered-architecture rules (`layeredArchitecture().consideringAllDependencies()`), per-module banned-imports lists, public-API-surface constraints (no public mutable static state, etc.). Partial progress: `7b6667d` covers the "no public field that is not final" sub-rule.
- **Additional ArchUnit rules to consider** — the full **`layeredArchitecture()`** rule and a **per-module banned-import** rule (`jacksonBannedFromContractsAndLoader` — Jackson kept out of `args`/`callback`/`exception`/`loader`) are now DONE. Still open: more per-module banned-imports if useful, public-API-surface constraints (no public mutable static state, etc.). Partial progress: `7b6667d` covers the "no public field that is not final" sub-rule.

- **Cross-repo code-quality TODOs** — see [`../workspace/policies/code-quality-todos.md`](../workspace/policies/code-quality-todos.md) for the canonical `@VisibleForTesting` design-fit review, package hierarchy review, and class/method naming review. This repo has no `@VisibleForTesting` usages today; package and naming reviews remain open.

## Done (kept for history)

### Layered package restructure (flat root package → layered hierarchy)

The flat `net.ladenthin.llama` root package was split (via `git mv`, history
preserved) into layered packages so boundaries align with the layers, enforced
by a new `layeredArchitecture()` ArchUnit rule (Api → Loader → Marshalling →
Foundation):

- **Foundation**: `value` (18 DTOs: ChatMessage, ContentPart, Pair, LlamaOutput,
…), `callback` (CancellationToken, LoadProgressCallback, ToolHandler),
`exception` (LlamaException, ModelUnavailableException), `args` (existing leaf).
- **Marshalling**: `json` (response parsers + `TimingsLogger`, its only consumer),
`parameters` (Inference/Model/Json/Cli parameters + `ParameterJsonSerializer` +
`ChatRequest`).
- **Loader** (internal, NOT exported): `loader` (LlamaLoader, OSInfo,
ProcessRunner, NativeLibraryPermissionSetter, Java8CompatibilityHelper,
SkipDownloadFailureTranslator, LlamaSystemProperties).
- **Api** (root): LlamaModel, Session, LlamaIterable, LlamaIterator.

Cycle-breaking moves: `TimingsLogger` root→`json`, `ParameterJsonSerializer`
`json`→`parameters`, `ChatRequest` root→`parameters` (it carries an
`InferenceParameters` customizer). Test classes mirrored into their subjects'
packages; cross-layer members promoted to `public`. Cross-package Javadoc
`{@link}` references fully-qualified (palantir's `removeUnusedImports` strips
javadoc-only imports). `module-info` exports the new public-API packages and
keeps `loader` internal. All 11 ArchUnit rules green; `javadoc:jar` clean.

**Breaking change**: public-API FQNs changed (e.g. `net.ladenthin.llama.ChatMessage`
→ `net.ladenthin.llama.value.ChatMessage`) — ship under a major version bump.

- **Reactive `LlamaPublisher` removed in favour of consumer-side adapters.**
The hand-rolled `LlamaPublisher` + `LlamaModel.streamPublisher` /
`streamChatPublisher` (shipped in PR #188 as §2.3 of the Kotlin SDK
Expand All @@ -95,7 +124,7 @@ These are JNI plumbing items for upstream API additions. Policy: add only after
- **`javac -Werror` + `-Xlint:all,-serial,-options,-classfile,-processing`** — `3e2efbb`. ~20 EP warnings addressed first (EqualsGetClass on `Pair` via instanceof; MissingOverride on `PoolingType` / `RopeScalingType`; JdkObsolete `LinkedList` → `ArrayList` in `LlamaLoader`; StringSplitter inline-suppressed; 3× StringCaseLocaleUsage `Locale.ROOT` in `OSInfo`; EmptyCatch in `OSInfo.isAlpineLinux`; FutureReturnValueIgnored in `LlamaModel.completeAsync`; Finalize on `LlamaModel.finalize`; MixedMutabilityReturnType in 4 parser methods; EnumOrdinal in `InferenceParameters.setMiroStat`; EscapedEntity in `InferenceParameters` javadoc; 4× TypeParameterUnusedInFormals; AnnotateFormatMethod on `Java8CompatibilityHelper.formatted`; SafeVarargs + varargs on `Java8CompatibilityHelper.listOf`).
- **`-parameters` javac arg** — `4350cf2`.
- **`--release N`** — `4350cf2` (`<release>8</release>`).
- **Mutation-testing threshold enforcement (PIT)** — `62f8a00` + `bb93a8f` (docs) + `3bfa51f` (README badge). "Single class, full plumbing" pattern: PIT runs every CI build with `<mutationThreshold>100</mutationThreshold>`, `<targetClasses>` narrowed to `net.ladenthin.llama.Pair`.
- **Mutation-testing threshold enforcement (PIT)** — `62f8a00` + `bb93a8f` (docs) + `3bfa51f` (README badge). Runs every CI build with `<mutationThreshold>100</mutationThreshold>`. **Scope expanded 2026-06-07** from the original single `Pair` target (which was stale after the restructure — `llama.Pair`→`value.Pair` matched nothing) to `value.*` + `exception.*` + `args.*` + `json.TimingsLogger` = 27 classes / 163 mutations, all killed. Still open (optional): `json.ChatResponseParser` / `CompletionResponseParser` private-helper survivors (`RerankResponseParser` is excluded — equivalent empty-list mutant).
- **Checker Framework as a second static-nullness pass** — `c63870b`. The original
`@PolyNull` on `JsonParameters.toJsonString` was simplified to plain `@Nullable`
(the only `@PolyNull` site in production; eliminated in a later cleanup).
Expand Down
Loading
Loading