From b4b5a85a22a947f087ee583eb4f45f9c4b0c6efa Mon Sep 17 00:00:00 2001 From: Michal Harakal Date: Sat, 6 Jun 2026 22:01:19 +0200 Subject: [PATCH 1/2] =?UTF-8?q?release:=200.28.1=20=E2=80=94=20engine=20pi?= =?UTF-8?q?n=200.28.1,=20gemma3=20exports=20+=20compiles=20to=20vmfb?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - gradle/libs.versions.toml: skainet 0.27.0 -> 0.28.1 - gradle.properties: VERSION_NAME 0.26.0 -> 0.28.1 (version-aligned with engine) - CHANGELOG [0.28.1]: engine pin picks up the completed DAG-DSL -> StableHLO -> IREE export path (SKaiNET #673/#675); gemma3 traces and iree-compiles to a vmfb - README: Current release + What's new in 0.28.1; roadmap StableHLO bullet updated; transformers-bom snippet 0.25.0 -> 0.28.1 - docs: tutorial BOM/version refs 0.25.0 -> 0.28.1 Verified vs published 0.28.1: :llm-inference:gemma:jvmTest green (GemmaMlirDumpTest 1/1, GemmaTraceTest 1/1); full assemble green. --- CHANGELOG.md | 25 +++++++++++++ README.md | 35 ++++++++++++++----- .../pages/tutorials/getting-started-java.adoc | 4 +-- .../pages/tutorials/llama3-tool-calling.adoc | 2 +- gradle.properties | 2 +- gradle/libs.versions.toml | 2 +- 6 files changed, 56 insertions(+), 14 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 837c79d..5d79cc3 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,31 @@ version line is kept in lock-step with the underlying SKaiNET engine The format roughly follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.28.1] — 2026-06-06 + +Version-aligned with **SKaiNET 0.28.1**. Skips 0.26.x / 0.27.x — +SKaiNET-transformers tracked the engine internally across that window without a +tagged release. + +### Changed + +- **`gradle/libs.versions.toml` `skainet` pin: 0.27.0 → 0.28.1.** Picks up the + completed Kotlin DSL → StableHLO → IREE export path. SKaiNET 0.28.0/0.28.1 + closed the remaining DAG-DSL export bugs: shape-changing ops now declare their + inferred output type instead of echoing operand-0 — `reshape`/`matmul`/`concatenate` + ([SKaiNET #673](https://github.com/SKaiNET-developers/SKaiNET/issues/673)) and + `conv1d`/`gather`/`maxpool2d`/`avgpool2d`/`flatten` + ([SKaiNET #675](https://github.com/SKaiNET-developers/SKaiNET/issues/675)) — and + `reduce_window` is emitted in IREE's generic region form. A full gemma3 graph + traced through `GemmaMlirDumpTest` / `GemmaTraceTest` now lowers to StableHLO + that `iree-compile`s to a `vmfb`. No transformers-side API changes; existing + callers compile unchanged. + +### Verified + +- `:llm-inference:gemma:jvmTest` green against the published SKaiNET 0.28.1 + (`GemmaMlirDumpTest` 1/1, `GemmaTraceTest` 1/1). + ## [0.25.0] — 2026-05-25 Version-aligned with **SKaiNET 0.25.0**. Skips 0.24.x — SKaiNET-transformers has diff --git a/README.md b/README.md index ee11e0b..f5901dd 100644 --- a/README.md +++ b/README.md @@ -97,23 +97,28 @@ Honest status — see the project-status note at the top of this README. before extending scope. - Verify each generative architecture end-to-end with smoke tests. - Wire the **StableHLO / native compilation path** for full transformer models. + As of 0.28.1 a full gemma3 graph exports to StableHLO and `iree-compile`s to a + `vmfb` (`GemmaMlirDumpTest`); next is running the compiled module and extending + the same path to the other families. ## Current release -The current release is **0.25.0** — version-aligned with **SKaiNET 0.25.0**. -Skips 0.24.x: the engine bumped 0.23.1 → 0.25.0 in the same release window -without a tagged 0.24.x on either side. Brings the new -[hybrid adaptive DSL with optional dtype constraints](https://github.com/SKaiNET-developers/SKaiNET/pull/616) -(`DTypePolicy.Any | Require | Prefer | OneOf`) to every `*NetworkLoader`, -turns the catalog BOM-only so every internal build now exercises -`sk.ainet:skainet-bom` end-to-end, and locks three reference models -(`@Tag("smoke-reference")`) for the smoke tier. +The current release is **0.28.1** — version-aligned with **SKaiNET 0.28.1**. +Skips 0.26.x / 0.27.x: SKaiNET-transformers tracked the engine internally across +that window without a tagged release. The headline is that the engine's +**Kotlin DSL → StableHLO → IREE export path is now complete** — a full gemma3 +graph traces and lowers to StableHLO that `iree-compile`s to a `vmfb` +(`GemmaMlirDumpTest` / `GemmaTraceTest` are green against 0.28.1). SKaiNET +0.28.0/0.28.1 fixed the remaining export bugs: result-type inference for +`reshape`/`matmul`/`concatenate` ([#673](https://github.com/SKaiNET-developers/SKaiNET/issues/673)) +and `conv1d`/`gather`/pooling/`flatten` shapes plus the `reduce_window` emission +form ([#675](https://github.com/SKaiNET-developers/SKaiNET/issues/675)). The recommended way to consume is via the BOM. It pins every published `skainet-transformers-*` artifact and re-exports the upstream `sk.ainet:skainet-bom`, so the engine-side `sk.ainet.core:skainet-*` artifacts get the matching version too — you only need to declare the BOM version in one place. ```kotlin dependencies { - implementation(platform("sk.ainet.transformers:skainet-transformers-bom:0.25.0")) + implementation(platform("sk.ainet.transformers:skainet-transformers-bom:0.28.1")) // Versions resolved from the BOM: implementation("sk.ainet.transformers:skainet-transformers-core") @@ -190,6 +195,18 @@ try (KLlamaSession session = KLlamaJava.loadGGUF(modelPath, /* systemPrompt */ n See `llm-test/llm-test-java/src/test/java/.../KLlamaJavaToolCallingTest.java` for a runnable reference. +## What's new in 0.28.1 + +- **Engine pin `skainet 0.27.0 → 0.28.1`.** Picks up the completed Kotlin DSL → + StableHLO → IREE export path. Every shape-changing op now declares its inferred + output type (`reshape`/`matmul`/`concatenate`, [#673](https://github.com/SKaiNET-developers/SKaiNET/issues/673); + `conv1d`/`gather`/pooling/`flatten`, [#675](https://github.com/SKaiNET-developers/SKaiNET/issues/675)), + and `reduce_window` is emitted in IREE's generic region form — so a full gemma3 + graph traced via `GemmaMlirDumpTest` lowers to StableHLO that `iree-compile`s to + a `vmfb`. No transformers-side API changes; existing callers compile unchanged. +- Verified end-to-end: `:llm-inference:gemma:jvmTest` green against the published + 0.28.1 (`GemmaMlirDumpTest`, `GemmaTraceTest` pass). + ## What's new in 0.25.0 - **`DTypePolicy` on every `*NetworkLoader.fromGguf` / `.fromSafeTensors` diff --git a/docs/modules/ROOT/pages/tutorials/getting-started-java.adoc b/docs/modules/ROOT/pages/tutorials/getting-started-java.adoc index 873a84c..d5e51c8 100644 --- a/docs/modules/ROOT/pages/tutorials/getting-started-java.adoc +++ b/docs/modules/ROOT/pages/tutorials/getting-started-java.adoc @@ -25,7 +25,7 @@ In your `build.gradle.kts`: [source,kotlin] ---- dependencies { - implementation(platform("sk.ainet.transformers:skainet-transformers-bom:0.25.0")) + implementation(platform("sk.ainet.transformers:skainet-transformers-bom:0.28.1")) implementation("sk.ainet.transformers:skainet-transformers-runtime-kllama") implementation("sk.ainet.transformers:skainet-transformers-agent") @@ -41,7 +41,7 @@ Or in Maven (Maven needs the `-jvm` classifier suffix on platform artifacts): sk.ainet.transformers skainet-transformers-bom - 0.25.0 + 0.28.1 pom import diff --git a/docs/modules/ROOT/pages/tutorials/llama3-tool-calling.adoc b/docs/modules/ROOT/pages/tutorials/llama3-tool-calling.adoc index 6152eeb..710da06 100644 --- a/docs/modules/ROOT/pages/tutorials/llama3-tool-calling.adoc +++ b/docs/modules/ROOT/pages/tutorials/llama3-tool-calling.adoc @@ -52,7 +52,7 @@ The pieces you need live in three modules: [source,kotlin] ---- dependencies { - implementation(platform("sk.ainet.transformers:skainet-transformers-bom:0.25.0")) + implementation(platform("sk.ainet.transformers:skainet-transformers-bom:0.28.1")) implementation("sk.ainet.transformers:skainet-transformers-runtime-kllama") implementation("sk.ainet.transformers:skainet-transformers-agent") diff --git a/gradle.properties b/gradle.properties index 6a60740..7efd6cc 100644 --- a/gradle.properties +++ b/gradle.properties @@ -1,5 +1,5 @@ GROUP=sk.ainet.transformers -VERSION_NAME=0.26.0 +VERSION_NAME=0.28.1 POM_DESCRIPTION=SKaiNET-transformers diff --git a/gradle/libs.versions.toml b/gradle/libs.versions.toml index 415110f..188f043 100644 --- a/gradle/libs.versions.toml +++ b/gradle/libs.versions.toml @@ -1,5 +1,5 @@ [versions] -skainet = "0.27.0" +skainet = "0.28.1" agp = "9.2.1" jacksonDatabind = "2.22.0" jsonSchemaValidator = "3.0.3" From cfa34b7f59a961b75313f44ee4eaa685bb10d942 Mon Sep 17 00:00:00 2001 From: Michal Harakal Date: Sat, 6 Jun 2026 22:19:22 +0200 Subject: [PATCH 2/2] docs: drop the 'Add a Compute Backend' how-to (belongs in upstream SKaiNET docs) Compute backends are an engine (sk.ainet.core) concern, not a transformers one; the how-to (and its nav link) referenced a core skainet-backend-metal artifact and duplicated upstream material. Removed the page and its nav entry. --- docs/modules/ROOT/nav.adoc | 1 - .../pages/how-to/add-compute-backend.adoc | 239 ------------------ 2 files changed, 240 deletions(-) delete mode 100644 docs/modules/ROOT/pages/how-to/add-compute-backend.adoc diff --git a/docs/modules/ROOT/nav.adoc b/docs/modules/ROOT/nav.adoc index 46ddc95..4262b66 100644 --- a/docs/modules/ROOT/nav.adoc +++ b/docs/modules/ROOT/nav.adoc @@ -10,7 +10,6 @@ .How-to Guides * xref:how-to/add-model.adoc[Add a New Model Architecture] -* xref:how-to/add-compute-backend.adoc[Add a Compute Backend] * xref:how-to/add-tool.adoc[Add a Custom Tool] * xref:how-to/run-unified-cli.adoc[Use the Unified CLI] * xref:how-to/benchmarking.adoc[Run Benchmarks] diff --git a/docs/modules/ROOT/pages/how-to/add-compute-backend.adoc b/docs/modules/ROOT/pages/how-to/add-compute-backend.adoc deleted file mode 100644 index c3388c4..0000000 --- a/docs/modules/ROOT/pages/how-to/add-compute-backend.adoc +++ /dev/null @@ -1,239 +0,0 @@ -= Add a Compute Backend -:description: Implement and register a new compute backend (Metal GPU, MLX, CUDA, Vulkan) for the SKaiNET inference engine. - -This guide explains how to implement and register a new compute backend for the SKaiNET inference engine. - -== Architecture Overview - ----- -llm-core (commonMain) - └── BackendProvider ← interface you implement - └── BackendRegistry ← expect object (discovery) - -llm-core (jvmMain) - └── BackendRegistry.jvm ← ServiceLoader-based discovery - -llm-core (registryBasedMain) - └── BackendRegistry ← manual registry (native, JS, Wasm, Android) - -llm-runtime/kllama - └── CpuBackendProvider ← reference implementation - └── META-INF/services/... ← JVM SPI registration - └── BackendActual.kt ← native registration per platform ----- - -Backends are discovered differently per target: - -[cols="1,3", options="header"] -|=== -| Target | Mechanism - -| JVM -| `java.util.ServiceLoader` — auto-discovers from JAR - -| Native (macOS, Linux, iOS) -| `registerPlatformBackends()` at startup - -| Android, JS, Wasm -| `BackendRegistry.register()` called by host app -|=== - -The registry auto-selects the *highest priority available* backend. -Users can override with `--backend=NAME` on the CLI. - -== Step 1: Implement BackendProvider - -Create a class that implements `BackendProvider` from -`llm-core/src/commonMain/kotlin/sk/ainet/apps/llm/backend/BackendProvider.kt`: - -[source,kotlin] ----- -package sk.ainet.apps.mybackend - -import sk.ainet.apps.llm.backend.BackendProvider -import sk.ainet.context.ExecutionContext - -class MetalBackendProvider : BackendProvider { - override val name: String = "metal" - override val displayName: String = "Metal GPU" - override val priority: Int = 100 // GPU > CPU (0) - - override fun isAvailable(): Boolean { - // Runtime check: is the hardware/driver present? - // Return false if not — the registry will skip this backend. - return try { - MetalExecutionContext() - true - } catch (_: Throwable) { - false - } - } - - override fun createContext(): ExecutionContext { - return MetalExecutionContext() - } -} ----- - -*Priority guidelines:* - -[cols="2,1", options="header"] -|=== -| Backend | Priority - -| CPU | 0 -| GPU (Metal, Vulkan) | 100 -| Specialized accelerator | 200 -|=== - -`isAvailable()` must be safe to call on any platform — return `false` -if the hardware or native library is not present. - -== Step 2: Register the backend - -Registration differs by target. - -=== JVM — ServiceLoader (automatic) - -Create a service file in your module's JVM resources: - ----- -src/jvmMain/resources/META-INF/services/sk.ainet.apps.llm.backend.BackendProvider ----- - -Contents (one fully-qualified class name per line): - ----- -sk.ainet.apps.mybackend.MetalBackendProvider ----- - -That's it. Adding the JAR to the classpath makes it discoverable. -The Shadow JAR's `mergeServiceFiles()` handles combining service files -from multiple JARs. - -=== Native — manual registration - -In the platform-specific `BackendActual.kt`, add a `register()` call -inside `registerPlatformBackends()`: - -[source,kotlin] ----- -// llm-runtime/kllama/src/macosMain/kotlin/.../BackendActual.kt - -internal actual fun registerPlatformBackends() { - BackendRegistry.register(CpuBackendProvider()) - BackendRegistry.register(MetalBackendProvider()) // ← add this -} ----- - -This is called once at CLI startup before backend selection happens. - -=== Android / JS / Wasm - -Call `BackendRegistry.register()` from your application's initialization -code before any inference calls: - -[source,kotlin] ----- -// In your app's startup -BackendRegistry.register(CpuBackendProvider()) -BackendRegistry.register(MyGpuBackendProvider()) ----- - -== Step 3: Add the dependency - -=== As a separate module - -If the backend lives in its own Gradle module or external JAR: - -[source,kotlin] ----- -// build.gradle.kts of the consuming module -sourceSets { - val jvmMain by getting { - dependencies { - implementation("sk.ainet.core:skainet-backend-metal:0.21.0") - } - } - val macosMain by getting { - dependencies { - implementation("sk.ainet.core:skainet-backend-metal:0.21.0") - } - } -} ----- - -=== Native bridge libraries - -If the backend wraps a native C/C{plus}{plus} library (Metal, MLX, Vulkan), -configure linker opts in the native binary block: - -[source,kotlin] ----- -macosArm64 { - binaries { - executable { - linkerOpts( - "-L/path/to/bridge", "-lmetal_bridge", - "-framework", "Metal", - "-framework", "MetalPerformanceShaders", - "-framework", "Accelerate", - ) - } - } -} ----- - -== Step 4: Verify - -=== List backends - -[source,bash] ----- -# JVM -./gradlew :llm-runtime:kllama:runJvm --args="--list-backends" - -# Native (macOS) -./llm-runtime/kllama/build/bin/macosArm64/debugExecutable/kllama.kexe --list-backends ----- - -Expected output: - ----- -Available backends: - metal Metal GPU (priority=100, available) - cpu CPU (SIMD) (priority=0, available) ----- - -=== Run with a specific backend - -[source,bash] ----- -./gradlew :llm-runtime:kllama:runJvm \ - --args="--backend=metal -m model.gguf 'Hello'" ----- - -=== Auto-selection - -Without `--backend`, the registry picks the highest-priority available -backend automatically (Metal over CPU in this example). - -== Reference: Existing implementation - -The CPU backend serves as the reference implementation: - -* *Provider:* `llm-runtime/kllama/src/commonMain/kotlin/sk/ainet/apps/kllama/CpuBackendProvider.kt` -* *JVM SPI file:* `llm-runtime/kllama/src/jvmMain/resources/META-INF/services/sk.ainet.apps.llm.backend.BackendProvider` -* *Native registration:* `llm-runtime/kllama/src/{macosMain,linuxMain,iosMain}/kotlin/.../BackendActual.kt` -* *Interface:* `llm-core/src/commonMain/kotlin/sk/ainet/apps/llm/backend/BackendProvider.kt` -* *Registry:* `llm-core/src/commonMain/kotlin/sk/ainet/apps/llm/backend/BackendRegistry.kt` - -== File checklist for a new backend - -* [ ] BackendProvider implementation class -* [ ] JVM: META-INF/services file listing the provider class -* [ ] Native: register() call in platform BackendActual.kt -* [ ] build.gradle.kts: dependency declaration -* [ ] Native: linkerOpts if wrapping C/C{plus}{plus} bridge -* [ ] Verify: --list-backends shows the new backend -* [ ] Verify: --backend=NAME runs inference with it